Science.gov

Sample records for molecular features predicting

  1. Genes associated with histopathologic features of triple negative breast tumors predict molecular subtypes.

    PubMed

    Purrington, Kristen S; Visscher, Daniel W; Wang, Chen; Yannoukakos, Drakoulis; Hamann, Ute; Nevanlinna, Heli; Cox, Angela; Giles, Graham G; Eckel-Passow, Jeanette E; Lakis, Sotiris; Kotoula, Vassiliki; Fountzilas, George; Kabisch, Maria; Rüdiger, Thomas; Heikkilä, Päivi; Blomqvist, Carl; Cross, Simon S; Southey, Melissa C; Olson, Janet E; Gilbert, Judy; Deming-Halverson, Sandra; Kosma, Veli-Matti; Clarke, Christine; Scott, Rodney; Jones, J Louise; Zheng, Wei; Mannermaa, Arto; Eccles, Diana M; Vachon, Celine M; Couch, Fergus J

    2016-05-01

    Distinct subtypes of triple negative (TN) breast cancer have been identified by tumor expression profiling. However, little is known about the relationship between histopathologic features of TN tumors, which reflect aspects of both tumor behavior and tumor microenvironment, and molecular TN subtypes. The histopathologic features of TN tumors were assessed by central review and 593 TN tumors were subjected to whole genome expression profiling using the Illumina Whole Genome DASL array. TN molecular subtypes were defined based on gene expression data associated with histopathologic features of TN tumors. Gene expression analysis yielded signatures for four TN subtypes (basal-like, androgen receptor positive, immune, and stromal) consistent with previous studies. Expression analysis also identified genes significantly associated with the 12 histological features of TN tumors. Development of signatures using these markers of histopathological features resulted in six distinct TN subtype signatures, including an additional basal-like and stromal signature. The additional basal-like subtype was distinguished by elevated expression of cell motility and glucose metabolism genes and reduced expression of immune signaling genes, whereas the additional stromal subtype was distinguished by elevated expression of immunomodulatory pathway genes. Histopathologic features that reflect heterogeneity in tumor architecture, cell structure, and tumor microenvironment are related to TN subtype. Accounting for histopathologic features in the development of gene expression signatures, six major subtypes of TN breast cancer were identified. PMID:27083182

  2. Semen molecular and cellular features: these parameters can reliably predict subsequent ART outcome in a goat model

    PubMed Central

    Berlinguer, Fiammetta; Madeddu, Manuela; Pasciu, Valeria; Succu, Sara; Spezzigu, Antonio; Satta, Valentina; Mereu, Paolo; Leoni, Giovanni G; Naitana, Salvatore

    2009-01-01

    Currently, the assessment of sperm function in a raw or processed semen sample is not able to reliably predict sperm ability to withstand freezing and thawing procedures and in vivo fertility and/or assisted reproductive biotechnologies (ART) outcome. The aim of the present study was to investigate which parameters among a battery of analyses could predict subsequent spermatozoa in vitro fertilization ability and hence blastocyst output in a goat model. Ejaculates were obtained by artificial vagina from 3 adult goats (Capra hircus) aged 2 years (A, B and C). In order to assess the predictive value of viability, computer assisted sperm analyzer (CASA) motility parameters and ATP intracellular concentration before and after thawing and of DNA integrity after thawing on subsequent embryo output after an in vitro fertility test, a logistic regression analysis was used. Individual differences in semen parameters were evident for semen viability after thawing and DNA integrity. Results of IVF test showed that spermatozoa collected from A and B lead to higher cleavage rates (0 < 0.01) and blastocysts output (p < 0.05) compared with C. Logistic regression analysis model explained a deviance of 72% (p < 0.0001), directly related with the mean percentage of rapid spermatozoa in fresh semen (p < 0.01), semen viability after thawing (p < 0.01), and with two of the three comet parameters considered, i.e tail DNA percentage and comet length (p < 0.0001). DNA integrity alone had a high predictive value on IVF outcome with frozen/thawed semen (deviance explained: 57%). The model proposed here represents one of the many possible ways to explain differences found in embryo output following IVF with different semen donors and may represent a useful tool to select the most suitable donors for semen cryopreservation. PMID:19900288

  3. Features of CD44+/CD24-low phenotypic cell distribution in relation to predictive markers and molecular subtypes of invasive ductal carcinoma of the breast.

    PubMed

    Gudadze, M; Kankava, Q; Mariamidze, A; Burkadze, G

    2014-03-01

    Breast cancer is the most widespread pathology among women. Despite the current progresses in research and treatment of metastatic breast cancer, mortality caused by this disease is still high, because above mentioned therapy is limited due to existence of cells resistant to therapy . Cancer stem cells are the only cells with ability of unlimited proliferative activity and cancerous potential, thus, they participate in the growth, progression and dissemination of cancer. Cancer stem cells are resistant to various forms of therapy, including chemotherapy and radiotherapy . Results of examination showed that 50% of all cases are positive on so called markers of stem cells, thus 45% of cases are negative. CD44+/CD24-low cases (cases that reveal stem cell-phenotype) in the group of invasive ductal carcinoma of Luminal A molecular subtype are almost as many as CD44+/CD24+ and CD44-/CD24+ phenotype cancers. In this group non-stem phenotype cases are 65%, so 5 times more than stem cell phenotype cancers. 1324 postoperative breast materials studied through 2008-2012 at the laboratory of "Pathgeo-Union of Pathologists" LTD and Academician N. Kipshidze Central University Clinic were used as test materials and specimens from 393 patients with invasive ductal carcinoma were selected. CD44/CD24 markers' expression in phenotypically different cancers and clinic-pathologic parameters as well as various biological features was conducted by the Pearson's correlation analysis and using X2 test. Statistical analysis of obtained numeral data was held using SPSS V.19.0 program. Confidence interval of 95% was considered statistically significant. Stem cell phenotype positive cases are with the highest percentage represented in Luminal B and basal-like molecular subgroup that to our minds is associated with their aggressive behavior and resistance to chemotherapy. Relatively good prognosis and response to chemotherapy of Luminal A molecular subtype cancers are to be stipulated by lower

  4. Communication: Finding destructive interference features in molecular transport junctions

    SciTech Connect

    Reuter, Matthew G.; Hansen, Thorsten

    2014-11-14

    Associating molecular structure with quantum interference features in electrode-molecule-electrode transport junctions has been difficult because existing guidelines for understanding interferences only apply to conjugated hydrocarbons. Herein we use linear algebra and the Landauer-Büttiker theory for electron transport to derive a general rule for predicting the existence and locations of interference features. Our analysis illustrates that interferences can be directly determined from the molecular Hamiltonian and the molecule–electrode couplings, and we demonstrate its utility with several examples.

  5. Communication: Finding destructive interference features in molecular transport junctions.

    PubMed

    Reuter, Matthew G; Hansen, Thorsten

    2014-11-14

    Associating molecular structure with quantum interference features in electrode-molecule-electrode transport junctions has been difficult because existing guidelines for understanding interferences only apply to conjugated hydrocarbons. Herein we use linear algebra and the Landauer-Büttiker theory for electron transport to derive a general rule for predicting the existence and locations of interference features. Our analysis illustrates that interferences can be directly determined from the molecular Hamiltonian and the molecule-electrode couplings, and we demonstrate its utility with several examples. PMID:25399124

  6. Predicting discovery rates of genomic features.

    PubMed

    Gravel, Simon

    2014-06-01

    Successful sequencing experiments require judicious sample selection. However, this selection must often be performed on the basis of limited preliminary data. Predicting the statistical properties of the final sample based on preliminary data can be challenging, because numerous uncertain model assumptions may be involved. Here, we ask whether we can predict "omics" variation across many samples by sequencing only a fraction of them. In the infinite-genome limit, we find that a pilot study sequencing 5% of a population is sufficient to predict the number of genetic variants in the entire population within 6% of the correct value, using an estimator agnostic to demography, selection, or population structure. To reach similar accuracy in a finite genome with millions of polymorphisms, the pilot study would require ∼15% of the population. We present computationally efficient jackknife and linear programming methods that exhibit substantially less bias than the state of the art when applied to simulated data and subsampled 1000 Genomes Project data. Extrapolating based on the National Heart, Lung, and Blood Institute Exome Sequencing Project data, we predict that 7.2% of sites in the capture region would be variable in a sample of 50,000 African Americans and 8.8% in a European sample of equal size. Finally, we show how the linear programming method can also predict discovery rates of various genomic features, such as the number of transcription factor binding sites across different cell types. PMID:24637199

  7. Feature Selection for Wheat Yield Prediction

    NASA Astrophysics Data System (ADS)

    Ruß, Georg; Kruse, Rudolf

    Carrying out effective and sustainable agriculture has become an important issue in recent years. Agricultural production has to keep up with an everincreasing population by taking advantage of a field’s heterogeneity. Nowadays, modern technology such as the global positioning system (GPS) and a multitude of developed sensors enable farmers to better measure their fields’ heterogeneities. For this small-scale, precise treatment the term precision agriculture has been coined. However, the large amounts of data that are (literally) harvested during the growing season have to be analysed. In particular, the farmer is interested in knowing whether a newly developed heterogeneity sensor is potentially advantageous or not. Since the sensor data are readily available, this issue should be seen from an artificial intelligence perspective. There it can be treated as a feature selection problem. The additional task of yield prediction can be treated as a multi-dimensional regression problem. This article aims to present an approach towards solving these two practically important problems using artificial intelligence and data mining ideas and methodologies.

  8. Molecular Dynamics Simulations Of Nanometer-Scale Feature Etch

    SciTech Connect

    Vegh, J. J.; Graves, D. B.

    2008-09-23

    Molecular dynamics (MD) simulations have been carried out to examine fundamental etch limitations. Beams of Ar{sup +}, Ar{sup +}/F and CF{sub x}{sup +} (x = 2,3) with 2 nm diameter cylindrical confinement were utilized to mimic 'perfect' masks for small feature etching in silicon. The holes formed during etch exhibit sidewall damage and passivation as a result of ion-induced mixing. The MD results predict a minimum hole diameter of {approx}5 nm after post-etch cleaning of the sidewall.

  9. Feature Selection for Neural Network Based Stock Prediction

    NASA Astrophysics Data System (ADS)

    Sugunnasil, Prompong; Somhom, Samerkae

    We propose a new methodology of feature selection for stock movement prediction. The methodology is based upon finding those features which minimize the correlation relation function. We first produce all the combination of feature and evaluate each of them by using our evaluate function. We search through the generated set with hill climbing approach. The self-organizing map based stock prediction model is utilized as the prediction method. We conduct the experiment on data sets of the Microsoft Corporation, General Electric Co. and Ford Motor Co. The results show that our feature selection method can improve the efficiency of the neural network based stock prediction.

  10. Predicting the molecular complexity of sequencing libraries.

    PubMed

    Daley, Timothy; Smith, Andrew D

    2013-04-01

    Predicting the molecular complexity of a genomic sequencing library is a critical but difficult problem in modern sequencing applications. Methods to determine how deeply to sequence to achieve complete coverage or to predict the benefits of additional sequencing are lacking. We introduce an empirical bayesian method to accurately characterize the molecular complexity of a DNA sample for almost any sequencing application on the basis of limited preliminary sequencing. PMID:23435259

  11. Molecular Markers Predictive of Chemotherapy Response in Colorectal Cancer

    PubMed Central

    Shiovitz, Stacey; Grady, William M.

    2015-01-01

    Recognition of the molecular heterogeneity of colorectal cancer (CRC) has led to the classification of CRC based on a variety of clinical and molecular characteristics. Although the clinical significance of the majority of these molecular alterations is still being ascertained, it is widely anticipated that these characteristics will improve the accuracy of our ability to determine the prognosis and therapeutic response of CRC patients. A few of these markers, such as microsatellite instability and the CpG island methylator phenotype (CIMP), show promise as predictive markers for cytotoxic chemotherapy. KRAS is a validated biomarker for EGFR-targeted therapy, while NRAS and PI3KCA are evolving markers for targeted therapies. Multiple new actionable drug targets are being identified on a regular basis, but most are not ready for clinical use at this time. This review focuses on key molecular features of CRCs and the application of these molecular alterations as predictive biomarkers for CRC. PMID:25663616

  12. Learning through Feature Prediction: An Initial Investigation into Teaching Categories to Children with Autism through Predicting Missing Features

    ERIC Educational Resources Information Center

    Sweller, Naomi

    2015-01-01

    Individuals with autism have difficulty generalising information from one situation to another, a process that requires the learning of categories and concepts. Category information may be learned through: (1) classifying items into categories, or (2) predicting missing features of category items. Predicting missing features has to this point been…

  13. Molecular features of cellular reprogramming and development.

    PubMed

    Smith, Zachary D; Sindhu, Camille; Meissner, Alexander

    2016-03-01

    Differentiating somatic cells are progressively restricted to specialized functions during ontogeny, but they can be experimentally directed to form other cell types, including those with complete embryonic potential. Early nuclear reprogramming methods, such as somatic cell nuclear transfer (SCNT) and cell fusion, posed significant technical hurdles to precise dissection of the regulatory programmes governing cell identity. However, the discovery of reprogramming by ectopic expression of a defined set of transcription factors, known as direct reprogramming, provided a tractable platform to uncover molecular characteristics of cellular specification and differentiation, cell type stability and pluripotency. We discuss the control and maintenance of cellular identity during developmental transitions as they have been studied using direct reprogramming, with an emphasis on transcriptional and epigenetic regulation. PMID:26883001

  14. Outer packet sets and feature prediction of computer virus

    NASA Astrophysics Data System (ADS)

    Zhang, Ling

    2014-10-01

    The packet sets model was proposed by Prof. Shi in 2008. A packet sets is a set pair composed of internal and outer packet sets, and it has dynamic characteristic. Using packet sets theory, this paper gives the feature prediction of computer virus based on outer packet sets. The concept of virus screening-filtering is given, furthermore, the virus screening-filtering order theorem, composite virus screening-filtering theorem and virus screening-filtering rule are presented. A prediction method of computer virus feature is given based on the results. The outer packet sets is a new tool in the research of the prediction of dynamic virus feature.

  15. Predicting Clinical Outcomes Using Molecular Biomarkers

    PubMed Central

    Burke, Harry B.

    2016-01-01

    Over the past 20 years, there has been an exponential increase in the number of biomarkers. At the last count, there were 768,259 papers indexed in PubMed.gov directly related to biomarkers. Although many of these papers claim to report clinically useful molecular biomarkers, embarrassingly few are currently in clinical use. It is suggested that a failure to properly understand, clinically assess, and utilize molecular biomarkers has prevented their widespread adoption in treatment, in comparative benefit analyses, and their integration into individualized patient outcome predictions for clinical decision-making and therapy. A straightforward, general approach to understanding how to predict clinical outcomes using risk, diagnostic, and prognostic molecular biomarkers is presented. In the future, molecular biomarkers will drive advances in risk, diagnosis, and prognosis, they will be the targets of powerful molecular therapies, and they will individualize and optimize therapy. Furthermore, clinical predictions based on molecular biomarkers will be displayed on the clinician’s screen during the physician–patient interaction, they will be an integral part of physician–patient-shared decision-making, and they will improve clinical care and patient outcomes. PMID:27279751

  16. Predicting beef tenderness using color and multispectral image texture features.

    PubMed

    Sun, X; Chen, K J; Maddock-Carlin, K R; Anderson, V L; Lepper, A N; Schwartz, C A; Keller, W L; Ilse, B R; Magolski, J D; Berg, E P

    2012-12-01

    The objective of this study was to investigate the usefulness of raw meat surface characteristics (texture) in predicting cooked beef tenderness. Color and multispectral texture features, including 4 different wavelengths and 217 image texture features, were extracted from 2 laboratory-based multispectral camera imaging systems. Steaks were segregated into tough and tender classification groups based on Warner-Bratzler shear force. The texture features were submitted to STEPWISE multiple regression and support vector machine (SVM) analyses to establish prediction models for beef tenderness. A subsample (80%) of tender or tough classified steaks were used to train models which were then validated on the remaining (20%) test steaks. For color images, the SVM model correctly identified tender steaks with 100% accurately while the STEPWISE equation identified 94.9% of the tender steaks correctly. For multispectral images, the SVM model predicted 91% and STEPWISE predicted 87% average accuracy of beef tender. PMID:22647652

  17. Feature selection for splice site prediction: A new method using EDA-based feature ranking

    PubMed Central

    Saeys, Yvan; Degroeve, Sven; Aeyels, Dirk; Rouzé, Pierre; Van de Peer, Yves

    2004-01-01

    Background The identification of relevant biological features in large and complex datasets is an important step towards gaining insight in the processes underlying the data. Other advantages of feature selection include the ability of the classification system to attain good or even better solutions using a restricted subset of features, and a faster classification. Thus, robust methods for fast feature selection are of key importance in extracting knowledge from complex biological data. Results In this paper we present a novel method for feature subset selection applied to splice site prediction, based on estimation of distribution algorithms, a more general framework of genetic algorithms. From the estimated distribution of the algorithm, a feature ranking is derived. Afterwards this ranking is used to iteratively discard features. We apply this technique to the problem of splice site prediction, and show how it can be used to gain insight into the underlying biological process of splicing. Conclusion We show that this technique proves to be more robust than the traditional use of estimation of distribution algorithms for feature selection: instead of returning a single best subset of features (as they normally do) this method provides a dynamical view of the feature selection process, like the traditional sequential wrapper methods. However, the method is faster than the traditional techniques, and scales better to datasets described by a large number of features. PMID:15154966

  18. Molecular predictive and prognostic factors in ependymoma.

    PubMed

    Benson, Rony; Mallick, Supriya; Julka, Pramod K; Rath, Goura K

    2016-01-01

    An ependymoma is an uncommon glial tumor, which arises from different parts of the neuroaxis. Considerable variation in presentation and survival in tumors in different locations after an optimum treatment indicates inherent molecular and genetic differences in tumorigenesis between them. A number of genetic aberrations have been identified to distinctly characterize different subgroups of ependymomas that include a posterior fossa tumor, a supratentorial tumor, and a pediatric tumor. These different groups have substantial genetic alterations, and also distinct demography, clinical characteristics, and prognosis. This article is intended to review the diverse molecular and genetic aberrations that may be helpful in prognostication and prediction of survival in patients suffering from an ependymoma. PMID:26954807

  19. Tumors of the Testis: Morphologic Features and Molecular Alterations.

    PubMed

    Howitt, Brooke E; Berney, Daniel M

    2015-12-01

    This article reviews the most frequently encountered tumor of the testis; pure and mixed malignant testicular germ cell tumors (TGCT), with emphasis on adult (postpubertal) TGCTs and their differential diagnoses. We additionally review TGCT in the postchemotherapy setting, and findings to be integrated into the surgical pathology report, including staging of testicular tumors and other problematic issues. The clinical features, gross pathologic findings, key histologic features, common differential diagnoses, the use of immunohistochemistry, and molecular alterations in TGCTs are discussed. PMID:26612222

  20. NSCLC tumor shrinkage prediction using quantitative image features.

    PubMed

    Hunter, Luke A; Chen, Yi Pei; Zhang, Lifei; Matney, Jason E; Choi, Haesun; Kry, Stephen F; Martel, Mary K; Stingo, Francesco; Liao, Zhongxing; Gomez, Daniel; Yang, Jinzhong; Court, Laurence E

    2016-04-01

    The objective of this study was to develop a quantitative image feature model to predict non-small cell lung cancer (NSCLC) volume shrinkage from pre-treatment CT images. 64 stage II-IIIB NSCLC patients with similar treatments were all imaged using the same CT scanner and protocol. For each patient, the planning gross tumor volume (GTV) was deformed onto the week 6 treatment image, and tumor shrinkage was quantified as the deformed GTV volume divided by the planning GTV volume. Geometric, intensity histogram, absolute gradient image, co-occurrence matrix, and run-length matrix image features were extracted from each planning GTV. Prediction models were generated using principal component regression with simulated annealing subset selection. Performance was quantified using the mean squared error (MSE) between the predicted and observed tumor shrinkages. Permutation tests were used to validate the results. The optimal prediction model gave a strong correlation between the observed and predicted tumor shrinkages with r=0.81 and MSE=8.60×10(-3). Compared to predictions based on the mean population shrinkage this resulted in a 2.92 fold reduction in MSE. In conclusion, this study indicated that quantitative image features extracted from existing pre-treatment CT images can successfully predict tumor shrinkage and provide additional information for clinical decisions regarding patient risk stratification, treatment, and prognosis. PMID:26878137

  1. A protein structural class prediction method based on novel features.

    PubMed

    Zhang, Lichao; Zhao, Xiqiang; Kong, Liang

    2013-09-01

    In this study, a 12-dimensional feature vector is constructed to reflect the general contents and spatial arrangements of the secondary structural elements of a given protein sequence. Among the 12 features, 6 novel features are specially designed to improve the prediction accuracies for α/β and α + β classes based on the distributions of α-helices and β-strands and the characteristics of parallel β-sheets and anti-parallel β-sheets. To evaluate our method, the jackknife cross-validating test is employed on two widely-used datasets, 25PDB and 1189 datasets with sequence similarity lower than 40% and 25%, respectively. The performance of our method outperforms the recently reported methods in most cases, and the 6 newly-designed features have significant positive effect to the prediction accuracies, especially for α/β and α + β classes. PMID:23770446

  2. Stabilizing l1-norm prediction models by supervised feature grouping.

    PubMed

    Kamkar, Iman; Gupta, Sunil Kumar; Phung, Dinh; Venkatesh, Svetha

    2016-02-01

    Emerging Electronic Medical Records (EMRs) have reformed the modern healthcare. These records have great potential to be used for building clinical prediction models. However, a problem in using them is their high dimensionality. Since a lot of information may not be relevant for prediction, the underlying complexity of the prediction models may not be high. A popular way to deal with this problem is to employ feature selection. Lasso and l1-norm based feature selection methods have shown promising results. But, in presence of correlated features, these methods select features that change considerably with small changes in data. This prevents clinicians to obtain a stable feature set, which is crucial for clinical decision making. Grouping correlated variables together can improve the stability of feature selection, however, such grouping is usually not known and needs to be estimated for optimal performance. Addressing this problem, we propose a new model that can simultaneously learn the grouping of correlated features and perform stable feature selection. We formulate the model as a constrained optimization problem and provide an efficient solution with guaranteed convergence. Our experiments with both synthetic and real-world datasets show that the proposed model is significantly more stable than Lasso and many existing state-of-the-art shrinkage and classification methods. We further show that in terms of prediction performance, the proposed method consistently outperforms Lasso and other baselines. Our model can be used for selecting stable risk factors for a variety of healthcare problems, so it can assist clinicians toward accurate decision making. PMID:26689771

  3. BDDCS Class Prediction for New Molecular Entities

    PubMed Central

    Broccatelli, Fabio; Cruciani, Gabriele; Benet, Leslie Z.; Oprea, Tudor I.

    2012-01-01

    The Biopharmaceutics Drug Disposition Classification System (BDDCS) was successfully employed for predicting drug-drug interactions (DDIs) with respect to drug metabolizing enzymes (DMEs), drug transporters and their interplay. The major assumption of BDDCS is that the extent of metabolism (EoM) predicts high versus low intestinal permeability rate, and vice versa, at least when uptake transporters or paracellular transport are not involved. We recently published a collection of over 900 marketed drugs classified for BDDCS. We suggest that a reliable model for predicting BDDCS class, integrated with in vitro assays, could anticipate disposition and potential DDIs of new molecular entities (NMEs). Here we describe a computational procedure for predicting BDDCS class from molecular structures. The model was trained on a set of 300 oral drugs, and validated on an external set of 379 oral drugs, using 17 descriptors calculated or derived from the VolSurf+ software. For each molecule, a probability of BDDCS class membership was given, based on predicted EoM, FDA solubility (FDAS) and their confidence scores. The accuracy in predicting FDAS was 78% in training and 77% in validation, while for EoM prediction the accuracy was 82% in training and 79% in external validation. The actual BDDCS class corresponded to the highest ranked calculated class for 55% of the validation molecules, and it was within the top two ranked more than 92% of the times. The unbalanced stratification of the dataset didn’t affect the prediction, which showed highest accuracy in predicting classes 2 and 3 with respect to the most populated class 1. For class 4 drugs a general lack of predictability was observed. A linear discriminant analysis (LDA) confirmed the degree of accuracy for the prediction of the different BDDCS classes is tied to the structure of the dataset. This model could routinely be used in early drug discovery to prioritize in vitro tests for NMEs (e.g., affinity to transporters

  4. Classification performance prediction using parametric scattering feature models

    NASA Astrophysics Data System (ADS)

    Chiang, Hung-Chih; Moses, Randolph L.; Potter, Lee C.

    2000-08-01

    We consider a method for estimating classification performance of a model-based synthetic aperture radar (SAR) automatic target recognition system. Target classification is performed by comparing an unordered feature set extracted from a measured SAR image chip with an unordered feature set predicted from a hypothesized target class and pose. A Bayes likelihood metric that incorporates uncertainty in both the predicted and extracted feature vectors is used to compute the match score. Evaluation of the match likelihoods requires a correspondence between the unordered predicted and extracted feature sets. This is a bipartite graph matching problem with insertions and deletions; we show that the optimal match can be found in polynomial time. We extend the results in 1 to estimate classification performance for a ten-class SAR ATR problem. We consider a synthetic classification problem to validate the classifier and to address resolution and robustness questions in the likelihood scoring method. Specifically, we consider performance versus SAR resolution, performance degradation due to mismatch between the assumed and actual feature statistics, and performance impact of correlated feature attributes.

  5. How to Predict Molecular Interactions between Species?

    PubMed Central

    Schulze, Sylvie; Schleicher, Jana; Guthke, Reinhard; Linde, Jörg

    2016-01-01

    Organisms constantly interact with other species through physical contact which leads to changes on the molecular level, for example the transcriptome. These changes can be monitored for all genes, with the help of high-throughput experiments such as RNA-seq or microarrays. The adaptation of the gene expression to environmental changes within cells is mediated through complex gene regulatory networks. Often, our knowledge of these networks is incomplete. Network inference predicts gene regulatory interactions based on transcriptome data. An emerging application of high-throughput transcriptome studies are dual transcriptomics experiments. Here, the transcriptome of two or more interacting species is measured simultaneously. Based on a dual RNA-seq data set of murine dendritic cells infected with the fungal pathogen Candida albicans, the software tool NetGenerator was applied to predict an inter-species gene regulatory network. To promote further investigations of molecular inter-species interactions, we recently discussed dual RNA-seq experiments for host-pathogen interactions and extended the applied tool NetGenerator (Schulze et al., 2015). The updated version of NetGenerator makes use of measurement variances in the algorithmic procedure and accepts gene expression time series data with missing values. Additionally, we tested multiple modeling scenarios regarding the stimuli functions of the gene regulatory network. Here, we summarize the work by Schulze et al. (2015) and put it into a broader context. We review various studies making use of the dual transcriptomics approach to investigate the molecular basis of interacting species. Besides the application to host-pathogen interactions, dual transcriptomics data are also utilized to study mutualistic and commensalistic interactions. Furthermore, we give a short introduction into additional approaches for the prediction of gene regulatory networks and discuss their application to dual transcriptomics data. We

  6. Improving Protein Expression Prediction Using Extra Features and Ensemble Averaging

    PubMed Central

    Fernandes, Armando; Vinga, Susana

    2016-01-01

    The article focus is the improvement of machine learning models capable of predicting protein expression levels based on their codon encoding. Support vector regression (SVR) and partial least squares (PLS) were used to create the models. SVR yields predictions that surpass those of PLS. It is shown that it is possible to improve the models predictive ability by using two more input features, codon identification number and codon count, besides the already used codon bias and minimum free energy. In addition, applying ensemble averaging to the SVR or PLS models also improves the results even further. The present work motivates the test of different ensembles and features with the aim of improving the prediction models whose correlation coefficients are still far from perfect. These results are relevant for the optimization of codon usage and enhancement of protein expression levels in synthetic biology problems. PMID:26934190

  7. Clinical and molecular features of young-onset colorectal cancer

    PubMed Central

    Ballester, Veroushka; Rashtak, Shahrooz; Boardman, Lisa

    2016-01-01

    Colorectal cancer (CRC) is one of the leading causes of cancer related mortality worldwide. Although young-onset CRC raises the possibility of a hereditary component, hereditary CRC syndromes only explain a minority of young-onset CRC cases. There is evidence to suggest that young-onset CRC have a different molecular profile than late-onset CRC. While the pathogenesis of young-onset CRC is well characterized in individuals with an inherited CRC syndrome, knowledge regarding the molecular features of sporadic young-onset CRC is limited. Understanding the molecular mechanisms of young-onset CRC can help us tailor specific screening and management strategies. While the incidence of late-onset CRC has been decreasing, mainly attributed to an increase in CRC screening, the incidence of young-onset CRC is increasing. Differences in the molecular biology of these tumors and low suspicion of CRC in young symptomatic individuals, may be possible explanations. Currently there is no evidence that supports that screening of average risk individuals less than 50 years of age will translate into early detection or increased survival. However, increasing understanding of the underlying molecular mechanisms of young-onset CRC could help us tailor specific screening and management strategies. The purpose of this review is to evaluate the current knowledge about young-onset CRC, its clinicopathologic features, and the newly recognized molecular alterations involved in tumor progression. PMID:26855533

  8. Sequence-based feature prediction and annotation of proteins

    PubMed Central

    Juncker, Agnieszka S; Jensen, Lars J; Pierleoni, Andrea; Bernsel, Andreas; Tress, Michael L; Bork, Peer; von Heijne, Gunnar; Valencia, Alfonso; Ouzounis, Christos A; Casadio, Rita; Brunak, Søren

    2009-01-01

    A recent trend in computational methods for annotation of protein function is that many prediction tools are combined in complex workflows and pipelines to facilitate the analysis of feature combinations, for example, the entire repertoire of kinase-binding motifs in the human proteome. PMID:19226438

  9. Prediction of acoustic feature parameters using myoelectric signals.

    PubMed

    Lee, Ki-Seung

    2010-07-01

    It is well-known that a clear relationship exists between human voices and myoelectric signals (MESs) from the area of the speaker's mouth. In this study, we utilized this information to implement a speech synthesis scheme in which MES alone was used to predict the parameters characterizing the vocal-tract transfer function of specific speech signals. Several feature parameters derived from MES were investigated to find the optimal feature for maximization of the mutual information between the acoustic and the MES features. After the optimal feature was determined, an estimation rule for the acoustic parameters was proposed, based on a minimum mean square error (MMSE) criterion. In a preliminary study, 60 isolated words were used for both objective and subjective evaluations. The results showed that the average Euclidean distance between the original and predicted acoustic parameters was reduced by about 30% compared with the average Euclidean distance of the original parameters. The intelligibility of the synthesized speech signals using the predicted features was also evaluated. A word-level identification ratio of 65.5% and a syllable-level identification ratio of 73% were obtained through a listening test. PMID:20172775

  10. Molecular classification and prediction in gastric cancer

    PubMed Central

    Lin, Xiandong; Zhao, Yongzhong; Song, Won-min; Zhang, Bin

    2015-01-01

    Gastric cancer, a highly heterogeneous disease, is the second leading cause of cancer death and the fourth most common cancer globally, with East Asia accounting for more than half of cases annually. Alongside TNM staging, gastric cancer clinic has two well-recognized classification systems, the Lauren classification that subdivides gastric adenocarcinoma into intestinal and diffuse types and the alternative World Health Organization system that divides gastric cancer into papillary, tubular, mucinous (colloid), and poorly cohesive carcinomas. Both classification systems enable a better understanding of the histogenesis and the biology of gastric cancer yet have a limited clinical utility in guiding patient therapy due to the molecular heterogeneity of gastric cancer. Unprecedented whole-genome-scale data have been catalyzing and advancing the molecular subtyping approach. Here we cataloged and compared those published gene expression profiling signatures in gastric cancer. We summarized recent integrated genomic characterization of gastric cancer based on additional data of somatic mutation, chromosomal instability, EBV virus infection, and DNA methylation. We identified the consensus patterns across these signatures and identified the underlying molecular pathways and biological functions. The identification of molecular subtyping of gastric adenocarcinoma and the development of integrated genomics approaches for clinical applications such as prediction of clinical intervening emerge as an essential phase toward personalized medicine in treating gastric cancer. PMID:26380657

  11. Volumetric feature extraction and visualization of tomographic molecular imaging.

    PubMed

    Bajaj, Chandrajit; Yu, Zeyun; Auer, Manfred

    2003-01-01

    Electron tomography is useful for studying large macromolecular complex within their cellular context. The associate problems include crowding and complexity. Data exploration and 3D visualization of complexes require rendering of tomograms as well as extraction of all features of interest. We present algorithms for fully automatic boundary segmentation and skeletonization, and demonstrate their applications in feature extraction and visualization of cell and molecular tomographic imaging. We also introduce an interactive volumetric exploration and visualization tool (Volume Rover), which encapsulates implementations of the above volumetric image processing algorithms, and additionally uses efficient multi-resolution interactive geometry and volume rendering techniques for interactive visualization. PMID:14643216

  12. Nonstationary time series prediction combined with slow feature analysis

    NASA Astrophysics Data System (ADS)

    Wang, G.; Chen, X.

    2015-07-01

    Almost all climate time series have some degree of nonstationarity due to external driving forces perturbing the observed system. Therefore, these external driving forces should be taken into account when constructing the climate dynamics. This paper presents a new technique of obtaining the driving forces of a time series from the slow feature analysis (SFA) approach, and then introduces them into a predictive model to predict nonstationary time series. The basic theory of the technique is to consider the driving forces as state variables and to incorporate them into the predictive model. Experiments using a modified logistic time series and winter ozone data in Arosa, Switzerland, were conducted to test the model. The results showed improved prediction skills.

  13. Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo

    PubMed Central

    Li, Dong; Zhang, Yongchao; Xu, Zhiming; Chu, Dianhui; Li, Sheng

    2016-01-01

    The rapid development of online social networks (e.g., Twitter and Facebook) has promoted research related to social networks in which link prediction is a key problem. Although numerous attempts have been made for link prediction based on network structure, node attribute and so on, few of the current studies have considered the impact of information diffusion on link creation and prediction. This paper mainly addresses Sina Weibo, which is the largest microblog platform with Chinese characteristics, and proposes the hypothesis that information diffusion influences link creation and verifies the hypothesis based on real data analysis. We also detect an important feature from the information diffusion process, which is used to promote link prediction performance. Finally, the experimental results on Sina Weibo dataset have demonstrated the effectiveness of our methods. PMID:26817436

  14. Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo

    NASA Astrophysics Data System (ADS)

    Li, Dong; Zhang, Yongchao; Xu, Zhiming; Chu, Dianhui; Li, Sheng

    2016-01-01

    The rapid development of online social networks (e.g., Twitter and Facebook) has promoted research related to social networks in which link prediction is a key problem. Although numerous attempts have been made for link prediction based on network structure, node attribute and so on, few of the current studies have considered the impact of information diffusion on link creation and prediction. This paper mainly addresses Sina Weibo, which is the largest microblog platform with Chinese characteristics, and proposes the hypothesis that information diffusion influences link creation and verifies the hypothesis based on real data analysis. We also detect an important feature from the information diffusion process, which is used to promote link prediction performance. Finally, the experimental results on Sina Weibo dataset have demonstrated the effectiveness of our methods.

  15. Exploiting Information Diffusion Feature for Link Prediction in Sina Weibo.

    PubMed

    Li, Dong; Zhang, Yongchao; Xu, Zhiming; Chu, Dianhui; Li, Sheng

    2016-01-01

    The rapid development of online social networks (e.g., Twitter and Facebook) has promoted research related to social networks in which link prediction is a key problem. Although numerous attempts have been made for link prediction based on network structure, node attribute and so on, few of the current studies have considered the impact of information diffusion on link creation and prediction. This paper mainly addresses Sina Weibo, which is the largest microblog platform with Chinese characteristics, and proposes the hypothesis that information diffusion influences link creation and verifies the hypothesis based on real data analysis. We also detect an important feature from the information diffusion process, which is used to promote link prediction performance. Finally, the experimental results on Sina Weibo dataset have demonstrated the effectiveness of our methods. PMID:26817436

  16. Common features of microRNA target prediction tools.

    PubMed

    Peterson, Sarah M; Thompson, Jeffrey A; Ufkin, Melanie L; Sathyanarayana, Pradeep; Liaw, Lucy; Congdon, Clare Bates

    2014-01-01

    The human genome encodes for over 1800 microRNAs (miRNAs), which are short non-coding RNA molecules that function to regulate gene expression post-transcriptionally. Due to the potential for one miRNA to target multiple gene transcripts, miRNAs are recognized as a major mechanism to regulate gene expression and mRNA translation. Computational prediction of miRNA targets is a critical initial step in identifying miRNA:mRNA target interactions for experimental validation. The available tools for miRNA target prediction encompass a range of different computational approaches, from the modeling of physical interactions to the incorporation of machine learning. This review provides an overview of the major computational approaches to miRNA target prediction. Our discussion highlights three tools for their ease of use, reliance on relatively updated versions of miRBase, and range of capabilities, and these are DIANA-microT-CDS, miRanda-mirSVR, and TargetScan. In comparison across all miRNA target prediction tools, four main aspects of the miRNA:mRNA target interaction emerge as common features on which most target prediction is based: seed match, conservation, free energy, and site accessibility. This review explains these features and identifies how they are incorporated into currently available target prediction tools. MiRNA target prediction is a dynamic field with increasing attention on development of new analysis tools. This review attempts to provide a comprehensive assessment of these tools in a manner that is accessible across disciplines. Understanding the basis of these prediction methodologies will aid in user selection of the appropriate tools and interpretation of the tool output. PMID:24600468

  17. Automated Analysis and Classification of Histological Tissue Features by Multi-Dimensional Microscopic Molecular Profiling

    PubMed Central

    Riordan, Daniel P.; Varma, Sushama; West, Robert B.; Brown, Patrick O.

    2015-01-01

    Characterization of the molecular attributes and spatial arrangements of cells and features within complex human tissues provides a critical basis for understanding processes involved in development and disease. Moreover, the ability to automate steps in the analysis and interpretation of histological images that currently require manual inspection by pathologists could revolutionize medical diagnostics. Toward this end, we developed a new imaging approach called multidimensional microscopic molecular profiling (MMMP) that can measure several independent molecular properties in situ at subcellular resolution for the same tissue specimen. MMMP involves repeated cycles of antibody or histochemical staining, imaging, and signal removal, which ultimately can generate information analogous to a multidimensional flow cytometry analysis on intact tissue sections. We performed a MMMP analysis on a tissue microarray containing a diverse set of 102 human tissues using a panel of 15 informative antibody and 5 histochemical stains plus DAPI. Large-scale unsupervised analysis of MMMP data, and visualization of the resulting classifications, identified molecular profiles that were associated with functional tissue features. We then directly annotated H&E images from this MMMP series such that canonical histological features of interest (e.g. blood vessels, epithelium, red blood cells) were individually labeled. By integrating image annotation data, we identified molecular signatures that were associated with specific histological annotations and we developed statistical models for automatically classifying these features. The classification accuracy for automated histology labeling was objectively evaluated using a cross-validation strategy, and significant accuracy (with a median per-pixel rate of 77% per feature from 15 annotated samples) for de novo feature prediction was obtained. These results suggest that high-dimensional profiling may advance the development of computer

  18. Predictive features of breast cancer on Mexican screening mammography patients

    NASA Astrophysics Data System (ADS)

    Rodriguez-Rojas, Juan; Garza-Montemayor, Margarita; Trevino-Alvarado, Victor; Tamez-Pena, José Gerardo

    2013-02-01

    Breast cancer is the most common type of cancer worldwide. In response, breast cancer screening programs are becoming common around the world and public programs now serve millions of women worldwide. These programs are expensive, requiring many specialized radiologists to examine all images. Nevertheless, there is a lack of trained radiologists in many countries as in Mexico, which is a barrier towards decreasing breast cancer mortality, pointing at the need of a triaging system that prioritizes high risk cases for prompt interpretation. Therefore we explored in an image database of Mexican patients whether high risk cases can be distinguished using image features. We collected a set of 200 digital screening mammography cases from a hospital in Mexico, and assigned low or high risk labels according to its BIRADS score. Breast tissue segmentation was performed using an automatic procedure. Image features were obtained considering only the segmented region on each view and comparing the bilateral di erences of the obtained features. Predictive combinations of features were chosen using a genetic algorithms based feature selection procedure. The best model found was able to classify low-risk and high-risk cases with an area under the ROC curve of 0.88 on a 150-fold cross-validation test. The features selected were associated to the differences of signal distribution and tissue shape on bilateral views. The model found can be used to automatically identify high risk cases and trigger the necessary measures to provide prompt treatment.

  19. Quantitative imaging features to predict cancer status in lung nodules

    NASA Astrophysics Data System (ADS)

    Liu, Ying; Balagurunathan, Yoganand; Atwater, Thomas; Antic, Sanja; Li, Qian; Walker, Ronald; Smith, Gary T.; Massion, Pierre P.; Schabath, Matthew B.; Gillies, Robert J.

    2016-03-01

    Background: We propose a systematic methodology to quantify incidentally identified lung nodules based on observed radiological traits on a point scale. These quantitative traits classification model was used to predict cancer status. Materials and Methods: We used 102 patients' low dose computed tomography (LDCT) images for this study, 24 semantic traits were systematically scored from each image. We built a machine learning classifier in cross validation setting to find best predictive imaging features to differentiate malignant from benign lung nodules. Results: The best feature triplet to discriminate malignancy was based on long axis, concavity and lymphadenopathy with average AUC of 0.897 (Accuracy of 76.8%, Sensitivity of 64.3%, Specificity of 90%). A similar semantic triplet optimized on Sensitivity/Specificity (Youden's J index) included long axis, vascular convergence and lymphadenopathy which had an average AUC of 0.875 (Accuracy of 81.7%, Sensitivity of 76.2%, Specificity of 95%). Conclusions: Quantitative radiological image traits can differentiate malignant from benign lung nodules. These semantic features along with size measurement enhance the prediction accuracy.

  20. Application of optimal prediction to molecular dynamics

    SciTech Connect

    Barber IV, John Letherman

    2004-12-01

    Optimal prediction is a general system reduction technique for large sets of differential equations. In this method, which was devised by Chorin, Hald, Kast, Kupferman, and Levy, a projection operator formalism is used to construct a smaller system of equations governing the dynamics of a subset of the original degrees of freedom. This reduced system consists of an effective Hamiltonian dynamics, augmented by an integral memory term and a random noise term. Molecular dynamics is a method for simulating large systems of interacting fluid particles. In this thesis, I construct a formalism for applying optimal prediction to molecular dynamics, producing reduced systems from which the properties of the original system can be recovered. These reduced systems require significantly less computational time than the original system. I initially consider first-order optimal prediction, in which the memory and noise terms are neglected. I construct a pair approximation to the renormalized potential, and ignore three-particle and higher interactions. This produces a reduced system that correctly reproduces static properties of the original system, such as energy and pressure, at low-to-moderate densities. However, it fails to capture dynamical quantities, such as autocorrelation functions. I next derive a short-memory approximation, in which the memory term is represented as a linear frictional force with configuration-dependent coefficients. This allows the use of a Fokker-Planck equation to show that, in this regime, the noise is {delta}-correlated in time. This linear friction model reproduces not only the static properties of the original system, but also the autocorrelation functions of dynamical variables.

  1. Proteomic Features Predict Seroreactivity against Leptospiral Antigens in Leptospirosis Patients

    PubMed Central

    2015-01-01

    With increasing efficiency, accuracy, and speed we can access complete genome sequences from thousands of infectious microorganisms; however, the ability to predict antigenic targets of the immune system based on amino acid sequence alone is still needed. Here we use a Leptospira interrogans microarray expressing 91% (3359) of all leptospiral predicted ORFs (3667) and make an empirical accounting of all antibody reactive antigens recognized in sera from naturally infected humans; 191 antigens elicited an IgM or IgG response, representing 5% of the whole proteome. We classified the reactive antigens into 26 annotated COGs (clusters of orthologous groups), 26 JCVI Mainrole annotations, and 11 computationally predicted proteomic features. Altogether, 14 significantly enriched categories were identified, which are associated with immune recognition including mass spectrometry evidence of in vitro expression and in vivo mRNA up-regulation. Together, this group of 14 enriched categories accounts for just 25% of the leptospiral proteome but contains 50% of the immunoreactive antigens. These findings are consistent with our previous studies of other Gram-negative bacteria. This genome-wide approach provides an empirical basis to predict and classify antibody reactive antigens based on structural, physical–chemical, and functional proteomic features and a framework for understanding the breadth and specificity of the immune response to L. interrogans. PMID:25358092

  2. Identifying predictive morphologic features of malignancy in eyelid lesions

    PubMed Central

    Leung, Christina; Johnson, Davin; Pang, Renee; Kratky, Vladimir

    2015-01-01

    Abstract Objective To determine features of eyelid lesions most predictive of malignancy, and to design a key to assist general practitioners in the triaging of such lesions. Design Prospective observational study. Setting Department of Ophthalmology at Queen’s University in Kingston, Ont. Participants A total of 199 consecutive periocular lesions requiring biopsy or excision were included. Main outcome measures First, potential features suggestive of malignancy for eyelid lesions were identified based on a survey sent to Canadian oculoplastic surgeons. The sensitivity, specificity, and odds ratios (ORs) of these features were then determined using 199 consecutive photographed eyelid lesions of patients who presented to the Department of Ophthalmology and underwent biopsy or excision. A triage key was then created based on the features with the highest ORs, and it was pilot-tested by a group of medical students. Results Of the 199 lesions included, 161 (80.9%) were benign and 38 (19.1%) were malignant. The 3 features with the highest ORs in predicting malignancy were infiltration (OR = 18.2, P < .01), ulceration (OR = 14.7, P < .01), and loss of eyelashes (OR = 6.0, P < .01). The acronym LUI (loss of eyelashes, ulceration, infiltration) was created to assist in memory recall. After watching a video describing the LUI triage key, the mean total score of a group of medical students for correctly identifying malignant lesions increased from 46% to 70% (P < .001). Conclusion Differentiating benign from malignant eyelid lesions can be difficult even for experienced physicians. The LUI triage key provides physicians with an evidence-based, easy-to-remember system for assisting in the triaging of these lesions. PMID:25756148

  3. Predicting Malignancy in Thyroid Nodules: Molecular Advances

    PubMed Central

    Melck, Adrienne L.; Yip, Linwah

    2016-01-01

    Over the last several years, a clearer understanding of the genetic alterations underlying thyroid carcinogenesis has developed. This knowledge can be utilized to tackle one of the greatest challenges facing thyroidologists: management of the indeterminate thyroid nodule. Despite the accuracy of fine needle aspiration cytology, many patients undergo invasive surgery in order to determine if a follicular or Hurthle cell neoplasm is malignant, and better diagnostic tools are required. A number of biomarkers have recently been studied and show promise in this setting. In particular, BRAF, RAS, PAX8-PPARγ, microRNAs and loss of heterozygosity have each been demonstrated as useful molecular tools for predicting malignancy and can thereby guide decisions regarding surgical management of nodular thyroid disease. This review summarizes the current literature surrounding each of these markers and highlights our institution’s prospective analysis of these markers and their subsequent incorporation into our management algorithms for thyroid nodules. PMID:21818817

  4. A Prediction Model for Membrane Proteins Using Moments Based Features.

    PubMed

    Butt, Ahmad Hassan; Khan, Sher Afzal; Jamil, Hamza; Rasool, Nouman; Khan, Yaser Daanial

    2016-01-01

    The most expedient unit of the human body is its cell. Encapsulated within the cell are many infinitesimal entities and molecules which are protected by a cell membrane. The proteins that are associated with this lipid based bilayer cell membrane are known as membrane proteins and are considered to play a significant role. These membrane proteins exhibit their effect in cellular activities inside and outside of the cell. According to the scientists in pharmaceutical organizations, these membrane proteins perform key task in drug interactions. In this study, a technique is presented that is based on various computationally intelligent methods used for the prediction of membrane protein without the experimental use of mass spectrometry. Statistical moments were used to extract features and furthermore a Multilayer Neural Network was trained using backpropagation for the prediction of membrane proteins. Results show that the proposed technique performs better than existing methodologies. PMID:26966690

  5. A Prediction Model for Membrane Proteins Using Moments Based Features

    PubMed Central

    Butt, Ahmad Hassan; Khan, Sher Afzal; Jamil, Hamza; Rasool, Nouman; Khan, Yaser Daanial

    2016-01-01

    The most expedient unit of the human body is its cell. Encapsulated within the cell are many infinitesimal entities and molecules which are protected by a cell membrane. The proteins that are associated with this lipid based bilayer cell membrane are known as membrane proteins and are considered to play a significant role. These membrane proteins exhibit their effect in cellular activities inside and outside of the cell. According to the scientists in pharmaceutical organizations, these membrane proteins perform key task in drug interactions. In this study, a technique is presented that is based on various computationally intelligent methods used for the prediction of membrane protein without the experimental use of mass spectrometry. Statistical moments were used to extract features and furthermore a Multilayer Neural Network was trained using backpropagation for the prediction of membrane proteins. Results show that the proposed technique performs better than existing methodologies. PMID:26966690

  6. Characterization of statistical features for plant microRNA prediction

    PubMed Central

    2011-01-01

    Background Several tools are available to identify miRNAs from deep-sequencing data, however, only a few of them, like miRDeep, can identify novel miRNAs and are also available as a standalone application. Given the difference between plant and animal miRNAs, particularly in terms of distribution of hairpin length and the nature of complementarity with its duplex partner (or miRNA star), the underlying (statistical) features of miRDeep and other tools, using similar features, are likely to get affected. Results The potential effects on features, such as minimum free energy, stability of secondary structures, excision length, etc., were examined, and the parameters of those displaying sizable changes were estimated for plant specific miRNAs. We found most of these features acquired a new set of values or distributions for plant specific miRNAs. While the length of conserved positions (nucleus) in mature miRNAs were relatively longer in plants, the difference in distribution of minimum free energy, between real and background hairpins, was marginal. However, the choice of source (species) of background sequences was found to affect both the minimum free energy and miRNA hairpin stability. The new parameters were tested on an Illumina dataset from maize seedlings, and the results were compared with those obtained using default parameters. The newly parameterized model was found to have much improved specificity and sensitivity over its default counterpart. Conclusions In summary, the present study reports behavior of few general and tool-specific statistical features for improving the prediction accuracy of plant miRNAs from deep-sequencing data. PMID:21324149

  7. Universal molecular features of refractory dissolved organic matter in fresh- and seawater

    NASA Astrophysics Data System (ADS)

    Dittmar, T.; Blasius, B.; Steinbrink, C.; Feenders, C.; Stumm, M.; Christoffers, J.; Niggemann, J.; Gerdts, G.; Osterholz, H.; Seibt, M.; Seidel, M.; Vähätalo, A.

    2012-04-01

    Dissolved organic matter (DOM) is among the largest pools of reduced carbon on Earth's surface. Its molecular structure and the reasons behind its stability in the aquatic environment are unknown. We present a mathematical model that predicts essential molecular features of refractory dissolved organic matter in fresh- and seawater. The model has only eight input variables and can accurately reproduce the presence and abundance of up to 10,000 molecular formulae in aquatic systems. The model was established with ultrahigh-resolution mass spectrometry data of North Pacific deep water (obtained on a 15 Tesla Fourier-transform ion cyclotron resonance mass spectrometer, FT-ICR-MS). We determined the molecular formulae of DOM with help of FT-ICR-MS in >1,000 samples from around the globe, covering a wide variety of open ocean, freshwater and coastal systems. The molecular formulae predicted from our North Pacific deep water model were present in all sea- and fresh water samples. In terrigenous DOM, we detected a second group of compounds that could also accurately be predicted with our model, by using a different set of eight input variables. This exclusively terrigenous compound group was more photo-reactive than the universal compound group. During a two-year sampling period at a continental shelf station, the universal DOM compounds were always present at their predicted abundance. During plankton blooms, additional compounds were produced that did not match our model and that did not persist on a longer term. The universal DOM pattern was also not observed in mesocosm experiments where algae and bacteria blooms were artificially induced. Refractory DOM in any aquatic system not only shares the same molecular formulae at the same relative abundance, but compounds with the same molecular formulae most likely have the same molecular structure, independent of the origin of DOM. Fragmentation experiments in the FT-ICR-MS on a wide range of molecular formulae revealed

  8. Assist feature printability prediction by 3-D resist profile reconstruction

    NASA Astrophysics Data System (ADS)

    Zheng, Xin; Huang, Jensheng; Chin, Fook; Kazarian, Aram; Kuo, Chun-Chieh

    2012-06-01

    Sub-resolution Assist Features (SRAFs) are powerful tools to enhance the focus margin of drawn patterns. SRAFs are placed and sized so they do not print on the wafer, but the larger the SRAF, the more effective it becomes at enhancing through-focus stability. The size and location of an SRAF that will image on a wafer is highly dependent upon neighboring patterns and models of SRAF printability are, at present, unreliable. Model-based SRAF placement has been used to enhance resolution at 20nm node processes and below with stringent requirements that inserted SRAFs will not be imaged on wafer. However, despite widespread SRAF use and hard data as to SRAF effectiveness, it has been very difficult to develop a process model that accurately predicts under what process conditions an SRAF will image on a wafer. More accurate models of SRAF printing should allow model based SRAF placement to be relaxed, resulting in more effective SRAF placement and broader focus margins. One of the first problems with the concept of SRAF printability is the definition of an SRAF printing on a wafer. This is not obvious because two different states of printing exist. The first print state is when a residue is left on a wafer from the SRAF. The first state can be considered printing from the point of view that photoresist is on the wafer and the photoresist may even lift off and cause defects. However, the first state can be considered non-printing because the over etch from the etch process will generally remove the photoresist residual and the material underneath. The second state is when a pattern is formed and etched into the substrate, a state at which the pattern has clearly printed on the wafer. Of course, intermediate states may also be defined. In order to be applicable, an SRAF printability model must be able to predict both printing states. In addition, the model must be able to extrapolate to configurations beyond those used to develop the model in the first place. These model

  9. Evaluating a variety of text-mined features for automatic protein function prediction with GOstruct.

    PubMed

    Funk, Christopher S; Kahanda, Indika; Ben-Hur, Asa; Verspoor, Karin M

    2015-01-01

    Most computational methods that predict protein function do not take advantage of the large amount of information contained in the biomedical literature. In this work we evaluate both ontology term co-mention and bag-of-words features mined from the biomedical literature and analyze their impact in the context of a structured output support vector machine model, GOstruct. We find that even simple literature based features are useful for predicting human protein function (F-max: Molecular Function =0.408, Biological Process =0.461, Cellular Component =0.608). One advantage of using literature features is their ability to offer easy verification of automated predictions. We find through manual inspection of misclassifications that some false positive predictions could be biologically valid predictions based upon support extracted from the literature. Additionally, we present a "medium-throughput" pipeline that was used to annotate a large subset of co-mentions; we suggest that this strategy could help to speed up the rate at which proteins are curated. PMID:26005564

  10. The formation of discrete high velocity molecular features

    NASA Astrophysics Data System (ADS)

    Hartquist, T. W.; Dyson, J. E.

    1987-10-01

    Clumps embedded in a flowing diffuse medium will be dissipated before ram pressure accelerates them substantially. Molecular hydrogen can be accelerated to high speeds by passing through a slow shock leading a shell at the edge of a wind-driven bubble if the density in the ambient medium drops rapidly enough to allow the shell to accelerate subsequently. The shell will be subject to the Rayleigh-Taylor instability which will drive transonic turbulence but will not initiate the formation of fragments having large density contrasts until the shell reaches sufficient speeds to become thermally unstable. The existence of high velocity discrete features in and the magnitude of the linewidth of the H2 emission from CRL 618 are explained with this acceleration mechanism. High velocity water masers may be formed in a similar fashion, but not Herbig-Haro objects.

  11. Clinical and molecular genetic features of ARC syndrome.

    PubMed

    Gissen, Paul; Tee, Louise; Johnson, Colin A; Genin, Emmanuelle; Caliebe, Almuth; Chitayat, David; Clericuzio, Carol; Denecke, Jonas; Di Rocco, Maja; Fischler, Björn; FitzPatrick, David; García-Cazorla, Angeles; Guyot, Delphine; Jacquemont, Sebastien; Koletzko, Sibylle; Leheup, Bruno; Mandel, Hanna; Sanseverino, Maria Teresa Vieira; Houwen, Roderick H J; McKiernan, Patrick J; Kelly, Deirdre A; Maher, Eamonn R

    2006-10-01

    Arthrogryposis, renal dysfunction and cholestasis (ARC) syndrome (MIM 208085) is an autosomal recessive multisystem disorder that may be associated with germline VPS33B mutations. VPS33B is involved in regulation of vesicular membrane fusion by interacting with SNARE proteins, and evidence of abnormal polarised membrane protein trafficking has been reported in ARC patients. We characterised clinical and molecular features of ARC syndrome in order to identify potential genotype-phenotype correlations. The clinical phenotype of 62 ARC syndrome patients was analysed. In addition to classical features described previously, all patients had severe failure to thrive, which was not adequately explained by the degree of liver disease and 10% had structural cardiac defects. Almost half of the patients who underwent diagnostic organ biopsy (7/16) developed life-threatening haemorrhage. We found that most patients (9/11) who suffered severe haemorrhage (7 post biopsy and 4 spontaneous) had normal platelet count and morphology. Germline VPS33B mutations were detected in 28/35 families (48/62 individuals) with ARC syndrome. Several mutations were restricted to specific ethnic groups. Thus p.Arg438X mutation was common in the UK Pakistani families and haplotyping was consistent with a founder mutation with the most recent common ancestor 900-1,000 years ago. Heterozygosity was found in the VPS33B locus in some cases of ARC providing the first evidence of a possible second ARC syndrome gene. In conclusion we state that molecular diagnosis is possible for most children in whom ARC syndrome is suspected and VPS33B mutation analysis should replace organ biopsy as a first line diagnostic test for ARC syndrome. PMID:16896922

  12. Molecular Features Related to HIV Integrase Inhibition Obtained from Structure- and Ligand-Based Approaches

    PubMed Central

    de Carvalho, Luciana L.; Maltarollo, Vinícius G.; de Lima, Emmanuela Ferreira; Weber, Karen C.; Honorio, Kathia M.; da Silva, Albérico B. F.

    2014-01-01

    Among several biological targets to treat AIDS, HIV integrase is a promising enzyme that can be employed to develop new anti-HIV agents. The aim of this work is to propose a mechanistic interpretation of HIV-1 integrase inhibition and to rationalize the molecular features related to the binding affinity of studied ligands. A set of 79 HIV-1 integrase inhibitors and its relationship with biological activity are investigated employing 2D and 3D QSAR models, docking analysis and DFT studies. Analyses of docking poses and frontier molecular orbitals revealed important features on the main ligand-receptor interactions. 2D and 3D models presenting good internal consistency, predictive power and stability were obtained in all cases. Significant correlation coefficients (r2 = 0.908 and q2 = 0.643 for 2D model; r2 = 0.904 and q2 = 0.719 for 3D model) were obtained, indicating the potential of these models for untested compounds. The generated holograms and contribution maps revealed important molecular requirements to HIV-1 IN inhibition and several evidences for molecular modifications. The final models along with information resulting from molecular orbitals, 2D contribution and 3D contour maps should be useful in the design of new inhibitors with increased potency and selectivity within the chemical diversity of the data. PMID:24416129

  13. Molecular features of hypothalamic plaques in Alzheimer's disease.

    PubMed Central

    Standaert, D. G.; Lee, V. M.; Greenberg, B. D.; Lowery, D. E.; Trojanowski, J. Q.

    1991-01-01

    The pathology of Alzheimer's disease (AD) involves subcortical as well as cortical structures. The authors have used immunohistochemical methods to study the molecular composition of AD plaques in the hypothalamus. In contrast to previous studies using histochemical methods, the authors observed large numbers of diffuse plaques in the AD hypothalamus labeled with an antiserum to the beta-amyloid, or A4 peptide, of the beta-amyloid precursor proteins (beta APPs), whereas A4-immunoreactive plaques were uncommon in the hypothalamus of patients without AD. Unlike plaques in the cortex and hippocampus of AD patients, hypothalamic plaques did not contain epitopes corresponding to other regions of the beta APPs, nor did they contain tau-, neurofilament-, or microtubule-associated protein-reactive epitopes, and did not disrupt the neuropil or produce astrogliosis. These findings demonstrate that there are substantial molecular and cellular differences in the pathologic features of AD in the hypothalamus compared with those observed in hippocampal and cortical structures, which may provide insight into the pathogenetic mechanisms of AD. Images Figure 1 Figure 2 Figure 3 Figure 4 PMID:1653521

  14. Predicting the Presence of Large Fish through Benthic Geomorphic Features

    NASA Astrophysics Data System (ADS)

    Knuth, F.; Sautter, L.; Levine, N. S.; Kracker, L.

    2013-12-01

    Marine Protected Areas are critical in sustaining the resilience of fish populations to commercial fishing operations. Using acoustic data to survey these areas promises efficiency, accuracy, and minimal environmental impact. In July, 2013, the NOAA Ship Pisces collected bathymetric, backscatter and water column data for 10 proposed MPA sites along the U.S. Southeast Atlantic continental shelf. A total of 205 km2 of seafloor were mapped between Mayport, FL and Wilmington, NC, using the SIMRAD ME70 and EK60 echosounder systems. These data were processed in Caris HIPS, QPS FMGT, MATLAB and ArcGIS. The backscatter and bathymetry reveal various benthic geomorphic features, including flat sand, rippled sand, and rugose hard bottom. Water column data directly above highly rugose hardbottom contains the greatest counts for large fish populations. Using spatial statistics, such as a geographically weighted regression model, we aim to identify features of the benthic profile, including rugosity, curvature and slope, that can predict the presence of large fish. The success of this approach will greatly expedite fishery surveys, minimize operational cost and aid in making timely management decisions.

  15. Enhanced Protein Fold Prediction Method Through a Novel Feature Extraction Technique.

    PubMed

    Wei, Leyi; Liao, Minghong; Gao, Xing; Zou, Quan

    2015-09-01

    Information of protein 3-dimensional (3D) structures plays an essential role in molecular biology, cell biology, biomedicine, and drug design. Protein fold prediction is considered as an immediate step for deciphering the protein 3D structures. Therefore, protein fold prediction is one of fundamental problems in structural bioinformatics. Recently, numerous taxonomic methods have been developed for protein fold prediction. Unfortunately, the overall prediction accuracies achieved by existing taxonomic methods are not satisfactory although much progress has been made. To address this problem, we propose a novel taxonomic method, called PFPA, which is featured by combining a novel feature set through an ensemble classifier. Particularly, the sequential evolution information from the profiles of PSI-BLAST and the local and global secondary structure information from the profiles of PSI-PRED are combined to construct a comprehensive feature set. Experimental results demonstrate that PFPA outperforms the state-of-the-art predictors. To be specific, when tested on the independent testing set of a benchmark dataset, PFPA achieves an overall accuracy of 73.6%, which is the leading accuracy ever reported. Moreover, PFPA performs well without significant performance degradation on three updated large-scale datasets, indicating the robustness and generalization of PFPA. Currently, a webserver that implements PFPA is freely available on http://121.192.180.204:8080/PFPA/Index.html. PMID:26335556

  16. Prediction of cell-penetrating peptides with feature selection techniques.

    PubMed

    Tang, Hua; Su, Zhen-Dong; Wei, Huan-Huan; Chen, Wei; Lin, Hao

    2016-08-12

    Cell-penetrating peptides are a group of peptides which can transport different types of cargo molecules such as drugs across plasma membrane and have been applied in the treatment of various diseases. Thus, the accurate prediction of cell-penetrating peptides with bioinformatics methods will accelerate the development of drug delivery systems. The study aims to develop a powerful model to accurately identify cell-penetrating peptides. At first, the peptides were translated into a set of vectors with the same dimension by using dipeptide compositions. Secondly, the Analysis of Variance-based technique was used to reduce the dimension of the vector and explore the optimized features. Finally, the support vector machine was utilized to discriminate cell-penetrating peptides from non-cell-penetrating peptides. The five-fold cross-validated results showed that our proposed method could achieve an overall prediction accuracy of 83.6%. Based on the proposed model, we constructed a free webserver called C2Pred (http://lin.uestc.edu.cn/server/C2Pred). PMID:27291150

  17. Prognostic Significance and Molecular Features of Colorectal Mucinous Adenocarcinomas

    PubMed Central

    Wang, Mo-Jin; Ping, Jie; Li, Yuan; Holmqvist, Annica; Adell, Gunnar; Arbman, Gunnar; Zhang, Hong; Zhou, Zong-Guang; Sun, Xiao-Feng

    2015-01-01

    Abstract Mucinous adenocarcinoma (MC) is a special histology subtype of colorectal adenocarcinoma. The survival of MC is controversial and the prognostic biomarkers of MC remain unclear. To analyze prognostic significance and molecular features of colorectal MC. This study included 755,682 and 1001 colorectal cancer (CRC) patients from Surveillance, Epidemiology, and End Results program (SEER, 1973–2011), and Linköping Cancer (LC, 1972–2009) databases. We investigated independently the clinicopathological characteristics, survival, and variety of molecular features from these 2 databases. MC was found in 9.3% and 9.8% patients in SEER and LC, respectively. MC was more frequently localized in the right colon compared with nonmucinous adenocarcinoma (NMC) in both SEER (57.7% vs 37.2%, P < 0.001) and LC (46.9% vs 27.7%, P < 0.001). Colorectal MC patients had significantly worse cancer-specific survival (CSS) than NMC patients (SEER, P < 0.001; LC, P = 0.026), prominently in stage III (SEER, P < 0.001; LC, P = 0.023). The multivariate survival analysis showed that MC was independently related to poor prognosis in rectal cancer patients (SEER, hazard ratios [HR], 1.076; 95% confidence intervals [CI], 1.057–1.096; P < 0.001). In LC, the integrated analysis of genetic and epigenetic features showed that that strong expression of PINCH (HR, 3.954; 95% CI, 1.493–10.47; P = 0.013) and weak expression of RAD50 (HR 0.348, 95% CI, 0.106–1.192; P = 0.026) were significantly associated with poor CSS of colorectal MC patients. In conclusion, the colorectal MC patients had significantly worse CSS than NMC patients, prominently in stage III. MC was an independent prognostic factor associated with worse survival in rectal cancer patients. The PINCH and RAD50 were prognostic biomarkers for colorectal MC patients. PMID:26705231

  18. Predictive Features of a Cockpit Traffic Display: A Workload Assessment

    NASA Technical Reports Server (NTRS)

    Wickens, Christopher D.; Morphew, Ephimia

    1997-01-01

    Eighteen pilots flew a series of traffic avoidance maneuvers in an experiment designed to assess the support offered and workload imposed by different levels of traffic display information in a free flight simulation. Three display prototypes were compared which differed in traffic information provided. A BASELINE (BL) display provided current and (2nd order) predicted information regarding ownship and current information of an intruder aircraft, represented on lateral and vertical displays in a coplanar suite. An INTRUDER PREDICTOR (IP) display, augmented the baseline display by providing lateral and vertical prediction of the intruder aircraft. A THREAT VECTOR (TV) display added to the IP display a vector that indicates the direction from ownship to the intruder at the predicted point of closest contact (POCC). The length of the vector corresponds to the radius of the protected zone, and the distance of the intersection of the vector with ownship predictor, corresponds to the time available till POCC or loss of separation. Pilots time shared the traffic avoidance task with a secondary task requiring them to monitor the top of the display for faint targets. This task simulated the visual demands of out-of-cockpit scanning, and hence was used to estimate the head-down time required by the different display formats. The results revealed that both display augmentations improved performance (safety) as assessed by predicted and actual loss of separation (i.e., penetration of the protected zone). Both enhancements also reduced workload, as assessed by the NASA TLX scale. The intruder predictor display produced these benefits with no substantial impact on the qualitative nature of the avoidance maneuvers that were selected. The threat vector produced the safety benefits by inducing a greater degree of (effective) lateral maneuvering, thus partially offsetting the benefits of reduced workload. The three displays did not differ in terms of their effect on performance of

  19. Prediction of substrate-enzyme-product interaction based on molecular descriptors and physicochemical properties.

    PubMed

    Niu, Bing; Huang, Guohua; Zheng, Linfeng; Wang, Xueyuan; Chen, Fuxue; Zhang, Yuhui; Huang, Tao

    2013-01-01

    It is important to correctly and efficiently predict the interaction of substrate-enzyme and to predict their product in metabolic pathway. In this work, a novel approach was introduced to encode substrate/product and enzyme molecules with molecular descriptors and physicochemical properties, respectively. Based on this encoding method, KNN was adopted to build the substrate-enzyme-product interaction network. After selecting the optimal features that are able to represent the main factors of substrate-enzyme-product interaction in our prediction, totally 160 features out of 290 features were attained which can be clustered into ten categories: elemental analysis, geometry, chemistry, amino acid composition, predicted secondary structure, hydrophobicity, polarizability, solvent accessibility, normalized van der Waals volume, and polarity. As a result, our predicting model achieved an MCC of 0.423 and an overall prediction accuracy of 89.1% for 10-fold cross-validation test. PMID:24455714

  20. Delta hepatitis: molecular biology and clinical and epidemiological features.

    PubMed Central

    Polish, L B; Gallagher, M; Fields, H A; Hadler, S C

    1993-01-01

    Hepatitis delta virus, discovered in 1977, requires the help of hepatitis B virus to replicate in hepatocytes and is an important cause of acute, fulminant, and chronic liver disease in many regions of the world. Because of the helper function of hepatitis delta virus, infection with it occurs either as a coinfection with hepatitis B or as a superinfection of a carrier of hepatitis B surface antigen. Although the mechanisms of transmission are similar to those of hepatitis B virus, the patterns of transmission of delta virus vary widely around the world. In regions of the world in which hepatitis delta virus infection is not endemic, the disease is confined to groups at high risk of acquiring hepatitis B infection and high-risk hepatitis B carriers. Because of the propensity of this viral infection to cause fulminant as well as chronic liver disease, continued incursion of hepatitis delta virus into areas of the world where persistent hepatitis B infection is endemic will have serious implications. Prevention depends on the widespread use of hepatitis B vaccine. This review focuses on the molecular biology and the clinical and epidemiologic features of this important viral infection. PMID:8358704

  1. PredictProtein—an open resource for online prediction of protein structural and functional features

    PubMed Central

    Yachdav, Guy; Kloppmann, Edda; Kajan, Laszlo; Hecht, Maximilian; Goldberg, Tatyana; Hamp, Tobias; Hönigschmid, Peter; Schafferhans, Andrea; Roos, Manfred; Bernhofer, Michael; Richter, Lothar; Ashkenazy, Haim; Punta, Marco; Schlessinger, Avner; Bromberg, Yana; Schneider, Reinhard; Vriend, Gerrit; Sander, Chris; Ben-Tal, Nir; Rost, Burkhard

    2014-01-01

    PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with a protein sequence it returns: multiple sequence alignments, predicted aspects of structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) and strands, coiled-coil regions, disulfide bonds and disordered regions) and function. The service incorporates analysis methods for the identification of functional regions (ConSurf), homology-based inference of Gene Ontology terms (metastudent), comprehensive subcellular localization prediction (LocTree3), protein–protein binding sites (ISIS2), protein–polynucleotide binding sites (SomeNA) and predictions of the effect of point mutations (non-synonymous SNPs) on protein function (SNAP2). Our goal has always been to develop a system optimized to meet the demands of experimentalists not highly experienced in bioinformatics. To this end, the PredictProtein results are presented as both text and a series of intuitive, interactive and visually appealing figures. The web server and sources are available at http://ppopen.rostlab.org. PMID:24799431

  2. Beyond [lambda][subscript max] Part 2: Predicting Molecular Color

    ERIC Educational Resources Information Center

    Williams, Darren L.; Flaherty, Thomas J.; Alnasleh, Bassam K.

    2009-01-01

    A concise roadmap for using computational chemistry programs (i.e., Gaussian 03W) to predict the color of a molecular species is presented. A color-predicting spreadsheet is available with the online material that uses transition wavelengths and peak-shape parameters to predict the visible absorbance spectrum, transmittance spectrum, chromaticity…

  3. Molecular Evolution and Structural Features of IRAK Family Members

    PubMed Central

    Gosu, Vijayakumar; Basith, Shaherin; Durai, Prasannavenkatesh; Choi, Sangdun

    2012-01-01

    The interleukin-1 receptor-associated kinase (IRAK) family comprises critical signaling mediators of the TLR/IL-1R signaling pathways. IRAKs are Ser/Thr kinases. There are 4 members in the vertebrate genome (IRAK1, IRAK2, IRAKM, and IRAK4) and an IRAK homolog, Pelle, in insects. IRAK family members are highly conserved in vertebrates, but the evolutionary relationship between IRAKs in vertebrates and insects is not clear. To investigate the evolutionary history and functional divergence of IRAK members, we performed extensive bioinformatics analysis. The phylogenetic relationship between IRAK sequences suggests that gene duplication events occurred in the evolutionary lineage, leading to early vertebrates. A comparative phylogenetic analysis with insect homologs of IRAKs suggests that the Tube protein is a homolog of IRAK4, unlike the anticipated protein, Pelle. Furthermore, the analysis supports that an IRAK4-like kinase is an ancestral protein in the metazoan lineage of the IRAK family. Through functional analysis, several potentially diverged sites were identified in the common death domain and kinase domain. These sites have been constrained during evolution by strong purifying selection, suggesting their functional importance within IRAKs. In summary, our study highlighted the molecular evolution of the IRAK family, predicted the amino acids that contributed to functional divergence, and identified structural variations among the IRAK paralogs that may provide a starting point for further experimental investigations. PMID:23166766

  4. Molecular biology of testicular germ cell tumors: unique features awaiting clinical application.

    PubMed

    Boublikova, Ludmila; Buchler, Tomas; Stary, Jan; Abrahamova, Jitka; Trka, Jan

    2014-03-01

    Testicular germ cell tumors (TGCTs) are the most common solid tumors in young adult men characterized by distinct biologic features and clinical behavior. Both genetic predispositions and environmental factors probably play a substantial role in their etiology. TGTCs arise from a malignant transformation of primordial germ cells in a process that starts prenatally, is often associated with a certain degree of gonadal dysgenesis, and involves the acquirement of several specific aberrations, including activation of SCF-CKIT, amplification of 12p with up-regulation of stem cell genes, and subsequent genetic and epigenetic alterations. Their embryonic and germ origin determines the unique sensitivity of TGCTs to platinum-based chemotherapy. Contrary to the vast majority of other malignancies, no molecular prognostic/predictive factors nor targeted therapy is available for patients with these tumors. This review summarizes the principal molecular characteristics of TGCTs that could represent a potential basis for development of novel diagnostic and treatment approaches. PMID:24182421

  5. Neural Network predictions of Diatomic and Triatomic Molecular Data

    NASA Astrophysics Data System (ADS)

    Blake Laing, W.

    1997-11-01

    The arrangement of molecules in periodic systems offers an enhanced comprehension of trends in molecular properties, a more efficient method of sorting and searching of molecular databases, and bases for the prediction of new data. Neural networks have the ability to "learn" existing data and to forecast a large amount of new data without a smoothing equation.(R. Hefferlin, B. Davis, W. B. Laing, "The Learning and Prediction of Triatomic Molecular Data with Neural Networks," International Arctic Seminar 1997, Murmansk, Russia)(J. Wohlers, W. B. Laing, R. Hefferlin, and B. Daivs, "Least-Squares and Neural-Network Forecasting from Citical Data: Diatomic Molecular Internuclear Separations and Triatomic Heats of Atomization and Ionization Potentials," Advances in Molecular Similarity: JIA book series, in press) This report will present periodic systems of molecules as well as neural network predictions for additional properties of diatomic and triatomic molecules.

  6. Lung Cancer Prediction Using Neural Network Ensemble with Histogram of Oriented Gradient Genomic Features

    PubMed Central

    Adetiba, Emmanuel; Olugbara, Oludayo O.

    2015-01-01

    This paper reports an experimental comparison of artificial neural network (ANN) and support vector machine (SVM) ensembles and their “nonensemble” variants for lung cancer prediction. These machine learning classifiers were trained to predict lung cancer using samples of patient nucleotides with mutations in the epidermal growth factor receptor, Kirsten rat sarcoma viral oncogene, and tumor suppressor p53 genomes collected as biomarkers from the IGDB.NSCLC corpus. The Voss DNA encoding was used to map the nucleotide sequences of mutated and normal genomes to obtain the equivalent numerical genomic sequences for training the selected classifiers. The histogram of oriented gradient (HOG) and local binary pattern (LBP) state-of-the-art feature extraction schemes were applied to extract representative genomic features from the encoded sequences of nucleotides. The ANN ensemble and HOG best fit the training dataset of this study with an accuracy of 95.90% and mean square error of 0.0159. The result of the ANN ensemble and HOG genomic features is promising for automated screening and early detection of lung cancer. This will hopefully assist pathologists in administering targeted molecular therapy and offering counsel to early stage lung cancer patients and persons in at risk populations. PMID:25802891

  7. Linking molecular feature space and disease terms for the immunosuppressive drug rapamycin.

    PubMed

    Bernthaler, Andreas; Mönks, Konrad; Mühlberger, Irmgard; Mayer, Bernd; Perco, Paul; Oberbauer, Rainer

    2011-10-01

    Next to development of novel drugs also drug repositioning appears promising for tackling unmet clinical needs. Here Omics provided the ground for novel analysis strategies for linking drug and disease by integrating profiles on the molecular as well as the clinical data level. We developed a workflow for linking drugs and diseases for identifying repositioning options, and exemplify the procedure for the immunosuppressive drug rapamycin. Our strategy rests on delineating a drug-specific molecular profile by combining Omics data reflecting the drug's impact on the cellular status as well as drug-associated molecular features extracted from the scientific literature. For rapamycin the respective profile held 905 unique molecular features reflecting defined molecular processes as identified by molecular pathway and process enrichment analysis. Literature mining identified 419 diseases significantly associated with this rapamycin molecular feature list, and transforming the significance of gene-disease associations into a continuous score allowed us to compute ROC and precision-recall for comparing this disease list with diseases already undergoing clinical trials utilizing rapamycin. The AUC of this assignment was computed as 0.84, indicating excellent recovery of relevant disease terms solely based on the drug molecular feature profile. We verified relevant indications by comparing molecular feature sets characteristic for the identified diseases to the drug molecular feature profile, demonstrating highly significant overlaps. The presented workflow allowed positive identification of diseases associated with rapamycin utilizing the drug-specific molecular feature profile, and may be well applicable to other drugs of interest. PMID:21789336

  8. Radiogenomic analysis of breast cancer: dynamic contrast enhanced - magnetic resonance imaging based features are associated with molecular subtypes

    NASA Astrophysics Data System (ADS)

    Wang, Shijian; Fan, Ming; Zhang, Juan; Zheng, Bin; Wang, Xiaojia; Li, Lihua

    2016-03-01

    Breast cancer is one of the most common malignant tumor with upgrading incidence in females. The key to decrease the mortality is early diagnosis and reasonable treatment. Molecular classification could provide better insights into patient-directed therapy and prognosis prediction of breast cancer. It is known that different molecular subtypes have different characteristics in magnetic resonance imaging (MRI) examination. Therefore, we assumed that imaging features can reflect molecular information in breast cancer. In this study, we investigated associations between dynamic contrasts enhanced MRI (DCE-MRI) features and molecular subtypes in breast cancer. Sixty patients with breast cancer were enrolled and the MR images were pre-processed for noise reduction, registration and segmentation. Sixty-five dimensional imaging features including statistical characteristics, morphology, texture and dynamic enhancement in breast lesion and background regions were semiautomatically extracted. The associations between imaging features and molecular subtypes were assessed by using statistical analyses, including univariate logistic regression and multivariate logistic regression. The results of multivariate regression showed that imaging features are significantly associated with molecular subtypes of Luminal A (p=0.00473), HER2-enriched (p=0.00277) and Basal like (p=0.0117), respectively. The results indicated that three molecular subtypes are correlated with DCE-MRI features in breast cancer. Specifically, patients with a higher level of compactness or lower level of skewness in breast lesion are more likely to be Luminal A subtype. Besides, the higher value of the dynamic enhancement at T1 time in normal side reflect higher possibility of HER2-enriched subtype in breast cancer.

  9. Embedded prediction in feature extraction: application to single-trial EEG discrimination.

    PubMed

    Hsu, Wei-Yen

    2013-01-01

    In this study, an analysis system embedding neuron-fuzzy prediction in feature extraction is proposed for brain-computer interface (BCI) applications. Wavelet-fractal features combined with neuro-fuzzy predictions are applied for feature extraction in motor imagery (MI) discrimination. The features are extracted from the electroencephalography (EEG) signals recorded from participants performing left and right MI. Time-series predictions are performed by training 2 adaptive neuro-fuzzy inference systems (ANFIS) for respective left and right MI data. Features are then calculated from the difference in multi-resolution fractal feature vector (MFFV) between the predicted and actual signals through a window of EEG signals. Finally, the support vector machine is used for classification. The proposed method estimates its performance in comparison with the linear adaptive autoregressive (AAR) model and the AAR time-series prediction of 6 participants from 2 data sets. The results indicate that the proposed method is promising in MI classification. PMID:23248335

  10. Clinical Risk Prediction by Exploring High-Order Feature Correlations

    PubMed Central

    Wang, Fei; Zhang, Ping; Wang, Xiang; Hu, Jianying

    2014-01-01

    Clinical risk prediction is one important problem in medical informatics, and logistic regression is one of the most widely used approaches for clinical risk prediction. In many cases, the number of potential risk factors is fairly large and the actual set of factors that contribute to the risk is small. Therefore sparse logistic regression is proposed, which can not only predict the clinical risk but also identify the set of relevant risk factors. The inputs of logistic regression and sparse logistic regression are required to be in vector form. This limits the applicability of these models in the problems when the data cannot be naturally represented vectors (e.g., medical images are two-dimensional matrices). To handle the cases when the data are in the form of multi-dimensional arrays, we propose HOSLR: High-Order Sparse Logistic Regression, which can be viewed as a high order extension of sparse logistic regression. Instead of solving one classification vector as in conventional logistic regression, we solve for K classification vectors in HOSLR (K is the number of modes in the data). A block proximal descent approach is proposed to solve the problem and its convergence is guaranteed. Finally we validate the effectiveness of HOSLR on predicting the onset risk of patients with Alzheimer’s disease and heart failure. PMID:25954428

  11. Interval Prediction of Molecular Properties in Parametrized Quantum Chemistry

    NASA Astrophysics Data System (ADS)

    Edwards, David E.; Zubarev, Dmitry Yu.; Packard, Andrew; Lester, William A.; Frenklach, Michael

    2014-06-01

    The accurate evaluation of molecular properties lies at the core of predictive physical models. Most reliable quantum-chemical calculations are limited to smaller molecular systems while purely empirical approaches are limited in accuracy and reliability. A promising approach is to employ a quantum-mechanical formalism with simplifications and to compensate for the latter with parametrization. We propose a strategy of directly predicting the uncertainty interval for a property of interest, based on training-data uncertainties, which sidesteps the need for an optimum set of parameters.

  12. Prediction of OCR accuracy using simple image features

    SciTech Connect

    Blando, L.R.; Kanai, Junichi; Nartker, T.A.

    1995-04-01

    A classifier for predicting the character accuracy of a given page achieved by any Optical Character Recognition (OCR) system is presented. This classifier is based on measuring the amount of white speckle, the amount of character fragments, and overall size information in the page. No output from the OCR system is used. The given page is classified as either good quality (i.e., high OCR accuracy expected) or poor (i.e., low OCR accuracy expected). Six OCR systems processed two different sets of test data: a set of 439 pages obtained from technical and scientific documents and a set of 200 pages obtained from magazines. For every system, approximately 85% of the pages in each data set were correctly predicted. The performance of this classifier is also compared with the ideal-case performance of a prediction method based upon the number of reject markers in OCR generated text. In several cases, this method matched or exceeded the performance of the reject based approach.

  13. Personalized Cancer Medicine: Molecular Diagnostics, Predictive biomarkers, and Drug Resistance

    PubMed Central

    Gonzalez de Castro, D; Clarke, P A; Al-Lazikani, B; Workman, P

    2013-01-01

    The progressive elucidation of the molecular pathogenesis of cancer has fueled the rational development of targeted drugs for patient populations stratified by genetic characteristics. Here we discuss general challenges relating to molecular diagnostics and describe predictive biomarkers for personalized cancer medicine. We also highlight resistance mechanisms for epidermal growth factor receptor (EGFR) kinase inhibitors in lung cancer. We envisage a future requiring the use of longitudinal genome sequencing and other omics technologies alongside combinatorial treatment to overcome cellular and molecular heterogeneity and prevent resistance caused by clonal evolution. PMID:23361103

  14. Epileptic Seizure Prediction based on Ratio and Differential Linear Univariate Features

    PubMed Central

    Rasekhi, Jalil; Mollaei, Mohammad Reza Karami; Bandarabadi, Mojtaba; Teixeira, César A.; Dourado, António

    2015-01-01

    Bivariate features, obtained from multichannel electroencephalogram recordings, quantify the relation between different brain regions. Studies based on bivariate features have shown optimistic results for tackling epileptic seizure prediction problem in patients suffering from refractory epilepsy. A new bivariate approach using univariate features is proposed here. Differences and ratios of 22 linear univariate features were calculated using pairwise combination of 6 electroencephalograms channels, to create 330 differential, and 330 relative features. The feature subsets were classified using support vector machines separately, as one of the two classes of preictal and nonpreictal. Furthermore, minimum Redundancy Maximum Relevance feature reduction method is employed to improve the predictions and reduce the number of false alarms. The studies were carried out on features obtained from 10 patients. For reduced subset of 30 features and using differential approach, the seizures were on average predicted in 60.9% of the cases (28 out of 46 in 737.9 h of test data), with a low false prediction rate of 0.11 h−1. Results of bivariate approaches were compared with those achieved from original linear univariate features, extracted from 6 channels. The advantage of proposed bivariate features is the smaller number of false predictions in comparison to the original 22 univariate features. In addition, reduction in feature dimension could provide a less complex and the more cost-effective algorithm. Results indicate that applying machine learning methods on a multidimensional feature space resulting from relative/differential pairwise combination of 22 univariate features could predict seizure onsets with high performance. PMID:25709936

  15. Analysis of motion features for molecular dynamics simulation of proteins

    NASA Astrophysics Data System (ADS)

    Kamada, Mayumi; Toda, Mikito; Sekijima, Masakazu; Takata, Masami; Joe, Kazuki

    2011-01-01

    Recently, a new method for time series analysis using the wavelet transformation has been proposed by Sakurai et al. We apply it to molecular dynamics simulation of Thermomyces lanuginosa lipase (TLL). Introducing indexes to characterize collective motion of the protein, we have obtained the following two results. First, time evolution of the collective motion involves not only the dynamics within a single potential well but also takes place wandering around multiple conformations. Second, correlation of the collective motion between secondary structures shows that collective motion exists involving multiple secondary structures. We discuss future prospects of our study involving 'disordered proteins'.

  16. Rational Prediction with Molecular Dynamics for Hit Identification

    PubMed Central

    Nichols, Sara E; Swift, Robert V; Amaro, Rommie E

    2012-01-01

    Although the motions of proteins are fundamental for their function, for pragmatic reasons, the consideration of protein elasticity has traditionally been neglected in drug discovery and design. This review details protein motion, its relevance to biomolecular interactions and how it can be sampled using molecular dynamics simulations. Within this context, two major areas of research in structure-based prediction that can benefit from considering protein flexibility, binding site detection and molecular docking, are discussed. Basic classification metrics and statistical analysis techniques, which can facilitate performance analysis, are also reviewed. With hardware and software advances, molecular dynamics in combination with traditional structure-based prediction methods can potentially reduce the time and costs involved in the hit identification pipeline. PMID:23110535

  17. Skeletal Muscle Laminopathies: A Review of Clinical and Molecular Features.

    PubMed

    Maggi, Lorenzo; Carboni, Nicola; Bernasconi, Pia

    2016-01-01

    LMNA-related disorders are caused by mutations in the LMNA gene, which encodes for the nuclear envelope proteins, lamin A and C, via alternative splicing. Laminopathies are associated with a wide range of disease phenotypes, including neuromuscular, cardiac, metabolic disorders and premature aging syndromes. The most frequent diseases associated with mutations in the LMNA gene are characterized by skeletal and cardiac muscle involvement. This review will focus on genetics and clinical features of laminopathies affecting primarily skeletal muscle. Although only symptomatic treatment is available for these patients, many achievements have been made in clarifying the pathogenesis and improving the management of these diseases. PMID:27529282

  18. Molecular Pathogenesis and Diagnostic, Prognostic and Predictive Molecular Markers in Sarcoma.

    PubMed

    Mariño-Enríquez, Adrián; Bovée, Judith V M G

    2016-09-01

    Sarcomas are infrequent mesenchymal neoplasms characterized by notable morphological and molecular heterogeneity. Molecular studies in sarcoma provide refinements to morphologic classification, and contribute diagnostic information (frequently), prognostic stratification (rarely) and predict therapeutic response (occasionally). Herein, we summarize the major molecular mechanisms underlying sarcoma pathogenesis and present clinically useful diagnostic, prognostic and predictive molecular markers for sarcoma. Five major molecular alterations are discussed, illustrated with representative sarcoma types, including 1. the presence of chimeric transcription factors, in vascular tumors; 2. abnormal kinase signaling, in gastrointestinal stromal tumor; 3. epigenetic deregulation, in chondrosarcoma, chondroblastoma, and other tumors; 4. deregulated cell survival and proliferation, due to focal copy number alterations, in dedifferentiated liposarcoma; 5. extreme genomic instability, in conventional osteosarcoma as a representative example of sarcomas with highly complex karyotype. PMID:27523972

  19. Extraction of Molecular Features through Exome to Transcriptome Alignment.

    PubMed

    Mudvari, Prakriti; Kowsari, Kamran; Cole, Charles; Mazumder, Raja; Horvath, Anelia

    2013-08-22

    Integrative Next Generation Sequencing (NGS) DNA and RNA analyses have very recently become feasible, and the published to date studies have discovered critical disease implicated pathways, and diagnostic and therapeutic targets. A growing number of exomes, genomes and transcriptomes from the same individual are quickly accumulating, providing unique venues for mechanistic and regulatory features analysis, and, at the same time, requiring new exploration strategies. In this study, we have integrated variation and expression information of four NGS datasets from the same individual: normal and tumor breast exomes and transcriptomes. Focusing on SNPcentered variant allelic prevalence, we illustrate analytical algorithms that can be applied to extract or validate potential regulatory elements, such as expression or growth advantage, imprinting, loss of heterozygosity (LOH), somatic changes, and RNA editing. In addition, we point to some critical elements that might bias the output and recommend alternative measures to maximize the confidence of findings. The need for such strategies is especially recognized within the growing appreciation of the concept of systems biology: integrative exploration of genome and transcriptome features reveal mechanistic and regulatory insights that reach far beyond linear addition of the individual datasets. PMID:24791251

  20. Extraction of Molecular Features through Exome to Transcriptome Alignment

    PubMed Central

    Mudvari, Prakriti; Kowsari, Kamran; Cole, Charles; Mazumder, Raja; Horvath, Anelia

    2014-01-01

    Integrative Next Generation Sequencing (NGS) DNA and RNA analyses have very recently become feasible, and the published to date studies have discovered critical disease implicated pathways, and diagnostic and therapeutic targets. A growing number of exomes, genomes and transcriptomes from the same individual are quickly accumulating, providing unique venues for mechanistic and regulatory features analysis, and, at the same time, requiring new exploration strategies. In this study, we have integrated variation and expression information of four NGS datasets from the same individual: normal and tumor breast exomes and transcriptomes. Focusing on SNPcentered variant allelic prevalence, we illustrate analytical algorithms that can be applied to extract or validate potential regulatory elements, such as expression or growth advantage, imprinting, loss of heterozygosity (LOH), somatic changes, and RNA editing. In addition, we point to some critical elements that might bias the output and recommend alternative measures to maximize the confidence of findings. The need for such strategies is especially recognized within the growing appreciation of the concept of systems biology: integrative exploration of genome and transcriptome features reveal mechanistic and regulatory insights that reach far beyond linear addition of the individual datasets. PMID:24791251

  1. Prediction of reactive hazards based on molecular structure.

    PubMed

    Saraf, S R; Rogers, W J; Mannan, M S

    2003-03-17

    There is considerable interest in prediction of reactive hazards based on chemical structure. Calorimetric measurements to determine reactivity can be resource consuming, so computational methods to predict reactivity hazards present an attractive option. This paper reviews some of the commonly employed theoretical hazard evaluation techniques, including the oxygen-balance method, ASTM CHETAH, and calculated adiabatic reaction temperature (CART). It also discusses the development of a study table to correlate and predict calorimetric properties of pure compounds. Quantitative structure-property relationships (QSPR) based on quantum mechanical calculations can be employed to correlate calorimetrically measured onset temperatures, T(o), and energies of reaction, -deltaH, with molecular properties. To test the feasibility of this approach, the QSPR technique is used to correlate differential scanning calorimeter (DSC) data, T(o) and -deltaH, with molecular properties for 19 nitro compounds. PMID:12628775

  2. Structural and Molecular Modeling Features of P2X Receptors

    PubMed Central

    Alves, Luiz Anastacio; da Silva, João Herminio Martins; Ferreira, Dinarte Neto Moreira; Fidalgo-Neto, Antonio Augusto; Teixeira, Pedro Celso Nogueira; de Souza, Cristina Alves Magalhães; Caffarena, Ernesto Raúl; de Freitas, Mônica Santos

    2014-01-01

    Currently, adenosine 5′-triphosphate (ATP) is recognized as the extracellular messenger that acts through P2 receptors. P2 receptors are divided into two subtypes: P2Y metabotropic receptors and P2X ionotropic receptors, both of which are found in virtually all mammalian cell types studied. Due to the difficulty in studying membrane protein structures by X-ray crystallography or NMR techniques, there is little information about these structures available in the literature. Two structures of the P2X4 receptor in truncated form have been solved by crystallography. Molecular modeling has proven to be an excellent tool for studying ionotropic receptors. Recently, modeling studies carried out on P2X receptors have advanced our knowledge of the P2X receptor structure-function relationships. This review presents a brief history of ion channel structural studies and shows how modeling approaches can be used to address relevant questions about P2X receptors. PMID:24637936

  3. Clinical and Molecular Features of POLG-Related Mitochondrial Disease

    PubMed Central

    Stumpf, Jeffrey D.; Saneto, Russell P.; Copeland, William C.

    2013-01-01

    The inability to replicate mitochondrial genomes (mtDNA) by the mitochondrial DNA polymerase (pol γ) leads to a subset of mitochondrial diseases. Many mutations in POLG, the gene that encodes pol γ, have been associated with mitochondrial diseases such as myocerebrohepatopathy spectrum (MCHS) disorders, Alpers-Huttenlocher syndrome, myoclonic epilepsy myopathy sensory ataxia (MEMSA), ataxia neuropathy spectrum (ANS), and progressive external ophthalmoplegia (PEO). This chapter explores five important topics in POLG-related disease: (1) clinical symptoms that identify and distinguish POLG-related diseases, (2) molecular characterization of defects in polymerase activity by POLG disease variants, (3) the importance of holoenzyme formation in disease presentation, (4) the role of pol γ exonuclease activity and mutagenesis in disease and aging, and (5) novel approaches to therapy and avoidance of toxicity based on primary research in pol γ replication. PMID:23545419

  4. Prediction using step-wise L1, L2 regularization and feature selection for small data sets with large number of features

    PubMed Central

    2011-01-01

    Background Machine learning methods are nowadays used for many biological prediction problems involving drugs, ligands or polypeptide segments of a protein. In order to build a prediction model a so called training data set of molecules with measured target properties is needed. For many such problems the size of the training data set is limited as measurements have to be performed in a wet lab. Furthermore, the considered problems are often complex, such that it is not clear which molecular descriptors (features) may be suitable to establish a strong correlation with the target property. In many applications all available descriptors are used. This can lead to difficult machine learning problems, when thousands of descriptors are considered and only few (e.g. below hundred) molecules are available for training. Results The CoEPrA contest provides four data sets, which are typical for biological regression problems (few molecules in the training data set and thousands of descriptors). We applied the same two-step training procedure for all four regression tasks. In the first stage, we used optimized L1 regularization to select the most relevant features. Thus, the initial set of more than 6,000 features was reduced to about 50. In the second stage, we used only the selected features from the preceding stage applying a milder L2 regularization, which generally yielded further improvement of prediction performance. Our linear model employed a soft loss function which minimizes the influence of outliers. Conclusions The proposed two-step method showed good results on all four CoEPrA regression tasks. Thus, it may be useful for many other biological prediction problems where for training only a small number of molecules are available, which are described by thousands of descriptors. PMID:22026913

  5. Clinical impact of molecular features in diffuse large B-cell lymphoma and follicular lymphoma.

    PubMed

    Pon, Julia R; Marra, Marco A

    2016-01-14

    Our understanding of the pathogenesis and heterogeneity of diffuse large B-cell lymphoma (DLBCL) and follicular lymphoma (FL) has been dramatically enhanced by recent attempts to profile molecular features of these lymphomas. In this article, we discuss ways in which testing for molecular features may impact DLBCL and FL management if clinical trials are designed to incorporate such tests. Specifically, we discuss how distinguishing lymphomas on the basis of cell-of-origin subtypes or the presence of other molecular features is prognostically and therapeutically significant. Conversely, we discuss how the molecular similarities of DLBCL and FL have provided insight into the potential of both DLBCL and FL cases to respond to agents targeting alterations they have in common. Through these examples, we demonstrate how the translation of our understanding of cancer biology into improvements in patient outcomes depends on analyzing the molecular correlates of treatment outcomes in clinical trials and in routinely treated patients. PMID:26447189

  6. Molecular features in arsenic-induced lung tumors

    PubMed Central

    2013-01-01

    Arsenic is a well-known human carcinogen, which potentially affects ~160 million people worldwide via exposure to unsafe levels in drinking water. Lungs are one of the main target organs for arsenic-related carcinogenesis. These tumors exhibit particular features, such as squamous cell-type specificity and high incidence among never smokers. Arsenic-induced malignant transformation is mainly related to the biotransformation process intended for the metabolic clearing of the carcinogen, which results in specific genetic and epigenetic alterations that ultimately affect key pathways in lung carcinogenesis. Based on this, lung tumors induced by arsenic exposure could be considered an additional subtype of lung cancer, especially in the case of never-smokers, where arsenic is a known etiological agent. In this article, we review the current knowledge on the various mechanisms of arsenic carcinogenicity and the specific roles of this metalloid in signaling pathways leading to lung cancer. PMID:23510327

  7. Synthesis of a specified, silica molecular sieve by using computationally predicted organic structure-directing agents.

    PubMed

    Schmidt, Joel E; Deem, Michael W; Davis, Mark E

    2014-08-01

    Crystalline molecular sieves are used in numerous applications, where the properties exploited for each technology are the direct consequence of structural features. New materials are typically discovered by trial and error, and in many cases, organic structure-directing agents (OSDAs) are used to direct their formation. Here, we report the first successful synthesis of a specified molecular sieve through the use of an OSDA that was predicted from a recently developed computational method that constructs chemically synthesizable OSDAs. Pentamethylimidazolium is computationally predicted to have the largest stabilization energy in the STW framework, and is experimentally shown to strongly direct the synthesis of pure-silica STW. Other OSDAs with lower stabilization energies did not form STW. The general method demonstrated here to create STW may lead to new, simpler OSDAs for existing frameworks and provide a way to predict OSDAs for desired, theoretical frameworks. PMID:24961789

  8. Prediction and Analysis of Quorum Sensing Peptides Based on Sequence Features

    PubMed Central

    Rajput, Akanksha; Gupta, Amit Kumar; Kumar, Manoj

    2015-01-01

    Quorum sensing peptides (QSPs) are the signaling molecules used by the Gram-positive bacteria in orchestrating cell-to-cell communication. In spite of their enormous importance in signaling process, their detailed bioinformatics analysis is lacking. In this study, QSPs and non-QSPs were examined according to their amino acid composition, residues position, motifs and physicochemical properties. Compositional analysis concludes that QSPs are enriched with aromatic residues like Trp, Tyr and Phe. At the N-terminal, Ser was a dominant residue at maximum positions, namely, first, second, third and fifth while Phe was a preferred residue at first, third and fifth positions from the C-terminal. A few motifs from QSPs were also extracted. Physicochemical properties like aromaticity, molecular weight and secondary structure were found to be distinguishing features of QSPs. Exploiting above properties, we have developed a Support Vector Machine (SVM) based predictive model. During 10-fold cross-validation, SVM achieves maximum accuracy of 93.00%, Mathew’s correlation coefficient (MCC) of 0.86 and Receiver operating characteristic (ROC) of 0.98 on the training/testing dataset (T200p+200n). Developed models performed equally well on the validation dataset (V20p+20n). The server also integrates several useful analysis tools like “QSMotifScan”, “ProtFrag”, “MutGen” and “PhysicoProp”. Our analysis reveals important characteristics of QSPs and on the basis of these unique features, we have developed a prediction algorithm “QSPpred” (freely available at: http://crdd.osdd.net/servers/qsppred). PMID:25781990

  9. Clinical Relevance of Prognostic and Predictive Molecular Markers in Gliomas.

    PubMed

    Siegal, Tali

    2016-01-01

    Sorting and grading of glial tumors by the WHO classification provide clinicians with guidance as to the predicted course of the disease and choice of treatment. Nonetheless, histologically identical tumors may have very different outcome and response to treatment. Molecular markers that carry both diagnostic and prognostic information add useful tools to traditional classification by redefining tumor subtypes within each WHO category. Therefore, molecular markers have become an integral part of tumor assessment in modern neuro-oncology and biomarker status now guides clinical decisions in some subtypes of gliomas. The routine assessment of IDH status improves histological diagnostic accuracy by differentiating diffuse glioma from reactive gliosis. It carries a favorable prognostic implication for all glial tumors and it is predictive for chemotherapeutic response in anaplastic oligodendrogliomas with codeletion of 1p/19q chromosomes. Glial tumors that contain chromosomal codeletion of 1p/19q are defined as tumors of oligodendroglial lineage and have favorable prognosis. MGMT promoter methylation is a favorable prognostic marker in astrocytic high-grade gliomas and it is predictive for chemotherapeutic response in anaplastic gliomas with wild-type IDH1/2 and in glioblastoma of the elderly. The clinical implication of other molecular markers of gliomas like mutations of EGFR and ATRX genes and BRAF fusion or point mutation is highlighted. The potential of molecular biomarker-based classification to guide future therapeutic approach is discussed and accentuated. PMID:26508407

  10. Wiring and Molecular Features of Prefrontal Ensembles Representing Distinct Experiences.

    PubMed

    Ye, Li; Allen, William E; Thompson, Kimberly R; Tian, Qiyuan; Hsueh, Brian; Ramakrishnan, Charu; Wang, Ai-Chi; Jennings, Joshua H; Adhikari, Avishek; Halpern, Casey H; Witten, Ilana B; Barth, Alison L; Luo, Liqun; McNab, Jennifer A; Deisseroth, Karl

    2016-06-16

    A major challenge in understanding the cellular diversity of the brain has been linking activity during behavior with standard cellular typology. For example, it has not been possible to determine whether principal neurons in prefrontal cortex active during distinct experiences represent separable cell types, and it is not known whether these differentially active cells exert distinct causal influences on behavior. Here, we develop quantitative hydrogel-based technologies to connect activity in cells reporting on behavioral experience with measures for both brain-wide wiring and molecular phenotype. We find that positive and negative-valence experiences in prefrontal cortex are represented by cell populations that differ in their causal impact on behavior, long-range wiring, and gene expression profiles, with the major discriminant being expression of the adaptation-linked gene NPAS4. These findings illuminate cellular logic of prefrontal cortex information processing and natural adaptive behavior and may point the way to cell-type-specific understanding and treatment of disease-associated states. PMID:27238022

  11. Clinical and molecular features of Joubert syndrome and related disorders

    PubMed Central

    Parisi, Melissa A.

    2009-01-01

    Joubert syndrome (JBTS; OMIM 213300) is a rare, autosomal recessive disorder characterized by a specific congenital malformation of the hindbrain and a broad spectrum of other phenotypic findings that is now known to be caused by defects in the structure and/or function of the primary cilium. The complex hindbrain malformation that is characteristic of JBTS can be identified on axial magnetic resonance imaging and is known as the molar tooth sign (MTS); other diagnostic criteria include intellectual disability, hypotonia, and often, abnormal respiratory pattern and/or abnormal eye movements. In addition, a broad spectrum of other anomalies characterize Joubert syndrome and related disorders (JSRD), and may include retinal dystrophy, ocular coloboma, oral frenulae and tongue tumors, polydactyly, cystic renal disease (including cystic dysplasia or juvenile nephronophthisis), and congenital hepatic fibrosis. The clinical course can be variable, but most children with this condition survive infancy to reach adulthood. At least 8 genes cause JSRD, with some genotype-phenotype correlations emerging, including the association between mutations in the MKS3 gene and hepatic fibrosis characteristic of the JSRD subtype known as COACH syndrome. Several of the causative genes for JSRD are implicated in other ciliary disorders, such as juvenile nephronophthisis and Meckel syndrome, illustrating the close association between these conditions and their overlapping clinical features that reflect a shared etiology involving the primary cilium. PMID:19876931

  12. Interaction of proteases with legume seed inhibitors. Molecular features.

    PubMed

    de Seidl, D S

    1996-12-01

    After having found that raw black beans (Phaseolus vulgaris) were toxic, while the cooked ones constitute the basic diet of the underdeveloped peoples of the world, in the sixties, our research directed by Dr. Jaffé, concentrated mainly around the detection and identification of the heat labile toxic factors in legume seeds. A micromethod for the detection of protease inhibitors (PI) in individual seeds was developed, for the purpose of establishing that the multiple trypsin inhibitors (TI) found in the Cubagua variety were expressions of single seeds and not a mixture of a non homogenous bean lot. Six isoinhibitors were isolated and purified, all of which were "double-headed" and interacted with trypsin (T) and chymotrypsin (CHT) independently and simultaneously, as shown by electrophoresis of their binary and ternary complexes with each and both enzymes. However, their affinity for the enzymes, including elastases, was rather variable, as well as their amino acid composition which consisted of 51 units for inhibitor V, the smallest, and 83 amino acids for inhibitor I, the largest. A low molecular weight protein fraction that inhibited subtilisin (S), but recognized neither T, CHT nor pancreatic elastase was detected in 63 varieties of Phaseolus vulgaris as well as in broad beans (Vicia faba), chick peas (Cicer arietinum), jack beans (Canavalia ensiformis), kidney beans (Vigna aureus), etc., It was absent though, in soybeans (Glycine max), lentils (Lens culinaris), green peas (Pisum sativum), cowpea (Vigna sinensis) and lupine seeds (Lupinus sp). Subtilisin inhibitors (SI) were isolated from black beans, broad beans, chick peas and jack beans. Their Mr is between 8-9KD and they show a rather high stability in the presence of denaturing agents. They are specific toward microbial proteases, in addition to subtilisins, Carlsberg and BPN', they inhibit the alkaline protease from Tritirachium album (Protease K), from Aspergillus oryzae and one isolated from

  13. Adaptive modelling of structured molecular representations for toxicity prediction

    NASA Astrophysics Data System (ADS)

    Bertinetto, Carlo; Duce, Celia; Micheli, Alessio; Solaro, Roberto; Tiné, Maria Rosaria

    2012-12-01

    We investigated the possibility of modelling structure-toxicity relationships by direct treatment of the molecular structure (without using descriptors) through an adaptive model able to retain the appropriate structural information. With respect to traditional descriptor-based approaches, this provides a more general and flexible way to tackle prediction problems that is particularly suitable when little or no background knowledge is available. Our method employs a tree-structured molecular representation, which is processed by a recursive neural network (RNN). To explore the realization of RNN modelling in toxicological problems, we employed a data set containing growth impairment concentrations (IGC50) for Tetrahymena pyriformis.

  14. Circular features with predictable size on Xanadu region of Titan

    NASA Astrophysics Data System (ADS)

    Kochemasov, G. G.

    2008-09-01

    Planets' satellites in the Solar system (rocky and icy) have in common one fundamental property: all of them move simultaneously in two orbits - around Sun and around their planets (planets have only one orbit in the Solar system). As was shown by the wave planetology [1-6] " orbits make structures'. This means that movements in elliptical keplerian orbits imply periodically changing increasing and decreasing accelerations. Multiplied by celestial body mass this produces inertia-gravity forces (Newton: F=m • a). These forces warp celestial bodies in form of standing waves propagating in rotating bodies in four interfering orthogonal and diagonal directions. This interference gives three kinds of regularly disposed tectonic blocks: uprising (+), subsiding (-), neutral (0)(Fig. 1). Their size depends on warping wavelengths. The fundamental wave1 and its first overtone wave2 (and weaker ones) are responsible for ubiquitous tectonic dichotomy - two hemispheres - segments and sectoring. These superimposed global tectonic features are adorned by tectonic granulations size of which is inversely proportional to orbital frequencies: higher frequency - smaller granule, lower frequency - larger granule. A row of the planets granulations is as follows: Mercury πR/16, Venus πR/6, Earth πR/4, Mars πR/2, asteroids πR/1, Jupiter 3πR, Saturn 7.5πR, Uranus 21πR, Neptune 41πR, Pluto 62πR (a granule size is a half of a wavelength; a scale is Earth with πR/4 granule corresponding to 1/1 year orbital frequency; R-radius). So, orbits make structures. They are simpler for planets, but much more complicated for moons. Their surfaces are saturated with granules related to two main frequencies and at least two modulated side frequencies. Two orbits imply a wave modulation. The lower circum-Sun frequency modulates the higher circum-planet frequency by dividing and multiplying it thus producing two side frequencies with corresponding waves and granules. In case of Titan for the

  15. Clinical, Epidemiologic, Histopathologic and Molecular Features of an Unexplained Dermopathy

    PubMed Central

    Pearson, Michele L.; Selby, Joseph V.; Katz, Kenneth A.; Cantrell, Virginia; Braden, Christopher R.; Parise, Monica E.; Paddock, Christopher D.; Lewin-Smith, Michael R.; Kalasinsky, Victor F.; Goldstein, Felicia C.; Hightower, Allen W.; Papier, Arthur; Lewis, Brian; Motipara, Sarita; Eberhard, Mark L.

    2012-01-01

    Background Morgellons is a poorly characterized constellation of symptoms, with the primary manifestations involving the skin. We conducted an investigation of this unexplained dermopathy to characterize the clinical and epidemiologic features and explore potential etiologies. Methods A descriptive study was conducted among persons at least 13 years of age and enrolled in Kaiser Permanente Northern California (KPNC) during 2006–2008. A case was defined as the self-reported emergence of fibers or materials from the skin accompanied by skin lesions and/or disturbing skin sensations. We collected detailed epidemiologic data, performed clinical evaluations and geospatial analyses and analyzed materials collected from participants' skin. Results We identified 115 case-patients. The prevalence was 3.65 (95% CI = 2.98, 4.40) cases per 100,000 enrollees. There was no clustering of cases within the 13-county KPNC catchment area (p = .113). Case-patients had a median age of 52 years (range: 17–93) and were primarily female (77%) and Caucasian (77%). Multi-system complaints were common; 70% reported chronic fatigue and 54% rated their overall health as fair or poor with mean Physical Component Scores and Mental Component Scores of 36.63 (SD = 12.9) and 35.45 (SD = 12.89), respectively. Cognitive deficits were detected in 59% of case-patients and 63% had evidence of clinically significant somatic complaints; 50% had drugs detected in hair samples and 78% reported exposure to solvents. Solar elastosis was the most common histopathologic abnormality (51% of biopsies); skin lesions were most consistent with arthropod bites or chronic excoriations. No parasites or mycobacteria were detected. Most materials collected from participants' skin were composed of cellulose, likely of cotton origin. Conclusions This unexplained dermopathy was rare among this population of Northern California residents, but associated with significantly reduced health-related quality of

  16. Protein location prediction using atomic composition and global features of the amino acid sequence

    SciTech Connect

    Cherian, Betsy Sheena; Nair, Achuthsankar S.

    2010-01-22

    Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.

  17. Bankruptcy prediction using SVM models with a new approach to combine features selection and parameter optimisation

    NASA Astrophysics Data System (ADS)

    Zhou, Ligang; Keung Lai, Kin; Yen, Jerome

    2014-03-01

    Due to the economic significance of bankruptcy prediction of companies for financial institutions, investors and governments, many quantitative methods have been used to develop effective prediction models. Support vector machine (SVM), a powerful classification method, has been used for this task; however, the performance of SVM is sensitive to model form, parameter setting and features selection. In this study, a new approach based on direct search and features ranking technology is proposed to optimise features selection and parameter setting for 1-norm and least-squares SVM models for bankruptcy prediction. This approach is also compared to the SVM models with parameter optimisation and features selection by the popular genetic algorithm technique. The experimental results on a data set with 2010 instances show that the proposed models are good alternatives for bankruptcy prediction.

  18. Structural class prediction of protein using novel feature extraction method from chaos game representation of predicted secondary structure.

    PubMed

    Zhang, Lichao; Kong, Liang; Han, Xiaodong; Lv, Jinfeng

    2016-07-01

    Protein structural class prediction plays an important role in protein structure and function analysis, drug design and many other biological applications. Extracting good representation from protein sequence is fundamental for this prediction task. In recent years, although several secondary structure based feature extraction strategies have been specially proposed for low-similarity protein sequences, the prediction accuracy still remains limited. To explore the potential of secondary structure information, this study proposed a novel feature extraction method from the chaos game representation of predicted secondary structure to mainly capture sequence order information and secondary structure segments distribution information in a given protein sequence. Several kinds of prediction accuracies obtained by the jackknife test are reported on three widely used low-similarity benchmark datasets (25PDB, 1189 and 640). Compared with the state-of-the-art prediction methods, the proposed method achieves the highest overall accuracies on all the three datasets. The experimental results confirm that the proposed feature extraction method is effective for accurate prediction of protein structural class. Moreover, it is anticipated that the proposed method could be extended to other graphical representations of protein sequence and be helpful in future research. PMID:27084358

  19. Predictive Value of Morphological Features in Patients with Autism versus Normal Controls

    ERIC Educational Resources Information Center

    Ozgen, H.; Hellemann, G. S.; de Jonge, M. V.; Beemer, F. A.; van Engeland, H.

    2013-01-01

    We investigated the predictive power of morphological features in 224 autistic patients and 224 matched-pairs controls. To assess the relationship between the morphological features and autism, we used the receiver operator curves (ROC). In addition, we used recursive partitioning (RP) to determine a specific pattern of abnormalities that is…

  20. Using random forest to classify linear B-cell epitopes based on amino acid properties and molecular features.

    PubMed

    Huang, Jian-Hua; Wen, Ming; Tang, Li-Juan; Xie, Hua-Lin; Fu, Liang; Liang, Yi-Zeng; Lu, Hong-Mei

    2014-08-01

    Identification and characterization of B-cell epitopes in target antigens was one of the key steps in epitopes-driven vaccine design, immunodiagnostic tests, and antibody production. Experimental determination of epitopes was labor-intensive and expensive. Therefore, there was an urgent need of computational methods for reliable identification of B-cell epitopes. In current study, we proposed a novel peptide feature description method which combined peptide amino acid properties with chemical molecular features. Based on these combined features, a random forest (RF) classifier was adopted to classify B-cell epitopes and non-epitopes. RF is an ensemble method that uses recursive partitioning to generate many trees for aggregating the results; and it always produces highly competitive models. The classification accuracy, sensitivity, specificity, Matthews correlation coefficient (MCC), and area under the curve (AUC) values for current method were 78.31%, 80.05%, 72.23%, 0.5836, and 0.8800, respectively. These results showed that an appropriate combination of peptide amino acid features and chemical molecular features with a RF model could enhance the prediction performance of linear B-cell epitopes. Finally, a freely online service was available at http://sysbio.yznu.cn/Research/Epitopesprediction.aspx. PMID:24721579

  1. Toward Fully in Silico Melting Point Prediction Using Molecular Simulations

    SciTech Connect

    Zhang, Y; Maginn, EJ

    2013-03-01

    Melting point is one of the most fundamental and practically important properties of a compound. Molecular computation of melting points. However, all of these methods simulation methods have been developed for the accurate need an experimental crystal structure as input, which means that such calculations are not really predictive since the melting point can be measured easily in experiments once a crystal structure is known. On the other hand, crystal structure prediction (CSP) has become an active field and significant progress has been made, although challenges still exist. One of the main challenges is the existence of many crystal structures (polymorphs) that are very close in energy. Thermal effects and kinetic factors make the situation even more complicated, such that it is still not trivial to predict experimental crystal structures. In this work, we exploit the fact that free energy differences are often small between crystal structures. We show that accurate melting point predictions can be made by using a reasonable crystal structure from CSP as a starting point for a free energy-based melting point calculation. The key is that most crystal structures predicted by CSP have free energies that are close to that of the experimental structure. The proposed method was tested on two rigid molecules and the results suggest that a fully in silico melting point prediction method is possible.

  2. Selecting radiomic features from FDG-PET images for cancer treatment outcome prediction.

    PubMed

    Lian, Chunfeng; Ruan, Su; Denœux, Thierry; Jardin, Fabrice; Vera, Pierre

    2016-08-01

    As a vital task in cancer therapy, accurately predicting the treatment outcome is valuable for tailoring and adapting a treatment planning. To this end, multi-sources of information (radiomics, clinical characteristics, genomic expressions, etc) gathered before and during treatment are potentially profitable. In this paper, we propose such a prediction system primarily using radiomic features (e.g., texture features) extracted from FDG-PET images. The proposed system includes a feature selection method based on Dempster-Shafer theory, a powerful tool to deal with uncertain and imprecise information. It aims to improve the prediction accuracy, and reduce the imprecision and overlaps between different classes (treatment outcomes) in a selected feature subspace. Considering that training samples are often small-sized and imbalanced in our applications, a data balancing procedure and specified prior knowledge are taken into account to improve the reliability of the selected feature subsets. Finally, the Evidential K-NN (EK-NN) classifier is used with selected features to output prediction results. Our prediction system has been evaluated by synthetic and clinical datasets, consistently showing good performance. PMID:27236221

  3. Widespread convergence in toxin resistance by predictable molecular evolution

    PubMed Central

    Ujvari, Beata; Casewell, Nicholas R.; Sunagar, Kartik; Arbuckle, Kevin; Wüster, Wolfgang; Lo, Nathan; O’Meally, Denis; Beckmann, Christa; King, Glenn F.; Deplazes, Evelyne; Madsen, Thomas

    2015-01-01

    The question about whether evolution is unpredictable and stochastic or intermittently constrained along predictable pathways is the subject of a fundamental debate in biology, in which understanding convergent evolution plays a central role. At the molecular level, documented examples of convergence are rare and limited to occurring within specific taxonomic groups. Here we provide evidence of constrained convergent molecular evolution across the metazoan tree of life. We show that resistance to toxic cardiac glycosides produced by plants and bufonid toads is mediated by similar molecular changes to the sodium-potassium-pump (Na+/K+-ATPase) in insects, amphibians, reptiles, and mammals. In toad-feeding reptiles, resistance is conferred by two point mutations that have evolved convergently on four occasions, whereas evidence of a molecular reversal back to the susceptible state in varanid lizards migrating to toad-free areas suggests that toxin resistance is maladaptive in the absence of selection. Importantly, resistance in all taxa is mediated by replacements of 2 of the 12 amino acids comprising the Na+/K+-ATPase H1–H2 extracellular domain that constitutes a core part of the cardiac glycoside binding site. We provide mechanistic insight into the basis of resistance by showing that these alterations perturb the interaction between the cardiac glycoside bufalin and the Na+/K+-ATPase. Thus, similar selection pressures have resulted in convergent evolution of the same molecular solution across the breadth of the animal kingdom, demonstrating how a scarcity of possible solutions to a selective challenge can lead to highly predictable evolutionary responses. PMID:26372961

  4. Feature maps driven no-reference image quality prediction of authentically distorted images

    NASA Astrophysics Data System (ADS)

    Ghadiyaram, Deepti; Bovik, Alan C.

    2015-03-01

    Current blind image quality prediction models rely on benchmark databases comprised of singly and synthetically distorted images, thereby learning image features that are only adequate to predict human perceived visual quality on such inauthentic distortions. However, real world images often contain complex mixtures of multiple distortions. Rather than a) discounting the effect of these mixtures of distortions on an image's perceptual quality and considering only the dominant distortion or b) using features that are only proven to be efficient for singly distorted images, we deeply study the natural scene statistics of authentically distorted images, in different color spaces and transform domains. We propose a feature-maps-driven statistical approach which avoids any latent assumptions about the type of distortion(s) contained in an image, and focuses instead on modeling the remarkable consistencies in the scene statistics of real world images in the absence of distortions. We design a deep belief network that takes model-based statistical image features derived from a very large database of authentically distorted images as input and discovers good feature representations by generalizing over different distortion types, mixtures, and severities, which are later used to learn a regressor for quality prediction. We demonstrate the remarkable competence of our features for improving automatic perceptual quality prediction on a benchmark database and on the newly designed LIVE Authentic Image Quality Challenge Database and show that our approach of combining robust statistical features and the deep belief network dramatically outperforms the state-of-the-art.

  5. Prediction of Conversion from Mild Cognitive Impairment to Alzheimer's Disease Using MRI and Structural Network Features.

    PubMed

    Wei, Rizhen; Li, Chuhan; Fogelson, Noa; Li, Ling

    2016-01-01

    Optimized magnetic resonance imaging (MRI) features and abnormalities of brain network architectures may allow earlier detection and accurate prediction of the progression from mild cognitive impairment (MCI) to Alzheimer's disease (AD). In this study, we proposed a classification framework to distinguish MCI converters (MCIc) from MCI non-converters (MCInc) by using a combination of FreeSurfer-derived MRI features and nodal features derived from the thickness network. At the feature selection step, we first employed sparse linear regression with stability selection, for the selection of discriminative features in the iterative combinations of MRI and network measures. Subsequently the top K features of available combinations were selected as optimal features for classification. To obtain unbiased results, support vector machine (SVM) classifiers with nested cross validation were used for classification. The combination of 10 features including those from MRI and network measures attained accuracies of 66.04, 76.39, 74.66, and 73.91% for mixed conversion time, 6, 12, and 18 months before diagnosis of probable AD, respectively. Analysis of the diagnostic power of different time periods before diagnosis of probable AD showed that short-term prediction (6 and 12 months) achieved more stable and higher AUC scores compared with long-term prediction (18 months), with K-values from 1 to 30. The present results suggest that meaningful predictors composed of MRI and network measures may offer the possibility for early detection of progression from MCI to AD. PMID:27148045

  6. Prediction of Conversion from Mild Cognitive Impairment to Alzheimer's Disease Using MRI and Structural Network Features

    PubMed Central

    Wei, Rizhen; Li, Chuhan; Fogelson, Noa; Li, Ling

    2016-01-01

    Optimized magnetic resonance imaging (MRI) features and abnormalities of brain network architectures may allow earlier detection and accurate prediction of the progression from mild cognitive impairment (MCI) to Alzheimer's disease (AD). In this study, we proposed a classification framework to distinguish MCI converters (MCIc) from MCI non-converters (MCInc) by using a combination of FreeSurfer-derived MRI features and nodal features derived from the thickness network. At the feature selection step, we first employed sparse linear regression with stability selection, for the selection of discriminative features in the iterative combinations of MRI and network measures. Subsequently the top K features of available combinations were selected as optimal features for classification. To obtain unbiased results, support vector machine (SVM) classifiers with nested cross validation were used for classification. The combination of 10 features including those from MRI and network measures attained accuracies of 66.04, 76.39, 74.66, and 73.91% for mixed conversion time, 6, 12, and 18 months before diagnosis of probable AD, respectively. Analysis of the diagnostic power of different time periods before diagnosis of probable AD showed that short-term prediction (6 and 12 months) achieved more stable and higher AUC scores compared with long-term prediction (18 months), with K-values from 1 to 30. The present results suggest that meaningful predictors composed of MRI and network measures may offer the possibility for early detection of progression from MCI to AD. PMID:27148045

  7. Prediction of structural features and application to outer membrane protein identification

    NASA Astrophysics Data System (ADS)

    Yan, Renxiang; Wang, Xiaofeng; Huang, Lanqing; Yan, Feidi; Xue, Xiaoyu; Cai, Weiwen

    2015-06-01

    Protein three-dimensional (3D) structures provide insightful information in many fields of biology. One-dimensional properties derived from 3D structures such as secondary structure, residue solvent accessibility, residue depth and backbone torsion angles are helpful to protein function prediction, fold recognition and ab initio folding. Here, we predict various structural features with the assistance of neural network learning. Based on an independent test dataset, protein secondary structure prediction generates an overall Q3 accuracy of ~80%. Meanwhile, the prediction of relative solvent accessibility obtains the highest mean absolute error of 0.164, and prediction of residue depth achieves the lowest mean absolute error of 0.062. We further improve the outer membrane protein identification by including the predicted structural features in a scoring function using a simple profile-to-profile alignment. The results demonstrate that the accuracy of outer membrane protein identification can be improved by ~3% at a 1% false positive level when structural features are incorporated. Finally, our methods are available as two convenient and easy-to-use programs. One is PSSM-2-Features for predicting secondary structure, relative solvent accessibility, residue depth and backbone torsion angles, the other is PPA-OMP for identifying outer membrane proteins from proteomes.

  8. Clinicopathological and molecular features of malignant optic pathway glioma in an adult.

    PubMed

    Nagaishi, Masaya; Sugiura, Yoshiki; Takano, Issei; Tanaka, Yoshihiro; Suzuki, Kensuke; Yokoo, Hideaki; Hyodo, Akio

    2015-01-01

    Malignant gliomas of the optic pathway are rare, and their genetic alterations are poorly understood. We describe a 64-year-old woman with anaplastic astrocytoma originating from the optic pathway, together with the molecular features. She presented with progressive visual field loss, and a biopsy sample was obtained from the lesion in the optic chiasm. She underwent radiosurgery concomitant with temozolomide chemotherapy, and subsequently remained stable for 10 months after initial presentation. Molecular analysis indicated that the mass may have shared common molecular genetic features with conventional primary astrocytic gliomas but not pilocytic gliomas, which supported the morphologic diagnosis of anaplastic astrocytoma. Molecular analysis of malignant optic pathway gliomas in adults is useful for distinguishing between high-grade gliomas and anaplastic pilocytic astrocytomas, and for determining further therapy. PMID:25150758

  9. Acinar Cell Carcinoma of the Pancreas: Overview of Clinicopathologic Features and Insights into the Molecular Pathology

    PubMed Central

    La Rosa, Stefano; Sessa, Fausto; Capella, Carlo

    2015-01-01

    Acinar cell carcinomas (ACCs) of the pancreas are rare pancreatic neoplasms accounting for about 1–2% of pancreatic tumors in adults and about 15% in pediatric subjects. They show different clinical symptoms at presentation, different morphological features, different outcomes, and different molecular alterations. This heterogeneous clinicopathological spectrum may give rise to difficulties in the clinical and pathological diagnosis with consequential therapeutic and prognostic implications. The molecular mechanisms involved in the onset and progression of ACCs are still not completely understood, although in recent years, several attempts have been made to clarify the molecular mechanisms involved in ACC biology. In this paper, we will review the main clinicopathological and molecular features of pancreatic ACCs of both adult and pediatric subjects to give the reader a comprehensive overview of this rare tumor type. PMID:26137463

  10. In silico prediction of major drug clearance pathways by support vector machines with feature-selected descriptors.

    PubMed

    Toshimoto, Kouta; Wakayama, Naomi; Kusama, Makiko; Maeda, Kazuya; Sugiyama, Yuichi; Akiyama, Yutaka

    2014-11-01

    We have previously established an in silico classification method ("CPathPred") to predict the major clearance pathways of drugs based on an empirical decision with only four physicochemical descriptors-charge, molecular weight, octanol-water distribution coefficient, and protein unbound fraction in plasma-using a rectangular method. In this study, we attempted to improve the prediction performance of the method by introducing a support vector machine (SVM) and increasing the number of descriptors. The data set consisted of 141 approved drugs whose major clearance pathways were classified into metabolism by CYP3A4, CYP2C9, or CYP2D6; organic anion transporting polypeptide-mediated hepatic uptake; or renal excretion. With the same four default descriptors as used in CPathPred, the SVM-based predictor (named "default descriptor SVM") resulted in higher prediction performance compared with a rectangular-based predictor judged by 10-fold cross-validation. Two SVM-based predictors were also established by adding some descriptors as follows: 1) 881 descriptors predicted in silico from the chemical structures of drugs in addition to 4 default descriptors ("885 descriptor SVM"); and 2) selected descriptors extracted by a feature selection based on a greedy algorithm with default descriptors ("feature selection SVM"). The prediction accuracies of the rectangular-based predictor, default descriptor SVM, 885 descriptor SVM, and feature selection SVM were 0.49, 0.60, 0.72, and 0.91, respectively, and the overall precision values for these four methods were 0.72, 0.77, 0.86, and 0.98, respectively. In conclusion, we successfully constructed SVM-based predictors with limited numbers of descriptors to classify the major clearance pathways of drugs in humans with high prediction performance. PMID:25128502

  11. Clinical and Molecular Cytogenetic Characterisation of Children with Developmental Delay and Dysmorphic Features

    PubMed Central

    BERTOK, Sara; ŽERJAV TANŠEK, Mojca; KOTNIK, Primož; BATTELINO, Tadej; VOLK, Marija; PECILE, Vanna; CLEVA, Lisa; GASPARINI, Paolo; KOVAČ, Jernej; HOVNIK, Tinka

    2015-01-01

    Introduction Developmental delay and dysmorphic features affect 1 – 3 % of paediatric population. In the last few years molecular cytogenetic high resolution techniques (comparative genomic hybridization arrays and single-nucleotide polymorphism arrays) have been proven to be a first-tier choice for clinical diagnostics of developmental delay and dysmorphic features. Methods and results In the present article we describe the clinical advantages of molecular cytogenetic approach (comparative genomic hybridization arrays and single nucleotide polymorphism arrays) in the diagnostic procedure of two children with developmental delay, dysmorphic features and additional morphological phenotypes. Additionally, we demonstrate the necessity of fluorescent in situ hybridization utilisation to identify the localisation and underlying mechanism of detected chromosomal rearrangement. Conclusions Two types of chromosomal abnormalities were identified and confirmed using different molecular genetic approaches. Comparative genomic hybridization arrays and single nucleotide polymorphism arrays are hereby presented as important methods to identify chromosomal imbalances in patients with developmental delay and dysmorphic features. We emphasize the importance of molecular genetic testing in patients’ parents for the demonstration of the origin and clinical importance of the aberrations prior determined in the patients. The results obtained using molecular cytogenetic high resolution techniques methods are the cornerstone for proper genetic counselling to the affected families.

  12. Modified Logistic Regression Models Using Gene Coexpression and Clinical Features to Predict Prostate Cancer Progression

    PubMed Central

    Zhao, Hongya; Logothetis, Christopher J.; Gorlov, Ivan P.; Zeng, Jia; Dai, Jianguo

    2013-01-01

    Predicting disease progression is one of the most challenging problems in prostate cancer research. Adding gene expression data to prediction models that are based on clinical features has been proposed to improve accuracy. In the current study, we applied a logistic regression (LR) model combining clinical features and gene co-expression data to improve the accuracy of the prediction of prostate cancer progression. The top-scoring pair (TSP) method was used to select genes for the model. The proposed models not only preserved the basic properties of the TSP algorithm but also incorporated the clinical features into the prognostic models. Based on the statistical inference with the iterative cross validation, we demonstrated that prediction LR models that included genes selected by the TSP method provided better predictions of prostate cancer progression than those using clinical variables only and/or those that included genes selected by the one-gene-at-a-time approach. Thus, we conclude that TSP selection is a useful tool for feature (and/or gene) selection to use in prognostic models and our model also provides an alternative for predicting prostate cancer progression. PMID:24367394

  13. Cellular automata with object-oriented features for parallel molecular network modeling.

    PubMed

    Zhu, Hao; Wu, Yinghui; Huang, Sui; Sun, Yan; Dhar, Pawan

    2005-06-01

    Cellular automata are an important modeling paradigm for studying the dynamics of large, parallel systems composed of multiple, interacting components. However, to model biological systems, cellular automata need to be extended beyond the large-scale parallelism and intensive communication in order to capture two fundamental properties characteristic of complex biological systems: hierarchy and heterogeneity. This paper proposes extensions to a cellular automata language, Cellang, to meet this purpose. The extended language, with object-oriented features, can be used to describe the structure and activity of parallel molecular networks within cells. Capabilities of this new programming language include object structure to define molecular programs within a cell, floating-point data type and mathematical functions to perform quantitative computation, message passing capability to describe molecular interactions, as well as new operators, statements, and built-in functions. We discuss relevant programming issues of these features, including the object-oriented description of molecular interactions with molecule encapsulation, message passing, and the description of heterogeneity and anisotropy at the cell and molecule levels. By enabling the integration of modeling at the molecular level with system behavior at cell, tissue, organ, or even organism levels, the program will help improve our understanding of how complex and dynamic biological activities are generated and controlled by parallel functioning of molecular networks. Index Terms-Cellular automata, modeling, molecular network, object-oriented. PMID:16117022

  14. MINT: Mutual Information Based Transductive Feature Selection for Genetic Trait Prediction.

    PubMed

    He, Dan; Rish, Irina; Haws, David; Parida, Laxmi

    2016-01-01

    Whole genome prediction of complex phenotypic traits using high-density genotyping arrays has attracted a lot of attention, as it is relevant to the fields of plant and animal breeding and genetic epidemiology. Since the number of genotypes is generally much bigger than the number of samples, predictive models suffer from the curse of dimensionality. The curse of dimensionality problem not only affects the computational efficiency of a particular genomic selection method, but can also lead to a poor performance, mainly due to possible overfitting, or un-informative features. In this work, we propose a novel transductive feature selection method, called MINT, which is based on the MRMR (Max-Relevance and Min-Redundancy) criterion. We apply MINT on genetic trait prediction problems and show that, in general, MINT is a better feature selection method than the state-of-the-art inductive method MRMR. PMID:27295642

  15. Adaptive reliance on the most stable sensory predictions enhances perceptual feature extraction of moving stimuli.

    PubMed

    Kumar, Neeraj; Mutha, Pratik K

    2016-03-01

    The prediction of the sensory outcomes of action is thought to be useful for distinguishing self- vs. externally generated sensations, correcting movements when sensory feedback is delayed, and learning predictive models for motor behavior. Here, we show that aspects of another fundamental function-perception-are enhanced when they entail the contribution of predicted sensory outcomes and that this enhancement relies on the adaptive use of the most stable predictions available. We combined a motor-learning paradigm that imposes new sensory predictions with a dynamic visual search task to first show that perceptual feature extraction of a moving stimulus is poorer when it is based on sensory feedback that is misaligned with those predictions. This was possible because our novel experimental design allowed us to override the "natural" sensory predictions present when any action is performed and separately examine the influence of these two sources on perceptual feature extraction. We then show that if the new predictions induced via motor learning are unreliable, rather than just relying on sensory information for perceptual judgments, as is conventionally thought, then subjects adaptively transition to using other stable sensory predictions to maintain greater accuracy in their perceptual judgments. Finally, we show that when sensory predictions are not modified at all, these judgments are sharper when subjects combine their natural predictions with sensory feedback. Collectively, our results highlight the crucial contribution of sensory predictions to perception and also suggest that the brain intelligently integrates the most stable predictions available with sensory information to maintain high fidelity in perceptual decisions. PMID:26823516

  16. Adaptive reliance on the most stable sensory predictions enhances perceptual feature extraction of moving stimuli

    PubMed Central

    Kumar, Neeraj

    2016-01-01

    The prediction of the sensory outcomes of action is thought to be useful for distinguishing self- vs. externally generated sensations, correcting movements when sensory feedback is delayed, and learning predictive models for motor behavior. Here, we show that aspects of another fundamental function—perception—are enhanced when they entail the contribution of predicted sensory outcomes and that this enhancement relies on the adaptive use of the most stable predictions available. We combined a motor-learning paradigm that imposes new sensory predictions with a dynamic visual search task to first show that perceptual feature extraction of a moving stimulus is poorer when it is based on sensory feedback that is misaligned with those predictions. This was possible because our novel experimental design allowed us to override the “natural” sensory predictions present when any action is performed and separately examine the influence of these two sources on perceptual feature extraction. We then show that if the new predictions induced via motor learning are unreliable, rather than just relying on sensory information for perceptual judgments, as is conventionally thought, then subjects adaptively transition to using other stable sensory predictions to maintain greater accuracy in their perceptual judgments. Finally, we show that when sensory predictions are not modified at all, these judgments are sharper when subjects combine their natural predictions with sensory feedback. Collectively, our results highlight the crucial contribution of sensory predictions to perception and also suggest that the brain intelligently integrates the most stable predictions available with sensory information to maintain high fidelity in perceptual decisions. PMID:26823516

  17. Scoring multiple features to predict drug disease associations using information fusion and aggregation.

    PubMed

    Moghadam, H; Rahgozar, M; Gharaghani, S

    2016-08-01

    Prediction of drug-disease associations is one of the current fields in drug repositioning that has turned into a challenging topic in pharmaceutical science. Several available computational methods use network-based and machine learning approaches to reposition old drugs for new indications. However, they often ignore features of drugs and diseases as well as the priority and importance of each feature, relation, or interactions between features and the degree of uncertainty. When predicting unknown drug-disease interactions there are diverse data sources and multiple features available that can provide more accurate and reliable results. This information can be collectively mined using data fusion methods and aggregation operators. Therefore, we can use the feature fusion method to make high-level features. We have proposed a computational method named scored mean kernel fusion (SMKF), which uses a new method to score the average aggregation operator called scored mean. To predict novel drug indications, this method systematically combines multiple features related to drugs or diseases at two levels: the drug-drug level and the drug-disease level. The purpose of this study was to investigate the effect of drug and disease features as well as data fusion to predict drug-disease interactions. The method was validated against a well-established drug-disease gold-standard dataset. When compared with the available methods, our proposed method outperformed them and competed well in performance with area under cover (AUC) of 0.91, F-measure of 84.9% and Matthews correlation coefficient of 70.31%. PMID:27455069

  18. Identifying ultrasound and clinical features of breast cancer molecular subtypes by ensemble decision.

    PubMed

    Zhang, Lei; Li, Jing; Xiao, Yun; Cui, Hao; Du, Guoqing; Wang, Ying; Li, Ziyao; Wu, Tong; Li, Xia; Tian, Jiawei

    2015-01-01

    Breast cancer is molecularly heterogeneous and categorized into four molecular subtypes: Luminal-A, Luminal-B, HER2-amplified and Triple-negative. In this study, we aimed to apply an ensemble decision approach to identify the ultrasound and clinical features related to the molecular subtypes. We collected ultrasound and clinical features from 1,000 breast cancer patients and performed immunohistochemistry on these samples. We used the ensemble decision approach to select unique features and to construct decision models. The decision model for Luminal-A subtype was constructed based on the presence of an echogenic halo and post-acoustic shadowing or indifference. The decision model for Luminal-B subtype was constructed based on the absence of an echogenic halo and vascularity. The decision model for HER2-amplified subtype was constructed based on the presence of post-acoustic enhancement, calcification, vascularity and advanced age. The model for Triple-negative subtype followed two rules. One was based on irregular shape, lobulate margin contour, the absence of calcification and hypovascularity, whereas the other was based on oval shape, hypovascularity and micro-lobulate margin contour. The accuracies of the models were 83.8%, 77.4%, 87.9% and 92.7%, respectively. We identified specific features of each molecular subtype and expanded the scope of ultrasound for making diagnoses using these decision models. PMID:26046791

  19. Mem-PHybrid: hybrid features-based prediction system for classifying membrane protein types.

    PubMed

    Hayat, Maqsood; Khan, Asifullah

    2012-05-01

    Membrane proteins are a major class of proteins and encoded by approximately 20% to 30% of genes in most organisms. In this work, a two-layer novel membrane protein prediction system, called Mem-PHybrid, is proposed. It is able to first identify the protein query as a membrane or nonmembrane protein. In the second level, it further identifies the type of membrane protein. The proposed Mem-PHybrid prediction system is based on hybrid features, whereby a fusion of both the physicochemical and split amino acid composition-based features is performed. This enables the proposed Mem-PHybrid to exploit the discrimination capabilities of both types of feature extraction strategy. In addition, minimum redundancy and maximum relevance has also been applied to reduce the dimensionality of a feature vector. We employ random forest, evidence-theoretic K-nearest neighbor, and support vector machine (SVM) as classifiers and analyze their performance on two datasets. SVM using hybrid features yields the highest accuracy of 89.6% and 97.3% on dataset1 and 91.5% and 95.5% on dataset2 for jackknife and independent dataset tests, respectively. The enhanced prediction performance of Mem-PHybrid is largely attributed to the exploitation of the discrimination power of the hybrid features and of the learning capability of SVM. Mem-PHybrid is accessible at http://www.111.68.99.218/Mem-PHybrid. PMID:22342883

  20. MRI signal and texture features for the prediction of MCI to Alzheimer's disease progression

    NASA Astrophysics Data System (ADS)

    Martínez-Torteya, Antonio; Rodríguez-Rojas, Juan; Celaya-Padilla, José M.; Galván-Tejada, Jorge I.; Treviño, Victor; Tamez-Peña, José G.

    2014-03-01

    An early diagnosis of Alzheimer's disease (AD) confers many benefits. Several biomarkers from different information modalities have been proposed for the prediction of MCI to AD progression, where features extracted from MRI have played an important role. However, studies have focused almost exclusively in the morphological characteristics of the images. This study aims to determine whether features relating to the signal and texture of the image could add predictive power. Baseline clinical, biological and PET information, and MP-RAGE images for 62 subjects from the Alzheimer's Disease Neuroimaging Initiative were used in this study. Images were divided into 83 regions and 50 features were extracted from each one of these. A multimodal database was constructed, and a feature selection algorithm was used to obtain an accurate and small logistic regression model, which achieved a cross-validation accuracy of 0.96. These model included six features, five of them obtained from the MP-RAGE image, and one obtained from genotyping. A risk analysis divided the subjects into low-risk and high-risk groups according to a prognostic index, showing that both groups are statistically different (p-value of 2.04e-11). The results demonstrate that MRI features related to both signal and texture, add MCI to AD predictive power, and support the idea that multimodal biomarkers outperform single-modality biomarkers.

  1. A novel feature extraction scheme with ensemble coding for protein-protein interaction prediction.

    PubMed

    Du, Xiuquan; Cheng, Jiaxing; Zheng, Tingting; Duan, Zheng; Qian, Fulan

    2014-01-01

    Protein-protein interactions (PPIs) play key roles in most cellular processes, such as cell metabolism, immune response, endocrine function, DNA replication, and transcription regulation. PPI prediction is one of the most challenging problems in functional genomics. Although PPI data have been increasing because of the development of high-throughput technologies and computational methods, many problems are still far from being solved. In this study, a novel predictor was designed by using the Random Forest (RF) algorithm with the ensemble coding (EC) method. To reduce computational time, a feature selection method (DX) was adopted to rank the features and search the optimal feature combination. The DXEC method integrates many features and physicochemical/biochemical properties to predict PPIs. On the Gold Yeast dataset, the DXEC method achieves 67.2% overall precision, 80.74% recall, and 70.67% accuracy. On the Silver Yeast dataset, the DXEC method achieves 76.93% precision, 77.98% recall, and 77.27% accuracy. On the human dataset, the prediction accuracy reaches 80% for the DXEC-RF method. We extended the experiment to a bigger and more realistic dataset that maintains 50% recall on the Yeast All dataset and 80% recall on the Human All dataset. These results show that the DXEC method is suitable for performing PPI prediction. The prediction service of the DXEC-RF classifier is available at http://ailab.ahu.edu.cn:8087/ DXECPPI/index.jsp. PMID:25046746

  2. Accurate and predictive antibody repertoire profiling by molecular amplification fingerprinting

    PubMed Central

    Khan, Tarik A.; Friedensohn, Simon; de Vries, Arthur R. Gorter; Straszewski, Jakub; Ruscheweyh, Hans-Joachim; Reddy, Sai T.

    2016-01-01

    High-throughput antibody repertoire sequencing (Ig-seq) provides quantitative molecular information on humoral immunity. However, Ig-seq is compromised by biases and errors introduced during library preparation and sequencing. By using synthetic antibody spike-in genes, we determined that primer bias from multiplex polymerase chain reaction (PCR) library preparation resulted in antibody frequencies with only 42 to 62% accuracy. Additionally, Ig-seq errors resulted in antibody diversity measurements being overestimated by up to 5000-fold. To rectify this, we developed molecular amplification fingerprinting (MAF), which uses unique molecular identifier (UID) tagging before and during multiplex PCR amplification, which enabled tagging of transcripts while accounting for PCR efficiency. Combined with a bioinformatic pipeline, MAF bias correction led to measurements of antibody frequencies with up to 99% accuracy. We also used MAF to correct PCR and sequencing errors, resulting in enhanced accuracy of full-length antibody diversity measurements, achieving 98 to 100% error correction. Using murine MAF-corrected data, we established a quantitative metric of recent clonal expansion—the intraclonal diversity index—which measures the number of unique transcripts associated with an antibody clone. We used this intraclonal diversity index along with antibody frequencies and somatic hypermutation to build a logistic regression model for prediction of the immunological status of clones. The model was able to predict clonal status with high confidence but only when using MAF error and bias corrected Ig-seq data. Improved accuracy by MAF provides the potential to greatly advance Ig-seq and its utility in immunology and biotechnology. PMID:26998518

  3. Accurate and predictive antibody repertoire profiling by molecular amplification fingerprinting.

    PubMed

    Khan, Tarik A; Friedensohn, Simon; Gorter de Vries, Arthur R; Straszewski, Jakub; Ruscheweyh, Hans-Joachim; Reddy, Sai T

    2016-03-01

    High-throughput antibody repertoire sequencing (Ig-seq) provides quantitative molecular information on humoral immunity. However, Ig-seq is compromised by biases and errors introduced during library preparation and sequencing. By using synthetic antibody spike-in genes, we determined that primer bias from multiplex polymerase chain reaction (PCR) library preparation resulted in antibody frequencies with only 42 to 62% accuracy. Additionally, Ig-seq errors resulted in antibody diversity measurements being overestimated by up to 5000-fold. To rectify this, we developed molecular amplification fingerprinting (MAF), which uses unique molecular identifier (UID) tagging before and during multiplex PCR amplification, which enabled tagging of transcripts while accounting for PCR efficiency. Combined with a bioinformatic pipeline, MAF bias correction led to measurements of antibody frequencies with up to 99% accuracy. We also used MAF to correct PCR and sequencing errors, resulting in enhanced accuracy of full-length antibody diversity measurements, achieving 98 to 100% error correction. Using murine MAF-corrected data, we established a quantitative metric of recent clonal expansion-the intraclonal diversity index-which measures the number of unique transcripts associated with an antibody clone. We used this intraclonal diversity index along with antibody frequencies and somatic hypermutation to build a logistic regression model for prediction of the immunological status of clones. The model was able to predict clonal status with high confidence but only when using MAF error and bias corrected Ig-seq data. Improved accuracy by MAF provides the potential to greatly advance Ig-seq and its utility in immunology and biotechnology. PMID:26998518

  4. Molecular structures of carotenoids as predicted by MNDO-AM1 molecular orbital calculations

    NASA Astrophysics Data System (ADS)

    Hashimoto, Hideki; Yoda, Takeshi; Kobayashi, Takayoshi; Young, Andrew J.

    2002-02-01

    Semi-empirical molecular orbital calculations using AM1 Hamiltonian (MNDO-AM1 method) were performed for a number of biologically important carotenoid molecules, namely all- trans-β-carotene, all- trans-zeaxanthin, and all- trans-violaxanthin (found in higher plants and algae) together with all- trans-canthaxanthin, all- trans-astaxanthin, and all- trans-tunaxanthin in order to predict their stable structures. The molecular structures of all- trans-β-carotene, all- trans-canthaxanthin, and all- trans-astaxanthin predicted based on molecular orbital calculations were compared with those determined by X-ray crystallography. Predicted bond lengths, bond angles, and dihedral angles showed an excellent agreement with those determined experimentally, a fact that validated the present theoretical calculations. Comparison of the bond lengths, bond angles and dihedral angles of the most stable conformer among all the carotenoid molecules showed that the displacements are localized around the substituent groups and hence around the cyclohexene rings. The most stable conformers of all- trans-zeaxanthin and all- trans-violaxanthin gave rise to a torsion angle around the C6-C7 bond to be ±48.7 and -84.8°, respectively. This difference is a key factor in relation to the biological function of these two carotenoids in plants and algae (the xanthophyll cycle). Further analyses by calculating the atomic charges and using enpartment calculations (division of bond energies between component atoms) were performed to ascribe the cause of the different observed torsion angles.

  5. Unbiased Prediction and Feature Selection in High-Dimensional Survival Regression

    PubMed Central

    Laimighofer, Michael; Krumsiek, Jan; Theis, Fabian J.

    2016-01-01

    Abstract With widespread availability of omics profiling techniques, the analysis and interpretation of high-dimensional omics data, for example, for biomarkers, is becoming an increasingly important part of clinical medicine because such datasets constitute a promising resource for predicting survival outcomes. However, early experience has shown that biomarkers often generalize poorly. Thus, it is crucial that models are not overfitted and give accurate results with new data. In addition, reliable detection of multivariate biomarkers with high predictive power (feature selection) is of particular interest in clinical settings. We present an approach that addresses both aspects in high-dimensional survival models. Within a nested cross-validation (CV), we fit a survival model, evaluate a dataset in an unbiased fashion, and select features with the best predictive power by applying a weighted combination of CV runs. We evaluate our approach using simulated toy data, as well as three breast cancer datasets, to predict the survival of breast cancer patients after treatment. In all datasets, we achieve more reliable estimation of predictive power for unseen cases and better predictive performance compared to the standard CoxLasso model. Taken together, we present a comprehensive and flexible framework for survival models, including performance estimation, final feature selection, and final model construction. The proposed algorithm is implemented in an open source R package (SurvRank) available on CRAN. PMID:26894327

  6. Unbiased Prediction and Feature Selection in High-Dimensional Survival Regression.

    PubMed

    Laimighofer, Michael; Krumsiek, Jan; Buettner, Florian; Theis, Fabian J

    2016-04-01

    With widespread availability of omics profiling techniques, the analysis and interpretation of high-dimensional omics data, for example, for biomarkers, is becoming an increasingly important part of clinical medicine because such datasets constitute a promising resource for predicting survival outcomes. However, early experience has shown that biomarkers often generalize poorly. Thus, it is crucial that models are not overfitted and give accurate results with new data. In addition, reliable detection of multivariate biomarkers with high predictive power (feature selection) is of particular interest in clinical settings. We present an approach that addresses both aspects in high-dimensional survival models. Within a nested cross-validation (CV), we fit a survival model, evaluate a dataset in an unbiased fashion, and select features with the best predictive power by applying a weighted combination of CV runs. We evaluate our approach using simulated toy data, as well as three breast cancer datasets, to predict the survival of breast cancer patients after treatment. In all datasets, we achieve more reliable estimation of predictive power for unseen cases and better predictive performance compared to the standard CoxLasso model. Taken together, we present a comprehensive and flexible framework for survival models, including performance estimation, final feature selection, and final model construction. The proposed algorithm is implemented in an open source R package (SurvRank) available on CRAN. PMID:26894327

  7. Multivariate Feature Selection for Predicting Scour-Related Bridge Damage using a Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Anderson, I.

    2015-12-01

    Scour and hydraulic damage are the most common cause of bridge failure, reported to be responsible for over 60% of bridge failure nationwide. Scour is a complex process, and is likely an epistatic function of both bridge and stream conditions that are both stationary and in dynamic flux. Bridge inspections, conducted regularly on bridges nationwide, rate bridge health assuming a static stream condition, and typically do not include dynamically changing geomorphological adjustments. The Vermont Agency of Natural Resources stream geomorphic assessment data could add value into the current bridge inspection and scour design. The 2011 bridge damage from Tropical Storm Irene served as a case study for feature selection to improve bridge scour damage prediction in extreme events. The bridge inspection (with over 200 features on more than 300 damaged and 2,000 non-damaged bridges), and the stream geomorphic assessment (with over 300 features on more than 5000 stream reaches) constitute "Big Data", and together have the potential to generate large numbers of combined features ("epistatic relationships") that might better predict scour-related bridge damage. The potential combined features pose significant computational challenges for traditional statistical techniques (e.g., multivariate logistic regression). This study uses a genetic algorithm to perform a search of the multivariate feature space to identify epistatic relationships that are indicative of bridge scour damage. The combined features identified could be used to improve bridge scour design, and to better monitor and rate bridge scour vulnerability.

  8. Patient feature based dosimetric Pareto front prediction in esophageal cancer radiotherapy

    SciTech Connect

    Wang, Jiazhou; Zhao, Kuaike; Peng, Jiayuan; Xie, Jiang; Chen, Junchao; Zhang, Zhen; Hu, Weigang; Jin, Xiance; Studenski, Matthew

    2015-02-15

    Purpose: To investigate the feasibility of the dosimetric Pareto front (PF) prediction based on patient’s anatomic and dosimetric parameters for esophageal cancer patients. Methods: Eighty esophagus patients in the authors’ institution were enrolled in this study. A total of 2928 intensity-modulated radiotherapy plans were obtained and used to generate PF for each patient. On average, each patient had 36.6 plans. The anatomic and dosimetric features were extracted from these plans. The mean lung dose (MLD), mean heart dose (MHD), spinal cord max dose, and PTV homogeneity index were recorded for each plan. Principal component analysis was used to extract overlap volume histogram (OVH) features between PTV and other organs at risk. The full dataset was separated into two parts; a training dataset and a validation dataset. The prediction outcomes were the MHD and MLD. The spearman’s rank correlation coefficient was used to evaluate the correlation between the anatomical features and dosimetric features. The stepwise multiple regression method was used to fit the PF. The cross validation method was used to evaluate the model. Results: With 1000 repetitions, the mean prediction error of the MHD was 469 cGy. The most correlated factor was the first principal components of the OVH between heart and PTV and the overlap between heart and PTV in Z-axis. The mean prediction error of the MLD was 284 cGy. The most correlated factors were the first principal components of the OVH between heart and PTV and the overlap between lung and PTV in Z-axis. Conclusions: It is feasible to use patients’ anatomic and dosimetric features to generate a predicted Pareto front. Additional samples and further studies are required improve the prediction model.

  9. PROSPER: an integrated feature-based tool for predicting protease substrate cleavage sites.

    PubMed

    Song, Jiangning; Tan, Hao; Perry, Andrew J; Akutsu, Tatsuya; Webb, Geoffrey I; Whisstock, James C; Pike, Robert N

    2012-01-01

    The ability to catalytically cleave protein substrates after synthesis is fundamental for all forms of life. Accordingly, site-specific proteolysis is one of the most important post-translational modifications. The key to understanding the physiological role of a protease is to identify its natural substrate(s). Knowledge of the substrate specificity of a protease can dramatically improve our ability to predict its target protein substrates, but this information must be utilized in an effective manner in order to efficiently identify protein substrates by in silico approaches. To address this problem, we present PROSPER, an integrated feature-based server for in silico identification of protease substrates and their cleavage sites for twenty-four different proteases. PROSPER utilizes established specificity information for these proteases (derived from the MEROPS database) with a machine learning approach to predict protease cleavage sites by using different, but complementary sequence and structure characteristics. Features used by PROSPER include local amino acid sequence profile, predicted secondary structure, solvent accessibility and predicted native disorder. Thus, for proteases with known amino acid specificity, PROSPER provides a convenient, pre-prepared tool for use in identifying protein substrates for the enzymes. Systematic prediction analysis for the twenty-four proteases thus far included in the database revealed that the features we have included in the tool strongly improve performance in terms of cleavage site prediction, as evidenced by their contribution to performance improvement in terms of identifying known cleavage sites in substrates for these enzymes. In comparison with two state-of-the-art prediction tools, PoPS and SitePrediction, PROSPER achieves greater accuracy and coverage. To our knowledge, PROSPER is the first comprehensive server capable of predicting cleavage sites of multiple proteases within a single substrate sequence using

  10. Genomic Signal Processing: Predicting Basic Molecular Biological Principles

    NASA Astrophysics Data System (ADS)

    Alter, Orly

    2005-03-01

    Advances in high-throughput technologies enable acquisition of different types of molecular biological data, monitoring the flow of biological information as DNA is transcribed to RNA, and RNA is translated to proteins, on a genomic scale. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment and drug development. Recently we described data-driven models for genome-scale molecular biological data, which use singular value decomposition (SVD) and the comparative generalized SVD (GSVD). Now we describe an integrative data-driven model, which uses pseudoinverse projection (1). We also demonstrate the predictive power of these matrix algebra models (2). The integrative pseudoinverse projection model formulates any number of genome-scale molecular biological data sets in terms of one chosen set of data samples, or of profiles extracted mathematically from data samples, designated the ``basis'' set. The mathematical variables of this integrative model, the pseudoinverse correlation patterns that are uncovered in the data, represent independent processes and corresponding cellular states (such as observed genome-wide effects of known regulators or transcription factors, the biological components of the cellular machinery that generate the genomic signals, and measured samples in which these regulators or transcription factors are over- or underactive). Reconstruction of the data in the basis simulates experimental observation of only the cellular states manifest in the data that correspond to those of the basis. Classification of the data samples according to their reconstruction in the basis, rather than their overall measured profiles, maps the cellular states of the data onto those of the basis, and gives a global picture of the correlations and possibly also causal coordination of

  11. Critical Features Predicting Sustained Implementation of School-Wide Positive Behavioral Interventions and Supports

    ERIC Educational Resources Information Center

    Mathews, Susanna; McIntosh, Kent; Frank, Jennifer L.; May, Seth L.

    2014-01-01

    The current study explored the extent to which a common measure of perceived implementation of critical features of Positive Behavioral Interventions and Supports (PBIS) predicted fidelity of implementation 3 years later. Respondents included school personnel from 261 schools across the United States implementing PBIS. School teams completed the…

  12. Critical Features Predicting Sustained Implementation of School-Wide Positive Behavior Support

    ERIC Educational Resources Information Center

    Mathews, Susanna; McIntosh, Kent; Frank, Jennifer; May, Seth

    2014-01-01

    The current study explored the extent to which a common measure of perceived implementation of critical features of School-wide Positive Behavior Support (SWPBS) predicted fidelity of implementation 3 years later. Respondents included school personnel from 261 schools across the United States implementing SWPBS. School teams completed the…

  13. Improved Species-Specific Lysine Acetylation Site Prediction Based on a Large Variety of Features Set

    PubMed Central

    Wuyun, Qiqige; Zheng, Wei; Zhang, Yanping; Ruan, Jishou; Hu, Gang

    2016-01-01

    Lysine acetylation is a major post-translational modification. It plays a vital role in numerous essential biological processes, such as gene expression and metabolism, and is related to some human diseases. To fully understand the regulatory mechanism of acetylation, identification of acetylation sites is first and most important. However, experimental identification of protein acetylation sites is often time consuming and expensive. Therefore, the alternative computational methods are necessary. Here, we developed a novel tool, KA-predictor, to predict species-specific lysine acetylation sites based on support vector machine (SVM) classifier. We incorporated different types of features and employed an efficient feature selection on each type to form the final optimal feature set for model learning. And our predictor was highly competitive for the majority of species when compared with other methods. Feature contribution analysis indicated that HSE features, which were firstly introduced for lysine acetylation prediction, significantly improved the predictive performance. Particularly, we constructed a high-accurate structure dataset of H.sapiens from PDB to analyze the structural properties around lysine acetylation sites. Our datasets and a user-friendly local tool of KA-predictor can be freely available at http://sourceforge.net/p/ka-predictor. PMID:27183223

  14. Improved Species-Specific Lysine Acetylation Site Prediction Based on a Large Variety of Features Set.

    PubMed

    Wuyun, Qiqige; Zheng, Wei; Zhang, Yanping; Ruan, Jishou; Hu, Gang

    2016-01-01

    Lysine acetylation is a major post-translational modification. It plays a vital role in numerous essential biological processes, such as gene expression and metabolism, and is related to some human diseases. To fully understand the regulatory mechanism of acetylation, identification of acetylation sites is first and most important. However, experimental identification of protein acetylation sites is often time consuming and expensive. Therefore, the alternative computational methods are necessary. Here, we developed a novel tool, KA-predictor, to predict species-specific lysine acetylation sites based on support vector machine (SVM) classifier. We incorporated different types of features and employed an efficient feature selection on each type to form the final optimal feature set for model learning. And our predictor was highly competitive for the majority of species when compared with other methods. Feature contribution analysis indicated that HSE features, which were firstly introduced for lysine acetylation prediction, significantly improved the predictive performance. Particularly, we constructed a high-accurate structure dataset of H.sapiens from PDB to analyze the structural properties around lysine acetylation sites. Our datasets and a user-friendly local tool of KA-predictor can be freely available at http://sourceforge.net/p/ka-predictor. PMID:27183223

  15. Comparison of the predictive power of beef surface wavelet texture features at high and low magnification.

    PubMed

    Jackman, Patrick; Sun, Da-Wen; Allen, Paul

    2009-07-01

    Beef longissimus dorsi surface texture is an indicator used in predicting beef palatability by expert graders. Computer vision systems have previously used imaging at normal view to develop surface texture features with some success. Good models of beef overall acceptability using imaging at high magnification have been recently developed. As a comparison the same surface texture features were computed from the corresponding images at normal view and used to model overall acceptability. Both sets of texture features were also combined with muscle colour and marbling features and used to model overall acceptability. Models using texture features alone were more successful at normal modality. However colour and marbling features combined much better with texture features at high modality to yield the most accurate model of overall acceptability (r(2)=0.93). Accurate Partial Least Squares Regression (PLSR) models were computed at both modalities with and without inclusion of colour and marbling features. Addition of squared terms to the models failed to improve accuracy. PMID:20416713

  16. Non-linear feature extraction from HRV signal for mortality prediction of ICU cardiovascular patient.

    PubMed

    Karimi Moridani, Mohammad; Setarehdan, Seyed Kamaledin; Motie Nasrabadi, Ali; Hajinasrollah, Esmaeil

    2016-04-01

    Intensive care unit (ICU) patients are at risk of in-ICU morbidities and mortality, making specific systems for identifying at-risk patients a necessity for improving clinical care. This study presents a new method for predicting in-hospital mortality using heart rate variability (HRV) collected from the times of a patient's ICU stay. In this paper, a HRV time series processing based method is proposed for mortality prediction of ICU cardiovascular patients. HRV signals were obtained measuring R-R time intervals. A novel method, named return map, is then developed that reveals useful information from the HRV time series. This study also proposed several features that can be extracted from the return map, including the angle between two vectors, the area of triangles formed by successive points, shortest distance to 45° line and their various combinations. Finally, a thresholding technique is proposed to extract the risk period and to predict mortality. The data used to evaluate the proposed algorithm obtained from 80 cardiovascular ICU patients, from the first 48 h of the first ICU stay of 40 males and 40 females. This study showed that the angle feature has on average a sensitivity of 87.5% (with 12 false alarms), the area feature has on average a sensitivity of 89.58% (with 10 false alarms), the shortest distance feature has on average a sensitivity of 85.42% (with 14 false alarms) and, finally, the combined feature has on average a sensitivity of 92.71% (with seven false alarms). The results showed that the last half an hour before the patient's death is very informative for diagnosing the patient's condition and to save his/her life. These results confirm that it is possible to predict mortality based on the features introduced in this paper, relying on the variations of the HRV dynamic characteristics. PMID:27028609

  17. Predicting and explaining the movement of mesoscale oceanographic features using CLIPS

    NASA Technical Reports Server (NTRS)

    Bridges, Susan; Chen, Liang-Chun; Lybanon, Matthew

    1994-01-01

    The Naval Research Laboratory has developed an oceanographic expert system that describes the evolution of mesoscale features in the Gulf Stream region of the northwest Atlantic Ocean. These features include the Gulf Stream current and the warm and cold core eddies associated with the Gulf Stream. An explanation capability was added to the eddy prediction component of the expert system in order to allow the system to justify the reasoning process it uses to make predictions. The eddy prediction and explanation components of the system have recently been redesigned and translated from OPS83 to C and CLIPS and the new system is called WATE (Where Are Those Eddies). The new design has improved the system's readability, understandability and maintainability and will also allow the system to be incorporated into the Semi-Automated Mesoscale Analysis System which will eventually be embedded into the Navy's Tactical Environmental Support System, Third Generation, TESS(3).

  18. The prognostic impact of clinical and molecular features in hairy cell leukaemia variant and splenic marginal zone lymphoma.

    PubMed

    Hockley, Sarah L; Else, Monica; Morilla, Alison; Wotherspoon, Andrew; Dearden, Claire; Catovsky, Daniel; Gonzalez, David; Matutes, Estella

    2012-08-01

    Hairy cell leukaemia variant (HCL-variant) and splenic marginal zone lymphoma (SMZL) are disorders with overlapping features. We investigated the prognostic impact in these disorders of clinical and molecular features including IGH VDJ rearrangements, IGHV gene usage and TP 53 mutations. Clinical and laboratory data were collected before therapy from 35 HCL-variant and 68 SMZL cases. End-points were the need for treatment and overall survival. 97% of HCL-variant and 77% of SMZL cases required treatment (P = 0·009). Survival at 5 years was significantly worse in HCL-variant [57% (95% confidence interval 38-73%)] compared with SMZL [84% (71-91%); Hazard Ratio 2·25 (1·20-4·25), P = 0·01]. In HCL-variant, adverse prognostic factors for survival were older age (P = 0·04), anaemia (P = 0·01) and TP 53 mutations (P = 0·02). In SMZL, splenomegaly, anaemia and IGHV genes with >98% homology to the germline predicted the need for treatment; older age, anaemia and IGHV unmutated genes (100% homology) predicted shorter survival. IGHV gene usage had no impact on clinical outcome in either disease. The combination of unfavourable factors allowed patients to be stratified into risk groups with significant differences in survival. Although HCL-variant and SMZL share some features, they have different outcomes, influenced by clinical and biological factors. PMID:22594855

  19. Use of the molecular connectivity index to predict chemical biotransfer

    SciTech Connect

    Dowdy, D.L.; McKone, T.E.; Hsieh, D.P.H.

    1994-12-31

    Chemicals released into the environment can pose a danger to organisms if exposure occurs. In order to assess the level of risk, it is necessary to first determine if a chemical is capable of biotransfer from a given environmental medium into a particular biological system. Experimental determination of biotransfer factors (BTF), defined as the ratio of the concentration of a chemical in an organism or tissue to that in the exposure medium, is usually difficult, expensive, and time consuming. Since an accurate measurement of BTF is crucial to exposure and risk assessment, it would be advantageous if BTF could be estimated from a chemical property that is quantifiable with high precision. The molecular connectivity index (MCI) is such a chemical property, which in theory encodes information about molecular size, branching, cyclization, saturation, and heteroatom content. MCI`s are readily obtainable from chemical structure and the periodic table, requiring no experimental measurement. The results indicate a strong correlation between the MCI and BTF values for animal tissue, milk, and vegetation. Using MCI to estimate BTF could provide a faster, more cost effective, and more accurate method for predicting chemical biotransfer.

  20. Computer-aided breast MR image feature analysis for prediction of tumor response to chemotherapy

    SciTech Connect

    Aghaei, Faranak; Tan, Maxine; Liu, Hong; Zheng, Bin; Hollingsworth, Alan B.; Qian, Wei

    2015-11-15

    Purpose: To identify a new clinical marker based on quantitative kinetic image features analysis and assess its feasibility to predict tumor response to neoadjuvant chemotherapy. Methods: The authors assembled a dataset involving breast MR images acquired from 68 cancer patients before undergoing neoadjuvant chemotherapy. Among them, 25 patients had complete response (CR) and 43 had partial and nonresponse (NR) to chemotherapy based on the response evaluation criteria in solid tumors. The authors developed a computer-aided detection scheme to segment breast areas and tumors depicted on the breast MR images and computed a total of 39 kinetic image features from both tumor and background parenchymal enhancement regions. The authors then applied and tested two approaches to classify between CR and NR cases. The first one analyzed each individual feature and applied a simple feature fusion method that combines classification results from multiple features. The second approach tested an attribute selected classifier that integrates an artificial neural network (ANN) with a wrapper subset evaluator, which was optimized using a leave-one-case-out validation method. Results: In the pool of 39 features, 10 yielded relatively higher classification performance with the areas under receiver operating characteristic curves (AUCs) ranging from 0.61 to 0.78 to classify between CR and NR cases. Using a feature fusion method, the maximum AUC = 0.85 ± 0.05. Using the ANN-based classifier, AUC value significantly increased to 0.96 ± 0.03 (p < 0.01). Conclusions: This study demonstrated that quantitative analysis of kinetic image features computed from breast MR images acquired prechemotherapy has potential to generate a useful clinical marker in predicting tumor response to chemotherapy.

  1. Relationship of carbohydrate molecular spectroscopic features in combined feeds to carbohydrate utilization and availability in ruminants

    NASA Astrophysics Data System (ADS)

    Zhang, Xuewei; Yu, Peiqiang

    To date, there is no study on the relationship between carbohydrate (CHO) molecular structures and nutrient availability of combined feeds in ruminants. The objective of this study was to use molecular spectroscopy to reveal the relationship between CHO molecular spectral profiles (in terms of functional groups (biomolecular, biopolymer) spectral peak area and height intensity) and CHO chemical profiles, CHO subfractions, energy values, and CHO rumen degradation kinetics of combined feeds of hulless barley with pure wheat dried distillers grains with solubles (DDGS) at five different combination ratios (hulless barley to pure wheat DDGS: 100:0, 75:25, 50:50, 25:75, 0:100). The molecular spectroscopic parameters assessed included: lignin biopolymer molecular spectra profile (peak area and height, region and baseline: ca. 1539-1504 cm-1); structural carbohydrate (STCHO, peaks area region and baseline: ca. 1485-1186 cm-1) mainly associated with hemi- and cellulosic compounds; cellulosic materials peak area (centered at ca. 1240 cm-1 with region and baseline: ca. 1272-1186 cm-1); total carbohydrate (CHO, peaks area region and baseline: ca. 1186-946 cm-1). The results showed that the functional groups (biomolecular, biopolymer) in the combined feeds are sensitive to the changes of carbohydrate chemical and nutrient profiles. The changes of the CHO molecular spectroscopic features in the combined feeds were highly correlated with CHO chemical profiles, CHO subfractions, in situ CHO rumen degradation kinetics and fermentable organic matter supply. Further study is needed to investigate possibility of using CHO molecular spectral features as a predictor to estimate nutrient availability in combined feeds for animals and quantify their relationship.

  2. Comprehensible Predictive Modeling Using Regularized Logistic Regression and Comorbidity Based Features

    PubMed Central

    Stiglic, Gregor; Povalej Brzan, Petra; Fijacko, Nino; Wang, Fei; Delibasic, Boris; Kalousis, Alexandros; Obradovic, Zoran

    2015-01-01

    Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755–0.771) to 0.769 (95% CI: 0.761–0.777). Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression. PMID:26645087

  3. General morphological and biological features of neoplasms: integration of molecular findings.

    PubMed

    Diaz-Cano, S J

    2008-07-01

    This review highlights the importance of morphology-molecular correlations for a proper implementation of new markers. It covers both general aspects of tumorigenesis (which are normally omitted in papers analysing molecular pathways) and the general mechanisms for the acquired capabilities of neoplasms. The mechanisms are also supported by appropriate diagrams for each acquired capability that include overlooked features such as mobilization of cellular resources and changes in chromatin, transcription and epigenetics; fully accepted oncogenes and tumour suppressor genes are highlighted, while the pathways are also presented as activating or inactivating with appropriate colour coding. Finally, the concepts and mechanisms presented enable us to understand the basic requirements for the appropriate implementation of molecular tests in clinical practice. In summary, the basic findings are presented to serve as a bridge to clinical applications. The current definition of neoplasm is descriptive and difficult to apply routinely. Biologically, neoplasms develop through acquisition of capabilities that involve tumour cell aspects and modified microenvironment interactions, resulting in unrestricted growth due to a stepwise accumulation of cooperative genetic alterations that affect key molecular pathways. The correlation of these molecular aspects with morphological changes is essential for better understanding of essential concepts as early neoplasms/precancerous lesions, progression/dedifferentiation, and intratumour heterogeneity. The acquired capabilities include self-maintained replication (cell cycle dysregulation), extended cell survival (cell cycle arrest, apoptosis dysregulation, and replicative lifespan), genetic instability (chromosomal and microsatellite), changes of chromatin, transcription and epigenetics, mobilization of cellular resources, and modified microenvironment interactions (tumour cells, stromal cells, extracellular, endothelium). The acquired

  4. Feature genes predicting the FLT3/ITD mutation in acute myeloid leukemia.

    PubMed

    Li, Chenglong; Zhu, Biao; Chen, Jiao; Huang, Xiaobing

    2016-07-01

    In the present study, gene expression profiles of acute myeloid leukemia (AML) samples were analyzed to identify feature genes with the capacity to predict the mutation status of FLT3/ITD. Two machine learning models, namely the support vector machine (SVM) and random forest (RF) methods, were used for classification. Four datasets were downloaded from the European Bioinformatics Institute, two of which (containing 371 samples, including 281 FLT3/ITD mutation-negative and 90 mutation‑positive samples) were randomly defined as the training group, while the other two datasets (containing 488 samples, including 350 FLT3/ITD mutation-negative and 138 mutation-positive samples) were defined as the test group. Differentially expressed genes (DEGs) were identified by significance analysis of the microarray data by using the training samples. The classification efficiency of the SCM and RF methods was evaluated using the following parameters: Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and the area under the receiver operating characteristic curve. Functional enrichment analysis was performed for the feature genes with DAVID. A total of 585 DEGs were identified in the training group, of which 580 were upregulated and five were downregulated. The classification accuracy rates of the two methods for the training group, the test group and the combined group using the 585 feature genes were >90%. For the SVM and RF methods, the rates of correct determination, specificity and PPV were >90%, while the sensitivity and NPV were >80%. The SVM method produced a slightly better classification effect than the RF method. A total of 13 biological pathways were overrepresented by the feature genes, mainly involving energy metabolism, chromatin organization and translation. The feature genes identified in the present study may be used to predict the mutation status of FLT3/ITD in patients with AML. PMID:27177049

  5. Feature genes predicting the FLT3/ITD mutation in acute myeloid leukemia

    PubMed Central

    LI, CHENGLONG; ZHU, BIAO; CHEN, JIAO; HUANG, XIAOBING

    2016-01-01

    In the present study, gene expression profiles of acute myeloid leukemia (AML) samples were analyzed to identify feature genes with the capacity to predict the mutation status of FLT3/ITD. Two machine learning models, namely the support vector machine (SVM) and random forest (RF) methods, were used for classification. Four datasets were downloaded from the European Bioinformatics Institute, two of which (containing 371 samples, including 281 FLT3/ITD mutation-negative and 90 mutation-positive samples) were randomly defined as the training group, while the other two datasets (containing 488 samples, including 350 FLT3/ITD mutation-negative and 138 mutation-positive samples) were defined as the test group. Differentially expressed genes (DEGs) were identified by significance analysis of the micro-array data by using the training samples. The classification efficiency of the SCM and RF methods was evaluated using the following parameters: Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and the area under the receiver operating characteristic curve. Functional enrichment analysis was performed for the feature genes with DAVID. A total of 585 DEGs were identified in the training group, of which 580 were upregulated and five were downregulated. The classification accuracy rates of the two methods for the training group, the test group and the combined group using the 585 feature genes were >90%. For the SVM and RF methods, the rates of correct determination, specificity and PPV were >90%, while the sensitivity and NPV were >80%. The SVM method produced a slightly better classification effect than the RF method. A total of 13 biological pathways were overrepresented by the feature genes, mainly involving energy metabolism, chromatin organization and translation. The feature genes identified in the present study may be used to predict the mutation status of FLT3/ITD in patients with AML. PMID:27177049

  6. Predicting the Occurrence of Cave-Inhabiting Fauna Based on Features of the Earth Surface Environment

    PubMed Central

    Doctor, Daniel H.; Niemiller, Matthew L.; Weary, David J.; Young, John A.; Zigler, Kirk S.

    2016-01-01

    One of the most challenging fauna to study in situ is the obligate cave fauna because of the difficulty of sampling. Cave-limited species display patchy and restricted distributions, but it is often unclear whether the observed distribution is a sampling artifact or a true restriction in range. Further, the drivers of the distribution could be local environmental conditions, such as cave humidity, or they could be associated with surface features that are surrogates for cave conditions. If surface features can be used to predict the distribution of important cave taxa, then conservation management is more easily obtained. We examined the hypothesis that the presence of major faunal groups of cave obligate species could be predicted based on features of the earth surface. Georeferenced records of cave obligate amphipods, crayfish, fish, isopods, beetles, millipedes, pseudoscorpions, spiders, and springtails within the area of Appalachian Landscape Conservation Cooperative in the eastern United States (Illinois to Virginia and New York to Alabama) were assigned to 20 x 20 km grid cells. Habitat suitability for these faunal groups was modeled using logistic regression with twenty predictor variables within each grid cell, such as percent karst, soil features, temperature, precipitation, and elevation. Models successfully predicted the presence of a group greater than 65% of the time (mean = 88%) for the presence of single grid cell endemics, and for all faunal groups except pseudoscorpions. The most common predictor variables were latitude, percent karst, and the standard deviation of the Topographic Position Index (TPI), a measure of landscape rugosity within each grid cell. The overall success of these models points to a number of important connections between the surface and cave environments, and some of these, especially soil features and topographic variability, suggest new research directions. These models should prove to be useful tools in predicting the

  7. Predicting the Occurrence of Cave-Inhabiting Fauna Based on Features of the Earth Surface Environment.

    PubMed

    Christman, Mary C; Doctor, Daniel H; Niemiller, Matthew L; Weary, David J; Young, John A; Zigler, Kirk S; Culver, David C

    2016-01-01

    One of the most challenging fauna to study in situ is the obligate cave fauna because of the difficulty of sampling. Cave-limited species display patchy and restricted distributions, but it is often unclear whether the observed distribution is a sampling artifact or a true restriction in range. Further, the drivers of the distribution could be local environmental conditions, such as cave humidity, or they could be associated with surface features that are surrogates for cave conditions. If surface features can be used to predict the distribution of important cave taxa, then conservation management is more easily obtained. We examined the hypothesis that the presence of major faunal groups of cave obligate species could be predicted based on features of the earth surface. Georeferenced records of cave obligate amphipods, crayfish, fish, isopods, beetles, millipedes, pseudoscorpions, spiders, and springtails within the area of Appalachian Landscape Conservation Cooperative in the eastern United States (Illinois to Virginia and New York to Alabama) were assigned to 20 x 20 km grid cells. Habitat suitability for these faunal groups was modeled using logistic regression with twenty predictor variables within each grid cell, such as percent karst, soil features, temperature, precipitation, and elevation. Models successfully predicted the presence of a group greater than 65% of the time (mean = 88%) for the presence of single grid cell endemics, and for all faunal groups except pseudoscorpions. The most common predictor variables were latitude, percent karst, and the standard deviation of the Topographic Position Index (TPI), a measure of landscape rugosity within each grid cell. The overall success of these models points to a number of important connections between the surface and cave environments, and some of these, especially soil features and topographic variability, suggest new research directions. These models should prove to be useful tools in predicting the

  8. Biased ART: a neural architecture that shifts attention toward previously disregarded features following an incorrect prediction.

    PubMed

    Carpenter, Gail A; Gaddam, Sai Chaitanya

    2010-04-01

    Memories in Adaptive Resonance Theory (ART) networks are based on matched patterns that focus attention on those portions of bottom-up inputs that match active top-down expectations. While this learning strategy has proved successful for both brain models and applications, computational examples show that attention to early critical features may later distort memory representations during online fast learning. For supervised learning, biased ARTMAP (bARTMAP) solves the problem of over-emphasis on early critical features by directing attention away from previously attended features after the system makes a predictive error. Small-scale, hand-computed analog and binary examples illustrate key model dynamics. Two-dimensional simulation examples demonstrate the evolution of bARTMAP memories as they are learned online. Benchmark simulations show that featural biasing also improves performance on large-scale examples. One example, which predicts movie genres and is based, in part, on the Netflix Prize database, was developed for this project. Both first principles and consistent performance improvements on all simulation studies suggest that featural biasing should be incorporated by default in all ARTMAP systems. Benchmark datasets and bARTMAP code are available from the CNS Technology Lab Website: http://techlab.bu.edu/bART/. PMID:19811892

  9. MRI texture features as biomarkers to predict MGMT methylation status in glioblastomas

    PubMed Central

    Korfiatis, Panagiotis; Kline, Timothy L.; Coufalova, Lucie; Lachance, Daniel H.; Parney, Ian F.; Carter, Rickey E.; Buckner, Jan C.; Erickson, Bradley J.

    2016-01-01

    Purpose: Imaging biomarker research focuses on discovering relationships between radiological features and histological findings. In glioblastoma patients, methylation of the O6-methylguanine methyltransferase (MGMT) gene promoter is positively correlated with an increased effectiveness of current standard of care. In this paper, the authors investigate texture features as potential imaging biomarkers for capturing the MGMT methylation status of glioblastoma multiforme (GBM) tumors when combined with supervised classification schemes. Methods: A retrospective study of 155 GBM patients with known MGMT methylation status was conducted. Co-occurrence and run length texture features were calculated, and both support vector machines (SVMs) and random forest classifiers were used to predict MGMT methylation status. Results: The best classification system (an SVM-based classifier) had a maximum area under the receiver-operating characteristic (ROC) curve of 0.85 (95% CI: 0.78–0.91) using four texture features (correlation, energy, entropy, and local intensity) originating from the T2-weighted images, yielding at the optimal threshold of the ROC curve, a sensitivity of 0.803 and a specificity of 0.813. Conclusions: Results show that supervised machine learning of MRI texture features can predict MGMT methylation status in preoperative GBM tumors, thus providing a new noninvasive imaging biomarker. PMID:27277032

  10. Prediction of hot spots in protein interfaces using a random forest model with hybrid features.

    PubMed

    Wang, Lin; Liu, Zhi-Ping; Zhang, Xiang-Sun; Chen, Luonan

    2012-03-01

    Prediction of hot spots in protein interfaces provides crucial information for the research on protein-protein interaction and drug design. Existing machine learning methods generally judge whether a given residue is likely to be a hot spot by extracting features only from the target residue. However, hot spots usually form a small cluster of residues which are tightly packed together at the center of protein interface. With this in mind, we present a novel method to extract hybrid features which incorporate a wide range of information of the target residue and its spatially neighboring residues, i.e. the nearest contact residue in the other face (mirror-contact residue) and the nearest contact residue in the same face (intra-contact residue). We provide a novel random forest (RF) model to effectively integrate these hybrid features for predicting hot spots in protein interfaces. Our method can achieve accuracy (ACC) of 82.4% and Matthew's correlation coefficient (MCC) of 0.482 in Alanine Scanning Energetics Database, and ACC of 77.6% and MCC of 0.429 in Binding Interface Database. In a comparison study, performance of our RF model exceeds other existing methods, such as Robetta, FOLDEF, KFC, KFC2, MINERVA and HotPoint. Of our hybrid features, three physicochemical features of target residues (mass, polarizability and isoelectric point), the relative side-chain accessible surface area and the average depth index of mirror-contact residues are found to be the main discriminative features in hot spots prediction. We also confirm that hot spots tend to form large contact surface areas between two interacting proteins. Source data and code are available at: http://www.aporc.org/doc/wiki/HotSpot. PMID:22258275

  11. Automatic feature template generation for maximum entropy based intonational phrase break prediction

    NASA Astrophysics Data System (ADS)

    Zhou, You

    2013-03-01

    The prediction of intonational phrase (IP) breaks is important for both the naturalness and intelligibility of Text-to- Speech (TTS) systems. In this paper, we propose a maximum entropy (ME) model to predict IP breaks from unrestricted text, and evaluate various keyword selection approaches in different domains. Furthermore, we design a hierarchical clustering algorithm for automatic generation of feature templates, which minimizes the need for human supervision during ME model training. Results of comparative experiments show that, for the task of IP break prediction, ME model obviously outperforms classification and regression tree (CART), log-likelihood ratio is the best scoring measure of keyword selection, compared with manual templates, templates automatically generated by our approach greatly improves the F-score of ME based IP break prediction, and significantly reduces the size of ME model.

  12. Feature Selection Methods for Early Predictive Biomarker Discovery Using Untargeted Metabolomic Data

    PubMed Central

    Grissa, Dhouha; Pétéra, Mélanie; Brandolini, Marion; Napoli, Amedeo; Comte, Blandine; Pujos-Guillot, Estelle

    2016-01-01

    Untargeted metabolomics is a powerful phenotyping tool for better understanding biological mechanisms involved in human pathology development and identifying early predictive biomarkers. This approach, based on multiple analytical platforms, such as mass spectrometry (MS), chemometrics and bioinformatics, generates massive and complex data that need appropriate analyses to extract the biologically meaningful information. Despite various tools available, it is still a challenge to handle such large and noisy datasets with limited number of individuals without risking overfitting. Moreover, when the objective is focused on the identification of early predictive markers of clinical outcome, few years before occurrence, it becomes essential to use the appropriate algorithms and workflow to be able to discover subtle effects among this large amount of data. In this context, this work consists in studying a workflow describing the general feature selection process, using knowledge discovery and data mining methodologies to propose advanced solutions for predictive biomarker discovery. The strategy was focused on evaluating a combination of numeric-symbolic approaches for feature selection with the objective of obtaining the best combination of metabolites producing an effective and accurate predictive model. Relying first on numerical approaches, and especially on machine learning methods (SVM-RFE, RF, RF-RFE) and on univariate statistical analyses (ANOVA), a comparative study was performed on an original metabolomic dataset and reduced subsets. As resampling method, LOOCV was applied to minimize the risk of overfitting. The best k-features obtained with different scores of importance from the combination of these different approaches were compared and allowed determining the variable stabilities using Formal Concept Analysis. The results revealed the interest of RF-Gini combined with ANOVA for feature selection as these two complementary methods allowed selecting the 48

  13. Feature Selection Methods for Early Predictive Biomarker Discovery Using Untargeted Metabolomic Data.

    PubMed

    Grissa, Dhouha; Pétéra, Mélanie; Brandolini, Marion; Napoli, Amedeo; Comte, Blandine; Pujos-Guillot, Estelle

    2016-01-01

    Untargeted metabolomics is a powerful phenotyping tool for better understanding biological mechanisms involved in human pathology development and identifying early predictive biomarkers. This approach, based on multiple analytical platforms, such as mass spectrometry (MS), chemometrics and bioinformatics, generates massive and complex data that need appropriate analyses to extract the biologically meaningful information. Despite various tools available, it is still a challenge to handle such large and noisy datasets with limited number of individuals without risking overfitting. Moreover, when the objective is focused on the identification of early predictive markers of clinical outcome, few years before occurrence, it becomes essential to use the appropriate algorithms and workflow to be able to discover subtle effects among this large amount of data. In this context, this work consists in studying a workflow describing the general feature selection process, using knowledge discovery and data mining methodologies to propose advanced solutions for predictive biomarker discovery. The strategy was focused on evaluating a combination of numeric-symbolic approaches for feature selection with the objective of obtaining the best combination of metabolites producing an effective and accurate predictive model. Relying first on numerical approaches, and especially on machine learning methods (SVM-RFE, RF, RF-RFE) and on univariate statistical analyses (ANOVA), a comparative study was performed on an original metabolomic dataset and reduced subsets. As resampling method, LOOCV was applied to minimize the risk of overfitting. The best k-features obtained with different scores of importance from the combination of these different approaches were compared and allowed determining the variable stabilities using Formal Concept Analysis. The results revealed the interest of RF-Gini combined with ANOVA for feature selection as these two complementary methods allowed selecting the 48

  14. Stable feature selection for clinical prediction: exploiting ICD tree structure using Tree-Lasso.

    PubMed

    Kamkar, Iman; Gupta, Sunil Kumar; Phung, Dinh; Venkatesh, Svetha

    2015-02-01

    Modern healthcare is getting reshaped by growing Electronic Medical Records (EMR). Recently, these records have been shown of great value towards building clinical prediction models. In EMR data, patients' diseases and hospital interventions are captured through a set of diagnoses and procedures codes. These codes are usually represented in a tree form (e.g. ICD-10 tree) and the codes within a tree branch may be highly correlated. These codes can be used as features to build a prediction model and an appropriate feature selection can inform a clinician about important risk factors for a disease. Traditional feature selection methods (e.g. Information Gain, T-test, etc.) consider each variable independently and usually end up having a long feature list. Recently, Lasso and related l1-penalty based feature selection methods have become popular due to their joint feature selection property. However, Lasso is known to have problems of selecting one feature of many correlated features randomly. This hinders the clinicians to arrive at a stable feature set, which is crucial for clinical decision making process. In this paper, we solve this problem by using a recently proposed Tree-Lasso model. Since, the stability behavior of Tree-Lasso is not well understood, we study the stability behavior of Tree-Lasso and compare it with other feature selection methods. Using a synthetic and two real-world datasets (Cancer and Acute Myocardial Infarction), we show that Tree-Lasso based feature selection is significantly more stable than Lasso and comparable to other methods e.g. Information Gain, ReliefF and T-test. We further show that, using different types of classifiers such as logistic regression, naive Bayes, support vector machines, decision trees and Random Forest, the classification performance of Tree-Lasso is comparable to Lasso and better than other methods. Our result has implications in identifying stable risk factors for many healthcare problems and therefore can

  15. Prediction of Golgi-resident protein types using general form of Chou's pseudo-amino acid compositions: Approaches with minimal redundancy maximal relevance feature selection.

    PubMed

    Jiao, Ya-Sen; Du, Pu-Feng

    2016-08-01

    Recently, several efforts have been made in predicting Golgi-resident proteins. However, it is still a challenging task to identify the type of a Golgi-resident protein. Precise prediction of the type of a Golgi-resident protein plays a key role in understanding its molecular functions in various biological processes. In this paper, we proposed to use a mutual information based feature selection scheme with the general form Chou's pseudo-amino acid compositions to predict the Golgi-resident protein types. The positional specific physicochemical properties were applied in the Chou's pseudo-amino acid compositions. We achieved 91.24% prediction accuracy in a jackknife test with 49 selected features. It has the best performance among all the present predictors. This result indicates that our computational model can be useful in identifying Golgi-resident protein types. PMID:27155042

  16. miRNAfe: A comprehensive tool for feature extraction in microRNA prediction.

    PubMed

    Yones, Cristian A; Stegmayer, Georgina; Kamenetzky, Laura; Milone, Diego H

    2015-12-01

    miRNAfe is a comprehensive tool to extract features from RNA sequences. It is freely available as a web service, allowing a single access point to almost all state-of-the-art feature extraction methods used today in a variety of works from different authors. It has a very simple user interface, where the user only needs to load a file containing the input sequences and select the features to extract. As a result, the user obtains a text file with the features extracted, which can be used to analyze the sequences or as input to a miRNA prediction software. The tool can calculate up to 80 features where many of them are multidimensional arrays. In order to simplify the web interface, the features have been divided into six pre-defined groups, each one providing information about: primary sequence, secondary structure, thermodynamic stability, statistical stability, conservation between genomes of different species and substrings analysis of the sequences. Additionally, pre-trained classifiers are provided for prediction in different species. All algorithms to extract the features have been validated, comparing the results with the ones obtained from software of the original authors. The source code is freely available for academic use under GPL license at http://sourceforge.net/projects/sourcesinc/files/mirnafe/0.90/. A user-friendly access is provided as web interface at http://fich.unl.edu.ar/sinc/web-demo/mirnafe/. A more configurable web interface can be accessed at http://fich.unl.edu.ar/sinc/web-demo/mirnafe-full/. PMID:26499212

  17. Systems Medicine: from molecular features and models to the clinic in COPD

    PubMed Central

    2014-01-01

    Background and hypothesis Chronic Obstructive Pulmonary Disease (COPD) patients are characterized by heterogeneous clinical manifestations and patterns of disease progression. Two major factors that can be used to identify COPD subtypes are muscle dysfunction/wasting and co-morbidity patterns. We hypothesized that COPD heterogeneity is in part the result of complex interactions between several genes and pathways. We explored the possibility of using a Systems Medicine approach to identify such pathways, as well as to generate predictive computational models that may be used in clinic practice. Objective and method Our overarching goal is to generate clinically applicable predictive models that characterize COPD heterogeneity through a Systems Medicine approach. To this end we have developed a general framework, consisting of three steps/objectives: (1) feature identification, (2) model generation and statistical validation, and (3) application and validation of the predictive models in the clinical scenario. We used muscle dysfunction and co-morbidity as test cases for this framework. Results In the study of muscle wasting we identified relevant features (genes) by a network analysis and generated predictive models that integrate mechanistic and probabilistic models. This allowed us to characterize muscle wasting as a general de-regulation of pathway interactions. In the co-morbidity analysis we identified relevant features (genes/pathways) by the integration of gene-disease and disease-disease associations. We further present a detailed characterization of co-morbidities in COPD patients that was implemented into a predictive model. In both use cases we were able to achieve predictive modeling but we also identified several key challenges, the most pressing being the validation and implementation into actual clinical practice. Conclusions The results confirm the potential of the Systems Medicine approach to study complex diseases and generate clinically relevant

  18. Dimensionality reduced cortical features and their use in predicting longitudinal changes in Alzheimer's disease.

    PubMed

    Park, Hyunjin; Yang, Jin-ju; Seo, Jongbum; Lee, Jong-min

    2013-08-29

    Neuroimaging features derived from the cortical surface provide important information in detecting changes related to the progression of Alzheimer's disease (AD). Recent widespread adoption of neuroimaging has allowed researchers to study longitudinal data in AD. We adopted cortical thickness and sulcal depth, parameterized by three-dimensional meshes, from magnetic resonance imaging as the surface features. The cortical feature is high-dimensional, and it is difficult to use directly with a classifier because of the "small sample size" problem. We applied manifold learning to reduce the dimensionality of the feature and then tested the usage of the dimensionality reduced feature with a support vector machine classifier. Principal component analysis (PCA) was chosen as the method of manifold learning. PCA was applied to a region of interest within the cortical surface. We used 30 normal, 30 mild cognitive impairment (MCI) and 12 conversion cases taken from the ADNI database. The classifier was trained using the cortical features extracted from normal and MCI patients. The classifier was tested for the 12 conversion patients only using the imaging data before the actual conversion. The conversion was predicted early with an accuracy of 83%. PMID:23827219

  19. Prediction of banana quality indices from color features using support vector regression.

    PubMed

    Sanaeifar, Alireza; Bakhshipour, Adel; de la Guardia, Miguel

    2016-02-01

    Banana undergoes significant quality indices and color transformations during shelf-life process, which in turn affect important chemical and physical characteristics for the organoleptic quality of banana. A computer vision system was implemented in order to evaluate color of banana in RGB, L*a*b* and HSV color spaces, and changes in color features of banana during shelf-life were employed for the quantitative prediction of quality indices. The radial basis function (RBF) was applied as the kernel function of support vector regression (SVR) and the color features, in different color spaces, were selected as the inputs of the model, being determined total soluble solids, pH, titratable acidity and firmness as the output. Experimental results provided an improvement in predictive accuracy as compared with those obtained by using artificial neural network (ANN). PMID:26653423

  20. Hyper-Echoic Rim in Thyroid Nodules: A New Ultrasonographic Feature for Malignancy Prediction.

    PubMed

    Dong, YiJie; Zhan, WeiWei; Zhou, JianQiao; Song, LinLin; Ni, XiaoFeng; Zhang, BenYan

    2016-09-01

    The goal of this study was to verify the ultrasound features of hyper-echoic rims in thyroid nodules and to evaluate their diagnostic value in predicting thyroid malignancies. We retrospectively analyzed 228 pathologically proven thyroid nodules (137 malignant and 91 benign nodules). Forty-eight thyroid nodules had a hyper echogenic rim. All malignant nodules (137) were papillary carcinomas, which were studied to identify the correlation between the hyper-echoic rim (detected by ultrasound) and other histologic features. Presence of a hyper-echoic rim had high specificity (94.51%), but low sensitivity (31.39%) in predicting malignancy (p < 0.05). Thirty-seven of 43 malignant nodules had boundary zones of mixed structure (apparent fibrous stroma bands or dense collagenous border with a mixed population of cancerous cells) under microscopic examination. In conclusion, the hyper-echogenic rim could be one additional ultrasound parameter in the diagnosis of thyroid lesions. PMID:27339761

  1. Machine learning methods enable predictive modeling of antibody feature:function relationships in RV144 vaccinees.

    PubMed

    Choi, Ickwon; Chung, Amy W; Suscovich, Todd J; Rerks-Ngarm, Supachai; Pitisuttithum, Punnee; Nitayaphan, Sorachai; Kaewkungwal, Jaranit; O'Connell, Robert J; Francis, Donald; Robb, Merlin L; Michael, Nelson L; Kim, Jerome H; Alter, Galit; Ackerman, Margaret E; Bailey-Kellogg, Chris

    2015-04-01

    The adaptive immune response to vaccination or infection can lead to the production of specific antibodies to neutralize the pathogen or recruit innate immune effector cells for help. The non-neutralizing role of antibodies in stimulating effector cell responses may have been a key mechanism of the protection observed in the RV144 HIV vaccine trial. In an extensive investigation of a rich set of data collected from RV144 vaccine recipients, we here employ machine learning methods to identify and model associations between antibody features (IgG subclass and antigen specificity) and effector function activities (antibody dependent cellular phagocytosis, cellular cytotoxicity, and cytokine release). We demonstrate via cross-validation that classification and regression approaches can effectively use the antibody features to robustly predict qualitative and quantitative functional outcomes. This integration of antibody feature and function data within a machine learning framework provides a new, objective approach to discovering and assessing multivariate immune correlates. PMID:25874406

  2. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features

    PubMed Central

    Yu, Kun-Hsing; Zhang, Ce; Berry, Gerald J.; Altman, Russ B.; Ré, Christopher; Rubin, Daniel L.; Snyder, Michael

    2016-01-01

    Lung cancer is the most prevalent cancer worldwide, and histopathological assessment is indispensable for its diagnosis. However, human evaluation of pathology slides cannot accurately predict patients' prognoses. In this study, we obtain 2,186 haematoxylin and eosin stained histopathology whole-slide images of lung adenocarcinoma and squamous cell carcinoma patients from The Cancer Genome Atlas (TCGA), and 294 additional images from Stanford Tissue Microarray (TMA) Database. We extract 9,879 quantitative image features and use regularized machine-learning methods to select the top features and to distinguish shorter-term survivors from longer-term survivors with stage I adenocarcinoma (P<0.003) or squamous cell carcinoma (P=0.023) in the TCGA data set. We validate the survival prediction framework with the TMA cohort (P<0.036 for both tumour types). Our results suggest that automatically derived image features can predict the prognosis of lung cancer patients and thereby contribute to precision oncology. Our methods are extensible to histopathology images of other organs. PMID:27527408

  3. Prediction of near-term risk of developing breast cancer using computerized features from bilateral mammograms.

    PubMed

    Sun, Wenqing; Zheng, Bin; Lure, Fleming; Wu, Teresa; Zhang, Jianying; Wang, Benjamin Y; Saltzstein, Edward C; Qian, Wei

    2014-07-01

    Asymmetry of bilateral mammographic tissue density and patterns is a potentially strong indicator of having or developing breast abnormalities or early cancers. The purpose of this study is to design and test the global asymmetry features from bilateral mammograms to predict the near-term risk of women developing detectable high risk breast lesions or cancer in the next sequential screening mammography examination. The image dataset includes mammograms acquired from 90 women who underwent routine screening examinations, all interpreted as negative and not recalled by the radiologists during the original screening procedures. A computerized breast cancer risk analysis scheme using four image processing modules, including image preprocessing, suspicious region segmentation, image feature extraction, and classification was designed to detect and compute image feature asymmetry between the left and right breasts imaged on the mammograms. The highest computed area under curve (AUC) is 0.754±0.024 when applying the new computerized aided diagnosis (CAD) scheme to our testing dataset. The positive predictive value and the negative predictive value were 0.58 and 0.80, respectively. PMID:24725671

  4. Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features.

    PubMed

    Yu, Kun-Hsing; Zhang, Ce; Berry, Gerald J; Altman, Russ B; Ré, Christopher; Rubin, Daniel L; Snyder, Michael

    2016-01-01

    Lung cancer is the most prevalent cancer worldwide, and histopathological assessment is indispensable for its diagnosis. However, human evaluation of pathology slides cannot accurately predict patients' prognoses. In this study, we obtain 2,186 haematoxylin and eosin stained histopathology whole-slide images of lung adenocarcinoma and squamous cell carcinoma patients from The Cancer Genome Atlas (TCGA), and 294 additional images from Stanford Tissue Microarray (TMA) Database. We extract 9,879 quantitative image features and use regularized machine-learning methods to select the top features and to distinguish shorter-term survivors from longer-term survivors with stage I adenocarcinoma (P<0.003) or squamous cell carcinoma (P=0.023) in the TCGA data set. We validate the survival prediction framework with the TMA cohort (P<0.036 for both tumour types). Our results suggest that automatically derived image features can predict the prognosis of lung cancer patients and thereby contribute to precision oncology. Our methods are extensible to histopathology images of other organs. PMID:27527408

  5. Protein subcellular localization prediction based on compartment-specific features and structure conservation

    PubMed Central

    Su, Emily Chia-Yu; Chiu, Hua-Sheng; Lo, Allan; Hwang, Jenn-Kang; Sung, Ting-Yi; Hsu, Wen-Lian

    2007-01-01

    Background Protein subcellular localization is crucial for genome annotation, protein function prediction, and drug discovery. Determination of subcellular localization using experimental approaches is time-consuming; thus, computational approaches become highly desirable. Extensive studies of localization prediction have led to the development of several methods including composition-based and homology-based methods. However, their performance might be significantly degraded if homologous sequences are not detected. Moreover, methods that integrate various features could suffer from the problem of low coverage in high-throughput proteomic analyses due to the lack of information to characterize unknown proteins. Results We propose a hybrid prediction method for Gram-negative bacteria that combines a one-versus-one support vector machines (SVM) model and a structural homology approach. The SVM model comprises a number of binary classifiers, in which biological features derived from Gram-negative bacteria translocation pathways are incorporated. In the structural homology approach, we employ secondary structure alignment for structural similarity comparison and assign the known localization of the top-ranked protein as the predicted localization of a query protein. The hybrid method achieves overall accuracy of 93.7% and 93.2% using ten-fold cross-validation on the benchmark data sets. In the assessment of the evaluation data sets, our method also attains accurate prediction accuracy of 84.0%, especially when testing on sequences with a low level of homology to the training data. A three-way data split procedure is also incorporated to prevent overestimation of the predictive performance. In addition, we show that the prediction accuracy should be approximately 85% for non-redundant data sets of sequence identity less than 30%. Conclusion Our results demonstrate that biological features derived from Gram-negative bacteria translocation pathways yield a significant

  6. Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks

    NASA Astrophysics Data System (ADS)

    Wang, Yiheng; Liu, Tong; Xu, Dong; Shi, Huidong; Zhang, Chaoyang; Mo, Yin-Yuan; Wang, Zheng

    2016-01-01

    The hypo- or hyper-methylation of the human genome is one of the epigenetic features of leukemia. However, experimental approaches have only determined the methylation state of a small portion of the human genome. We developed deep learning based (stacked denoising autoencoders, or SdAs) software named “DeepMethyl” to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns. We used the experimental data from immortalised myelogenous leukemia (K562) and healthy lymphoblastoid (GM12878) cell lines to train the learning models and assess prediction performance. We have tested various SdA architectures with different configurations of hidden layer(s) and amount of pre-training data and compared the performance of deep networks relative to support vector machines (SVMs). Using the methylation states of sequentially neighboring regions as one of the learning features, an SdA achieved a blind test accuracy of 89.7% for GM12878 and 88.6% for K562. When the methylation states of sequentially neighboring regions are unknown, the accuracies are 84.82% for GM12878 and 72.01% for K562. We also analyzed the contribution of genome topological features inferred from Hi-C. DeepMethyl can be accessed at http://dna.cs.usm.edu/deepmethyl/.

  7. Self-Adaptive MOEA Feature Selection for Classification of Bankruptcy Prediction Data

    PubMed Central

    Gaspar-Cunha, A.; Recio, G.; Costa, L.; Estébanez, C.

    2014-01-01

    Bankruptcy prediction is a vast area of finance and accounting whose importance lies in the relevance for creditors and investors in evaluating the likelihood of getting into bankrupt. As companies become complex, they develop sophisticated schemes to hide their real situation. In turn, making an estimation of the credit risks associated with counterparts or predicting bankruptcy becomes harder. Evolutionary algorithms have shown to be an excellent tool to deal with complex problems in finances and economics where a large number of irrelevant features are involved. This paper provides a methodology for feature selection in classification of bankruptcy data sets using an evolutionary multiobjective approach that simultaneously minimise the number of features and maximise the classifier quality measure (e.g., accuracy). The proposed methodology makes use of self-adaptation by applying the feature selection algorithm while simultaneously optimising the parameters of the classifier used. The methodology was applied to four different sets of data. The obtained results showed the utility of using the self-adaptation of the classifier. PMID:24707201

  8. Self-adaptive MOEA feature selection for classification of bankruptcy prediction data.

    PubMed

    Gaspar-Cunha, A; Recio, G; Costa, L; Estébanez, C

    2014-01-01

    Bankruptcy prediction is a vast area of finance and accounting whose importance lies in the relevance for creditors and investors in evaluating the likelihood of getting into bankrupt. As companies become complex, they develop sophisticated schemes to hide their real situation. In turn, making an estimation of the credit risks associated with counterparts or predicting bankruptcy becomes harder. Evolutionary algorithms have shown to be an excellent tool to deal with complex problems in finances and economics where a large number of irrelevant features are involved. This paper provides a methodology for feature selection in classification of bankruptcy data sets using an evolutionary multiobjective approach that simultaneously minimise the number of features and maximise the classifier quality measure (e.g., accuracy). The proposed methodology makes use of self-adaptation by applying the feature selection algorithm while simultaneously optimising the parameters of the classifier used. The methodology was applied to four different sets of data. The obtained results showed the utility of using the self-adaptation of the classifier. PMID:24707201

  9. Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks.

    PubMed

    Wang, Yiheng; Liu, Tong; Xu, Dong; Shi, Huidong; Zhang, Chaoyang; Mo, Yin-Yuan; Wang, Zheng

    2016-01-01

    The hypo- or hyper-methylation of the human genome is one of the epigenetic features of leukemia. However, experimental approaches have only determined the methylation state of a small portion of the human genome. We developed deep learning based (stacked denoising autoencoders, or SdAs) software named "DeepMethyl" to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns. We used the experimental data from immortalised myelogenous leukemia (K562) and healthy lymphoblastoid (GM12878) cell lines to train the learning models and assess prediction performance. We have tested various SdA architectures with different configurations of hidden layer(s) and amount of pre-training data and compared the performance of deep networks relative to support vector machines (SVMs). Using the methylation states of sequentially neighboring regions as one of the learning features, an SdA achieved a blind test accuracy of 89.7% for GM12878 and 88.6% for K562. When the methylation states of sequentially neighboring regions are unknown, the accuracies are 84.82% for GM12878 and 72.01% for K562. We also analyzed the contribution of genome topological features inferred from Hi-C. DeepMethyl can be accessed at http://dna.cs.usm.edu/deepmethyl/. PMID:26797014

  10. Predicting DNA Methylation State of CpG Dinucleotide Using Genome Topological Features and Deep Networks

    PubMed Central

    Wang, Yiheng; Liu, Tong; Xu, Dong; Shi, Huidong; Zhang, Chaoyang; Mo, Yin-Yuan; Wang, Zheng

    2016-01-01

    The hypo- or hyper-methylation of the human genome is one of the epigenetic features of leukemia. However, experimental approaches have only determined the methylation state of a small portion of the human genome. We developed deep learning based (stacked denoising autoencoders, or SdAs) software named “DeepMethyl” to predict the methylation state of DNA CpG dinucleotides using features inferred from three-dimensional genome topology (based on Hi-C) and DNA sequence patterns. We used the experimental data from immortalised myelogenous leukemia (K562) and healthy lymphoblastoid (GM12878) cell lines to train the learning models and assess prediction performance. We have tested various SdA architectures with different configurations of hidden layer(s) and amount of pre-training data and compared the performance of deep networks relative to support vector machines (SVMs). Using the methylation states of sequentially neighboring regions as one of the learning features, an SdA achieved a blind test accuracy of 89.7% for GM12878 and 88.6% for K562. When the methylation states of sequentially neighboring regions are unknown, the accuracies are 84.82% for GM12878 and 72.01% for K562. We also analyzed the contribution of genome topological features inferred from Hi-C. DeepMethyl can be accessed at http://dna.cs.usm.edu/deepmethyl/. PMID:26797014

  11. Habitat features and predictive habitat modeling for the Colorado chipmunk in southern New Mexico

    USGS Publications Warehouse

    Rivieccio, M.; Thompson, B.C.; Gould, W.R.; Boykin, K.G.

    2003-01-01

    Two subspecies of Colorado chipmunk (state threatened and federal species of concern) occur in southern New Mexico: Tamias quadrivittatus australis in the Organ Mountains and T. q. oscuraensis in the Oscura Mountains. We developed a GIS model of potentially suitable habitat based on vegetation and elevation features, evaluated site classifications of the GIS model, and determined vegetation and terrain features associated with chipmunk occurrence. We compared GIS model classifications with actual vegetation and elevation features measured at 37 sites. At 60 sites we measured 18 habitat variables regarding slope, aspect, tree species, shrub species, and ground cover. We used logistic regression to analyze habitat variables associated with chipmunk presence/absence. All (100%) 37 sample sites (28 predicted suitable, 9 predicted unsuitable) were classified correctly by the GIS model regarding elevation and vegetation. For 28 sites predicted suitable by the GIS model, 18 sites (64%) appeared visually suitable based on habitat variables selected from logistic regression analyses, of which 10 sites (36%) were specifically predicted as suitable habitat via logistic regression. We detected chipmunks at 70% of sites deemed suitable via the logistic regression models. Shrub cover, tree density, plant proximity, presence of logs, and presence of rock outcrop were retained in the logistic model for the Oscura Mountains; litter, shrub cover, and grass cover were retained in the logistic model for the Organ Mountains. Evaluation of predictive models illustrates the need for multi-stage analyses to best judge performance. Microhabitat analyses indicate prospective needs for different management strategies between the subspecies. Sensitivities of each population of the Colorado chipmunk to natural and prescribed fire suggest that partial burnings of areas inhabited by Colorado chipmunks in southern New Mexico may be beneficial. These partial burnings may later help avoid a fire

  12. Accurate Prediction of Transposon-Derived piRNAs by Integrating Various Sequential and Physicochemical Features

    PubMed Central

    Luo, Longqiang; Li, Dingfang; Zhang, Wen; Tu, Shikui; Zhu, Xiaopeng; Tian, Gang

    2016-01-01

    Background Piwi-interacting RNA (piRNA) is the largest class of small non-coding RNA molecules. The transposon-derived piRNA prediction can enrich the research contents of small ncRNAs as well as help to further understand generation mechanism of gamete. Methods In this paper, we attempt to differentiate transposon-derived piRNAs from non-piRNAs based on their sequential and physicochemical features by using machine learning methods. We explore six sequence-derived features, i.e. spectrum profile, mismatch profile, subsequence profile, position-specific scoring matrix, pseudo dinucleotide composition and local structure-sequence triplet elements, and systematically evaluate their performances for transposon-derived piRNA prediction. Finally, we consider two approaches: direct combination and ensemble learning to integrate useful features and achieve high-accuracy prediction models. Results We construct three datasets, covering three species: Human, Mouse and Drosophila, and evaluate the performances of prediction models by 10-fold cross validation. In the computational experiments, direct combination models achieve AUC of 0.917, 0.922 and 0.992 on Human, Mouse and Drosophila, respectively; ensemble learning models achieve AUC of 0.922, 0.926 and 0.994 on the three datasets. Conclusions Compared with other state-of-the-art methods, our methods can lead to better performances. In conclusion, the proposed methods are promising for the transposon-derived piRNA prediction. The source codes and datasets are available in S1 File. PMID:27074043

  13. Music-induced emotions can be predicted from a combination of brain activity and acoustic features.

    PubMed

    Daly, Ian; Williams, Duncan; Hallowell, James; Hwang, Faustina; Kirke, Alexis; Malik, Asad; Weaver, James; Miranda, Eduardo; Nasuto, Slawomir J

    2015-12-01

    It is widely acknowledged that music can communicate and induce a wide range of emotions in the listener. However, music is a highly-complex audio signal composed of a wide range of complex time- and frequency-varying components. Additionally, music-induced emotions are known to differ greatly between listeners. Therefore, it is not immediately clear what emotions will be induced in a given individual by a piece of music. We attempt to predict the music-induced emotional response in a listener by measuring the activity in the listeners electroencephalogram (EEG). We combine these measures with acoustic descriptors of the music, an approach that allows us to consider music as a complex set of time-varying acoustic features, independently of any specific music theory. Regression models are found which allow us to predict the music-induced emotions of our participants with a correlation between the actual and predicted responses of up to r=0.234,p<0.001. This regression fit suggests that over 20% of the variance of the participant's music induced emotions can be predicted by their neural activity and the properties of the music. Given the large amount of noise, non-stationarity, and non-linearity in both EEG and music, this is an encouraging result. Additionally, the combination of measures of brain activity and acoustic features describing the music played to our participants allows us to predict music-induced emotions with significantly higher accuracies than either feature type alone (p<0.01). PMID:26544602

  14. Long Hydrocarbon Chains Serve as Unique Molecular Features Recognized by Ventral Glomeruli of the Rat Olfactory Bulb

    PubMed Central

    Ho, Sabrina L.; Johnson, Brett A.; Leon, Michael

    2008-01-01

    In an effort to understand mammalian olfactory processing, we have been describing the responses to systematically different odorants in the glomerular layer of the main olfactory bulb of rats. To understand the processing of pure hydrocarbon structures in this system, we used the [14C]2-deoxyglucose method to determine glomerular responses to a homologous series of alkanes (from six to sixteen carbons) that are straight-chained hydrocarbons without functional groups. We found two rostral regions of activity evoked by these odorants, one lateral and one medial, that were observed to shift ventrally with increasing alkane carbon chain length. Furthermore, we successfully predicted that the longest alkanes with carbon chain length greater than our previous odorant selections would stimulate extremely ventral glomerular regions where no activation had been observed with the hundreds of odorants that we had previously studied. Overlaps in response profiles were observed in the patterns evoked by alkanes and by other aliphatic odorants of corresponding carbon chain length despite possessing different oxygen-containing functional groups, which demonstrated that hydrocarbon chains could serve as molecular features in the combinatorial coding of odorant information. We found a close and predictable relationship among the molecular properties of odorants, their induced neural activity, and their perceptual similarities. PMID:16856178

  15. SuSPect: enhanced prediction of single amino acid variant (SAV) phenotype using network features.

    PubMed

    Yates, Christopher M; Filippis, Ioannis; Kelley, Lawrence A; Sternberg, Michael J E

    2014-07-15

    Whole-genome and exome sequencing studies reveal many genetic variants between individuals, some of which are linked to disease. Many of these variants lead to single amino acid variants (SAVs), and accurate prediction of their phenotypic impact is important. Incorporating sequence conservation and network-level features, we have developed a method, SuSPect (Disease-Susceptibility-based SAV Phenotype Prediction), for predicting how likely SAVs are to be associated with disease. SuSPect performs significantly better than other available batch methods on the VariBench benchmarking dataset, with a balanced accuracy of 82%. SuSPect is available at www.sbg.bio.ic.ac.uk/suspect. The Web site has been implemented in Perl and SQLite and is compatible with modern browsers. An SQLite database of possible missense variants in the human proteome is available to download at www.sbg.bio.ic.ac.uk/suspect/download.html. PMID:24810707

  16. Prediction models for solitary pulmonary nodules based on curvelet textural features and clinical parameters.

    PubMed

    Wang, Jing-Jing; Wu, Hai-Feng; Sun, Tao; Li, Xia; Wang, Wei; Tao, Li-Xin; Huo, Da; Lv, Ping-Xin; He, Wen; Guo, Xiu-Hua

    2013-01-01

    Lung cancer, one of the leading causes of cancer-related deaths, usually appears as solitary pulmonary nodules (SPNs) which are hard to diagnose using the naked eye. In this paper, curvelet-based textural features and clinical parameters are used with three prediction models [a multilevel model, a least absolute shrinkage and selection operator (LASSO) regression method, and a support vector machine (SVM)] to improve the diagnosis of benign and malignant SPNs. Dimensionality reduction of the original curvelet-based textural features was achieved using principal component analysis. In addition, non-conditional logistical regression was used to find clinical predictors among demographic parameters and morphological features. The results showed that, combined with 11 clinical predictors, the accuracy rates using 12 principal components were higher than those using the original curvelet-based textural features. To evaluate the models, 10-fold cross validation and back substitution were applied. The results obtained, respectively, were 0.8549 and 0.9221 for the LASSO method, 0.9443 and 0.9831 for SVM, and 0.8722 and 0.9722 for the multilevel model. All in all, it was found that using curvelet-based textural features after dimensionality reduction and using clinical predictors, the highest accuracy rate was achieved with SVM. The method may be used as an auxiliary tool to differentiate between benign and malignant SPNs in CT images. PMID:24289618

  17. Role of Side-Chain Molecular Features in Tuning Lower Critical Solution Temperatures (LCSTs) of Oligoethylene Glycol Modified Polypeptides.

    PubMed

    Gharakhanian, Eric G; Deming, Timothy J

    2016-07-01

    A series of thermoresponsive polypeptides has been synthesized using a methodology that allowed facile adjustment of side-chain functional groups. The lower critical solution temperature (LCST) properties of these polymers in water were then evaluated relative to systematic molecular modifications in their side-chains. It was found that in addition to the number of ethylene glycol repeats in the side-chains, terminal and linker groups also have substantial and predictable effects on cloud point temperatures (Tcp). In particular, we found that the structure of these polypeptides allowed for inclusion of polar hydroxyl groups, which significantly increased their hydrophilicity and decreased the need to use long oligoethylene glycol repeats to obtain LCSTs. The thioether linkages in these polypeptides were found to provide an additional structural feature for reversible switching of both polypeptide conformation and thermoresponsive properties. PMID:27102972

  18. Molecular Features of Subtype-Specific Progression from Ductal Carcinoma In Situ to Invasive Breast Cancer.

    PubMed

    Lesurf, Robert; Aure, Miriam Ragle; Mørk, Hanne Håberg; Vitelli, Valeria; Lundgren, Steinar; Børresen-Dale, Anne-Lise; Kristensen, Vessela; Wärnberg, Fredrik; Hallett, Michael; Sørlie, Therese

    2016-07-26

    Breast cancer consists of at least five main molecular "intrinsic" subtypes that are reflected in both pre-invasive and invasive disease. Although previous studies have suggested that many of the molecular features of invasive breast cancer are established early, it is unclear what mechanisms drive progression and whether the mechanisms of progression are dependent or independent of subtype. We have generated mRNA, miRNA, and DNA copy-number profiles from a total of 59 in situ lesions and 85 invasive tumors in order to comprehensively identify those genes, signaling pathways, processes, and cell types that are involved in breast cancer progression. Our work provides evidence that there are molecular features associated with disease progression that are unique to the intrinsic subtypes. We additionally establish subtype-specific signatures that are able to identify a small proportion of pre-invasive tumors with expression profiles that resemble invasive carcinoma, indicating a higher likelihood of future disease progression. PMID:27396337

  19. Quantitative Description of a Protein Fitness Landscape Based on Molecular Features.

    PubMed

    Meini, María-Rocío; Tomatis, Pablo E; Weinreich, Daniel M; Vila, Alejandro J

    2015-07-01

    Understanding the driving forces behind protein evolution requires the ability to correlate the molecular impact of mutations with organismal fitness. To address this issue, we employ here metallo-β-lactamases as a model system, which are Zn(II) dependent enzymes that mediate antibiotic resistance. We present a study of all the possible evolutionary pathways leading to a metallo-β-lactamase variant optimized by directed evolution. By studying the activity, stability and Zn(II) binding capabilities of all mutants in the preferred evolutionary pathways, we show that this local fitness landscape is strongly conditioned by epistatic interactions arising from the pleiotropic effect of mutations in the different molecular features of the enzyme. Activity and stability assays in purified enzymes do not provide explanatory power. Instead, measurement of these molecular features in an environment resembling the native one provides an accurate description of the observed antibiotic resistance profile. We report that optimization of Zn(II) binding abilities of metallo-β-lactamases during evolution is more critical than stabilization of the protein to enhance fitness. A global analysis of these parameters allows us to connect genotype with fitness based on quantitative biochemical and biophysical parameters. PMID:25767204

  20. Molecular Size and Separability Features of Pea Cell Wall Polysaccharides 1

    PubMed Central

    Talbott, Lawrence D.; Ray, Peter M.

    1992-01-01

    Relative molecular size distributions of pectic and hemicellulosic polysaccharides of pea (Pisum sativum cv Alaska) third internode primary walls were determined by gel filtration chromatography. Pectic polyuronides have a peak molecular mass of about 1100 kilodaltons, relative to dextran standards. This peak may be partly an aggregate of smaller molecular units, because demonstrable aggregation occurred when samples were concentrated by evaporation. About 86% of the neutral sugars (mostly arabinose and galactose) in the pectin cofractionate with polyuronide in gel filtration chromatography and diethylaminoethyl-cellulose chromatography and appear to be attached covalently to polyuronide chains, probably as constituents of rhamnogalacturonans. However, at least 60% of the wall's arabinan/galactan is not linked covalently to the bulk of its rhamnogalacturonan, either glycosidically or by ester links, but occurs in the hemicellulose fraction, accompanied by negligible uronic acid, and has a peak molecular mass of about 1000 kilodaltons. Xyloglucan, the other principal hemicellulosic polymer, has a peak molecular mass of about 30 kilodaltons (with a secondary, usually minor, peak of approximately 300 kilodaltons) and is mostly not linked glycosidically either to pectic polyuronides or to arabinogalactan. The relatively narrow molecular mass distributions of these polymers suggest mechanisms of co- or postsynthetic control of hemicellulose chain length by the cell. Although the macromolecular features of the mentioned polymers individually agree generally with those shown in the widely disseminated sycamore cell primary wall model, the matrix polymers seem to be associated mostly noncovalently rather than in the covalently interlinked meshwork postulated by that model. Xyloglucan and arabinan/galactan may form tightly and more loosely bound layers, respectively, around the cellulose microfibrils, the outer layer interacting with pectic rhamnogalacturonans that occupy

  1. Remote health monitoring: predicting outcome success based on contextual features for cardiovascular disease.

    PubMed

    Alshurafa, Nabil; Eastwood, Jo-Ann; Pourhomayoun, Mohammad; Liu, Jason J; Sarrafzadeh, Majid

    2014-01-01

    Current studies have produced a plethora of remote health monitoring (RHM) systems designed to enhance the care of patients with chronic diseases. Many RHM systems are designed to improve patient risk factors for cardiovascular disease, including physiological parameters such as body mass index (BMI) and waist circumference, and lipid profiles such as low density lipoprotein (LDL) and high density lipoprotein (HDL). There are several patient characteristics that could be determining factors for a patient's RHM outcome success, but these characteristics have been largely unidentified. In this paper, we analyze results from an RHM system deployed in a six month Women's Heart Health study of 90 patients, and apply advanced feature selection and machine learning algorithms to identify patients' key baseline contextual features and build effective prediction models that help determine RHM outcome success. We introduce Wanda-CVD, a smartphone-based RHM system designed to help participants with cardiovascular disease risk factors by motivating participants through wireless coaching using feedback and prompts as social support. We analyze key contextual features that secure positive patient outcomes in both physiological parameters and lipid profiles. Results from the Women's Heart Health study show that health threat of heart disease, quality of life, family history, stress factors, social support, and anxiety at baseline all help predict patient RHM outcome success. PMID:25570321

  2. BioCAST/IFCT-1002: epidemiological and molecular features of lung cancer in never-smokers.

    PubMed

    Couraud, Sébastien; Souquet, Pierre-Jean; Paris, Christophe; Dô, Pascal; Doubre, Hélène; Pichon, Eric; Dixmier, Adrien; Monnet, Isabelle; Etienne-Mastroianni, Bénédicte; Vincent, Michel; Trédaniel, Jean; Perrichon, Marielle; Foucher, Pascal; Coudert, Bruno; Moro-Sibilot, Denis; Dansin, Eric; Labonne, Stéphanie; Missy, Pascale; Morin, Franck; Blanché, Hélène; Zalcman, Gérard

    2015-05-01

    Lung cancer in never-smokers (LCINS) (fewer than 100 cigarettes in lifetime) is considered as a distinct entity and harbours an original molecular profile. However, the epidemiological and molecular features of LCINS in Europe remain poorly understood. All consecutive newly diagnosed LCINS patients were included in this prospective observational study by 75 participating centres during a 14-month period. Each patient completed a detailed questionnaire about risk factor exposure. Biomarker and pathological analyses were also collected. We report the main descriptive overall results with a focus on sex differences. 384 patients were included: 65 men and 319 women. 66% had been exposed to passive smoking (significantly higher among women). Definite exposure to main occupational carcinogens was significantly higher in men (35% versus 8% in women). A targetable molecular alteration was found in 73% of patients (without any significant sex difference): EGFR in 51%, ALK in 8%, KRAS in 6%, HER2 in 3%, BRAF in 3%, PI3KCA in less than 1%, and multiple in 2%. We present the largest and most comprehensive LCINS analysis in a European population. Physicians should track occupational exposure in men (35%), and a somatic molecular alteration in both sexes (73%). PMID:25657019

  3. Energy Minimization of Molecular Features Observed on the (110) Face of Lysozyme Crystals

    NASA Technical Reports Server (NTRS)

    Perozzo, Mary A.; Konnert, John H.; Li, Huayu; Nadarajah, Arunan; Pusey, Marc

    1999-01-01

    Molecular dynamics and energy minimization have been carried out using the program XPLOR to check the plausibility of a model lysozyme crystal surface. The molecular features of the (110) face of lysozyme were observed using atomic force microscopy (AFM). A model of the crystal surface was constructed using the PDB file 193L, and was used to simulate an AFM image. Molecule translations, van der Waals radii, and assumed AFM tip shape were adjusted to maximize the correlation coefficient between the experimental and simulated images. The highest degree of 0 correlation (0.92) was obtained with the molecules displaced over 6 A from their positions within the bulk of the crystal. The quality of this starting model, the extent of energy minimization, and the correlation coefficient between the final model and the experimental data will be discussed.

  4. Sub-resolution assist feature (SRAF) printing prediction using logistic regression

    NASA Astrophysics Data System (ADS)

    Tan, Chin Boon; Koh, Kar Kit; Zhang, Dongqing; Foong, Yee Mei

    2015-03-01

    In optical proximity correction (OPC), the sub-resolution assist feature (SRAF) has been used to enhance the process window of main structures. However, the printing of SRAF on wafer is undesirable as this may adversely degrade the overall process yield if it is transferred into the final pattern. A reasonably accurate prediction model is needed during OPC to ensure that the SRAF placement and size have no risk of SRAF printing. Current common practice in OPC is either using the main OPC model or model threshold adjustment (MTA) solution to predict the SRAF printing. This paper studies the feasibility of SRAF printing prediction using logistic regression (LR). Logistic regression is a probabilistic classification model that gives discrete binary outputs after receiving sufficient input variables from SRAF printing conditions. In the application of SRAF printing prediction, the binary outputs can be treated as 1 for SRAFPrinting and 0 for No-SRAF-Printing. The experimental work was performed using a 20nm line/space process layer. The results demonstrate that the accuracy of SRAF printing prediction using LR approach outperforms MTA solution. Overall error rate of as low as calibration 2% and verification 5% was achieved by LR approach compared to calibration 6% and verification 15% for MTA solution. In addition, the performance of LR approach was found to be relatively independent and consistent across different resist image planes compared to MTA solution.

  5. Prediction of bacterial type IV secreted effectors by C-terminal features

    PubMed Central

    2014-01-01

    Background Many bacteria can deliver pathogenic proteins (effectors) through type IV secretion systems (T4SSs) to eukaryotic cytoplasm, causing host diseases. The inherent property, such as sequence diversity and global scattering throughout the whole genome, makes it a big challenge to effectively identify the full set of T4SS effectors. Therefore, an effective inter-species T4SS effector prediction tool is urgently needed to help discover new effectors in a variety of bacterial species, especially those with few known effectors, e.g., Helicobacter pylori. Results In this research, we first manually annotated a full list of validated T4SS effectors from different bacteria and then carefully compared their C-terminal sequential and position-specific amino acid compositions, possible motifs and structural features. Based on the observed features, we set up several models to automatically recognize T4SS effectors. Three of the models performed strikingly better than the others and T4SEpre_Joint had the best performance, which could distinguish the T4SS effectors from non-effectors with a 5-fold cross-validation sensitivity of 89% at a specificity of 97%, based on the training datasets. An inter-species cross prediction showed that T4SEpre_Joint could recall most known effectors from a variety of species. The inter-species prediction tool package, T4SEpre, was further used to predict new T4SS effectors from H. pylori, an important human pathogen associated with gastritis, ulcer and cancer. In total, 24 new highly possible H. pylori T4S effector genes were computationally identified. Conclusions We conclude that T4SEpre, as an effective inter-species T4SS effector prediction software package, will help find new pathogenic T4SS effectors efficiently in a variety of pathogenic bacteria. PMID:24447430

  6. Predicting spectral features in galaxy spectra from broad-band photometry

    NASA Astrophysics Data System (ADS)

    Abdalla, F. B.; Mateus, A.; Santos, W. A.; Sodrè, L., Jr.; Ferreras, I.; Lahav, O.

    2008-07-01

    We explore the prospects of predicting emission-line features present in galaxy spectra given broad-band photometry alone. There is a general consent that colours, and spectral features, most notably the 4000 Å break, can predict many properties of galaxies, including star formation rates and hence they could infer some of the line properties. We argue that these techniques have great prospects in helping us understand line emission in extragalactic objects and might speed up future galaxy redshift surveys if they are to target emission-line objects only. We use two independent methods, Artificial Neural Networks (based on the ANNz code) and Locally Weighted Regression (LWR), to retrieve correlations present in the colour N-dimensional space and to predict the equivalent widths present in the corresponding spectra. We also investigate how well it is possible to separate galaxies with and without lines from broad-band photometry only. We find, unsurprisingly, that recombination lines can be well predicted by galaxy colours. However, among collisional lines some can and some cannot be predicted well from galaxy colours alone, without any further redshift information. We also use our techniques to estimate how much information contained in spectral diagnostic diagrams can be recovered from broad-band photometry alone. We find that it is possible to classify active galactic nuclei and star formation objects relatively well using colours only. We suggest that this technique could be used to considerably improve redshift surveys such as the upcoming Fibre Multi Object Spectrograph (FMOS) survey and the planned Wide Field Multi Object Spectrograph (WFMOS) survey.

  7. Prognostic Significance and Molecular Features of Signet-Ring Cell and Mucinous Components in Colorectal Carcinoma

    PubMed Central

    Mima, Kosuke; Sukawa, Yasutaka; Li, Tingting; Yasunari, Mika; Zhang, Xuehong; Wu, Kana; Meyerhardt, Jeffrey A.; Fuchs, Charles S.

    2014-01-01

    Background Colorectal carcinoma (CRC) represents a group of histopathologically and molecularly heterogeneous diseases, which may contain signet-ring cell component and/or mucinous component to a varying extent under pathology assessment. However, little is known about the prognostic significance of those components, independent of various tumor molecular features. Methods Utilizing a molecular pathological epidemiology database of 1,336 rectal and colon cancers in the Nurses’ Health Study and the Health Professionals Follow-up Study, we examined patient survival according to the proportion of signet-ring cell and mucinous components in CRCs. Cox proportional hazards models were used to compute hazard ratio (HR) for mortality, adjusting for potential confounders including stage, microsatellite instability, CpG island methylator phenotype, LINE-1 methylation, and KRAS, BRAF, and PIK3CA mutations. Results Compared to CRC without signet-ring cell component, 1–50 % signet-ring cell component was associated with multivariate CRC-specific mortality HR of 1.40 [95 % confidence interval (CI) 1.02–1.93], and >50 % signet-ring cell component was associated with multivariate CRC-specific mortality HR of 4.53 (95 % CI 2.53–8.12) (Ptrend > 0.0001). Compared to CRC without mucinous component, neither 1–50 % mucinous component (multivariate HR 1.04; 95 % CI 0.81–1.33) nor >50 % mucinous component (multivariate HR 0.82; 95 % CI 0.54–1.23) was significantly associated with CRC-specific mortality (Ptrend < 0.57). Conclusions Even a minor (50 % or less) signet-ring cell component in CRC was associated with higher patient mortality, independent of various tumor molecular and other clinicopathological features. In contrast, mucinous component was not associated with mortality in CRC patients. PMID:25326395

  8. kmer-SVM: a web server for identifying predictive regulatory sequence features in genomic data sets

    PubMed Central

    Fletez-Brant, Christopher; Lee, Dongwon; McCallion, Andrew S.; Beer, Michael A.

    2013-01-01

    Massively parallel sequencing technologies have made the generation of genomic data sets a routine component of many biological investigations. For example, Chromatin immunoprecipitation followed by sequence assays detect genomic regions bound (directly or indirectly) by specific factors, and DNase-seq identifies regions of open chromatin. A major bottleneck in the interpretation of these data is the identification of the underlying DNA sequence code that defines, and ultimately facilitates prediction of, these transcription factor (TF) bound or open chromatin regions. We have recently developed a novel computational methodology, which uses a support vector machine (SVM) with kmer sequence features (kmer-SVM) to identify predictive combinations of short transcription factor-binding sites, which determine the tissue specificity of these genomic assays (Lee, Karchin and Beer, Discriminative prediction of mammalian enhancers from DNA sequence. Genome Res. 2011; 21:2167–80). This regulatory information can (i) give confidence in genomic experiments by recovering previously known binding sites, and (ii) reveal novel sequence features for subsequent experimental testing of cooperative mechanisms. Here, we describe the development and implementation of a web server to allow the broader research community to independently apply our kmer-SVM to analyze and interpret their genomic datasets. We analyze five recently published data sets and demonstrate how this tool identifies accessory factors and repressive sequence elements. kmer-SVM is available at http://kmersvm.beerlab.org. PMID:23771147

  9. Docking Studies and Molecular Dynamic Simulations Reveal Different Features of IDO1 Structure.

    PubMed

    Greco, Francesco Antonio; Bournique, Answald; Coletti, Alice; Custodi, Chiara; Dolciami, Daniela; Carotti, Andrea; Macchiarulo, Antonio

    2016-09-01

    In the last decade, indoleamine 2,3-dioxygenase 1 (IDO1) has attracted a great deal of attention being recognized as key regulator of immunosuppressive pathways in the tumor immuno-editing process. Several classes of inhibitors have been developed as potential anticancer agents, but only few of them have advanced in clinical trials. Hence, the quest of novel potent and selective inhibitors of the enzyme is still active and mostly pursued by structure-based drug design strategies based on early and more recent crystal structures of IDO1. Combining docking studies and molecular dynamic simulations, in this work we have comparatively investigated the structural features of each crystal structure of IDO1. The results pinpoint different features in specific crystal structures of the enzyme that may benefit the medicinal chemistry arena aiding the design of novel potent and selective inhibitors of IDO1. PMID:27546049

  10. Observations of the interstellar ice grain feature in the Taurus molecular clouds

    SciTech Connect

    Whittet, D.C.B.; Bode, H.F.; Longmore, A.J.; Baines, D.W.T.; Evans, A.

    1983-01-01

    Although water ice was originally proposed as a major constituent of the interstellar grain population (e.g. Oort and van de Hulst, 1946), the advent of infrared astronomy has shown that the expected absorption due to O-H stretching vibrations at 3 ..mu..m is illusive. Observations have in fact revealed that the carrier of this feature is apparently restricted to regions deep within dense molecular clouds (Merrill et al., 1976; Willner et al., 1982). However, the exact carrier of this feature is still controversial, and many questions remain as to the conditions required for its appearance. It is also uncertain whether it is restricted to circumstellar shells, rather than the general cloud medium. Detailed discussion of the 3 ..mu..m band properties is given elsewhere in this volume. 15 references, 4 figures.

  11. Toll-like receptor 7 agonists: chemical feature based pharmacophore identification and molecular docking studies.

    PubMed

    Yu, Hui; Jin, Hongwei; Sun, Lidan; Zhang, Liangren; Sun, Gang; Wang, Zhanli; Yu, Yongchun

    2013-01-01

    Chemical feature based pharmacophore models were generated for Toll-like receptors 7 (TLR7) agonists using HypoGen algorithm, which is implemented in the Discovery Studio software. Several methods tools used in validation of pharmacophore model were presented. The first hypothesis Hypo1 was considered to be the best pharmacophore model, which consists of four features: one hydrogen bond acceptor, one hydrogen bond donor, and two hydrophobic features. In addition, homology modeling and molecular docking studies were employed to probe the intermolecular interactions between TLR7 and its agonists. The results further confirmed the reliability of the pharmacophore model. The obtained pharmacophore model (Hypo1) was then employed as a query to screen the Traditional Chinese Medicine Database (TCMD) for other potential lead compounds. One hit was identified as a potent TLR7 agonist, which has antiviral activity against hepatitis virus in vitro. Therefore, our current work provides confidence for the utility of the selected chemical feature based pharmacophore model to design novel TLR7 agonists with desired biological activity. PMID:23526932

  12. ERα-Negative and Triple Negative Breast Cancer: Molecular Features and Potential Therapeutic Approaches

    PubMed Central

    Chen, Jin-Qiang; Russo, Jose

    2010-01-01

    Triple negative breast cancer (TNBC) is a type of aggressive breast cancer lacking the expression of estrogen receptors (ER), progesterone receptors (PR) and human epidermal growth factor receptor-2 (HER-2). TNBC patients account for approximately 15% of total breast cancer patients and are more prevalent among young African, African-American and Latino women patients. The currently available ER-targeted and Her-2-based therapies are not effective for treating TNBC. Recent studies have revealed a number of novel features of TNBC. In the present work, we comprehensively addressed these features and discussed potential therapeutic approaches based on these features for TNBC, with particular focus on: 1) the pathological features of TNBC/basal-like breast cancer; 2) E2/ERβ – mediated signaling pathways; 3) G-protein coupling receptor-30/epithelial growth factor receptor (GPCR-30/EGFR) signaling pathway; 4) interactions of ERβ with breast cancer 1/2 (BRCA1/2); 5) chemokine CXCL8 and related chemokines; 6) altered microRNA signatures and suppression of ERα expression/ERα-signaling by micro-RNAs; 7) altered expression of several pro-oncongenic and tumor suppressor proteins; and 8) genotoxic effects caused by oxidative estrogen metabolites. Gaining better insights into these molecular pathways in TNBC may lead to identification of novel biomarkers and targets for development of diagnostic and therapeutic approaches for prevention and treatment of TNBC. PMID:19527773

  13. A switchable bis-branched [1]rotaxane featuring dual-mode molecular motions and tunable molecular aggregation.

    PubMed

    Li, Hong; Li, Xin; Cao, Zhan-Qi; Qu, Da-Hui; Ågren, Hans; Tian, He

    2014-01-01

    A multifunctional bis-branched [1]rotaxane containing a perylene bisimide (PBI) core and two identical bistable[1]rotaxane arms terminated with ferrocene units was prepared and characterized by (1)H NMR, (13)C NMR, and 2D ROESY NMR spectroscopies and by HR-ESI spectrometry. The system is shown to possess several key features: (1) In acetone solution, external acid-base stimuli can result in relative mechanical movements of its ring and thread, which can induce extension and contraction movements of the whole system accompanied by a rotational movement of the ferrocene units, thus realizing dual-mode molecular motions, and the optimized conformations at different states are obtained through molecular dynamics simulations employing the general Amber force field. (2) The introduction of PBI enables the system fluorescence encoding through distance-dependent photoinduced electron transfer process from the ferrocene units to the PBI fluorophore. (3) The addition of Zn(2+) can increase the degree of aggregation of the system, while adding base hinders aggregation because of the movement of the macrocycle. The tunable aggregated nanostructural morphologies of [1]rotaxane were examined by scanning electron microscopy. These results can pave the way to achieve precise control of integrated and coupling nanomechanical motions at a single-molecule level and provide more insight into controlling the aggregate behavior of switchable mechanically interlocked molecules. PMID:25302680

  14. Symbolic features and classification via support vector machine for predicting death in patients with Chagas disease.

    PubMed

    Sady, Cristina C R; Ribeiro, Antonio Luiz P

    2016-03-01

    This paper introduces a technique for predicting death in patients with Chagas disease using features extracted from symbolic series and time-frequency indices of heart rate variability (HRV). The study included 150 patients: 15 patients who died and 135 who did not. The HRV series were obtained from 24-h Holter monitoring. Sequences of symbols from 5-min epochs from series of RR intervals were generated using symbolic dynamics and ordinal pattern statistics. Fourteen features were extracted from symbolic series and four derived from clinical aspects of patients. For classification, the 18 features from each epoch were used as inputs in a support vector machine (SVM) with a radial basis function (RBF) kernel. The results showed that it is possible to distinguish between the two classes, patients with Chagas disease who did or did not die, with a 95% accuracy rate. Therefore, we suggest that the use of new features based on symbolic series, coupled with classic time-frequency and clinical indices, proves to be a good predictor of death in patients with Chagas disease. PMID:26851730

  15. Systems Biological Approach of Molecular Descriptors Connectivity: Optimal Descriptors for Oral Bioavailability Prediction

    PubMed Central

    Ahmed, Shiek S. S. J.; Ramakrishnan, V.

    2012-01-01

    Background Poor oral bioavailability is an important parameter accounting for the failure of the drug candidates. Approximately, 50% of developing drugs fail because of unfavorable oral bioavailability. In silico prediction of oral bioavailability (%F) based on physiochemical properties are highly needed. Although many computational models have been developed to predict oral bioavailability, their accuracy remains low with a significant number of false positives. In this study, we present an oral bioavailability model based on systems biological approach, using a machine learning algorithm coupled with an optimal discriminative set of physiochemical properties. Results The models were developed based on computationally derived 247 physicochemical descriptors from 2279 molecules, among which 969, 605 and 705 molecules were corresponds to oral bioavailability, intestinal absorption (HIA) and caco-2 permeability data set, respectively. The partial least squares discriminate analysis showed 49 descriptors of HIA and 50 descriptors of caco-2 are the major contributing descriptors in classifying into groups. Of these descriptors, 47 descriptors were commonly associated to HIA and caco-2, which suggests to play a vital role in classifying oral bioavailability. To determine the best machine learning algorithm, 21 classifiers were compared using a bioavailability data set of 969 molecules with 47 descriptors. Each molecule in the data set was represented by a set of 47 physiochemical properties with the functional relevance labeled as (+bioavailability/−bioavailability) to indicate good-bioavailability/poor-bioavailability molecules. The best-performing algorithm was the logistic algorithm. The correlation based feature selection (CFS) algorithm was implemented, which confirms that these 47 descriptors are the fundamental descriptors for oral bioavailability prediction. Conclusion The logistic algorithm with 47 selected descriptors correctly predicted the oral

  16. Molecular features of interaction between VEGFA and anti-angiogenic drugs used in retinal diseases: a computational approach

    PubMed Central

    Platania, Chiara B. M.; Di Paola, Luisa; Leggio, Gian M.; Romano, Giovanni L.; Drago, Filippo; Salomone, Salvatore; Bucolo, Claudio

    2015-01-01

    Anti-angiogenic agents are biological drugs used for treatment of retinal neovascular degenerative diseases. In this study, we aimed at in silico analysis of interaction of vascular endothelial growth factor A (VEGFA), the main mediator of angiogenesis, with binding domains of anti-angiogenic agents used for treatment of retinal diseases, such as ranibizumab, bevacizumab and aflibercept. The analysis of anti-VEGF/VEGFA complexes was carried out by means of protein-protein docking and molecular dynamics (MD) coupled to molecular mechanics-Poisson Boltzmann Surface Area (MM-PBSA) calculation. Molecular dynamics simulation was further analyzed by protein contact networks. Rough energetic evaluation with protein-protein docking scores revealed that aflibercept/VEGFA complex was characterized by electrostatic stabilization, whereas ranibizumab and bevacizumab complexes were stabilized by Van der Waals (VdW) energy term; these results were confirmed by MM-PBSA. Comparison of MM-PBSA predicted energy terms with experimental binding parameters reported in literature indicated that the high association rate (Kon) of aflibercept to VEGFA was consistent with high stabilizing electrostatic energy. On the other hand, the relatively low experimental dissociation rate (Koff) of ranibizumab may be attributed to lower conformational fluctuations of the ranibizumab/VEGFA complex, higher number of contacts and hydrogen bonds in comparison to bevacizumab and aflibercept. Thus, the anti-angiogenic agents have been found to be considerably different both in terms of molecular interactions and stabilizing energy. Characterization of such features can improve the design of novel biological drugs potentially useful in clinical practice. PMID:26578958

  17. Molecular features of interaction between VEGFA and anti-angiogenic drugs used in retinal diseases: a computational approach.

    PubMed

    Platania, Chiara B M; Di Paola, Luisa; Leggio, Gian M; Romano, Giovanni L; Drago, Filippo; Salomone, Salvatore; Bucolo, Claudio

    2015-01-01

    Anti-angiogenic agents are biological drugs used for treatment of retinal neovascular degenerative diseases. In this study, we aimed at in silico analysis of interaction of vascular endothelial growth factor A (VEGFA), the main mediator of angiogenesis, with binding domains of anti-angiogenic agents used for treatment of retinal diseases, such as ranibizumab, bevacizumab and aflibercept. The analysis of anti-VEGF/VEGFA complexes was carried out by means of protein-protein docking and molecular dynamics (MD) coupled to molecular mechanics-Poisson Boltzmann Surface Area (MM-PBSA) calculation. Molecular dynamics simulation was further analyzed by protein contact networks. Rough energetic evaluation with protein-protein docking scores revealed that aflibercept/VEGFA complex was characterized by electrostatic stabilization, whereas ranibizumab and bevacizumab complexes were stabilized by Van der Waals (VdW) energy term; these results were confirmed by MM-PBSA. Comparison of MM-PBSA predicted energy terms with experimental binding parameters reported in literature indicated that the high association rate (Kon) of aflibercept to VEGFA was consistent with high stabilizing electrostatic energy. On the other hand, the relatively low experimental dissociation rate (Koff) of ranibizumab may be attributed to lower conformational fluctuations of the ranibizumab/VEGFA complex, higher number of contacts and hydrogen bonds in comparison to bevacizumab and aflibercept. Thus, the anti-angiogenic agents have been found to be considerably different both in terms of molecular interactions and stabilizing energy. Characterization of such features can improve the design of novel biological drugs potentially useful in clinical practice. PMID:26578958

  18. Respiratory trace feature analysis for the prediction of respiratory-gated PET quantification

    NASA Astrophysics Data System (ADS)

    Wang, Shouyi; Bowen, Stephen R.; Chaovalitwongse, W. Art; Sandison, George A.; Grabowski, Thomas J.; Kinahan, Paul E.

    2014-02-01

    The benefits of respiratory gating in quantitative PET/CT vary tremendously between individual patients. Respiratory pattern is among many patient-specific characteristics that are thought to play an important role in gating-induced imaging improvements. However, the quantitative relationship between patient-specific characteristics of respiratory pattern and improvements in quantitative accuracy from respiratory-gated PET/CT has not been well established. If such a relationship could be estimated, then patient-specific respiratory patterns could be used to prospectively select appropriate motion compensation during image acquisition on a per-patient basis. This study was undertaken to develop a novel statistical model that predicts quantitative changes in PET/CT imaging due to respiratory gating. Free-breathing static FDG-PET images without gating and respiratory-gated FDG-PET images were collected from 22 lung and liver cancer patients on a PET/CT scanner. PET imaging quality was quantified with peak standardized uptake value (SUVpeak) over lesions of interest. Relative differences in SUVpeak between static and gated PET images were calculated to indicate quantitative imaging changes due to gating. A comprehensive multidimensional extraction of the morphological and statistical characteristics of respiratory patterns was conducted, resulting in 16 features that characterize representative patterns of a single respiratory trace. The six most informative features were subsequently extracted using a stepwise feature selection approach. The multiple-regression model was trained and tested based on a leave-one-subject-out cross-validation. The predicted quantitative improvements in PET imaging achieved an accuracy higher than 90% using a criterion with a dynamic error-tolerance range for SUVpeak values. The results of this study suggest that our prediction framework could be applied to determine which patients would likely benefit from respiratory motion compensation

  19. Respiratory trace feature analysis for the prediction of respiratory-gated PET quantification.

    PubMed

    Wang, Shouyi; Bowen, Stephen R; Chaovalitwongse, W Art; Sandison, George A; Grabowski, Thomas J; Kinahan, Paul E

    2014-02-21

    The benefits of respiratory gating in quantitative PET/CT vary tremendously between individual patients. Respiratory pattern is among many patient-specific characteristics that are thought to play an important role in gating-induced imaging improvements. However, the quantitative relationship between patient-specific characteristics of respiratory pattern and improvements in quantitative accuracy from respiratory-gated PET/CT has not been well established. If such a relationship could be estimated, then patient-specific respiratory patterns could be used to prospectively select appropriate motion compensation during image acquisition on a per-patient basis. This study was undertaken to develop a novel statistical model that predicts quantitative changes in PET/CT imaging due to respiratory gating. Free-breathing static FDG-PET images without gating and respiratory-gated FDG-PET images were collected from 22 lung and liver cancer patients on a PET/CT scanner. PET imaging quality was quantified with peak standardized uptake value (SUV(peak)) over lesions of interest. Relative differences in SUV(peak) between static and gated PET images were calculated to indicate quantitative imaging changes due to gating. A comprehensive multidimensional extraction of the morphological and statistical characteristics of respiratory patterns was conducted, resulting in 16 features that characterize representative patterns of a single respiratory trace. The six most informative features were subsequently extracted using a stepwise feature selection approach. The multiple-regression model was trained and tested based on a leave-one-subject-out cross-validation. The predicted quantitative improvements in PET imaging achieved an accuracy higher than 90% using a criterion with a dynamic error-tolerance range for SUV(peak) values. The results of this study suggest that our prediction framework could be applied to determine which patients would likely benefit from respiratory motion

  20. Prediction of core cancer genes using a hybrid of feature selection and machine learning methods.

    PubMed

    Liu, Y X; Zhang, N N; He, Y; Lun, L J

    2015-01-01

    Machine learning techniques are of great importance in the analysis of microarray expression data, and provide a systematic and promising way to predict core cancer genes. In this study, a hybrid strategy was introduced based on machine learning techniques to select a small set of informative genes, which will lead to improving classification accuracy. First feature filtering algorithms were applied to select a set of top-ranked genes, and then hierarchical clustering and collapsing dense clusters were used to select core cancer genes. Through empirical study, our approach is capable of selecting relatively few core cancer genes while making high-accuracy predictions. The biological significance of these genes was evaluated using systems biology analysis. Extensive functional pathway and network analyses have confirmed findings in previous studies and can bring new insights into common cancer mechanisms. PMID:26345818

  1. Spiking neurons can discover predictive features by aggregate-label learning.

    PubMed

    Gütig, Robert

    2016-03-01

    The brain routinely discovers sensory clues that predict opportunities or dangers. However, it is unclear how neural learning processes can bridge the typically long delays between sensory clues and behavioral outcomes. Here, I introduce a learning concept, aggregate-label learning, that enables biologically plausible model neurons to solve this temporal credit assignment problem. Aggregate-label learning matches a neuron's number of output spikes to a feedback signal that is proportional to the number of clues but carries no information about their timing. Aggregate-label learning outperforms stochastic reinforcement learning at identifying predictive clues and is able to solve unsegmented speech-recognition tasks. Furthermore, it allows unsupervised neural networks to discover reoccurring constellations of sensory features even when they are widely dispersed across space and time. PMID:26941324

  2. Application of Molecular Dynamics Simulations in Molecular Property Prediction I: Density and Heat of Vaporization

    PubMed Central

    Wang, Junmei; Tingjun, Hou

    2011-01-01

    Molecular mechanical force field (FF) methods are useful in studying condensed phase properties. They are complementary to experiment and can often go beyond experiment in atomic details. Even a FF is specific for studying structures, dynamics and functions of biomolecules, it is still important for the FF to accurately reproduce the experimental liquid properties of small molecules that represent the chemical moieties of biomolecules. Otherwise, the force field may not describe the structures and energies of macromolecules in aqueous solutions properly. In this work, we have carried out a systematic study to evaluate the General AMBER Force Field (GAFF) in studying densities and heats of vaporization for a large set of organic molecules that covers the most common chemical functional groups. The latest techniques, such as the particle mesh Ewald (PME) for calculating electrostatic energies, and Langevin dynamics for scaling temperatures, have been applied in the molecular dynamics (MD) simulations. For density, the average percent error (APE) of 71 organic compounds is 4.43% when compared to the experimental values. More encouragingly, the APE drops to 3.43% after the exclusion of two outliers and four other compounds for which the experimental densities have been measured with pressures higher than 1.0 atm. For heat of vaporization, several protocols have been investigated and the best one, P4/ntt0, achieves an average unsigned error (AUE) and a root-mean-square error (RMSE) of 0.93 and 1.20 kcal/mol, respectively. How to reduce the prediction errors through proper van der Waals (vdW) parameterization has been discussed. An encouraging finding in vdW parameterization is that both densities and heats of vaporization approach their “ideal” values in a synchronous fashion when vdW parameters are tuned. The following hydration free energy calculation using thermodynamic integration further justifies the vdW refinement. We conclude that simple vdW parameterization

  3. Infrared images of reflection nebulae and Orion's bar: Fluorescent molecular hydrogen and the 3.3 micron feature

    NASA Technical Reports Server (NTRS)

    Burton, Michael G.; Moorhouse, Alan; Brand, P. W. J. L.; Roche, Patrick F.; Geballe, T. R.

    1989-01-01

    Images were obtained of the (fluorescent) molecular hydrogen 1-0 S(1) line, and of the 3.3 micron emission feature, in Orion's Bar and three reflection nebulae. The emission from these species appears to come from the same spatial locations in all sources observed. This suggests that the 3.3 micron feature is excited by the same energetic UV-photons which cause the molecular hydrogen to fluoresce.

  4. MERRF: Clinical features, muscle biopsy and molecular genetics in Brazilian patients.

    PubMed

    Lorenzoni, Paulo José; Scola, Rosana H; Kay, Cláudia S Kamoi; Arndt, Raquel C; Silvado, Carlos E; Werneck, Lineu C

    2011-05-01

    Myoclonic epilepsy with ragged red fibers (MERRF) is a mitochondrial disease that is characterized by myoclonic epilepsy with ragged red fibers (RRF) in muscle biopsies. The aim of this study was to analyze Brazilian patients with MERRF. Six patients with MERRF were studied and correlations between clinical findings, laboratory data, electrophysiology, histology and molecular features were examined. We found that blood lactate was increased in four patients. Electroencephalogram studies revealed generalized epileptiform discharges in five patients and generalized photoparoxysmal responses during intermittent photic stimulation in two patients. Muscle biopsies showed RRF in all patients using modified Gomori-trichrome and succinate dehydrogenase stains. Cytochrome c oxidase (COX) stain analysis indicated deficient activity in five patients and subsarcolemmal accumulation in one patient. Molecular analysis of the tRNA(Lys) gene with PCR/RFLP and direct sequencing showed the A8344G mutation of mtDNA in five patients. The presence of RRFs and COX deficiencies in muscle biopsies often confirmed the MERRF diagnosis. We conclude that molecular analysis of the tRNA(Lys) gene is an important criterion to help confirm the MERRF diagnosis. Furthermore, based on the findings of this study, we suggest a revision of the main characteristics of this disease. PMID:21303704

  5. Unsupervised Feature Learning Improves Prediction of Human Brain Activity in Response to Natural Images

    PubMed Central

    Güçlü, Umut; van Gerven, Marcel A. J.

    2014-01-01

    Encoding and decoding in functional magnetic resonance imaging has recently emerged as an area of research to noninvasively characterize the relationship between stimulus features and human brain activity. To overcome the challenge of formalizing what stimulus features should modulate single voxel responses, we introduce a general approach for making directly testable predictions of single voxel responses to statistically adapted representations of ecologically valid stimuli. These representations are learned from unlabeled data without supervision. Our approach is validated using a parsimonious computational model of (i) how early visual cortical representations are adapted to statistical regularities in natural images and (ii) how populations of these representations are pooled by single voxels. This computational model is used to predict single voxel responses to natural images and identify natural images from stimulus-evoked multiple voxel responses. We show that statistically adapted low-level sparse and invariant representations of natural images better span the space of early visual cortical representations and can be more effectively exploited in stimulus identification than hand-designed Gabor wavelets. Our results demonstrate the potential of our approach to better probe unknown cortical representations. PMID:25101625

  6. FFPred 3: feature-based function prediction for all Gene Ontology domains

    PubMed Central

    Cozzetto, Domenico; Minneci, Federico; Currant, Hannah; Jones, David T.

    2016-01-01

    Predicting protein function has been a major goal of bioinformatics for several decades, and it has gained fresh momentum thanks to recent community-wide blind tests aimed at benchmarking available tools on a genomic scale. Sequence-based predictors, especially those performing homology-based transfers, remain the most popular but increasing understanding of their limitations has stimulated the development of complementary approaches, which mostly exploit machine learning. Here we present FFPred 3, which is intended for assigning Gene Ontology terms to human protein chains, when homology with characterized proteins can provide little aid. Predictions are made by scanning the input sequences against an array of Support Vector Machines (SVMs), each examining the relationship between protein function and biophysical attributes describing secondary structure, transmembrane helices, intrinsically disordered regions, signal peptides and other motifs. This update features a larger SVM library that extends its coverage to the cellular component sub-ontology for the first time, prompted by the establishment of a dedicated evaluation category within the Critical Assessment of Functional Annotation. The effectiveness of this approach is demonstrated through benchmarking experiments, and its usefulness is illustrated by analysing the potential functional consequences of alternative splicing in human and their relationship to patterns of biological features. PMID:27561554

  7. FFPred 3: feature-based function prediction for all Gene Ontology domains.

    PubMed

    Cozzetto, Domenico; Minneci, Federico; Currant, Hannah; Jones, David T

    2016-01-01

    Predicting protein function has been a major goal of bioinformatics for several decades, and it has gained fresh momentum thanks to recent community-wide blind tests aimed at benchmarking available tools on a genomic scale. Sequence-based predictors, especially those performing homology-based transfers, remain the most popular but increasing understanding of their limitations has stimulated the development of complementary approaches, which mostly exploit machine learning. Here we present FFPred 3, which is intended for assigning Gene Ontology terms to human protein chains, when homology with characterized proteins can provide little aid. Predictions are made by scanning the input sequences against an array of Support Vector Machines (SVMs), each examining the relationship between protein function and biophysical attributes describing secondary structure, transmembrane helices, intrinsically disordered regions, signal peptides and other motifs. This update features a larger SVM library that extends its coverage to the cellular component sub-ontology for the first time, prompted by the establishment of a dedicated evaluation category within the Critical Assessment of Functional Annotation. The effectiveness of this approach is demonstrated through benchmarking experiments, and its usefulness is illustrated by analysing the potential functional consequences of alternative splicing in human and their relationship to patterns of biological features. PMID:27561554

  8. Prediction of bacterial protein subcellular localization by incorporating various features into Chou's PseAAC and a backward feature selection approach.

    PubMed

    Li, Liqi; Yu, Sanjiu; Xiao, Weidong; Li, Yongsheng; Li, Maolin; Huang, Lan; Zheng, Xiaoqi; Zhou, Shiwen; Yang, Hua

    2014-09-01

    Information on the subcellular localization of bacterial proteins is essential for protein function prediction, genome annotation and drug design. Here we proposed a novel approach to predict the subcellular localization of bacterial proteins by fusing features from position-specific score matrix (PSSM), Gene Ontology (GO) and PROFEAT. A backward feature selection approach by linear kennel of SVM was then used to rank the integrated feature vectors and extract optimal features. Finally, SVM was applied for predicting protein subcellular locations based on these optimal features. To validate the performance of our method, we employed jackknife cross-validation tests on three low similarity datasets, i.e., M638, Gneg1456 and Gpos523. The overall accuracies of 94.98%, 93.21%, and 94.57% were achieved for these three datasets, which are higher (from 1.8% to 10.9%) than those by state-of-the-art tools. Comparison results suggest that our method could serve as a very useful vehicle for expediting the prediction of bacterial protein subcellular localization. PMID:24929100

  9. Pathology Features in Bethesda Guidelines Predict Colorectal Cancer Microsatellite Instability: A Population-Based Study

    PubMed Central

    Jenkins, Mark A.; Hayashi, Shinichi; O’shea, Anne-Marie; Burgart, Lawrence J.; Smyrk, Tom C.; Shimizu, David; Waring, Paul M.; Ruszkiewicz, Andrew R.; Pollett, Aaron F.; Redston, Mark; Barker, Melissa A.; Baron, John A.; Casey, Graham R.; Dowty, James G.; Giles, Graham G.; Limburg, Paul; Newcomb, Polly; Young, Joanne P.; Walsh, Michael D.; Thibodeau, Stephen N.; Lindor, Noralane M.; Lemarchand, Loïc; Gallinger, Steven; Haile, Robert W.; Potter, John D.; Hopper, John L.; Jass, Jeremy R.

    2010-01-01

    Background & Aims The revised Bethesda guidelines for Lynch syndrome recommend microsatellite instability (MSI) testing all colorectal cancers in patients diagnosed before age 50 years and colorectal cancers diagnosed in patients between ages 50 and 59 years with particular pathology features. Our aim was to identify pathology and other features that independently predict high MSI (MSI-H). Methods Archival tissue from 1098 population-based colorectal cancers diagnosed before age 60 years was tested for MSI. Pathology features, site, and age at diagnosis were obtained. Multiple logistic regression was performed to determine the predictive value of each feature, as measured by an odds ratio (OR), from which a scoring system (MsPath) was developed to estimate the probability a colorectal cancer is MSI-H. Results Fifteen percent of tumors (162) were MSI-H. Independent predictors were tumor-infiltrating lymphocytes (OR, 9.1; 95% confidence interval [CI], 5.9 –14.1), proximal subsite (OR, 4.7; 95% CI, 3.1–7.3), mucinous histology (OR, 2.8; 95% CI, 1.7– 4.8), poor differentiation (OR, 1.9; 95% CI, 1.2–3.1), Crohn’s-like reaction (OR, 1.9; 95% CI, 1.2–2.9), and diagnosis before age 50 years (OR, 1.9; 95% CI, 1.3–2.9). MsPath score ≥ 1.0 had a sensitivity of 93% and a specificity of 55% for MSI-H. Conclusions The probability an individual colorectal cancer is MSI-H is predicted well by the MsPath score. There is little value in testing for DNA mismatch repair loss in tumors, or for germline mismatch repair mutations, for colorectal cancers diagnosed in patients before age 60 years with an MSPath score <1 (approximately 50%). Pathology can identify almost all MSI-H colorectal cancers diagnosed before age 60 years. PMID:17631130

  10. In silico predictive studies of mAHR congener binding using homology modelling and molecular docking.

    PubMed

    Panda, Roshni; Cleave, A Suneetha Susan; Suresh, P K

    2014-09-01

    The aryl hydrocarbon receptor (AHR) is one of the principal xenobiotic, nuclear receptor that is responsible for the early events involved in the transcription of a complex set of genes comprising the CYP450 gene family. In the present computational study, homology modelling and molecular docking were carried out with the objective of predicting the relationship between the binding efficiency and the lipophilicity of different polychlorinated biphenyl (PCB) congeners and the AHR in silico. Homology model of the murine AHR was constructed by several automated servers and assessed by PROCHECK, ERRAT, VERIFY3D and WHAT IF. The resulting model of the AHR by MODWEB was used to carry out molecular docking of 36 PCB congeners using PatchDock server. The lipophilicity of the congeners was predicted using the XLOGP3 tool. The results suggest that the lipophilicity influences binding energy scores and is positively correlated with the same. Score and Log P were correlated with r = +0.506 at p = 0.01 level. In addition, the number of chlorine (Cl) atoms and Log P were highly correlated with r = +0.900 at p = 0.01 level. The number of Cl atoms and scores also showed a moderate positive correlation of r = +0.481 at p = 0.01 level. To the best of our knowledge, this is the first study employing PatchDock in the docking of AHR to the environmentally deleterious congeners and attempting to correlate structural features of the AHR with its biochemical properties with regards to PCBs. The result of this study are consistent with those of other computational studies reported in the previous literature that suggests that a combination of docking, scoring and ranking organic pollutants could be a possible predictive tool for investigating ligand-mediated toxicity, for their subsequent validation using wet lab-based studies. PMID:23081860

  11. MELANCHOLIC DEPRESSION PREDICTION BY IDENTIFYING REPRESENTATIVE FEATURES IN METABOLIC AND MICROARRAY PROFILES WITH MISSING VALUES

    PubMed Central

    Nie, Zhi; Yang, Tao; Liu, Yashu; Lin, Binbin; Li, Qingyang; Narayan, Vaibhav A; Wittenberg, Gayle; Ye, Jieping

    2014-01-01

    Recent studies have revealed that melancholic depression, one major subtype of depression, is closely associated with the concentration of some metabolites and biological functions of certain genes and pathways. Meanwhile, recent advances in biotechnologies have allowed us to collect a large amount of genomic data, e.g., metabolites and microarray gene expression. With such a huge amount of information available, one approach that can give us new insights into the understanding of the fundamental biology underlying melancholic depression is to build disease status prediction models using classification or regression methods. However, the existence of strong empirical correlations, e.g., those exhibited by genes sharing the same biological pathway in microarray profiles, tremendously limits the performance of these methods. Furthermore, the occurrence of missing values which are ubiquitous in biomedical applications further complicates the problem. In this paper, we hypothesize that the problem of missing values might in some way benefit from the correlation between the variables and propose a method to learn a compressed set of representative features through an adapted version of sparse coding which is capable of identifying correlated variables and addressing the issue of missing values simultaneously. An efficient algorithm is also developed to solve the proposed formulation. We apply the proposed method on metabolic and microarray profiles collected from a group of subjects consisting of both patients with melancholic depression and healthy controls. Results show that the proposed method can not only produce meaningful clusters of variables but also generate a set of representative features that achieve superior classification performance over those generated by traditional clustering and data imputation techniques. In particular, on both datasets, we found that in comparison with the competing algorithms, the representative features learned by the proposed

  12. Melancholic depression prediction by identifying representative features in metabolic and microarray profiles with missing values.

    PubMed

    Nie, Zhi; Yang, Tao; Liu, Yashu; Li, Qingyang; Narayan, Vaibhav A; Wittenberg, Gayle; Ye, Jieping

    2015-01-01

    Recent studies have revealed that melancholic depression, one major subtype of depression, is closely associated with the concentration of some metabolites and biological functions of certain genes and pathways. Meanwhile, recent advances in biotechnologies have allowed us to collect a large amount of genomic data, e.g., metabolites and microarray gene expression. With such a huge amount of information available, one approach that can give us new insights into the understanding of the fundamental biology underlying melancholic depression is to build disease status prediction models using classification or regression methods. However, the existence of strong empirical correlations, e.g., those exhibited by genes sharing the same biological pathway in microarray profiles, tremendously limits the performance of these methods. Furthermore, the occurrence of missing values which are ubiquitous in biomedical applications further complicates the problem. In this paper, we hypothesize that the problem of missing values might in some way benefit from the correlation between the variables and propose a method to learn a compressed set of representative features through an adapted version of sparse coding which is capable of identifying correlated variables and addressing the issue of missing values simultaneously. An efficient algorithm is also developed to solve the proposed formulation. We apply the proposed method on metabolic and microarray profiles collected from a group of subjects consisting of both patients with melancholic depression and healthy controls. Results show that the proposed method can not only produce meaningful clusters of variables but also generate a set of representative features that achieve superior classification performance over those generated by traditional clustering and data imputation techniques. In particular, on both datasets, we found that in comparison with the competing algorithms, the representative features learned by the proposed

  13. Molecular features assisting in diagnosis, surgery, and treatment decision making in low-grade gliomas.

    PubMed

    Chen, Ricky; Ravindra, Vijay M; Cohen, Adam L; Jensen, Randy L; Salzman, Karen L; Prescot, Andrew P; Colman, Howard

    2015-03-01

    The preferred management of suspected low-grade gliomas (LGGs) has been disputed, and the implications of molecular changes for medical and surgical management of LGGs are important to consider. Current strategies that make use of molecular markers and imaging techniques and therapeutic considerations offer additional options for management of LGGs. Mutations in the isocitrate dehydrogenase 1 and 2 (IDH1 and IDH2) genes suggest a role for this abnormal metabolic pathway in the pathogenesis and progression of these primary brain tumors. Use of magnetic resonance spectroscopy can provide preoperative detection of IDH-mutated gliomas and affect surgical planning. In addition, IDH1 and IDH2 mutation status may have an effect on surgical resectability of gliomas. The IDH-mutated tumors exhibit better prognosis throughout every grade of glioma, and mutation may be an early genetic event, preceding lineage-specific secondary and tertiary alterations that transform LGGs into secondary glioblastomas. The O6-methylguanine-DNAmethyltransferase (MGMT) promoter methylation and 1p19q codeletion status can predict sensitivity to chemotherapy and radiation in low- and intermediate-grade gliomas. Thus, these recent advances, which have led to a better understanding of how molecular, genetic, and epigenetic alterations influence the pathogenicity of the different histological grades of gliomas, can lead to better prognostication and may lead to specific targeted surgical interventions and medical therapies. PMID:25727224

  14. Larval description of Drusus bosnicus Klapálek 1899 (Trichoptera: Limnephilidae), with distributional, molecular and ecological features.

    PubMed

    Kučinić, Mladen; Previšić, Ana; Graf, Wolfram; Mihoci, Iva; Šoufek, Marin; Stanić-Koštroman, Svjetlana; Lelo, Suvad; Vitecek, Simon; Waringer, Johann

    2015-01-01

    In this study we present morphological, molecular and ecological features of the last instar larvae of Drusus bosnicus with data about distribution of this species in Bosnia and Herzegovina. We also included  the most important diagnostic features enabling separation of larvae of D. bosnicus from larvae of the other European Drusinae and Trichoptera species. PMID:26249056

  15. Larval description of Drusus bosnicus Klapálek 1899 (Trichoptera: Limnephilidae), with distributional, molecular and ecological features

    PubMed Central

    KUČINIĆ, MLADEN; PREVIŠIĆ, ANA; GRAF, WOLFRAM; MIHOCI, IVA; ŠOUFEK, MARIN; STANIĆ-KOŠTROMAN, SVJETLANA; LELO, SUVAD; VITECEK, SIMON; WARINGER, JOHANN

    2016-01-01

    In this study we present morphological, molecular and ecological features of the last instar larvae of Drusus bosnicus with data about distribution of this species in Bosnia and Herzegovina. We also included are the most important diagnostic features enabling separation of larvae of D. bosnicus from larvae of the other European Drusinae and Trichoptera species. PMID:26249056

  16. PGlcS: Prediction of protein O-GlcNAcylation sites with multiple features and analysis.

    PubMed

    Zhao, Xiaowei; Ning, Qiao; Chai, Haiting; Ai, Meiyue; Ma, Zhiqiang

    2015-09-01

    As a widespread type of protein post-translational modification, O-GlcNAcylation plays crucial regulatory roles in almost all cellular processes and is related to some diseases. To deeply understand O-GlcNAcylated mechanisms, identification of substrates and specific O-GlcNAcylated sites is crucial. Experimental identification is expensive and time-consuming, so computational prediction of O-GlcNAcylated sites has considerable value. In this work, we developed a novel O-GlcNAcylated sites predictor called PGlcS (Prediction of O-GlcNAcylated Sites) by using k-means cluster to obtain informative and reliable negative samples, and support vector machines classifier combined with a two-step feature selection. The performance of PGlcS was evaluated using an independent testing dataset resulting in a sensitivity of 64.62%, a specificity of 68.4%, an accuracy of 68.37%, and a Matthew׳s correlation coefficient of 0.0697, which demonstrated PGlcS was very promising for predicting O-GlcNAcylated sites. The datasets and source code were available in Supplementary information. PMID:26116363

  17. Ribonucleotide reductases reveal novel viral diversity and predict biological and ecological features of unknown marine viruses

    PubMed Central

    Sakowski, Eric G.; Munsell, Erik V.; Hyatt, Mara; Kress, William; Williamson, Shannon J.; Nasko, Daniel J.; Polson, Shawn W.; Wommack, K. Eric

    2014-01-01

    Virioplankton play a crucial role in aquatic ecosystems as top-down regulators of bacterial populations and agents of horizontal gene transfer and nutrient cycling. However, the biology and ecology of virioplankton populations in the environment remain poorly understood. Ribonucleotide reductases (RNRs) are ancient enzymes that reduce ribonucleotides to deoxyribonucleotides and thus prime DNA synthesis. Composed of three classes according to O2 reactivity, RNRs can be predictive of the physiological conditions surrounding DNA synthesis. RNRs are universal among cellular life, common within viral genomes and virioplankton shotgun metagenomes (viromes), and estimated to occur within >90% of the dsDNA virioplankton sampled in this study. RNRs occur across diverse viral groups, including all three morphological families of tailed phages, making these genes attractive for studies of viral diversity. Differing patterns in virioplankton diversity were clear from RNRs sampled across a broad oceanic transect. The most abundant RNRs belonged to novel lineages of podoviruses infecting α-proteobacteria, a bacterial class critical to oceanic carbon cycling. RNR class was predictive of phage morphology among cyanophages and RNR distribution frequencies among cyanophages were largely consistent with the predictions of the “kill the winner–cost of resistance” model. RNRs were also identified for the first time to our knowledge within ssDNA viromes. These data indicate that RNR polymorphism provides a means of connecting the biological and ecological features of virioplankton populations. PMID:25313075

  18. Time Score: A New Feature for Link Prediction in Social Networks

    NASA Astrophysics Data System (ADS)

    Munasinghe, Lankeshwara; Ichise, Ryutaro

    Link prediction in social networks, such as friendship networks and coauthorship networks, has recently attracted a great deal of attention. There have been numerous attempts to address the problem of link prediction through diverse approaches. In the present paper, we focus on the temporal behavior of the link strength, particularly the relationship between the time stamps of interactions or links and the temporal behavior of link strength and how link strength affects future link evolution. Most previous studies have not sufficiently discussed either the impact of time stamps of the interactions or time stamps of the links on link evolution. The gap between the current time and the time stamps of the interactions or links is also important to link evolution. In the present paper, we introduce a new time-aware feature, referred to as time score, that captures the important aspects of time stamps of interactions and the temporality of the link strengths. We also analyze the effectiveness of time score with different parameter settings for different network data sets. The results of the analysis revealed that the time score was sensitive to different networks and different time measures. We applied time score to two social network data sets, namely, Facebook friendship network data set and a coauthorship network data set. The results revealed a significant improvement in predicting future links.

  19. Accurate single-sequence prediction of solvent accessible surface area using local and global features

    PubMed Central

    Faraggi, Eshel; Zhou, Yaoqi; Kloczkowski, Andrzej

    2014-01-01

    We present a new approach for predicting the Accessible Surface Area (ASA) using a General Neural Network (GENN). The novelty of the new approach lies in not using residue mutation profiles generated by multiple sequence alignments as descriptive inputs. Instead we use solely sequential window information and global features such as single-residue and two-residue compositions of the chain. The resulting predictor is both highly more efficient than sequence alignment based predictors and of comparable accuracy to them. Introduction of the global inputs significantly helps achieve this comparable accuracy. The predictor, termed ASAquick, is tested on predicting the ASA of globular proteins and found to perform similarly well for so-called easy and hard cases indicating generalizability and possible usability for de-novo protein structure prediction. The source code and a Linux executables for GENN and ASAquick are available from Research and Information Systems at http://mamiris.com, from the SPARKS Lab at http://sparks-lab.org, and from the Battelle Center for Mathematical Medicine at http://mathmed.org. PMID:25204636

  20. Accurate single-sequence prediction of solvent accessible surface area using local and global features.

    PubMed

    Faraggi, Eshel; Zhou, Yaoqi; Kloczkowski, Andrzej

    2014-11-01

    We present a new approach for predicting the Accessible Surface Area (ASA) using a General Neural Network (GENN). The novelty of the new approach lies in not using residue mutation profiles generated by multiple sequence alignments as descriptive inputs. Instead we use solely sequential window information and global features such as single-residue and two-residue compositions of the chain. The resulting predictor is both highly more efficient than sequence alignment-based predictors and of comparable accuracy to them. Introduction of the global inputs significantly helps achieve this comparable accuracy. The predictor, termed ASAquick, is tested on predicting the ASA of globular proteins and found to perform similarly well for so-called easy and hard cases indicating generalizability and possible usability for de-novo protein structure prediction. The source code and a Linux executables for GENN and ASAquick are available from Research and Information Systems at http://mamiris.com, from the SPARKS Lab at http://sparks-lab.org, and from the Battelle Center for Mathematical Medicine at http://mathmed.org. PMID:25204636

  1. Ribonucleotide reductases reveal novel viral diversity and predict biological and ecological features of unknown marine viruses.

    PubMed

    Sakowski, Eric G; Munsell, Erik V; Hyatt, Mara; Kress, William; Williamson, Shannon J; Nasko, Daniel J; Polson, Shawn W; Wommack, K Eric

    2014-11-01

    Virioplankton play a crucial role in aquatic ecosystems as top-down regulators of bacterial populations and agents of horizontal gene transfer and nutrient cycling. However, the biology and ecology of virioplankton populations in the environment remain poorly understood. Ribonucleotide reductases (RNRs) are ancient enzymes that reduce ribonucleotides to deoxyribonucleotides and thus prime DNA synthesis. Composed of three classes according to O2 reactivity, RNRs can be predictive of the physiological conditions surrounding DNA synthesis. RNRs are universal among cellular life, common within viral genomes and virioplankton shotgun metagenomes (viromes), and estimated to occur within >90% of the dsDNA virioplankton sampled in this study. RNRs occur across diverse viral groups, including all three morphological families of tailed phages, making these genes attractive for studies of viral diversity. Differing patterns in virioplankton diversity were clear from RNRs sampled across a broad oceanic transect. The most abundant RNRs belonged to novel lineages of podoviruses infecting α-proteobacteria, a bacterial class critical to oceanic carbon cycling. RNR class was predictive of phage morphology among cyanophages and RNR distribution frequencies among cyanophages were largely consistent with the predictions of the "kill the winner-cost of resistance" model. RNRs were also identified for the first time to our knowledge within ssDNA viromes. These data indicate that RNR polymorphism provides a means of connecting the biological and ecological features of virioplankton populations. PMID:25313075

  2. Functional connectivity classification of autism identifies highly predictive brain features but falls short of biomarker standards

    PubMed Central

    Plitt, Mark; Barnes, Kelly Anne; Martin, Alex

    2014-01-01

    Objectives Autism spectrum disorders (ASD) are diagnosed based on early-manifesting clinical symptoms, including markedly impaired social communication. We assessed the viability of resting-state functional MRI (rs-fMRI) connectivity measures as diagnostic biomarkers for ASD and investigated which connectivity features are predictive of a diagnosis. Methods Rs-fMRI scans from 59 high functioning males with ASD and 59 age- and IQ-matched typically developing (TD) males were used to build a series of machine learning classifiers. Classification features were obtained using 3 sets of brain regions. Another set of classifiers was built from participants' scores on behavioral metrics. An additional age and IQ-matched cohort of 178 individuals (89 ASD; 89 TD) from the Autism Brain Imaging Data Exchange (ABIDE) open-access dataset (http://fcon_1000.projects.nitrc.org/indi/abide/) were included for replication. Results High classification accuracy was achieved through several rs-fMRI methods (peak accuracy 76.67%). However, classification via behavioral measures consistently surpassed rs-fMRI classifiers (peak accuracy 95.19%). The class probability estimates, P(ASD|fMRI data), from brain-based classifiers significantly correlated with scores on a measure of social functioning, the Social Responsiveness Scale (SRS), as did the most informative features from 2 of the 3 sets of brain-based features. The most informative connections predominantly originated from regions strongly associated with social functioning. Conclusions While individuals can be classified as having ASD with statistically significant accuracy from their rs-fMRI scans alone, this method falls short of biomarker standards. Classification methods provided further evidence that ASD functional connectivity is characterized by dysfunction of large-scale functional networks, particularly those involved in social information processing. PMID:25685703

  3. An approach to predict Sudden Cardiac Death (SCD) using time domain and bispectrum features from HRV signal.

    PubMed

    Houshyarifar, Vahid; Chehel Amirani, Mehdi

    2016-08-12

    In this paper we present a method to predict Sudden Cardiac Arrest (SCA) with higher order spectral (HOS) and linear (Time) features extracted from heart rate variability (HRV) signal. Predicting the occurrence of SCA is important in order to avoid the probability of Sudden Cardiac Death (SCD). This work is a challenge to predict five minutes before SCA onset. The method consists of four steps: pre-processing, feature extraction, feature reduction, and classification. In the first step, the QRS complexes are detected from the electrocardiogram (ECG) signal and then the HRV signal is extracted. In second step, bispectrum features of HRV signal and time-domain features are obtained. Six features are extracted from bispectrum and two features from time-domain. In the next step, these features are reduced to one feature by the linear discriminant analysis (LDA) technique. Finally, KNN and support vector machine-based classifiers are used to classify the HRV signals. We used two database named, MIT/BIH Sudden Cardiac Death (SCD) Database and Physiobank Normal Sinus Rhythm (NSR). In this work we achieved prediction of SCD occurrence for six minutes before the SCA with the accuracy over 91%. PMID:27567781

  4. Unveiling atomic-scale features of inherent heterogeneity in metallic glass by molecular dynamics simulations

    NASA Astrophysics Data System (ADS)

    Hu, Y. C.; Guan, P. F.; Li, M. Z.; Liu, C. T.; Yang, Y.; Bai, H. Y.; Wang, W. H.

    2016-06-01

    Heterogeneity is commonly believed to be intrinsic to metallic glasses (MGs). Nevertheless, how to distinguish and characterize the heterogeneity at the atomic level is still debated. Based on the extensive molecular dynamics simulations that combine isoconfigurational ensemble and atomic pinning methods, we directly reveal that MG contains flow units and the elastic matrix which can be well distinguished by their distinctive atomic-level responsiveness and mechanical performance. The microscopic features of the flow units, such as the shape, spatial distribution dimensionality, and correlation length, are characterized from atomic position analyses. Furthermore, the correlation between the flow units and the landscape of energy state, free volume, atomic-level stress, and especially the local bond orientational order parameter is discussed.

  5. A data-driven feature extraction framework for predicting the severity of condition of congestive heart failure patients.

    PubMed

    Sideris, Costas; Alshurafa, Nabil; Pourhomayoun, Mohammad; Shahmohammadi, Farhad; Samy, Lauren; Sarrafzadeh, Majid

    2015-08-01

    In this paper, we propose a novel methodology for utilizing disease diagnostic information to predict severity of condition for Congestive Heart Failure (CHF) patients. Our methodology relies on a novel, clustering-based, feature extraction framework using disease diagnostic information. To reduce the dimensionality we identify disease clusters using cooccurence frequencies. We then utilize these clusters as features to predict patient severity of condition. We build our clustering and feature extraction algorithm using the 2012 National Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP) which contains 7 million discharge records and ICD-9-CM codes. The proposed framework is tested on Ronald Reagan UCLA Medical Center Electronic Health Records (EHR) from 3041 patients. We compare our cluster-based feature set with another that incorporates the Charlson comorbidity score as a feature and demonstrate an accuracy improvement of up to 14% in the predictability of the severity of condition. PMID:26736808

  6. Predicting cytotoxicity of PAMAM dendrimers using molecular descriptors

    PubMed Central

    Jones, David E; Ghandehari, Hamidreza

    2015-01-01

    Summary The use of data mining techniques in the field of nanomedicine has been very limited. In this paper we demonstrate that data mining techniques can be used for the development of predictive models of the cytotoxicity of poly(amido amine) (PAMAM) dendrimers using their chemical and structural properties. We present predictive models developed using 103 PAMAM dendrimer cytotoxicity values that were extracted from twelve cancer nanomedicine journal articles. The results indicate that data mining and machine learning can be effectively used to predict the cytotoxicity of PAMAM dendrimers on Caco-2 cells. PMID:26665059

  7. Sequence features accurately predict genome-wide MeCP2 binding in vivo.

    PubMed

    Rube, H Tomas; Lee, Wooje; Hejna, Miroslav; Chen, Huaiyang; Yasui, Dag H; Hess, John F; LaSalle, Janine M; Song, Jun S; Gong, Qizhi

    2016-01-01

    Methyl-CpG binding protein 2 (MeCP2) is critical for proper brain development and expressed at near-histone levels in neurons, but the mechanism of its genomic localization remains poorly understood. Using high-resolution MeCP2-binding data, we show that DNA sequence features alone can predict binding with 88% accuracy. Integrating MeCP2 binding and DNA methylation in a probabilistic graphical model, we demonstrate that previously reported genome-wide association with methylation is in part due to MeCP2's affinity to GC-rich chromatin, a result replicated using published data. Furthermore, MeCP2 co-localizes with nucleosomes. Finally, MeCP2 binding downstream of promoters correlates with increased expression in Mecp2-deficient neurons. PMID:27008915

  8. Sequence features accurately predict genome-wide MeCP2 binding in vivo

    PubMed Central

    Rube, H. Tomas; Lee, Wooje; Hejna, Miroslav; Chen, Huaiyang; Yasui, Dag H.; Hess, John F.; LaSalle, Janine M.; Song, Jun S.; Gong, Qizhi

    2016-01-01

    Methyl-CpG binding protein 2 (MeCP2) is critical for proper brain development and expressed at near-histone levels in neurons, but the mechanism of its genomic localization remains poorly understood. Using high-resolution MeCP2-binding data, we show that DNA sequence features alone can predict binding with 88% accuracy. Integrating MeCP2 binding and DNA methylation in a probabilistic graphical model, we demonstrate that previously reported genome-wide association with methylation is in part due to MeCP2's affinity to GC-rich chromatin, a result replicated using published data. Furthermore, MeCP2 co-localizes with nucleosomes. Finally, MeCP2 binding downstream of promoters correlates with increased expression in Mecp2-deficient neurons. PMID:27008915

  9. Molecular effective coverage surface area of optical clearing agents for predicting optical clearing potential

    NASA Astrophysics Data System (ADS)

    Feng, Wei; Ma, Ning; Zhu, Dan

    2015-03-01

    The improvement of methods for optical clearing agent prediction exerts an important impact on tissue optical clearing technique. The molecular dynamic simulation is one of the most convincing and simplest approaches to predict the optical clearing potential of agents by analyzing the hydrogen bonds, hydrogen bridges and hydrogen bridges type forming between agents and collagen. However, the above analysis methods still suffer from some problem such as analysis of cyclic molecule by reason of molecular conformation. In this study, a molecular effective coverage surface area based on the molecular dynamic simulation was proposed to predict the potential of optical clearing agents. Several typical cyclic molecules, fructose, glucose and chain molecules, sorbitol, xylitol were analyzed by calculating their molecular effective coverage surface area, hydrogen bonds, hydrogen bridges and hydrogen bridges type, respectively. In order to verify this analysis methods, in vitro skin samples optical clearing efficacy were measured after 25 min immersing in the solutions, fructose, glucose, sorbitol and xylitol at concentration of 3.5 M using 1951 USAF resolution test target. The experimental results show accordance with prediction of molecular effective coverage surface area. Further to compare molecular effective coverage surface area with other parameters, it can show that molecular effective coverage surface area has a better performance in predicting OCP of agents.

  10. Assessment of two mammographic density related features in predicting near-term breast cancer risk

    NASA Astrophysics Data System (ADS)

    Zheng, Bin; Sumkin, Jules H.; Zuley, Margarita L.; Wang, Xingwei; Klym, Amy H.; Gur, David

    2012-02-01

    In order to establish a personalized breast cancer screening program, it is important to develop risk models that have high discriminatory power in predicting the likelihood of a woman developing an imaging detectable breast cancer in near-term (e.g., <3 years after a negative examination in question). In epidemiology-based breast cancer risk models, mammographic density is considered the second highest breast cancer risk factor (second to woman's age). In this study we explored a new feature, namely bilateral mammographic density asymmetry, and investigated the feasibility of predicting near-term screening outcome. The database consisted of 343 negative examinations, of which 187 depicted cancers that were detected during the subsequent screening examination and 155 that remained negative. We computed the average pixel value of the segmented breast areas depicted on each cranio-caudal view of the initial negative examinations. We then computed the mean and difference mammographic density for paired bilateral images. Using woman's age, subjectively rated density (BIRADS), and computed mammographic density related features we compared classification performance in estimating the likelihood of detecting cancer during the subsequent examination using areas under the ROC curves (AUC). The AUCs were 0.63+/-0.03, 0.54+/-0.04, 0.57+/-0.03, 0.68+/-0.03 when using woman's age, BIRADS rating, computed mean density and difference in computed bilateral mammographic density, respectively. Performance increased to 0.62+/-0.03 and 0.72+/-0.03 when we fused mean and difference in density with woman's age. The results suggest that, in this study, bilateral mammographic tissue density is a significantly stronger (p<0.01) risk indicator than both woman's age and mean breast density.

  11. Body Composition Features Predict Overall Survival in Patients With Hepatocellular Carcinoma

    PubMed Central

    Singal, Amit G; Zhang, Peng; Waljee, Akbar K; Ananthakrishnan, Lakshmi; Parikh, Neehar D; Sharma, Pratima; Barman, Pranab; Krishnamurthy, Venkataramu; Wang, Lu; Wang, Stewart C; Su, Grace L

    2016-01-01

    Objectives: Existing prognostic models for patients with hepatocellular carcinoma (HCC) have limitations. Analytic morphomics, a novel process to measure body composition using computational image-processing algorithms, may offer further prognostic information. The aim of this study was to develop and validate a prognostic model for HCC patients using body composition features and objective clinical information. Methods: Using computed tomography scans from a cohort of HCC patients at the VA Ann Arbor Healthcare System between January 2006 and December 2013, we developed a prognostic model using analytic morphomics and routine clinical data based on multivariate Cox regression and regularization methods. We assessed model performance using C-statistics and validated predicted survival probabilities. We validated model performance in an external cohort of HCC patients from Parkland Hospital, a safety-net health system in Dallas County. Results: The derivation cohort consisted of 204 HCC patients (20.1% Barcelona Clinic Liver Cancer classification (BCLC) 0/A), and the validation cohort had 225 patients (22.2% BCLC 0/A). The analytic morphomics model had good prognostic accuracy in the derivation cohort (C-statistic 0.80, 95% confidence interval (CI) 0.71–0.89) and external validation cohort (C-statistic 0.75, 95% CI 0.68–0.82). The accuracy of the analytic morphomics model was significantly higher than that of TNM and BCLC staging systems in derivation (P<0.001 for both) and validation (P<0.001 for both) cohorts. For calibration, mean absolute errors in predicted 1-year survival probabilities were 5.3% (90% quantile of 7.5%) and 7.6% (90% quantile of 12.5%) in the derivation and validation cohorts, respectively. Conclusion: Body composition features, combined with readily available clinical data, can provide valuable prognostic information for patients with newly diagnosed HCC. PMID:27228403

  12. Spatial Habitat Features Derived from Multiparametric Magnetic Resonance Imaging Data Are Associated with Molecular Subtype and 12-Month Survival Status in Glioblastoma Multiforme

    PubMed Central

    Lee, Joonsang; Narang, Shivali; Martinez, Juan; Rao, Ganesh; Rao, Arvind

    2015-01-01

    One of the most common and aggressive malignant brain tumors is Glioblastoma multiforme. Despite the multimodality treatment such as radiation therapy and chemotherapy (temozolomide: TMZ), the median survival rate of glioblastoma patient is less than 15 months. In this study, we investigated the association between measures of spatial diversity derived from spatial point pattern analysis of multiparametric magnetic resonance imaging (MRI) data with molecular status as well as 12-month survival in glioblastoma. We obtained 27 measures of spatial proximity (diversity) via spatial point pattern analysis of multiparametric T1 post-contrast and T2 fluid-attenuated inversion recovery MRI data. These measures were used to predict 12-month survival status (≤12 or >12 months) in 74 glioblastoma patients. Kaplan-Meier with receiver operating characteristic analyses was used to assess the relationship between derived spatial features and 12-month survival status as well as molecular subtype status in patients with glioblastoma. Kaplan-Meier survival analysis revealed that 14 spatial features were capable of stratifying overall survival in a statistically significant manner. For prediction of 12-month survival status based on these diversity indices, sensitivity and specificity were 0.86 and 0.64, respectively. The area under the receiver operating characteristic curve and the accuracy were 0.76 and 0.75, respectively. For prediction of molecular subtype status, proneural subtype shows highest accuracy of 0.93 among all molecular subtypes based on receiver operating characteristic analysis. We find that measures of spatial diversity from point pattern analysis of intensity habitats from T1 post-contrast and T2 fluid-attenuated inversion recovery images are associated with both tumor subtype status and 12-month survival status and may therefore be useful indicators of patient prognosis, in addition to providing potential guidance for molecularly-targeted therapies in

  13. Clinical, Pathological, and Molecular Features of Lung Adenocarcinomas with AXL Expression.

    PubMed

    Sato, Katsuaki; Suda, Kenichi; Shimizu, Shigeki; Sakai, Kazuko; Mizuuchi, Hiroshi; Tomizawa, Kenji; Takemoto, Toshiki; Nishio, Kazuto; Mitsudomi, Tetsuya

    2016-01-01

    The receptor tyrosine kinase AXL is a member of the Tyro3-Axl-Mer receptor tyrosine kinase subfamily. AXL affects several cellular functions, including growth and migration. AXL aberration is reportedly a marker for poor prognosis and treatment resistance in various cancers. In this study, we analyzed clinical, pathological, and molecular features of AXL expression in lung adenocarcinomas (LADs). We examined 161 LAD specimens from patients who underwent pulmonary resections. When AXL protein expression was quantified (0, 1+, 2+, 3+) according to immunohistochemical staining intensity, results were 0: 35%; 1+: 20%; 2+: 37%; and 3+: 7% for the 161 samples. AXL expression status did not correlate with clinical features, including smoking status and pathological stage. However, patients whose specimens showed strong AXL expression (3+) had markedly poorer prognoses than other groups (P = 0.0033). Strong AXL expression was also significantly associated with downregulation of E-cadherin (P = 0.025) and CD44 (P = 0.0010). In addition, 9 of 12 specimens with strong AXL expression had driver gene mutations (6 with EGFR, 2 with KRAS, 1 with ALK). In conclusion, we found that strong AXL expression in surgically resected LADs was a predictor of poor prognosis. LADs with strong AXL expression were characterized by mesenchymal status, higher expression of stem-cell-like markers, and frequent driver gene mutations. PMID:27100677

  14. Clinical, Pathological, and Molecular Features of Lung Adenocarcinomas with AXL Expression

    PubMed Central

    Suda, Kenichi; Shimizu, Shigeki; Sakai, Kazuko; Mizuuchi, Hiroshi; Tomizawa, Kenji; Takemoto, Toshiki; Nishio, Kazuto; Mitsudomi, Tetsuya

    2016-01-01

    The receptor tyrosine kinase AXL is a member of the Tyro3-Axl-Mer receptor tyrosine kinase subfamily. AXL affects several cellular functions, including growth and migration. AXL aberration is reportedly a marker for poor prognosis and treatment resistance in various cancers. In this study, we analyzed clinical, pathological, and molecular features of AXL expression in lung adenocarcinomas (LADs). We examined 161 LAD specimens from patients who underwent pulmonary resections. When AXL protein expression was quantified (0, 1+, 2+, 3+) according to immunohistochemical staining intensity, results were 0: 35%; 1+: 20%; 2+: 37%; and 3+: 7% for the 161 samples. AXL expression status did not correlate with clinical features, including smoking status and pathological stage. However, patients whose specimens showed strong AXL expression (3+) had markedly poorer prognoses than other groups (P = 0.0033). Strong AXL expression was also significantly associated with downregulation of E-cadherin (P = 0.025) and CD44 (P = 0.0010). In addition, 9 of 12 specimens with strong AXL expression had driver gene mutations (6 with EGFR, 2 with KRAS, 1 with ALK). In conclusion, we found that strong AXL expression in surgically resected LADs was a predictor of poor prognosis. LADs with strong AXL expression were characterized by mesenchymal status, higher expression of stem-cell-like markers, and frequent driver gene mutations. PMID:27100677

  15. What catches a radiologist's eye? A comprehensive comparison of feature types for saliency prediction

    NASA Astrophysics Data System (ADS)

    Alzubaidi, Mohammad; Balasubramanian, Vineeth; Patel, Ameet; Panchanathan, Sethuraman; Black, John A., Jr.

    2010-03-01

    Experienced radiologists are in short supply, and are sometimes called upon to read many images in a short amount of time. This leaves them with a limited amount of time to read images, and can lead to fatigue and stress which can be sources of error, as they overlook subtle abnormalities that they otherwise might not miss. Another factor in error rates is called satisfaction of search, where a radiologist misses a second (typically subtle) abnormality after finding the first. These types of errors are due primarily to a lack of attention to an important region of the image during the search. In this paper we discuss the use of eye tracker technology, in combination with image analysis and machine learning techniques, to learn what types of features catch the eye experienced radiologists when reading chest x-rays for diagnostic purposes, and to then use that information to produce saliency maps that predict what regions of each image might be most interesting to radiologists. We found that, out of 13 popular features types that are widely extracted to characterize images, 4 are particularly useful for this task: (1) Localized Edge Orientation Histograms (2) Haar Wavelets, (3) Gabor Filters, and (4) Steerable Filters.

  16. Predicting DNA binding proteins using support vector machine with hybrid fractal features.

    PubMed

    Niu, Xiao-Hui; Hu, Xue-Hai; Shi, Feng; Xia, Jing-Bo

    2014-02-21

    DNA-binding proteins play a vitally important role in many biological processes. Prediction of DNA-binding proteins from amino acid sequence is a significant but not fairly resolved scientific problem. Chaos game representation (CGR) investigates the patterns hidden in protein sequences, and visually reveals previously unknown structure. Fractal dimensions (FD) are good tools to measure sizes of complex, highly irregular geometric objects. In order to extract the intrinsic correlation with DNA-binding property from protein sequences, CGR algorithm, fractal dimension and amino acid composition are applied to formulate the numerical features of protein samples in this paper. Seven groups of features are extracted, which can be computed directly from the primary sequence, and each group is evaluated by the 10-fold cross-validation test and Jackknife test. Comparing the results of numerical experiments, the group of amino acid composition and fractal dimension (21-dimension vector) gets the best result, the average accuracy is 81.82% and average Matthew's correlation coefficient (MCC) is 0.6017. This resulting predictor is also compared with existing method DNA-Prot and shows better performances. PMID:24189096

  17. The extent of whole-genome copy number alterations predicts aggressive features in primary melanomas.

    PubMed

    Gandolfi, Greta; Longo, Caterina; Moscarella, Elvira; Zalaudek, Iris; Sancisi, Valentina; Raucci, Margherita; Manzotti, Gloria; Gugnoni, Mila; Piana, Simonetta; Argenziano, Giuseppe; Ciarrocchi, Alessia

    2016-03-01

    Recent evidence indicates that melanoma comprises distinct types of tumors and suggests that specific morphological features may help predict its clinical behavior. Using a SNP-array approach, we quantified chromosomal copy number alterations (CNA) across the whole genome in 41 primary melanomas and found a high degree of heterogeneity in their genomic asset. Association analysis correlating the number and relative length of CNA with clinical, morphological, and dermoscopic attributes of melanoma revealed that features of aggressiveness were strongly linked to the overall amount of genomic damage. Furthermore, we observed that melanoma progression and survival were mainly affected by a low number of large chromosome losses and a high number of small gains. We identified the alterations most frequently associated with aggressive melanoma, and by integrating our data with publicly available gene expression profiles, we identified five genes which expression was found to be necessary for melanoma cells proliferation. In conclusion, this work provides new evidence that the phenotypic heterogeneity of melanoma reflects a parallel genetic diversity and lays the basis to define novel strategies for a more precise prognostic stratification of patients. PMID:26575206

  18. Ceruloplasmin/Hephaestin Knockout Mice Model Morphologic and Molecular Features of AMD

    PubMed Central

    Hadziahmetovic, Majda; Dentchev, Tzvete; Song, Ying; Haddad, Nadine; He, Xining; Hahn, Paul; Pratico, Domenico; Wen, Rong; Harris, Z. Leah; Lambris, John D.; Beard, John; Dunaief, Joshua L.

    2008-01-01

    Purpose Iron is an essential element in human metabolism but also is a potent generator of oxidative damage with levels that increase with age. Several studies suggest that iron accumulation may be a factor in age-related macular degeneration (AMD). In prior studies, both iron overload and features of AMD were identified in mice deficient in the ferroxidase ceruloplasmin (Cp) and its homologue hephaestin (Heph) (double knockout, DKO). In this study, the location and timing of iron accumulation, the rate and reproducibility of retinal degeneration, and the roles of oxidative stress and complement activation were determined. Methods Morphologic analysis and histochemical iron detection by Perls' staining was performed on retina sections from DKO and control mice. Immunofluorescence and immunohistochemistry were performed with antibodies detecting activated complement factor C3, transferrin receptor, L-ferritin, and macrophages. Tissue iron levels were measured by atomic absorption spectrophotometry. Isoprostane F2α-VI, a specific marker of oxidative stress, was quantified in the tissue by gas chromatography/mass spectrometry. Results DKOs exhibited highly reproducible age-dependent iron overload, which plateaued at 6 months of age, with subsequent progressive retinal degeneration continuing to at least 12 months. The degeneration shared some features of AMD, including RPE hypertrophy and hyperplasia, photoreceptor degeneration, subretinal neovascularization, RPE lipofuscin accumulation, oxidative stress, and complement activation. Conclusions DKOs have age-dependent iron accumulation followed by retinal degeneration modeling some of the morphologic and molecular features of AMD. Therefore, these mice are a good platform on which to test therapeutic agents for AMD, such as antioxidants, iron chelators, and antiangiogenic agents. PMID:18326691

  19. Using molecular structure for reliable predicting enthalpy of melting of nitroaromatic energetic compounds.

    PubMed

    Semnani, Abolfazl; Keshavarz, Mohammad Hossein

    2010-06-15

    In this work, a reliable simple method has been introduced for predicting enthalpy of melting of nitroaromatic energetic compounds through their molecular structures. This method can be used for a wide range of nitroaromatics including halogenated nitroaromatic compounds. The contribution of hydrogen bonding and polar groups as well as structural parameters can be used to improve the predicted values on the basis of the number of carbon, nitrogen and oxygen atoms. The predicted results show that this method gives reliable prediction of standard enthalpy of melting with respect to the best available methods for different nitroaromatic compounds including high explosives with complex molecular structures. PMID:20117881

  20. Beyond intensity: Spectral features effectively predict music-induced subjective arousal.

    PubMed

    Gingras, Bruno; Marin, Manuela M; Fitch, W Tecumseh

    2014-01-01

    Emotions in music are conveyed by a variety of acoustic cues. Notably, the positive association between sound intensity and arousal has particular biological relevance. However, although amplitude normalization is a common procedure used to control for intensity in music psychology research, direct comparisons between emotional ratings of original and amplitude-normalized musical excerpts are lacking. In this study, 30 nonmusicians retrospectively rated the subjective arousal and pleasantness induced by 84 six-second classical music excerpts, and an additional 30 nonmusicians rated the same excerpts normalized for amplitude. Following the cue-redundancy and Brunswik lens models of acoustic communication, we hypothesized that arousal and pleasantness ratings would be similar for both versions of the excerpts, and that arousal could be predicted effectively by other acoustic cues besides intensity. Although the difference in mean arousal and pleasantness ratings between original and amplitude-normalized excerpts correlated significantly with the amplitude adjustment, ratings for both sets of excerpts were highly correlated and shared a similar range of values, thus validating the use of amplitude normalization in music emotion research. Two acoustic parameters, spectral flux and spectral entropy, accounted for 65% of the variance in arousal ratings for both sets, indicating that spectral features can effectively predict arousal. Additionally, we confirmed that amplitude-normalized excerpts were adequately matched for loudness. Overall, the results corroborate our hypotheses and support the cue-redundancy and Brunswik lens models. PMID:24215647

  1. When can we expect statistical mechanics to help predict large scale atmospheric and oceanic features?

    NASA Astrophysics Data System (ADS)

    Nadiga, B. T.; Bouchet, F.

    2010-12-01

    While theoretical predictions of the large scales of turbulent geophysical flows is difficult, statistical mechanics has succeeded in describing various tropospheric features of Jupiter, the polar vortex, and oceanic jets and vortices. Nevertheless, the applicability of such statistical mechanical theories to non-equilibrium situations is unclear. Based on numerical studies of non-equilibrium, two dimensional and geostrophic turbulence and some recent experiments, we propose a criterion based on the relative importance of the forcing-dissipation time scale on the one hand, and an inertial relaxation time scale on the other. Across these studies, we find that when the inertial relaxation time scale is much smaller than the forcing-dissipation timescale, statistical mechanics gives good predictions of the large scale mean flow including those of possible transitions. We further elaborate on the extension of such a criterion to other situations. F. BOUCHET and B.T. NADIGA Criteria for the applicability of statistical mechanics for the statistics of the largest scales of turbulent flows, to be submitted to Journal of Fluid Mechanics F. BOUCHET and A. VENAILLE, Statistical mechanics of two-dimensional and geophysical flows, submitted to Physics Reports F. BOUCHET and J. SOMMERIA, Emergence of intense jets and Jupiter's Great Red Spot as maximum-entropy structures, Journal of Fluid Mechanics 464 (2002), 165-207.

  2. Search performance is better predicted by tileability than presence of a unique basic feature

    PubMed Central

    Chang, Honghua; Rosenholtz, Ruth

    2016-01-01

    Traditional models of visual search such as feature integration theory (FIT; Treisman & Gelade, 1980), have suggested that a key factor determining task difficulty consists of whether or not the search target contains a “basic feature” not found in the other display items (distractors). Here we discriminate between such traditional models and our recent texture tiling model (TTM) of search (Rosenholtz, Huang, Raj, Balas, & Ilie, 2012b), by designing new experiments that directly pit these models against each other. Doing so is nontrivial, for two reasons. First, the visual representation in TTM is fully specified, and makes clear testable predictions, but its complexity makes getting intuitions difficult. Here we elucidate a rule of thumb for TTM, which enables us to easily design new and interesting search experiments. FIT, on the other hand, is somewhat ill-defined and hard to pin down. To get around this, rather than designing totally new search experiments, we start with five classic experiments that FIT already claims to explain: T among Ls, 2 among 5s, Q among Os, O among Qs, and an orientation/luminance-contrast conjunction search. We find that fairly subtle changes in these search tasks lead to significant changes in performance, in a direction predicted by TTM, providing definitive evidence in favor of the texture tiling model as opposed to traditional views of search. PMID:27548090

  3. High Resolution Prediction of Calcium-Binding Sites in 3D Protein Structures Using FEATURE

    PubMed Central

    2015-01-01

    Metal-binding proteins are ubiquitous in biological systems ranging from enzymes to cell surface receptors. Among the various biologically active metal ions, calcium plays a large role in regulating cellular and physiological changes. With the increasing number of high-quality crystal structures of proteins associated with their metal ion ligands, many groups have built models to identify Ca2+ sites in proteins, utilizing information such as structure, geometry, or homology to do the inference. We present a FEATURE-based approach in building such a model and show that our model is able to discriminate between nonsites and calcium-binding sites with a very high precision of more than 98%. We demonstrate the high specificity of our model by applying it to test sets constructed from other ions. We also introduce an algorithm to convert high scoring regions into specific site predictions and demonstrate the usage by scanning a test set of 91 calcium-binding protein structures (190 calcium sites). The algorithm has a recall of more than 93% on the test set with predictions found within 3 Å of the actual sites. PMID:26226489

  4. Neuroendocrine Tumors of the Large Intestine: Clinicopathological Features and Predictive Factors of Lymph Node Metastasis

    PubMed Central

    Kojima, Motohiro; Ikeda, Koji; Saito, Norio; Sakuyama, Naoki; Koushi, Kenichi; Kawano, Shingo; Watanabe, Toshiaki; Sugihara, Kenichi; Ito, Masaaki; Ochiai, Atsushi

    2016-01-01

    A new histological classification of neuroendocrine tumors (NETs) was established in WHO 2010. ENET and NCCN proposed treatment algorithms for colorectal NET. Retrospective study of NET of the large intestine (colorectal and appendiceal NET) was performed among institutions allied with the Japanese Society for Cancer of the Colon and Rectum, and 760 neuroendocrine tumors from 2001 to 2011 were re-assessed using WHO 2010 criteria to elucidate the clinicopathological features of NET in the large intestine. Next, the clinicopathological relationship with lymph node metastasis was analyzed to predict lymph node metastasis in locally resected rectal NET. The primary site was rectum in 718/760 cases (94.5%), colon in 30/760 cases (3.9%), and appendix in 12/760 cases (1.6%). Patients were predominantly men (61.6%) with a mean age of 58.7 years. Tumor size was <10 mm in 65.4% of cases. Proportions of NET G1, G2, G3, and mixed adeno-neuroendocrine carcinoma (MANEC) were 88.4, 6.3, 3.9, and 1.3%, respectively. Of the 760 tumors, 468 were locally resected, and 292 were surgically resected with lymph node dissection. Rectal NET showed a higher proportion of NET G1, and colonic and appendiceal NET was more commonly G3 and MANEC. Of the 292 surgically resected cases, 233 NET G1 and G2 located in the rectum were used for the prediction of lymph node metastasis. Lymphatic and blood vessel invasion were independent predictive factors of lymph node metastasis. NET G2 cases showed more frequent lymph node metastasis than that seen in NET G1 cases, but this was not an independent predictor of lymph node metastasis. Of the 98 surgically resected cases <10 mm in size, we found 9 cases with lymph node metastasis (9.2%). All cases were NET G1, and eight of the nine cases were positive either for lymphatic invasion or blood vessel invasion. Using the WHO classification, we found NET in the large intestine showed a tumor-site-dependent variety of histological and clinicopathological

  5. Neuroendocrine Tumors of the Large Intestine: Clinicopathological Features and Predictive Factors of Lymph Node Metastasis.

    PubMed

    Kojima, Motohiro; Ikeda, Koji; Saito, Norio; Sakuyama, Naoki; Koushi, Kenichi; Kawano, Shingo; Watanabe, Toshiaki; Sugihara, Kenichi; Ito, Masaaki; Ochiai, Atsushi

    2016-01-01

    A new histological classification of neuroendocrine tumors (NETs) was established in WHO 2010. ENET and NCCN proposed treatment algorithms for colorectal NET. Retrospective study of NET of the large intestine (colorectal and appendiceal NET) was performed among institutions allied with the Japanese Society for Cancer of the Colon and Rectum, and 760 neuroendocrine tumors from 2001 to 2011 were re-assessed using WHO 2010 criteria to elucidate the clinicopathological features of NET in the large intestine. Next, the clinicopathological relationship with lymph node metastasis was analyzed to predict lymph node metastasis in locally resected rectal NET. The primary site was rectum in 718/760 cases (94.5%), colon in 30/760 cases (3.9%), and appendix in 12/760 cases (1.6%). Patients were predominantly men (61.6%) with a mean age of 58.7 years. Tumor size was <10 mm in 65.4% of cases. Proportions of NET G1, G2, G3, and mixed adeno-neuroendocrine carcinoma (MANEC) were 88.4, 6.3, 3.9, and 1.3%, respectively. Of the 760 tumors, 468 were locally resected, and 292 were surgically resected with lymph node dissection. Rectal NET showed a higher proportion of NET G1, and colonic and appendiceal NET was more commonly G3 and MANEC. Of the 292 surgically resected cases, 233 NET G1 and G2 located in the rectum were used for the prediction of lymph node metastasis. Lymphatic and blood vessel invasion were independent predictive factors of lymph node metastasis. NET G2 cases showed more frequent lymph node metastasis than that seen in NET G1 cases, but this was not an independent predictor of lymph node metastasis. Of the 98 surgically resected cases <10 mm in size, we found 9 cases with lymph node metastasis (9.2%). All cases were NET G1, and eight of the nine cases were positive either for lymphatic invasion or blood vessel invasion. Using the WHO classification, we found NET in the large intestine showed a tumor-site-dependent variety of histological and clinicopathological

  6. Computer extracted texture features on T2w MRI to predict biochemical recurrence following radiation therapy for prostate cancer

    NASA Astrophysics Data System (ADS)

    Ginsburg, Shoshana B.; Rusu, Mirabela; Kurhanewicz, John; Madabhushi, Anant

    2014-03-01

    In this study we explore the ability of a novel machine learning approach, in conjunction with computer-extracted features describing prostate cancer morphology on pre-treatment MRI, to predict whether a patient will develop biochemical recurrence within ten years of radiation therapy. Biochemical recurrence, which is characterized by a rise in serum prostate-specific antigen (PSA) of at least 2 ng/mL above the nadir PSA, is associated with increased risk of metastasis and prostate cancer-related mortality. Currently, risk of biochemical recurrence is predicted by the Kattan nomogram, which incorporates several clinical factors to predict the probability of recurrence-free survival following radiation therapy (but has limited prediction accuracy). Semantic attributes on T2w MRI, such as the presence of extracapsular extension and seminal vesicle invasion and surrogate measure- ments of tumor size, have also been shown to be predictive of biochemical recurrence risk. While the correlation between biochemical recurrence and factors like tumor stage, Gleason grade, and extracapsular spread are well- documented, it is less clear how to predict biochemical recurrence in the absence of extracapsular spread and for small tumors fully contained in the capsule. Computer{extracted texture features, which quantitatively de- scribe tumor micro-architecture and morphology on MRI, have been shown to provide clues about a tumor's aggressiveness. However, while computer{extracted features have been employed for predicting cancer presence and grade, they have not been evaluated in the context of predicting risk of biochemical recurrence. This work seeks to evaluate the role of computer-extracted texture features in predicting risk of biochemical recurrence on a cohort of sixteen patients who underwent pre{treatment 1.5 Tesla (T) T2w MRI. We extract a combination of first-order statistical, gradient, co-occurrence, and Gabor wavelet features from T2w MRI. To identify which of these

  7. Anion pairs in room temperature ionic liquids predicted by molecular dynamics simulation, verified by spectroscopic characterization

    SciTech Connect

    Schwenzer, Birgit; Kerisit, Sebastien N.; Vijayakumar, M.

    2014-01-01

    Molecular-level spectroscopic analyses of an aprotic and a protic room-temperature ionic liquid, BMIM OTf and BMIM HSO4, respectively, have been carried out with the aim of verifying molecular dynamics simulations that predict anion pair formation in these fluid structures. Fourier-transform infrared spectroscopy, Raman spectroscopy and nuclear magnetic resonance spectroscopy of various nuclei support the theoretically-determined average molecular arrangements.

  8. Predicting hot spots in protein interfaces based on protrusion index, pseudo hydrophobicity and electron-ion interaction pseudopotential features

    PubMed Central

    Xia, Junfeng; Yue, Zhenyu; Di, Yunqiang; Zhu, Xiaolei; Zheng, Chun-Hou

    2016-01-01

    The identification of hot spots, a small subset of protein interfaces that accounts for the majority of binding free energy, is becoming more important for the research of drug design and cancer development. Based on our previous methods (APIS and KFC2), here we proposed a novel hot spot prediction method. For each hot spot residue, we firstly constructed a wide variety of 108 sequence, structural, and neighborhood features to characterize potential hot spot residues, including conventional ones and new one (pseudo hydrophobicity) exploited in this study. We then selected 3 top-ranking features that contribute the most in the classification by a two-step feature selection process consisting of minimal-redundancy-maximal-relevance algorithm and an exhaustive search method. We used support vector machines to build our final prediction model. When testing our model on an independent test set, our method showed the highest F1-score of 0.70 and MCC of 0.46 comparing with the existing state-of-the-art hot spot prediction methods. Our results indicate that these features are more effective than the conventional features considered previously, and that the combination of our and traditional features may support the creation of a discriminative feature set for efficient prediction of hot spots in protein interfaces. PMID:26934646

  9. Prediction of biomechanical properties of trabecular bone in MR images with geometric features and support vector regression.

    PubMed

    Huber, Markus B; Lancianese, Sarah L; Nagarajan, Mahesh B; Ikpot, Imoh Z; Lerner, Amy L; Wismuller, Axel

    2011-06-01

    Whole knee joint MR image datasets were used to compare the performance of geometric trabecular bone features and advanced machine learning techniques in predicting biomechanical strength properties measured on the corresponding ex vivo specimens. Changes of trabecular bone structure throughout the proximal tibia are indicative of several musculoskeletal disorders involving changes in the bone quality and the surrounding soft tissue. Recent studies have shown that MR imaging also allows non-invasive 3-D characterization of bone microstructure. Sophisticated features like the scaling index method (SIM) can estimate local structural and geometric properties of the trabecular bone and may improve the ability of MR imaging to determine local bone quality in vivo. A set of 67 bone cubes was extracted from knee specimens and their biomechanical strength estimated by the yield stress (YS) [in MPa] was determined through mechanical testing. The regional apparent bone volume fraction (BVF) and SIM derived features were calculated for each bone cube. A linear multiregression analysis (MultiReg) and a optimized support vector regression (SVR) algorithm were used to predict the YS from the image features. The prediction accuracy was measured by the root mean square error (RMSE) for each image feature on independent test sets. The best prediction result with the lowest prediction error of RMSE = 1.021 MPa was obtained with a combination of BVF and SIM features and by using SVR. The prediction accuracy with only SIM features and SVR (RMSE = 1.023 MPa) was still significantly better than BVF alone and MultiReg (RMSE = 1.073 MPa). The current study demonstrates that the combination of sophisticated bone structure features and supervised learning techniques can improve MR-based determination of trabecular bone quality. PMID:21356612

  10. Predicting the biomechanical strength of proximal femur specimens with bone mineral density features and support vector regression

    NASA Astrophysics Data System (ADS)

    Huber, Markus B.; Yang, Chien-Chun; Carballido-Gamio, Julio; Bauer, Jan S.; Baum, Thomas; Nagarajan, Mahesh B.; Eckstein, Felix; Lochmüller, Eva; Majumdar, Sharmila; Link, Thomas M.; Wismüller, Axel

    2012-03-01

    To improve the clinical assessment of osteoporotic hip fracture risk, recent computer-aided diagnosis systems explore new approaches to estimate the local trabecular bone quality beyond bone density alone to predict femoral bone strength. In this context, statistical bone mineral density (BMD) features extracted from multi-detector computed tomography (MDCT) images of proximal femur specimens and different function approximations methods were compared in their ability to predict the biomechanical strength. MDCT scans were acquired in 146 proximal femur specimens harvested from human cadavers. The femurs' failure load (FL) was determined through biomechanical testing. An automated volume of interest (VOI)-fitting algorithm was used to define a consistent volume in the femoral head of each specimen. In these VOIs, the trabecular bone was represented by statistical moments of the BMD distribution and by pairwise spatial occurrence of BMD values using the gray-level co-occurrence (GLCM) approach. A linear multi-regression analysis (MultiReg) and a support vector regression algorithm with a linear kernel (SVRlin) were used to predict the FL from the image feature sets. The prediction performance was measured by the root mean square error (RMSE) for each image feature on independent test sets; in addition the coefficient of determination R2 was calculated. The best prediction result was obtained with a GLCM feature set using SVRlin, which had the lowest prediction error (RSME = 1.040+/-0.143, R2 = 0.544) and which was significantly lower that the standard approach of using BMD.mean and MultiReg (RSME = 1.093+/-0.133, R2 = 0.490, p<0.0001). The combined sets including BMD.mean and GLCM features had a similar or slightly lower performance than using only GLCM features. The results indicate that the performance of high-dimensional BMD features extracted from MDCT images in predicting the biomechanical strength of proximal femur specimens can be significantly improved by

  11. Identification of critical chemical features for Aurora kinase-B inhibitors using Hip-Hop, virtual screening and molecular docking

    NASA Astrophysics Data System (ADS)

    Sakkiah, Sugunadevi; Thangapandian, Sundarapandian; John, Shalini; Lee, Keun Woo

    2011-01-01

    This study was performed to find the selective chemical features for Aurora kinase-B inhibitors using the potent methods like Hip-Hop, virtual screening, homology modeling, molecular dynamics and docking. The best hypothesis, Hypo1 was validated toward a wide range of test set containing the selective inhibitors of Aurora kinase-B. Homology modeling and molecular dynamics studies were carried out to perform the molecular docking studies. The best hypothesis Hypo1 was used as a 3D query to screen the chemical databases. The screened molecules from the databases were sorted based on ADME and drug like properties. The selective hit compounds were docked and the hydrogen bond interactions with the critical amino acids present in Aurora kinase-B were compared with the chemical features present in the Hypo1. Finally, we suggest that the chemical features present in the Hypo1 are vital for a molecule to inhibit the Aurora kinase-B activity.

  12. BCR-ABL-positive acute myeloid leukemia: a new entity? Analysis of clinical and molecular features.

    PubMed

    Neuendorff, Nina Rosa; Burmeister, Thomas; Dörken, Bernd; Westermann, Jörg

    2016-08-01

    BCR-ABL-positive acute myeloid leukemia (AML) is a rare subtype of AML that is now included as a provisional entity in the 2016 revised WHO classification of myeloid malignancies. Since a clear distinction between de novo BCR-ABL+ AML and chronic myeloid leukemia (CML) blast crisis is challenging in many cases, the existence of de novo BCR-ABL+ AML has been a matter of debate for a long time. However, there is increasing evidence suggesting that BCR-ABL+ AML is in fact a distinct subgroup of AML. In this study, we analyzed all published cases since 1975 as well as cases from our institution in order to present common clinical and molecular features of this rare disease. Our analysis shows that BCR-ABL predominantly occurs in AML-NOS, CBF leukemia, and AML with myelodysplasia-related changes. The most common BCR-ABL transcripts (p190 and p210) are nearly equally distributed. Based on the analysis of published data, we provide a clinical algorithm for the initial differential diagnosis of BCR-ABL+ AML. The prognosis of BCR-ABL+ AML seems to depend on the cytogenetic and/or molecular background rather than on BCR-ABL itself. A therapy with tyrosine kinase inhibitors (TKIs) such as imatinib, dasatinib, or nilotinib is reasonable, but-due to a lack of systematic clinical data-their use cannot be routinely recommended in first-line therapy. Beyond first-line treatment of AML, the use of TKI remains an individual decision, both in combination with intensive chemotherapy and/or as a bridge to allogeneic stem cell transplantation. In each single case, potential benefits have to be weighed against potential risks. PMID:27297971

  13. Features of exciton dynamics in molecular nanoclusters (J-aggregates): Exciton self-trapping (Review Article)

    NASA Astrophysics Data System (ADS)

    Malyukin, Yu. V.; Sorokin, A. V.; Semynozhenko, V. P.

    2016-06-01

    We present thoroughly analyzed experimental results that demonstrate the anomalous manifestation of the exciton self-trapping effect, which is already well-known in bulk crystals, in ordered molecular nanoclusters called J-aggregates. Weakly-coupled one-dimensional (1D) molecular chains are the main structural feature of J-aggregates, wherein the electron excitations are manifested as 1D Frenkel excitons. According to the continuum theory of Rashba-Toyozawa, J-aggregates can have only self-trapped excitons, because 1D excitons must adhere to barrier-free self-trapping at any exciton-phonon coupling constant g = ɛLR/2β, wherein ɛLR is the lattice relaxation energy, and 2β is the half-width of the exciton band. In contrast, very often only the luminescence of free, mobile excitons would manifest in experiments involving J-aggregates. Using the Urbach rule in order to analyze the low-frequency region of the low-temperature exciton absorption spectra has shown that J-aggregates can have both a weak (g < 1) and a strong (g > 1) exciton-phonon coupling. Moreover, it is experimentally demonstrated that under certain conditions, the J-aggregate excited state can have both free and self-trapped excitons, i.e., we establish the existence of a self-trapping barrier for 1D Frenkel excitons. We demonstrate and analyze the reasons behind the anomalous existence of both free and self-trapped excitons in J-aggregates, and demonstrate how exciton-self trapping efficiency can be managed in J-aggregates by varying the values of g, which is fundamentally impossible in bulk crystals. We discuss how the exciton-self trapping phenomenon can be used as an alternate interpretation of the wide band emission of some J-aggregates, which has thus far been explained by the strongly localized exciton model.

  14. CD30-positive peripheral T-cell lymphomas share molecular and phenotypic features

    PubMed Central

    Bisig, Bettina; de Reyniès, Aurélien; Bonnet, Christophe; Sujobert, Pierre; Rickman, David S.; Marafioti, Teresa; Delsol, Georges; Lamant, Laurence; Gaulard, Philippe; de Leval, Laurence

    2013-01-01

    Peripheral T-cell lymphoma, not otherwise specified is a heterogeneous group of aggressive neoplasms with indistinct borders. By gene expression profiling we previously reported unsupervised clusters of peripheral T-cell lymphomas, not otherwise specified correlating with CD30 expression. In this work we extended the analysis of peripheral T-cell lymphoma molecular profiles to prototypical CD30+ peripheral T-cell lymphomas (anaplastic large cell lymphomas), and validated mRNA expression profiles at the protein level. Existing transcriptomic datasets from peripheral T-cell lymphomas, not otherwise specified and anaplastic large cell lymphomas were reanalyzed. Twenty-one markers were selected for immunohistochemical validation on 80 peripheral T-cell lymphoma samples (not otherwise specified, CD30+ and CD30−; anaplastic large cell lymphomas, ALK+ and ALK−), and differences between subgroups were assessed. Clinical follow-up was recorded. Compared to CD30− tumors, CD30+ peripheral T-cell lymphomas, not otherwise specified were significantly enriched in ALK− anaplastic large cell lymphoma-related genes. By immunohistochemistry, CD30+ peripheral T-cell lymphomas, not otherwise specified differed significantly from CD30− samples [down-regulated expression of T-cell receptor-associated proximal tyrosine kinases (Lck, Fyn, Itk) and of proteins involved in T-cell differentiation/activation (CD69, ICOS, CD52, NFATc2); upregulation of JunB and MUM1], while overlapping with anaplastic large cell lymphomas. CD30− peripheral T-cell lymphomas, not otherwise specified tended to have an inferior clinical outcome compared to the CD30+ subgroups. In conclusion, we show molecular and phenotypic features common to CD30+ peripheral T-cell lymphomas, and significant differences between CD30− and CD30+ peripheral T-cell lymphomas, not otherwise specified, suggesting that CD30 expression might delineate two biologically distinct subgroups. PMID:23716562

  15. STAT3 Expression, Molecular Features, Inflammation Patterns and Prognosis in a Database of 724 Colorectal Cancers

    PubMed Central

    Morikawa, Teppei; Baba, Yoshifumi; Yamauchi, Mai; Kuchiba, Aya; Nosho, Katsuhiko; Shima, Kaori; Tanaka, Noriko; Huttenhower, Curtis; Frank, David A.; Fuchs, Charles S.; Ogino, Shuji

    2010-01-01

    Purpose STAT3 (signal transducer and activator of transcription 3) is a transcription factor that is constitutively activated in some cancers. STAT3 appears to play crucial roles in cell proliferation and survival, angiogenesis, tumor-promoting inflammation and suppression of anti-tumor host immune response in the tumor microenvironment. Although the STAT3 signaling pathway is a potential drug target, clinical, pathologic, molecular or prognostic features of STAT3-activated colorectal cancer remain uncertain. Experimental Design Utilizing a database of 724 colon and rectal cancer cases, we evaluated phosphorylated STAT3 (p-STAT3) expression by immunohistochemistry. Cox proportional hazards model was used to compute mortality hazard ratio (HR), adjusting for clinical, pathologic and molecular features, including microsatellite instability (MSI), the CpG island methylator phenotype (CIMP), LINE-1 methylation, 18q loss of heterozygosity, TP53 (p53), CTNNB1 (β-catenin), JC virus T-antigen, and KRAS, BRAF, and PIK3CA mutations. Results Among the 724 tumors, 131 (18%) showed high-level p-STAT3 expression (p-STAT3-high), 244 (34%) showed low-level expression (p-STAT3-low), and the remaining 349 (48%) were negative for p-STAT3. p-STAT3 overexpression was associated with significantly higher colorectal cancer-specific mortality [log-rank p=0.0020; univariate HR (p-STAT3-high vs. p-STAT3-negative) 1.85, 95% confidence interval (CI) 1.30–2.63, Ptrend =0.0005; multivariate HR, 1.61, 95% CI 1.11–2.34, Ptrend =0.015). p-STAT3 expression was positively associated with peritumoral lymphocytic reaction (multivariate odds ratio 3.23; 95% CI, 1.89–5.53; p<0.0001). p-STAT3 expression was not associated with MSI, CIMP, or LINE-1 hypomethylation. Conclusions STAT3 activation in colorectal cancer is associated with adverse clinical outcome, supporting its potential roles as a prognostic biomarker and a chemoprevention and/or therapeutic target. PMID:21310826

  16. pT0 Prostate Cancer: Predictive Clinicopathologic Features in an American Population

    PubMed Central

    Bream, Matthew J.; Dahmoush, Laila; Brown, James A.

    2013-01-01

    Introduction The pT0 stage of prostate cancer describes the radical prostatectomy (RP) specimen where no cancer can be identified. Given known racial and geographic differences in prostate cancer incidence and survival, we reviewed our experience with pT0 disease to determine applicability of these predictive features in an American population. Materials and Methods A retrospective chart review was conducted for all RPs at one state tertiary care institution during a 20-year period (1991-2011). Clinicopathologic features of pT0 patients were collected and their relevant pathologic material re-reviewed. Results Of a total of 1,635 RPs performed, 4 (0.2%) not receiving neoadjuvant therapy or other prior prostate surgeries were stage pT0. Biopsies from 3 of 4 patients were re-evaluated and confirmed a small focus, <1% of tissue, of Gleason score 3+3 adenocarcinoma; a fourth was not available for re-review. Our re-review of the RP slides identified small foci of cancer in two of the four, thus yielding a final true pT0 incidence of 0.1%. Preoperative prostate specific antigen ranged from 4.4 to 7.4 ng/ml, clinical stages were all T1c, and there was no evidence of recurrence at 3 months to 10 years of follow-up. Conclusions Stage pT0 prostate cancer is very uncommon, occurring with an incidence of 0.1%, and in our experience occurs only in clinical T1c patients with pre-biopsy prostate specific antigen < 7.5 ng/ml, with Gleason score 3 + 3 adenocarcinoma comprising < 1%, 1 mm of a single core biopsy, a stricter threshold than that seen in non-American populations. PMID:24917750

  17. Clinical Features, Outcomes, and Molecular Characteristics of Community- and Health Care-Associated Staphylococcus lugdunensis Infections.

    PubMed

    Yeh, Chun-Fu; Chang, Shih-Cheng; Cheng, Chun-Wen; Lin, Jung-Fu; Liu, Tsui-Ping; Lu, Jang-Jih

    2016-08-01

    Staphylococcus lugdunensis is a major cause of aggressive endocarditis, but it is also responsible for a broad spectrum of infections. The differences in clinical and molecular characteristics between community-associated (CA) and health care-associated (HA) S. lugdunensis infections have remained unclear. We performed a retrospective study of S. lugdunensis infections between 2003 and 2014 to compare the clinical and molecular characteristics of CA and HA isolates. We collected 129 S. lugdunensis isolates in total: 81 (62.8%) HA isolates and 48 (37.2%) CA isolates. HA infections were more frequent than CA infections in children (16.0% versus 4.2%, respectively; P = 0.041) and the elderly (38.3% versus 14.6%, respectively; P = 0.004). The CA isolates were more likely to cause skin and soft tissue infections (85.4% versus 19.8%, respectively; P < 0.001). HA isolates were more frequently responsible for bacteremia of unknown origin (34.6% versus 4.2%, respectively; P < 0.001) and for catheter-related bacteremia (12.3% versus 0%, respectively; P = 0.011) than CA isolates. Fourteen-day mortality was higher for HA infections than for CA infections (11.1% versus 0%, respectively). A higher proportion of the HA isolates than of the CA isolates were resistant to penicillin (76.5% versus 52.1%, respectively; P = 0.004) and oxacillin (32.1% versus 2.1%, respectively; P < 0.001). Two major clonal complexes (CC1 and CC3) were identified. Sequence type 41 (ST41) was the most common sequence type identified (29.5%). The proportion of ST38 isolates was higher for HA than for CA infections (33.3% versus 12.5%, respectively; P = 0.009). These isolates were of staphylococcal cassette chromosome mec element (SCCmec)type IV, V, or Vt. HA and CA S. lugdunensis infections differ in terms of their clinical features, outcome, antibiotic susceptibilities, and molecular characteristics. PMID:27225402

  18. Molecular Features and Survival Outcomes of the Intrinsic Subtypes Within HER2-Positive Breast Cancer

    PubMed Central

    Carey, Lisa A.; Adamo, Barbara; Vidal, Maria; Tabernero, Josep; Cortés, Javier; Parker, Joel S.; Perou, Charles M.; Baselga, José

    2014-01-01

    Background The clinical impact of the biological heterogeneity within HER2-positive (HER2+) breast cancer is not fully understood. Here, we evaluated the molecular features and survival outcomes of the intrinsic subtypes within HER2+ breast cancer. Methods We interrogated The Cancer Genome Atlas (n = 495) and Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) datasets (n = 1730) of primary breast cancers for molecular data derived from DNA, RNA and protein, and determined intrinsic subtype. Clinical HER2 status was defined according to American Society of Clinical Oncology (ASCO)/College of American Pathologists (CAP) guidelines or DNA copy-number aberration by single nucleotide polymorphism arrays. Cox models tested the prognostic significance of each variable in patients not treated with trastuzumab (n = 1711). Results Compared with clinically HER2 (cHER2)-negative breast cancer, cHER2+ breast cancer had a higher frequency of the HER2-enriched (HER2E) subtype (47.0% vs 7.1%) and a lower frequency of Luminal A (10.7% vs 39.0%) and Basal-like (14.1% vs 23.4%) subtypes. The likelihood of cHER2-positivity in HER2E, Luminal B, Basal-like and Luminal A subtypes was 64.6%, 20.0%, 14.4% and 7.3%, respectively. Within each subtype, only 0.3% to 3.9% of genes were found differentially expressed between cHER2+ and cHER2-negative tumors. Within cHER2+ tumors, HER2 gene and protein expression was statistically significantly higher in the HER2E and Basal-like subtypes than either luminal subtype. Neither cHER2 status nor the new 10-subtype copy number-based classification system (IntClust) added independent prognostic value to intrinsic subtype. Conclusions When the intrinsic subtypes are taken into account, cHER2-positivity does not translate into large changes in the expression of downstream signaling pathways, nor does it affect patient survival in the absence of HER2 targeting. PMID:25139534

  19. Semiempirical Predictions of Chemical Degradation Reaction Mechanisms of CL-20 as Related to Molecular Structure

    SciTech Connect

    Qasim, Mohammad M.; Furey, John; Fredrickson, Herbert L.; Szecsody, Jim E.; Mcgrath, Chris J.; Bajpai, Rakesh

    2004-10-01

    Quantum mechanical methods and force field molecular mechanics were used to characterize cage cyclic nitramines and to predict environmental degradation mechanisms. Due to structural similarities it is predicted that, under homologous circumstances, the major environmental RDX degradation pathways should also be effective for CL-20 and similar cyclic nitramines.

  20. Molecular insight into the viral biology and clinical features of trichodysplasia spinulosa.

    PubMed

    Wu, J H; Nguyen, H P; Rady, P L; Tyring, S K

    2016-03-01

    Trichodysplasia spinulosa (TS) is a disfiguring skin disease that occurs most frequently in patients receiving immunosuppressive therapies, and is thus frequently associated with organ transplantation. TS is characterized clinically by folliculocentric papular eruption, keratin spine formation and development of leonine face; and histologically by expansion of the inner root sheath epithelium and high expression of the proliferative marker Ki-67. Recent discovery of the TS-associated polyomavirus (TSPyV) and emerging studies demonstrating the role of TSPyV tumour antigens in cell proliferation pathways have opened a new corridor for research on TS. In this brief review, we summarize the clinical and histological features of TS and evaluate the current options for therapy. Furthermore, we address the viral aetiology of the disease and explore the mechanisms by which TSPyV may influence TS development and progression. As reports of TS continue to rise, clinician recognition of TS, as well as accompanying research on its underlying pathogenesis and therapeutic options, is becoming increasingly important. It is our hope that heightened clinical suspicion for TS will increase rates of diagnosis and will galvanize both molecular and clinical interest in this disease. PMID:26479880

  1. Molecular features in complex environment: Cooperative team players during excited state bond cleavage.

    PubMed

    Thallmair, Sebastian; Roos, Matthias K; de Vivie-Riedle, Regina

    2016-07-01

    Photoinduced bond cleavage is often employed for the generation of highly reactive carbocations in solution and to study their reactivity. Diphenylmethyl derivatives are prominent precursors in polar and moderately polar solvents like acetonitrile or dichloromethane. Depending on the leaving group, the photoinduced bond cleavage occurs on a femtosecond to picosecond time scale and typically leads to two distinguishable products, the desired diphenylmethyl cations (Ph2CH(+)) and as competing by-product the diphenylmethyl radicals ([Formula: see text]). Conical intersections are the chief suspects for such ultrafast branching processes. We show for two typical examples, the neutral diphenylmethylchloride (Ph2CH-Cl) and the charged diphenylmethyltriphenylphosphonium ions ([Formula: see text]) that the role of the conical intersections depends not only on the molecular features but also on the interplay with the environment. It turns out to differ significantly for both precursors. Our analysis is based on quantum chemical and quantum dynamical calculations. For comparison, we use ultrafast transient absorption measurements. In case of Ph2CH-Cl, we can directly connect the observed signals to two early three-state and two-state conical intersections, both close to the Franck-Condon region. In case of the [Formula: see text], dynamic solvent effects are needed to activate a two-state conical intersection at larger distances along the reaction coordinate. PMID:26958588

  2. Association of Fusobacterium species in pancreatic cancer tissues with molecular features and prognosis

    PubMed Central

    Mitsuhashi, Kei; Nosho, Katsuhiko; Sukawa, Yasutaka; Matsunaga, Yasutaka; Ito, Miki; Kurihara, Hiroyoshi; Kanno, Shinichi; Igarashi, Hisayoshi; Naito, Takafumi; Adachi, Yasushi; Tachibana, Mami; Tanuma, Tokuma; Maguchi, Hiroyuki; Shinohara, Toshiya; Hasegawa, Tadashi; Imamura, Masafumi; Kimura, Yasutoshi; Hirata, Koichi; Maruyama, Reo; Suzuki, Hiromu; Imai, Kohzoh

    2015-01-01

    Recently, bacterial infection causing periodontal disease has attracted considerable attention as a risk factor for pancreatic cancer. Fusobacterium species is an oral bacterial group of the human microbiome. Some evidence suggests that Fusobacterium species promote colorectal cancer development; however, no previous studies have reported the association between Fusobacterium species and pancreatic cancer. Therefore, we examined whether Fusobacterium species exist in pancreatic cancer tissue. Using a database of 283 patients with pancreatic ductal adenocarcinoma (PDAC), we tested cancer tissue specimens for Fusobacterium species. We also tested the specimens for KRAS, NRAS, BRAF and PIK3CA mutations and measured microRNA-21 and microRNA-31. In addition, we assessed epigenetic alterations, including CpG island methylator phenotype (CIMP). Our data showed an 8.8% detection rate of Fusobacterium species in pancreatic cancers; however, tumor Fusobacterium status was not associated with any clinical and molecular features. In contrast, in multivariate Cox regression analysis, compared with the Fusobacterium species-negative group, we observed significantly higher cancer-specific mortality rates in the positive group (p = 0.023). In conclusion, Fusobacterium species were detected in pancreatic cancer tissue. Tumor Fusobacterium species status is independently associated with a worse prognosis of pancreatic cancer, suggesting that Fusobacterium species may be a prognostic biomarker of pancreatic cancer. PMID:25797243

  3. Molecular features of heterogeneous vancomycin-intermediate Staphylococcus aureus strains isolated from bacteremic patients

    PubMed Central

    2009-01-01

    Background Heterogeneous vancomycin-intermediate Staphylococcus aureus (hVISA) bacteremia is an emerging infection. Our objective was to determine the molecular features of hVISA strains isolated from bacteremic patients and to compare them to methicillin resistant S. aureus (MRSA) and methicillin sensitive S. aureus (MSSA) blood isolates. Results We assessed phenotypic and genomic changes of hVISA (n = 24), MRSA (n = 16) and MSSA (n = 17) isolates by PCR to determine staphylococcal chromosomal cassette (SCCmec) types, Panton-Valentine leukocidin (PVL) and the accessory gene regulator (agr) loci. Biofilm formation was quantified. Genetic relatedness was assessed by PFGE. PFGE analysis of isolates was diverse suggesting multiple sources of infection. 50% of hVISA isolates carried SCCmec type I, 21% type II; 25% type V; in 4% the SCCmec type could not be identified. Among MRSA isolates, 44% were SCCmec type I, 12.5% type II, 25% type V, 12.5% were non-typable, and 6% were SCCmec type IVd. Only one hVISA isolate and two MSSA isolates carried the PVL. Biofilm formation and agr patterns were diverse. Conclusion hVISA isolates were diverse in all parameters tested. A considerable number of hVISA and MRSA strains carried the SCCmec type V cassette, which was not related to community acquisition. PMID:19732456

  4. Clinical and molecular features and therapeutic perspectives of spinal muscular atrophy with respiratory distress type 1

    PubMed Central

    Vanoli, Fiammetta; Rinchetti, Paola; Porro, Francesca; Parente, Valeria; Corti, Stefania

    2015-01-01

    Spinal muscular atrophy with respiratory distress (SMARD1) is an autosomal recessive neuromuscular disease caused by mutations in the IGHMBP2 gene, encoding the immunoglobulin μ-binding protein 2, leading to motor neuron degeneration. It is a rare and fatal disease with an early onset in infancy in the majority of the cases. The main clinical features are muscular atrophy and diaphragmatic palsy, which requires prompt and permanent supportive ventilation. The human disease is recapitulated in the neuromuscular degeneration (nmd) mouse. No effective treatment is available yet, but novel therapeutical approaches tested on the nmd mouse, such as the use of neurotrophic factors and stem cell therapy, have shown positive effects. Gene therapy demonstrated effectiveness in SMA, being now at the stage of clinical trial in patients and therefore representing a possible treatment for SMARD1 as well. The significant advancement in understanding of both SMARD1 clinical spectrum and molecular mechanisms makes ground for a rapid translation of pre-clinical therapeutic strategies in humans. PMID:26095024

  5. Cryptosporidiosis in HIV/AIDS Patients in Kenya: Clinical Features, Epidemiology, Molecular Characterization and Antibody Responses

    PubMed Central

    Wanyiri, Jane W.; Kanyi, Henry; Maina, Samuel; Wang, David E.; Steen, Aaron; Ngugi, Paul; Kamau, Timothy; Waithera, Tabitha; O'Connor, Roberta; Gachuhi, Kimani; Wamae, Claire N.; Mwamburi, Mkaya; Ward, Honorine D.

    2014-01-01

    We investigated the epidemiological and clinical features of cryptosporidiosis, the molecular characteristics of infecting species and serum antibody responses to three Cryptosporidium-specific antigens in human immunodeficiency virus (HIV)/acquired immunodeficiency syndrome (AIDS) patients in Kenya. Cryptosporidium was the most prevalent enteric pathogen and was identified in 56 of 164 (34%) of HIV/AIDS patients, including 25 of 70 (36%) with diarrhea and 31 of 94 (33%) without diarrhea. Diarrhea in patients exclusively infected with Cryptosporidium was significantly associated with the number of children per household, contact with animals, and water treatment. Cryptosporidium hominis was the most prevalent species and the most prevalent subtype family was Ib. Patients without diarrhea had significantly higher serum IgG levels to Chgp15, Chgp40 and Cp23, and higher fecal IgA levels to Chgp15 and Chgp40 than those with diarrhea suggesting that antibody responses to these antigens may be associated with protection from diarrhea and supporting further investigation of these antigens as vaccine candidates. PMID:24865675

  6. Molecular features and toxicological properties of four common pesticides, acetamiprid, deltamethrin, chlorpyriphos and fipronil.

    PubMed

    Taillebois, Emiliane; Alamiddine, Zakaria; Brazier, Christine; Graton, Jérôme; Laurent, Adèle D; Thany, Steeve H; Le Questel, Jean-Yves

    2015-04-01

    Structural features and selected physicochemical properties of four common pesticides: acetamiprid (neonicotinoid), chlorpyriphos (organophosphate insecticide), deltamethrin (pyrethroid) and fipronil (phenylpyrazole) have been investigated by Density Functional Theory quantum chemical calculations. The high flexible character of these insecticides is revealed by the numerous conformers obtained, located within a 20kJmol(-1) range in the gas phase. In line with this trend, a redistribution of the energetic minima is observed in water medium. Molecular electrostatic potential calculations provide a ranking of the potential interaction sites of the four insecticides. The theoretical studies reported in the present work are completed by comparative toxicological assays against three aphid strains. Thus, the same toxicity order for the two susceptible strains Myzus persicae 4106A and Acyrthosiphon pisum LSR1: acetamiprid>fipronil>deltamethrin>chlorpyriphos is revealed. In the resistant strain M. persicae 1300145, the toxicity order is modified: acetamiprid>fipronil>chlorpyriphos>deltamethrin. Interestingly, the strain 1300145 which is known to be resistant to neonicotinoids, is also less sensitive to deltamethrin, chlorpyriphos and fipronil. PMID:25716006

  7. Complete mitochondrial DNA sequences of six snakes: phylogenetic relationships and molecular evolution of genomic features.

    PubMed

    Dong, Songyu; Kumazawa, Yoshinori

    2005-07-01

    Complete mitochondrial DNA (mtDNA) sequences were determined for representative species from six snake families: the acrochordid little file snake, the bold boa constrictor, the cylindrophiid red pipe snake, the viperid himehabu, the pythonid ball python, and the xenopeltid sunbeam snake. Thirteen protein-coding genes, 22 tRNA genes, 2 rRNA genes, and 2 control regions were identified in these mtDNAs. Duplication of the control region and translocation of the tRNALeu gene were two notable features of the snake mtDNAs. The duplicate control regions had nearly identical nucleotide sequences within species but they were divergent among species, suggesting concerted sequence evolution of the two control regions. In addition, the duplicate control regions appear to have facilitated an interchange of some flanking tRNA genes in the viperid lineage. Phylogenetic analyses were conducted using a large number of sites (9570 sites in total) derived from the complete mtDNA sequences. Our data strongly suggested a new phylogenetic relationship among the major families of snakes: ((((Viperidae, Colubridae), Acrochordidae), (((Pythonidae, Xenopeltidae), Cylindrophiidae), Boidae)), Leptotyphlopidae). This conclusion was distinct from a widely accepted view based on morphological characters in denying the sister-group relationship of boids and pythonids, as well as the basal divergence of nonmacrostomatan cylindrophiids. These results imply the significance to reconstruct the snake phylogeny with ample molecular data, such as those from complete mtDNA sequences. PMID:16007493

  8. Extension and Application of Feature Prediction Model for Synthesis of Hydrologic Records

    NASA Astrophysics Data System (ADS)

    Panu, Umed Singh; Unny, T. E.

    1980-02-01

    The method described in this paper for the synthesis of streamflows differs from the traditional approaches in synthetic hydrology in the sense that it utilizes the information contained in or among the groups of data in a streamflow record. The existense of such groups in geophysical records, including hydrologic records, is well emphasized by Hurst (1951). Further, in the proposed method, based on concepts of pattern recognition, neither a basic structure nor any preconceived model is imposed on the data; rather the data are allowed to speak for themselves in a most `democratic' way. The preliminary details of the method were provided in an earlier paper by Panu et al. (1978). The intent of this paper is to describe a procedure whereby it is possible to specify explicitly multivariate probability distribution for the intrapattern structure and first-order Markovian dependence for the interpattern structure in the feature prediction model (Panu et al., 1978). The various steps involved in the construction and operation of the model for streamflow synthesis are presented. The application of the model for synthesizing monthly streamflow records of three Canadian rivers exhibiting biannual cycles is explained. Statistical and hydrological tests show that these synthetic realizations possess relevant properties that are comparable with the corresponding properties contained in the historical record. This article should be read in conjunction with the previous publication by Panu et al. (1978).

  9. Can Structural Features of Kinase Receptors Provide Clues on Selectivity and Inhibition?: A Molecular Modeling Study

    PubMed Central

    Ravichandran, Sarangan; Luke, Brian T.; Collins, Jack R.

    2015-01-01

    Cancer is a complex disease resulting from the uncontrolled proliferation of cell signaling events. Protein kinases have been identified as central molecules that participate overwhelmingly in oncogenic events, thus becoming key targets for anticancer drugs. A majority of studies converged on the idea that ligand-binding pockets of kinases retain clues to the inhibiting abilities and cross-reacting tendencies of inhibitor drugs. Even though these ideas are critical for drug discovery, validating them using experiments is not only difficult, but in some cases infeasible. To overcome these limitations and to test these ideas at the molecular level, we present here the results of receptor-focused in-silico docking of nine marketed drugs to 19 different wild-type and mutated kinases chosen from a wide range of families. This investigation highlights the need for using relevant models to explain the correct inhibition trends and the results are used to make predictions that might be able to influence future experiments. Our simulation studies are able to correctly predict the primary targets for each drug studied in majority of cases and our results agree with the existing findings. Our study shows that the conformations a given receptor acquires during kinase activation, and their micro-environment, defines the ligand partners. Type II drugs display high compatibility and selectivity for DFG-out kinase conformations. On the other hand Type I drugs are less selective and show binding preferences for both the open and closed forms of selected kinases. Using this receptor-focused approach, it is possible to capture the observed fold change in binding affinities between the wild-type and disease-centric mutations in ABL kinase for Imatinib and the second-generation ABL drugs. The effects of mutation are also investigated for two other systems, EGFR and B-Raf. Finally, by including pathway information in the design it is possible to model kinase inhibitors with potentially

  10. Accurate prediction of lattice energies and structures of molecular crystals with molecular quantum chemistry methods.

    PubMed

    Fang, Tao; Li, Wei; Gu, Fangwei; Li, Shuhua

    2015-01-13

    We extend the generalized energy-based fragmentation (GEBF) approach to molecular crystals under periodic boundary conditions (PBC), and we demonstrate the performance of the method for a variety of molecular crystals. With this approach, the lattice energy of a molecular crystal can be obtained from the energies of a series of embedded subsystems, which can be computed with existing advanced molecular quantum chemistry methods. The use of the field compensation method allows the method to take long-range electrostatic interaction of the infinite crystal environment into account and make the method almost translationally invariant. The computational cost of the present method scales linearly with the number of molecules in the unit cell. Illustrative applications demonstrate that the PBC-GEBF method with explicitly correlated quantum chemistry methods is capable of providing accurate descriptions on the lattice energies and structures for various types of molecular crystals. In addition, this approach can be employed to quantify the contributions of various intermolecular interactions to the theoretical lattice energy. Such qualitative understanding is very useful for rational design of molecular crystals. PMID:26574207

  11. Molecular Mechanism and Prediction of Sorafenib Chemoresistance in Human Hepatocellular Carcinoma.

    PubMed

    Nishida, Naoshi; Kitano, Masayuki; Sakurai, Toshiharu; Kudo, Masatoshi

    2015-10-01

    Hepatocellular carcinoma (HCC) is the second leading cause of cancer death worldwide, and prognosis remains unsatisfactory when the disease is diagnosed at an advanced stage. Many molecular targeted agents are being developed for the treatment of advanced HCC; however, the only promising drug to have been developed is sorafenib, which acts as a multi-kinase inhibitor. Unfortunately, a subgroup of HCC is resistant to sorafenib, and the majority of these HCC patients show disease progression even after an initial satisfactory response. To date, a number of studies have examined the underlying mechanisms involved in the response to sorafenib, and trials have been performed to overcome the acquisition of drug resistance. The anti-tumor activity of sorafenib is largely attributed to the blockade of the signals from growth factors, such as vascular endothelial growth factor receptor and platelet-derived growth factor receptor, and the downstream RAF/mitogen-activated protein/extracellular signal-regulated kinase (ERK) kinase (MEK)/ERK cascade. The activation of an escape pathway from RAF/MEK/ERK possibly results in chemoresistance. In addition, there are several features of HCCs indicating sorafenib resistance, such as epithelial-mesenchymal transition and positive stem cell markers. Here, we review the recent reports and focus on the mechanism and prediction of chemoresistance to sorafenib in HCC. PMID:26488287

  12. Ensembles of extremely randomized trees and feature ranking for streamflow prediction

    NASA Astrophysics Data System (ADS)

    Castelletti, Andrea; Galelli, Stefano

    2010-05-01

    Accurate and reliable stream-flow predictions are an important input to water resources planning and management processes, which heavily depend upon the availability of water (e.g. river basin planning, optimal reservoir operation, irrigation system management). Hydrological processes are extremely complex, combining high non-linearity and spatial-temporal variability. The prediction of hydrological variables is therefore a challenging task, very often complicated by lack of data and/or the presence of outliers. Usually, data-driven modelling provides a good balance between model accuracy and complexity, which are ultimately critical to the adoption of optimization-based approaches. While neural networks have been widely used in hydrological modelling (e.g. Govindaraju and Rao, 2000), tree-based model is a relatively unexplored methodology (Solomatine and Dual, 2003; Solomatine and Xue, 2004; Iorgulescu and Beven, 2004; Stravs and Brilly, 2007). In this paper a new data-driven modelling approach based on Ensembles of Extremely Randomized Trees (ETs; Geurts et al., 2006) is proposed for stream-flow prediction using different hydro-meteorological predictors. By randomizing the tree construction process and merging a forest of diversified trees to predict the output, ETs alleviate the well-known poor generalization property of traditional standalone decision tress (e.g. CART), thus avoid over fitting the training data. Input to the model are selected using a tree-based feature ranking algorithm, which ranks the candidate predictors (e.g. precipitation and evaporation at different stations, linear combinations thereof) according to their contribution in explaining the variance of an underlying ETs-based model of the stream-flow process. The approach is applied in the Red river basin (Vietnam), a sub-tropical catchment characterized by extremely variable weather conditions, where strong precipitations significantly contribute to the high flow. Results shown that

  13. Predicted Molecular Effects of Sequence Variants Link to System Level of Disease.

    PubMed

    Reeb, Jonas; Hecht, Maximilian; Mahlich, Yannick; Bromberg, Yana; Rost, Burkhard

    2016-08-01

    Developments in experimental and computational biology are advancing our understanding of how protein sequence variation impacts molecular protein function. However, the leap from the micro level of molecular function to the macro level of the whole organism, e.g. disease, remains barred. Here, we present new results emphasizing earlier work that suggested some links from molecular function to disease. We focused on non-synonymous single nucleotide variants, also referred to as single amino acid variants (SAVs). Building upon OMIA (Online Mendelian Inheritance in Animals), we introduced a curated set of 117 disease-causing SAVs in animals. Methods optimized to capture effects upon molecular function often correctly predict human (OMIM) and animal (OMIA) Mendelian disease-causing variants. We also predicted effects of human disease-causing variants in the mouse model, i.e. we put OMIM SAVs into mouse orthologs. Overall, fewer variants were predicted with effect in the model organism than in the original organism. Our results, along with other recent studies, demonstrate that predictions of molecular effects capture some important aspects of disease. Thus, in silico methods focusing on the micro level of molecular function can help to understand the macro system level of disease. PMID:27536940

  14. Occurrence of thrombotic events in acute promyelocytic leukemia correlates with consistent immunophenotypic and molecular features.

    PubMed

    Breccia, M; Avvisati, G; Latagliata, R; Carmosino, I; Guarini, A; De Propris, M S; Gentilini, F; Petti, M C; Cimino, G; Mandelli, F; Lo-Coco, F

    2007-01-01

    Although the occurrence of thrombosis in acute promyelocytic leukemia (APL) has been reported during retinoic acid treatment, no studies carried out in large clinical cohorts have specifically addressed this issue. We analyzed 124 APL patients treated with the all-trans retinoic acid and idarubicin protocol and compared clinico-biologic characteristics of 11 patients who developed thrombosis with those of 113 patients who had no thrombosis. In seven patients, the events were recorded during induction, whereas in four patients deep vein thrombosis occurred in the post-induction phase. Comparison of clinico-biological characteristics of patients with and without thrombosis revealed in the former group higher median white blood cell (WBC) count (17 x 10(9)/l, range 1.2-56, P=0.002), prevalence of the bcr3 transcript type (72 vs 48%, P=0.01), of FLT3-ITD (64 vs 28%, P=0.02), CD2 (P=0.0001) and CD15 (P=0.01) expression. No correlation was found with sex, age, French-American-British subtype, all-trans-retinoic acid syndrome or with thrombophilic state that was investigated in 5/11 patients. Our findings suggest that, in APL patients consistent biologic features of leukemia cells may predict increased risk of developing thrombosis. PMID:16932337

  15. Identifying molecular genetic features and oncogenic pathways of clear cell renal cell carcinoma through the anatomical (PADUA) scoring system

    PubMed Central

    Lin, Zhiqian; Shi, Guohai; Lin, Xiaozhu; Wu, Zhiyuan; Zhang, Xia; Zhang, Xi

    2016-01-01

    Although the preoperative aspects and dimensions used for the PADUA scoring system were successfully applied in macroscopic clinical practice for renal tumor, the relevant molecular genetic basis remained unclear. To uncover meaningful correlations between the genetic aberrations and radiological features, we enrolled 112 patients with clear cell renal cell carcinoma (ccRCC) whose clinicopathological data, genomics data and CT data were obtained from The Cancer Genome Atlas (TCGA) and The Cancer Imaging Archive (TCIA). Overall PADUA score and several radiological features included in the PADUA system were assigned for each ccRCC. Despite having observed no significant association between the gene mutation frequency and the overall PADUA score, correlations between gene mutations and a few radiological features (tumor rim location and tumor size) were identified. A significant association between rim location and miRNA molecular subtypes was also observed. Survival analysis revealed that tumor size > 7 cm was significantly associated with poor survival. In addition, Gene Set Enrichment Analysis (GSEA) on mRNA expression revealed that the high PADUA score was related to numerous cancer-related networks, especially epithelial to mesenchymal transition (EMT) related pathways. This preliminary analysis of ccRCC revealed meaningful correlations between PADUA anatomical features and molecular basis including genomic aberrations and molecular subtypes. PMID:26848523

  16. Experimental indication of a naphthalene-base molecular aggregate for the carrier of the 2175 angstroms interstellar extinction feature

    NASA Technical Reports Server (NTRS)

    Beegle, L. W.; Wdowiak, T. J.; Robinson, M. S.; Cronin, J. R.; McGehee, M. D.; Clemett, S. J.; Gillette, S.

    1997-01-01

    Experiments where the simple polycyclic aromatic hydrocarbon (PAH) naphthalene (C10H8) is subjected to the energetic environment of a plasma have resulted in the synthesis of a molecular aggregate that has ultraviolet spectral characteristics that suggest it provides insight into the nature of the carrier of the 2175 angstroms interstellar extinction feature and may be a laboratory analog. Ultraviolet, visible, infrared, and mass spectroscopy, along with gas chromatography, indicate that it is a molecular aggregate in which an aromatic double ring ("naphthalene") structural base serves as the electron "box" chromophore that gives rise to the envelope of the 2175 angstroms feature. This chromophore can also provide the peak of the feature or function as a mantle in concert with another peak provider such as graphite. The molecular base/chromophore manifests itself both as a structural component of an alkyl-aromatic polymer and as a substructure of hydrogenated PAH species. Its spectral and molecular characteristics are consistent with what is generally expected for a complex molecular aggregate that has a role as an interstellar constituent.

  17. Prediction of troponin-T degradation using color image texture features in 10d aged beef longissimus steaks.

    PubMed

    Sun, X; Chen, K J; Berg, E P; Newman, D J; Schwartz, C A; Keller, W L; Maddock Carlin, K R

    2014-02-01

    The objective was to use digital color image texture features to predict troponin-T degradation in beef. Image texture features, including 88 gray level co-occurrence texture features, 81 two-dimension fast Fourier transformation texture features, and 48 Gabor wavelet filter texture features, were extracted from color images of beef strip steaks (longissimus dorsi, n = 102) aged for 10d obtained using a digital camera and additional lighting. Steaks were designated degraded or not-degraded based on troponin-T degradation determined on d 3 and d 10 postmortem by immunoblotting. Statistical analysis (STEPWISE regression model) and artificial neural network (support vector machine model, SVM) methods were designed to classify protein degradation. The d 3 and d 10 STEPWISE models were 94% and 86% accurate, respectively, while the d 3 and d 10 SVM models were 63% and 71%, respectively, in predicting protein degradation in aged meat. STEPWISE and SVM models based on image texture features show potential to predict troponin-T degradation in meat. PMID:24200578

  18. Molecular imaging of apoptosis for early prediction of therapy efficiency.

    PubMed

    De Saint-Hubert, Marijke; Bauwens, Matthias; Mottaghy, Felix M

    2014-01-01

    Evasion of apoptosis is one of the hallmarks of cancer and any effective therapy primarily attempts to induce apoptosis. The evaluation of the degree of success of cancer therapy is currently mainly based on clinical and laboratory parameters and in a later stage on tumor shrinkage. However, none of these parameters provide an objective and early analysis of a therapeutic effect. Molecular imaging may provide a tool for this purpose by using not only pathophysiological but also biochemical effects of the therapy. First in the field, FDG-PET has been explored and demonstrated to offer insight in the amount of viable cells, even though false positives are commonly due to the lack of specificity of this particular radiopharmaceutical. More specific markers target the dying cells instead of those remaining alive. Specific apoptosis markers have been developed of which the radiolabeled Annexin A5 is the most intensely studied probe. Site-specific labeling strategies have improved this imaging probe with good results both in pre-clinical studies and in clinical trials, with promises for clinical applications. Caspase sensitive probes, such as the isatines, can also effectively image apoptosis but are limited due to the high background activities. More recent discoveries of small apoptosis sensitive probes, such as (18)F-ML10, are currently being explored. In this review, the most important apoptosis sensitive probes are described from both a pre-clinical and a clinical perspective, highlighting their potential but also their limitations as an early marker for therapeutic success. It seems that apoptosis imaging can help to guide therapy, not by replacing the current methodology but by providing additional and useful information. PMID:24025102

  19. Identification of Molecular Targets for Predicting Colon Adenocarcinoma.

    PubMed

    Wang, Yansheng; Zhang, Jun; Li, Li; Xu, Xin; Zhang, Yong; Teng, Zhaowei; Wu, Feihu

    2016-01-01

    BACKGROUND Colon adenocarcinoma mostly happens at the junction of the rectum and is a common gastrointestinal malignancy. Accumulated evidence has indicated that colon adenocarcinoma develops by genetic alterations and is a complicated disease. The aim of this study was to screen differentially expressed miRNAs (DEMs) and genes with diagnostic and prognostic potentials in colon adenocarcinoma. MATERIAL AND METHODS In this study we screened DEMs and their target genes (DEGs) between 100 colon adenocarcinoma and normal samples in The Cancer Genome Atlas (TCGA) database by using the DEseq toolkit in Bioconductor. Then Go enrichment and KEGG pathway analysis were performed on the selected differential genes by use of the DAVID online tool. A regulation network of miRNA-gene was constructed and analyzed by Cytoscape. Finally, we performed ROC analysis of 8 miRNAs and ROC curves were drawn. RESULTS A total of 159 DEMs and 1921 DEGs were screened, and 1881 pairs of miRNA-target genes with significant negative correlations were also obtained. A regulatory network of miRNA-gene, including 60 cancer-related genes and 47 miRNAs, was successfully constructed. In addition, 5 clusters with several miRNAs regulating a set of target genes simultaneously were identified through cluster analysis. There were 8 miRNAs involved in these 5 clusters, and these miRNAs could serve as molecular biomarkers to distinguish colon adenocarcinoma and normal samples indicated by ROC analysis. CONCLUSIONS The identified 8 miRNAs were closely associated with colon adenocarcinoma, which may have great clinical value as diagnostic and prognostic biomarkers and provide new ideas for targeted therapy. PMID:26868022

  20. A Molecular Signature Predictive of Indolent Prostate Cancer

    PubMed Central

    Irshad, Shazia; Bansal, Mukesh; Castillo-Martin, Mireia; Zheng, Tian; Aytes, Alvaro; Wenske, Sven; Le Magnen, Clémentine; Guarnieri, Paolo; Sumazin, Pavel; Benson, Mitchell C.; Shen, Michael M.; Califano, Andrea; Abate-Shen, Cory

    2014-01-01

    Many newly diagnosed prostate cancers present as low Gleason score tumors that require no treatment intervention. Distinguishing the many indolent tumors from the minority of lethal ones remains a major clinical challenge. We now show that low Gleason score prostate tumors can be distinguished as indolent and aggressive subgroups on the basis of their expression of genes associated with aging and senescence. Using gene set enrichment analysis, we identified a 19-gene signature enriched in indolent prostate tumors. We then further classified this signature with a decision tree learning model to identify three genes—FGFR1, PMP22, and CDKN1A—that together accurately predicted outcome of low Gleason score tumors. Validation of this three-gene panel on independent cohorts confirmed its independent prognostic value as well as its ability to improve prognosis with currently used clinical nomograms. Furthermore, protein expression of this three-gene panel in biopsy samples distinguished Gleason 6 patients who failed surveillance over a 10-year period. We propose that this signature may be incorporated into prognostic assays for monitoring patients on active surveillance to facilitate appropriate courses of treatment. PMID:24027026

  1. Molecular recognition features (MoRFs) in three domains of life.

    PubMed

    Yan, Jing; Dunker, A Keith; Uversky, Vladimir N; Kurgan, Lukasz

    2016-03-01

    Intrinsically disordered proteins and protein regions offer numerous advantages in the context of protein-protein interactions when compared to the structured proteins and domains. These advantages include ability to interact with multiple partners, to fold into different conformations when bound to different partners, and to undergo disorder-to-order transitions concomitant with their functional activity. Molecular recognition features (MoRFs) are widespread elements located in disordered regions that undergo disorder-to-order transition upon binding to their protein partners. We characterize abundance, composition, and functions of MoRFs and their association with the disordered regions across 868 species spread across Eukaryota, Bacteria and Archaea. We found that although disorder is substantially elevated in Eukaryota, MoRFs have similar abundance and amino acid composition across the three domains of life. The abundance of MoRFs is highly correlated with the amount of intrinsic disorder in Bacteria and Archaea but only modestly correlated in Eukaryota. Proteins with MoRFs have significantly more disorder and MoRFs are present in many disordered regions, with Eukaryota having more MoRF-free disordered regions. MoRF-containing proteins are enriched in the ribosome, nucleus, nucleolus and microtubule and are involved in translation, protein transport, protein folding, and interactions with DNAs. Our insights into the nature and function of MoRFs enhance our understanding of the mechanisms underlying the disorder-to-order transition and protein-protein recognition and interactions. The fMoRFpred method that we used to annotate MoRFs is available at http://biomine.ece.ualberta.ca/fMoRFpred/. PMID:26651072

  2. Molecular Features of Triple Negative Breast Cancer: Microarray Evidence and Further Integrated Analysis

    PubMed Central

    Chen, Weicai; Wu, Huisheng; Yuan, Zishan; Wang, Kun; Li, Guojin; Sun, Jie; Yu, Limin

    2015-01-01

    Purpose Breast cancer is a heterogeneous disease usually including four molecular subtypes such as luminal A, luminal B, HER2-enriched, and triple-negative breast cancer (TNBC). TNBC is more aggressive than other breast cancer subtypes. Despite major advances in ER-positive or HER2-amplified breast cancer, there is no targeted agent currently available for TNBC, so it is urgent to identify new potential therapeutic targets for TNBC. Methods We first used microarray analysis to compare gene expression profiling between TNBC and non-TNBC. Furthermore an integrated analysis was conducted based on our own and published data, leading to more robust, reproducible and accurate predictions. Additionally, we performed qRT-PCR in breast cancer cell lines to verify the findings in integrated analysis. Results After searching Gene Expression Omnibus database (GEO), two microarray studies were obtained according to the inclusion criteria. The integrated analysis was conducted, including 30 samples of TNBC and 77 samples of non-TNBC. 556 genes were found to be consistently differentially expressed (344 up-regulated genes and 212 down-regulated genes in TNBC). Functional annotation for these differentially expressed genes (DEGs) showed that the most significantly enriched Gene Ontology (GO) term for molecular functions was protein binding (GO: 0005515, P = 6.09E-21), while that for biological processes was signal transduction (GO: 0007165, P = 9.46E-08), and that for cellular component was cytoplasm (GO: 0005737, P = 2.09E-21). The most significant pathway was Pathways in cancer (P = 6.54E-05) based on Kyoto Encyclopedia of Genes and Genomes (KEGG). DUSP1 (Degree = 21), MYEOV2 (Degree = 15) and UQCRQ (Degree = 14) were identified as the significant hub proteins in the protein-protein interaction (PPI) network. Five genes were selected to perform qRT-PCR in seven breast cancer cell lines, and qRT-PCR results showed that the expression pattern of selected genes in TNBC lines and

  3. New QSAR prediction models derived from GPCR CB2-antagonistic triaryl bis-sulfone analogues by a combined molecular morphological and pharmacophoric approach.

    PubMed

    Chen, J-Z; Myint, K-Z; Xie, X-Q

    2011-01-01

    In order to build quantitative structure-activity relationship (QSAR) models for virtual screening of novel cannabinoid CB2 ligands and hit ranking selections, a new QSAR algorithm has been developed for the cannabinoid ligands, triaryl bis-sulfones, using a combined molecular morphological and pharmacophoric search approach. Both pharmacophore features and shape complementarity were considered using a number of molecular descriptors, including Surflex-Sim similarity and Unity Query fit, in addition to other molecular properties such as molecular weight, ClogP, molecular volume, molecular area, molecular polar volume, molecular polar surface area and dipole moment. Subsequently, partial least squares regression analyses were carried out to derive QSAR models linking bioactivity and the descriptors mentioned, using a training set of 25 triaryl bis-sulfones. Good prediction capability was confirmed for the best QSAR model by evaluation against a test set of a further 20 triaryl bis-sulfones. The pharmacophore and molecular shape-based QSAR scoring function now established can be used to predict the biological properties of virtual hits or untested compounds obtained from ligand-based virtual screenings. PMID:21714749

  4. Predicted Rupture Force of a Single Molecular Bond Becomes Rate Independent at Ultralow Loading Rates

    NASA Astrophysics Data System (ADS)

    Li, Dechang; Ji, Baohua

    2014-02-01

    We present for the first time a theoretical model of studying the saturation of the rupture force of a single molecular bond that causes the rupture force to be rate independent under an ultralow loading rate. This saturation will obviously bring challenges to understanding the rupture behavior of the molecular bond using conventional methods. This intriguing feature implies that the molecular bond has a nonzero strength at a vanishing loading rate. We find that the saturation behavior is caused by bond rebinding when the loading rate is lower than a limiting value depending on the loading stiffness.

  5. Expanding the clinical spectrum of the 'HDAC8-phenotype' - implications for molecular diagnostics, counseling and risk prediction.

    PubMed

    Parenti, I; Gervasini, C; Pozojevic, J; Wendt, K S; Watrin, E; Azzollini, J; Braunholz, D; Buiting, K; Cereda, A; Engels, H; Garavelli, L; Glazar, R; Graffmann, B; Larizza, L; Lüdecke, H J; Mariani, M; Masciadri, M; Pié, J; Ramos, F J; Russo, S; Selicorni, A; Stefanova, M; Strom, T M; Werner, R; Wierzba, J; Zampino, G; Gillessen-Kaesbach, G; Wieczorek, D; Kaiser, F J

    2016-05-01

    Cornelia de Lange syndrome (CdLS) is a clinically heterogeneous disorder characterized by typical facial dysmorphism, cognitive impairment and multiple congenital anomalies. Approximately 75% of patients carry a variant in one of the five cohesin-related genes NIPBL, SMC1A, SMC3, RAD21 and HDAC8. Herein we report on the clinical and molecular characterization of 11 patients carrying 10 distinct variants in HDAC8. Given the high number of variants identified so far, we advise sequencing of HDAC8 as an indispensable part of the routine molecular diagnostic for patients with CdLS or CdLS-overlapping features. The phenotype of our patients is very broad, whereas males tend to be more severely affected than females, who instead often present with less canonical CdLS features. The extensive clinical variability observed in the heterozygous females might be at least partially associated with a completely skewed X-inactivation, observed in seven out of eight female patients. Our cohort also includes two affected siblings whose unaffected mother was found to be mosaic for the causative mutation inherited to both affected children. This further supports the urgent need for an integration of highly sensitive sequencing technology to allow an appropriate molecular diagnostic, genetic counseling and risk prediction. PMID:26671848

  6. Computing Molecular Signatures as Optima of a Bi-Objective Function: Method and Application to Prediction in Oncogenomics

    PubMed Central

    Gardeux, Vincent; Chelouah, Rachid; Wanderley, Maria F Barbosa; Siarry, Patrick; Braga, Antônio P; Reyal, Fabien; Rouzier, Roman; Pusztai, Lajos; Natowicz, René

    2015-01-01

    BACKGROUND Filter feature selection methods compute molecular signatures by selecting subsets of genes in the ranking of a valuation function. The motivations of the valuation functions choice are almost always clearly stated, but those for selecting the genes according to their ranking are hardly ever explicit. METHOD We addressed the computation of molecular signatures by searching the optima of a bi-objective function whose solution space was the set of all possible molecular signatures, ie, the set of subsets of genes. The two objectives were the size of the signature–to be minimized–and the interclass distance induced by the signature–to be maximized–. RESULTS We showed that: 1) the convex combination of the two objectives had exactly n optimal non empty signatures where n was the number of genes, 2) the n optimal signatures were nested, and 3) the optimal signature of size k was the subset of k top ranked genes that contributed the most to the interclass distance. We applied our feature selection method on five public datasets in oncology, and assessed the prediction performances of the optimal signatures as input to the diagonal linear discriminant analysis (DLDA) classifier. They were at the same level or better than the best-reported ones. The predictions were robust, and the signatures were almost always significantly smaller. We studied in more details the performances of our predictive modeling on two breast cancer datasets to predict the response to a preoperative chemotherapy: the performances were higher than the previously reported ones, the signatures were three times smaller (11 versus 30 gene signatures), and the genes member of the signature were known to be involved in the response to chemotherapy. CONCLUSIONS Defining molecular signatures as the optima of a bi-objective function that combined the signature size and the interclass distance was well founded and efficient for prediction in oncogenomics. The complexity of the computation

  7. The role of ultrasonographic findings to predict molecular subtype, histologic grade, and hormone receptor status of breast cancer

    PubMed Central

    Çelebi, Filiz; Pilancı, Kezban Nur; Ordu, Çetin; Ağacayak, Filiz; Alço, Gül; İlgün, Serkan; Sarsenov, Dauren; Erdoğan, Zeynep; Özmen, Vahit

    2015-01-01

    PURPOSE The correlation between imaging findings and pathologic characteristics of tumors may provide information for diagnosis and treatment of cancer. The aim of this study is to determine whether ultrasound features of breast cancer are associated with molecular subtype, histologic grade, and hormone receptor status, as well as assess the predictive value of these features. METHODS A total of 201 consecutive invasive breast cancer patients were reviewed from the database according to the Breast Imaging and Reporting Data System (BI-RADS). Tumor margins were classified as circumscribed and noncircumscribed. Noncircumscribed group was divided into indistinct, spiculated, angular, and microlobulated. The posterior acoustic features were divided into four categories: shadowing, enhancement, no change, and mixed pattern. RESULTS Tumors with posterior shadowing were more likely to be of nontriple negative subtype (odds ratio [OR], 7.42; 95% CI, 2.10–24.99; P = 0.002), low histologic grade (grade 1 or 2 vs. grade 3: OR, 2.42; 95% CI, 1.34–4.35; P = 0.003) and having at least one positive receptor (OR, 3.36; 95% CI, 1.55–7.26; P = 0.002). Tumors with circumscribed margins were more often triple-negative subtype (OR, 6.72; 95% CI, 2.56–17.65; P < 0.001), high grade (grade 3 vs. grade 1 or 2: OR, 5.42; 95% CI, 2.66–11.00; P < 0.001) and hormone receptor negative (OR, 4.87; 95% CI, 2.37–9.99; P < 0.001). CONCLUSION Sonographic features are strongly associated with molecular subtype, histologic grade, and hormone receptor status of the tumor. These findings may separate triple-negative breast cancer from other molecular subtypes. PMID:26359880

  8. TU-C-17A-10: Patient Features Based Dosimetric Pareto Front Prediction In Esophagus Cancer Radiotherapy

    SciTech Connect

    Wang, J; Zhao, K; Peng, J; Hu, W; Jin, X

    2014-06-15

    Purpose: The purpose of this study is to study the feasibility of the dosimetric pareto front (PF) prediction based on patient anatomic and dosimetric parameters for esophagus cancer patients. Methods: Sixty esophagus patients in our institution were enrolled in this study. A total 2920 IMRT plans were created to generated PF for each patient. On average, each patient had 48 plans. The anatomic and dosimetric features were extracted from those plans. The mean lung dose (MLD), mean heart dose (MHD), spinal cord max dose and PTV homogeneous index (PTVHI) were recorded for each plan. The principal component analysis (PCA) was used to extract overlap volume histogram (OVH) features between PTV and other critical organs. The full dataset was separated into two parts include the training dataset and the validation dataset. The prediction outcomes were the MHD and MLD for the current study. The spearman rank correlation coefficient was used to evaluate the correlation between the anatomical features and dosimetric features. The PF was fit by the the stepwise multiple regression method. The cross-validation method was used to evaluation the model. Results: The mean prediction error of the MHD was 465 cGy with 100 repetitions. The most correlated factors were the first principal components of the OVH between heart and PTV, and the overlap between heart and PTV in Z-axis. The mean prediction error of the MLD was 195 cGy. The most correlated factors were the first principal components of the OVH between lung and PTV, and the overlap between lung and PTV in Z-axis. Conclusion: It is feasible to use patients anatomic and dosimetric features to generate a predicted PF. Additional samples and further studies were required to get a better prediction model.

  9. Performance comparison of the Prophecy (forecasting) Algorithm in FFT form for unseen feature and time-series prediction

    NASA Astrophysics Data System (ADS)

    Jaenisch, Holger; Handley, James

    2013-06-01

    We introduce a generalized numerical prediction and forecasting algorithm. We have previously published it for malware byte sequence feature prediction and generalized distribution modeling for disparate test article analysis. We show how non-trivial non-periodic extrapolation of a numerical sequence (forecast and backcast) from the starting data is possible. Our ancestor-progeny prediction can yield new options for evolutionary programming. Our equations enable analytical integrals and derivatives to any order. Interpolation is controllable from smooth continuous to fractal structure estimation. We show how our generalized trigonometric polynomial can be derived using a Fourier transform.

  10. Investigating the molecular structural features of hulless barley (Hordeum vulgare L.) in relation to metabolic characteristics using synchrotron-based fourier transform infrared microspectroscopy.

    PubMed

    Yang, Ling; Christensen, David A; McKinnon, John J; Beattie, Aaron D; Xin, Hangshu; Yu, Peiqiang

    2013-11-27

    The synchrotron-based Fourier transform infrared microspectroscopy (SR-FTIRM) technique was used to quantify molecular structural features of the four hulless barley lines with altered carbohydrate traits [amylose, 1-40% of dry matter (DM); β-glucan, 5-10% of DM] in relation to rumen degradation kinetics, intestinal nutrient digestion, and predicted protein supply. Spectral features of β-glucan (both area and heights) in hulless barley lines showed a negative correlation with protein availability in the small intestine, including truly digested protein in the small intestine (DVE) (r = -0.76, P < 0.01; r = -0.84, P < 0.01) and total metabolizable protein (MP) (r = -0.71, P < 0.05; r = -0.84, P < 0.01). Variation in absorption intensities of total carbohydrate (CHO) was observed with negative effects on protein degradation, digestion, and potential protein supply (P < 0.05). Molecular structural features of CHO in hulless barley have negative effects on the supply of true protein to ruminants. The results clearly indicated the impact of the carbohydrate-protein structure and matrix. PMID:24156528

  11. Prediction of the Fate of Organic Compounds in the Environment From Their Molecular Properties: A Review

    PubMed Central

    Mamy, Laure; Patureau, Dominique; Barriuso, Enrique; Bedos, Carole; Bessac, Fabienne; Louchart, Xavier; Martin-laurent, Fabrice; Miege, Cecile; Benoit, Pierre

    2015-01-01

    A comprehensive review of quantitative structure-activity relationships (QSAR) allowing the prediction of the fate of organic compounds in the environment from their molecular properties was done. The considered processes were water dissolution, dissociation, volatilization, retention on soils and sediments (mainly adsorption and desorption), degradation (biotic and abiotic), and absorption by plants. A total of 790 equations involving 686 structural molecular descriptors are reported to estimate 90 environmental parameters related to these processes. A significant number of equations was found for dissociation process (pKa), water dissolution or hydrophobic behavior (especially through the KOW parameter), adsorption to soils and biodegradation. A lack of QSAR was observed to estimate desorption or potential of transfer to water. Among the 686 molecular descriptors, five were found to be dominant in the 790 collected equations and the most generic ones: four quantum-chemical descriptors, the energy of the highest occupied molecular orbital (EHOMO) and the energy of the lowest unoccupied molecular orbital (ELUMO), polarizability (α) and dipole moment (μ), and one constitutional descriptor, the molecular weight. Keeping in mind that the combination of descriptors belonging to different categories (constitutional, topological, quantum-chemical) led to improve QSAR performances, these descriptors should be considered for the development of new QSAR, for further predictions of environmental parameters. This review also allows finding of the relevant QSAR equations to predict the fate of a wide diversity of compounds in the environment. PMID:25866458

  12. Characterizing structural features of cuticle-degrading proteases from fungi by molecular modeling

    PubMed Central

    Liu, Shu-Qun; Meng, Zhao-Hui; Yang, Jin-Kui; Fu, Yun-Xin; Zhang, Ke-Qin

    2007-01-01

    Background Serine proteases secreted by nematode and insect pathogenic fungi are bio-control agents which have commercial potential for developing into effective bio-pesticides. A thorough understanding of the structural and functional features of these proteases would significantly assist with targeting the design of efficient bio-control agents. Results Structural models of serine proteases PR1 from entomophagous fungus, Ver112 and VCP1 from nematophagous fungi, have been modeled using the homology modeling technique based on the crystal coordinate of the proteinase K. In combination with multiple sequence alignment, these models suggest one similar calcium-binding site and two common disulfide bridges in the three cuticle-degrading enzymes. In addition, the predicted models of the three cuticle-degrading enzymes present an essentially identical backbone topology and similar geometric properties with the exception of a limited number of sites exhibiting relatively large local conformational differences only in some surface loops and the N-, C termini. However, they differ from each other in the electrostatic surface potential, in hydrophobicity and size of the S4 substrate-binding pocket, and in the number and distribution of hydrogen bonds and salt bridges within regions that are part of or in close proximity to the S2-loop. Conclusion These differences likely lead to variations in substrate specificity and catalytic efficiency among the three enzymes. Amino acid polymorphisms in cuticle-degrading enzymes were discussed with respect to functional effects and host preference. It is hoped that these structural models would provide a further basis for exploitation of these serine proteases from pathogenic fungi as effective bio-control agents. PMID:17511867

  13. Structural features of interfacial tyrosine residue in ROBO1 fibronectin domain-antibody complex: Crystallographic, thermodynamic, and molecular dynamic analyses

    PubMed Central

    Nakayama, Taisuke; Mizohata, Eiichi; Yamashita, Takefumi; Nagatoishi, Satoru; Nakakido, Makoto; Iwanari, Hiroko; Mochizuki, Yasuhiro; Kado, Yuji; Yokota, Yuki; Satoh, Reiko; Tsumoto, Kouhei; Fujitani, Hideaki; Kodama, Tatsuhiko; Hamakubo, Takao; Inoue, Tsuyoshi

    2015-01-01

    ROBO1, fibronectin Type-III domain (Fn)-containing protein, is a novel immunotherapeutic target for hepatocellular carcinoma in humans. The crystal structure of the antigen-binding fragment (Fab) of B2212A, the monoclonal antibody against the third Fn domain (Fn3) of ROBO1, was determined in pursuit of antibody drug for hepatocellular carcinoma. This effort was conducted in the presence or absence of the antigen, with the chemical features being investigated by determining the affinity of the antibody using molecular dynamics (MD) and thermodynamics. The structural comparison of B2212A Fab between the complex and the free form revealed that the interfacial TyrL50 (superscripts L, H, and F stand for the residues in the light chain, heavy chain, and Fn3, respectively) played important roles in Fn3 recognition. That is, the aromatic ring of TyrL50 pivoted toward PheF68, forming a CH/π interaction and a new hydrogen bond with the carbonyl O atom of PheF68. MD simulations predicted that the TyrL50-PheF68 interaction almost entirely dominated Fab-Fn3 binding, and Ala-substitution of TyrL50 led to a reduced binding of the resultant complex. On the contrary, isothermal titration calorimetry experiments underscored that Ala-substitution of TyrL50 caused an increase of the binding enthalpy between B2212A and Fn3, but importantly, it induced an increase of the binding entropy, resulting in a suppression of loss in the Gibbs free energy in total. These results suggest that mutation analysis considering the binding entropy as well as the binding enthalpy will aid in the development of novel antibody drugs for hepatocellular carcinoma. PMID:25492858

  14. Prediction of transport properties by molecular simulation: methanol and ethanol and their mixture.

    PubMed

    Guevara-Carrion, Gabriela; Nieto-Draghi, Carlos; Vrabec, Jadran; Hasse, Hans

    2008-12-25

    Transport properties of liquid methanol and ethanol are predicted by molecular dynamics simulation. The molecular models for the alcohols are rigid, nonpolarizable, and of united-atom type. They were developed in preceding work using experimental vapor-liquid equilibrium data only. Self- and Maxwell-Stefan diffusion coefficients as well as the shear viscosity of methanol, ethanol, and their binary mixture are determined using equilibrium molecular dynamics and the Green-Kubo formalism. Nonequilibrium molecular dynamics is used for predicting the thermal conductivity of the two pure substances. The transport properties of the fluids are calculated over a wide temperature range at ambient pressure and compared with experimental and simulation data from the literature. Overall, a very good agreement with the experiment is found. For instance, the self-diffusion coefficient and the shear viscosity are predicted with average deviations of less than 8% for the pure alcohols and 12% for the mixture. The predicted thermal conductivity agrees on average within 5% with the experimental data. Additionally, some velocity and shear viscosity autocorrelation functions are presented and discussed. Radial distribution functions for ethanol are also presented. The predicted excess volume, excess enthalpy, and the vapor-liquid equilibrium of the binary mixture methanol + ethanol are assessed and agree well with experimental data. PMID:19367909

  15. Apocalypse...now? Molecular epidemiology, predictive genetic tests, and social communication of genetic contents.

    PubMed

    Castiel, L D

    1999-01-01

    The author analyzes the underlying theoretical aspects in the construction of the molecular watershed of epidemiology and the concept of genetic risk, focusing on issues raised by contemporary reality: new technologies, globalization, proliferation of communications strategies, and the dilution of identity matrices. He discusses problems pertaining to the establishment of such new interdisciplinary fields as molecular epidemiology and molecular genetics. Finally, he analyzes the repercussions of the social communication of genetic content, especially as related to predictive genetic tests and cloning of animals, based on triumphal, deterministic metaphors sustaining beliefs relating to the existence and supremacy of concepts such as 'purity', 'essence', and 'unification' of rational, integrated 'I's/egos'. PMID:10089550

  16. Andic soil features and debris flows in Italy. New perspective towards prediction

    NASA Astrophysics Data System (ADS)

    Scognamiglio, Solange; Calcaterra, Domenico; Iamarino, Michela; Langella, Giuliano; Orefice, Nadia; Vingiani, Simona; Terribile, Fabio

    2016-04-01

    Debris flows are dangerous hazards causing fatalities and damage. Previous works have demonstrated that the materials involved by debris flows in Campania (southern Italy) are soils classified as Andosols. These soils have peculiar chemical and physical properties which make them fertile but also vulnerable to landslide. In Italy, andic soil properties are found both in volcanic and non-volcanic mountain ecosystems (VME and NVME). Here, we focused on the assessment of the main chemical and physical properties of the soils in the detachment areas of eight debris flows occurred in NVME of Italy in the last 70 years. Such landslides were selected by consulting the official Italian geodatabase (IFFI Project). Andic properties (by means of ammonium oxalate extractable Fe, Si and Al forms for the calculation of Alo+1/2Feo) were also evaluated and a comparison with soils of VME was performed to assess possible common features. Landslide source areas were characterised by slope gradient ranging from 25° to 50° and lithological heterogeneity of the bedrock. The soils showed similar, i.e. all were very deep, had a moderately thick topsoil with a high organic carbon (OC) content decreasing regularly with depth. The cation exchange capacity trend was generally consistent with the OC and the pH varied from extremely to slightly acid, but increased with depth. Furthermore, the soils had high water retention values both at saturation (0.63 to 0.78 cm3 cm‑3) and in the dryer part of the water retention curve, and displayed a prevalent loamy texture. Such properties denote the chemical and physical fertility of the investigated ecosystems. The values of Alo+1/2Feoindicated that the soils had vitric or andic features and can be classified as Andosols. The comparison between NVME soils and those of VME showed similar depth, thickness of soil horizons, and family texture, whereas soil pH, degree of development of andic properties and allophane content were higher for VME soils

  17. Predicting molecular scale skin-effect in electrochemical impedance due to anomalous subdiffusion mediated adsorption phenomenon

    NASA Astrophysics Data System (ADS)

    Kushagra, Arindam

    2016-02-01

    Anomalous subdiffusion governs the processes which are not energetically driven, on a molecular scale. This paper proposes a model to predict the response of electrochemical impedance due to such diffusion process. Previous works considered the use of fractional calculus to predict the impedance behaviour in response to the anomalous diffusion. Here, we have developed an expression which predicts the skin-effect, marked by an increase in the impedance with increasing frequency, in this regime. Negative inductances have also been predicted as a consequence of the inertial response of adsorbed species upon application of frequency-mediated perturbations. It might help the researchers in the fields of impedimetric sensors to choose the working frequency and those working in the field of batteries to choose the parameters, likewise. This work would shed some light into the molecular mechanisms governing the impedance when exposed to frequency-based perturbations like electromagnetic waves (microwaves to ionizing radiations) and in charge storage devices like batteries etc.

  18. Prediction of Selected Physical and Mechanical Properties of a Telechelic Polybenzoxazine by Molecular Simulation

    PubMed Central

    Wan Hassan, Wan Aminah; Hamerton, Ian; Howlin, Brendan J.

    2013-01-01

    Molecular simulation is becoming an important tool for both understanding polymeric structures and predicting their physical and mechanical properties. In this study, temperature ramped molecular dynamics simulations are used to predict two physical properties (i.e., glass transition temperature and thermal degradation temperature) of a previously synthesised and published telechelic benzoxazine. Plots of simulated density versus temperature show decreases in density within the same temperature range as experimental values for the thermal degradation. The predicted value for the thermal degradation temperature for the cured polybenzoxazine based on the telechelic polyetherketone (PEK) monomer was ca. 400°C, in line with the experimental thermal degradation temperature range of 450°C to 500°C. Mechanical Properties of both the unmodified PEK and the telechelic benzoxazines are simulated and compared to experimental values (where available). The introduction of the benoxazine moieties are predicted to increase the elastic moduli in line with the increase of crosslinking in the system. PMID:23577206

  19. Tracking the Correlation Between CpG Island Methylator Phenotype and Other Molecular Features and Clinicopathological Features in Human Colorectal Cancers: A Systematic Review and Meta-Analysis

    PubMed Central

    Zong, Liang; Abe, Masanobu; Ji, Jiafu; Zhu, Wei-Guo; Yu, Duonan

    2016-01-01

    Objectives: The controversy of CpG island methylator phenotype (CIMP) in colorectal cancers (CRCs) persists, despite many studies that have been conducted on its correlation with molecular and clinicopathological features. To drive a more precise estimate of the strength of this postulated relationship, a meta-analysis was performed. Methods: A comprehensive search for studies reporting molecular and clinicopathological features of CRCs stratified by CIMP was performed within the PubMed, EMBASE, and Cochrane Library. CIMP was defined by either one of the three panels of gene-specific CIMP markers (Weisenberger panel, classic panel, or a mixture panel of the previous two) or the genome-wide DNA methylation profile. The associations of CIMP with outcome parameters were estimated using odds ratio (OR) or weighted mean difference (WMD) or hazard ratios (HRs) with 95% confidence interval (CI) for each study using a fixed effects or random effects model. Results: A total of 29 studies involving 9,393 CRC patients were included for analysis. We observed more BRAF mutations (OR 34.87; 95% CI, 22.49–54.06) and microsatellite instability (MSI) (OR 12.85 95% CI, 8.84–18.68) in CIMP-positive vs. -negative CRCs, whereas KRAS mutations were less frequent (OR 0.47; 95% CI, 0.30–0.75). Subgroup analysis showed that only the genome-wide methylation profile-defined CIMP subset encompassed all BRAF-mutated CRCs. As expected, CIMP-positive CRCs displayed significant associations with female (OR 0.64; 95% CI, 0.56–0.72), older age at diagnosis (WMD 2.77; 95% CI, 1.15–4.38), proximal location (OR 6.91; 95% CI, 5.17–9.23), mucinous histology (OR 3.81; 95% CI, 2.93–4.95), and poor differentiation (OR 4.22; 95% CI, 2.52–7.08). Although CIMP did not show a correlation with tumor stage (OR 1.10; 95% CI, 0.82–1.46), it was associated with shorter overall survival (HR 1.73; 95% CI, 1.27–2.37). Conclusions: The meta-analysis highlights that CIMP-positive CRCs take their own

  20. T-RECS: STABLE SELECTION OF DYNAMICALLY FORMED GROUPS OF FEATURES WITH APPLICATION TO PREDICTION OF CLINICAL OUTCOMES

    PubMed Central

    Huang, Grace T.; Tsamardinos, Ioannis; Raghu, Vineet; Kaminski, Naftali; Benos, Panayiotis V.

    2014-01-01

    Feature selection is used extensively in biomedical research for biomarker identification and patient classification, both of which are essential steps in developing personalized medicine strategies. However, the structured nature of the biological datasets and high correlation of variables frequently yield multiple equally optimal signatures, thus making traditional feature selection methods unstable. Features selected based on one cohort of patients, may not work as well in another cohort. In addition, biologically important features may be missed due to selection of other co-clustered features We propose a new method, Tree-guided Recursive Cluster Selection (T-ReCS), for efficient selection of grouped features. T-ReCS significantly improves predictive stability while maintains the same level of accuracy. T-ReCS does not require an a priori knowledge of the clusters like group-lasso and also can handle “orphan” features (not belonging to a cluster). T-ReCS can be used with categorical or survival target variables. Tested on simulated and real expression data from breast cancer and lung diseases and survival data, T-ReCS selected stable cluster features without significant loss in classification accuracy. PMID:25592602

  1. From molecular signatures to predictive biomarkers: modeling disease pathophysiology and drug mechanism of action

    PubMed Central

    Heinzel, Andreas; Perco, Paul; Mayer, Gert; Oberbauer, Rainer; Lukas, Arno; Mayer, Bernd

    2014-01-01

    Omics profiling significantly expanded the molecular landscape describing clinical phenotypes. Association analysis resulted in first diagnostic and prognostic biomarker signatures entering clinical utility. However, utilizing Omics for deepening our understanding of disease pathophysiology, and further including specific interference with drug mechanism of action on a molecular process level still sees limited added value in the clinical setting. We exemplify a computational workflow for expanding from statistics-based association analysis toward deriving molecular pathway and process models for characterizing phenotypes and drug mechanism of action. Interference analysis on the molecular model level allows identification of predictive biomarker candidates for testing drug response. We discuss this strategy on diabetic nephropathy (DN), a complex clinical phenotype triggered by diabetes and presenting with renal as well as cardiovascular endpoints. A molecular pathway map indicates involvement of multiple molecular mechanisms, and selected biomarker candidates reported as associated with disease progression are identified for specific molecular processes. Selective interference of drug mechanism of action and disease-associated processes is identified for drug classes in clinical use, in turn providing precision medicine hypotheses utilizing predictive biomarkers. PMID:25364744

  2. Application of computer-extracted breast tissue texture features in predicting false-positive recalls from screening mammography

    NASA Astrophysics Data System (ADS)

    Ray, Shonket; Choi, Jae Y.; Keller, Brad M.; Chen, Jinbo; Conant, Emily F.; Kontos, Despina

    2014-03-01

    Mammographic texture features have been shown to have value in breast cancer risk assessment. Previous models have also been developed that use computer-extracted mammographic features of breast tissue complexity to predict the risk of false-positive (FP) recall from breast cancer screening with digital mammography. This work details a novel locallyadaptive parenchymal texture analysis algorithm that identifies and extracts mammographic features of local parenchymal tissue complexity potentially relevant for false-positive biopsy prediction. This algorithm has two important aspects: (1) the adaptive nature of automatically determining an optimal number of region-of-interests (ROIs) in the image and each ROI's corresponding size based on the parenchymal tissue distribution over the whole breast region and (2) characterizing both the local and global mammographic appearances of the parenchymal tissue that could provide more discriminative information for FP biopsy risk prediction. Preliminary results show that this locallyadaptive texture analysis algorithm, in conjunction with logistic regression, can predict the likelihood of false-positive biopsy with an ROC performance value of AUC=0.92 (p<0.001) with a 95% confidence interval [0.77, 0.94]. Significant texture feature predictors (p<0.05) included contrast, sum variance and difference average. Sensitivity for false-positives was 51% at the 100% cancer detection operating point. Although preliminary, clinical implications of using prediction models incorporating these texture features may include the future development of better tools and guidelines regarding personalized breast cancer screening recommendations. Further studies are warranted to prospectively validate our findings in larger screening populations and evaluate their clinical utility.

  3. Criminal recidivism among juvenile offenders: testing the incremental and predictive validity of three measures of psychopathic features.

    PubMed

    Douglas, Kevin S; Epstein, Monica E; Poythress, Norman G

    2008-10-01

    We studied the predictive, comparative, and incremental validity of three measures of psychopathic features (Psychopathy Checklist: Youth Version [PCL:YV]; Antisocial Process Screening Device [APSD]; Childhood Psychopathy Scale [CPS]) vis-à-vis criminal recidivism among 83 delinquent youth within a truly prospective design. Bivariate and multivariate analyses (Cox proportional hazard analyses) showed that of the three measures, the CPS was most consistently related to most types of recidivism in comparison to the other measures. However, incremental validity analyses demonstrated that all of the predictive effects for the measures of psychopathic features disappeared after conceptually relevant covariates (i.e., substance use, conduct disorder, young age, past property crime) were included in multivariate predictive models. Implications for the limits of these measures in applied juvenile justice assessment are discussed. PMID:18064548

  4. Structural and molecular features of intestinal strictures in rats with Crohn's-like disease

    PubMed Central

    Talapka, Petra; Berkó, Anikó; Nagy, Lajos István; Chandrakumar, Lalitha; Bagyánszki, Mária; Puskás, László Géza; Fekete, Éva; Bódi, Nikolett

    2016-01-01

    AIM: To develop a new rat model we wanted to gain a better understanding of stricture formation in Crohn’s disease (CD). METHODS: Chronic colitis was induced locally by the administration of 2,4,6-trinitrobenzenesulfonic acid (TNBS). The relapsing inflammation characteristic to CD was mimicked by repeated TNBS treatments. Animals were randomly divided into control, once, twice and three times TNBS-treated groups. Control animals received an enema of saline. Tissue samples were taken from the strictured colonic segments and also adjacent proximally and distally to its 60, 90 or 120 d after the last TNBS or saline administrations. The frequency and macroscopic extent of the strictures were measured on digital photographs. The structural features of strictured gut wall were studied by light- and electron microscopy. Inflammation related alterations in TGF-beta 2 and 3, matrix metalloproteinases 9 (MMP9) and TIMP1 mRNA and protein expression were determined by quantitative real-time PCR and western blot analysis. The quantitative distribution of caspase 9 was determined by post-embedding immunohistochemistry. RESULTS: Intestinal strictures first appeared 60 d after TNBS treatments and the frequency of them increased up to day 120. From day 90 an intact lamina epithelialis, reversible thickening of lamina muscularis mucosae and irreversible thickening of the muscularis externa were demonstrated in the strictured colonic segments. Nevertheless the morphological signs of apoptosis were frequently seen and excess extracellular matrix deposition was recorded between smooth muscle cells (SMCs). Enhanced caspase 9 expression on day 90 in the SMCs and on day 120 also in myenteric neurons indicated the induction of apoptosis. The mRNA expression profile of TGF-betas after repeated TNBS doses was characteristic to CD, TGF-beta 2, but not TGF-beta 3 was up-regulated. Overexpression of MMP9 and down-regulation of TIMP1 were demonstrated. The progressive increase in the amount of

  5. Features of Knowledge Building in Biology: Understanding Undergraduate Students' Ideas about Molecular Mechanisms

    ERIC Educational Resources Information Center

    Southard, Katelyn; Wince, Tyler; Meddleton, Shanice; Bolger, Molly S.

    2016-01-01

    Research has suggested that teaching and learning in molecular and cellular biology (MCB) is difficult. We used a new lens to understand undergraduate reasoning about molecular mechanisms: the knowledge-integration approach to conceptual change. Knowledge integration is the dynamic process by which learners acquire new ideas, develop connections…

  6. Machine learning for molecular scattering dynamics: Gaussian Process models for improved predictions of molecular collision observables

    NASA Astrophysics Data System (ADS)

    Krems, Roman; Cui, Jie; Li, Zhiying

    2016-05-01

    We show how statistical learning techniques based on kriging (Gaussian Process regression) can be used for improving the predictions of classical and/or quantum scattering theory. In particular, we show how Gaussian Process models can be used for: (i) efficient non-parametric fitting of multi-dimensional potential energy surfaces without the need to fit ab initio data with analytical functions; (ii) obtaining scattering observables as functions of individual PES parameters; (iii) using classical trajectories to interpolate quantum results; (iv) extrapolation of scattering observables from one molecule to another; (v) obtaining scattering observables with error bars reflecting the inherent inaccuracy of the underlying potential energy surfaces. We argue that the application of Gaussian Process models to quantum scattering calculations may potentially elevate the theoretical predictions to the same level of certainty as the experimental measurements and can be used to identify the role of individual atoms in determining the outcome of collisions of complex molecules. We will show examples and discuss the applications of Gaussian Process models to improving the predictions of scattering theory relevant for the cold molecules research field. Work supported by NSERC of Canada.

  7. Collision cross section prediction of deprotonated phenolics in a travelling-wave ion mobility spectrometer using molecular descriptors and chemometrics.

    PubMed

    Gonzales, Gerard Bryan; Smagghe, Guy; Coelus, Sofie; Adriaenssens, Dieter; De Winter, Karel; Desmet, Tom; Raes, Katleen; Van Camp, John

    2016-06-14

    The combination of ion mobility and mass spectrometry (MS) affords significant improvements over conventional MS/MS, especially in the characterization of isomeric metabolites due to the differences in their collision cross sections (CCS). Experimentally obtained CCS values are typically matched with theoretical CCS values from Trajectory Method (TM) and/or Projection Approximation (PA) calculations. In this paper, predictive models for CCS of deprotonated phenolics were developed using molecular descriptors and chemometric tools, stepwise multiple linear regression (SMLR), principal components regression (PCR), and partial least squares regression (PLS). A total of 102 molecular descriptors were generated and reduced to 28 after employing a feature selection tool, composed of mass, topological descriptors, Jurs descriptors and shadow indices. Therefore, the generated models considered the effects of mass, 3D conformation and partial charge distribution on CCS, which are the main parameters for either TM or PA (only 3D conformation) calculations. All three techniques yielded highly predictive models for both the training (R(2)SMLR = 0.9911; R(2)PCR = 0.9917; R(2)PLS = 0.9918) and validation datasets (R(2)SMLR = 0.9489; R(2)PCR = 0.9761; R(2)PLS = 0.9760). Also, the high cross validated R(2) values indicate that the generated models are robust and highly predictive (Q(2)SMLR = 0.9859; Q(2)PCR = 0.9748; Q(2)PLS = 0.9760). The predictions were also very comparable to the results from TM calculations using modified mobcal (N2). Most importantly, this method offered a rapid (<10 min) alternative to TM calculations without compromising predictive ability. These methods could therefore be used in routine analysis and could be easily integrated to metabolite identification platforms. PMID:27181646

  8. A highly accurate protein structural class prediction approach using auto cross covariance transformation and recursive feature elimination.

    PubMed

    Li, Xiaowei; Liu, Taigang; Tao, Peiying; Wang, Chunhua; Chen, Lanming

    2015-12-01

    Structural class characterizes the overall folding type of a protein or its domain. Many methods have been proposed to improve the prediction accuracy of protein structural class in recent years, but it is still a challenge for the low-similarity sequences. In this study, we introduce a feature extraction technique based on auto cross covariance (ACC) transformation of position-specific score matrix (PSSM) to represent a protein sequence. Then support vector machine-recursive feature elimination (SVM-RFE) is adopted to select top K features according to their importance and these features are input to a support vector machine (SVM) to conduct the prediction. Performance evaluation of the proposed method is performed using the jackknife test on three low-similarity datasets, i.e., D640, 1189 and 25PDB. By means of this method, the overall accuracies of 97.2%, 96.2%, and 93.3% are achieved on these three datasets, which are higher than those of most existing methods. This suggests that the proposed method could serve as a very cost-effective tool for predicting protein structural class especially for low-similarity datasets. PMID:26460680

  9. Assessment of Genetic and Molecular Approaches for the Prediction of Wheat Quality

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Assessment of genetic and molecular approaches for the prediction of wheat quality. R.A. Graybosch, USDA-ARS, Lincoln, NE, U.S.A. Over the past four decades, the field of plant breeding and genetics has been revolutionized by technological advances in the areas of DNA manipulation and evaluation. Fo...

  10. PREDICTION OF MOLECULAR PROPERTIES WITH MID-INFRARED SPECTRA AND INTERFEROGRAMS

    EPA Science Inventory

    We have built infrared spectroscopy-based partial least squares (PLS) models for molecular polarizabilities using a 97 member training set and a 59 member independent prediction set. These 156 compounds span a very wide range of chemical structure. Our goal was to use this well...

  11. Molecular modeling as a predictive tool for the development of solid dispersions.

    PubMed

    Maniruzzaman, Mohammed; Pang, Jiayun; Morgan, David J; Douroumis, Dennis

    2015-04-01

    In this study molecular modeling is introduced as a novel approach for the development of pharmaceutical solid dispersions. A computational model based on quantum mechanical (QM) calculations was used to predict the miscibility of various drugs in various polymers by predicting the binding strength between the drug and dimeric form of the polymer. The drug/polymer miscibility was also estimated by using traditional approaches such as Van Krevelen/Hoftyzer and Bagley solubility parameters or Flory-Huggins interaction parameter in comparison to the molecular modeling approach. The molecular modeling studies predicted successfully the drug-polymer binding energies and the preferable site of interaction between the functional groups. The drug-polymer miscibility and the physical state of bulk materials, physical mixtures, and solid dispersions were determined by thermal analysis (DSC/MTDSC) and X-ray diffraction. The produced solid dispersions were analyzed by X-ray photoelectron spectroscopy (XPS), which confirmed not only the exact type of the intermolecular interactions between the drug-polymer functional groups but also the binding strength by estimating the N coefficient values. The findings demonstrate that QM-based molecular modeling is a powerful tool to predict the strength and type of intermolecular interactions in a range of drug/polymeric systems for the development of solid dispersions. PMID:25734898

  12. Using the SMOTE technique and hybrid features to predict the types of ion channel-targeted conotoxins.

    PubMed

    Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing

    2016-08-21

    Conotoxins targeting different ion channels play distinct physiological functions and therapeutic potentials in organisms. Accurate identification of types of ion channel-targeted conotoxins will provide significant clues to reveal the physiological mechanism and pharmacological therapeutic potential of conotoxins. In this study, a random forest based predictor called ICTCPred for the types of ion channel-targeted conotoxin prediction is proposed with hybrid features incorporating CTD (Composition, Transition, and Distribution), g-Gap DC (g-Gap Dipeptide Composition), PP (Physicochemical Properties), and SSI (Secondary Structure Information). To deal with the imbalanced benchmark dataset, the SMOTE Technique (Synthetic Minority Over-sampling Technique) is applied. Based on the above-mentioned individual feature spaces, the average accuracy of ICTCPred lies in the range of 0.729-0.886, indicating the discriminative power of these features. In addition, ICTCPred yields the highest average accuracy of 0.895 using the hybrid feature space of CTD, g-Gap DC, PP and SSI. The Relief-IFS (Incremental Feature Selection) method is adopted to further improve the prediction performance of ICTCPred. Based on the training dataset, ICTCPred achieves satisfactory performance with an average accuracy of 0.910. To evaluate the prediction performance objectively, ICTCPred is compared with previous studies on the same independent testing dataset. Encouragingly, our proposed method performs better than previous studies to identify types of ion channel-targeted conotoxins, with the highest sensitivity of 0.919 for Na(+)-targeted conotoxins, the highest sensitivity of 1 for K(+)-targeted conotoxins, and the highest sensitivity of 1 for Ca(2+)-targeted conotoxins. It is anticipated that ICTCPred can be a potential candidate for the ion channel-targeted conotoxin prediction. PMID:27142776

  13. KFC2: a knowledge-based hot spot prediction method based on interface solvation, atomic density, and plasticity features.

    PubMed

    Zhu, Xiaolei; Mitchell, Julie C

    2011-09-01

    Hot spots constitute a small fraction of protein-protein interface residues, yet they account for a large fraction of the binding affinity. Based on our previous method (KFC), we present two new methods (KFC2a and KFC2b) that outperform other methods at hot spot prediction. A number of improvements were made in developing these new methods. First, we created a training data set that contained a similar number of hot spot and non-hot spot residues. In addition, we generated 47 different features, and different numbers of features were used to train the models to avoid over-fitting. Finally, two feature combinations were selected: One (used in KFC2a) is composed of eight features that are mainly related to solvent accessible surface area and local plasticity; the other (KFC2b) is composed of seven features, only two of which are identical to those used in KFC2a. The two models were built using support vector machines (SVM). The two KFC2 models were then tested on a mixed independent test set, and compared with other methods such as Robetta, FOLDEF, HotPoint, MINERVA, and KFC. KFC2a showed the highest predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.85); however, the false positive rate was somewhat higher than for other models. KFC2b showed the best predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.62) among all methods other than KFC2a, and the False Positive Rate (FPR = 0.15) was comparable with other highly predictive methods. PMID:21735484

  14. Predicting the Pro-Longevity or Anti-Longevity Effect of Model Organism Genes with New Hierarchical Feature Selection Methods.

    PubMed

    Wan, Cen; Freitas, Alex A; de Magalhães, João Pedro

    2015-01-01

    Ageing is a highly complex biological process that is still poorly understood. With the growing amount of ageing-related data available on the web, in particular concerning the genetics of ageing, it is timely to apply data mining methods to that data, in order to try to discover novel patterns that may assist ageing research. In this work, we introduce new hierarchical feature selection methods for the classification task of data mining and apply them to ageing-related data from four model organisms: Caenorhabditis elegans (worm), Saccharomyces cerevisiae (yeast), Drosophila melanogaster (fly), and Mus musculus (mouse). The main novel aspect of the proposed feature selection methods is that they exploit hierarchical relationships in the set of features (Gene Ontology terms) in order to improve the predictive accuracy of the Naïve Bayes and 1-Nearest Neighbour (1-NN) classifiers, which are used to classify model organisms' genes into pro-longevity or anti-longevity genes. The results show that our hierarchical feature selection methods, when used together with Naïve Bayes and 1-NN classifiers, obtain higher predictive accuracy than the standard (without feature selection) Naïve Bayes and 1-NN classifiers, respectively. We also discuss the biological relevance of a number of Gene Ontology terms very frequently selected by our algorithms in our datasets. PMID:26357215

  15. Investigation on the isoform selectivity of novel kinesin-like protein 1 (KIF11) inhibitor using chemical feature based pharmacophore, molecular docking, and quantum mechanical studies.

    PubMed

    Karunagaran, Subramanian; Subhashchandrabose, Subramaniyan; Lee, Keun Woo; Meganathan, Chandrasekaran

    2016-04-01

    Kinesin-like protein (KIF11) is a molecular motor protein that is essential in mitosis. Removal of KIF11 prevents centrosome migration and causes cell arrest in mitosis. KIF11 defects are linked to the disease of microcephaly, lymph edema or mental retardation. The human KIF11 protein has been actively studied for its role in mitosis and its potential as a therapeutic target for cancer treatment. Pharmacophore modeling, molecular docking and density functional theory approaches was employed to reveal the structural, chemical and electronic features essential for the development of small molecule inhibitor for KIF11. Hence we have developed chemical feature based pharmacophore models using Discovery Studio v 2.5 (DS). The best hypothesis (Hypo1) consisting of four chemical features (two hydrogen bond acceptor, one hydrophobic and one ring aromatic) has exhibited high correlation co-efficient of 0.9521, cost difference of 70.63 and low RMS value of 0.9475. This Hypo1 is cross validated by Cat Scramble method; test set and decoy set to prove its robustness, statistical significance and predictability respectively. The well validated Hypo1 was used as 3Dquery to perform virtual screening. The hits obtained from the virtual screening were subjected to various scrupulous drug-like filters such as Lipinski's rule of five and ADMET properties. Finally, six hit compounds were identified based on the molecular interaction and its electronic properties. Our final lead compound could serve as a powerful tool for the discovery of potent inhibitor as KIF11 agonists. PMID:26815769

  16. Predicting Subcellular Localization of Apoptosis Proteins Combining GO Features of Homologous Proteins and Distance Weighted KNN Classifier

    PubMed Central

    Wang, Xiao; Li, Hui; Zhang, Qiuwen; Wang, Rong

    2016-01-01

    Apoptosis proteins play a key role in maintaining the stability of organism; the functions of apoptosis proteins are related to their subcellular locations which are used to understand the mechanism of programmed cell death. In this paper, we utilize GO annotation information of apoptosis proteins and their homologous proteins retrieved from GOA database to formulate feature vectors and then combine the distance weighted KNN classification algorithm with them to solve the data imbalance problem existing in CL317 data set to predict subcellular locations of apoptosis proteins. It is found that the number of homologous proteins can affect the overall prediction accuracy. Under the optimal number of homologous proteins, the overall prediction accuracy of our method on CL317 data set reaches 96.8% by Jackknife test. Compared with other existing methods, it shows that our proposed method is very effective and better than others for predicting subcellular localization of apoptosis proteins. PMID:27213149

  17. Using Matched Molecular Series as a Predictive Tool To Optimize Biological Activity

    PubMed Central

    2014-01-01

    A matched molecular series is the general form of a matched molecular pair and refers to a set of two or more molecules with the same scaffold but different R groups at the same position. We describe Matsy, a knowledge-based method that uses matched series to predict R groups likely to improve activity given an observed activity order for some R groups. We compare the Matsy predictions based on activity data from ChEMBLdb to the recommendations of the Topliss tree and carry out a large scale retrospective test to measure performance. We show that the basis for predictive success is preferred orders in matched series and that this preference is stronger for longer series. The Matsy algorithm allows medicinal chemists to integrate activity trends from diverse medicinal chemistry programs and apply them to problems of interest as a Topliss-like recommendation or as a hypothesis generator to aid compound design. PMID:24601597

  18. Comparison of Algorithms for Prediction of Protein Structural Features from Evolutionary Data

    PubMed Central

    Bywater, Robert P.

    2016-01-01

    Proteins have many functions and predicting these is still one of the major challenges in theoretical biophysics and bioinformatics. Foremost amongst these functions is the need to fold correctly thereby allowing the other genetically dictated tasks that the protein has to carry out to proceed efficiently. In this work, some earlier algorithms for predicting protein domain folds are revisited and they are compared with more recently developed methods. In dealing with intractable problems such as fold prediction, when different algorithms show convergence onto the same result there is every reason to take all algorithms into account such that a consensus result can be arrived at. In this work it is shown that the application of different algorithms in protein structure prediction leads to results that do not converge as such but rather they collude in a striking and useful way that has never been considered before. PMID:26963911

  19. Prediction of clathrate structure type and guest position by molecular mechanics.

    PubMed

    Fleischer, Everly B; Janda, Kenneth C

    2013-05-16

    The clathrate hydrates occur in various types in which the number, size, and shape of the various cages differ. Usually the clathrate type of a specific guest is predicted by the size and shape of the molecular guest. We have developed a methodology to determine the clathrate type employing molecular mechanics with the MMFF force field employing a strategy to calculate the energy of formation of the clathrate from the sum of the guest/cage energies. The clathrate type with the most negative (most stable) energy of formation would be the type predicted (we mainly focused on type I, type II, or bromine type). This strategy allows for a calculation to predict the clathrate type for any cage guest in a few minutes on a laptop computer. It proved successful in predicting the clathrate structure for 46 out of 47 guest molecules. The molecular mechanics calculations also provide a prediction of the guest position within the cage and clathrate structure. These predictions are generally consistent with the X-ray and neutron diffraction studies. By supplementing the diffraction study with molecular mechanics, we gain a more detailed insight regarding the details of the structure. We have also compared MM calculations to studies of the multiple occupancy of the cages. Finally, we present a density functional calculation that demonstrates that the inside of the clathrates cages have a relatively uniform and low electrostatic potential in comparison with the outside oxygen and hydrogen atoms. This implies that van der Waals forces will usually be dominant in the guest-cage interactions. PMID:23600658

  20. Downstream Antisense Transcription Predicts Genomic Features That Define the Specific Chromatin Environment at Mammalian Promoters

    PubMed Central

    Lavender, Christopher A.; Hoffman, Jackson A.; Trotter, Kevin W.; Gilchrist, Daniel A.; Bennett, Brian D.; Burkholder, Adam B.; Fargo, David C.; Archer, Trevor K.

    2016-01-01

    Antisense transcription is a prevalent feature at mammalian promoters. Previous studies have primarily focused on antisense transcription initiating upstream of genes. Here, we characterize promoter-proximal antisense transcription downstream of gene transcription starts sites in human breast cancer cells, investigating the genomic context of downstream antisense transcription. We find extensive correlations between antisense transcription and features associated with the chromatin environment at gene promoters. Antisense transcription downstream of promoters is widespread, with antisense transcription initiation observed within 2 kb of 28% of gene transcription start sites. Antisense transcription initiates between nucleosomes regularly positioned downstream of these promoters. The nucleosomes between gene and downstream antisense transcription start sites carry histone modifications associated with active promoters, such as H3K4me3 and H3K27ac. This region is bound by chromatin remodeling and histone modifying complexes including SWI/SNF subunits and HDACs, suggesting that antisense transcription or resulting RNA transcripts contribute to the creation and maintenance of a promoter-associated chromatin environment. Downstream antisense transcription overlays additional regulatory features, such as transcription factor binding, DNA accessibility, and the downstream edge of promoter-associated CpG islands. These features suggest an important role for antisense transcription in the regulation of gene expression and the maintenance of a promoter-associated chromatin environment. PMID:27487356

  1. Downstream Antisense Transcription Predicts Genomic Features That Define the Specific Chromatin Environment at Mammalian Promoters.

    PubMed

    Lavender, Christopher A; Cannady, Kimberly R; Hoffman, Jackson A; Trotter, Kevin W; Gilchrist, Daniel A; Bennett, Brian D; Burkholder, Adam B; Burd, Craig J; Fargo, David C; Archer, Trevor K

    2016-08-01

    Antisense transcription is a prevalent feature at mammalian promoters. Previous studies have primarily focused on antisense transcription initiating upstream of genes. Here, we characterize promoter-proximal antisense transcription downstream of gene transcription starts sites in human breast cancer cells, investigating the genomic context of downstream antisense transcription. We find extensive correlations between antisense transcription and features associated with the chromatin environment at gene promoters. Antisense transcription downstream of promoters is widespread, with antisense transcription initiation observed within 2 kb of 28% of gene transcription start sites. Antisense transcription initiates between nucleosomes regularly positioned downstream of these promoters. The nucleosomes between gene and downstream antisense transcription start sites carry histone modifications associated with active promoters, such as H3K4me3 and H3K27ac. This region is bound by chromatin remodeling and histone modifying complexes including SWI/SNF subunits and HDACs, suggesting that antisense transcription or resulting RNA transcripts contribute to the creation and maintenance of a promoter-associated chromatin environment. Downstream antisense transcription overlays additional regulatory features, such as transcription factor binding, DNA accessibility, and the downstream edge of promoter-associated CpG islands. These features suggest an important role for antisense transcription in the regulation of gene expression and the maintenance of a promoter-associated chromatin environment. PMID:27487356

  2. Conformational features of an actuator containing calix[4]arene and thiophene: a molecular dynamics study.

    PubMed

    Zanuy, David; Casanovas, Jordi; Aleman, Carlos

    2006-05-25

    Molecular dynamics simulations have been performed for poly(calix[4]arene bis(bithiophene)) in dichloromethane solution. This material responds to its electronic structure variations with significant conformational changes, producing contraction-expansion movements. Simulations have been performed for the three states of this molecular actuator (reduced, oxidized-nondeprotonated, and oxidized-deprotonated), a specific force-field being developed for each case. Results, which are fully consistent with previous ab initio quantum mechanical calculations on an isolated actuating unit, have revealed important findings about the dynamics of the system. Analyses of the flexibility/rigidity of the molecular chain with the state, the interaction of the polymer with the solvent molecules and the influence of environmental factors (as the viscosity of solvent, the counterions and the thermal agitation) on the dynamics have provided important insights to the actuation mechanism. PMID:16706442

  3. Predicting the auto-ignition temperatures of organic compounds from molecular structure using support vector machine.

    PubMed

    Pan, Yong; Jiang, Juncheng; Wang, Rui; Cao, Hongyin; Cui, Yi

    2009-05-30

    A quantitative structure-property relationship (QSPR) study is suggested for the prediction of auto-ignition temperatures (AIT) of organic compounds. Various kinds of molecular descriptors were calculated to represent the molecular structures of compounds, such as topological, charge, and geometric descriptors. The variable selection method of genetic algorithm (GA) was employed to select optimal subset of descriptors that have significant contribution to the overall AIT property from the large pool of calculated descriptors. The novel modeling method of support vector machine (SVM) was then employed to model the possible quantitative relationship existed between these selected descriptors and AIT property. The resulted model showed high prediction ability with the average absolute error being 28.88 degrees C, and the root mean square error being 36.86 for the prediction set, which are within the range of the experimental error of AIT measurements. The proposed method can be successfully used to predict the auto-ignition temperatures of organic compounds with only nine pre-selected theoretical descriptors which can be calculated directly from molecular structure alone. PMID:18952371

  4. A Consistency-Based Feature Selection Method Allied with Linear SVMs for HIV-1 Protease Cleavage Site Prediction

    PubMed Central

    Öztürk, Orkun; Aksaç, Alper; Elsheikh, Abdallah; Özyer, Tansel; Alhajj, Reda

    2013-01-01

    Background Predicting type-1 Human Immunodeficiency Virus (HIV-1) protease cleavage site in protein molecules and determining its specificity is an important task which has attracted considerable attention in the research community. Achievements in this area are expected to result in effective drug design (especially for HIV-1 protease inhibitors) against this life-threatening virus. However, some drawbacks (like the shortage of the available training data and the high dimensionality of the feature space) turn this task into a difficult classification problem. Thus, various machine learning techniques, and specifically several classification methods have been proposed in order to increase the accuracy of the classification model. In addition, for several classification problems, which are characterized by having few samples and many features, selecting the most relevant features is a major factor for increasing classification accuracy. Results We propose for HIV-1 data a consistency-based feature selection approach in conjunction with recursive feature elimination of support vector machines (SVMs). We used various classifiers for evaluating the results obtained from the feature selection process. We further demonstrated the effectiveness of our proposed method by comparing it with a state-of-the-art feature selection method applied on HIV-1 data, and we evaluated the reported results based on attributes which have been selected from different combinations. Conclusion Applying feature selection on training data before realizing the classification task seems to be a reasonable data-mining process when working with types of data similar to HIV-1. On HIV-1 data, some feature selection or extraction operations in conjunction with different classifiers have been tested and noteworthy outcomes have been reported. These facts motivate for the work presented in this paper. Software availability The software is available at http://ozyer.etu.edu.tr/c-fs-svm.rar. The software

  5. A simple feature construction method for predicting upstream/downstream signal flow in human protein-protein interaction networks

    PubMed Central

    Mei, Suyu; Zhu, Hao

    2015-01-01

    Signaling pathways play important roles in understanding the underlying mechanism of cell growth, cell apoptosis, organismal development and pathways-aberrant diseases. Protein-protein interaction (PPI) networks are commonly-used infrastructure to infer signaling pathways. However, PPI networks generally carry no information of upstream/downstream relationship between interacting proteins, which retards our inferring the signal flow of signaling pathways. In this work, we propose a simple feature construction method to train a SVM (support vector machine) classifier to predict PPI upstream/downstream relations. The domain based asymmetric feature representation naturally embodies domain-domain upstream/downstream relations, providing an unconventional avenue to predict the directionality between two objects. Moreover, we propose a semantically interpretable decision function and a macro bag-level performance metric to satisfy the need of two-instance depiction of an interacting protein pair. Experimental results show that the proposed method achieves satisfactory cross validation performance and independent test performance. Lastly, we use the trained model to predict the PPIs in HPRD, Reactome and IntAct. Some predictions have been validated against recent literature. PMID:26648121

  6. Texture feature analysis for prediction of postoperative liver failure prior to surgery

    NASA Astrophysics Data System (ADS)

    Simpson, Amber L.; Do, Richard K.; Parada, E. Patricia; Miga, Michael I.; Jarnagin, William R.

    2014-03-01

    Texture analysis of preoperative CT images of the liver is undertaken in this study. Standard texture features were extracted from portal-venous phase contrast-enhanced CT scans of 36 patients prior to major hepatic resection and correlated to postoperative liver failure. Differences between patients with and without postoperative liver failure were statistically significant for contrast (measure of local variation), correlation (linear dependency of gray levels on neighboring pixels), cluster prominence (asymmetry), and normalized inverse difference moment (local homogeneity). Though texture features have been used to diagnose and characterize lesions, to our knowledge, parenchymal statistical variation has not been quantified and studied. We demonstrate that texture analysis is a valuable tool for quantifying liver function prior to surgery, which may help to identify and change the preoperative management of patients at higher risk for overall morbidity.

  7. Machine learning identification of EEG features predicting working memory performance in schizophrenia and healthy adults

    PubMed Central

    Johannesen, Jason K.; Bi, Jinbo; Jiang, Ruhua; Kenney, Joshua G.; Chen, Chi-Ming A.

    2016-01-01

    Background With millisecond-level resolution, electroencephalographic (EEG) recording provides a sensitive tool to assay neural dynamics of human cognition. However, selection of EEG features used to answer experimental questions is typically determined a priori. The utility of machine learning was investigated as a computational framework for extracting the most relevant features from EEG data empirically. Methods Schizophrenia (SZ; n = 40) and healthy community (HC; n = 12) subjects completed a Sternberg Working Memory Task (SWMT) during EEG recording. EEG was analyzed to extract 5 frequency components (theta1, theta2, alpha, beta, gamma) at 4 processing stages (baseline, encoding, retention, retrieval) and 3 scalp sites (frontal-Fz, central-Cz, occipital-Oz) separately for correctly and incorrectly answered trials. The 1-norm support vector machine (SVM) method was used to build EEG classifiers of SWMT trial accuracy (correct vs. incorrect; Model 1) and diagnosis (HC vs. SZ; Model 2). External validity of SVM models was examined in relation to neuropsychological test performance and diagnostic classification using conventional regression-based analyses. Results SWMT performance was significantly reduced in SZ (p < .001). Model 1 correctly classified trial accuracy at 84 % in HC, and at 74 % when cross-validated in SZ data. Frontal gamma at encoding and central theta at retention provided highest weightings, accounting for 76 % of variance in SWMT scores and 42 % variance in neuropsychological test performance across samples. Model 2 identified frontal theta at baseline and frontal alpha during retrieval as primary classifiers of diagnosis, providing 87 % classification accuracy as a discriminant function. Conclusions EEG features derived by SVM are consistent with literature reports of gamma’s role in memory encoding, engagement of theta during memory retention, and elevated resting low-frequency activity in schizophrenia. Tests of model performance and cross

  8. Predicting Protein-Protein Interactions from the Molecular to the Proteome Level.

    PubMed

    Keskin, Ozlem; Tuncbag, Nurcan; Gursoy, Attila

    2016-04-27

    Identification of protein-protein interactions (PPIs) is at the center of molecular biology considering the unquestionable role of proteins in cells. Combinatorial interactions result in a repertoire of multiple functions; hence, knowledge of PPI and binding regions naturally serve to functional proteomics and drug discovery. Given experimental limitations to find all interactions in a proteome, computational prediction/modeling of protein interactions is a prerequisite to proceed on the way to complete interactions at the proteome level. This review aims to provide a background on PPIs and their types. Computational methods for PPI predictions can use a variety of biological data including sequence-, evolution-, expression-, and structure-based data. Physical and statistical modeling are commonly used to integrate these data and infer PPI predictions. We review and list the state-of-the-art methods, servers, databases, and tools for protein-protein interaction prediction. PMID:27074302

  9. Ab initio NMR Confirmed Evolutionary Structure Prediction for Organic Molecular Crystals

    NASA Astrophysics Data System (ADS)

    Pham, Cong-Huy; Kucukbenli, Emine; de Gironcoli, Stefano

    2015-03-01

    Ab initio crystal structure prediction of even small organic compounds is extremely challenging due to polymorphism, molecular flexibility and difficulties in addressing the dispersion interaction from first principles. We recently implemented vdW-aware density functionals and demonstrated their success in energy ordering of aminoacid crystals. In this work we combine this development with the evolutionary structure prediction method to study cholesterol polymorphs. Cholesterol crystals have paramount importance in various diseases, from cancer to atherosclerosis. The structure of some polymorphs (e.g. ChM, ChAl, ChAh) have already been resolved while some others, which display distinct NMR spectra and are involved in disease formation, are yet to be determined. Here we thoroughly assess the applicability of evolutionary structure prediction to address such real world problems. We validate the newly predicted structures with ab initio NMR chemical shift data using secondary referencing for an improved comparison with experiments.

  10. Measuring the successes and deficiencies of constant pH molecular dynamics: a blind prediction study.

    PubMed

    Williams, Sarah L; Blachly, Patrick G; McCammon, J Andrew

    2011-12-01

    A constant pH molecular dynamics method has been used in the blind prediction of pK(a) values of titratable residues in wild type and mutated structures of the Staphylococcal nuclease (SNase) protein. The predicted values have been subsequently compared to experimental values provided by the laboratory of García-Moreno. CpHMD performs well in predicting the pK(a) of solvent-exposed residues. For residues in the protein interior, the CpHMD method encounters some difficulties in reaching convergence and predicting the pK(a) values for residues having strong interactions with neighboring residues. These results show the need to accurately and sufficiently sample conformational space in order to obtain pK(a) values consistent with experimental results. PMID:22072520

  11. Dysembryoplastic Neuroepithelial Tumor of the Septum Pellucidum and the Supratentorial Midline: Histopathologic, Neuroradiologic, and Molecular Features of 7 Cases.

    PubMed

    Gessi, Marco; Hattingen, Elke; Dörner, Evelyn; Goschzik, Tobias; Dreschmann, Verena; Waha, Andreas; Pietsch, Torsten

    2016-06-01

    Dysembryoplastic neuroepithelial tumors (DNTs) are one of the most common epilepsy-associated low-grade glioneuronal tumors of the central nervous system. Although most DNTs occur in the cerebral cortex, DNT-like tumors with unusual intraventricular or periventricular localizations have been reported. Most of them involve the septum pellucidum and the foramen of Monro. In this study, we have described the neuroradiologic, histopathologic, and molecular features of 7 cases (4 female and 3 male; patient age range, 3 to 34 y; mean age, 16.7 y). The tumors, all localized near the supratentorial midline structures in proximity to the foramen of Monro and septum pellucidum, appeared in magnetic resonance imaging as well-delimited cystic lesions with cerebrospinal fluid-like signal on T1-weighted and T2-weighted images, some of them with typical fluid-attenuated inversion recovery ring sign. Histologically, they shared features with classic cortical DNTs but did not display aspects of multinodularity. From a molecular point of view the cases investigated did not show KIAA1549-BRAF fusions or FGFR1 mutations, alterations otherwise observed in pilocytic astrocytomas, or MYB and MYBL1 alterations that have been identified in a large group of pediatric low-grade gliomas. Moreover, BRAF mutations, which so far represent the most common molecular alteration found in cortical DNTs, were absent in this group of rare periventricular tumors. PMID:26796505

  12. Borderline Personality Features in Students: the Predicting Role of Schema, Emotion Regulation, Dissociative Experience and Suicidal Ideation

    PubMed Central

    Sajadi, Seyede Fateme; Arshadi, Nasrin; Zargar, Yadolla; Mehrabizade Honarmand, Mahnaz; Hajjari, Zahra

    2015-01-01

    Background: Numerous studies have demonstrated that early maladaptive schemas, emotional dysregulation are supposed to be the defining core of borderline personality disorder. Many studies have also found a strong association between the diagnosis of borderline personality and the occurrence of suicide ideation and dissociative symptoms. Objectives: The present study was designed to investigate the relationship between borderline personality features and schema, emotion regulation, dissociative experiences and suicidal ideation among high school students in Shiraz City, Iran. Patients and Methods: In this descriptive correlational study, 300 students (150 boys and 150 girls) were selected from the high schools in Shiraz, Iran, using the multi-stage random sampling. Data were collected using some instruments including borderline personality feature scale for children, young schema questionnaire-short form, difficulties in emotion-regulation scale (DERS), dissociative experience scale and beck suicide ideation scale. Data were analyzed using the Pearson correlation coefficient and multivariate regression analysis. Results: The results showed a significant positive correlation between schema, emotion regulation, dissociative experiences and suicide ideation with borderline personality features. Moreover, the results of multivariate regression analysis suggested that among the studied variables, schema was the most effective predicting variable of borderline features (P < 0.001). Conclusions: The findings of this study are in accordance with findings from previous studies, and generally show a meaningful association between schema, emotion regulation, dissociative experiences, and suicide ideation with borderline personality features. PMID:26401490

  13. Reactive oxygen species–associated molecular signature predicts survival in patients with sepsis

    PubMed Central

    Zhou, Tong; Wang, Ting; Slepian, Marvin J.; Garcia, Joe G. N.; Hecker, Louise

    2016-01-01

    Abstract Sepsis-related multiple organ dysfunction syndrome is a leading cause of death in intensive care units. There is overwhelming evidence that oxidative stress plays a significant role in the pathogenesis of sepsis-associated multiple organ failure; however, reactive oxygen species (ROS)–associated biomarkers and/or diagnostics that define mortality or predict survival in sepsis are lacking. Lung or peripheral blood gene expression analysis has gained increasing recognition as a potential prognostic and/or diagnostic tool. The objective of this study was to identify ROS-associated biomarkers predictive of survival in patients with sepsis. In-silico analyses of expression profiles allowed the identification of a 21-gene ROS-associated molecular signature that predicts survival in sepsis patients. Importantly, this signature performed well in a validation cohort consisting of sepsis patients aggregated from distinct patient populations recruited from different sites. Our signature outperforms randomly generated signatures of the same signature gene size. Our findings further validate the critical role of ROSs in the pathogenesis of sepsis and provide a novel gene signature that predicts survival in sepsis patients. These results also highlight the utility of peripheral blood molecular signatures as biomarkers for predicting mortality risk in patients with sepsis, which could facilitate the development of personalized therapies. PMID:27252846

  14. WeGET: predicting new genes for molecular systems by weighted co-expression.

    PubMed

    Szklarczyk, Radek; Megchelenbrink, Wout; Cizek, Pavel; Ledent, Marie; Velemans, Gonny; Szklarczyk, Damian; Huynen, Martijn A

    2016-01-01

    We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up- and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that co-express with a custom query set. PMID:26582928

  15. WeGET: predicting new genes for molecular systems by weighted co-expression

    PubMed Central

    Szklarczyk, Radek; Megchelenbrink, Wout; Cizek, Pavel; Ledent, Marie; Velemans, Gonny; Szklarczyk, Damian; Huynen, Martijn A.

    2016-01-01

    We have developed the Weighted Gene Expression Tool and database (WeGET, http://weget.cmbi.umcn.nl) for the prediction of new genes of a molecular system by correlated gene expression. WeGET utilizes a compendium of 465 human and 560 murine gene expression datasets that have been collected from multiple tissues under a wide range of experimental conditions. It exploits this abundance of expression data by assigning a high weight to datasets in which the known genes of a molecular system are harmoniously up- and down-regulated. WeGET ranks new candidate genes by calculating their weighted co-expression with that system. A weighted rank is calculated for human genes and their mouse orthologs. Then, an integrated gene rank and p-value is computed using a rank-order statistic. We applied our method to predict novel genes that have a high degree of co-expression with Gene Ontology terms and pathways from KEGG and Reactome. For each query set we provide a list of predicted novel genes, computed weights for transcription datasets used and cell and tissue types that contributed to the final predictions. The performance for each query set is assessed by 10-fold cross-validation. Finally, users can use the WeGET to predict novel genes that co-express with a custom query set. PMID:26582928

  16. Activity Prediction and Molecular Mechanism of Bovine Blood Derived Angiotensin I-Converting Enzyme Inhibitory Peptides

    PubMed Central

    Zhang, Ting; Nie, Shaoping; Liu, Boqun; Yu, Yiding; Zhang, Yan; Liu, Jingbo

    2015-01-01

    Development of angiotensin I-converting enzyme (ACE, EC 3.4.15.1) inhibitory peptides from food protein is under extensive research as alternative for the prevention of hypertension. However, it is difficult to identify peptides released from food sources. To accelerate the progress of peptide identification, a three layer back propagation neural network model was established to predict the ACE-inhibitory activity of pentapeptides derived from bovine hemoglobin by simulated enzyme digestion. The pentapeptide WTQRF has the best predicted value with experimental IC50 23.93 μM. The potential molecular mechanism of the WTQRF / ACE interaction was investigated by flexible docking. PMID:25768442

  17. Individualized treatment of gastric cancer: Impact of molecular biology and pathohistological features

    PubMed Central

    Dittmar, Yves; Settmacher, Utz

    2015-01-01

    Gastric cancer is one of the most common malignancies worldwide. The overall prognosis remains poor over the last decades even though improvements in surgical outcomes have been achieved. A better understanding of the molecular biology of gastric cancer and detection of eligible molecular targets might be of central interest to further improve clinical outcome. With this intention, first steps have been made in the research of growth factor signaling. Regarding morphogens, cell cycle and nuclear factor-κB signaling, a remarkable count of target-specific agents have been developed, nevertheless the transfer into the field of clinical routine is still at the beginning. The potential utility of epigenetic targets and the further evaluation of microRNA signaling seem to have potential for the development of novel treatment strategies in the future. PMID:26600929

  18. Molecular features distinguish ten neuronal types in the mouse superficial superior colliculus.

    PubMed

    Byun, Haewon; Kwon, Soohyun; Ahn, Hee-Jeong; Liu, Hong; Forrest, Douglas; Demb, Jonathan B; Kim, In-Jung

    2016-08-01

    The superior colliculus (SC) is a midbrain center involved in controlling head and eye movements in response to inputs from multiple sensory modalities. Visual inputs arise from both the retina and visual cortex and converge onto the superficial layer of the SC (sSC). Neurons in the sSC send information to deeper layers of the SC and to thalamic nuclei that modulate visually guided behaviors. Presently, our understanding of sSC neurons is impeded by a lack of molecular markers that define specific cell types. To better understand the identity and organization of sSC neurons, we took a systematic approach to investigate gene expression within four molecular families: transcription factors, cell adhesion molecules, neuropeptides, and calcium binding proteins. Our analysis revealed 12 molecules with distinct expression patterns in mouse sSC: cadherin 7, contactin 3, netrin G2, cadherin 6, protocadherin 20, retinoid-related orphan receptor β, brain-specific homeobox/POU domain protein 3b, Ets variant gene 1, substance P, somatostatin, vasoactive intestinal polypeptide, and parvalbumin. Double labeling experiments, by either in situ hybridization or immunostaining, demonstrated that the 12 molecular markers collectively define 10 different sSC neuronal types. The characteristic positions of these cell types divide the sSC into four distinct layers. The 12 markers identified here will serve as valuable tools to examine molecular mechanisms that regulate development of sSC neuronal types. These markers could also be used to examine the connections between specific cell types that form retinocollicular, corticocollicular, or colliculothalamic pathways. J. Comp. Neurol. 524:2300-2321, 2016. © 2016 Wiley Periodicals, Inc. PMID:26713509

  19. Clinical and Laboratory Features of the Nocardia spp. Based on Current Molecular Taxonomy

    PubMed Central

    Brown-Elliott, Barbara A.; Brown, June M.; Conville, Patricia S.; Wallace, Richard J.

    2006-01-01

    The recent explosion of newly described species of Nocardia results from the impact in the last decade of newer molecular technology, including PCR restriction enzyme analysis and 16S rRNA sequencing. These molecular techniques have revolutionized the identification of the nocardiae by providing rapid and accurate identification of recognized nocardiae and, at the same time, revealing new species and a number of yet-to-be-described species. There are currently more than 30 species of nocardiae of human clinical significance, with the majority of isolates being N. nova complex, N. abscessus, N. transvalensis complex, N. farcinica, N. asteroides type VI (N. cyriacigeorgica), and N. brasiliensis. These species cause a wide variety of diseases and have variable drug susceptibilities. Accurate identification often requires referral to a reference laboratory with molecular capabilities, as many newer species are genetically distinct from established species yet have few or no distinguishing phenotypic characteristics. Correct identification is important in deciding the clinical relevance of a species and in the clinical management and treatment of patients with nocardial disease. This review characterizes the currently known pathogenic species of Nocardia, including clinical disease, drug susceptibility, and methods of identification. PMID:16614249

  20. Some Dynamical Features of Molecular Fragmentation by Electrons and Swift Ions

    NASA Astrophysics Data System (ADS)

    Montenegro, E. C.; Sigaud, L.; Wolff, W.; Luna, H.; Natalia, Ferreira

    To date, the large majority of studies on molecular fragmentation by swift charged particles have been carried out using simple molecules, for which reliable Potential Energy Curves are available to interpret the measured fragmentation yields. For complex molecules the scenario is quite different and such guidance is not available, obscuring even a simple organization of the data which are currently obtained for a large variety of molecules of biological or technological interest. In this work we show that a general and relatively simple methodology can be used to obtain a broader picture of the fragmentation pattern of an arbitrary molecule. The electronic ionization or excitation cross section of a given molecular orbital, which is the first part of the fragmentation process, can be well scaled by a simple and general procedure at high projectile velocities. The fragmentation fractions arising from each molecular orbital can then be achieved by matching the calculated ionization with the measured fragmentation cross sections. Examples for Oxygen, Chlorodifluoromethane and Pyrimidine molecules are presented.

  1. Features of Knowledge Building in Biology: Understanding Undergraduate Students’ Ideas about Molecular Mechanisms

    PubMed Central

    Southard, Katelyn; Wince, Tyler; Meddleton, Shanice; Bolger, Molly S.

    2016-01-01

    Research has suggested that teaching and learning in molecular and cellular biology (MCB) is difficult. We used a new lens to understand undergraduate reasoning about molecular mechanisms: the knowledge-integration approach to conceptual change. Knowledge integration is the dynamic process by which learners acquire new ideas, develop connections between ideas, and reorganize and restructure prior knowledge. Semistructured, clinical think-aloud interviews were conducted with introductory and upper-division MCB students. Interviews included a written conceptual assessment, a concept-mapping activity, and an opportunity to explain the biomechanisms of DNA replication, transcription, and translation. Student reasoning patterns were explored through mixed-method analyses. Results suggested that students must sort mechanistic entities into appropriate mental categories that reflect the nature of MCB mechanisms and that conflation between these categories is common. We also showed how connections between molecular mechanisms and their biological roles are part of building an integrated knowledge network as students develop expertise. We observed differences in the nature of connections between ideas related to different forms of reasoning. Finally, we provide a tentative model for MCB knowledge integration and suggest its implications for undergraduate learning. PMID:26931398

  2. Neural and Molecular Features on Charcot-Marie-Tooth Disease Plasticity and Therapy

    PubMed Central

    Juárez, Paula; Palau, Francesc

    2012-01-01

    In the peripheral nervous system disorders plasticity is related to changes on the axon and Schwann cell biology, and the synaptic formations and connections, which could be also a focus for therapeutic research. Charcot-Marie-Tooth disease (CMT) represents a large group of inherited peripheral neuropathies that involve mainly both motor and sensory nerves and induce muscular atrophy and weakness. Genetic analysis has identified several pathways and molecular mechanisms involving myelin structure and proper nerve myelination, transcriptional regulation, protein turnover, vesicle trafficking, axonal transport and mitochondrial dynamics. These pathogenic mechanisms affect the continuous signaling and dialogue between the Schwann cell and the axon, having as final result the loss of myelin and nerve maintenance; however, some late onset axonal CMT neuropathies are a consequence of Schwann cell specific changes not affecting myelin. Comprehension of molecular pathways involved in Schwann cell-axonal interactions is likely not only to increase the understanding of nerve biology but also to identify the molecular targets and cell pathways to design novel therapeutic approaches for inherited neuropathies but also for most common peripheral neuropathies. These approaches should improve the plasticity of the synaptic connections at the neuromuscular junction and regenerate cell viability based on improving myelin and axon interaction. PMID:22745917

  3. Correlation Spectroscopy and Molecular Dynamics Simulations to Study the Structural Features of Proteins

    PubMed Central

    Varriale, Antonio; Marabotti, Anna; Mei, Giampiero; Staiano, Maria; D’Auria, Sabato

    2013-01-01

    In this work, we used a combination of fluorescence correlation spectroscopy (FCS) and molecular dynamics (MD) simulation methodologies to acquire structural information on pH-induced unfolding of the maltotriose-binding protein from Thermus thermophilus (MalE2). FCS has emerged as a powerful technique for characterizing the dynamics of molecules and it is, in fact, used to study molecular diffusion on timescale of microsecond and longer. Our results showed that keeping temperature constant, the protein diffusion coefficient decreased from 84±4 µm2/s to 44±3 µm2/s when pH was changed from 7.0 to 4.0. An even more marked decrease of the MalE2 diffusion coefficient (31±3 µm2/s) was registered when pH was raised from 7.0 to 10.0. According to the size of MalE2 (a monomeric protein with a molecular weight of 43 kDa) as well as of its globular native shape, the values of 44 µm2/s and 31 µm2/s could be ascribed to deformations of the protein structure, which enhances its propensity to form aggregates at extreme pH values. The obtained fluorescence correlation data, corroborated by circular dichroism, fluorescence emission and light-scattering experiments, are discussed together with the MD simulations results. PMID:23750215

  4. Physical re-examination of parameters on a molecular collisions-based diffusion model for diffusivity prediction in polymers.

    PubMed

    Ohashi, Hidenori; Tamaki, Takanori; Yamaguchi, Takeo

    2011-12-29

    Molecular collisions, which are the microscopic origin of molecular diffusive motion, are affected by both the molecular surface area and the distance between molecules. Their product can be regarded as the free space around a penetrant molecule defined as the "shell-like free volume" and can be taken as a characteristic of molecular collisions. On the basis of this notion, a new diffusion theory has been developed. The model can predict molecular diffusivity in polymeric systems using only well-defined single-component parameters of molecular volume, molecular surface area, free volume, and pre-exponential factors. By consideration of the physical description of the model, the actual body moved and which neighbor molecules are collided with are the volume and the surface area of the penetrant molecular core. In the present study, a semiempirical quantum chemical calculation was used to calculate both of these parameters. The model and the newly developed parameters offer fairly good predictive ability. PMID:22082054

  5. A comparison of molecular dynamics and diffuse interface model predictions of Lennard-Jones fluid evaporation

    SciTech Connect

    Barbante, Paolo; Frezzotti, Aldo; Gibelli, Livio

    2014-12-09

    The unsteady evaporation of a thin planar liquid film is studied by molecular dynamics simulations of Lennard-Jones fluid. The obtained results are compared with the predictions of a diffuse interface model in which capillary Korteweg contributions are added to hydrodynamic equations, in order to obtain a unified description of the liquid bulk, liquid-vapor interface and vapor region. Particular care has been taken in constructing a diffuse interface model matching the thermodynamic and transport properties of the Lennard-Jones fluid. The comparison of diffuse interface model and molecular dynamics results shows that, although good agreement is obtained in equilibrium conditions, remarkable deviations of diffuse interface model predictions from the reference molecular dynamics results are observed in the simulation of liquid film evaporation. It is also observed that molecular dynamics results are in good agreement with preliminary results obtained from a composite model which describes the liquid film by a standard hydrodynamic model and the vapor by the Boltzmann equation. The two mathematical model models are connected by kinetic boundary conditions assuming unit evaporation coefficient.

  6. Molecular-Scale Features that Govern the Effects of O-Glycosylation on a Carbohydrate-Binding Module

    DOE PAGESBeta

    Guan, Xiaoyang; Chaffey, Patrick K.; Zeng, Chen; Greene, Eric R.; Chen, Liqun; Drake, Matthew R.; Chen, Claire; Groobman, Ari; Resch, Michael G.; Himmel, Michael E.; et al

    2015-09-21

    The protein glycosylation is a ubiquitous post-translational modification in all kingdoms of life. Despite its importance in molecular and cellular biology, the molecular-level ramifications of O-glycosylation on biomolecular structure and function remain elusive. Here, we took a small model glycoprotein and changed the glycan structure and size, amino acid residues near the glycosylation site, and glycosidic linkage while monitoring any corresponding changes to physical stability and cellulose binding affinity. The results of this study reveal the collective importance of all the studied features in controlling the most pronounced effects of O-glycosylation in this system. This study suggests the possibility ofmore » designing proteins with multiple improved properties by simultaneously varying the structures of O-glycans and amino acids local to the glycosylation site.« less

  7. Molecular-Scale Features that Govern the Effects of O-Glycosylation on a Carbohydrate-Binding Module

    SciTech Connect

    Guan, Xiaoyang; Chaffey, Patrick K.; Zeng, Chen; Greene, Eric R.; Chen, Liqun; Drake, Matthew R.; Chen, Claire; Groobman, Ari; Resch, Michael G.; Himmel, Michael E.; Beckham, Gregg T.; Tan, Zhongping

    2015-09-21

    The protein glycosylation is a ubiquitous post-translational modification in all kingdoms of life. Despite its importance in molecular and cellular biology, the molecular-level ramifications of O-glycosylation on biomolecular structure and function remain elusive. Here, we took a small model glycoprotein and changed the glycan structure and size, amino acid residues near the glycosylation site, and glycosidic linkage while monitoring any corresponding changes to physical stability and cellulose binding affinity. The results of this study reveal the collective importance of all the studied features in controlling the most pronounced effects of O-glycosylation in this system. This study suggests the possibility of designing proteins with multiple improved properties by simultaneously varying the structures of O-glycans and amino acids local to the glycosylation site.

  8. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space

    DOE PAGESBeta

    Hansen, Katja; Biegler, Franziska; Ramakrishnan, Raghunathan; Pronobis, Wiktor; von Lilienfeld, O. Anatole; Müller, Klaus -Robert; Tkatchenko, Alexandre

    2015-06-04

    Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstratemore » prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. The same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies.« less

  9. Machine learning predictions of molecular properties: Accurate many-body potentials and nonlocality in chemical space

    SciTech Connect

    Hansen, Katja; Biegler, Franziska; Ramakrishnan, Raghunathan; Pronobis, Wiktor; von Lilienfeld, O. Anatole; Müller, Klaus -Robert; Tkatchenko, Alexandre

    2015-06-04

    Simultaneously accurate and efficient prediction of molecular properties throughout chemical compound space is a critical ingredient toward rational compound design in chemical and pharmaceutical industries. Aiming toward this goal, we develop and apply a systematic hierarchy of efficient empirical methods to estimate atomization and total energies of molecules. These methods range from a simple sum over atoms, to addition of bond energies, to pairwise interatomic force fields, reaching to the more sophisticated machine learning approaches that are capable of describing collective interactions between many atoms or bonds. In the case of equilibrium molecular geometries, even simple pairwise force fields demonstrate prediction accuracy comparable to benchmark energies calculated using density functional theory with hybrid exchange-correlation functionals; however, accounting for the collective many-body interactions proves to be essential for approaching the “holy grail” of chemical accuracy of 1 kcal/mol for both equilibrium and out-of-equilibrium geometries. This remarkable accuracy is achieved by a vectorized representation of molecules (so-called Bag of Bonds model) that exhibits strong nonlocality in chemical space. The same representation allows us to predict accurate electronic properties of molecules, such as their polarizability and molecular frontier orbital energies.

  10. An improved approach for predicting drug-target interaction: proteochemometrics to molecular docking.

    PubMed

    Shaikh, Naeem; Sharma, Mahesh; Garg, Prabha

    2016-02-23

    Proteochemometric (PCM) methods, which use descriptors of both the interacting species, i.e. drug and the target, are being successfully employed for the prediction of drug-target interactions (DTI). However, unavailability of non-interacting dataset and determining the applicability domain (AD) of model are a main concern in PCM modeling. In the present study, traditional PCM modeling was improved by devising novel methodologies for reliable negative dataset generation and fingerprint based AD analysis. In addition, various types of descriptors and classifiers were evaluated for their performance. The Random Forest and Support Vector Machine models outperformed the other classifiers (accuracies >98% and >89% for 10-fold cross validation and external validation, respectively). The type of protein descriptors had negligible effect on the developed models, encouraging the use of sequence-based descriptors over the structure-based descriptors. To establish the practical utility of built models, targets were predicted for approved anticancer drugs of natural origin. The molecular recognition interactions between the predicted drug-target pair were quantified with the help of a reverse molecular docking approach. The majority of predicted targets are known for anticancer therapy. These results thus correlate well with anticancer potential of the selected drugs. Interestingly, out of all predicted DTIs, thirty were found to be reported in the ChEMBL database, further validating the adopted methodology. The outcome of this study suggests that the proposed approach, involving use of the improved PCM methodology and molecular docking, can be successfully employed to elucidate the intricate mode of action for drug molecules as well as repositioning them for new therapeutic applications. PMID:26822863

  11. Integrating In Silico Prediction Methods, Molecular Docking, and Molecular Dynamics Simulation to Predict the Impact of ALK Missense Mutations in Structural Perspective

    PubMed Central

    Priya Doss, C. George; Chen, Luonan

    2014-01-01

    Over the past decade, advancements in next generation sequencing technology have placed personalized genomic medicine upon horizon. Understanding the likelihood of disease causing mutations in complex diseases as pathogenic or neutral remains as a major task and even impossible in the structural context because of its time consuming and expensive experiments. Among the various diseases causing mutations, single nucleotide polymorphisms (SNPs) play a vital role in defining individual's susceptibility to disease and drug response. Understanding the genotype-phenotype relationship through SNPs is the first and most important step in drug research and development. Detailed understanding of the effect of SNPs on patient drug response is a key factor in the establishment of personalized medicine. In this paper, we represent a computational pipeline in anaplastic lymphoma kinase (ALK) for SNP-centred study by the application of in silico prediction methods, molecular docking, and molecular dynamics simulation approaches. Combination of computational methods provides a way in understanding the impact of deleterious mutations in altering the protein drug targets and eventually leading to variable patient's drug response. We hope this rapid and cost effective pipeline will also serve as a bridge to connect the clinicians and in silico resources in tailoring treatments to the patients' specific genotype. PMID:25054154

  12. Support vector machines using EEG features of cross-frequency coupling can predict treatment outcome in Mecp2-deficient mice.

    PubMed

    Colic, Sinisa; Wither, Robert G; Min Lang; Zhang Liang; Eubanks, James H; Bardakjian, Berj L

    2015-08-01

    Anti-convulsive drug treatments of epilepsy typically produce varied outcomes from one patient to the next, often necessitating patients to go through several anticonvulsive drug trials until an appropriate treatment is found. The focus of this study is to predict treatment outcome using a priori electroencephalogram (EEG) features for a rare genetic model of epilepsy seen in patients with Rett Syndrome. Previous work on Mecp2-deficient mice, exhibiting the symptoms of Rett syndrome, have revealed EEG-based biomarkers that track the pathology well. Specifically the presence of cross-frequency coupling of the delta-like (3-6 Hz) frequency range phase with the fast ripple (400 - 600 Hz) frequency range amplitude in long duration discharges was found to track seizure pathology. Support Vector Machines (SVM) were trained with features generated from phase-amplitude comodulograms and tested on (n=6) Mecp2-deficient mice to predict treatment outcome to Midazolam, a commonly used anti-convulsive drug. Using SVMs it was shown that it is possible to generate a likelihood score to predict treatment outcomes on all of the animal subjects. Identifying the most appropriate treatment a priori would potentially lead to improved treatment outcomes. PMID:26737563

  13. An NMR and molecular dynamics investigation of the avian prion hexarepeat conformational features in solution

    NASA Astrophysics Data System (ADS)

    Pietropaolo, Adriana; Raiola, Luca; Muccioli, Luca; Tiberio, Giustiniano; Zannoni, Claudio; Fattorusso, Roberto; Isernia, Carla; Mendola, Diego La; Pappalardo, Giuseppe; Rizzarelli, Enrico

    2007-07-01

    The prion protein is a copper binding glycoprotein that in mammals can misfold into a pathogenic isoform leading to prion diseases, as opposed, surprisingly, to avians. The avian prion N-terminal tandem repeat is richer in prolines than the mammal one, and understanding their effect on conformation is of great biological importance. Here we succeeded in investigating the conformations of a single avian hexarepeat by means of NMR and molecular dynamics techniques. We found a high flexibility and a strong conformational dependence on pH: local turns are present at acidic and neutral pH, while unordered regions dominate at basic conditions.

  14. Predictive features of ligand-specific signaling through the estrogen receptor.

    PubMed

    Nwachukwu, Jerome C; Srinivasan, Sathish; Zheng, Yangfan; Wang, Song; Min, Jian; Dong, Chune; Liao, Zongquan; Nowak, Jason; Wright, Nicholas J; Houtman, René; Carlson, Kathryn E; Josan, Jatinder S; Elemento, Olivier; Katzenellenbogen, John A; Zhou, Hai-Bing; Nettles, Kendall W

    2016-01-01

    Some estrogen receptor-α (ERα)-targeted breast cancer therapies such as tamoxifen have tissue-selective or cell-specific activities, while others have similar activities in different cell types. To identify biophysical determinants of cell-specific signaling and breast cancer cell proliferation, we synthesized 241 ERα ligands based on 19 chemical scaffolds, and compared ligand response using quantitative bioassays for canonical ERα activities and X-ray crystallography. Ligands that regulate the dynamics and stability of the coactivator-binding site in the C-terminal ligand-binding domain, called activation function-2 (AF-2), showed similar activity profiles in different cell types. Such ligands induced breast cancer cell proliferation in a manner that was predicted by the canonical recruitment of the coactivators NCOA1/2/3 and induction of the GREB1 proliferative gene. For some ligand series, a single inter-atomic distance in the ligand-binding domain predicted their proliferative effects. In contrast, the N-terminal coactivator-binding site, activation function-1 (AF-1), determined cell-specific signaling induced by ligands that used alternate mechanisms to control cell proliferation. Thus, incorporating systems structural analyses with quantitative chemical biology reveals how ligands can achieve distinct allosteric signaling outcomes through ERα. PMID:27107013

  15. Predicting fish community properties within estuaries: Influence of habitat type and other environmental features

    NASA Astrophysics Data System (ADS)

    França, Susana; Vasconcelos, Rita P.; Fonseca, Vanessa F.; Tanner, Susanne E.; Reis-Santos, Patrick; Costa, Maria José; Cabral, Henrique N.

    2012-07-01

    Statistical models predicting species distributions are essential not only to increase knowledge on species but for their application in conservation and ecologically-based management. The variation of fish species richness and abundance in the most representative habitats (saltmarsh, mudflat and subtidal) in five estuaries along the Portuguese coast was analysed through seasonal sampling surveys in 2009. Generalized additive models (GAM) were developed to describe the variation of species richness and abundances with a set of geomorphologic, hydrologic and environmental characteristics from the sampled estuaries and habitats. GAM were chosen as the complex interactions dominating these ecosystems and species distribution are non-linear. Final models built for each estuary and for all estuaries together performed well during the calibration phase and also during the validation phase, where an unused data sub-set from each estuary was used. There was not a similar combination of variables retained by the models for the studied estuaries but factors such as the area of the habitat, the distance to estuary mouth, percentage of mud in the sediment and depth were commonly retained. The partial effect of these predictor variables on the variation of species richness and abundance in the estuaries varied markedly and the importance of preserving the heterogeneity of habitats within estuaries was highlighted. Models for each individual estuary performed better than models for estuaries combined. Predictive models could be useful as a preliminary tool to prepare long-term conservation plans at different scales.

  16. Mitral valve apparatus: echocardiographic features predicting the outcome of percutaneous mitral balloon valvotomy

    PubMed Central

    Du Toit, R; Brice, EAW; Van Niekerk, JD; Doubell, AF

    2007-01-01

    Summary Objectives To evaluate the significance of involvement of subvalvular apparatus in the outcome of percutaneous mitral balloon valvotomy (PMBV) in patients with mitral stenosis (MS) and to determine the predictive value of chordal length compared with current echocardiographic scores. Methods Patients with significant MS were selected according to the Massachusetts General Hospital score (MGHS). Chordal lengths were assessed as additional markers of disease. Standard percutaneous valvotomies were performed. Valve area was assessed post-procedure with follow-up over one year. Results Thirty-nine patients were prospectively studied. Valve area increased from a mean (SD) 0.97 (0.26) cm2 to 1.52 (0.38) cm2 with procedural success in 31 (79.5%) patients. There was no correlation (r = 0.09) between the MGHS and final valve area (FV A). There was a positive correlation between anterior chordal length and FV A (r = 0.66; p = 0.01). An FV A ≥ 1.5 cm2 was associated with higher mean chordal lengths (p = 0.01). A positive correlation was seen between valve area pre-procedure and FV A (r = 0.61; p < 0.01). Conclusions The MGHS is valuable in the selection of patients for PMBV, but fails to separate selected patients into prognostic groups. Assessment of chordal length provides useful additional information, predicting the outcome of PMBV more accurately. Our data may support the earlier use of PMBV (asymptomatic patients). PMID:17612747

  17. Predicting Essential Genes and Proteins Based on Machine Learning and Network Topological Features: A Comprehensive Review

    PubMed Central

    Zhang, Xue; Acencio, Marcio Luis; Lemke, Ney

    2016-01-01

    Essential proteins/genes are indispensable to the survival or reproduction of an organism, and the deletion of such essential proteins will result in lethality or infertility. The identification of essential genes is very important not only for understanding the minimal requirements for survival of an organism, but also for finding human disease genes and new drug targets. Experimental methods for identifying essential genes are costly, time-consuming, and laborious. With the accumulation of sequenced genomes data and high-throughput experimental data, many computational methods for identifying essential proteins are proposed, which are useful complements to experimental methods. In this review, we show the state-of-the-art methods for identifying essential genes and proteins based on machine learning and network topological features, point out the progress and limitations of current methods, and discuss the challenges and directions for further research. PMID:27014079

  18. MRI features of Binswanger’s disease predict prognosis and associated pathology

    PubMed Central

    Akiguchi, Ichiro; Budka, Herbert; Shirakashi, Yoshitomo; Woehrer, Adelheid; Watanabe, Toshiyuki; Shiino, Akihiko; Yamamoto, Yasumasa; Kawamoto, Yasuhiro; Krampla, Wolfgang; Jungwirth, Susanne; Fischer, Peter

    2014-01-01

    Objective To identify the prevalence of MRI features of Binswanger’s disease (BD), specifically MRI with diffuse white matter lesions and scattered multiple lacunes (BD-MRI), and to describe neurological features and pathological outcomes of a community-based cohort study. Methods Of 697 participants (all 75 years old), 503 completed neurological examinations at baseline and were followed-up every 30 months thereafter with MRIs, the mini-mental state examination (MMSE) and the Unified Parkinson Disease Rating Scale-Motor Section (UPDRSM). Data from participants with BD-MRI were compared with those from participants with predominant white matter lesions (WML-MRI), scattered multiple lacunes (ML-MRI), or normal MRIs. Results Fourteen BD-MRI patients (2.8%) were detected at baseline. The mean MMSE scores in the BD-MRI, WML-MRI, ML-MRI, and normal MRIs groups were 26.4, 28.2, 28.4, and 28.5, respectively, and the mean UPDRSM scores were 9.1, 1.3, 3.1, and 1.7, respectively. At the 30-month follow-up, mortality rates in the normal MRIs, WML-MRI and ML-MRI were 4%, 9.1%, and 22.2%, respectively, and follow-up MRIs were available for 80%, 82%, and 61% of the participants, respectively. In the BD-MRI, however, five patients were deceased, and only five follow-up individual MRIs were available (33.3%). Autopsies were performed on six of eight BD-MRI brains, and these brains fulfilled the pathological criteria for BD independent of Alzheimer disease pathology. All these six individuals also showed systemic atherosclerosis and renal arterio-arteriolosclerosis. Interpretation The BD-MRI participants had poor prognoses and showed pure BD pathology with advanced systemic vascular disease. BD-MRI appears to be a predictor of vascular neurocognitive impairment. PMID:25493272

  19. Discovery of characteristic molecular signatures for the simultaneous prediction and detection of environmental pollutants.

    PubMed

    Song, Mi-Kyung; Choi, Han-Seam; Park, Yong-Keun; Ryu, Jae-Chun

    2014-02-01

    Gene expression data may be very promising for the classification of toxicant types, but the development and application of transcriptomic-based gene classifiers for environmental toxicological applications are lacking compared to the biomedical sciences. Also, simultaneous classification across a set of toxicant types has not been investigated extensively. In the present study, we determined the transcriptomic response to three types of ubiquitous toxicants exposure in two types of human cell lines (HepG2 and HL-60), which are useful in vitro human model for evaluation of toxic substances that may affect human hepatotoxicity (e.g., polycyclic aromatic hydrocarbon [PAH] and persistent organic pollutant [POP]) and human leukemic myelopoietic proliferation (e.g., volatile organic compound [VOC]). The findings demonstrate characteristic molecular signatures that facilitated discrimination and prediction of the toxicant type. To evaluate changes in gene expression levels after exposure to environmental toxicants, we utilized 18 chemical substances; nine PAH toxicants, six VOC toxicants, and three POP toxicants. Unsupervised gene expression analysis resulted in a characteristic molecular signature for each toxicant group, and combination analysis of two separate multi-classifications indicated 265 genes as surrogate markers for predicting each group of toxicants with 100 % accuracy. Our results suggest that these expression signatures can be used as predictable and discernible surrogate markers for detection and prediction of environmental toxicant exposure. Furthermore, this approach could easily be extended to screening for other types of environmental toxicants. PMID:24197968

  20. Role of electrostatic potential in the in silico prediction of molecular bioactivation and mutagenesis.

    PubMed

    Ford, Kevin A

    2013-04-01

    Electrostatic potential (ESP) is a useful physicochemical property of a molecule that provides insights into inter- and intramolecular associations, as well as prediction of likely sites of electrophilic and nucleophilic metabolic attack. Knowledge of sites of metabolic attack is of paramount importance in DMPK research since drugs frequently fail in clinical trials due to the formation of bioactivated metabolites which are often difficult to measure experimentally due to their reactive nature and relatively short half-lives. Computational chemistry methods have proven invaluable in recent years as a means to predict and study bioactivated metabolites without the need for chemical syntheses, or testing on experimental animals. Additional molecular properties (heat of formation, heat of solvation and E(LUMO) - E(HOMO)) are discussed in this paper as complementary indicators of the behavior of metabolites in vivo. Five diverse examples are presented (acetaminophen, aniline/phenylamines, imidacloprid, nefazodone and vinyl chloride) which illustrate the utility of this multidimensional approach in predicting bioactivation, and in each case the predicted data agreed with experimental data described in the scientific literature. A further example of the usefulness of calculating ESP, in combination with the molecular properties mentioned above, is provided by an examination of the use of these parameters in providing an explanation for the sites of nucleophilic attack of the nucleic acid cytosine. Exploration of sites of nucleophilic attack of nucleic acids is important as adducts of DNA have the potential to result in mutagenesis. PMID:23323940

  1. Modeling Far-UV Fluorescent Emission Features of Warm Molecular Hydrogen in the Inner Regions of Protoplanetary Disks

    NASA Astrophysics Data System (ADS)

    Hoadley, Keri; France, Kevin

    2015-01-01

    Probing the surviving molecular gas within the inner regions of protoplanetary disks (PPDs) around T Tauri stars (1 - 10 Myr) provides insight into the conditions in which planet formation and migration occurs while the gas disk is still present. We model observed far ultraviolet (FUV) molecular hydrogen (H₂) fluorescent emission lines that originate within the inner regions (< 10 AU) of 9 well-studied Classic T Tauri stars, using the Hubble Space Telescope Cosmic Origins Spectrograph (COS), to explore the physical structure of the molecular disk at different PPD dust evolutionary stages. We created a 2D radiative transfer model that estimates the density and temperature distributions of warm, inner radial H₂ (T > 1500 K) with a set of 6 free parameters and produces a data cube of expected emission line profiles that describe the physical structure of the inner molecular disk atmosphere. By comparing the modeled emission lines with COS H₂ fluorescence emission features, we estimate the physical structure of the molecular disk atmosphere for each target with the set of free parameters that best replicate the observed lines. First results suggest that, for all dust evolutionary stages of disks considered, ground-state H₂ populations are described by a roughly constant temperature T(H₂) = 2500 +/- 1000 K. Possible evolution of the density structure of the H₂ atmosphere between intact and depleting dust disks may be distinguishable, but large errors in the inferred best-fit parameter sets prevent us from making this conclusion. Further improvements to the modeling framework and statistical comparison in determining the best-fit model-to-data parameter sets are ongoing, beginning with improvements to the radiative transfer model and use of up-to-date HI Lyman α absorption optical depths (see McJunkin in posters) to better estimate disk structural parameters. Once improvements are implemented, we will investigate the possible presence of a molecular wind

  2. Crypto-rhombomeres of the mouse medulla oblongata, defined by molecular and morphological features.

    PubMed

    Tomás-Roca, Laura; Corral-San-Miguel, Rubén; Aroca, Pilar; Puelles, Luis; Marín, Faustino

    2016-03-01

    The medulla oblongata is the caudal portion of the vertebrate hindbrain. It contains major ascending and descending fiber tracts as well as several motor and interneuron populations, including neural centers that regulate the visceral functions and the maintenance of bodily homeostasis. In the avian embryo, it has been proposed that the primordium of this region is subdivided into five segments or crypto-rhombomeres (r7-r11), which were defined according to either their parameric position relative to intersomitic boundaries (Cambronero and Puelles, in J Comp Neurol 427:522-545, 2000) or a stepped expression of Hox genes (Marín et al., in Dev Biol 323:230-247, 2008). In the present work, we examine the implied similar segmental organization of the mouse medulla oblongata. To this end, we analyze the expression pattern of Hox genes from groups 3 to 8, comparing them to the expression of given cytoarchitectonic and molecular markers, from mid-gestational to perinatal stages. As a result of this approach, we conclude that the mouse medulla oblongata is segmentally organized, similarly as in avian embryos. Longitudinal structures such as the nucleus of the solitary tract, the dorsal vagal motor nucleus, the hypoglossal motor nucleus, the descending trigeminal and vestibular columns, or the reticular formation appear subdivided into discrete segmental units. Additionally, our analysis identified an internal molecular organization of the migrated pontine nuclei that reflects a differential segmental origin of their neurons as assessed by Hox gene expression. PMID:25381007

  3. Molecular features of hepatosplenic T-cell lymphoma unravels potential novel therapeutic targets

    PubMed Central

    Travert, Marion; Huang, Yenlin; De Leval, Laurence; Martin-Garcia, Nadine; Delfau-Larue, Marie-Helene; Berger, Françoise; Bosq, Jacques; Brière, Josette; Soulier, Jean; Macintyre, Elizabeth; Marafioti, Teresa; de Reyniès, Aurélien; Gaulard, Philippe

    2012-01-01

    Hepatosplenic T-cell lymphoma (HSTL) is a rare entity mostly derived from γδ T cells that shows a fatal outcome. Its pathogenesis remains largely unknown. HSTL samples (7γδ, 2αβ) and the DERL2 HSTL-cell line were subject to combined gene expression profiling and array-based comparative genomic hybridization. Compared to other T-cell lymphomas, HSTL disclosed a distinct molecular signature irrespective of TCR cell lineage. Compared to PTCL,NOS and normal γδ cells, HSTL overexpressed genes encoding NK-cell associated molecules, oncogenes (FOS, VAV3), the Sphingosine-1-phosphatase receptor 5 involved in cell trafficking and the tyrosine kinase SYK, whereas the tumor suppressor gene AIM1 was among the most downexpressed. Methylation analysis of DERL2 cells demonstrated highly methylated CpG islands of AIM1 and decitabine treatment induced significant increase in AIM1 transcripts. Notably, Syk was demonstrated in HSTL cells with its phosphorylated form present in DERL2 cells by Western blot, and in vitro DERL2 cells were sensitive to a Syk inhibitor. Genomic profiles confirmed recurrent isochromosome 7q (n=6/9) without alterations at 9q22 and 6q21 containing SYK and AIM1 genes, respectively. The current study identifies a distinct molecular signature for HSTL and highlights oncogenic pathways which offer rationale for exploring new therapeutic options such as Syk inhibitors and demethylating agents. PMID:22510872

  4. Molecular features of a human rhabdomyosarcoma cell line with spontaneous metastatic progression

    PubMed Central

    Scholl, F A; Betts, D R; Niggli, F K; Schäfer, B W

    2000-01-01

    A novel human cell line was established from a primary botryoid rhabdomyosarcoma. Reverse transcription polymerase chain reaction investigations of this cell line, called RUCH-2, demonstrated expression of the regulatory factors PAX3, Myf3 and Myf5. After 3.5 months in culture, cells underwent a crisis after which Myf3 and Myf5 could no longer be detected, whereas PAX3 expression remained constant over the entire period. Karyotype analysis revealed breakpoints in regions similar to previously described alterations in primary rhabdomyosarcoma tumour samples. Interestingly, cells progressed to a metastatic phenotype, as observed by enhanced invasiveness in vitro and tumour growth in nude mice in vivo. On the molecular level, microarray analysis before and after progression identified extensive changes in the composition of the extracellular matrix. As expected, down-regulation of tissue inhibitors of metalloproteinases and up-regulation of matrix metalloproteinases were observed. Extensive down-regulation of several death receptors of the tumour necrosis factor family suggests that these cells might have an altered response to appropriate apoptotic stimuli. The RUCH-2 cell line represents a cellular model to study multistep tumorigenesis in human rhabdomyosarcoma, allowing molecular comparison of tumorigenic versus metastatic cancer cells. © 2000 Cancer Research Campaign PMID:10735512

  5. Molecular features of triple negative breast cancer cells by genome-wide gene expression profiling analysis.

    PubMed

    Komatsu, Masato; Yoshimaru, Tetsuro; Matsuo, Taisuke; Kiyotani, Kazuma; Miyoshi, Yasuo; Tanahashi, Toshihito; Rokutan, Kazuhito; Yamaguchi, Rui; Saito, Ayumu; Imoto, Seiya; Miyano, Satoru; Nakamura, Yusuke; Sasa, Mitsunori; Shimada, Mitsuo; Katagiri, Toyomasa

    2013-02-01

    Triple negative breast cancer (TNBC) has a poor outcome due to the lack of beneficial therapeutic targets. To clarify the molecular mechanisms involved in the carcinogenesis of TNBC and to identify target molecules for novel anticancer drugs, we analyzed the gene expression profiles of 30 TNBCs as well as 13 normal epithelial ductal cells that were purified by laser-microbeam microdissection. We identified 301 and 321 transcripts that were significantly upregulated and downregulated in TNBC, respectively. In particular, gene expression profile analyses of normal human vital organs allowed us to identify 104 cancer-specific genes, including those involved in breast carcinogenesis such as NEK2, PBK and MELK. Moreover, gene annotation enrichment analysis revealed prominent gene subsets involved in the cell cycle, especially mitosis. Therefore, we focused on cell cycle regulators, asp (abnormal spindle) homolog, microcephaly-associated (Drosophila) (ASPM) and centromere protein K (CENPK) as novel therapeutic targets for TNBC. Small-interfering RNA-mediated knockdown of their expression significantly attenuated TNBC cell viability due to G1 and G2/M cell cycle arrest. Our data will provide a better understanding of the carcinogenesis of TNBC and could contribute to the development of molecular targets as a treatment for TNBC patients. PMID:23254957

  6. Predicting the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol mixtures via molecular simulation

    PubMed Central

    Paluch, Andrew S.; Parameswaran, Sreeja; Liu, Shuai; Kolavennu, Anasuya; Mobley, David L.

    2015-01-01

    We present a general framework to predict the excess solubility of small molecular solids (such as pharmaceutical solids) in binary solvents via molecular simulation free energy calculations at infinite dilution with conventional molecular models. The present study used molecular dynamics with the General AMBER Force Field to predict the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol solvents. The simulations are able to predict the existence of solubility enhancement and the results are in good agreement with available experimental data. The accuracy of the predictions in addition to the generality of the method suggests that molecular simulations may be a valuable design tool for solvent selection in drug development processes. PMID:25637996

  7. Predicting the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol mixtures via molecular simulation.

    PubMed

    Paluch, Andrew S; Parameswaran, Sreeja; Liu, Shuai; Kolavennu, Anasuya; Mobley, David L

    2015-01-28

    We present a general framework to predict the excess solubility of small molecular solids (such as pharmaceutical solids) in binary solvents via molecular simulation free energy calculations at infinite dilution with conventional molecular models. The present study used molecular dynamics with the General AMBER Force Field to predict the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol solvents. The simulations are able to predict the existence of solubility enhancement and the results are in good agreement with available experimental data. The accuracy of the predictions in addition to the generality of the method suggests that molecular simulations may be a valuable design tool for solvent selection in drug development processes. PMID:25637996

  8. Predicting the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol mixtures via molecular simulation

    NASA Astrophysics Data System (ADS)

    Paluch, Andrew S.; Parameswaran, Sreeja; Liu, Shuai; Kolavennu, Anasuya; Mobley, David L.

    2015-01-01

    We present a general framework to predict the excess solubility of small molecular solids (such as pharmaceutical solids) in binary solvents via molecular simulation free energy calculations at infinite dilution with conventional molecular models. The present study used molecular dynamics with the General AMBER Force Field to predict the excess solubility of acetanilide, acetaminophen, phenacetin, benzocaine, and caffeine in binary water/ethanol solvents. The simulations are able to predict the existence of solubility enhancement and the results are in good agreement with available experimental data. The accuracy of the predictions in addition to the generality of the method suggests that molecular simulations may be a valuable design tool for solvent selection in drug development processes.

  9. Prevailing Features of X-Ray-Induced Molecular Electron Spectra Revealed with Fullerenes

    NASA Astrophysics Data System (ADS)

    Camacho Garibay, Abraham; Saalmann, Ulf; Rost, Jan M.

    2014-08-01

    X-ray photoabsorption from intense short pulses by a molecule triggers complicated electron and subsequently ion dynamics, leading to photoelectron spectra, which are difficult to interpret. Illuminating fullerenes offers a way to separate out the electron dynamics since the cage structure confines spatially the origin of photo- and Auger electrons. Together with the sequential nature of the photoprocesses at intensities available at x-ray free-electron lasers, this allows for a remarkably detailed interpretation of the photoelectron spectra, as we will demonstrate. The general features derived can serve as a paradigm for less well-defined situations in other large molecules or clusters.

  10. Labyrinthine water flow across multilayer graphene-based membranes: Molecular dynamics versus continuum predictions

    NASA Astrophysics Data System (ADS)

    Yoshida, Hiroaki; Bocquet, Lydéric

    2016-06-01

    In this paper, we investigate the hydrodynamic permeance of water through graphene-based membranes, inspired by recent experimental findings on graphene-oxide membranes. We consider the flow across multiple graphene layers having nanoslits in a staggered alignment, with an inter-layer distance ranging from sub-nanometer to a few nanometers. We compare results for the permeability obtained by means of molecular dynamics simulations to continuum predictions obtained by using the lattice Boltzmann calculations and hydrodynamic modelization. This highlights that, in spite of extreme confinement, the permeability across the graphene-based membrane is quantitatively predicted on the basis of a continuum expression, taking properly into account entrance and slippage effects of the confined water flow. Our predictions refute the breakdown of hydrodynamics at small scales in these membrane systems. They constitute a benchmark to which we compare published experimental data.

  11. Labyrinthine water flow across multilayer graphene-based membranes: Molecular dynamics versus continuum predictions.

    PubMed

    Yoshida, Hiroaki; Bocquet, Lydéric

    2016-06-21

    In this paper, we investigate the hydrodynamic permeance of water through graphene-based membranes, inspired by recent experimental findings on graphene-oxide membranes. We consider the flow across multiple graphene layers having nanoslits in a staggered alignment, with an inter-layer distance ranging from sub-nanometer to a few nanometers. We compare results for the permeability obtained by means of molecular dynamics simulations to continuum predictions obtained by using the lattice Boltzmann calculations and hydrodynamic modelization. This highlights that, in spite of extreme confinement, the permeability across the graphene-based membrane is quantitatively predicted on the basis of a continuum expression, taking properly into account entrance and slippage effects of the confined water flow. Our predictions refute the breakdown of hydrodynamics at small scales in these membrane systems. They constitute a benchmark to which we compare published experimental data. PMID:27334184

  12. HybridGO-Loc: mining hybrid features on gene ontology for predicting subcellular localization of multi-location proteins.

    PubMed

    Wan, Shibiao; Mak, Man-Wai; Kung, Sun-Yuan

    2014-01-01

    Protein subcellular localization prediction, as an essential step to elucidate the functions in vivo of proteins and identify drugs targets, has been extensively studied in previous decades. Instead of only determining subcellular localization of single-label proteins, recent studies have focused on predicting both single- and multi-location proteins. Computational methods based on Gene Ontology (GO) have been demonstrated to be superior to methods based on other features. However, existing GO-based methods focus on the occurrences of GO terms and disregard their relationships. This paper proposes a multi-label subcellular-localization predictor, namely HybridGO-Loc, that leverages not only the GO term occurrences but also the inter-term relationships. This is achieved by hybridizing the GO frequencies of occurrences and the semantic similarity between GO terms. Given a protein, a set of GO terms are retrieved by searching against the gene ontology database, using the accession numbers of homologous proteins obtained via BLAST search as the keys. The frequency of GO occurrences and semantic similarity (SS) between GO terms are used to formulate frequency vectors and semantic similarity vectors, respectively, which are subsequently hybridized to construct fusion vectors. An adaptive-decision based multi-label support vector machine (SVM) classifier is proposed to classify the fusion vectors. Experimental results based on recent benchmark datasets and a new dataset containing novel proteins show that the proposed hybrid-feature predictor significantly outperforms predictors based on individual GO features as well as other state-of-the-art predictors. For readers' convenience, the HybridGO-Loc server, which is for predicting virus or plant proteins, is available online at http://bioinfo.eie.polyu.edu.hk/HybridGoServer/. PMID:24647341

  13. Molecular Dynamics Simulations and Kinetic Measurements to Estimate and Predict Protein-Ligand Residence Times.

    PubMed

    Mollica, Luca; Theret, Isabelle; Antoine, Mathias; Perron-Sierra, Françoise; Charton, Yves; Fourquez, Jean-Marie; Wierzbicki, Michel; Boutin, Jean A; Ferry, Gilles; Decherchi, Sergio; Bottegoni, Giovanni; Ducrot, Pierre; Cavalli, Andrea

    2016-08-11

    Ligand-target residence time is emerging as a key drug discovery parameter because it can reliably predict drug efficacy in vivo. Experimental approaches to binding and unbinding kinetics are nowadays available, but we still lack reliable computational tools for predicting kinetics and residence time. Most attempts have been based on brute-force molecular dynamics (MD) simulations, which are CPU-demanding and not yet particularly accurate. We recently reported a new scaled-MD-based protocol, which showed potential for residence time prediction in drug discovery. Here, we further challenged our procedure's predictive ability by applying our methodology to a series of glucokinase activators that could be useful for treating type 2 diabetes mellitus. We combined scaled MD with experimental kinetics measurements and X-ray crystallography, promptly checking the protocol's reliability by directly comparing computational predictions and experimental measures. The good agreement highlights the potential of our scaled-MD-based approach as an innovative method for computationally estimating and predicting drug residence times. PMID:27391254

  14. Iodine atoms: a new molecular feature for the design of potent transthyretin fibrillogenesis inhibitors.

    PubMed

    Mairal, Teresa; Nieto, Joan; Pinto, Marta; Almeida, Maria Rosário; Gales, Luis; Ballesteros, Alfredo; Barluenga, José; Pérez, Juan J; Vázquez, Jesús T; Centeno, Nuria B; Saraiva, Maria Joao; Damas, Ana M; Planas, Antoni; Arsequell, Gemma; Valencia, Gregorio

    2009-01-01

    The thyroid hormone and retinol transporter protein known as transthyretin (TTR) is in the origin of one of the 20 or so known amyloid diseases. TTR self assembles as a homotetramer leaving a central hydrophobic channel with two symmetrical binding sites. The aggregation pathway of TTR into amiloid fibrils is not yet well characterized but in vitro binding of thyroid hormones and other small organic molecules to TTR binding channel results in tetramer stabilization which prevents amyloid formation in an extent which is proportional to the binding constant. Up to now, TTR aggregation inhibitors have been designed looking at various structural features of this binding channel others than its ability to host iodine atoms. In the present work, greatly improved inhibitors have been designed and tested by taking into account that thyroid hormones are unique in human biochemistry owing to the presence of multiple iodine atoms in their molecules which are probed to interact with specific halogen binding domains sitting at the TTR binding channel. The new TTR fibrillogenesis inhibitors are based on the diflunisal core structure because diflunisal is a registered salicylate drug with NSAID activity now undergoing clinical trials for TTR amyloid diseases. Biochemical and biophysical evidence confirms that iodine atoms can be an important design feature in the search for candidate drugs for TTR related amyloidosis. PMID:19125186

  15. Iodine Atoms: A New Molecular Feature for the Design of Potent Transthyretin Fibrillogenesis Inhibitors

    PubMed Central

    Pinto, Marta; Almeida, Maria Rosário; Gales, Luis; Ballesteros, Alfredo; Barluenga, José; Pérez, Juan J.; Vázquez, Jesús T.; Centeno, Nuria B.; Saraiva, Maria Joao; Damas, Ana M.; Planas, Antoni; Arsequell, Gemma; Valencia, Gregorio

    2009-01-01

    The thyroid hormone and retinol transporter protein known as transthyretin (TTR) is in the origin of one of the 20 or so known amyloid diseases. TTR self assembles as a homotetramer leaving a central hydrophobic channel with two symmetrical binding sites. The aggregation pathway of TTR into amiloid fibrils is not yet well characterized but in vitro binding of thyroid hormones and other small organic molecules to TTR binding channel results in tetramer stabilization which prevents amyloid formation in an extent which is proportional to the binding constant. Up to now, TTR aggregation inhibitors have been designed looking at various structural features of this binding channel others than its ability to host iodine atoms. In the present work, greatly improved inhibitors have been designed and tested by taking into account that thyroid hormones are unique in human biochemistry owing to the presence of multiple iodine atoms in their molecules which are probed to interact with specific halogen binding domains sitting at the TTR binding channel. The new TTR fibrillogenesis inhibitors are based on the diflunisal core structure because diflunisal is a registered salicylate drug with NSAID activity now undergoing clinical trials for TTR amyloid diseases. Biochemical and biophysical evidence confirms that iodine atoms can be an important design feature in the search for candidate drugs for TTR related amyloidosis. PMID:19125186

  16. Clinicopathologic and Molecular Features of Colorectal Adenocarcinoma with Signet-Ring Cell Component

    PubMed Central

    Gao, Jing; Li, Jian; Li, Jie; Qi, Changsong; Li, Yanyan; Li, Zhongwu; Shen, Lin

    2016-01-01

    Background We performed a retrospective study to assess the clinicopathological characters, molecular alterations and multigene mutation profiles in colorectal cancer patients with signet-ring cell component. Methods Between November 2008 and January 2015, 61 consecutive primary colorectal carcinomas with signet-ring cell component were available for pathological confirmation. RAS/BRAF status was performed by direct sequencing. 14 genes associated with hereditary cancer syndromes were analyzed by targeted gene sequencing. Results A slight male predominance was detected in these patients (59.0%). Colorectal carcinomas with signet-ring cell component were well distributed along the large intestine. A frequently higher TNM stage at the time of diagnosis was observed, compared with the conventional adenocarcinoma. Family history of malignant tumor was remarkable with 49.2% in 61 cases. The median OS time of stage IV patients in our study was 14 months. RAS mutations were detected in 22.2% (12/54) cases with KRAS mutations in 16.7% (9/54) cases and Nras mutations in 5.4%(3/54) cases. BRAF V600E mutation was detected in 3.7% (2/54) cases. As an exploration, we analyzed 14 genes by targeted gene sequencing. These genes were selected based on their biological role in association with hereditary cancer syndromes. 79.6% cases carried at least one pathogenic mutation. Finally, the patients were classified by the percentage of signet-ring cell. 39 (63.9%) cases were composed of ≥50% signet-ring cells; 22 (36.1%) cases were composed of <50% signet-ring cells. We compared clinical parameters, molecular and genetic alterations between the two groups and found no significant differences. Conclusions Colorectal adenocarcinoma with signet-ring cell component is characterized by advanced stage at diagnosis with remarkable family history of malignant tumor. It is likely a negative prognostic factor and tends to affect male patients with low rates of RAS /BRAF mutation. Colorectal

  17. Structural features of low-dimensional molecular conductors-Representatives of new hybrid polyfunctional materials: Review

    SciTech Connect

    Shibaeva, R. P. Khasanov, S. S.; Zorina, L. V.; Simonov, S. V.

    2006-12-15

    The crystal structures of the family of low-dimensional molecular conductors based on radical cation salts of different organic {pi} donors with photochromic and magnetic metal complexes as anions have been considered. This class of supramolecular systems demonstrates a large variety of structural types and a wide range of transport properties. The specificity of the structure and properties of such hybrid materials is illustrated by several examples. The crystallochemical analysis of the conductors considered indicates the possibility of purposeful control of their transport properties via changing of the charge, sizes, shape, and symmetry of the anionic block components. The specificity of the crystal structure and properties of some organic conductors shows that such systems can be used as model systems in the study of new physical phenomena related to electron correlation and effects of charge ordering.

  18. Structural model of human GAD65: prediction and interpretation of biochemical and immunogenic features.

    PubMed

    Capitani, Guido; De Biase, Daniela; Gut, Heinz; Ahmed, Shaheen; Grütter, Markus G

    2005-04-01

    The 65 kDa human isoform of glutamate decarboxylase, GAD65, plays a central role in neurotransmission in higher vertebrates and is a typical autoantigen in several human autoimmune diseases, such as insulin-dependent diabetes mellitus (IDDM), Stiff-man syndrome and autoimmune polyendocrine syndrome type I. In autoimmune diabetes, an attack of inflammatory cells to endocrine pancreatic beta-cells leads to their complete destruction, eventually resulting in the inability to produce sufficient insulin for the body's requirements. Even though the etiology of beta-cell destruction is still a matter of debate, the role and antigenic potency of GAD65 are widely recognized. Herein a model of GAD65 is presented, which is based on the recently solved crystal structures of mammalian DOPA decarboxylase and of bacterial glutamate decarboxylase. The model provides for the first time a detailed and accurate structure of the GAD65 subunit (all three domains) and of its dimeric quaternary assembly. It reveals the structural basis for specific antibody recognition to GAD65 as opposed to GAD67, the other human isoform, which shares 81% sequence similarity with GAD65 and is much less antigenic. Literature data on monoclonal antibody binding are perfectly consistent with the detailed features of the model, which allows explanation of several findings on GAD65 immunogenicity. Importantly, by analyzing the active site, we identified the residues most likely involved in catalysis and substrate recognition, paving the way for rational mutagenesis studies of the GAD65 reaction mechanism, specificity and inhibition. PMID:15690345

  19. Molecular characteristics and prognostic features of breast cancer in Nigerian compared with UK women.

    PubMed

    Agboola, A J; Musa, A A; Wanangwa, N; Abdel-Fatah, T; Nolan, C C; Ayoade, B A; Oyebadejo, T Y; Banjo, A A; Deji-Agboola, A M; Rakha, E A; Green, A R; Ellis, I O

    2012-09-01

    Although breast cancer (BC) incidence is lower in African-American women compared with White-American, in African countries such as Nigeria, BC is a common disease. Nigerian women have a higher risk for early-onset, with a high mortality rate from BC, prompting speculation that risk factors could be genetic and the molecular portrait of these tumours are different to those of western women. In this study, 308 BC samples from Nigerian women with complete clinical history and tumour characteristics were included and compared with a large series of BC from the UK as a control group. Immunoprofile of these tumours was characterised using a panel of 11 biomarkers of known relevance to BC. The immunoprofile and patients' outcome were compared with tumour grade-matched UK control group. Nigerian women presenting with BC were more frequently premenopausal, and their tumours were characterised by large primary tumour size, high tumour grade, advanced lymph node stage, and a higher rate of vascular invasion compared with UK women. In the grade-matched groups, Nigerian BC showed over representation of triple-negative and basal phenotypes and BRCA1 deficiency BC compared with UK women, but no difference was found regarding HER2 expression between the two series. Nigerian women showed significantly poorer outcome after development of BC compared with UK women. This study demonstrates that there are possible genetic and molecular differences between an indigenous Black population and a UK-based series. The basal-like, triple negative and BRCA1 dysfunction groups of tumours identified in this study may have implications in the development of screening programs and therapies for African patients and families that are likely to have a BRCA1 dysfunction, basal like and triple negative. PMID:22842985

  20. Molecular Profiling in Unknown Primary Cancer: Accuracy of Tissue of Origin Prediction

    PubMed Central

    Spigel, David R.; Yardley, Denise A.; Erlander, Mark G.; Ma, Xiao-Jun; Hainsworth, John D.

    2010-01-01

    Introduction. This retrospective, multi-institutional study evaluated the accuracy of tissue-of-origin prediction by molecular profiling in patients with carcinoma of unknown primary site (CUP). Methods. Thirty-eight of 501 patients (7.6%) with CUP, seen in 2000–2008, had their latent primary site tumor subsequently identified during life. Twenty-eight of these patients (73.7%) had adequate initial tissue biopsies available for molecular profiling with a reverse transcriptase-polymerase chain reaction (RT-PCR) assay (Cancer Type ID; bioTheranostics, Inc., San Diego, CA). The assay was performed on formalin-fixed paraffin-embedded biopsy specimens in a blinded fashion, and the assay results were compared with clinicopathologic data and the actual latent primary sites. Results. Twenty of the 28 (71.4%) RT-PCR assays were successfully completed (eight biopsies had either insufficient tumor or poorly preserved RNA). Fifteen of the 20 assay predictions (75%) were correct (95% confidence interval, 60%–85%), corresponding to the actual latent primary sites identified after the initial diagnosis of CUP. Primary sites correctly identified included breast (four patients), ovary/primary peritoneal (four patients), non-small cell lung (three patients), colorectal (two patients), gastric (one patient), and melanoma (one patient). Three predictions were incorrect (intestinal, testicular, sarcoma) in patients with gastroesophageal, pancreatic, and non-small cell lung cancer, respectively, and two were unclassifiable in patients with non-small cell lung cancer. Clinicopathologic findings were helpful in suggesting the correct primary site in some patients and appear to complement the molecular assay findings. Conclusions. These data validate the reliability of this assay in predicting the primary site in CUP patients and may form the basis for more successful site-directed therapy, when used in concert with clinicopathologic data. PMID:20427384

  1. Influence of Feature Encoding and Choice of Classifier on Disease Risk Prediction in Genome-Wide Association Studies

    PubMed Central

    Mittag, Florian; Römer, Michael; Zell, Andreas

    2015-01-01

    Various attempts have been made to predict the individual disease risk based on genotype data from genome-wide association studies (GWAS). However, most studies only investigated one or two classification algorithms and feature encoding schemes. In this study, we applied seven different classification algorithms on GWAS case-control data sets for seven different diseases to create models for disease risk prediction. Further, we used three different encoding schemes for the genotypes of single nucleotide polymorphisms (SNPs) and investigated their influence on the predictive performance of these models. Our study suggests that an additive encoding of the SNP data should be the preferred encoding scheme, as it proved to yield the best predictive performances for all algorithms and data sets. Furthermore, our results showed that the differences between most state-of-the-art classification algorithms are not statistically significant. Consequently, we recommend to prefer algorithms with simple models like the linear support vector machine (SVM) as they allow for better subsequent interpretation without significant loss of accuracy. PMID:26285210

  2. Clinical and Molecular Features of Laron Syndrome, A Genetic Disorder Protecting from Cancer.

    PubMed

    Janecka, Anna; Kołodziej-Rzepa, Marta; Biesaga, Beata

    2016-01-01

    Laron syndrome (LS) is a rare, genetic disorder inherited in an autosomal recessive manner. The disease is caused by mutations of the growth hormone (GH) gene, leading to GH/insulin-like growth factor type 1 (IGF1) signalling pathway defect. Patients with LS have characteristic biochemical features, such as a high serum level of GH and low IGF1 concentration. Laron syndrome was first described by the Israeli physician Zvi Laron in 1966. Globally, around 350 people are affected by this syndrome and there are two large groups living in separate geographic regions: Israel (69 individuals) and Ecuador (90 individuals). They are all characterized by typical appearance such as dwarfism, facial phenotype, obesity and hypogenitalism. Additionally, they suffer from hypoglycemia, hypercholesterolemia and sleep disorders, but surprisingly have a very low cancer risk. Therefore, studies on LS offer a unique opportunity to better understand carcinogenesis and develop new strategies of cancer treatment. PMID:27381597

  3. Foreign exchange market data analysis reveals statistical features that predict price movement acceleration

    NASA Astrophysics Data System (ADS)

    Nacher, Jose C.; Ochiai, Tomoshiro

    2012-05-01

    Increasingly accessible financial data allow researchers to infer market-dynamics-based laws and to propose models that are able to reproduce them. In recent years, several stylized facts have been uncovered. Here we perform an extensive analysis of foreign exchange data that leads to the unveiling of a statistical financial law. First, our findings show that, on average, volatility increases more when the price exceeds the highest (or lowest) value, i.e., breaks the resistance line. We call this the breaking-acceleration effect. Second, our results show that the probability P(T) to break the resistance line in the past time T follows power law in both real data and theoretically simulated data. However, the probability calculated using real data is rather lower than the one obtained using a traditional Black-Scholes (BS) model. Taken together, the present analysis characterizes a different stylized fact of financial markets and shows that the market exceeds a past (historical) extreme price fewer times than expected by the BS model (the resistance effect). However, when the market does, we predict that the average volatility at that time point will be much higher. These findings indicate that any Markovian model does not faithfully capture the market dynamics.

  4. Foreign exchange market data analysis reveals statistical features that predict price movement acceleration.

    PubMed

    Nacher, Jose C; Ochiai, Tomoshiro

    2012-05-01

    Increasingly accessible financial data allow researchers to infer market-dynamics-based laws and to propose models that are able to reproduce them. In recent years, several stylized facts have been uncovered. Here we perform an extensive analysis of foreign exchange data that leads to the unveiling of a statistical financial law. First, our findings show that, on average, volatility increases more when the price exceeds the highest (or lowest) value, i.e., breaks the resistance line. We call this the breaking-acceleration effect. Second, our results show that the probability P(T) to break the resistance line in the past time T follows power law in both real data and theoretically simulated data. However, the probability calculated using real data is rather lower than the one obtained using a traditional Black-Scholes (BS) model. Taken together, the present analysis characterizes a different stylized fact of financial markets and shows that the market exceeds a past (historical) extreme price fewer times than expected by the BS model (the resistance effect). However, when the market does, we predict that the average volatility at that time point will be much higher. These findings indicate that any Markovian model does not faithfully capture the market dynamics. PMID:23004832

  5. Retrospective analysis of molecular scores for the prediction of distant recurrence according to baseline risk factors.

    PubMed

    Sestak, Ivana; Dowsett, Mitch; Ferree, Sean; Baehner, Frederick L; Cuzick, Jack

    2016-08-01

    Clinical variables and several gene signature profiles have been investigated for the prediction of (distant) recurrence in several trials. These molecular markers are significantly correlated with overall and late distant recurrences. Here, we retrospectively explore whether age and body mass index (BMI) affect the prediction of these molecular scores for distant recurrence in postmenopausal women with hormone receptor-positive breast cancer in the transATAC trial. 940 postmenopausal women for whom the Clinical Treatment Score (CTS), immunohistochemical markers (IHC4), Oncotype Recurrence Score (RS), and the Prosigna Risk of Recurrence Score (ROR) were available were included in this retrospective analysis. Conventional BMI groups were used (N = 865), and age was split into equal tertiles (N = 940). Cox proportional hazard models were used to determine the effect of a molecular score for the prediction of distant recurrence according to BMI and age groups. In both the univariate and bivariate analyses, the effect size of the IHC4 and RS was strongest in women aged 59.8 years or younger. Trends tests for age were significant for the IHC4 and RS, but not for the CTS and ROR, for which most prognostic information was added in women aged 60 years or older. The CTS and ROR scores added significant prognostic information in all three BMI groups. In both the univariate and bivariate analyses, the IHC4 provided the most prognostic information in women with a BMI lower than 25 kg/m(2), whereas the RS did not add prognostic information for distant recurrence in women with a BMI of 30 kg/m(2) or above. Molecular scores are increasingly used in women with breast cancer to assess recurrence risk. We have shown that the effect size of the molecular scores is significantly different across age groups, but not across BMI groups. The results from this retrospective analysis may be incorporated in the identification of women who may benefit most from the use of these

  6. LMO2 expression reflects the different stages of blast maturation and genetic features in B-cell acute lymphoblastic leukemia and predicts clinical outcome

    PubMed Central

    Malumbres, Raquel; Fresquet, Vicente; Roman-Gomez, Jose; Bobadilla, Miriam; Robles, Eloy F.; Altobelli, Giovanna G.; Calasanz, M.ª José; Smeland, Erlend B.; Aznar, Maria Angela; Agirre, Xabier; Martin-Palanco, Vanesa; Prosper, Felipe; Lossos, Izidore S.; Martinez-Climent, Jose A.

    2011-01-01

    Background LMO2 is highly expressed at the most immature stages of lymphopoiesis. In T-lymphocytes, aberrant LMO2 expression beyond those stages leads to T-cell acute lymphoblastic leukemia, while in B cells LMO2 is also expressed in germinal center lymphocytes and diffuse large B-cell lymphomas, where it predicts better clinical outcome. The implication of LMO2 in B-cell acute lymphoblastic leukemia must still be explored. Design and Methods We measured LMO2 expression by real time RT-PCR in 247 acute lymphoblastic leukemia patient samples with cytogenetic data (144 of them also with survival and immunophenotypical data) and in normal hematopoietic and lymphoid cells. Results B-cell acute lymphoblastic leukemia cases expressed variable levels of LMO2 depending on immunophenotypical and cytogenetic features. Thus, the most immature subtype, pro-B cells, displayed three-fold higher LMO2 expression than pre-B cells, common-CD10+ or mature subtypes. Additionally, cases with TEL-AML1 or MLL rearrangements exhibited two-fold higher LMO2 expression compared to cases with BCR-ABL rearrangements or hyperdyploid karyotype. Clinically, high LMO2 expression correlated with better overall survival in adult patients (5-year survival rate 64.8% (42.5%–87.1%) vs. 25.8% (10.9%–40.7%), P= 0.001) and constituted a favorable independent prognostic factor in B-ALL with normal karyotype: 5-year survival rate 80.3% (66.4%–94.2%) vs. 63.0% (46.1%–79.9%) (P= 0.043). Conclusions Our data indicate that LMO2 expression depends on the molecular features and the differentiation stage of B-cell acute lymphoblastic leukemia cells. Furthermore, assessment of LMO2 expression in adult patients with a normal karyotype, a group which lacks molecular prognostic factors, could be of clinical relevance. PMID:21459790

  7. Risk of malignancy in nonpalpable thyroid nodules: predictive value of ultrasound and color-Doppler features.

    PubMed

    Papini, Enrico; Guglielmi, Rinaldo; Bianchini, Antonio; Crescenzi, Anna; Taccogna, Silvia; Nardi, Francesco; Panunzi, Claudio; Rinaldi, Roberta; Toscano, Vincenzo; Pacella, Claudio M

    2002-05-01

    The aim of the study was to correlate the sonographic [ultrasound (US)] and color-Doppler (CFD) findings with the results of US-guided fine needle aspiration biopsy (FNA) and of pathologic staging of resected carcinomas to establish: 1) the relative importance of US features as risk factors of malignancy; and 2) a cost-effective management of nonpalpable thyroid nodules. Four hundred ninety-four consecutive patients with nonpalpable thyroid nodules (8-15 mm) were evaluated by US, CFD, and US-FNA. Ninety-two patients with inadequate cytology were excluded from the study. All patients with suspicious or malignant cytology underwent surgery, whereas subjects with benign cytology had clinical and US control 6 months later. Thyroid malignancies were observed in 18 of 195 (9.2%) solitary thyroid nodules and in 13 of 207 (6.3%) multinodular goiters. Cancer prevalence was similar in nodules greater or smaller than 10 mm (9.1 vs. 7.0%). Extracapsular growth (pT(4)) was present in 35.5%, and nodal involvement in 19.4% of neoplastic lesions, with no significant differences between tumors greater or smaller than 10 mm. At US cancers presented a solid hypoechoic appearance in 87% of cases, irregular or blurred margins in 77.4%, an intranodular vascular pattern in 74.2%, and microcalcifications in 29.0%. Irregular margins (RR 16.83), intranodular vascular spots (RR 14.29), and microcalcifications (RR 4.97) were independent risk factors of malignancy. FNA performed on hypoechoic nodules with at least one risk factor was able to identify 87% of the cancers at the expence of cytological evaluation of 38.4% of nonpalpable lesions. The majority of nonpalpable thyroid tumors can be identified by cytological evaluation of lesions presenting hypoechoic appearance in conjunction with one independent risk factor. Due to the nonnegligible prevalence of extracapsular growth and nodal metastasis, US-FNA should be performed on all 8-15 mm hypoechoic nodules with irregular margins

  8. Predicting Molecular Targets for Small-Molecule Drugs with a Ligand-Based Interaction Fingerprint Approach.

    PubMed

    Cao, Ran; Wang, Yanli

    2016-06-20

    The computational prediction of molecular targets for small-molecule drugs remains a great challenge. Herein we describe a ligand-based interaction fingerprint (LIFt) approach for target prediction. Together with physics-based docking and sampling methods, we assessed the performance systematically by modeling the polypharmacology of 12 kinase inhibitors in three stages. First, we examined the capacity of this approach to differentiate true targets from false targets with the promiscuous binder staurosporine, based on native complex structures. Second, we performed large-scale profiling of kinase selectivity on the clinical drug sunitinib by means of computational simulation. Third, we extended the study beyond kinases by modeling the cross-inhibition of bromodomain-containing protein 4 (BRD4) for 10 well-established kinase inhibitors. On this basis, we made prospective predictions by exploring new kinase targets for the anticancer drug candidate TN-16, originally known as a colchicine site binder and microtubule disruptor. As a result, p38α was highlighted from a panel of 187 different kinases. Encouragingly, our prediction was validated by an in vitro kinase assay, which showed TN-16 as a low-micromolar p38α inhibitor. Collectively, our results suggest the promise of the LIFt approach in predicting potential targets for small-molecule drugs. PMID:26222196

  9. COBRA: A Computational Brewing Application for Predicting the Molecular Composition of Organic Aerosols

    SciTech Connect

    Fooshee, David R.; Nguyen, Tran B.; Nizkorodov, Sergey A.; Laskin, Julia; Laskin, Alexander; Baldi, Pierre

    2012-05-08

    Atmospheric organic aerosols (OA) represent a significant fraction of airborne particulate matter and can impact climate, visibility, and human health. These mixtures are difficult to characterize experimentally due to the enormous complexity and dynamic nature of their chemical composition. We introduce a novel Computational Brewing Application (COBRA) and apply it to modeling oligomerization chemistry stemming from condensation and addition reactions of monomers pertinent to secondary organic aerosol (SOA) formed by photooxidation of isoprene. COBRA uses two lists as input: a list of chemical structures comprising the molecular starting pool, and a list of rules defining potential reactions between molecules. Reactions are performed iteratively, with products of all previous iterations serving as reactants for the next one. The simulation generated thousands of molecular structures in the mass range of 120-500 Da, and correctly predicted ~70% of the individual SOA constituents observed by high-resolution mass spectrometry (HR-MS). Selected predicted structures were confirmed with tandem mass spectrometry. Esterification and hemiacetal formation reactions were shown to play the most significant role in oligomer formation, whereas aldol condensation was shown to be insignificant. COBRA is not limited to atmospheric aerosol chemistry, but is broadly applicable to the prediction of reaction products in other complex mixtures for which reasonable reaction mechanisms and seed molecules can be supplied by experimental or theoretical methods.

  10. Prediction of Transport Properties of Liquid Ammonia and Its Binary Mixture with Methanol by Molecular Simulation

    NASA Astrophysics Data System (ADS)

    Guevara-Carrion, Gabriela; Vrabec, Jadran; Hasse, Hans

    2012-03-01

    Transport properties of ammonia and of the binary mixture ammonia + methanol are predicted for a broad range of liquid states by molecular dynamics (MD) simulation on the basis of rigid, non-polarizable molecular models of the united-atom type. These models were parameterized in preceding work using only experimental vapor-liquid equilibrium data. The self- and the Maxwell-Stefan (MS) diffusion coefficients as well as the shear viscosity are obtained by equilibrium MD and the Green-Kubo formalism. Non-equilibrium MD is used for the thermal conductivity. The transport properties of liquid ammonia are predicted for temperatures between 223 K and 473 K up to pressures of 200 MPa and are compared to experimental data and correlations thereof. Generally, good agreement is achieved. The predicted self-diffusion coefficient as well as the shear viscosity deviates on average by less than 15 % from the experiment and the thermal conductivity by less than 6 %. Furthermore, the self- and the MS transport diffusion coefficients as well as the shear viscosity of the liquid mixture ammonia + methanol are studied at different compositions and compared to the available experimental data.

  11. Predicting Low Energy Dopant Implant Profiles in Semiconductors using Molecular Dynamics

    SciTech Connect

    Beardmore, K.M.; Gronbech-Jensen, N.

    1999-05-02

    The authors present a highly efficient molecular dynamics scheme for calculating dopant density profiles in group-IV alloy, and III-V zinc blende structure materials. Their scheme incorporates several necessary methods for reducing computational overhead, plus a rare event algorithm to give statistical accuracy over several orders of magnitude change in the dopant concentration. The code uses a molecular dynamics (MD) model to describe ion-target interactions. Atomic interactions are described by a combination of 'many-body' and pair specific screened Coulomb potentials. Accumulative damage is accounted for using a Kinchin-Pease type model, inelastic energy loss is represented by a Firsov expression, and electronic stopping is described by a modified Brandt-Kitagawa model which contains a single adjustable ion-target dependent parameter. Thus, the program is easily extensible beyond a given validation range, and is therefore truly predictive over a wide range of implant energies and angles. The scheme is especially suited for calculating profiles due to low energy and to situations where a predictive capability is required with the minimum of experimental validation. They give examples of using the code to calculate concentration profiles and 2D 'point response' profiles of dopants in crystalline silicon and gallium-arsenide. Here they can predict the experimental profile over five orders of magnitude for <100> and <110> channeling and for non-channeling implants at energies up to hundreds of keV.

  12. An Efficient Molecular Dynamics Scheme for Predicting Dopant Implant Profiles in Semiconductors

    SciTech Connect

    Beardmore, K.M.; Gronbech-Jensen, N.

    1998-09-15

    The authors present a highly efficient molecular dynamics scheme for calculating the concentration profile of dopants implanted in group-IV alloy, and III-V zinc blende structure materials. The program incorporates methods for reducing computational overhead, plus a rare event algorithm to give statistical accuracy over several orders of magnitude change in the dopant concentration. The code uses a molecular dynamics (MD) model, instead of the binary collision approximation (BCA) used in implant simulators such as TRIM and Marlowe, to describe ion-target interactions. Atomic interactions are described by a combination of 'many-body' and screened Coulomb potentials. Inelastic energy loss is accounted for using a Firsov model, and electronic stopping is described by a Brandt-Kitagawa model which contains the single adjustable parameter for the entire scheme. Thus, the program is easily extensible to new ion-target combinations with the minimum of tuning, and is predictive over a wide range of implant energies and angles. The scheme is especially suited for calculating profiles due to low energy, large angle implants, and for situations where a predictive capability is required with the minimum of experimental validation. They give examples of using their code to calculate concentration profiles and 2D 'point response' profiles of dopants in crystalline silicon, silicon-germanium blends, and gallium-arsenide. They can predict the experimental profiles over five orders of magnitude for <100> and <110> channeling and for non-channeling implants at energies up to hundreds of keV.

  13. Clinical, hematological, and molecular features in Sicilians with sickle cell disease.

    PubMed

    Schilirò, G; Samperi, P; Consalvo, C; Gangarossa, S; Testa, R; Miraglia, V; Lo Nigro, L

    1992-01-01

    We report the clinical, hematological, and molecular findings observed in 32 Sicilian patients with sickle cell disease. None of our patients received regular blood transfusions and careful infectious disease prophylaxis was carried out for all. Haplotyping of beta S chromosomes was performed in all patients; all were homozygous for haplotype #19 (Benin). Gene mapping excluded the presence of an alpha-thalassemia in 13 of our patients; none of the relatives showed any evidence of the presence of alpha-thalassemia. Hb F levels were 11.8 +/- 5.9% with G gamma representing 39.6 +/- 3.6% of total gamma chain. Hb F levels were higher in females than in males (12.5 +/- 5.9% versus 9.7 +/- 6.5%) but the difference was not statistically significant. All patients, regardless of age and sex, were anemic with normal mean corpuscular hemoglobin concentration, high mean corpuscular volume and mean corpuscular hemoglobin, and mild reticulocytosis. Analysis of clinical manifestations suggests that our patients have a disease of moderate severity. PMID:1487418

  14. Molecular and genetic features of zinc transporters in physiology and pathogenesis.

    PubMed

    Fukada, Toshiyuki; Kambe, Taiho

    2011-07-01

    Zinc (Zn) is a vital element. It plays indispensable roles in multifarious cellular processes, affecting the expression and activity of a variety of molecules, including transcription factors, enzymes, adapters, channels, growth factors, and their receptors. A disturbance in Zn homeostasis due to Zn deficiency or an excess of Zn absorption can therefore impair the cellular machinery and exert various influences on physiological programs, such as systemic growth, morphogenetic processes, and immune responses, as well as neuro-sensory and endocrine functions. Thus, Zn imbalance becomes pathogenic in humans. Zn homeostasis is controlled by the coordinated actions of Zn transporters, which are responsible for Zn influx and efflux, and intricately regulate the intracellular and extracellular Zn concentration and distribution. In this review, we describe crucial roles of Zn transporters in biological phenomena, focusing in particular on how Zn transporters contribute to cellular events at the molecular, biochemical, and genetic level, with recent progress uncovering the roles of Zn transporters in physiology and pathogenesis. PMID:21566827

  15. Genetic features and molecular epidemiology of Enterococcus faecium isolated in two university hospitals in Brazil.

    PubMed

    da Silva, Leila Priscilla Pinheiro; Pitondo-Silva, André; Martinez, Roberto; da Costa Darini, Ana Lúcia

    2012-11-01

    The global emergence of vancomycin-resistant Enterococcus faecium (VREfm) has been characterized by a clonal spread of strains belonging to clonal complex 17 (CC17). Genetic features and clonal relationships of 53 VREfm isolated from patients in 2 hospitals in Ribeirao Preto, São Paulo, Brazil, during 2005-2010 were determined as a contribution to the Brazilian evolutionary history of these nosocomial pathogens. All isolates were daptomycin susceptible, vancomycin-resistant, and had the vanA gene. The predominant virulence genes were acm and esp. Only 5 VREfm isolated in 2005-2006 had intact Tn1546, while 81% showed Tn1546 with deleted left extremity and insertion of IS1251 between the vanS and vanH genes. Multilocus sequence typing analysis permitted the identification of 9 different sequence types (STs), with 5 being new ones (656, 657, 658, 659, and 660). Predominant STs were ST412 and ST478, all belonging to CC17, except ST658. This is the first report of the ST78 in Brazil. PMID:22959818

  16. Emerinopathy and Laminopathy Clinical, pathological and molecular features of muscular dystrophy with nuclear envelopathy in Japan

    PubMed Central

    Astejada, MN; Goto, K; Nagano, A; Ura, S; Noguchi, S; Nonaka, I; Nishino, I; Hayashi, YK

    2007-01-01

    Summary Mutations in the genes for nuclear envelope proteins of emerin (EMD) and lamin A/C (LMNA) are known to cause Emery-Dreifuss muscular dystrophy (EDMD) and limb girdle muscular dystrophy (LGMD). We compared clinical features of the muscular dystrophy patients associated with mutations in EMD (emerinopathy) and LMNA (laminopathy) in our series. The incidence of laminopathy was slightly higher than that of emerinopathy. The age at onset of the disease in emerinopathy was variable and significantly older than in laminopathy. The initial symptom of emerinopathy was also variable, whereas nearly all laminopathy patients presented initially with muscle weakness. Calf hypertrophy was often seen in laminopathy, underscoring the importance of mutation screening for LMNA in childhood muscular dystrophy with calf hypertrophy. The clinical spectrum of emerinopathy is actually wider than previously known including EDMD, LGMD, conduction defects with minimal muscle/joint involvement, and their intermittent forms. Pathologically, no marked difference was observed between emerinopathy and laminopathy. Increased number and variation in size of myonuclei were detected. More precise observations using electron microscopy is warranted to characterize the detailed nuclear changes in nuclear envelopathy. PMID:18646565

  17. Papulonecrotic tuberculid—clinicopathologic and molecular features of 12 Indian patients

    PubMed Central

    Tirumalae, Rajalakshmi; Yeliur, Inchara K.; Antony, Meryl; George, Geojith; Kenneth, John

    2014-01-01

    Background: Papulonecrotic tuberculid (PNT) is said to be a hypersensitivity reaction to M. tuberculosis. Some reports indicate that organisms are demonstrable by polymerase chain reaction (PCR). Methods: We describe 12 patients with PNT over 6 years. We reviewed the histopathologic features, clinical data and follow-up. PCR for M. tuberculosis DNA was done in all cases. Results: There were 7 men and 5 women. The ages ranged from 3–58 years. Upper limbs were commonly involved (8 cases). All patients had multiple papulonodular lesions, 5 showed ulceration and scarring. Mantoux test was strongly positive in all. Seven patients had systemic tuberculosis. On microscopy, necrosis was seen in 11 cases, varying from minimal to extensive. Epithelioid granulomas were common, except for 1 case with palisading and interstitial patterns. The infiltrate showed mostly lymphocytes, while 3 cases showed eosinophils. Vasculitis was seen in 8 cases. Two cases had dermal mucin, one also with interface dermatitis. This patient had concurrent LE. Mycobacterial DNA was detectable by PCR in 3 cases. Seven patients showed improvement/resolution of lesions on treatment. Conclusions: PNT is a rare disease. A positive PCR reiterates the question whether these are “tuberculids”. PNT may be better classified as true cutaneous tuberculosis and patients screened for systemic disease. PMID:24855568

  18. Multiscale Reactive Molecular Dynamics for Absolute pK a Predictions and Amino Acid Deprotonation.

    PubMed

    Nelson, J Gard; Peng, Yuxing; Silverstein, Daniel W; Swanson, Jessica M J

    2014-07-01

    Accurately calculating a weak acid's pK a from simulations remains a challenging task. We report a multiscale theoretical approach to calculate the free energy profile for acid ionization, resulting in accurate absolute pK a values in addition to insights into the underlying mechanism. Importantly, our approach minimizes empiricism by mapping electronic structure data (QM/MM forces) into a reactive molecular dynamics model capable of extensive sampling. Consequently, the bulk property of interest (the absolute pK a) is the natural consequence of the model, not a parameter used to fit it. This approach is applied to create reactive models of aspartic and glutamic acids. We show that these models predict the correct pK a values and provide ample statistics to probe the molecular mechanism of dissociation. This analysis shows changes in the solvation structure and Zundel-dominated transitions between the protonated acid, contact ion pair, and bulk solvated excess proton. PMID:25061442

  19. Resistance to sunitinib in renal cell carcinoma: From molecular mechanisms to predictive markers and future perspectives.

    PubMed

    Joosten, S C; Hamming, L; Soetekouw, P M; Aarts, M J; Veeck, J; van Engeland, M; Tjan-Heijnen, V C

    2015-01-01

    The introduction of agents that inhibit tumor angiogenesis by targeting vascular endothelial growth factor (VEGF) signaling has made a significant impact on the survival of patients with metastasized renal cell carcinoma (RCC). Sunitinib, a tyrosine kinase inhibitor of the VEGF receptor, has become the mainstay of treatment for these patients. Although treatment with sunitinib substantially improved patient outcome, the initial success is overshadowed by the occurrence of resistance. The mechanisms of resistance are poorly understood. Insight into the molecular mechanisms of resistance will help to better understand the biology of RCC and can ultimately aid the development of more effective therapies for patients with this infaust disease. In this review we comprehensively discuss molecular mechanisms of resistance to sunitinib and the involved biological processes, summarize potential biomarkers that predict response and resistance to treatment with sunitinib, and elaborate on future perspectives in the treatment of metastasized RCC. PMID:25446042

  20. The prediction of novel multiple lipid-binding regions in protein translocation motor proteins: a possible general feature.

    PubMed

    Keller, Rob C A

    2011-03-01

    Protein translocation is an important cellular process. SecA is an essential protein component in the Sec system, as it contains the molecular motor that facilitates protein translocation. In this study, a bioinformatics approach was applied in the search for possible lipid-binding helix regions in protein translocation motor proteins. Novel lipid-binding regions in Escherichia coli SecA were identified. Remarkably, multiple lipid-binding sites were also identified in other motor proteins such as BiP, which is involved in ER protein translocation. The prokaryotic signal recognition particle receptor FtsY, though not a motor protein, is in many ways related to SecA, and was therefore included in this study. The results demonstrate a possible general feature for motor proteins involved in protein translocation. PMID:20957445

  1. Squamousness: Next-generation sequencing reveals shared molecular features across squamous tumor types

    PubMed Central

    Schwaederle, Maria; Elkin, Sheryl K; Tomson, Brett N; Carter, Jennifer Levin; Kurzrock, Razelle

    2015-01-01

    In order to gain a better understanding of the underlying biology of squamous cell carcinoma (SCC), we tested the hypothesis that SCC originating from different organs may possess common molecular alterations. SCC samples (N = 361) were examined using clinical-grade targeted next-generation sequencing (NGS). The most frequent SCC tumor types were head and neck, lung, cutaneous, gastrointestinal and gynecologic cancers. The most common gene alterations were TP53 (64.5% of patients), PIK3CA (28.5%), CDKN2A (24.4%), SOX2 (17.7%), and CCND1 (15.8%). By comparing NGS results of our SCC cohort to a non-SCC cohort (N = 277), we found that CDKN2A, SOX2, NOTCH1, TP53, PIK3CA, CCND1, and FBXW7 were significantly more frequently altered, unlike KRAS, which was less frequently altered in SCC specimens (all P < 0.05; multivariable analysis). Therefore, we identified “squamousness” gene signatures (TP53, PIK3CA, CCND1, CDKN2A, SOX2, NOTCH 1, and FBXW7 aberrations, and absence of KRAS alterations) that were significantly more frequent in SCC versus non-SCC histologies. A multivariable co-alteration analysis established 2 SCC subgroups: (i) patients in whom TP53 and cyclin pathway (CDKN2A and CCND1) alterations strongly correlated but in whom PIK3CA aberrations were less frequent; and (ii) patients with PIK3CA alterations in whom TP53 mutations were less frequent (all P ≤ 0 .001, multivariable analysis). In conclusion, we identified a set of 8 genes altered with significantly different frequencies when SCC and non-SCC were compared, suggesting the existence of patterns for “squamousness.” Targeting the PI3K-AKT-mTOR and/or cyclin pathway components in SCC may be warranted. PMID:26030731

  2. Cellular and Molecular Features of Developmentally Programmed Genome Rearrangement in a Vertebrate (Sea Lamprey: Petromyzon marinus)

    PubMed Central

    Timoshevskiy, Vladimir A.; Herdy, Joseph R.; Keinath, Melissa C.; Smith, Jeramiah J.

    2016-01-01

    The sea lamprey (Petromyzon marinus) represents one of the few vertebrate species known to undergo large-scale programmatic elimination of genomic DNA over the course of its normal development. Programmed genome rearrangements (PGRs) result in the reproducible loss of ~20% of the genome from somatic cell lineages during early embryogenesis. Studies of PGR hold the potential to provide novel insights related to the maintenance of genome stability during the cell cycle and coordination between mechanisms responsible for the accurate distribution of chromosomes into daughter cells, yet little is known regarding the mechanistic basis or cellular context of PGR in this or any other vertebrate lineage. Here we identify epigenetic silencing events that are associated with the programmed elimination of DNA and describe the spatiotemporal dynamics of PGR during lamprey embryogenesis. In situ analyses reveal that the earliest DNA methylation (and to some extent H3K9 trimethylation) events are limited to specific extranuclear structures (micronuclei) containing eliminated DNA. During early embryogenesis a majority of micronuclei (~60%) show strong enrichment for repressive chromatin modifications (H3K9me3 and 5meC). These analyses also led to the discovery that eliminated DNA is packaged into chromatin that does not migrate with somatically retained chromosomes during anaphase, a condition that is superficially similar to lagging chromosomes observed in some cancer subtypes. Closer examination of “lagging” chromatin revealed distributions of repetitive elements, cytoskeletal contacts and chromatin contacts that provide new insights into the cellular mechanisms underlying the programmed loss of these segments. Our analyses provide additional perspective on the cellular and molecular context of PGR, identify new structures associated with elimination of DNA and reveal that PGR is completed over the course of several successive cell divisions. PMID:27341395

  3. Molecular Features of an Alcohol Binding Site in a Neuronal Potassium Channel†

    PubMed Central

    Shahidullah, Mohammad; Harris, Thanawath; Germann, Markus W.; Covarrubias, Manuel

    2008-01-01

    Aliphatic alcohols (1-alkanols) selectively inhibit the neuronal Shaw2 K+ channel at an internal binding site. This inhibition is conferred by a sequence of 13 residues that constitutes the S4–S5 loop in the pore-forming subunit. Here, we combined functional and structural approaches to gain insights into the molecular basis of this interaction. To infer the forces that are involved, we employed a fast concentration-clamp method (10–90% exchange time = 800 μs) to examine the kinetics of the interaction of three members of the homologous series of 1-alkanols (ethanol, 1-butanol, and 1-hexanol) with Shaw2 K+ channels in Xenopus oocyte inside-out patches. As expected for a second-order mechanism involving a receptor site, only the observed association rate constants were linearly dependent on the 1-alkanol concentration. While the alkyl chain length modestly influenced the dissociation rate constants (decreasing only ∼2-fold between ethanol and 1-hexanol), the second-order association rate constants increased e-fold per carbon atom. Thus, hydrophobic interactions govern the probability of productive collisions at the 1-alkanol binding site, and short-range polar interactions help to stabilize the complex. We also examined the relationship between the energetics of 1-alkanol binding and the structural properties of the S4–S5 loop. Circular dichroism spectroscopy applied to peptides corresponding to the S4–S5 loop of various K+ channels revealed a correlation between the apparent binding affinity of the 1-alkanol binding site and the α-helical propensity of the S4–S5 loop. The data suggest that amphiphilic interactions at the Shaw2 1-alkanol binding site depend on specific structural constraints in the pore-forming subunit of the channel. PMID:14503874

  4. The Complete Chloroplast Genome of the Hare’s Ear Root, Bupleurum falcatum: Its Molecular Features

    PubMed Central

    Shin, Dong-Ho; Lee, Jeong-Hoon; Kang, Sang-Ho; Ahn, Byung-Ohg; Kim, Chang-Kug

    2016-01-01

    Bupleurum falcatum, which belongs to the family Apiaceae, has long been applied for curative treatments, especially as a liver tonic, in herbal medicine. The chloroplast (cp) genome has been an ideal model to perform the evolutionary and comparative studies because of its highly conserved features and simple structure. The Apiaceae family is taxonomically close to the Araliaceae family and there have been numerous complete chloroplast genome sequences reported in the Araliaceae family, while little is known about the Apiaceae family. In this study, the complete sequence of the B. falcatum chloroplast genome was obtained. The full-length of the cp genome is 155,989 nucleotides with a 37.66% overall guanine-cytosine (GC) content and shows a quadripartite structure composed of three nomenclatural regions: a large single-copy (LSC) region, a small single-copy (SSC) region, and a pair of inverted repeat (IR) regions. The genome occupancy is 85,912-bp, 17,517-bp, and 26,280-bp for LSC, SSC, and IR, respectively. B. falcatum was shown to contain 111 unique genes (78 for protein-coding, 29 for tRNAs, and four for rRNAs, respectively) on its chloroplast genome. Genic comparison found that B. falcatum has no pseudogenes and has two gene losses, accD in the LSC and ycf15 in the IRs. A total of 55 unique tandem repeat sequences were detected in the B. falcatum cp genome. This report is the first to describe the complete chloroplast genome sequence in B. falcatum and will open up further avenues of research to understand the evolutionary panorama and the chloroplast genome conformation in related plant species. PMID:27187480

  5. The Complete Chloroplast Genome of the Hare's Ear Root, Bupleurum falcatum: Its Molecular Features.

    PubMed

    Shin, Dong-Ho; Lee, Jeong-Hoon; Kang, Sang-Ho; Ahn, Byung-Ohg; Kim, Chang-Kug

    2016-01-01

    Bupleurum falcatum, which belongs to the family Apiaceae, has long been applied for curative treatments, especially as a liver tonic, in herbal medicine. The chloroplast (cp) genome has been an ideal model to perform the evolutionary and comparative studies because of its highly conserved features and simple structure. The Apiaceae family is taxonomically close to the Araliaceae family and there have been numerous complete chloroplast genome sequences reported in the Araliaceae family, while little is known about the Apiaceae family. In this study, the complete sequence of the B. falcatum chloroplast genome was obtained. The full-length of the cp genome is 155,989 nucleotides with a 37.66% overall guanine-cytosine (GC) content and shows a quadripartite structure composed of three nomenclatural regions: a large single-copy (LSC) region, a small single-copy (SSC) region, and a pair of inverted repeat (IR) regions. The genome occupancy is 85,912-bp, 17,517-bp, and 26,280-bp for LSC, SSC, and IR, respectively. B. falcatum was shown to contain 111 unique genes (78 for protein-coding, 29 for tRNAs, and four for rRNAs, respectively) on its chloroplast genome. Genic comparison found that B. falcatum has no pseudogenes and has two gene losses, accD in the LSC and ycf15 in the IRs. A total of 55 unique tandem repeat sequences were detected in the B. falcatum cp genome. This report is the first to describe the complete chloroplast genome sequence in B. falcatum and will open up further avenues of research to understand the evolutionary panorama and the chloroplast genome conformation in related plant species. PMID:27187480

  6. Molecular Epidemiology and Clinical Features of Human T Cell Lymphotropic Virus Type 1 Infection in Spain

    PubMed Central

    Alcantara, Luiz Carlos; Benito, Rafael; Caballero, Estrella; Aguilera, Antonio; Ramos, José Manuel; de Mendoza, Carmen; Rodríguez, Carmen; García, Juan; Rodríguez-Iglesias, Manuel; Ortiz de Lejarazu, Raúl; Roc, Lourdes; Parra, Patricia; Eiros, José; del Romero, Jorge; Soriano, Vincent

    2014-01-01

    Abstract Human T cell lymphotropic virus type 1 (HTLV-1) infection in Spain is rare and mainly affects immigrants from endemic regions and native Spaniards with a prior history of sexual intercourse with persons from endemic countries. Herein, we report the main clinical and virological features of cases reported in Spain. All individuals with HTLV-1 infection recorded at the national registry since 1989 were examined. Phylogenetic analysis was performed based on the long terminal repeat (LTR) region. A total of 229 HTLV-1 cases had been reported up to December 2012. The mean age was 41 years old and 61% were female. Their country of origin was Latin America in 59%, Africa in 15%, and Spain in 20%. Transmission had occurred following sexual contact in 41%, parenteral exposure in 12%, and vertically in 9%. HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) was diagnosed in 27 cases and adult T cell leukemia/lymphoma (ATLL) in 17 subjects. HTLV-1 subtype could be obtained for 45 patients; all but one belonged to the Cosmopolitan subtype a. One Nigerian pregnant woman harbored HTLV-1 subtype b. Within the Cosmopolitan subtype a, two individuals (from Bolivia and Peru, respectively) belonged to the Japanese subgroup B, another two (from Senegal and Mauritania) to the North African subgroup D, and 39 to the Transcontinental subgroup A. Of note, one divergent HTLV-1 strain from an Ethiopian branched off from all five known Cosmopolitan subtype 1a subgroups. Divergent HTLV-1 strains have been introduced and currently circulate in Spain. The relatively large proportion of symptomatic cases (19%) suggests that HTLV-1 infection is underdiagnosed in Spain. PMID:24924996

  7. Analytic Methods for Predicting Significant Multi-Quanta Effects in Collisional Molecular Energy Transfer

    NASA Technical Reports Server (NTRS)

    Bieniek, Ronald J.

    1996-01-01

    Collision-induced transitions can significantly affect molecular vibrational-rotational populations and energy transfer in atmospheres and gaseous systems. This, in turn. can strongly influence convective heat transfer through dissociation and recombination of diatomics. and radiative heat transfer due to strong vibrational coupling. It is necessary to know state-to-state rates to predict engine performance and aerothermodynamic behavior of hypersonic flows, to analyze diagnostic radiative data obtained from experimental test facilities, and to design heat shields and other thermal protective systems. Furthermore, transfer rates between vibrational and translational modes can strongly influence energy flow in various 'disturbed' environments, particularly where the vibrational and translational temperatures are not equilibrated.

  8. Predictions of Quantum Molecular Dynamical Model between incident energy 50 and 1000 MeV/Nucleon

    NASA Astrophysics Data System (ADS)

    Kumar, Sanjeev

    2015-01-01

    In the present work, the Quantum Molecular Dynamical (QMD) model is summarized as a useful tool for the incident energy range of 50 to 1000 MeV/nucleon in heavy-ion collisions. The model has reproduced the experimental results of various collaborations such as ALADIN, INDRA, PLASTIC BALL and FOPI upto a high level of accuracy for the phenomena like multifragmentation, collective flow as well as elliptical flow in the above prescribed energy range. The efforts are further in the direction to predict the symmetry energy in the wide incident energy range.

  9. Prediction of gasoline octane numbers from near-infrared spectral features in the range 660-1215 nm

    SciTech Connect

    Kelly, J.J.; Barlow, C.H.; Jinguji, T.M.; Callis, J.B.

    1989-02-15

    The feasibility of predicting the quality parameters of gasoline from its absorption spectrum in the wavelength range 660-1215 nm was investigated. In this spectral region, vibrational overtones and combination bands of CH groups of methyl, methylene, aromatic, and olefinic functions were observed. With the aid of multivariate statistics, the spectral features could be correlated to various quality parameters of gasoline such as octane number. As an example, multivariate analysis of the spectra of 43 unleaded gasoline samples yielded a three-wavelength prediction equation for pump octane that gave excellent correlations (R/sup 2/ = 0.95; standard error of estimate, 0.3-0.4 octane number; standard error of prediction, 0.4-0.5 octane number) with the ASTM motor determined octane numbers. Independent multivariate analysis using partial least-squares (PLS) regression yielded similar results. An additional set of nine samples from the Pacific Coast Exchange Group of the ASTM were examined for ten different quality parameters (research and motor octane numbers, Reid vapor pressure, API gravity, bromine number, lead, sulfur, aromatic, olefinic, and saturate contents). Regression analysis of the spectra results in correlation of nine of the ten properties with R/sup 2/ values ranging from 0.94 to 0.99 and standard errors near the independent reference test values.

  10. Model predictions of features in microsaccade-related neural responses in a feedforward network with short-term synaptic depression

    PubMed Central

    Zhou, Jian-Fang; Yuan, Wu-Jie; Zhou, Zhao; Zhou, Changsong

    2016-01-01

    Recently, the significant microsaccade-induced neural responses have been extensively observed in experiments. To explore the underlying mechanisms of the observed neural responses, a feedforward network model with short-term synaptic depression has been proposed [Yuan, W.-J., Dimigen, O., Sommer, W. and Zhou, C. Front. Comput. Neurosci. 7, 47 (2013)]. The depression model not only gave an explanation for microsaccades in counteracting visual fading, but also successfully reproduced several microsaccade-related features in experimental findings. These results strongly suggest that, the depression model is very useful to investigate microsaccade-related neural responses. In this paper, by using the model, we extensively study and predict the dependance of microsaccade-related neural responses on several key parameters, which could be tuned in experiments. Particularly, we provide a significant prediction that microsaccade-related neural response also complies with the property “sharper is better” observed in many contexts in neuroscience. Importantly, the property exhibits a power-law relationship between the width of input signal and the responsive effectiveness, which is robust against many parameters in the model. By using mean field theory, we analytically investigate the robust power-law property. Our predictions would give theoretical guidance for further experimental investigations of the functional role of microsaccades in visual information processing. PMID:26853547

  11. Model predictions of features in microsaccade-related neural responses in a feedforward network with short-term synaptic depression

    NASA Astrophysics Data System (ADS)

    Zhou, Jian-Fang; Yuan, Wu-Jie; Zhou, Zhao; Zhou, Changsong

    2016-02-01

    Recently, the significant microsaccade-induced neural responses have been extensively observed in experiments. To explore the underlying mechanisms of the observed neural responses, a feedforward network model with short-term synaptic depression has been proposed [Yuan, W.-J., Dimigen, O., Sommer, W. and Zhou, C. Front. Comput. Neurosci. 7, 47 (2013)]. The depression model not only gave an explanation for microsaccades in counteracting visual fading, but also successfully reproduced several microsaccade-related features in experimental findings. These results strongly suggest that, the depression model is very useful to investigate microsaccade-related neural responses. In this paper, by using the model, we extensively study and predict the dependance of microsaccade-related neural responses on several key parameters, which could be tuned in experiments. Particularly, we provide a significant prediction that microsaccade-related neural response also complies with the property “sharper is better” observed in many contexts in neuroscience. Importantly, the property exhibits a power-law relationship between the width of input signal and the responsive effectiveness, which is robust against many parameters in the model. By using mean field theory, we analytically investigate the robust power-law property. Our predictions would give theoretical guidance for further experimental investigations of the functional role of microsaccades in visual information processing.

  12. Initiating a regenerative response; cellular and molecular features of wound healing in the cnidarian Nematostella vectensis

    PubMed Central

    2014-01-01

    Background Wound healing is the first stage of a series of cellular events that are necessary to initiate a regenerative response. Defective wound healing can block regeneration even in animals with a high regenerative capacity. Understanding how signals generated during wound healing promote regeneration of lost structures is highly important, considering that virtually all animals have the ability to heal but many lack the ability to regenerate missing structures. Cnidarians are the phylogenetic sister taxa to bilaterians and are highly regenerative animals. To gain a greater understanding of how early animals generate a regenerative response, we examined the cellular and molecular components involved during wound healing in the anthozoan cnidarian Nematostella vectensis. Results Pharmacological inhibition of extracellular signal-regulated kinases (ERK) signaling blocks regeneration and wound healing in Nematostella. We characterized early and late wound healing events through genome-wide microarray analysis, quantitative PCR, and in situ hybridization to identify potential wound healing targets. We identified a number of genes directly related to the wound healing response in other animals (metalloproteinases, growth factors, transcription factors) and suggest that glycoproteins (mucins and uromodulin) play a key role in early wound healing events. This study also identified a novel cnidarian-specific gene, for a thiamine biosynthesis enzyme (vitamin B synthesis), that may have been incorporated into the genome by lateral gene transfer from bacteria and now functions during wound healing. Lastly, we suggest that ERK signaling is a shared element of the early wound response for animals with a high regenerative capacity. Conclusions This research describes the temporal events involved during Nematostella wound healing, and provides a foundation for comparative analysis with other regenerative and non-regenerative species. We have shown that the same genes that

  13. Prediction of physical properties of water under extremely supercritical conditions: a molecular dynamics study.

    PubMed

    Sakuma, Hiroshi; Ichiki, Masahiro; Kawamura, Katsuyuki; Fuji-ta, Kiyoshi

    2013-04-01

    The physical properties of water under a wide range of pressure and temperature conditions are important in fundamental physics, chemistry, and geoscience. Molecular simulations are useful for predicting and understanding the physical properties of water at phases extremely different from ambient conditions. In this study, we developed a new five-site flexible induced point charge model to predict the density, static dielectric constant, and transport properties of water in the extremely supercritical phase at high temperatures and pressures of up to 2000 K and 2000 MPa. The model satisfactorily reproduced the density, radial distribution function, static dielectric constant, reorientation time, and self-diffusion coefficients of water above the critical points. We also developed a database of the static dielectric constant, which is useful for discussing the electrical conductivity of aqueous fluids in the earth's crust and mantle. PMID:23574243

  14. Predicting hydration free energies of amphetamine-type stimulants with a customized molecular model

    NASA Astrophysics Data System (ADS)

    Li, Jipeng; Fu, Jia; Huang, Xing; Lu, Diannan; Wu, Jianzhong

    2016-09-01

    Amphetamine-type stimulants (ATS) are a group of incitation and psychedelic drugs affecting the central nervous system. Physicochemical data for these compounds are essential for understanding the stimulating mechanism, for assessing their environmental impacts, and for developing new drug detection methods. However, experimental data are scarce due to tight regulation of such illicit drugs, yet conventional methods to estimate their properties are often unreliable. Here we introduce a tailor-made multiscale procedure for predicting the hydration free energies and the solvation structures of ATS molecules by a combination of first principles calculations and the classical density functional theory. We demonstrate that the multiscale procedure performs well for a training set with similar molecular characteristics and yields good agreement with a testing set not used in the training. The theoretical predictions serve as a benchmark for the missing experimental data and, importantly, provide microscopic insights into manipulating the hydrophobicity of ATS compounds by chemical modifications.

  15. Predicting hydration free energies of amphetamine-type stimulants with a customized molecular model.

    PubMed

    Li, Jipeng; Fu, Jia; Huang, Xing; Lu, Diannan; Wu, Jianzhong

    2016-09-01

    Amphetamine-type stimulants (ATS) are a group of incitation and psychedelic drugs affecting the central nervous system. Physicochemical data for these compounds are essential for understanding the stimulating mechanism, for assessing their environmental impacts, and for developing new drug detection methods. However, experimental data are scarce due to tight regulation of such illicit drugs, yet conventional methods to estimate their properties are often unreliable. Here we introduce a tailor-made multiscale procedure for predicting the hydration free energies and the solvation structures of ATS molecules by a combination of first principles calculations and the classical density functional theory. We demonstrate that the multiscale procedure performs well for a training set with similar molecular characteristics and yields good agreement with a testing set not used in the training. The theoretical predictions serve as a benchmark for the missing experimental data and, importantly, provide microscopic insights into manipulating the hydrophobicity of ATS compounds by chemical modifications. PMID:27367616

  16. Structural and molecular features of the endomyometrium in endometriosis and adenomyosis.

    PubMed

    Benagiano, Giuseppe; Brosens, Ivo; Habiba, Marwan

    2014-01-01

    BACKGROUND Adenomyosis and endometriosis were initially described as 'adenomyoma'. When the retrograde menstruation theory became widely accepted to explain the pathogenesis of endometriosis, since it does not explain adenomyosis, the two conditions came to be seen as distinct entities. However, emerging evidence suggests that both diseases may be linked to changes in the inner portion of the myometrium. In addition, similar anomalies were found in the eutopic endometrium of the two conditions and the debate has re-opened. A common origin for both adenomyosis and endometriosis would have relevance not only for understanding uterine function and pathophysiology, but also for clinical management and treatment. METHODS The Scopus and Medline databases were searched for all original articles published in English up to the end of 2012. Search terms included 'adenomyosis'; 'endometriosis'; 'endometrium'; 'eutopic endometrium'; 'inner myometrium'; 'junctional zone'. Special attention was paid to articles comparing features of eutopic endometrium in the two conditions. RESULTS A number of similarities exist between adenomyosis and endometriosis and, by using magnetic resonance and laparoscopy, it was found that, at least in some subgroups, the two conditions often coexist. In both situations the inner myometrium (or junctional zone) is altered, although alterations are much more marked in adenomyosis where a thickness >12 mm is today considered sufficient for diagnosis. Research has shown differences between the eutopic endometrium of women with both diseases when compared with controls. There is an immune dysfunction and there are alterations of adhesion molecules, cell proliferation and apoptosis. An increase in cytokines and inflammatory mediators has also been observed. Finally, the presence of oxidative stress and anomalies in free-radical metabolism may alter uterine receptivity. When the two conditions were compared, dissimilarities were also observed in the extent

  17. Pharmacophore modeling and molecular dynamics simulation to identify the critical chemical features against human sirtuin 2 inhibitors

    NASA Astrophysics Data System (ADS)

    Sakkiah, Sugunadevi; Baek, Ayoung; Lee, Keun Woo

    2012-03-01

    Sirtuin 2 (SIRT2) is one of the emerging targets in chemotherapy field and mainly associated with many diseases such as cancer and Parkinson's. Hence, quantitative hypothesis was developed using Discovery Studio v2.5. Top ten resultant hypotheses were generated, among them Hypo1 was selected as a best hypothesis based on the statistical parameters like high cost difference (52), lowest RMS (0.71), and good correlation coefficient (0.96). Hypo1 has been validated by using well known methodologies such as Fischer's randomization method (95% confidence level), test set which has shown the correlation coefficient of 0.93 as well as the goodness of hit (0.65), and enrichment factor (8.80). All the above statistical validations confirm that the chemical features in Hypo1 (1 hydrogen bond acceptor, 1 hydrophobic, and 2 ring aromatic features) was able to inhibit the function of SIRT2. Hence, Hypo1 was used as a query in virtual screening to find a novel scaffolds by screening the various chemical databases. The screened molecules from the databases were checked for the ADMET as well as the drug-like properties. Due to the lack of SIRT2-ligand complex structure in PDB, molecular docking and molecular dynamics (MD) simulation was carried out to find the suitable orientation of ligand in the active site. The representative structure from MD simulations was used as a receptor to dock the molecules which passed the drug-like properties from the virtual screening. Finally, 29 compounds were selected as a potent candidate leads based on the interactions with the active site residues of SIRT2. Thus, the resultant pharmacophore can be used to discover and design the SIRT2 inhibitors with desired biological activity.

  18. Molecular mobility in the cytoplasm: An approach to describe and predict lifespan of dry germplasm

    PubMed Central

    Buitink, Julia; Leprince, Olivier; Hemminga, Marcus A.; Hoekstra, Folkert A.

    2000-01-01

    Molecular mobility is increasingly considered a key factor influencing storage stability of biomolecular substances, because it is thought to control the rate of detrimental reactions responsible for reducing the shelf life of, for instance, pharmaceuticals, food, and germplasm. We investigated the relationship between aging rates of germplasm and the rotational motion of a polar spin probe in the cytoplasm under different storage conditions using saturation transfer electron paramagnetic resonance spectroscopy. Rotational motion of the spin probe in the cytoplasm of seed and pollen of various plant species changed as a function of moisture content and temperature in a manner similar to aging rates or longevity. A linear relationship was established between the logarithms of rotational motion and aging rates or longevity. This linearity suggests that detrimental aging rates are associated with molecular mobility in the cytoplasm. By measuring the rotational correlation times at low temperatures at which experimental determination of longevity is practically impossible, this linearity enabled us to predict vigor loss or longevity. At subzero temperatures, moisture contents for maximum life span were predicted to be higher than those hitherto used in genebanks, urging for a reexamination of seed storage protocols. PMID:10681458

  19. Prediction of chromatographic relative retention time of polychlorinated biphenyls from the molecular electronegativity distance vector.

    PubMed

    Liu, Shu-Shen; Liu, Yan; Yin, Da-Qian; Wang, Xiao-Dong; Wang, Lian-Sheng

    2006-02-01

    Using the molecular electronegativity distance vector (MEDV) descriptors derived directly from the molecular topological structures, the gas chromatographic relative retention times (RRTs) of 209 polychlorinated biphenyls (PCBs) on the SE-54 stationary phase were predicted. A five-variable regression equation with the correlation coefficient of 0.9964 and the root mean square errors of 0.0152 was developed. The descriptors included in the equation represent degree of chlorination (nCl), nonortho index (Ino), and interactions between three pairs of atom types, i.e., atom groups -C= and -C=, -C= and >C=, -C= and -Cl. It has been proved that the retention times of all 209 PCB congeners can be accurately predicted as long as there are more than 50 calibration compounds. In the same way, the MEDV descriptors are also used to develop the five- or six-variable models of RRTs of PCBs on other 18 stationary phases and the correlation coefficients in both modeling stage and LOO cross-validation step are not lower than 0.99 except two models. PMID:16524106

  20. [Molecular characteristics and prediction of the reactive properties of the N-chlorotaurine analogs].

    PubMed

    Roshchupkin, D I; Kondrashova, K V; Murina, M A

    2014-01-01

    A number of molecular characteristics for the N-chlorotaurine structural analogs, amino acid chloramines and relative compounds have been computed by the ab initio method B3LYP/6-31G. In particular, the characteristics were the Mulliken atomic charges for the chloramine part and its adjacent atoms. A quantitative measure of the capabilities of the chloramines to react with the methionine sulfide group or sulfhydryl group of reduced glutathione was their reaction rate constants. The constants available in literature and determined in own experiments have been depicted with an exponential equation of multiple correlation. In the case of a reaction with methionine, the high determination coefficient (R2) was obtained with five independent variables. They were the charges of active chlorine, nitrogen, carbon bonded with nitrogen, a bond length between nitrogen and carbon atoms, and also molecular mass. The equation has been used to predict the rate constant values for the reaction between compounds that contain active chlorine and methionine. The prediction has showed that structural analogs of N-chlorotaurine bearing two methyl groups at beta-carbon of taurine are remarkable for the low value of the discussed rate constant. PMID:25715608

  1. UManSysProp: an online facility for molecular property prediction and atmospheric aerosol calculations

    NASA Astrophysics Data System (ADS)

    Topping, D.; Barley, M. H.; Bane, M.; Higham, N.; Aumont, B.; McFiggans, G.

    2015-11-01

    In this paper we describe the development and application of a new web based facility, UManSysProp (http://umansysprop.seaes.manchester.ac.uk), for automating predictions of molecular and atmospheric aerosol properties. Current facilities include: pure component vapour pressures, critical properties and sub-cooled densities of organic molecules; activity coefficient predictions for mixed inorganic-organic liquid systems; hygroscopic growth factors and CCN activation potential of mixed inorganic/organic aerosol particles; absorptive partitioning calculations with/without a treatment of non-ideality. The aim of this new facility is to provide a single point of reference for all properties relevant to atmospheric aerosol that have been checked for applicability to atmospheric compounds where possible. The group contribution approach allows users to upload molecular information in the form of SMILES strings and UManSysProp will automatically extract the relevant information for calculations. Built using open source chemical informatics, and hosted at the University of Manchester, the facilities are provided via a browser and device-friendly web-interface, or can be accessed using the user's own code via a JSON API. In this paper we demonstrate its use with specific examples that can be simulated using the web-browser interface.

  2. How important is thermal expansion for predicting molecular crystal structures and thermochemistry at finite temperatures?

    PubMed

    Heit, Yonaton N; Beran, Gregory J O

    2016-08-01

    Molecular crystals expand appreciably upon heating due to both zero-point and thermal vibrational motion, yet this expansion is often neglected in molecular crystal modeling studies. Here, a quasi-harmonic approximation is coupled with fragment-based hybrid many-body interaction calculations to predict thermal expansion and finite-temperature thermochemical properties in crystalline carbon dioxide, ice Ih, acetic acid and imidazole. Fragment-based second-order Möller-Plesset perturbation theory (MP2) and coupled cluster theory with singles, doubles and perturbative triples [CCSD(T)] predict the thermal expansion and the temperature dependence of the enthalpies, entropies and Gibbs free energies of sublimation in good agreement with experiment. The errors introduced by neglecting thermal expansion in the enthalpy and entropy cancel somewhat in the Gibbs free energy. The resulting ∼ 1-2 kJ mol(-1) errors in the free energy near room temperature are comparable to or smaller than the errors expected from the electronic structure treatment, but they may be sufficiently large to affect free-energy rankings among energetically close polymorphs. PMID:27484373

  3. Molecular Features Underlying Neurodegeneration Identified through In Vitro Modeling of Genetically Diverse Parkinson's Disease Patients.

    PubMed

    Lin, Lin; Göke, Jonathan; Cukuroglu, Engin; Dranias, Mark R; VanDongen, Antonius M J; Stanton, Lawrence W

    2016-06-14

    The fact that Parkinson's disease (PD) can arise from numerous genetic mutations suggests a unifying molecular pathology underlying the various genetic backgrounds. To address this hypothesis, we took an integrated approach utilizing in vitro disease modeling and comprehensive transcriptome profiling to advance our understanding of PD progression and the concordant downstream signaling pathways across divergent genetic predispositions. To model PD in vitro, we generated neurons harboring disease-causing mutations from patient-specific, induced pluripotent stem cells (iPSCs). We observed signs of degeneration in midbrain dopaminergic neurons, reflecting the cardinal feature of PD. Gene expression signatures of PD neurons provided molecular insights into disease phenotypes observed in vitro, including oxidative stress vulnerability and altered neuronal activity. Notably, PD neurons show that elevated RBFOX1, a gene previously linked to neurodevelopmental diseases, underlies a pattern of alternative RNA-processing associated with PD-specific phenotypes. PMID:27264186

  4. Exploring QSTR modeling and toxicophore mapping for identification of important molecular features contributing to the chemical toxicity in Escherichia coli.

    PubMed

    Pramanik, Subrata; Roy, Kunal

    2014-03-01

    Biodiversity deprivation can affect functions and services of the ecosystem. Changes in biodiversity alter ecosystem processes and change the resilience of ecosystems to ecological changes. Bacterial communities are the main form of biomass in the ecosystem and one of largest populations on the planet. Bacterial communities provide important services to biodiversity. They break down pollutants, municipal waste and ingested food, and they are the primary route for recycling of organic matter to plants and other autotrophs, conversion of inorganic matter into new biological tissue using sunlight, management of energy crisis through use of biofuel. In the present study, computational chemistry and statistical modeling have been used to develop mathematical equations which can be applied to calculate toxicity of new/unknown chemicals/biofuels/metabolites in Escherichia coli. 2D and 3D descriptors were generated from molecular structure of compounds and mathematical models have been developed using genetic function approximation followed by multiple linear regression (GFA-MLR) method. Model validity was checked through defined internal (R(2)=0.751 and Q(2)=0.711), and external (Rpred(2)=0.773) statistical parameters. Molecular features responsible for toxicity were also assessed through 3D toxicophore study. The toxicophore-based model was validated (R=0.785) using qualitative statistical metrics and randomization test (Fischer validation). PMID:24246193

  5. NACE: A web-based tool for prediction of intercompartmental efficiency of human molecular genetic networks.

    PubMed

    Popik, Olga V; Ivanisenko, Timofey V; Saik, Olga V; Petrovskiy, Evgeny D; Lavrik, Inna N; Ivanisenko, Vladimir A

    2016-06-15

    Molecular genetic processes generally involve proteins from distinct intracellular localisations. Reactions that follow the same process are distributed among various compartments within the cell. In this regard, the reaction rate and the efficiency of biological processes can depend on the subcellular localisation of proteins. Previously, the authors proposed a method of evaluating the efficiency of biological processes based on the analysis of the distribution of protein subcellular localisation (Popik et al., 2014). Here, NACE is presented, which is an open access web-oriented program that implements this method and allows the user to evaluate the intercompartmental efficiency of human molecular genetic networks. The method has been extended by a new feature that provides the evaluation of the tissue-specific efficiency of networks for more than 2800 anatomical structures. Such assessments are important in cases when molecular genetic pathways in different tissues proceed with the participation of various proteins with a number of intracellular localisations. For example, an analysis of KEGG pathways, conducted using the developed program, showed that the efficiencies of many KEGG pathways are tissue-specific. Analysis of efficiencies of regulatory pathways in the liver, linking proteins of the hepatitis C virus with human proteins involved in the KEGG apoptosis pathway, showed that intercompartmental efficiency might play an important role in host-pathogen interactions. Thus, the developed tool can be useful in the study of the effectiveness of functioning of various molecular genetic networks, including metabolic, regulatory, host-pathogen interactions and others taking into account tissue-specific gene expression. The tool is available via the following link: http://www-bionet.sscc.ru/nace/. PMID:27109913

  6. Molecular subclasses of hepatocellular carcinoma predict sensitivity to fibroblast growth factor receptor inhibition.

    PubMed

    Schmidt, Benjamin; Wei, Lan; DePeralta, Danielle K; Hoshida, Yujin; Tan, Poh Seng; Sun, Xiaochen; Sventek, Janelle P; Lanuti, Michael; Tanabe, Kenneth K; Fuchs, Bryan C

    2016-03-15

    A recent gene expression classification of hepatocellular carcinoma (HCC) includes a poor survival subclass termed S2 representing about one-third of all HCC in clinical series. S2 cells express E-cadherin and c-myc and secrete AFP. As the expression of fibroblast growth factor receptors (FGFRs) differs between S2 and non-S2 HCC, this study investigated whether molecular subclasses of HCC predict sensitivity to FGFR inhibition. S2 cell lines were significantly more sensitive (p < 0.001) to the FGFR inhibitors BGJ398 and AZD4547. BGJ398 decreased MAPK signaling in S2 but not in non-S2 cell lines. All cell lines expressed FGFR1 and FGFR2, but only S2 cell lines expressed FGFR3 and FGFR4. FGFR4 siRNA decreased proliferation by 44% or more in all five S2 cell lines (p < 0.05 for each cell line), a significantly greater decrease than seen with knockdown of FGFR1-3 with siRNA transfection. FGFR4 knockdown decreased MAPK signaling in S2 cell lines, but little effect was seen with knockdown of FGFR1-3. In conclusion, the S2 molecular subclass of HCC is sensitive to FGFR inhibition. FGFR4-MAPK signaling plays an important role in driving proliferation of a molecular subclass of HCC. This classification system may help to identify those patients who are most likely to benefit from inhibition of this pathway. PMID:26481559

  7. Assessing value of innovative molecular diagnostic tests in the concept of predictive, preventive, and personalized medicine.

    PubMed

    Akhmetov, Ildar; Bubnov, Rostyslav V

    2015-01-01

    Molecular diagnostic tests drive the scientific and technological uplift in the field of predictive, preventive, and personalized medicine offering invaluable clinical and socioeconomic benefits to the key stakeholders. Although the results of diagnostic tests are immensely influential, molecular diagnostic tests (MDx) are still grudgingly reimbursed by payers and amount for less than 5 % of the overall healthcare costs. This paper aims at defining the value of molecular diagnostic test and outlining the most important components of "value" from miscellaneous assessment frameworks, which go beyond accuracy and feasibility and impact the clinical adoption, informing healthcare resource allocation decisions. The authors suggest that the industry should facilitate discussions with various stakeholders throughout the entire assessment process in order to arrive at a consensus about the depth of evidence required for positive marketing authorization or reimbursement decisions. In light of the evolving "value-based healthcare" delivery practices, it is also recommended to account for social and ethical parameters of value, since these are anticipated to become as critical for reimbursement decisions and test acceptance as economic and clinical criteria. PMID:26425215

  8. Clinical features of Clostridium difficile infection and molecular characterization of the isolated strains in a cohort of Danish hospitalized patients.

    PubMed

    Søes, L M; Brock, I; Persson, S; Simonsen, J; Pribil Olsen, K E; Kemp, M

    2012-02-01

    The purpose of this study was to compare clinical features of Clostridium difficile infection (CDI) to toxin gene profiles of the strains isolated from Danish hospitalized patients. C. difficile isolates were characterized by PCR based molecular typing methods including toxin gene profiling and analysis of deletions and truncating mutations in the toxin regulating gene tcdC. Clinical features were obtained by questionnaire. Thirty percent of the CDI cases were classified as community-acquired. Infection by C. difficile with genes encoding both toxin A, toxin B and the binary toxin was significantly associated with hospital-acquired/healthcare-associated CDI compared to community-acquired CDI. Significantly higher leukocyte counts and more severe clinical manifestations were observed in patients infected by C. difficile containing genes also encoding the binary toxin together with toxin A and B compared to patients infected by C. difficile harbouring only toxin A and B. In conclusion, infection by C. difficile harbouring genes encoding both toxin A, toxin B and the binary toxin were associated with hospital acquisition, higher leukocyte counts and severe clinical disease. PMID:21744281

  9. Structure prediction and molecular simulation of gases diffusion pathways in hydrogenase.

    PubMed

    Sundaram, Shanthy; Tripathi, Ashutosh; Gupta, Vipul

    2010-01-01

    Although hydrogen is considered to be one of the most promising future energy sources and the technical aspects involved in using it have advanced considerably, the future supply of hydrogen from renewable sources is still unsolved. The [Fe]- hydrogenase enzymes are highly efficient H(2) catalysts found in ecologically and phylogenetically diverse microorganisms, including the photosynthetic green alga, Chlamydomonas reinhardtii. While these enzymes can occur in several forms, H(2) catalysis takes place at a unique [FeS] prosthetic group or H-cluster, located at the active site. 3D structure of the protein hydA1 hydrogenase from Chlamydomonas reinhardtti was predicted using the MODELER 8v2 software. Conserved region was depicted from the NCBI CDD Search. Template selection was done on the basis NCBI BLAST results. For single template 1FEH was used and for multiple templates 1FEH and 1HFE were used. The result of the Homology modeling was verified by uploading the file to SAVS server. On the basis of the SAVS result 3D structure predicted using single template was chosen for performing molecular simulation. For performing molecular simulation three strategies were used. First the molecular simulation of the protein was performed in solvated box containing bulk water. Then 100 H(2) molecules were randomly inserted in the solvated box and two simulations of 50 and 100 ps were performed. Similarly 100 O(2) molecules were randomly placed in the solvated box and again 50 and 100 ps simulation were performed. Energy minimization was performed before each simulation was performed. Conformations were saved after each simulation. Analysis of the gas diffusion was done on the basis of RMSD, Radius of Gyration and no. of gas molecule/ps plot. PMID:21364783

  10. Molecular Markers Predict Distant Metastases After Adjuvant Chemoradiation for Rectal Cancer

    SciTech Connect

    Kim, Jun Won; Kim, Yong Bae; Choi, Jun Jeong; Koom, Woong Sub; Kim, Hoguen; Kim, Nam-Kyu; Ahn, Joong Bae; Lee, Ikjae; Cho, Jae Ho; Keum, Ki Chang

    2012-12-01

    Purpose: The outcomes of adjuvant chemoradiation for locally advanced rectal cancer are nonuniform among patients with matching prognostic factors. We explored the role of molecular markers for predicting the outcome of adjuvant chemoradiation for rectal cancer patients. Methods and Materials: The study included 68 patients with stages II to III rectal adenocarcinoma who were treated with total mesorectal excision and adjuvant chemoradiation. Chemotherapy based on 5-fluorouracil and leucovorin was intravenously administered each month for 6-12 cycles. Radiation therapy consisted of 54 Gy delivered in 30 fractions. Immunostaining of surgical specimens for COX-2, EGFR, VEGF, thymidine synthase (TS), and Raf kinase inhibitor protein (RKIP) was performed. Results: The median follow-up was 65 months. Eight locoregional (11.8%) and 13 distant (19.1%) recurrences occurred. Five-year locoregional failure-free survival (LRFFS), distant metastasis-free survival (DMFS), disease-free survival (DFS), and overall survival (OS) rates for all patients were 83.9%, 78.7%, 66.7%, and 73.8%, respectively. LRFFS was not correlated with TNM stage, surgical margin, or any of the molecular markers. VEGF overexpression was significantly correlated with decreased DMFS (P=.045), while RKIP-positive results were correlated with increased DMFS (P=.025). In multivariate analyses, positive findings for COX-2 (COX-2+) and VEGF (VEGF+) and negative findings for RKIP (RKIP-) were independent prognostic factors for DMFS, DFS, and OS (P=.035, .014, and .007 for DMFS; .021, .010, and <.0001 for DFS; and .004, .012, and .001 for OS). The combination of both COX-2+ and VEGF+ (COX-2+/VEGF+) showed a strong correlation with decreased DFS (P=.007), and the combinations of RKIP+/COX-2- and RKIP+/VEGF- showed strong correlations with improved DFS compared with the rest of the patients (P=.001 and <.0001, respectively). Conclusions: Molecular markers can be valuable in predicting treatment outcome of adjuvant

  11. Outcome Prediction for Patients with Traumatic Brain Injury with Dynamic Features from Intracranial Pressure and Arterial Blood Pressure Signals: A Gaussian Process Approach.

    PubMed

    Pimentel, Marco A F; Brennan, Thomas; Lehman, Li-Wei; King, Nicolas Kon Kam; Ang, Beng-Ti; Feng, Mengling

    2016-01-01

    Previous work has been demonstrated that tracking features describing the dynamic and time-varying patterns in brain monitoring signals provide additional predictive information beyond that derived from static features based on snapshot measurements. To achieve more accurate predictions of outcomes of patients with traumatic brain injury (TBI), we proposed a statistical framework to extract dynamic features from brain monitoring signals based on the framework of Gaussian processes (GPs). GPs provide an explicit probabilistic, nonparametric Bayesian approach to metric regression problems. This not only provides probabilistic predictions, but also gives the ability to cope with missing data and infer model parameters such as those that control the function's shape, noise level and dynamics of the signal. Through experimental evaluation, we have demonstrated that dynamic features extracted from GPs provide additional predictive information in addition to the features based on the pressure reactivity index (PRx). Significant improvements in patient outcome prediction were achieved by combining GP-based and PRx-based dynamic features. In particular, compared with the a baseline PRx-based model, the combined model achieved over 30 % improvement in prediction accuracy and sensitivity and over 20 % improvement in specificity and the area under the receiver operating characteristic curve. PMID:27165883

  12. SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features.

    PubMed

    Zhou, Yuan; Zeng, Pan; Li, Yan-Hui; Zhang, Ziding; Cui, Qinghua

    2016-06-01

    N(6)-methyladenosine (m(6)A) is a prevalent RNA methylation modification involved in the regulation of degradation, subcellular localization, splicing and local conformation changes of RNA transcripts. High-throughput experiments have demonstrated that only a small fraction of the m(6)A consensus motifs in mammalian transcriptomes are modified. Therefore, accurate identification of RNA m(6)A sites becomes emergently important. For the above purpose, here a computational predictor of mammalian m(6)A site named SRAMP is established. To depict the sequence context around m(6)A sites, SRAMP combines three random forest classifiers that exploit the positional nucleotide sequence pattern, the K-nearest neighbor information and the position-independent nucleotide pair spectrum features, respectively. SRAMP uses either genomic sequences or cDNA sequences as its input. With either kind of input sequence, SRAMP achieves competitive performance in both cross-validation tests and rigorous independent benchmarking tests. Analyses of the informative features and overrepresented rules extracted from the random forest classifiers demonstrate that nucleotide usage preferences at the distal positions, in addition to those at the proximal positions, contribute to the classification. As a public prediction server, SRAMP is freely available at http://www.cuilab.cn/sramp/. PMID:26896799

  13. SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features

    PubMed Central

    Zhou, Yuan; Zeng, Pan; Li, Yan-Hui; Zhang, Ziding; Cui, Qinghua

    2016-01-01

    N6-methyladenosine (m6A) is a prevalent RNA methylation modification involved in the regulation of degradation, subcellular localization, splicing and local conformation changes of RNA transcripts. High-throughput experiments have demonstrated that only a small fraction of the m6A consensus motifs in mammalian transcriptomes are modified. Therefore, accurate identification of RNA m6A sites becomes emergently important. For the above purpose, here a computational predictor of mammalian m6A site named SRAMP is established. To depict the sequence context around m6A sites, SRAMP combines three random forest classifiers that exploit the positional nucleotide sequence pattern, the K-nearest neighbor information and the position-independent nucleotide pair spectrum features, respectively. SRAMP uses either genomic sequences or cDNA sequences as its input. With either kind of input sequence, SRAMP achieves competitive performance in both cross-validation tests and rigorous independent benchmarking tests. Analyses of the informative features and overrepresented rules extracted from the random forest classifiers demonstrate that nucleotide usage preferences at the distal positions, in addition to those at the proximal positions, contribute to the classification. As a public prediction server, SRAMP is freely available at http://www.cuilab.cn/sramp/. PMID:26896799

  14. In Vitro Drug Sensitivity Tests to Predict Molecular Target Drug Responses in Surgically Resected Lung Cancer

    PubMed Central

    Miyazaki, Ryohei; Anayama, Takashi; Hirohashi, Kentaro; Okada, Hironobu; Kume, Motohiko; Orihashi, Kazumasa

    2016-01-01

    Background Epidermal growth factor receptor-tyrosine kinase inhibitors (EGFR-TKIs) and anaplastic lymphoma kinase (ALK) inhibitors have dramatically changed the strategy of medical treatment of lung cancer. Patients should be screened for the presence of the EGFR mutation or echinoderm microtubule-associated protein-like 4 (EML4)-ALK fusion gene prior to chemotherapy to predict their clinical response. The succinate dehydrogenase inhibition (SDI) test and collagen gel droplet embedded culture drug sensitivity test (CD-DST) are established in vitro drug sensitivity tests, which may predict the sensitivity of patients to cytotoxic anticancer drugs. We applied in vitro drug sensitivity tests for cyclopedic prediction of clinical responses to different molecular targeting drugs. Methods The growth inhibitory effects of erlotinib and crizotinib were confirmed for lung cancer cell lines using SDI and CD-DST. The sensitivity of 35 cases of surgically resected lung cancer to erlotinib was examined using SDI or CD-DST, and compared with EGFR mutation status. Results HCC827 (Exon19: E746-A750 del) and H3122 (EML4-ALK) cells were inhibited by lower concentrations of erlotinib and crizotinib, respectively than A549, H460, and H1975 (L858R+T790M) cells were. The viability of the surgically resected lung cancer was 60.0 ± 9.8 and 86.8 ± 13.9% in EGFR-mutants vs. wild types in the SDI (p = 0.0003). The cell viability was 33.5 ± 21.2 and 79.0 ± 18.6% in EGFR mutants vs. wild-type cases (p = 0.026) in CD-DST. Conclusions In vitro drug sensitivity evaluated by either SDI or CD-DST correlated with EGFR gene status. Therefore, SDI and CD-DST may be useful predictors of potential clinical responses to the molecular anticancer drugs, cyclopedically. PMID:27070423

  15. Determinants of molecular reactivity as criteria for predicting toxicity: problems and approaches.

    PubMed Central

    Weinstein, H; Rabinowitz, J; Liebman, M N; Osman, R

    1985-01-01

    We discuss the physicochemical basis for mechanisms of action of toxic chemicals and theoretical methods that can be used to understand the relation to the structure of these chemicals. Molecular properties that determine the chemical reactivity of the compounds are proposed as parameters in the analysis of such structure-activity relationships and as criteria for predicting potential toxicity. The theoretical approaches include quantitative methods for structural superposition of molecules and for superposition of their reactivity characteristics. Applications to polychlorinated hydrocarbons are used to illustrate both rigid superposition methods, and methods that take advantage of structural flexibility. These approaches and their results are discussed and compared with methods that afford quantitative structural comparisons without direct superposition, with special emphasis on the need for efficient automated methods suitable for rapid scans of large structural data bases. Quantum mechanical methods for the calculation of molecular properties that can serve as reactivity criteria are presented and illustrated. Special attention is given to the electrostatic properties of the molecules such as the molecular electrostatic potential, the electric fields, and the polarizability terms calculated from perturbation expansions. The practical considerations related to the rapid calculation of these properties on relevant molecular surfaces (e.g., solvent- or reagent-accessible surfaces) are discussed and exemplified, stressing the special problems posed by the structural variety of toxic substances and the paucity of information on their mechanisms of action. The discussion leads to a rationale for the use of the combination of theoretical methods to reveal discriminant criteria for toxicity and to analyze the initial steps in the metabolic processes that could yield toxic products. PMID:3905371

  16. Prediction of protein structure classes using hybrid space of multi-profile Bayes and bi-gram probability feature spaces.

    PubMed

    Hayat, Maqsood; Tahir, Muhammad; Khan, Sher Afzal

    2014-04-01

    Proteins are the executants of biological functions in living organisms. Comprehension of protein structure is a challenging problem in the era of proteomics, computational biology, and bioinformatics because of its pivotal role in protein folding patterns. Owing to the large exploration of protein sequences in protein databanks and intricacy of protein structures, experimental and theoretical methods are insufficient for prediction of protein structure classes. Therefore, it is highly desirable to develop an accurate, reliable, and high throughput computational model to predict protein structure classes correctly from polygenetic sequences. In this regard, we propose a promising model employing hybrid descriptor space in conjunction with optimized evidence-theoretic K-nearest neighbor algorithm. Hybrid space is the composition of two descriptor spaces including Multi-profile Bayes and bi-gram probability. In order to enhance the generalization power of the classifier, we have selected high discriminative descriptors from the hybrid space using particle swarm optimization, a well-known evolutionary feature selection technique. Performance evaluation of the proposed model is performed using the jackknife test on three low similarity benchmark datasets including 25PDB, 1189, and 640. The success rates of the proposed model are 87.0%, 86.6%, and 88.4%, respectively on the three benchmark datasets. The comparative analysis exhibits that our proposed model has yielded promising results compared to the existing methods in the literature. In addition, our proposed prediction system might be helpful in future research particularly in cases where the major focus of research is on low similarity datasets. PMID:24384128

  17. COBRA: a computational brewing application for predicting the molecular composition of organic aerosols.

    PubMed

    Fooshee, David R; Nguyen, Tran B; Nizkorodov, Sergey A; Laskin, Julia; Laskin, Alexander; Baldi, Pierre

    2012-06-01

    Atmospheric organic aerosols (OA) represent a significant fraction of airborne particulate matter and can impact climate, visibility, and human health. These mixtures are difficult to characterize experimentally due to their complex and dynamic chemical composition. We introduce a novel Computational Brewing Application (COBRA) and apply it to modeling oligomerization chemistry stemming from condensation and addition reactions in OA formed by photooxidation of isoprene. COBRA uses two lists as input: a list of chemical structures comprising the molecular starting pool and a list of rules defining potential reactions between molecules. Reactions are performed iteratively, with products of all previous iterations serving as reactants for the next. The simulation generated thousands of structures in the mass range of 120-500 Da and correctly predicted ∼70% of the individual OA constituents observed by high-resolution mass spectrometry. Select predicted structures were confirmed with tandem mass spectrometry. Esterification was shown to play the most significant role in oligomer formation, with hemiacetal formation less important, and aldol condensation insignificant. COBRA is not limited to atmospheric aerosol chemistry; it should be applicable to the prediction of reaction products in other complex mixtures for which reasonable reaction mechanisms and seed molecules can be supplied by experimental or theoretical methods. PMID:22568707

  18. COBRA: A Computational Brewing Application for Predicting the Molecular Composition of Organic Aerosols

    PubMed Central

    Fooshee, David R.; Nguyen, Tran B.; Nizkorodov, Sergey A.; Laskin, Julia; Laskin, Alexander; Baldi, Pierre

    2012-01-01

    Atmospheric organic aerosols (OA) represent a significant fraction of airborne particulate matter and can impact climate, visibility, and human health. These mixtures are difficult to characterize experimentally due to their complex and dynamic chemical composition. We introduce a novel Computational Brewing Application (COBRA) and apply it to modeling oligomerization chemistry stemming from condensation and addition reactions in OA formed by photooxidation of isoprene. COBRA uses two lists as input: a list of chemical structures comprising the molecular starting pool, and a list of rules defining potential reactions between molecules. Reactions are performed iteratively, with products of all previous iterations serving as reactants for the next. The simulation generated thousands of structures in the mass range of 120–500 Da, and correctly predicted ~70% of the individual OA constituents observed by high-resolution mass spectrometry. Select predicted structures were confirmed with tandem mass spectrometry. Esterification was shown to play the most significant role in oligomer formation, with hemiacetal formation less important, and aldol condensation insignificant. COBRA is not limited to atmospheric aerosol chemistry; it should be applicable to the prediction of reaction products in other complex mixtures for which reasonable reaction mechanisms and seed molecules can be supplied by experimental or theoretical methods. PMID:22568707

  19. Quantification in Gas Chromatography: Prediction of Flame Ionization Detector Response Factors from Combustion Enthalpies and Molecular Structures.

    PubMed

    de Saint Laumer, Jean-Yves; Cicchetti, Esmeralda; Merle, Philippe; Egger, Jonathan; Chaintreau, Alain

    2010-08-01

    In a previous report, we validated the use of a database that compiled the relative response factors of flavor and fragrance compounds under standard GC conditions for a flame ionization detector. Here we investigate the prediction of unknown response factors from the molecular structure by using combustion enthalpies. In a first step, this enthalpy was well-predicted with either ab initio calculation or multiple linear regression based on the molecular formula. In a second step, good correlation was observed between these combustion enthalpies and experimental relative response factors, and so the response factors were predictable from only the molecular formula. With a database of 351 compounds, about 60% of them exhibited a difference of less than 5% between the predicted and experimental relative response factors and about 80% exhibited a difference of less than 10%. PMID:20698579

  20. Quantification in gas chromatography: prediction of flame ionization detector response factors from combustion enthalpies and molecular structures.

    PubMed

    de Saint Laumer, Jean-Yves; Cicchetti, Esmeralda; Merle, Philippe; Egger, Jonathan; Chaintreau, Alain

    2010-08-01

    In a previous report, we validated the use of a database that compiled the relative response factors of flavor and fragrance compounds under standard GC conditions for a flame ionization detector. Here we investigate the prediction of unknown response factors from the molecular structure by using combustion enthalpies. In a first step, this enthalpy was well-predicted with either ab initio calculation or multiple linear regression based on the molecular formula. In a second step, good correlation was observed between these combustion enthalpies and experimental relative response factors, and so the response factors were predictable from only the molecular formula. With a database of 351 compounds, about 60% of them exhibited a difference of less than 5% between the predicted and experimental relative response factors and about 80% exhibited a difference of less than 10%. PMID:20700911

  1. Numerical simulations of mechanical properties of innovative pothole patching materials featuring high toughness, low viscosity nano-molecular resins

    NASA Astrophysics Data System (ADS)

    Yuan, K. Y.; Yuan, W.; Ju, J. W.; Yang, J. M.; Kao, W.; Carlson, L.

    2012-04-01

    As asphalt pavements age and deteriorate, recurring pothole repair failures and propagating alligator cracks in the asphalt pavements have become a serious issue to our daily life and resulted in high repairing costs for pavement and vehicles. To solve this urgent issue, pothole repair materials with superior durability and long service life are needed. In the present work, revolutionary pothole patching materials with high toughness, high fatigue resistance that are reinforced with nano-molecular resins have been developed to enhance their resistance to traffic loads and service life of repaired potholes. In particular, DCPD resin (dicyclopentadiene, C10H12) with a Rhuthinium-based catalyst is employed to develop controlled properties that are compatible with aggregates and asphalt binders. In this paper, a multi-level numerical micromechanics-based model is developed to predict the mechanical properties of these innovative nanomolecular resin reinforced pothole patching materials. Coarse aggregates in the finite element analysis are modeled as irregular shapes through image processing techniques and randomly-dispersed coated particles. The overall properties of asphalt mastic, which consists of fine aggregates, asphalt binder, cured DCPD and air voids are theoretically estimated by the homogenization technique of micromechanics. Numerical predictions are compared with suitably designed experimental laboratory results.

  2. miR-126-3p and miR-451a correlate with clinicopathological features of lung adenocarcinoma: The underlying molecular mechanisms.

    PubMed

    Chen, Qingyong; Hu, Huizhen; Jiao, Demin; Yan, Jie; Xu, Wei; Tang, Xiali; Chen, Jun; Wang, Jian

    2016-08-01

    Lung cancer is the most common malignancy worldwide. This study aimed to identify miRNA biomarkers of lung adenocarcinoma and to investigate their molecular mechanisms. miRNA expression profiling of tumor tissues and adjacent normal tissues from 10 patients were detected using microarray. Differentially expressed miRNAs (DEMs) were identified, and were verified using quantitative reverse transcription-PCR. Thereafter, correlations between DEM expression and clinicopathological features were determined in 49 patients. Furthermore, Targetscan was utilized to predict target genes, among which transcription factors (TFs) were identified. The interactions among miRNAs, TFs and target genes were used to construct an miRNA-TF-target network. Totally, 11 DEMs were identified, among which two downregulated miRNAs (miR-126-3p and miR-451a) were validated. Low levels of miR-126-3p and miR-451a were associated with poor pathological stage, large tumor diameter and lymph node metastasis (P<0.05). Receiver operating characteristic analysis showed that both miRNAs could predict pathological stage, tumor diameter and lymph node metastasis of lung adenocarcinoma (AUC >0.65, P<0.05). For miR-126-3p, 154 target genes were predicted (e.g., PLXNB2), which were enriched in 29 pathways mainly concerning apoptosis and cancer. For miR‑451a, 397 target genes were predicted, which were enriched in 5 pathways including 'PPAR signaling pathway'. Ten genes were co-regulated by miR-126-3p and miR-451a, e.g., TSC1. Furthermore, an miRNA-TF-target network was constructed, and a sub-network was identified, including 2 miRNAs, 15 targets, and 7 TFs. In conclusion, miR-126-3p and miR-451a predicted the severity of lung adenocarcinoma. However, the possible mechanisms explored by bioinformatics need to be further validated. PMID:27277197

  3. PSSP-RFE: accurate prediction of protein structural class by recursive feature extraction from PSI-BLAST profile, physical-chemical property and functional annotations.

    PubMed

    Li, Liqi; Cui, Xiang; Yu, Sanjiu; Zhang, Yuan; Luo, Zhong; Yang, Hua; Zhou, Yue; Zheng, Xiaoqi

    2014-01-01

    Protein structure prediction is critical to functional annotation of the massively accumulated biological sequences, which prompts an imperative need for the development of high-throughput technologies. As a first and key step in protein structure prediction, protein structural class prediction becomes an increasingly challenging task. Amongst most homological-based approaches, the accuracies of protein structural class prediction are sufficiently high for high similarity datasets, but still far from being satisfactory for low similarity datasets, i.e., below 40% in pairwise sequence similarity. Therefore, we present a novel method for accurate and reliable protein structural class prediction for both high and low similarity datasets. This method is based on Support Vector Machine (SVM) in conjunction with integrated features from position-specific score matrix (PSSM), PROFEAT and Gene Ontology (GO). A feature selection approach, SVM-RFE, is also used to rank the integrated feature vectors through recursively removing the feature with the lowest ranking score. The definitive top features selected by SVM-RFE are input into the SVM engines to predict the structural class of a query protein. To validate our method, jackknife tests were applied to seven widely used benchmark datasets, reaching overall accuracies between 84.61% and 99.79%, which are significantly higher than those achieved by state-of-the-art tools. These results suggest that our method could serve as an accurate and cost-effective alternative to existing methods in protein structural classification, especially for low similarity datasets. PMID:24675610

  4. Predicting drug resistance of the HIV-1 protease using molecular interaction energy components.

    PubMed

    Hou, Tingjun; Zhang, Wei; Wang, Jian; Wang, Wei

    2009-03-01

    Drug resistance significantly impairs the efficacy of AIDS therapy. Therefore, precise prediction of resistant viral mutants is particularly useful for developing effective drugs and designing therapeutic regimen. In this study, we applied a structure-based computational approach to predict mutants of the HIV-1 protease resistant to the seven FDA approved drugs. We analyzed the energetic pattern of the protease-drug interaction by calculating the molecular interaction energy components (MIECs) between the drug and the protease residues. Support vector machines (SVMs) were trained on MIECs to classify protease mutants into resistant and nonresistant categories. The high prediction accuracies for the test sets of cross-validations suggested that the MIECs successfully characterized the interaction interface between drugs and the HIV-1 protease. We conducted a proof-of-concept study on a newly approved drug, darunavir (TMC114), on which no drug resistance data were available in the public domain. Compared with amprenavir, our analysis suggested that darunavir might be more potent to combat drug resistance. To quantitatively estimate binding affinities of drugs and study the contributions of protease residues to causing resistance, linear regression models were trained on MIECs using partial least squares (PLS). The MIEC-PLS models also achieved satisfactory prediction accuracy. Analysis of the fitting coefficients of MIECs in the regression model revealed the important resistance mutations and shed light into understanding the mechanisms of these mutations to cause resistance. Our study demonstrated the advantages of characterizing the protease-drug interaction using MIECs. We believe that MIEC-SVM and MIEC-PLS can help design new agents or combination of therapeutic regimens to counter HIV-1 protease resistant strains. PMID:18704937

  5. Predictive Bioinformatic Assignment of Methyl-Bearing Stereocenters, Total Synthesis, and an Additional Molecular Target of Ajudazol B.

    PubMed

    Essig, Sebastian; Schmalzbauer, Björn; Bretzke, Sebastian; Scherer, Olga; Koeberle, Andreas; Werz, Oliver; Müller, Rolf; Menche, Dirk

    2016-02-19

    Full details on the evaluation and application of an easily feasible and generally useful method for configurational assignments of isolated methyl-bearing stereocenters are reported. The analytical tool relies on a bioinformatic gene cluster analysis and utilizes a predictive enoylreductase alignment, and its feasibility was demonstrated by the full stereochemical determination of the ajudazols, highly potent inhibitors of the mitochondrial respiratory chain. Furthermore, a full account of our strategies and tactics that culminated in the total synthesis of ajudazol B, the most potent and least abundant of these structurally unique class of myxobacterial natural products, is presented. Key features include an application of an asymmetric ortholithiation strategy for synthesis of the characteristic anti-configured hydroxyisochromanone core bearing three contiguous stereocenters, a modular oxazole formation, a flexible cross-metathesis approach for terminal allyl amide synthesis, and a late-stage Z,Z-selective Suzuki coupling. This total synthesis unambiguously proves the correct stereochemistry, which was further corroborated by comparison with reisolated natural material. Finally, 5-lipoxygenase was discovered as an additional molecular target of ajudazol B. Activities against this clinically validated key enzyme of the biosynthesis of proinflammatory leukotrienes were in the range of the approved drug zileuton, which further underlines the biological importance of this unique natural product. PMID:26796481

  6. Using molecular classification to predict gains in maximal aerobic capacity following endurance exercise training in humans

    PubMed Central

    Knudsen, Steen; Rankinen, Tuomo; Koch, Lauren G.; Sarzynski, Mark; Jensen, Thomas; Keller, Pernille; Scheele, Camilla; Vollaard, Niels B. J.; Nielsen, Søren; Åkerström, Thorbjörn; MacDougald, Ormond A.; Jansson, Eva; Greenhaff, Paul L.; Tarnopolsky, Mark A.; van Loon, Luc J. C.; Pedersen, Bente K.; Sundberg, Carl Johan; Wahlestedt, Claes; Britton, Steven L.; Bouchard, Claude

    2010-01-01

    A low maximal oxygen consumption (V̇o2max) is a strong risk factor for premature mortality. Supervised endurance exercise training increases V̇o2max with a very wide range of effectiveness in humans. Discovering the DNA variants that contribute to this heterogeneity typically requires substantial sample sizes. In the present study, we first use RNA expression profiling to produce a molecular classifier that predicts V̇o2max training response. We then hypothesized that the classifier genes would harbor DNA variants that contributed to the heterogeneous V̇o2max response. Two independent preintervention RNA expression data sets were generated (n = 41 gene chips) from subjects that underwent supervised endurance training: one identified and the second blindly validated an RNA expression signature that predicted change in V̇o2max (“predictor” genes). The HERITAGE Family Study (n = 473) was used for genotyping. We discovered a 29-RNA signature that predicted V̇o2max training response on a continuous scale; these genes contained ∼6 new single-nucleotide polymorphisms associated with gains in V̇o2max in the HERITAGE Family Study. Three of four novel candidate genes from the HERITAGE Family Study were confirmed as RNA predictor genes (i.e., “reciprocal” RNA validation of a quantitative trait locus genotype), enhancing the performance of the 29-RNA-based predictor. Notably, RNA abundance for the predictor genes was unchanged by exercise training, supporting the idea that expression was preset by genetic variation. Regression analysis yielded a model where 11 single-nucleotide polymorphisms explained 23% of the variance in gains in V̇o2max, corresponding to ∼50% of the estimated genetic variance for V̇o2max. In conclusion, combining RNA profiling with single-gene DNA marker association analysis yields a strongly validated molecular predictor with meaningful explanatory power. V̇o2max responses to endurance training can be predicted by measuring a ∼30

  7. A signal detection model predicts the effects of set size on visual search accuracy for feature, conjunction, triple conjunction, and disjunction displays

    NASA Technical Reports Server (NTRS)

    Eckstein, M. P.; Thomas, J. P.; Palmer, J.; Shimozaki, S. S.

    2000-01-01

    Recently, quantitative models based on signal detection theory have been successfully applied to the prediction of human accuracy in visual search for a target that differs from distractors along a single attribute (feature search). The present paper extends these models for visual search accuracy to multidimensional search displays in which the target differs from the distractors along more than one feature dimension (conjunction, disjunction, and triple conjunction displays). The model assumes that each element in the display elicits a noisy representation for each of the relevant feature dimensions. The observer combines the representations across feature dimensions to obtain a single decision variable, and the stimulus with the maximum value determines the response. The model accurately predicts human experimental data on visual search accuracy in conjunctions and disjunctions of contrast and orientation. The model accounts for performance degradation without resorting to a limited-capacity spatially localized and temporally serial mechanism by which to bind information across feature dimensions.

  8. Clinical features and molecular characteristics of invasive community-acquired methicillin-resistant Staphylococcus aureus infections in Taiwanese children.

    PubMed

    Chen, Chih-Jung; Su, Lin-Hui; Chiu, Cheng-Hsun; Lin, Tzou-Yien; Wong, Kin-Sun; Chen, Yi-Ywan M; Huang, Yhu-Chering

    2007-11-01

    Highly virulent community-acquired methicillin-resistant Staphylococcus aureus (CA-MRSA) has been associated with morbidity and mortality in various countries of the world. We characterized the clinical and molecular features of pediatric invasive CA-MRSA infections in Taiwan. Between July 2000 and June 2005, 31 previously healthy children with invasive CA-MRSA infections were identified from 423 children with community-onset methicillin-resistant S. aureus infections. The medical records were reviewed. The clinical isolates, if available, were collected for molecular characterization. Sixteen (51.6%) patients were male, and the mean age was 5.7 years. Adolescents accounted for 9 (29%) cases. Eighteen children had bone and/or joint infections, 14 had deep-seated soft tissue infections, 11 had pneumonia, and 2 had central nervous system infections. Multiorgan involvement was identified in 8 of 20 bacteremic cases. Twenty-two patients (71%) required surgical interventions. The mean hospital stay was 27.4 days. All of the 15 available isolates were classified as sequence type (ST) 59 or its single locus variant and belonged to 2 previously reported community-associated clones containing staphylococcal cassette chromosome mec (SCCmec) type IV or type V(T) in Taiwan. Most of the isolates were multiresistant to clindamycin (94%) and erythromycin (97%). Eleven (73.3%) isolates carried pvl genes, and the strains harboring pvl genes were significantly associated with lung involvement. In conclusion, invasive CA-MRSA infections in pediatric population were not limited to young children. Surgical interventions were often required, and a prolonged course of antibiotic therapy was needed. A multiresistant CA-MRSA clone characterized as ST59 was identified from these children in Taiwan. PMID:17662565

  9. Molecular Features Contributing to Virus-Independent Intracellular Localization and Dynamic Behavior of the Herpesvirus Transport Protein US9

    PubMed Central

    Pedrazzi, Manuela; Nash, Bradley; Meucci, Olimpia; Brandimarti, Renato

    2014-01-01

    Reaching the right destination is of vital importance for molecules, proteins, organelles, and cargoes. Thus, intracellular traffic is continuously controlled and regulated by several proteins taking part in the process. Viruses exploit this machinery, and viral proteins regulating intracellular transport have been identified as they represent valuable tools to understand and possibly direct molecules targeting and delivery. Deciphering the molecular features of viral proteins contributing to (or determining) this dynamic phenotype can eventually lead to a virus-independent approach to control cellular transport and delivery. From this virus-independent perspective we looked at US9, a virion component of Herpes Simplex Virus involved in anterograde transport of the virus inside neurons of the infected host. As the natural cargo of US9-related vesicles is the virus (or its parts), defining its autonomous, virus-independent role in vesicles transport represents a prerequisite to make US9 a valuable molecular tool to study and possibly direct cellular transport. To assess the extent of this autonomous role in vesicles transport, we analyzed US9 behavior in the absence of viral infection. Based on our studies, Us9 behavior appears similar in different cell types; however, as expected, the data we obtained in neurons best represent the virus-independent properties of US9. In these primary cells, transfected US9 mostly recapitulates the behavior of US9 expressed from the viral genome. Additionally, ablation of two major phosphorylation sites (i.e. Y32Y33 and S34ES36) have no effect on protein incorporation on vesicles and on its localization on both proximal and distal regions of the cells. These results support the idea that, while US9 post-translational modification may be important to regulate cargo loading and, consequently, virion export and delivery, no additional viral functions are required for US9 role in intracellular transport. PMID:25133647

  10. Molecular dynamics prediction and experimental evidence for density of normal and metastable liquid zirconium

    NASA Astrophysics Data System (ADS)

    Wang, H. P.; Yang, S. J.; Hu, L.; Wei, B.

    2016-06-01

    The density of normal and metastable undercooled liquid zirconium was predicted by performing molecular dynamics calculation with a system consisting of 4000 atoms and measured by electrostatic levitation experiments. The results show that the density increases linearly with the descending of temperature, including a maximum undercooling of 928 K. The density is 6.00 g cm-3 at the melting temperature, which agrees well with the experimental result of 6.06 g cm-3. Furthermore, the atomic number is increased to 32,000 on the basis of 4000 atoms and there appears only 0.02% difference. Besides, the pair distribution function was applied to display the atomic structure, which indicates the liquid structure change occurs at the first neighbor distance.

  11. Isotopic Soret effect in ternary mixtures: Theoretical predictions and molecular simulations

    SciTech Connect

    Artola, Pierre-Arnaud; Rousseau, Bernard

    2015-11-07

    In this paper, we study the Soret effect in ternary fluid mixtures of isotopic argon like atoms. Soret coefficients have been computed using non-equilibrium molecular dynamics and a theoretical approach based on our extended Prigogine model (with mass effect) and generalized to mixtures with any number of components. As is well known for binary mixture studies, the heaviest component always accumulates on the cold side whereas the lightest species accumulate on the hot side. An interesting behavior is observed for the species with the intermediate mass: it can accumulate on both sides, depending on composition and mass ratios. A simple picture can be given to understand this change of sign: the intermediate mass species can be seen as evolving in an equivalent fluid whose species mass varies with composition. An excellent prediction of all simulated data has been obtained using our model including the change of sign of the Soret coefficient for species with intermediate mass.

  12. Isotopic Soret effect in ternary mixtures: Theoretical predictions and molecular simulations

    NASA Astrophysics Data System (ADS)

    Artola, Pierre-Arnaud; Rousseau, Bernard

    2015-11-01

    In this paper, we study the Soret effect in ternary fluid mixtures of isotopic argon like atoms. Soret coefficients have been computed using non-equilibrium molecular dynamics and a theoretical approach based on our extended Prigogine model (with mass effect) and generalized to mixtures with any number of components. As is well known for binary mixture studies, the heaviest component always accumulates on the cold side whereas the lightest species accumulate on the hot side. An interesting behavior is observed for the species with the intermediate mass: it can accumulate on both sides, depending on composition and mass ratios. A simple picture can be given to understand this change of sign: the intermediate mass species can be seen as evolving in an equivalent fluid whose species mass varies with composition. An excellent prediction of all simulated data has been obtained using our model including the change of sign of the Soret coefficient for species with intermediate mass.

  13. Zsyntax: A Formal Language for Molecular Biology with Projected Applications in Text Mining and Biological Prediction

    PubMed Central

    Boniolo, Giovanni; D'Agostino, Marcello; Di Fiore, Pier Paolo

    2010-01-01

    We propose a formal language that allows for transposing biological information precisely and rigorously into machine-readable information. This language, which we call Zsyntax (where Z stands for the Greek word ζωή, life), is grounded on a particular type of non-classical logic, and it can be used to write algorithms and computer programs. We present it as a first step towards a comprehensive formal language for molecular biology in which any biological process can be written and analyzed as a sort of logical “deduction”. Moreover, we illustrate the potential value of this language, both in the field of text mining and in that of biological prediction. PMID:20209084

  14. Prediction of static contact angles on the basis of molecular forces and adsorption data

    NASA Astrophysics Data System (ADS)

    Diaz, M. Elena; Savage, Michael D.; Cerro, Ramon L.

    2016-08-01

    At a three-phase contact line, a liquid bulk phase is in contact with and coexists with a very thin layer of adsorbed molecules. This adsorbed film in the immediate vicinity of a liquid wedge modifies the balance of forces between the liquid and solid phases such that, when included in the balance of forces, a quantitative relationship emerges between the adsorbed film thickness and the static contact angle. This relationship permits the prediction of static contact angles from molecular forces and equilibrium adsorption data by means of quantities that are physically meaningful and measurable. For n-alkanes on polytetrafluoroethylene, for which there are experimental data available on adsorption and contact angles, our computations show remarkable agreement with the data. The results obtained are an improvement on previously published calculations—particularly for alkanes with a low number of carbon atoms, for which adsorption is significant.

  15. Deleterious Effects of Exact Exchange Functionals on Predictions of Molecular Conductance.

    PubMed

    Feng, Qingguo; Yamada, Atsushi; Baer, Roi; Dunietz, Barry D

    2016-08-01

    Kohn-Sham (KS) density functional theory (DFT) describes well the atomistic structure of molecular junctions and their coupling to the semi-infinite metallic electrodes but severely overestimates conductance due to the spuriously large density of charge-carrier states of the KS system. Previous works show that inclusion of appropriate amounts of nonlocal exchange in the functional can fix the problem and provide realistic conductance estimates. Here however we discover that nonlocal exchange can also lead to deleterious effects which artificially overestimate transmittance even beyond the KS-DFT prediction. The effect is a result of exchange coupling between nonoverlapping states of diradical character. We prescribe a practical recipe for eliminating such artifacts. PMID:27454778

  16. Prediction of novel and selective TNF-alpha converting enzyme (TACE) inhibitors and characterization of correlative molecular descriptors by machine learning approaches.

    PubMed

    Cong, Yong; Yang, Xue-Gang; Lv, Wei; Xue, Ying

    2009-10-01

    The inhibition of TNF-alpha converting enzyme (TACE) has been explored as a feasible therapy for the treatment of rheumatoid arthritis (RA) and Crohn's disease (CD). Recently, large numbers of novel and selective TACE inhibitors have been reported. It is desirable to develop machine learning (ML) models for identifying the inhibitors of TACE in the early drug design phase and test the prediction capabilities of these ML models. This work evaluated four ML methods, support vector machine (SVM), k-nearest neighbor (k-NN), back-propagation neural network (BPNN) and C4.5 decision tree (C4.5 DT), which were trained and tested by using a diverse set of 443 TACE inhibitors and 759 non-inhibitors. A well-established feature selection method, the recursive feature elimination (RFE) method, was used to select the most appropriate descriptors for classification from a large pool of descriptors, and two evaluation methods, 5-fold cross-validation and independent evaluation, were used to assess the performances of these developed models. In this study, all these ML models have already achieved promising prediction accuracies. By using the RFE method, the prediction accuracies are further improved. In k-NN, the model gives the best prediction for TACE inhibitors (98.32%), and the SVM bears the best prediction for non-inhibitors (99.51%). Both the k-NN and SVM model give the best overall prediction accuracy (98.45%). To the best of our knowledge, the SVM model developed in this work is the first one for the classification prediction of TACE inhibitors with a broad applicability domain. Our study suggests that ML methods, particularly SVM, are potentially useful for facilitating the discovery of TACE inhibitors and for exhibiting the molecular descriptors associated with TACE inhibitors. PMID:19729328

  17. Spatial-Temporal [{sup 18}F]FDG-PET Features for Predicting Pathologic Response of Esophageal Cancer to Neoadjuvant Chemoradiation Therapy

    SciTech Connect

    Tan, Shan; Kligerman, Seth; Chen, Wengen; Lu, Minh; Kim, Grace; Feigenberg, Steven; D'Souza, Warren D.; Suntharalingam, Mohan; Lu, Wei

    2013-04-01

    Purpose: To extract and study comprehensive spatial-temporal {sup 18}F-labeled fluorodeoxyglucose ([{sup 18}F]FDG) positron emission tomography (PET) features for the prediction of pathologic tumor response to neoadjuvant chemoradiation therapy (CRT) in esophageal cancer. Methods and Materials: Twenty patients with esophageal cancer were treated with trimodal therapy (CRT plus surgery) and underwent [{sup 18}F]FDG-PET/CT scans both before (pre-CRT) and after (post-CRT) CRT. The 2 scans were rigidly registered. A tumor volume was semiautomatically delineated using a threshold standardized uptake value (SUV) of ≥2.5, followed by manual editing. Comprehensive features were extracted to characterize SUV intensity distribution, spatial patterns (texture), tumor geometry, and associated changes resulting from CRT. The usefulness of each feature in predicting pathologic tumor response to CRT was evaluated using the area under the receiver operating characteristic curve (AUC) value. Results: The best traditional response measure was decline in maximum SUV (SUV{sub max}; AUC, 0.76). Two new intensity features, decline in mean SUV (SUV{sub mean}) and skewness, and 3 texture features (inertia, correlation, and cluster prominence) were found to be significant predictors with AUC values ≥0.76. According to these features, a tumor was more likely to be a responder when the SUV{sub mean} decline was larger, when there were relatively fewer voxels with higher SUV values pre-CRT, or when [{sup 18}F]FDG uptake post-CRT was relatively homogeneous. All of the most accurate predictive features were extracted from the entire tumor rather than from the most active part of the tumor. For SUV intensity features and tumor size features, changes were more predictive than pre- or post-CRT assessment alone. Conclusion: Spatial-temporal [{sup 18}F]FDG-PET features were found to be useful predictors of pathologic tumor response to neoadjuvant CRT in esophageal cancer.

  18. A new molecular signature method for prediction of driver cancer pathways from transcriptional data

    PubMed Central

    Rykunov, Dmitry; Beckmann, Noam D.; Li, Hui; Uzilov, Andrew; Schadt, Eric E.; Reva, Boris

    2016-01-01

    Assigning cancer patients to the most effective treatments requires an understanding of the molecular basis of their disease. While DNA-based molecular profiling approaches have flourished over the past several years to transform our understanding of driver pathways across a broad range of tumors, a systematic characterization of key driver pathways based on RNA data has not been undertaken. Here we introduce a new approach for predicting the status of driver cancer pathways based on signature functions derived from RNA sequencing data. To identify the driver cancer pathways of interest, we mined DNA variant data from TCGA and nominated driver alterations in seven major cancer pathways in breast, ovarian and colon cancer tumors. The activation status of these driver pathways were then characterized using RNA sequencing data by constructing classification signature functions in training datasets and then testing the accuracy of the signatures in test datasets. The signature functions differentiate well tumors with nominated pathway activation from tumors with no signs of activation: average AUC equals to 0.83. Our results confirm that driver genomic alterations are distinctively displayed at the transcriptional level and that the transcriptional signatures can generally provide an alternative to DNA sequencing methods in detecting specific driver pathways. PMID:27098033

  19. Protein-protein structure prediction by scoring molecular dynamics trajectories of putative poses.

    PubMed

    Sarti, Edoardo; Gladich, Ivan; Zamuner, Stefano; Correia, Bruno E; Laio, Alessandro

    2016-09-01

    The prediction of protein-protein interactions and their structural configuration remains a largely unsolved problem. Most of the algorithms aimed at finding the native conformation of a protein complex starting from the structure of its monomers are based on searching the structure corresponding to the global minimum of a suitable scoring function. However, protein complexes are often highly flexible, with mobile side chains and transient contacts due to thermal fluctuations. Flexibility can be neglected if one aims at finding quickly the approximate structure of the native complex, but may play a role in structure refinement, and in discriminating solutions characterized by similar scores. We here benchmark the capability of some state-of-the-art scoring functions (BACH-SixthSense, PIE/PISA and Rosetta) in discriminating finite-temperature ensembles of structures corresponding to the native state and to non-native configurations. We produce the ensembles by running thousands of molecular dynamics simulations in explicit solvent starting from poses generated by rigid docking and optimized in vacuum. We find that while Rosetta outperformed the other two scoring functions in scoring the structures in vacuum, BACH-SixthSense and PIE/PISA perform better in distinguishing near-native ensembles of structures generated by molecular dynamics in explicit solvent. Proteins 2016; 84:1312-1320. © 2016 Wiley Periodicals, Inc. PMID:27253756

  20. Molecular and immunologic markers of kidney cancer-potential applications in predictive, preventive and personalized medicine.

    PubMed

    Mickley, Amanda; Kovaleva, Olga; Kzhyshkowska, Julia; Gratchev, Alexei

    2015-01-01

    Kidney cancer is one of the deadliest malignancies due to frequent late diagnosis (33 % or renal cell carcinoma are metastatic at diagnosis) and poor treatment options. There are two major subtypes of kidney cancer: renal cell carcinoma (RCC) and renal pelvis carcinoma. The risk factors for RCC, accounting for more than 90 % of all kidney cancers, are smoking, obesity, hypertension, misuse of pain medication, and some genetic diseases. The most common molecular markers of kidney cancer include mutations and epigenetic inactivation of von Hippel-Lindau (VHL) gene, genes of vascular endothelial growth factor (VEGF) pathway, and carbonic anhydrase IX (CIAX). The role of epigenetic pathways, including DNA methylation and chromatin structure remodeling, was also demonstrated. Immunologic properties of RCC enable this type of tumor to escape immune response effectively. An important role in this process is played by tumor-associated macrophages that demonstrate mixed M1/M2 phenotype. In this review, we discuss molecular and cellular aspects for RCC development and current state of knowledge allowing personalized approaches for diagnostics and prognostic prediction of this disease. A set of macrophage markers is suggested for the analysis of the association of macrophage phenotype and disease prognosis. PMID:26500709