Sample records for vector regression approach

  1. Robust support vector regression networks for function approximation with outliers.

    PubMed

    Chuang, Chen-Chia; Su, Shun-Feng; Jeng, Jin-Tsong; Hsiao, Chih-Ching

    2002-01-01

    Support vector regression (SVR) employs the support vector machine (SVM) to tackle problems of function approximation and regression estimation. SVR has been shown to have good robust properties against noise. When the parameters used in SVR are improperly selected, overfitting phenomena may still occur. However, the selection of various parameters is not straightforward. Besides, in SVR, outliers may also possibly be taken as support vectors. Such an inclusion of outliers in support vectors may lead to seriously overfitting phenomena. In this paper, a novel regression approach, termed as the robust support vector regression (RSVR) network, is proposed to enhance the robust capability of SVR. In the approach, traditional robust learning approaches are employed to improve the learning performance for any selected parameters. From the simulation results, our RSVR can always improve the performance of the learned systems for all cases. Besides, it can be found that even the training lasted for a long period, the testing errors would not go up. In other words, the overfitting phenomenon is indeed suppressed.

  2. TWSVR: Regression via Twin Support Vector Machine.

    PubMed

    Khemchandani, Reshma; Goyal, Keshav; Chandra, Suresh

    2016-02-01

    Taking motivation from Twin Support Vector Machine (TWSVM) formulation, Peng (2010) attempted to propose Twin Support Vector Regression (TSVR) where the regressor is obtained via solving a pair of quadratic programming problems (QPPs). In this paper we argue that TSVR formulation is not in the true spirit of TWSVM. Further, taking motivation from Bi and Bennett (2003), we propose an alternative approach to find a formulation for Twin Support Vector Regression (TWSVR) which is in the true spirit of TWSVM. We show that our proposed TWSVR can be derived from TWSVM for an appropriately constructed classification problem. To check the efficacy of our proposed TWSVR we compare its performance with TSVR and classical Support Vector Regression(SVR) on various regression datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.

  3. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    PubMed

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  4. ℓ(p)-Norm multikernel learning approach for stock market price forecasting.

    PubMed

    Shao, Xigao; Wu, Kun; Liao, Bifeng

    2012-01-01

    Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ(1)-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ(p)-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ(1)-norm multiple support vector regression model.

  5. ℓ p-Norm Multikernel Learning Approach for Stock Market Price Forecasting

    PubMed Central

    Shao, Xigao; Wu, Kun; Liao, Bifeng

    2012-01-01

    Linear multiple kernel learning model has been used for predicting financial time series. However, ℓ 1-norm multiple support vector regression is rarely observed to outperform trivial baselines in practical applications. To allow for robust kernel mixtures that generalize well, we adopt ℓ p-norm multiple kernel support vector regression (1 ≤ p < ∞) as a stock price prediction model. The optimization problem is decomposed into smaller subproblems, and the interleaved optimization strategy is employed to solve the regression model. The model is evaluated on forecasting the daily stock closing prices of Shanghai Stock Index in China. Experimental results show that our proposed model performs better than ℓ 1-norm multiple support vector regression model. PMID:23365561

  6. Clifford support vector machines for classification, regression, and recurrence.

    PubMed

    Bayro-Corrochano, Eduardo Jose; Arana-Daniel, Nancy

    2010-11-01

    This paper introduces the Clifford support vector machines (CSVM) as a generalization of the real and complex-valued support vector machines using the Clifford geometric algebra. In this framework, we handle the design of kernels involving the Clifford or geometric product. In this approach, one redefines the optimization variables as multivectors. This allows us to have a multivector as output. Therefore, we can represent multiple classes according to the dimension of the geometric algebra in which we work. We show that one can apply CSVM for classification and regression and also to build a recurrent CSVM. The CSVM is an attractive approach for the multiple input multiple output processing of high-dimensional geometric entities. We carried out comparisons between CSVM and the current approaches to solve multiclass classification and regression. We also study the performance of the recurrent CSVM with experiments involving time series. The authors believe that this paper can be of great use for researchers and practitioners interested in multiclass hypercomplex computing, particularly for applications in complex and quaternion signal and image processing, satellite control, neurocomputation, pattern recognition, computer vision, augmented virtual reality, robotics, and humanoids.

  7. New analysis methods to push the boundaries of diagnostic techniques in the environmental sciences

    NASA Astrophysics Data System (ADS)

    Lungaroni, M.; Murari, A.; Peluso, E.; Gelfusa, M.; Malizia, A.; Vega, J.; Talebzadeh, S.; Gaudio, P.

    2016-04-01

    In the last years, new and more sophisticated measurements have been at the basis of the major progress in various disciplines related to the environment, such as remote sensing and thermonuclear fusion. To maximize the effectiveness of the measurements, new data analysis techniques are required. First data processing tasks, such as filtering and fitting, are of primary importance, since they can have a strong influence on the rest of the analysis. Even if Support Vector Regression is a method devised and refined at the end of the 90s, a systematic comparison with more traditional non parametric regression methods has never been reported. In this paper, a series of systematic tests is described, which indicates how SVR is a very competitive method of non-parametric regression that can usefully complement and often outperform more consolidated approaches. The performance of Support Vector Regression as a method of filtering is investigated first, comparing it with the most popular alternative techniques. Then Support Vector Regression is applied to the problem of non-parametric regression to analyse Lidar surveys for the environments measurement of particulate matter due to wildfires. The proposed approach has given very positive results and provides new perspectives to the interpretation of the data.

  8. Hybrid approach of selecting hyperparameters of support vector machine for regression.

    PubMed

    Jeng, Jin-Tsong

    2006-06-01

    To select the hyperparameters of the support vector machine for regression (SVR), a hybrid approach is proposed to determine the kernel parameter of the Gaussian kernel function and the epsilon value of Vapnik's epsilon-insensitive loss function. The proposed hybrid approach includes a competitive agglomeration (CA) clustering algorithm and a repeated SVR (RSVR) approach. Since the CA clustering algorithm is used to find the nearly "optimal" number of clusters and the centers of clusters in the clustering process, the CA clustering algorithm is applied to select the Gaussian kernel parameter. Additionally, an RSVR approach that relies on the standard deviation of a training error is proposed to obtain an epsilon in the loss function. Finally, two functions, one real data set (i.e., a time series of quarterly unemployment rate for West Germany) and an identification of nonlinear plant are used to verify the usefulness of the hybrid approach.

  9. Hybrid modelling based on support vector regression with genetic algorithms in forecasting the cyanotoxins presence in the Trasona reservoir (Northern Spain).

    PubMed

    García Nieto, P J; Alonso Fernández, J R; de Cos Juez, F J; Sánchez Lasheras, F; Díaz Muñiz, C

    2013-04-01

    Cyanotoxins, a kind of poisonous substances produced by cyanobacteria, are responsible for health risks in drinking and recreational waters. As a result, anticipate its presence is a matter of importance to prevent risks. The aim of this study is to use a hybrid approach based on support vector regression (SVR) in combination with genetic algorithms (GAs), known as a genetic algorithm support vector regression (GA-SVR) model, in forecasting the cyanotoxins presence in the Trasona reservoir (Northern Spain). The GA-SVR approach is aimed at highly nonlinear biological problems with sharp peaks and the tests carried out proved its high performance. Some physical-chemical parameters have been considered along with the biological ones. The results obtained are two-fold. In the first place, the significance of each biological and physical-chemical variable on the cyanotoxins presence in the reservoir is determined with success. Finally, a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Predicting ectotherm disease vector spread—benefits from multidisciplinary approaches and directions forward

    NASA Astrophysics Data System (ADS)

    Thomas, Stephanie Margarete; Beierkuhnlein, Carl

    2013-05-01

    The occurrence of ectotherm disease vectors outside of their previous distribution area and the emergence of vector-borne diseases can be increasingly observed at a global scale and are accompanied by a growing number of studies which investigate the vast range of determining factors and their causal links. Consequently, a broad span of scientific disciplines is involved in tackling these complex phenomena. First, we evaluate the citation behaviour of relevant scientific literature in order to clarify the question "do scientists consider results of other disciplines to extend their expertise?" We then highlight emerging tools and concepts useful for risk assessment. Correlative models (regression-based, machine-learning and profile techniques), mechanistic models (basic reproduction number R 0) and methods of spatial regression, interaction and interpolation are described. We discuss further steps towards multidisciplinary approaches regarding new tools and emerging concepts to combine existing approaches such as Bayesian geostatistical modelling, mechanistic models which avoid the need for parameter fitting, joined correlative and mechanistic models, multi-criteria decision analysis and geographic profiling. We take the quality of both occurrence data for vector, host and disease cases, and data of the predictor variables into consideration as both determine the accuracy of risk area identification. Finally, we underline the importance of multidisciplinary research approaches. Even if the establishment of communication networks between scientific disciplines and the share of specific methods is time consuming, it promises new insights for the surveillance and control of vector-borne diseases worldwide.

  11. Remote Sensing as a Landscape Epidemiologic Tool to Identify Villages at High Risk for Malaria Transmission

    NASA Technical Reports Server (NTRS)

    Beck, Louisa R.; Rodriquez, Mario H.; Dister, Sheri W.; Rodriquez, Americo D.; Rejmankova, Eliska; Ulloa, Armando; Meza, Rosa A.; Roberts, Donald R.; Paris, Jack F.; Spanner, Michael A.; hide

    1994-01-01

    A landscape approach using remote sensing and Geographic Information System (GIS) technologies was developed to discriminate between villages at high and low risk for malaria transmission, as defined by adult Anopheles albimanus abundance. Satellite data for an area in southern Chiapas, Mexico were digitally processed to generate a map of landscape elements. The GIS processes were used to determine the proportion of mapped landscape elements surrounding 40 villages where An. albimanus data had been collected. The relationships between vector abundance and landscape element proportions were investigated using stepwise discriminant analysis and stepwise linear regression. Both analyses indicated that the most important landscape elements in terms of explaining vector abundance were transitional swamp and unmanaged pasture. Discriminant functions generated for these two elements were able to correctly distinguish between villages with high ind low vector abundance, with an overall accuracy of 90%. Regression results found both transitional swamp and unmanaged pasture proportions to be predictive of vector abundance during the mid-to-late wet season. This approach, which integrates remotely sensed data and GIS capabilities to identify villages with high vector-human contact risk, provides a promising tool for malaria surveillance programs that depend on labor-intensive field techniques. This is particularly relevant in areas where the lack of accurate surveillance capabilities may result in no malaria control action when, in fact, directed action is necessary. In general, this landscape approach could be applied to other vector-borne diseases in areas where: 1. the landscape elements critical to vector survival are known and 2. these elements can be detected at remote sensing scales.

  12. Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

    PubMed

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-04-21

    In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.

  13. Support vector methods for survival analysis: a comparison between ranking and regression approaches.

    PubMed

    Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K

    2011-10-01

    To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods including only regression or both regression and ranking constraints on clinical data. On high dimensional data, the former model performs better. However, this approach does not have a theoretical link with standard statistical models for survival data. This link can be made by means of transformation models when ranking constraints are included. Copyright © 2011 Elsevier B.V. All rights reserved.

  14. Prediction of biomechanical parameters of the proximal femur using statistical appearance models and support vector regression.

    PubMed

    Fritscher, Karl; Schuler, Benedikt; Link, Thomas; Eckstein, Felix; Suhm, Norbert; Hänni, Markus; Hengg, Clemens; Schubert, Rainer

    2008-01-01

    Fractures of the proximal femur are one of the principal causes of mortality among elderly persons. Traditional methods for the determination of femoral fracture risk use methods for measuring bone mineral density. However, BMD alone is not sufficient to predict bone failure load for an individual patient and additional parameters have to be determined for this purpose. In this work an approach that uses statistical models of appearance to identify relevant regions and parameters for the prediction of biomechanical properties of the proximal femur will be presented. By using Support Vector Regression the proposed model based approach is capable of predicting two different biomechanical parameters accurately and fully automatically in two different testing scenarios.

  15. A Language-Independent Approach to Automatic Text Difficulty Assessment for Second-Language Learners

    DTIC Science & Technology

    2013-08-01

    best-suited for regression. Our baseline uses z-normalized shallow length features and TF -LOG weighted vectors on bag-of-words for Arabic, Dari...length features and TF -LOG weighted vectors on bag-of-words for Arabic, Dari, English and Pashto. We compare Support Vector Machines and the Margin...football, whereas they are much less common in documents about opera). We used TF -LOG weighted word frequencies on bag-of-words for each document

  16. Fast metabolite identification with Input Output Kernel Regression.

    PubMed

    Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho

    2016-06-15

    An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. celine.brouard@aalto.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.

  17. Fast metabolite identification with Input Output Kernel Regression

    PubMed Central

    Brouard, Céline; Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu, Juho

    2016-01-01

    Motivation: An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space. Results: We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods. Availability and implementation: Contact: celine.brouard@aalto.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27307628

  18. Sparse kernel methods for high-dimensional survival data.

    PubMed

    Evers, Ludger; Messow, Claudia-Martina

    2008-07-15

    Sparse kernel methods like support vector machines (SVM) have been applied with great success to classification and (standard) regression settings. Existing support vector classification and regression techniques however are not suitable for partly censored survival data, which are typically analysed using Cox's proportional hazards model. As the partial likelihood of the proportional hazards model only depends on the covariates through inner products, it can be 'kernelized'. The kernelized proportional hazards model however yields a solution that is dense, i.e. the solution depends on all observations. One of the key features of an SVM is that it yields a sparse solution, depending only on a small fraction of the training data. We propose two methods. One is based on a geometric idea, where-akin to support vector classification-the margin between the failed observation and the observations currently at risk is maximised. The other approach is based on obtaining a sparse model by adding observations one after another akin to the Import Vector Machine (IVM). Data examples studied suggest that both methods can outperform competing approaches. Software is available under the GNU Public License as an R package and can be obtained from the first author's website http://www.maths.bris.ac.uk/~maxle/software.html.

  19. T-wave end detection using neural networks and Support Vector Machines.

    PubMed

    Suárez-León, Alexander Alexeis; Varon, Carolina; Willems, Rik; Van Huffel, Sabine; Vázquez-Seisdedos, Carlos Román

    2018-05-01

    In this paper we propose a new approach for detecting the end of the T-wave in the electrocardiogram (ECG) using Neural Networks and Support Vector Machines. Both, Multilayer Perceptron (MLP) neural networks and Fixed-Size Least-Squares Support Vector Machines (FS-LSSVM) were used as regression algorithms to determine the end of the T-wave. Different strategies for selecting the training set such as random selection, k-means, robust clustering and maximum quadratic (Rényi) entropy were evaluated. Individual parameters were tuned for each method during training and the results are given for the evaluation set. A comparison between MLP and FS-LSSVM approaches was performed. Finally, a fair comparison of the FS-LSSVM method with other state-of-the-art algorithms for detecting the end of the T-wave was included. The experimental results show that FS-LSSVM approaches are more suitable as regression algorithms than MLP neural networks. Despite the small training sets used, the FS-LSSVM methods outperformed the state-of-the-art techniques. FS-LSSVM can be successfully used as a T-wave end detection algorithm in ECG even with small training set sizes. Copyright © 2018 Elsevier Ltd. All rights reserved.

  20. Field applications of stand-off sensing using visible/NIR multivariate optical computing

    NASA Astrophysics Data System (ADS)

    Eastwood, DeLyle; Soyemi, Olusola O.; Karunamuni, Jeevanandra; Zhang, Lixia; Li, Hongli; Myrick, Michael L.

    2001-02-01

    12 A novel multivariate visible/NIR optical computing approach applicable to standoff sensing will be demonstrated with porphyrin mixtures as examples. The ultimate goal is to develop environmental or counter-terrorism sensors for chemicals such as organophosphorus (OP) pesticides or chemical warfare simulants in the near infrared spectral region. The mathematical operation that characterizes prediction of properties via regression from optical spectra is a calculation of inner products between the spectrum and the pre-determined regression vector. The result is scaled appropriately and offset to correspond to the basis from which the regression vector is derived. The process involves collecting spectroscopic data and synthesizing a multivariate vector using a pattern recognition method. Then, an interference coating is designed that reproduces the pattern of the multivariate vector in its transmission or reflection spectrum, and appropriate interference filters are fabricated. High and low refractive index materials such as Nb2O5 and SiO2 are excellent choices for the visible and near infrared regions. The proof of concept has now been established for this system in the visible and will later be extended to chemicals such as OP compounds in the near and mid-infrared.

  1. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    PubMed Central

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  2. A hybrid approach of stepwise regression, logistic regression, support vector machine, and decision tree for forecasting fraudulent financial statements.

    PubMed

    Chen, Suduan; Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  3. Nonparametric methods for drought severity estimation at ungauged sites

    NASA Astrophysics Data System (ADS)

    Sadri, S.; Burn, D. H.

    2012-12-01

    The objective in frequency analysis is, given extreme events such as drought severity or duration, to estimate the relationship between that event and the associated return periods at a catchment. Neural networks and other artificial intelligence approaches in function estimation and regression analysis are relatively new techniques in engineering, providing an attractive alternative to traditional statistical models. There are, however, few applications of neural networks and support vector machines in the area of severity quantile estimation for drought frequency analysis. In this paper, we compare three methods for this task: multiple linear regression, radial basis function neural networks, and least squares support vector regression (LS-SVR). The area selected for this study includes 32 catchments in the Canadian Prairies. From each catchment drought severities are extracted and fitted to a Pearson type III distribution, which act as observed values. For each method-duration pair, we use a jackknife algorithm to produce estimated values at each site. The results from these three approaches are compared and analyzed, and it is found that LS-SVR provides the best quantile estimates and extrapolating capacity.

  4. Estimation of diffusion coefficients from voltammetric signals by support vector and gaussian process regression

    PubMed Central

    2014-01-01

    Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463

  5. Support Vector Hazards Machine: A Counting Process Framework for Learning Risk Scores for Censored Outcomes.

    PubMed

    Wang, Yuanjia; Chen, Tianle; Zeng, Donglin

    2016-01-01

    Learning risk scores to predict dichotomous or continuous outcomes using machine learning approaches has been studied extensively. However, how to learn risk scores for time-to-event outcomes subject to right censoring has received little attention until recently. Existing approaches rely on inverse probability weighting or rank-based regression, which may be inefficient. In this paper, we develop a new support vector hazards machine (SVHM) approach to predict censored outcomes. Our method is based on predicting the counting process associated with the time-to-event outcomes among subjects at risk via a series of support vector machines. Introducing counting processes to represent time-to-event data leads to a connection between support vector machines in supervised learning and hazards regression in standard survival analysis. To account for different at risk populations at observed event times, a time-varying offset is used in estimating risk scores. The resulting optimization is a convex quadratic programming problem that can easily incorporate non-linearity using kernel trick. We demonstrate an interesting link from the profiled empirical risk function of SVHM to the Cox partial likelihood. We then formally show that SVHM is optimal in discriminating covariate-specific hazard function from population average hazard function, and establish the consistency and learning rate of the predicted risk using the estimated risk scores. Simulation studies show improved prediction accuracy of the event times using SVHM compared to existing machine learning methods and standard conventional approaches. Finally, we analyze two real world biomedical study data where we use clinical markers and neuroimaging biomarkers to predict age-at-onset of a disease, and demonstrate superiority of SVHM in distinguishing high risk versus low risk subjects.

  6. Mixed kernel function support vector regression for global sensitivity analysis

    NASA Astrophysics Data System (ADS)

    Cheng, Kai; Lu, Zhenzhou; Wei, Yuhao; Shi, Yan; Zhou, Yicheng

    2017-11-01

    Global sensitivity analysis (GSA) plays an important role in exploring the respective effects of input variables on an assigned output response. Amongst the wide sensitivity analyses in literature, the Sobol indices have attracted much attention since they can provide accurate information for most models. In this paper, a mixed kernel function (MKF) based support vector regression (SVR) model is employed to evaluate the Sobol indices at low computational cost. By the proposed derivation, the estimation of the Sobol indices can be obtained by post-processing the coefficients of the SVR meta-model. The MKF is constituted by the orthogonal polynomials kernel function and Gaussian radial basis kernel function, thus the MKF possesses both the global characteristic advantage of the polynomials kernel function and the local characteristic advantage of the Gaussian radial basis kernel function. The proposed approach is suitable for high-dimensional and non-linear problems. Performance of the proposed approach is validated by various analytical functions and compared with the popular polynomial chaos expansion (PCE). Results demonstrate that the proposed approach is an efficient method for global sensitivity analysis.

  7. Unified Heat Kernel Regression for Diffusion, Kernel Smoothing and Wavelets on Manifolds and Its Application to Mandible Growth Modeling in CT Images

    PubMed Central

    Chung, Moo K.; Qiu, Anqi; Seo, Seongho; Vorperian, Houri K.

    2014-01-01

    We present a novel kernel regression framework for smoothing scalar surface data using the Laplace-Beltrami eigenfunctions. Starting with the heat kernel constructed from the eigenfunctions, we formulate a new bivariate kernel regression framework as a weighted eigenfunction expansion with the heat kernel as the weights. The new kernel regression is mathematically equivalent to isotropic heat diffusion, kernel smoothing and recently popular diffusion wavelets. Unlike many previous partial differential equation based approaches involving diffusion, our approach represents the solution of diffusion analytically, reducing numerical inaccuracy and slow convergence. The numerical implementation is validated on a unit sphere using spherical harmonics. As an illustration, we have applied the method in characterizing the localized growth pattern of mandible surfaces obtained in CT images from subjects between ages 0 and 20 years by regressing the length of displacement vectors with respect to the template surface. PMID:25791435

  8. Evaluation of laser cutting process with auxiliary gas pressure by soft computing approach

    NASA Astrophysics Data System (ADS)

    Lazov, Lyubomir; Nikolić, Vlastimir; Jovic, Srdjan; Milovančević, Miloš; Deneva, Heristina; Teirumenieka, Erika; Arsic, Nebojsa

    2018-06-01

    Evaluation of the optimal laser cutting parameters is very important for the high cut quality. This is highly nonlinear process with different parameters which is the main challenge in the optimization process. Data mining methodology is one of most versatile method which can be used laser cutting process optimization. Support vector regression (SVR) procedure is implemented since it is a versatile and robust technique for very nonlinear data regression. The goal in this study was to determine the optimal laser cutting parameters to ensure robust condition for minimization of average surface roughness. Three cutting parameters, the cutting speed, the laser power, and the assist gas pressure, were used in the investigation. As a laser type TruLaser 1030 technological system was used. Nitrogen as an assisted gas was used in the laser cutting process. As the data mining method, support vector regression procedure was used. Data mining prediction accuracy was very high according the coefficient (R2) of determination and root mean square error (RMSE): R2 = 0.9975 and RMSE = 0.0337. Therefore the data mining approach could be used effectively for determination of the optimal conditions of the laser cutting process.

  9. River flow prediction using hybrid models of support vector regression with the wavelet transform, singular spectrum analysis and chaotic approach

    NASA Astrophysics Data System (ADS)

    Baydaroğlu, Özlem; Koçak, Kasım; Duran, Kemal

    2018-06-01

    Prediction of water amount that will enter the reservoirs in the following month is of vital importance especially for semi-arid countries like Turkey. Climate projections emphasize that water scarcity will be one of the serious problems in the future. This study presents a methodology for predicting river flow for the subsequent month based on the time series of observed monthly river flow with hybrid models of support vector regression (SVR). Monthly river flow over the period 1940-2012 observed for the Kızılırmak River in Turkey has been used for training the method, which then has been applied for predictions over a period of 3 years. SVR is a specific implementation of support vector machines (SVMs), which transforms the observed input data time series into a high-dimensional feature space (input matrix) by way of a kernel function and performs a linear regression in this space. SVR requires a special input matrix. The input matrix was produced by wavelet transforms (WT), singular spectrum analysis (SSA), and a chaotic approach (CA) applied to the input time series. WT convolutes the original time series into a series of wavelets, and SSA decomposes the time series into a trend, an oscillatory and a noise component by singular value decomposition. CA uses a phase space formed by trajectories, which represent the dynamics producing the time series. These three methods for producing the input matrix for the SVR proved successful, while the SVR-WT combination resulted in the highest coefficient of determination and the lowest mean absolute error.

  10. Least Square Regression Method for Estimating Gas Concentration in an Electronic Nose System

    PubMed Central

    Khalaf, Walaa; Pace, Calogero; Gaudioso, Manlio

    2009-01-01

    We describe an Electronic Nose (ENose) system which is able to identify the type of analyte and to estimate its concentration. The system consists of seven sensors, five of them being gas sensors (supplied with different heater voltage values), the remainder being a temperature and a humidity sensor, respectively. To identify a new analyte sample and then to estimate its concentration, we use both some machine learning techniques and the least square regression principle. In fact, we apply two different training models; the first one is based on the Support Vector Machine (SVM) approach and is aimed at teaching the system how to discriminate among different gases, while the second one uses the least squares regression approach to predict the concentration of each type of analyte. PMID:22573980

  11. Spectroscopic Determination of Aboveground Biomass in Grasslands Using Spectral Transformations, Support Vector Machine and Partial Least Squares Regression

    PubMed Central

    Marabel, Miguel; Alvarez-Taboada, Flor

    2013-01-01

    Aboveground biomass (AGB) is one of the strategic biophysical variables of interest in vegetation studies. The main objective of this study was to evaluate the Support Vector Machine (SVM) and Partial Least Squares Regression (PLSR) for estimating the AGB of grasslands from field spectrometer data and to find out which data pre-processing approach was the most suitable. The most accurate model to predict the total AGB involved PLSR and the Maximum Band Depth index derived from the continuum removed reflectance in the absorption features between 916–1,120 nm and 1,079–1,297 nm (R2 = 0.939, RMSE = 7.120 g/m2). Regarding the green fraction of the AGB, the Area Over the Minimum index derived from the continuum removed spectra provided the most accurate model overall (R2 = 0.939, RMSE = 3.172 g/m2). Identifying the appropriate absorption features was proved to be crucial to improve the performance of PLSR to estimate the total and green aboveground biomass, by using the indices derived from those spectral regions. Ordinary Least Square Regression could be used as a surrogate for the PLSR approach with the Area Over the Minimum index as the independent variable, although the resulting model would not be as accurate. PMID:23925082

  12. A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples.

    PubMed

    Li, Yankun; Shao, Xueguang; Cai, Wensheng

    2007-04-15

    Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.

  13. A regression approach to the mapping of bio-physical characteristics of surface sediment using in situ and airborne hyperspectral acquisitions

    NASA Astrophysics Data System (ADS)

    Ibrahim, Elsy; Kim, Wonkook; Crawford, Melba; Monbaliu, Jaak

    2017-02-01

    Remote sensing has been successfully utilized to distinguish and quantify sediment properties in the intertidal environment. Classification approaches of imagery are popular and powerful yet can lead to site- and case-specific results. Such specificity creates challenges for temporal studies. Thus, this paper investigates the use of regression models to quantify sediment properties instead of classifying them. Two regression approaches, namely multiple regression (MR) and support vector regression (SVR), are used in this study for the retrieval of bio-physical variables of intertidal surface sediment of the IJzermonding, a Belgian nature reserve. In the regression analysis, mud content, chlorophyll a concentration, organic matter content, and soil moisture are estimated using radiometric variables of two airborne sensors, namely airborne hyperspectral sensor (AHS) and airborne prism experiment (APEX) and and using field hyperspectral acquisitions by analytical spectral device (ASD). The performance of the two regression approaches is best for the estimation of moisture content. SVR attains the highest accuracy without feature reduction while MR achieves good results when feature reduction is carried out. Sediment property maps are successfully obtained using the models and hyperspectral imagery where SVR used with all bands achieves the best performance. The study also involves the extraction of weights identifying the contribution of each band of the images in the quantification of each sediment property when MR and principal component analysis are used.

  14. ATLS Hypovolemic Shock Classification by Prediction of Blood Loss in Rats Using Regression Models.

    PubMed

    Choi, Soo Beom; Choi, Joon Yul; Park, Jee Soo; Kim, Deok Won

    2016-07-01

    In our previous study, our input data set consisted of 78 rats, the blood loss in percent as a dependent variable, and 11 independent variables (heart rate, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, respiration rate, temperature, perfusion index, lactate concentration, shock index, and new index (lactate concentration/perfusion)). The machine learning methods for multicategory classification were applied to a rat model in acute hemorrhage to predict the four Advanced Trauma Life Support (ATLS) hypovolemic shock classes for triage in our previous study. However, multicategory classification is much more difficult and complicated than binary classification. We introduce a simple approach for classifying ATLS hypovolaemic shock class by predicting blood loss in percent using support vector regression and multivariate linear regression (MLR). We also compared the performance of the classification models using absolute and relative vital signs. The accuracies of support vector regression and MLR models with relative values by predicting blood loss in percent were 88.5% and 84.6%, respectively. These were better than the best accuracy of 80.8% of the direct multicategory classification using the support vector machine one-versus-one model in our previous study for the same validation data set. Moreover, the simple MLR models with both absolute and relative values could provide possibility of the future clinical decision support system for ATLS classification. The perfusion index and new index were more appropriate with relative changes than absolute values.

  15. A Short-Term and High-Resolution System Load Forecasting Approach Using Support Vector Regression with Hybrid Parameters Optimization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, Huaiguang

    This work proposes an approach for distribution system load forecasting, which aims to provide highly accurate short-term load forecasting with high resolution utilizing a support vector regression (SVR) based forecaster and a two-step hybrid parameters optimization method. Specifically, because the load profiles in distribution systems contain abrupt deviations, a data normalization is designed as the pretreatment for the collected historical load data. Then an SVR model is trained by the load data to forecast the future load. For better performance of SVR, a two-step hybrid optimization algorithm is proposed to determine the best parameters. In the first step of themore » hybrid optimization algorithm, a designed grid traverse algorithm (GTA) is used to narrow the parameters searching area from a global to local space. In the second step, based on the result of the GTA, particle swarm optimization (PSO) is used to determine the best parameters in the local parameter space. After the best parameters are determined, the SVR model is used to forecast the short-term load deviation in the distribution system.« less

  16. Prediction of brain maturity in infants using machine-learning algorithms.

    PubMed

    Smyser, Christopher D; Dosenbach, Nico U F; Smyser, Tara A; Snyder, Abraham Z; Rogers, Cynthia E; Inder, Terrie E; Schlaggar, Bradley L; Neil, Jeffrey J

    2016-08-01

    Recent resting-state functional MRI investigations have demonstrated that much of the large-scale functional network architecture supporting motor, sensory and cognitive functions in older pediatric and adult populations is present in term- and prematurely-born infants. Application of new analytical approaches can help translate the improved understanding of early functional connectivity provided through these studies into predictive models of neurodevelopmental outcome. One approach to achieving this goal is multivariate pattern analysis, a machine-learning, pattern classification approach well-suited for high-dimensional neuroimaging data. It has previously been adapted to predict brain maturity in children and adolescents using structural and resting state-functional MRI data. In this study, we evaluated resting state-functional MRI data from 50 preterm-born infants (born at 23-29weeks of gestation and without moderate-severe brain injury) scanned at term equivalent postmenstrual age compared with data from 50 term-born control infants studied within the first week of life. Using 214 regions of interest, binary support vector machines distinguished term from preterm infants with 84% accuracy (p<0.0001). Inter- and intra-hemispheric connections throughout the brain were important for group categorization, indicating that widespread changes in the brain's functional network architecture associated with preterm birth are detectable by term equivalent age. Support vector regression enabled quantitative estimation of birth gestational age in single subjects using only term equivalent resting state-functional MRI data, indicating that the present approach is sensitive to the degree of disruption of brain development associated with preterm birth (using gestational age as a surrogate for the extent of disruption). This suggests that support vector regression may provide a means for predicting neurodevelopmental outcome in individual infants. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. Prediction of brain maturity in infants using machine-learning algorithms

    PubMed Central

    Smyser, Christopher D.; Dosenbach, Nico U.F.; Smyser, Tara A.; Snyder, Abraham Z.; Rogers, Cynthia E.; Inder, Terrie E.; Schlaggar, Bradley L.; Neil, Jeffrey J.

    2016-01-01

    Recent resting-state functional MRI investigations have demonstrated that much of the large-scale functional network architecture supporting motor, sensory and cognitive functions in older pediatric and adult populations is present in term- and prematurely-born infants. Application of new analytical approaches can help translate the improved understanding of early functional connectivity provided through these studies into predictive models of neurodevelopmental outcome. One approach to achieving this goal is multivariate pattern analysis, a machine-learning, pattern classification approach well-suited for high-dimensional neuroimaging data. It has previously been adapted to predict brain maturity in children and adolescents using structural and resting state-functional MRI data. In this study, we evaluated resting state-functional MRI data from 50 preterm-born infants (born at 23–29 weeks of gestation and without moderate–severe brain injury) scanned at term equivalent postmenstrual age compared with data from 50 term-born control infants studied within the first week of life. Using 214 regions of interest, binary support vector machines distinguished term from preterm infants with 84% accuracy (p < 0.0001). Inter- and intra-hemispheric connections throughout the brain were important for group categorization, indicating that widespread changes in the brain's functional network architecture associated with preterm birth are detectable by term equivalent age. Support vector regression enabled quantitative estimation of birth gestational age in single subjects using only term equivalent resting state-functional MRI data, indicating that the present approach is sensitive to the degree of disruption of brain development associated with preterm birth (using gestational age as a surrogate for the extent of disruption). This suggests that support vector regression may provide a means for predicting neurodevelopmental outcome in individual infants. PMID:27179605

  18. RBF kernel based support vector regression to estimate the blood volume and heart rate responses during hemodialysis.

    PubMed

    Javed, Faizan; Chan, Gregory S H; Savkin, Andrey V; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H

    2009-01-01

    This paper uses non-linear support vector regression (SVR) to model the blood volume and heart rate (HR) responses in 9 hemodynamically stable kidney failure patients during hemodialysis. Using radial bias function (RBF) kernels the non-parametric models of relative blood volume (RBV) change with time as well as percentage change in HR with respect to RBV were obtained. The e-insensitivity based loss function was used for SVR modeling. Selection of the design parameters which includes capacity (C), insensitivity region (e) and the RBF kernel parameter (sigma) was made based on a grid search approach and the selected models were cross-validated using the average mean square error (AMSE) calculated from testing data based on a k-fold cross-validation technique. Linear regression was also applied to fit the curves and the AMSE was calculated for comparison with SVR. For the model based on RBV with time, SVR gave a lower AMSE for both training (AMSE=1.5) as well as testing data (AMSE=1.4) compared to linear regression (AMSE=1.8 and 1.5). SVR also provided a better fit for HR with RBV for both training as well as testing data (AMSE=15.8 and 16.4) compared to linear regression (AMSE=25.2 and 20.1).

  19. Soft computing techniques toward modeling the water supplies of Cyprus.

    PubMed

    Iliadis, L; Maris, F; Tachos, S

    2011-10-01

    This research effort aims in the application of soft computing techniques toward water resources management. More specifically, the target is the development of reliable soft computing models capable of estimating the water supply for the case of "Germasogeia" mountainous watersheds in Cyprus. Initially, ε-Regression Support Vector Machines (ε-RSVM) and fuzzy weighted ε-RSVMR models have been developed that accept five input parameters. At the same time, reliable artificial neural networks have been developed to perform the same job. The 5-fold cross validation approach has been employed in order to eliminate bad local behaviors and to produce a more representative training data set. Thus, the fuzzy weighted Support Vector Regression (SVR) combined with the fuzzy partition has been employed in an effort to enhance the quality of the results. Several rational and reliable models have been produced that can enhance the efficiency of water policy designers. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Predicting complications of percutaneous coronary intervention using a novel support vector method.

    PubMed

    Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan

    2013-01-01

    To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer-Lemeshow χ(2) value (seven cases) and the mean cross-entropy error (eight cases). The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains.

  1. Predicting complications of percutaneous coronary intervention using a novel support vector method

    PubMed Central

    Lee, Gyemin; Gurm, Hitinder S; Syed, Zeeshan

    2013-01-01

    Objective To explore the feasibility of a novel approach using an augmented one-class learning algorithm to model in-laboratory complications of percutaneous coronary intervention (PCI). Materials and methods Data from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2) multicenter registry for the years 2007 and 2008 (n=41 016) were used to train models to predict 13 different in-laboratory PCI complications using a novel one-plus-class support vector machine (OP-SVM) algorithm. The performance of these models in terms of discrimination and calibration was compared to the performance of models trained using the following classification algorithms on BMC2 data from 2009 (n=20 289): logistic regression (LR), one-class support vector machine classification (OC-SVM), and two-class support vector machine classification (TC-SVM). For the OP-SVM and TC-SVM approaches, variants of the algorithms with cost-sensitive weighting were also considered. Results The OP-SVM algorithm and its cost-sensitive variant achieved the highest area under the receiver operating characteristic curve for the majority of the PCI complications studied (eight cases). Similar improvements were observed for the Hosmer–Lemeshow χ2 value (seven cases) and the mean cross-entropy error (eight cases). Conclusions The OP-SVM algorithm based on an augmented one-class learning problem improved discrimination and calibration across different PCI complications relative to LR and traditional support vector machine classification. Such an approach may have value in a broader range of clinical domains. PMID:23599229

  2. Combination Gene Therapy for Liver Metastasis of Colon Carcinoma in vivo

    NASA Astrophysics Data System (ADS)

    Chen, Shu-Hsai; Chen, X. H. Li; Wang, Yibin; Kosai, Ken-Ichiro; Finegold, Milton J.; Rich, Susan S.

    1995-03-01

    The efficacy of combination therapy with a "suicide gene" and a cytokine gene to treat metastatic colon carcinoma in the liver was investigated. Tumor in the liver was generated by intrahepatic injection of a colon carcinoma cell line (MCA-26) in syngeneic BALB/c mice. Recombinant adenoviral vectors containing various control and therapeutic genes were injected directly into the solid tumors, followed by treatment with ganciclovir. While the tumors continued to grow in all animals treated with a control vector or a mouse interleukin 2 vector, those treated with a herpes simplex virus thymidine kinase vector, with or without the coadministration of the mouse interleukin 2 vector, exhibited dramatic necrosis and regression. However, only animals treated with both vectors developed an effective systemic antitumoral immunity against challenges of tumorigenic doses of parental tumor cells inoculated at distant sites. The antitumoral immunity was associated with the presence of MCA-26 tumor-specific cytolytic CD8^+ T lymphocytes. The results suggest that combination suicide and cytokine gene therapy in vivo can be a powerful approach for treatment of metastatic colon carcinoma in the liver.

  3. Vector autoregressive models: A Gini approach

    NASA Astrophysics Data System (ADS)

    Mussard, Stéphane; Ndiaye, Oumar Hamady

    2018-02-01

    In this paper, it is proven that the usual VAR models may be performed in the Gini sense, that is, on a ℓ1 metric space. The Gini regression is robust to outliers. As a consequence, when data are contaminated by extreme values, we show that semi-parametric VAR-Gini regressions may be used to obtain robust estimators. The inference about the estimators is made with the ℓ1 norm. Also, impulse response functions and Gini decompositions for prevision errors are introduced. Finally, Granger's causality tests are properly derived based on U-statistics.

  4. Multivariate Models for Prediction of Human Skin Sensitization ...

    EPA Pesticide Factsheets

    One of the lnteragency Coordinating Committee on the Validation of Alternative Method's (ICCVAM) top priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary to produce skin sensitization suggests that no single alternative method will replace the currently accepted animal tests. ICCVAM is evaluating an integrated approach to testing and assessment based on the adverse outcome pathway for skin sensitization that uses machine learning approaches to predict human skin sensitization hazard. We combined data from three in chemico or in vitro assays - the direct peptide reactivity assay (DPRA), human cell line activation test (h-CLAT) and KeratinoSens TM assay - six physicochemical properties and an in silico read-across prediction of skin sensitization hazard into 12 variable groups. The variable groups were evaluated using two machine learning approaches , logistic regression and support vector machine, to predict human skin sensitization hazard. Models were trained on 72 substances and tested on an external set of 24 substances. The six models (three logistic regression and three support vector machine) with the highest accuracy (92%) used: (1) DPRA, h-CLAT and read-across; (2) DPRA, h-CLAT, read-across and KeratinoSens; or (3) DPRA, h-CLAT, read-across, KeratinoSens and log P. The models performed better at predicting human skin sensitization hazard than the murine

  5. A short-term and high-resolution distribution system load forecasting approach using support vector regression with hybrid parameters optimization

    DOE PAGES

    Jiang, Huaiguang; Zhang, Yingchen; Muljadi, Eduard; ...

    2016-01-01

    This paper proposes an approach for distribution system load forecasting, which aims to provide highly accurate short-term load forecasting with high resolution utilizing a support vector regression (SVR) based forecaster and a two-step hybrid parameters optimization method. Specifically, because the load profiles in distribution systems contain abrupt deviations, a data normalization is designed as the pretreatment for the collected historical load data. Then an SVR model is trained by the load data to forecast the future load. For better performance of SVR, a two-step hybrid optimization algorithm is proposed to determine the best parameters. In the first step of themore » hybrid optimization algorithm, a designed grid traverse algorithm (GTA) is used to narrow the parameters searching area from a global to local space. In the second step, based on the result of the GTA, particle swarm optimization (PSO) is used to determine the best parameters in the local parameter space. After the best parameters are determined, the SVR model is used to forecast the short-term load deviation in the distribution system. The performance of the proposed approach is compared to some classic methods in later sections of the paper.« less

  6. Soft-sensing model of temperature for aluminum reduction cell on improved twin support vector regression

    NASA Astrophysics Data System (ADS)

    Li, Tao

    2018-06-01

    The complexity of aluminum electrolysis process leads the temperature for aluminum reduction cells hard to measure directly. However, temperature is the control center of aluminum production. To solve this problem, combining some aluminum plant's practice data, this paper presents a Soft-sensing model of temperature for aluminum electrolysis process on Improved Twin Support Vector Regression (ITSVR). ITSVR eliminates the slow learning speed of Support Vector Regression (SVR) and the over-fit risk of Twin Support Vector Regression (TSVR) by introducing a regularization term into the objective function of TSVR, which ensures the structural risk minimization principle and lower computational complexity. Finally, the model with some other parameters as auxiliary variable, predicts the temperature by ITSVR. The simulation result shows Soft-sensing model based on ITSVR has short time-consuming and better generalization.

  7. Comparative Effects of Diet-Induced Lipid Lowering Versus Lipid Lowering Along With Apo A-I Milano Gene Therapy on Regression of Atherosclerosis.

    PubMed

    Wang, Lai; Tian, Fang; Arias, Ana; Yang, Mingjie; Sharifi, Behrooz G; Shah, Prediman K

    2016-05-01

    Apolipoprotein A-1 (Apo A-I) Milano, a naturally occurring Arg173to Cys mutant of Apo A-1, has been shown to reduce atherosclerosis in animal models and in a small phase 2 human trial. We have shown the superior atheroprotective effects of Apo A-I Milano (Apo A-IM) gene compared to wild-type Apo A-I gene using transplantation of retrovirally transduced bone marrow in Apo A-I/Apo E null mice. In this study, we compared the effect of dietary lipid lowering versus lipid lowering plus Apo A-IM gene transfer using recombinant adeno-associated virus (rAAV) 8 as vectors on atherosclerosis regression in Apo A-I/Apo E null mice. All mice were fed a high-cholesterol diet from age of 6 weeks until week 20, and at 20 weeks, 10 mice were euthanized to determine the extent of atherosclerosis. After 20 weeks, an additional 20 mice were placed on either a low-cholesterol diet plus empty rAAV (n = 10) to serve as controls or low-cholesterol diet plus 1 single intravenous injection of 1.2 × 10(12)vector genomes of adeno-associated virus (AAV) 8 vectors expressing Apo A-IM (n = 10). At the 40 week time point, intravenous AAV8 Apo A-IM recipients showed a significant regression of atherosclerosis in the whole aorta (P< .01), aortic sinuses (P< .05), and brachiocephalic arteries (P< .05) compared to 20-week-old mice, whereas low-cholesterol diet plus empty vector control group showed no significant regression in lesion size. Immunostaining showed that compared to the 20-week-old mice, there was a significantly reduced macrophage content in the brachiocephalic (P< .05) and aortic sinus plaques (P< .05) of AAV8 Apo A-IM recipients. These data show that although dietary-mediated cholesterol lowering halts progression of atherosclerosis, it does not induce regression, whereas combination of low-cholesterol diet and AAV8 mediated Apo A-I Milano gene therapy induces rapid and significant regression of atherosclerosis in mice. These data provide support for the potential feasibility of this approach for atherosclerosis regression. © The Author(s) 2015.

  8. A Feature-Free 30-Disease Pathological Brain Detection System by Linear Regression Classifier.

    PubMed

    Chen, Yi; Shao, Ying; Yan, Jie; Yuan, Ti-Fei; Qu, Yanwen; Lee, Elizabeth; Wang, Shuihua

    2017-01-01

    Alzheimer's disease patients are increasing rapidly every year. Scholars tend to use computer vision methods to develop automatic diagnosis system. (Background) In 2015, Gorji et al. proposed a novel method using pseudo Zernike moment. They tested four classifiers: learning vector quantization neural network, pattern recognition neural network trained by Levenberg-Marquardt, by resilient backpropagation, and by scaled conjugate gradient. This study presents an improved method by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Therefore, it can be used to detect Alzheimer's disease. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  9. SEMIPARAMETRIC QUANTILE REGRESSION WITH HIGH-DIMENSIONAL COVARIATES

    PubMed Central

    Zhu, Liping; Huang, Mian; Li, Runze

    2012-01-01

    This paper is concerned with quantile regression for a semiparametric regression model, in which both the conditional mean and conditional variance function of the response given the covariates admit a single-index structure. This semiparametric regression model enables us to reduce the dimension of the covariates and simultaneously retains the flexibility of nonparametric regression. Under mild conditions, we show that the simple linear quantile regression offers a consistent estimate of the index parameter vector. This is a surprising and interesting result because the single-index model is possibly misspecified under the linear quantile regression. With a root-n consistent estimate of the index vector, one may employ a local polynomial regression technique to estimate the conditional quantile function. This procedure is computationally efficient, which is very appealing in high-dimensional data analysis. We show that the resulting estimator of the quantile function performs asymptotically as efficiently as if the true value of the index vector were known. The methodologies are demonstrated through comprehensive simulation studies and an application to a real dataset. PMID:24501536

  10. Rank-Optimized Logistic Matrix Regression toward Improved Matrix Data Classification.

    PubMed

    Zhang, Jianguang; Jiang, Jianmin

    2018-02-01

    While existing logistic regression suffers from overfitting and often fails in considering structural information, we propose a novel matrix-based logistic regression to overcome the weakness. In the proposed method, 2D matrices are directly used to learn two groups of parameter vectors along each dimension without vectorization, which allows the proposed method to fully exploit the underlying structural information embedded inside the 2D matrices. Further, we add a joint [Formula: see text]-norm on two parameter matrices, which are organized by aligning each group of parameter vectors in columns. This added co-regularization term has two roles-enhancing the effect of regularization and optimizing the rank during the learning process. With our proposed fast iterative solution, we carried out extensive experiments. The results show that in comparison to both the traditional tensor-based methods and the vector-based regression methods, our proposed solution achieves better performance for matrix data classifications.

  11. The effect of machine learning regression algorithms and sample size on individualized behavioral prediction with functional connectivity features.

    PubMed

    Cui, Zaixu; Gong, Gaolang

    2018-06-02

    Individualized behavioral/cognitive prediction using machine learning (ML) regression approaches is becoming increasingly applied. The specific ML regression algorithm and sample size are two key factors that non-trivially influence prediction accuracies. However, the effects of the ML regression algorithm and sample size on individualized behavioral/cognitive prediction performance have not been comprehensively assessed. To address this issue, the present study included six commonly used ML regression algorithms: ordinary least squares (OLS) regression, least absolute shrinkage and selection operator (LASSO) regression, ridge regression, elastic-net regression, linear support vector regression (LSVR), and relevance vector regression (RVR), to perform specific behavioral/cognitive predictions based on different sample sizes. Specifically, the publicly available resting-state functional MRI (rs-fMRI) dataset from the Human Connectome Project (HCP) was used, and whole-brain resting-state functional connectivity (rsFC) or rsFC strength (rsFCS) were extracted as prediction features. Twenty-five sample sizes (ranged from 20 to 700) were studied by sub-sampling from the entire HCP cohort. The analyses showed that rsFC-based LASSO regression performed remarkably worse than the other algorithms, and rsFCS-based OLS regression performed markedly worse than the other algorithms. Regardless of the algorithm and feature type, both the prediction accuracy and its stability exponentially increased with increasing sample size. The specific patterns of the observed algorithm and sample size effects were well replicated in the prediction using re-testing fMRI data, data processed by different imaging preprocessing schemes, and different behavioral/cognitive scores, thus indicating excellent robustness/generalization of the effects. The current findings provide critical insight into how the selected ML regression algorithm and sample size influence individualized predictions of behavior/cognition and offer important guidance for choosing the ML regression algorithm or sample size in relevant investigations. Copyright © 2018 Elsevier Inc. All rights reserved.

  12. A Fast Vector Radiative Transfer Model for Atmospheric and Oceanic Remote Sensing

    NASA Astrophysics Data System (ADS)

    Ding, J.; Yang, P.; King, M. D.; Platnick, S. E.; Meyer, K.

    2017-12-01

    A fast vector radiative transfer model is developed in support of atmospheric and oceanic remote sensing. This model is capable of simulating the Stokes vector observed at the top of the atmosphere (TOA) and the terrestrial surface by considering absorption, scattering, and emission. The gas absorption is parameterized in terms of atmospheric gas concentrations, temperature, and pressure. The parameterization scheme combines a regression method and the correlated-K distribution method, and can easily integrate with multiple scattering computations. The approach is more than four orders of magnitude faster than a line-by-line radiative transfer model with errors less than 0.5% in terms of transmissivity. A two-component approach is utilized to solve the vector radiative transfer equation (VRTE). The VRTE solver separates the phase matrices of aerosol and cloud into forward and diffuse parts and thus the solution is also separated. The forward solution can be expressed by a semi-analytical equation based on the small-angle approximation, and serves as the source of the diffuse part. The diffuse part is solved by the adding-doubling method. The adding-doubling implementation is computationally efficient because the diffuse component needs much fewer spherical function expansion terms. The simulated Stokes vector at both the TOA and the surface have comparable accuracy compared with the counterparts based on numerically rigorous methods.

  13. The Geometry of Enhancement in Multiple Regression

    ERIC Educational Resources Information Center

    Waller, Niels G.

    2011-01-01

    In linear multiple regression, "enhancement" is said to occur when R[superscript 2] = b[prime]r greater than r[prime]r, where b is a p x 1 vector of standardized regression coefficients and r is a p x 1 vector of correlations between a criterion y and a set of standardized regressors, x. When p = 1 then b [is congruent to] r and…

  14. Predicting Error Bars for QSAR Models

    NASA Astrophysics Data System (ADS)

    Schroeter, Timon; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-09-01

    Unfavorable physicochemical properties often cause drug failures. It is therefore important to take lipophilicity and water solubility into account early on in lead discovery. This study presents log D7 models built using Gaussian Process regression, Support Vector Machines, decision trees and ridge regression algorithms based on 14556 drug discovery compounds of Bayer Schering Pharma. A blind test was conducted using 7013 new measurements from the last months. We also present independent evaluations using public data. Apart from accuracy, we discuss the quality of error bars that can be computed by Gaussian Process models, and ensemble and distance based techniques for the other modelling approaches.

  15. Identification of environmental covariates of West Nile virus vector mosquito population abundance.

    PubMed

    Trawinski, Patricia R; Mackay, D Scott

    2010-06-01

    The rapid spread of West Nile virus (WNv) in North America is a major public health concern. Culex pipiens-restuans is the principle mosquito vector of WNv in the northeastern United States while Aedes vexans is an important bridge vector of the virus in this region. Vector mosquito abundance is directly dependent on physical environmental factors that provide mosquito habitats. The objective of this research is to determine landscape elements that explain the population abundance and distribution of WNv vector mosquitoes using stepwise linear regression. We developed a novel approach for examining a large set of landscape variables based on a land use and land cover classification by selecting variables in stages to minimize multicollinearity. We also investigated the distance at which landscape elements influence abundance of vector populations using buffer distances of 200, 400, and 1000 m. Results show landscape effects have a significant impact on Cx. pipiens-estuans population distribution while the effects of landscape features are less important for prediction of Ae. vexans population distributions. Cx. pipiens-restuans population abundance is positively correlated with human population density, housing unit density, and urban land use and land cover classes and negatively correlated with age of dwellings and amount of forested land.

  16. Stable Local Volatility Calibration Using Kernel Splines

    NASA Astrophysics Data System (ADS)

    Coleman, Thomas F.; Li, Yuying; Wang, Cheng

    2010-09-01

    We propose an optimization formulation using L1 norm to ensure accuracy and stability in calibrating a local volatility function for option pricing. Using a regularization parameter, the proposed objective function balances the calibration accuracy with the model complexity. Motivated by the support vector machine learning, the unknown local volatility function is represented by a kernel function generating splines and the model complexity is controlled by minimizing the 1-norm of the kernel coefficient vector. In the context of the support vector regression for function estimation based on a finite set of observations, this corresponds to minimizing the number of support vectors for predictability. We illustrate the ability of the proposed approach to reconstruct the local volatility function in a synthetic market. In addition, based on S&P 500 market index option data, we demonstrate that the calibrated local volatility surface is simple and resembles the observed implied volatility surface in shape. Stability is illustrated by calibrating local volatility functions using market option data from different dates.

  17. Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach

    NASA Astrophysics Data System (ADS)

    Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew

    2017-05-01

    This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.

  18. An Analytical Investigation of the Robustness and Power of ANCOVA with the Presence of Heterogeneous Regression Slopes.

    ERIC Educational Resources Information Center

    Hollingsworth, Holly H.

    This study shows that the test statistic for Analysis of Covariance (ANCOVA) has a noncentral F-districution with noncentrality parameter equal to zero if and only if the regression planes are homogeneous and/or the vector of overall covariate means is the null vector. The effect of heterogeneous regression slope parameters is to either increase…

  19. Blood glucose level prediction based on support vector regression using mobile platforms.

    PubMed

    Reymann, Maximilian P; Dorschky, Eva; Groh, Benjamin H; Martindale, Christine; Blank, Peter; Eskofier, Bjoern M

    2016-08-01

    The correct treatment of diabetes is vital to a patient's health: Staying within defined blood glucose levels prevents dangerous short- and long-term effects on the body. Mobile devices informing patients about their future blood glucose levels could enable them to take counter-measures to prevent hypo or hyper periods. Previous work addressed this challenge by predicting the blood glucose levels using regression models. However, these approaches required a physiological model, representing the human body's response to insulin and glucose intake, or are not directly applicable to mobile platforms (smart phones, tablets). In this paper, we propose an algorithm for mobile platforms to predict blood glucose levels without the need for a physiological model. Using an online software simulator program, we trained a Support Vector Regression (SVR) model and exported the parameter settings to our mobile platform. The prediction accuracy of our mobile platform was evaluated with pre-recorded data of a type 1 diabetes patient. The blood glucose level was predicted with an error of 19 % compared to the true value. Considering the permitted error of commercially used devices of 15 %, our algorithm is the basis for further development of mobile prediction algorithms.

  20. Building a computer program to support children, parents, and distraction during healthcare procedures.

    PubMed

    Hanrahan, Kirsten; McCarthy, Ann Marie; Kleiber, Charmaine; Ataman, Kaan; Street, W Nick; Zimmerman, M Bridget; Ersig, Anne L

    2012-10-01

    This secondary data analysis used data mining methods to develop predictive models of child risk for distress during a healthcare procedure. Data used came from a study that predicted factors associated with children's responses to an intravenous catheter insertion while parents provided distraction coaching. From the 255 items used in the primary study, 44 predictive items were identified through automatic feature selection and used to build support vector machine regression models. Models were validated using multiple cross-validation tests and by comparing variables identified as explanatory in the traditional versus support vector machine regression. Rule-based approaches were applied to the model outputs to identify overall risk for distress. A decision tree was then applied to evidence-based instructions for tailoring distraction to characteristics and preferences of the parent and child. The resulting decision support computer application, titled Children, Parents and Distraction, is being used in research. Future use will support practitioners in deciding the level and type of distraction intervention needed by a child undergoing a healthcare procedure.

  1. Efficient design of gain-flattened multi-pump Raman fiber amplifiers using least squares support vector regression

    NASA Astrophysics Data System (ADS)

    Chen, Jing; Qiu, Xiaojie; Yin, Cunyi; Jiang, Hao

    2018-02-01

    An efficient method to design the broadband gain-flattened Raman fiber amplifier with multiple pumps is proposed based on least squares support vector regression (LS-SVR). A multi-input multi-output LS-SVR model is introduced to replace the complicated solving process of the nonlinear coupled Raman amplification equation. The proposed approach contains two stages: offline training stage and online optimization stage. During the offline stage, the LS-SVR model is trained. Owing to the good generalization capability of LS-SVR, the net gain spectrum can be directly and accurately obtained when inputting any combination of the pump wavelength and power to the well-trained model. During the online stage, we incorporate the LS-SVR model into the particle swarm optimization algorithm to find the optimal pump configuration. The design results demonstrate that the proposed method greatly shortens the computation time and enhances the efficiency of the pump parameter optimization for Raman fiber amplifier design.

  2. Changes in Black-legged Tick Population in New England with Future Climate Change

    NASA Astrophysics Data System (ADS)

    Krishnan, S.; Huber, M.

    2015-12-01

    Lyme disease is one of the most frequently reported vector-borne diseases in the United States. In the Northeastern United States, vector transmission is maintained in a horizontal transmission cycle between the vector, the black-legged ticks, and the vertebrate reservoir hosts, which include white-tailed deer, rodents and other medium to large sized mammals. Predicting how vector populations change with future climate change is critical to understanding disease spread in the future, and for developing suitable regional adaptation strategies. For the United States, these predictions have mostly been made using regressions based on field and lab studies, or using spatial suitability studies. However, the relation between tick populations at various life-cycle stages and climate variables are complex, necessitating a mechanistic approach. In this study, we present a framework for driving a mechanistic tick population model with high-resolution regional climate modeling projections. The goal is to estimate changes in black-legged tick populations in New England for the 21st century. The tick population model used is based on the mechanistic approach of Ogden et al., (2005) developed for Canada. Dynamically downscaled climate projections at a 3-kms resolution using the Weather and Research Forecasting Model (WRF) are used to drive the tick population model.

  3. Classifying machinery condition using oil samples and binary logistic regression

    NASA Astrophysics Data System (ADS)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  4. Body Fat Percentage Prediction Using Intelligent Hybrid Approaches

    PubMed Central

    Shao, Yuehjen E.

    2014-01-01

    Excess of body fat often leads to obesity. Obesity is typically associated with serious medical diseases, such as cancer, heart disease, and diabetes. Accordingly, knowing the body fat is an extremely important issue since it affects everyone's health. Although there are several ways to measure the body fat percentage (BFP), the accurate methods are often associated with hassle and/or high costs. Traditional single-stage approaches may use certain body measurements or explanatory variables to predict the BFP. Diverging from existing approaches, this study proposes new intelligent hybrid approaches to obtain fewer explanatory variables, and the proposed forecasting models are able to effectively predict the BFP. The proposed hybrid models consist of multiple regression (MR), artificial neural network (ANN), multivariate adaptive regression splines (MARS), and support vector regression (SVR) techniques. The first stage of the modeling includes the use of MR and MARS to obtain fewer but more important sets of explanatory variables. In the second stage, the remaining important variables are served as inputs for the other forecasting methods. A real dataset was used to demonstrate the development of the proposed hybrid models. The prediction results revealed that the proposed hybrid schemes outperformed the typical, single-stage forecasting models. PMID:24723804

  5. Multiple injections of electroporated autologous T cells expressing a chimeric antigen receptor mediate regression of human disseminated tumor.

    PubMed

    Zhao, Yangbing; Moon, Edmund; Carpenito, Carmine; Paulos, Chrystal M; Liu, Xiaojun; Brennan, Andrea L; Chew, Anne; Carroll, Richard G; Scholler, John; Levine, Bruce L; Albelda, Steven M; June, Carl H

    2010-11-15

    Redirecting T lymphocyte antigen specificity by gene transfer can provide large numbers of tumor-reactive T lymphocytes for adoptive immunotherapy. However, safety concerns associated with viral vector production have limited clinical application of T cells expressing chimeric antigen receptors (CAR). T lymphocytes can be gene modified by RNA electroporation without integration-associated safety concerns. To establish a safe platform for adoptive immunotherapy, we first optimized the vector backbone for RNA in vitro transcription to achieve high-level transgene expression. CAR expression and function of RNA-electroporated T cells could be detected up to a week after electroporation. Multiple injections of RNA CAR-electroporated T cells mediated regression of large vascularized flank mesothelioma tumors in NOD/scid/γc(-/-) mice. Dramatic tumor reduction also occurred when the preexisting intraperitoneal human-derived tumors, which had been growing in vivo for >50 days, were treated by multiple injections of autologous human T cells electroporated with anti-mesothelin CAR mRNA. This is the first report using matched patient tumor and lymphocytes showing that autologous T cells from cancer patients can be engineered to provide an effective therapy for a disseminated tumor in a robust preclinical model. Multiple injections of RNA-engineered T cells are a novel approach for adoptive cell transfer, providing flexible platform for the treatment of cancer that may complement the use of retroviral and lentiviral engineered T cells. This approach may increase the therapeutic index of T cells engineered to express powerful activation domains without the associated safety concerns of integrating viral vectors. Copyright © 2010 AACR.

  6. Multiple injections of electroporated autologous T cells expressing a chimeric antigen receptor mediate regression of human disseminated tumor

    PubMed Central

    Zhao, Yangbing; Moon, Edmund; Carpenito, Carmine; Paulos, Chrystal M.; Liu, Xiaojun; Brennan, Andrea L; Chew, Anne; Carroll, Richard G.; Scholler, John; Levine, Bruce L.; Albelda, Steven M.; June, Carl H.

    2010-01-01

    Redirecting T lymphocyte antigen specificity by gene transfer can provide large numbers of tumor reactive T lymphocytes for adoptive immunotherapy. However, safety concerns associated with viral vector production have limited clinical application of T cells expressing chimeric antigen receptors (CARs). T lymphocytes can be gene modified by RNA electroporation without integration-associated safety concerns. To establish a safe platform for adoptive immunotherapy, we first optimized the vector backbone for RNA in vitro transcription to achieve high level transgene expression. CAR expression and function of RNA-electroporated T cells could be detected up to a week post electroporation. Multiple injections of RNA CAR electroporated T cells mediated regression of large vascularized flank mesothelioma tumors in NOD/scid/γc(−/−) mice. Dramatic tumor reduction also occurred when the pre-existing intraperitoneal human-derived tumors, that had been growing in vivo for over 50 days, were treated by multiple injections of autologous human T cells electroporated with anti-mesothelin CAR mRNA. This is the first report using matched patient tumor and lymphocytes demonstrating that autologous T cells from cancer patients can be engineered to provide an effective therapy for a disseminated tumor in a robust preclinical model. Multiple injections of RNA engineered T cells are a novel approach for adoptive cell transfer, providing flexible platform for the treatment of cancer that may complement the use of retroviral and lentiviral engineered T cells. This approach may increase the therapeutic index of T cells engineered to express powerful activation domains without the associated safety concerns of integrating viral vectors. PMID:20926399

  7. Unresolved Galaxy Classifier for ESA/Gaia mission: Support Vector Machines approach

    NASA Astrophysics Data System (ADS)

    Bellas-Velidis, Ioannis; Kontizas, Mary; Dapergolas, Anastasios; Livanou, Evdokia; Kontizas, Evangelos; Karampelas, Antonios

    A software package Unresolved Galaxy Classifier (UGC) is being developed for the ground-based pipeline of ESA's Gaia mission. It aims to provide an automated taxonomic classification and specific parameters estimation analyzing Gaia BP/RP instrument low-dispersion spectra of unresolved galaxies. The UGC algorithm is based on a supervised learning technique, the Support Vector Machines (SVM). The software is implemented in Java as two separate modules. An offline learning module provides functions for SVM-models training. Once trained, the set of models can be repeatedly applied to unknown galaxy spectra by the pipeline's application module. A library of galaxy models synthetic spectra, simulated for the BP/RP instrument, is used to train and test the modules. Science tests show a very good classification performance of UGC and relatively good regression performance, except for some of the parameters. Possible approaches to improve the performance are discussed.

  8. Modeling Dengue vector population using remotely sensed data and machine learning.

    PubMed

    Scavuzzo, Juan M; Trucco, Francisco; Espinosa, Manuel; Tauro, Carolina B; Abril, Marcelo; Scavuzzo, Carlos M; Frery, Alejandro C

    2018-05-16

    Mosquitoes are vectors of many human diseases. In particular, Aedes ægypti (Linnaeus) is the main vector for Chikungunya, Dengue, and Zika viruses in Latin America and it represents a global threat. Public health policies that aim at combating this vector require dependable and timely information, which is usually expensive to obtain with field campaigns. For this reason, several efforts have been done to use remote sensing due to its reduced cost. The present work includes the temporal modeling of the oviposition activity (measured weekly on 50 ovitraps in a north Argentinean city) of Aedes ægypti (Linnaeus), based on time series of data extracted from operational earth observation satellite images. We use are NDVI, NDWI, LST night, LST day and TRMM-GPM rain from 2012 to 2016 as predictive variables. In contrast to previous works which use linear models, we employ Machine Learning techniques using completely accessible open source toolkits. These models have the advantages of being non-parametric and capable of describing nonlinear relationships between variables. Specifically, in addition to two linear approaches, we assess a support vector machine, an artificial neural networks, a K-nearest neighbors and a decision tree regressor. Considerations are made on parameter tuning and the validation and training approach. The results are compared to linear models used in previous works with similar data sets for generating temporal predictive models. These new tools perform better than linear approaches, in particular nearest neighbor regression (KNNR) performs the best. These results provide better alternatives to be implemented operatively on the Argentine geospatial risk system that is running since 2012. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Predicting Error Bars for QSAR Models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schroeter, Timon; Technische Universitaet Berlin, Department of Computer Science, Franklinstrasse 28/29, 10587 Berlin; Schwaighofer, Anton

    2007-09-18

    Unfavorable physicochemical properties often cause drug failures. It is therefore important to take lipophilicity and water solubility into account early on in lead discovery. This study presents log D{sub 7} models built using Gaussian Process regression, Support Vector Machines, decision trees and ridge regression algorithms based on 14556 drug discovery compounds of Bayer Schering Pharma. A blind test was conducted using 7013 new measurements from the last months. We also present independent evaluations using public data. Apart from accuracy, we discuss the quality of error bars that can be computed by Gaussian Process models, and ensemble and distance based techniquesmore » for the other modelling approaches.« less

  10. Face Hallucination with Linear Regression Model in Semi-Orthogonal Multilinear PCA Method

    NASA Astrophysics Data System (ADS)

    Asavaskulkiet, Krissada

    2018-04-01

    In this paper, we propose a new face hallucination technique, face images reconstruction in HSV color space with a semi-orthogonal multilinear principal component analysis method. This novel hallucination technique can perform directly from tensors via tensor-to-vector projection by imposing the orthogonality constraint in only one mode. In our experiments, we use facial images from FERET database to test our hallucination approach which is demonstrated by extensive experiments with high-quality hallucinated color faces. The experimental results assure clearly demonstrated that we can generate photorealistic color face images by using the SO-MPCA subspace with a linear regression model.

  11. Experimental and computational prediction of glass transition temperature of drugs.

    PubMed

    Alzghoul, Ahmad; Alhalaweh, Amjad; Mahlin, Denny; Bergström, Christel A S

    2014-12-22

    Glass transition temperature (Tg) is an important inherent property of an amorphous solid material which is usually determined experimentally. In this study, the relation between Tg and melting temperature (Tm) was evaluated using a data set of 71 structurally diverse druglike compounds. Further, in silico models for prediction of Tg were developed based on calculated molecular descriptors and linear (multilinear regression, partial least-squares, principal component regression) and nonlinear (neural network, support vector regression) modeling techniques. The models based on Tm predicted Tg with an RMSE of 19.5 K for the test set. Among the five computational models developed herein the support vector regression gave the best result with RMSE of 18.7 K for the test set using only four chemical descriptors. Hence, two different models that predict Tg of drug-like molecules with high accuracy were developed. If Tm is available, a simple linear regression can be used to predict Tg. However, the results also suggest that support vector regression and calculated molecular descriptors can predict Tg with equal accuracy, already before compound synthesis.

  12. Using support vector machines to identify literacy skills: Evidence from eye movements.

    PubMed

    Lou, Ya; Liu, Yanping; Kaakinen, Johanna K; Li, Xingshan

    2017-06-01

    Is inferring readers' literacy skills possible by analyzing their eye movements during text reading? This study used Support Vector Machines (SVM) to analyze eye movement data from 61 undergraduate students who read a multiple-paragraph, multiple-topic expository text. Forward fixation time, first-pass rereading time, second-pass fixation time, and regression path reading time on different regions of the text were provided as features. The SVM classification algorithm assisted in distinguishing high-literacy-skilled readers from low-literacy-skilled readers with 80.3 % accuracy. Results demonstrate the effectiveness of combining eye tracking and machine learning techniques to detect readers with low literacy skills, and suggest that such approaches can be potentially used in predicting other cognitive abilities.

  13. Subpixel urban land cover estimation: comparing cubist, random forests, and support vector regression

    Treesearch

    Jeffrey T. Walton

    2008-01-01

    Three machine learning subpixel estimation methods (Cubist, Random Forests, and support vector regression) were applied to estimate urban cover. Urban forest canopy cover and impervious surface cover were estimated from Landsat-7 ETM+ imagery using a higher resolution cover map resampled to 30 m as training and reference data. Three different band combinations (...

  14. Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets.

    PubMed

    Marchese Robinson, Richard L; Palczewska, Anna; Palczewski, Jan; Kidley, Nathan

    2017-08-28

    The ability to interpret the predictions made by quantitative structure-activity relationships (QSARs) offers a number of advantages. While QSARs built using nonlinear modeling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modeling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting nonlinear QSAR models in general and Random Forest in particular. In the current work, we compare the performance of Random Forest to those of two widely used linear modeling approaches: linear Support Vector Machines (SVMs) (or Support Vector Regression (SVR)) and partial least-squares (PLS). We compare their performance in terms of their predictivity as well as the chemical interpretability of the predictions using novel scoring schemes for assessing heat map images of substructural contributions. We critically assess different approaches for interpreting Random Forest models as well as for obtaining predictions from the forest. We assess the models on a large number of widely employed public-domain benchmark data sets corresponding to regression and binary classification problems of relevance to hit identification and toxicology. We conclude that Random Forest typically yields comparable or possibly better predictive performance than the linear modeling approaches and that its predictions may also be interpreted in a chemically and biologically meaningful way. In contrast to earlier work looking at interpretation of nonlinear QSAR models, we directly compare two methodologically distinct approaches for interpreting Random Forest models. The approaches for interpreting Random Forest assessed in our article were implemented using open-source programs that we have made available to the community. These programs are the rfFC package ( https://r-forge.r-project.org/R/?group_id=1725 ) for the R statistical programming language and the Python program HeatMapWrapper [ https://doi.org/10.5281/zenodo.495163 ] for heat map generation.

  15. Building a Computer Program to Support Children, Parents, and Distraction during Healthcare Procedures

    PubMed Central

    McCarthy, Ann Marie; Kleiber, Charmaine; Ataman, Kaan; Street, W. Nick; Zimmerman, M. Bridget; Ersig, Anne L.

    2012-01-01

    This secondary data analysis used data mining methods to develop predictive models of child risk for distress during a healthcare procedure. Data used came from a study that predicted factors associated with children’s responses to an intravenous catheter insertion while parents provided distraction coaching. From the 255 items used in the primary study, 44 predictive items were identified through automatic feature selection and used to build support vector machine regression models. Models were validated using multiple cross-validation tests and by comparing variables identified as explanatory in the traditional versus support vector machine regression. Rule-based approaches were applied to the model outputs to identify overall risk for distress. A decision tree was then applied to evidence-based instructions for tailoring distraction to characteristics and preferences of the parent and child. The resulting decision support computer application, the Children, Parents and Distraction (CPaD), is being used in research. Future use will support practitioners in deciding the level and type of distraction intervention needed by a child undergoing a healthcare procedure. PMID:22805121

  16. Evaluation of modulation transfer function of optical lens system by support vector regression methodologies - A comparative study

    NASA Astrophysics Data System (ADS)

    Petković, Dalibor; Shamshirband, Shahaboddin; Saboohi, Hadi; Ang, Tan Fong; Anuar, Nor Badrul; Rahman, Zulkanain Abdul; Pavlović, Nenad T.

    2014-07-01

    The quantitative assessment of image quality is an important consideration in any type of imaging system. The modulation transfer function (MTF) is a graphical description of the sharpness and contrast of an imaging system or of its individual components. The MTF is also known and spatial frequency response. The MTF curve has different meanings according to the corresponding frequency. The MTF of an optical system specifies the contrast transmitted by the system as a function of image size, and is determined by the inherent optical properties of the system. In this study, the polynomial and radial basis function (RBF) are applied as the kernel function of Support Vector Regression (SVR) to estimate and predict estimate MTF value of the actual optical system according to experimental tests. Instead of minimizing the observed training error, SVR_poly and SVR_rbf attempt to minimize the generalization error bound so as to achieve generalized performance. The experimental results show that an improvement in predictive accuracy and capability of generalization can be achieved by the SVR_rbf approach in compare to SVR_poly soft computing methodology.

  17. Development of precursors recognition methods in vector signals

    NASA Astrophysics Data System (ADS)

    Kapralov, V. G.; Elagin, V. V.; Kaveeva, E. G.; Stankevich, L. A.; Dremin, M. M.; Krylov, S. V.; Borovov, A. E.; Harfush, H. A.; Sedov, K. S.

    2017-10-01

    Precursor recognition methods in vector signals of plasma diagnostics are presented. Their requirements and possible options for their development are considered. In particular, the variants of using symbolic regression for building a plasma disruption prediction system are discussed. The initial data preparation using correlation analysis and symbolic regression is discussed. Special attention is paid to the possibility of using algorithms in real time.

  18. Predicting the dissolution kinetics of silicate glasses using machine learning

    NASA Astrophysics Data System (ADS)

    Anoop Krishnan, N. M.; Mangalathu, Sujith; Smedskjaer, Morten M.; Tandia, Adama; Burton, Henry; Bauchy, Mathieu

    2018-05-01

    Predicting the dissolution rates of silicate glasses in aqueous conditions is a complex task as the underlying mechanism(s) remain poorly understood and the dissolution kinetics can depend on a large number of intrinsic and extrinsic factors. Here, we assess the potential of data-driven models based on machine learning to predict the dissolution rates of various aluminosilicate glasses exposed to a wide range of solution pH values, from acidic to caustic conditions. Four classes of machine learning methods are investigated, namely, linear regression, support vector machine regression, random forest, and artificial neural network. We observe that, although linear methods all fail to describe the dissolution kinetics, the artificial neural network approach offers excellent predictions, thanks to its inherent ability to handle non-linear data. Overall, we suggest that a more extensive use of machine learning approaches could significantly accelerate the design of novel glasses with tailored properties.

  19. Applications of Some Artificial Intelligence Methods to Satellite Soundings

    NASA Technical Reports Server (NTRS)

    Munteanu, M. J.; Jakubowicz, O.

    1985-01-01

    Hard clustering of temperature profiles and regression temperature retrievals were used to refine the method using the probabilities of membership of each pattern vector in each of the clusters derived with discriminant analysis. In hard clustering the maximum probability is taken and the corresponding cluster as the correct cluster are considered discarding the rest of the probabilities. In fuzzy partitioned clustering these probabilities are kept and the final regression retrieval is a weighted regression retrieval of several clusters. This method was used in the clustering of brightness temperatures where the purpose was to predict tropopause height. A further refinement is the division of temperature profiles into three major regions for classification purposes. The results are summarized in the tables total r.m.s. errors are displayed. An approach based on fuzzy logic which is intimately related to artificial intelligence methods is recommended.

  20. Predicting residue-wise contact orders in proteins by support vector regression.

    PubMed

    Song, Jiangning; Burrage, Kevin

    2006-10-03

    The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

  1. Forecasting Caspian Sea level changes using satellite altimetry data (June 1992-December 2013) based on evolutionary support vector regression algorithms and gene expression programming

    NASA Astrophysics Data System (ADS)

    Imani, Moslem; You, Rey-Jer; Kuo, Chung-Yen

    2014-10-01

    Sea level forecasting at various time intervals is of great importance in water supply management. Evolutionary artificial intelligence (AI) approaches have been accepted as an appropriate tool for modeling complex nonlinear phenomena in water bodies. In the study, we investigated the ability of two AI techniques: support vector machine (SVM), which is mathematically well-founded and provides new insights into function approximation, and gene expression programming (GEP), which is used to forecast Caspian Sea level anomalies using satellite altimetry observations from June 1992 to December 2013. SVM demonstrates the best performance in predicting Caspian Sea level anomalies, given the minimum root mean square error (RMSE = 0.035) and maximum coefficient of determination (R2 = 0.96) during the prediction periods. A comparison between the proposed AI approaches and the cascade correlation neural network (CCNN) model also shows the superiority of the GEP and SVM models over the CCNN.

  2. An automated ranking platform for machine learning regression models for meat spoilage prediction using multi-spectral imaging and metabolic profiling.

    PubMed

    Estelles-Lopez, Lucia; Ropodi, Athina; Pavlidis, Dimitris; Fotopoulou, Jenny; Gkousari, Christina; Peyrodie, Audrey; Panagou, Efstathios; Nychas, George-John; Mohareb, Fady

    2017-09-01

    Over the past decade, analytical approaches based on vibrational spectroscopy, hyperspectral/multispectral imagining and biomimetic sensors started gaining popularity as rapid and efficient methods for assessing food quality, safety and authentication; as a sensible alternative to the expensive and time-consuming conventional microbiological techniques. Due to the multi-dimensional nature of the data generated from such analyses, the output needs to be coupled with a suitable statistical approach or machine-learning algorithms before the results can be interpreted. Choosing the optimum pattern recognition or machine learning approach for a given analytical platform is often challenging and involves a comparative analysis between various algorithms in order to achieve the best possible prediction accuracy. In this work, "MeatReg", a web-based application is presented, able to automate the procedure of identifying the best machine learning method for comparing data from several analytical techniques, to predict the counts of microorganisms responsible of meat spoilage regardless of the packaging system applied. In particularly up to 7 regression methods were applied and these are ordinary least squares regression, stepwise linear regression, partial least square regression, principal component regression, support vector regression, random forest and k-nearest neighbours. MeatReg" was tested with minced beef samples stored under aerobic and modified atmosphere packaging and analysed with electronic nose, HPLC, FT-IR, GC-MS and Multispectral imaging instrument. Population of total viable count, lactic acid bacteria, pseudomonads, Enterobacteriaceae and B. thermosphacta, were predicted. As a result, recommendations of which analytical platforms are suitable to predict each type of bacteria and which machine learning methods to use in each case were obtained. The developed system is accessible via the link: www.sorfml.com. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. Load Forecasting Based Distribution System Network Reconfiguration -- A Distributed Data-Driven Approach

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, Huaiguang; Zhang, Yingchen; Muljadi, Eduard

    In this paper, a short-term load forecasting approach based network reconfiguration is proposed in a parallel manner. Specifically, a support vector regression (SVR) based short-term load forecasting approach is designed to provide an accurate load prediction and benefit the network reconfiguration. Because of the nonconvexity of the three-phase balanced optimal power flow, a second-order cone program (SOCP) based approach is used to relax the optimal power flow problem. Then, the alternating direction method of multipliers (ADMM) is used to compute the optimal power flow in distributed manner. Considering the limited number of the switches and the increasing computation capability, themore » proposed network reconfiguration is solved in a parallel way. The numerical results demonstrate the feasible and effectiveness of the proposed approach.« less

  4. TANGLE: Two-Level Support Vector Regression Approach for Protein Backbone Torsion Angle Prediction from Primary Sequences

    PubMed Central

    Song, Jiangning; Tan, Hao; Wang, Mingjun; Webb, Geoffrey I.; Akutsu, Tatsuya

    2012-01-01

    Protein backbone torsion angles (Phi) and (Psi) involve two rotation angles rotating around the Cα-N bond (Phi) and the Cα-C bond (Psi). Due to the planarity of the linked rigid peptide bonds, these two angles can essentially determine the backbone geometry of proteins. Accordingly, the accurate prediction of protein backbone torsion angle from sequence information can assist the prediction of protein structures. In this study, we develop a new approach called TANGLE (Torsion ANGLE predictor) to predict the protein backbone torsion angles from amino acid sequences. TANGLE uses a two-level support vector regression approach to perform real-value torsion angle prediction using a variety of features derived from amino acid sequences, including the evolutionary profiles in the form of position-specific scoring matrices, predicted secondary structure, solvent accessibility and natively disordered region as well as other global sequence features. When evaluated based on a large benchmark dataset of 1,526 non-homologous proteins, the mean absolute errors (MAEs) of the Phi and Psi angle prediction are 27.8° and 44.6°, respectively, which are 1% and 3% respectively lower than that using one of the state-of-the-art prediction tools ANGLOR. Moreover, the prediction of TANGLE is significantly better than a random predictor that was built on the amino acid-specific basis, with the p-value<1.46e-147 and 7.97e-150, respectively by the Wilcoxon signed rank test. As a complementary approach to the current torsion angle prediction algorithms, TANGLE should prove useful in predicting protein structural properties and assisting protein fold recognition by applying the predicted torsion angles as useful restraints. TANGLE is freely accessible at http://sunflower.kuicr.kyoto-u.ac.jp/~sjn/TANGLE/. PMID:22319565

  5. Using Data Mining for Wine Quality Assessment

    NASA Astrophysics Data System (ADS)

    Cortez, Paulo; Teixeira, Juliana; Cerdeira, António; Almeida, Fernando; Matos, Telmo; Reis, José

    Certification and quality assessment are crucial issues within the wine industry. Currently, wine quality is mostly assessed by physicochemical (e.g alcohol levels) and sensory (e.g. human expert evaluation) tests. In this paper, we propose a data mining approach to predict wine preferences that is based on easily available analytical tests at the certification step. A large dataset is considered with white vinho verde samples from the Minho region of Portugal. Wine quality is modeled under a regression approach, which preserves the order of the grades. Explanatory knowledge is given in terms of a sensitivity analysis, which measures the response changes when a given input variable is varied through its domain. Three regression techniques were applied, under a computationally efficient procedure that performs simultaneous variable and model selection and that is guided by the sensitivity analysis. The support vector machine achieved promising results, outperforming the multiple regression and neural network methods. Such model is useful for understanding how physicochemical tests affect the sensory preferences. Moreover, it can support the wine expert evaluations and ultimately improve the production.

  6. Discrimination and characterization of strawberry juice based on electronic nose and tongue: comparison of different juice processing approaches by LDA, PLSR, RF, and SVM.

    PubMed

    Qiu, Shanshan; Wang, Jun; Gao, Liping

    2014-07-09

    An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.

  7. Comparison of machine-learning methods for above-ground biomass estimation based on Landsat imagery

    NASA Astrophysics Data System (ADS)

    Wu, Chaofan; Shen, Huanhuan; Shen, Aihua; Deng, Jinsong; Gan, Muye; Zhu, Jinxia; Xu, Hongwei; Wang, Ke

    2016-07-01

    Biomass is one significant biophysical parameter of a forest ecosystem, and accurate biomass estimation on the regional scale provides important information for carbon-cycle investigation and sustainable forest management. In this study, Landsat satellite imagery data combined with field-based measurements were integrated through comparisons of five regression approaches [stepwise linear regression, K-nearest neighbor, support vector regression, random forest (RF), and stochastic gradient boosting] with two different candidate variable strategies to implement the optimal spatial above-ground biomass (AGB) estimation. The results suggested that RF algorithm exhibited the best performance by 10-fold cross-validation with respect to R2 (0.63) and root-mean-square error (26.44 ton/ha). Consequently, the map of estimated AGB was generated with a mean value of 89.34 ton/ha in northwestern Zhejiang Province, China, with a similar pattern to the distribution mode of local forest species. This research indicates that machine-learning approaches associated with Landsat imagery provide an economical way for biomass estimation. Moreover, ensemble methods using all candidate variables, especially for Landsat images, provide an alternative for regional biomass simulation.

  8. A comparative study of machine learning models for ethnicity classification

    NASA Astrophysics Data System (ADS)

    Trivedi, Advait; Bessie Amali, D. Geraldine

    2017-11-01

    This paper endeavours to adopt a machine learning approach to solve the problem of ethnicity recognition. Ethnicity identification is an important vision problem with its use cases being extended to various domains. Despite the multitude of complexity involved, ethnicity identification comes naturally to humans. This meta information can be leveraged to make several decisions, be it in target marketing or security. With the recent development of intelligent systems a sub module to efficiently capture ethnicity would be useful in several use cases. Several attempts to identify an ideal learning model to represent a multi-ethnic dataset have been recorded. A comparative study of classifiers such as support vector machines, logistic regression has been documented. Experimental results indicate that the logical classifier provides a much accurate classification than the support vector machine.

  9. fRMSDPred: Predicting Local RMSD Between Structural Fragments Using Sequence Information

    DTIC Science & Technology

    2007-04-04

    machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel

  10. Prediction of hourly PM2.5 using a space-time support vector regression model

    NASA Astrophysics Data System (ADS)

    Yang, Wentao; Deng, Min; Xu, Feng; Wang, Hang

    2018-05-01

    Real-time air quality prediction has been an active field of research in atmospheric environmental science. The existing methods of machine learning are widely used to predict pollutant concentrations because of their enhanced ability to handle complex non-linear relationships. However, because pollutant concentration data, as typical geospatial data, also exhibit spatial heterogeneity and spatial dependence, they may violate the assumptions of independent and identically distributed random variables in most of the machine learning methods. As a result, a space-time support vector regression model is proposed to predict hourly PM2.5 concentrations. First, to address spatial heterogeneity, spatial clustering is executed to divide the study area into several homogeneous or quasi-homogeneous subareas. To handle spatial dependence, a Gauss vector weight function is then developed to determine spatial autocorrelation variables as part of the input features. Finally, a local support vector regression model with spatial autocorrelation variables is established for each subarea. Experimental data on PM2.5 concentrations in Beijing are used to verify whether the results of the proposed model are superior to those of other methods.

  11. Cross-modal face recognition using multi-matcher face scores

    NASA Astrophysics Data System (ADS)

    Zheng, Yufeng; Blasch, Erik

    2015-05-01

    The performance of face recognition can be improved using information fusion of multimodal images and/or multiple algorithms. When multimodal face images are available, cross-modal recognition is meaningful for security and surveillance applications. For example, a probe face is a thermal image (especially at nighttime), while only visible face images are available in the gallery database. Matching a thermal probe face onto the visible gallery faces requires crossmodal matching approaches. A few such studies were implemented in facial feature space with medium recognition performance. In this paper, we propose a cross-modal recognition approach, where multimodal faces are cross-matched in feature space and the recognition performance is enhanced with stereo fusion at image, feature and/or score level. In the proposed scenario, there are two cameras for stereo imaging, two face imagers (visible and thermal images) in each camera, and three recognition algorithms (circular Gaussian filter, face pattern byte, linear discriminant analysis). A score vector is formed with three cross-matched face scores from the aforementioned three algorithms. A classifier (e.g., k-nearest neighbor, support vector machine, binomial logical regression [BLR]) is trained then tested with the score vectors by using 10-fold cross validations. The proposed approach was validated with a multispectral stereo face dataset from 105 subjects. Our experiments show very promising results: ACR (accuracy rate) = 97.84%, FAR (false accept rate) = 0.84% when cross-matching the fused thermal faces onto the fused visible faces by using three face scores and the BLR classifier.

  12. Aeromagnetic gradient compensation method for helicopter based on ɛ-support vector regression algorithm

    NASA Astrophysics Data System (ADS)

    Wu, Peilin; Zhang, Qunying; Fei, Chunjiao; Fang, Guangyou

    2017-04-01

    Aeromagnetic gradients are typically measured by optically pumped magnetometers mounted on an aircraft. Any aircraft, particularly helicopters, produces significant levels of magnetic interference. Therefore, aeromagnetic compensation is essential, and least square (LS) is the conventional method used for reducing interference levels. However, the LSs approach to solving the aeromagnetic interference model has a few difficulties, one of which is in handling multicollinearity. Therefore, we propose an aeromagnetic gradient compensation method, specifically targeted for helicopter use but applicable on any airborne platform, which is based on the ɛ-support vector regression algorithm. The structural risk minimization criterion intrinsic to the method avoids multicollinearity altogether. Local aeromagnetic anomalies can be retained, and platform-generated fields are suppressed simultaneously by constructing an appropriate loss function and kernel function. The method was tested using an unmanned helicopter and obtained improvement ratios of 12.7 and 3.5 in the vertical and horizontal gradient data, respectively. Both of these values are probably better than those that would have been obtained from the conventional method applied to the same data, had it been possible to do so in a suitable comparative context. The validity of the proposed method is demonstrated by the experimental result.

  13. Applying machine-learning techniques to Twitter data for automatic hazard-event classification.

    NASA Astrophysics Data System (ADS)

    Filgueira, R.; Bee, E. J.; Diaz-Doce, D.; Poole, J., Sr.; Singh, A.

    2017-12-01

    The constant flow of information offered by tweets provides valuable information about all sorts of events at a high temporal and spatial resolution. Over the past year we have been analyzing in real-time geological hazards/phenomenon, such as earthquakes, volcanic eruptions, landslides, floods or the aurora, as part of the GeoSocial project, by geo-locating tweets filtered by keywords in a web-map. However, not all the filtered tweets are related with hazard/phenomenon events. This work explores two classification techniques for automatic hazard-event categorization based on tweets about the "Aurora". First, tweets were filtered using aurora-related keywords, removing stop words and selecting the ones written in English. For classifying the remaining between "aurora-event" or "no-aurora-event" categories, we compared two state-of-art techniques: Support Vector Machine (SVM) and Deep Convolutional Neural Networks (CNN) algorithms. Both approaches belong to the family of supervised learning algorithms, which make predictions based on labelled training dataset. Therefore, we created a training dataset by tagging 1200 tweets between both categories. The general form of SVM is used to separate two classes by a function (kernel). We compared the performance of four different kernels (Linear Regression, Logistic Regression, Multinomial Naïve Bayesian and Stochastic Gradient Descent) provided by Scikit-Learn library using our training dataset to build the SVM classifier. The results shown that the Logistic Regression (LR) gets the best accuracy (87%). So, we selected the SVM-LR classifier to categorise a large collection of tweets using the "dispel4py" framework.Later, we developed a CNN classifier, where the first layer embeds words into low-dimensional vectors. The next layer performs convolutions over the embedded word vectors. Results from the convolutional layer are max-pooled into a long feature vector, which is classified using a softmax layer. The CNN's accuracy is lower (83%) than the SVM-LR, since the algorithm needs a bigger training dataset to increase its accuracy. We used TensorFlow framework for applying CNN classifier to the same collection of tweets.In future we will modify both classifiers to work with other geo-hazards, use larger training datasets and apply them in real-time.

  14. MANCOVA for one way classification with homogeneity of regression coefficient vectors

    NASA Astrophysics Data System (ADS)

    Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.

    2017-11-01

    The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.

  15. Emotion-independent face recognition

    NASA Astrophysics Data System (ADS)

    De Silva, Liyanage C.; Esther, Kho G. P.

    2000-12-01

    Current face recognition techniques tend to work well when recognizing faces under small variations in lighting, facial expression and pose, but deteriorate under more extreme conditions. In this paper, a face recognition system to recognize faces of known individuals, despite variations in facial expression due to different emotions, is developed. The eigenface approach is used for feature extraction. Classification methods include Euclidean distance, back propagation neural network and generalized regression neural network. These methods yield 100% recognition accuracy when the training database is representative, containing one image representing the peak expression for each emotion of each person apart from the neutral expression. The feature vectors used for comparison in the Euclidean distance method and for training the neural network must be all the feature vectors of the training set. These results are obtained for a face database consisting of only four persons.

  16. Determinants of Health Service Responsiveness in Community-Based Vector Surveillance for Chagas Disease in Guatemala, El Salvador, and Honduras.

    PubMed

    Hashimoto, Ken; Zúniga, Concepción; Romero, Eduardo; Morales, Zoraida; Maguire, James H

    2015-01-01

    Central American countries face a major challenge in the control of Triatoma dimidiata, a widespread vector of Chagas disease that cannot be eliminated. The key to maintaining the risk of transmission of Trypanosoma cruzi at lowest levels is to sustain surveillance throughout endemic areas. Guatemala, El Salvador, and Honduras integrated community-based vector surveillance into local health systems. Community participation was effective in detection of the vector, but some health services had difficulty sustaining their response to reports of vectors from the population. To date, no research has investigated how best to maintain and reinforce health service responsiveness, especially in resource-limited settings. We reviewed surveillance and response records of 12 health centers in Guatemala, El Salvador, and Honduras from 2008 to 2012 and analyzed the data in relation to the volume of reports of vector infestation, local geography, demography, human resources, managerial approach, and results of interviews with health workers. Health service responsiveness was defined as the percentage of households that reported vector infestation for which the local health service provided indoor residual spraying of insecticide or educational advice. Eight potential determinants of responsiveness were evaluated by linear and mixed-effects multi-linear regression. Health service responsiveness (overall 77.4%) was significantly associated with quarterly monitoring by departmental health offices. Other potential determinants of responsiveness were not found to be significant, partly because of short- and long-term strategies, such as temporary adjustments in manpower and redistribution of tasks among local participants in the effort. Consistent monitoring within the local health system contributes to sustainability of health service responsiveness in community-based vector surveillance of Chagas disease. Even with limited resources, countries can improve health service responsiveness with thoughtful strategies and management practices in the local health systems.

  17. Modeling and Predicting the Electrical Conductivity of Composite Cathode for Solid Oxide Fuel Cell by Using Support Vector Regression

    NASA Astrophysics Data System (ADS)

    Tang, J. L.; Cai, C. Z.; Xiao, T. T.; Huang, S. J.

    2012-07-01

    The electrical conductivity of solid oxide fuel cell (SOFC) cathode is one of the most important indices affecting the efficiency of SOFC. In order to improve the performance of fuel cell system, it is advantageous to have accurate model with which one can predict the electrical conductivity. In this paper, a model utilizing support vector regression (SVR) approach combined with particle swarm optimization (PSO) algorithm for its parameter optimization was established to modeling and predicting the electrical conductivity of Ba0.5Sr0.5Co0.8Fe0.2 O3-δ-xSm0.5Sr0.5CoO3-δ (BSCF-xSSC) composite cathode under two influence factors, including operating temperature (T) and SSC content (x) in BSCF-xSSC composite cathode. The leave-one-out cross validation (LOOCV) test result by SVR strongly supports that the generalization ability of SVR model is high enough. The absolute percentage error (APE) of 27 samples does not exceed 0.05%. The mean absolute percentage error (MAPE) of all 30 samples is only 0.09% and the correlation coefficient (R2) as high as 0.999. This investigation suggests that the hybrid PSO-SVR approach may be not only a promising and practical methodology to simulate the properties of fuel cell system, but also a powerful tool to be used for optimal designing or controlling the operating process of a SOFC system.

  18. Can Selforganizing Maps Accurately Predict Photometric Redshifts?

    NASA Technical Reports Server (NTRS)

    Way, Michael J.; Klose, Christian

    2012-01-01

    We present an unsupervised machine-learning approach that can be employed for estimating photometric redshifts. The proposed method is based on a vector quantization called the self-organizing-map (SOM) approach. A variety of photometrically derived input values were utilized from the Sloan Digital Sky Survey's main galaxy sample, luminous red galaxy, and quasar samples, along with the PHAT0 data set from the Photo-z Accuracy Testing project. Regression results obtained with this new approach were evaluated in terms of root-mean-square error (RMSE) to estimate the accuracy of the photometric redshift estimates. The results demonstrate competitive RMSE and outlier percentages when compared with several other popular approaches, such as artificial neural networks and Gaussian process regression. SOM RMSE results (using delta(z) = z(sub phot) - z(sub spec)) are 0.023 for the main galaxy sample, 0.027 for the luminous red galaxy sample, 0.418 for quasars, and 0.022 for PHAT0 synthetic data. The results demonstrate that there are nonunique solutions for estimating SOM RMSEs. Further research is needed in order to find more robust estimation techniques using SOMs, but the results herein are a positive indication of their capabilities when compared with other well-known methods

  19. Predicting disulfide connectivity from protein sequence using multiple sequence feature vectors and secondary structure.

    PubMed

    Song, Jiangning; Yuan, Zheng; Tan, Hao; Huber, Thomas; Burrage, Kevin

    2007-12-01

    Disulfide bonds are primary covalent crosslinks between two cysteine residues in proteins that play critical roles in stabilizing the protein structures and are commonly found in extracy-toplasmatic or secreted proteins. In protein folding prediction, the localization of disulfide bonds can greatly reduce the search in conformational space. Therefore, there is a great need to develop computational methods capable of accurately predicting disulfide connectivity patterns in proteins that could have potentially important applications. We have developed a novel method to predict disulfide connectivity patterns from protein primary sequence, using a support vector regression (SVR) approach based on multiple sequence feature vectors and predicted secondary structure by the PSIPRED program. The results indicate that our method could achieve a prediction accuracy of 74.4% and 77.9%, respectively, when averaged on proteins with two to five disulfide bridges using 4-fold cross-validation, measured on the protein and cysteine pair on a well-defined non-homologous dataset. We assessed the effects of different sequence encoding schemes on the prediction performance of disulfide connectivity. It has been shown that the sequence encoding scheme based on multiple sequence feature vectors coupled with predicted secondary structure can significantly improve the prediction accuracy, thus enabling our method to outperform most of other currently available predictors. Our work provides a complementary approach to the current algorithms that should be useful in computationally assigning disulfide connectivity patterns and helps in the annotation of protein sequences generated by large-scale whole-genome projects. The prediction web server and Supplementary Material are accessible at http://foo.maths.uq.edu.au/~huber/disulfide

  20. Multi-fidelity Gaussian process regression for prediction of random fields

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Parussini, L.; Venturi, D., E-mail: venturi@ucsc.edu; Perdikaris, P.

    We propose a new multi-fidelity Gaussian process regression (GPR) approach for prediction of random fields based on observations of surrogate models or hierarchies of surrogate models. Our method builds upon recent work on recursive Bayesian techniques, in particular recursive co-kriging, and extends it to vector-valued fields and various types of covariances, including separable and non-separable ones. The framework we propose is general and can be used to perform uncertainty propagation and quantification in model-based simulations, multi-fidelity data fusion, and surrogate-based optimization. We demonstrate the effectiveness of the proposed recursive GPR techniques through various examples. Specifically, we study the stochastic Burgersmore » equation and the stochastic Oberbeck–Boussinesq equations describing natural convection within a square enclosure. In both cases we find that the standard deviation of the Gaussian predictors as well as the absolute errors relative to benchmark stochastic solutions are very small, suggesting that the proposed multi-fidelity GPR approaches can yield highly accurate results.« less

  1. Assessing Principal Component Regression Prediction of Neurochemicals Detected with Fast-Scan Cyclic Voltammetry

    PubMed Central

    2011-01-01

    Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook’s distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards. PMID:21966586

  2. Assessing principal component regression prediction of neurochemicals detected with fast-scan cyclic voltammetry.

    PubMed

    Keithley, Richard B; Wightman, R Mark

    2011-06-07

    Principal component regression is a multivariate data analysis approach routinely used to predict neurochemical concentrations from in vivo fast-scan cyclic voltammetry measurements. This mathematical procedure can rapidly be employed with present day computer programming languages. Here, we evaluate several methods that can be used to evaluate and improve multivariate concentration determination. The cyclic voltammetric representation of the calculated regression vector is shown to be a valuable tool in determining whether the calculated multivariate model is chemically appropriate. The use of Cook's distance successfully identified outliers contained within in vivo fast-scan cyclic voltammetry training sets. This work also presents the first direct interpretation of a residual color plot and demonstrated the effect of peak shifts on predicted dopamine concentrations. Finally, separate analyses of smaller increments of a single continuous measurement could not be concatenated without substantial error in the predicted neurochemical concentrations due to electrode drift. Taken together, these tools allow for the construction of more robust multivariate calibration models and provide the first approach to assess the predictive ability of a procedure that is inherently impossible to validate because of the lack of in vivo standards.

  3. Exploring the capabilities of support vector machines in detecting silent data corruptions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo

    As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs), or silent errors, are one of the major sources that corrupt the execution results of HPC applications without being detected. Here in this paper, we explore a set of novel SDC detectors – by leveraging epsilon-insensitive support vector machine regression – to detect SDCs that occur in HPC applications. The key contributions are threefold. (1) Our exploration takes temporal, spatial, and spatiotemporal features into account and analyzes different detectors based onmore » different features. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show that support-vector-machine-based detectors can achieve detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% false positive rate for most cases. Our detectors incur low performance overhead, 5% on average, for all benchmarks studied in this work.« less

  4. Exploring the capabilities of support vector machines in detecting silent data corruptions

    DOE PAGES

    Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo; ...

    2018-02-01

    As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs), or silent errors, are one of the major sources that corrupt the execution results of HPC applications without being detected. Here in this paper, we explore a set of novel SDC detectors – by leveraging epsilon-insensitive support vector machine regression – to detect SDCs that occur in HPC applications. The key contributions are threefold. (1) Our exploration takes temporal, spatial, and spatiotemporal features into account and analyzes different detectors based onmore » different features. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show that support-vector-machine-based detectors can achieve detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% false positive rate for most cases. Our detectors incur low performance overhead, 5% on average, for all benchmarks studied in this work.« less

  5. Method for enhanced accuracy in predicting peptides using liquid separations or chromatography

    DOEpatents

    Kangas, Lars J.; Auberry, Kenneth J.; Anderson, Gordon A.; Smith, Richard D.

    2006-11-14

    A method for predicting the elution time of a peptide in chromatographic and electrophoretic separations by first providing a data set of known elution times of known peptides, then creating a plurality of vectors, each vector having a plurality of dimensions, and each dimension representing the elution time of amino acids present in each of these known peptides from the data set. The elution time of any protein is then be predicted by first creating a vector by assigning dimensional values for the elution time of amino acids of at least one hypothetical peptide and then calculating a predicted elution time for the vector by performing a multivariate regression of the dimensional values of the hypothetical peptide using the dimensional values of the known peptides. Preferably, the multivariate regression is accomplished by the use of an artificial neural network and the elution times are first normalized using a transfer function.

  6. Support vector regression-guided unravelling: antioxidant capacity and quantitative structure-activity relationship predict reduction and promotion effects of flavonoids on acrylamide formation

    PubMed Central

    Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu

    2016-01-01

    We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1–10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633–0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926–0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation. PMID:27586851

  7. Support vector regression-guided unravelling: antioxidant capacity and quantitative structure-activity relationship predict reduction and promotion effects of flavonoids on acrylamide formation

    NASA Astrophysics Data System (ADS)

    Huang, Mengmeng; Wei, Yan; Wang, Jun; Zhang, Yu

    2016-09-01

    We used the support vector regression (SVR) approach to predict and unravel reduction/promotion effect of characteristic flavonoids on the acrylamide formation under a low-moisture Maillard reaction system. Results demonstrated the reduction/promotion effects by flavonoids at addition levels of 1-10000 μmol/L. The maximal inhibition rates (51.7%, 68.8% and 26.1%) and promote rates (57.7%, 178.8% and 27.5%) caused by flavones, flavonols and isoflavones were observed at addition levels of 100 μmol/L and 10000 μmol/L, respectively. The reduction/promotion effects were closely related to the change of trolox equivalent antioxidant capacity (ΔTEAC) and well predicted by triple ΔTEAC measurements via SVR models (R: 0.633-0.900). Flavonols exhibit stronger effects on the acrylamide formation than flavones and isoflavones as well as their O-glycosides derivatives, which may be attributed to the number and position of phenolic and 3-enolic hydroxyls. The reduction/promotion effects were well predicted by using optimized quantitative structure-activity relationship (QSAR) descriptors and SVR models (R: 0.926-0.994). Compared to artificial neural network and multi-linear regression models, SVR models exhibited better fitting performance for both TEAC-dependent and QSAR descriptor-dependent predicting work. These observations demonstrated that the SVR models are competent for predicting our understanding on the future use of natural antioxidants for decreasing the acrylamide formation.

  8. Simulation of groundwater level variations using wavelet combined with neural network, linear regression and support vector machine

    NASA Astrophysics Data System (ADS)

    Ebrahimi, Hadi; Rajaee, Taher

    2017-01-01

    Simulation of groundwater level (GWL) fluctuations is an important task in management of groundwater resources. In this study, the effect of wavelet analysis on the training of the artificial neural network (ANN), multi linear regression (MLR) and support vector regression (SVR) approaches was investigated, and the ANN, MLR and SVR along with the wavelet-ANN (WNN), wavelet-MLR (WLR) and wavelet-SVR (WSVR) models were compared in simulating one-month-ahead of GWL. The only variable used to develop the models was the monthly GWL data recorded over a period of 11 years from two wells in the Qom plain, Iran. The results showed that decomposing GWL time series into several sub-time series, extremely improved the training of the models. For both wells 1 and 2, the Meyer and Db5 wavelets produced better results compared to the other wavelets; which indicated wavelet types had similar behavior in similar case studies. The optimal number of delays was 6 months, which seems to be due to natural phenomena. The best WNN model, using Meyer mother wavelet with two decomposition levels, simulated one-month-ahead with RMSE values being equal to 0.069 m and 0.154 m for wells 1 and 2, respectively. The RMSE values for the WLR model were 0.058 m and 0.111 m, and for WSVR model were 0.136 m and 0.060 m for wells 1 and 2, respectively.

  9. A low cost implementation of multi-parameter patient monitor using intersection kernel support vector machine classifier

    NASA Astrophysics Data System (ADS)

    Mohan, Dhanya; Kumar, C. Santhosh

    2016-03-01

    Predicting the physiological condition (normal/abnormal) of a patient is highly desirable to enhance the quality of health care. Multi-parameter patient monitors (MPMs) using heart rate, arterial blood pressure, respiration rate and oxygen saturation (S pO2) as input parameters were developed to monitor the condition of patients, with minimum human resource utilization. The Support vector machine (SVM), an advanced machine learning approach popularly used for classification and regression is used for the realization of MPMs. For making MPMs cost effective, we experiment on the hardware implementation of the MPM using support vector machine classifier. The training of the system is done using the matlab environment and the detection of the alarm/noalarm condition is implemented in hardware. We used different kernels for SVM classification and note that the best performance was obtained using intersection kernel SVM (IKSVM). The intersection kernel support vector machine classifier MPM has outperformed the best known MPM using radial basis function kernel by an absoute improvement of 2.74% in accuracy, 1.86% in sensitivity and 3.01% in specificity. The hardware model was developed based on the improved performance system using Verilog Hardware Description Language and was implemented on Altera cyclone-II development board.

  10. NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

    PubMed

    Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan

    2014-01-01

    One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.

  11. Risk Zone Modelling and Early Warning System for Visceral Leishmaniasis Kala-Azar Disease in Bihar, India Using Remote Sensing and GIS

    NASA Astrophysics Data System (ADS)

    Jeyaram, A.; Kesari, S.; Bajpai, A.; Bhunia, G. S.; Krishna Murthy, Y. V. N.

    2012-07-01

    Visceral Leishmaniasis (VL) commonly known as Kala-azar is one of the most neglected tropical disease affecting approximately 200 million poorest populations 'at risk in 109 districts of three endemic countries namely Bangladesh, India and Nepal at different levels. This tropical disease is caused by the protozoan parasite Leishmania donovani and transmitted by female Phlebotomus argentipes sand flies. The analysis of disease dynamics indicate the periodicity at seasonal and inter-annual temporal scale which forms the basis for development of advanced early warning system. Study area of highly endemic Vaishali district, Bihar, India has been taken for model development. A Systematic study of geo-environmental parameters derived from satellite data in conjunction with ground intelligence enabled modelling of infectious disease and risk villages. High resolution Indian satellites data of IRS LISS IV (multi-spectral) and Cartosat-1 (Pan) have been used for studying environmentally risk parameters viz. peri-domestic vegetation, dwelling condition, wetland ecosystem, cropping pattern, Normalised Difference Vegetation Index (NDVI), detailed land use etc towards risk assessment. Univariate analysis of the relationship between vector density and various land cover categories and climatic variables suggested that all the variables are significantly correlated. Using the significantly correlated variables with vector density, a seasonal multivariate regression model has been carried out incorporating geo-environmental parameters, climate variables and seasonal time series disease parameters. Linear and non-linear models have been applied for periodicity and interannual temporal scale to predict Man-hour-density (MHD) and 'out-of-fit' data set used for validating the model with reasonable accuracy. To improve the MHD predictive approach, fuzzy model has also been incorporated in GIS environment combining spatial geo-environmental and climatic variables using fuzzy membership logic. Based on the perceived importance of the geoenvironmental parameters assigned by epidemiology expert, combined fuzzy membership has been calculated. The combined fuzzy membership indicate the predictive measure of vector density in each village. A γ factor has been introduced to have increasing effect in the higher side and decreasing effect in the lower side which facilitated for prioritisation of the villages. This approach is not only to predict vector density but also to prioritise the villages for effective control measures. A software package for modelling the risk villages integrating multivariate regression and fuzzy membership analysis models have been developed to estimate MHD (vector density) as part of the early warning system.

  12. Uncertainty Management for Diagnostics and Prognostics of Batteries using Bayesian Techniques

    NASA Technical Reports Server (NTRS)

    Saha, Bhaskar; Goebel, kai

    2007-01-01

    Uncertainty management has always been the key hurdle faced by diagnostics and prognostics algorithms. A Bayesian treatment of this problem provides an elegant and theoretically sound approach to the modern Condition- Based Maintenance (CBM)/Prognostic Health Management (PHM) paradigm. The application of the Bayesian techniques to regression and classification in the form of Relevance Vector Machine (RVM), and to state estimation as in Particle Filters (PF), provides a powerful tool to integrate the diagnosis and prognosis of battery health. The RVM, which is a Bayesian treatment of the Support Vector Machine (SVM), is used for model identification, while the PF framework uses the learnt model, statistical estimates of noise and anticipated operational conditions to provide estimates of remaining useful life (RUL) in the form of a probability density function (PDF). This type of prognostics generates a significant value addition to the management of any operation involving electrical systems.

  13. Density-based penalty parameter optimization on C-SVM.

    PubMed

    Liu, Yun; Lian, Jie; Bartolacci, Michael R; Zeng, Qing-An

    2014-01-01

    The support vector machine (SVM) is one of the most widely used approaches for data classification and regression. SVM achieves the largest distance between the positive and negative support vectors, which neglects the remote instances away from the SVM interface. In order to avoid a position change of the SVM interface as the result of an error system outlier, C-SVM was implemented to decrease the influences of the system's outliers. Traditional C-SVM holds a uniform parameter C for both positive and negative instances; however, according to the different number proportions and the data distribution, positive and negative instances should be set with different weights for the penalty parameter of the error terms. Therefore, in this paper, we propose density-based penalty parameter optimization of C-SVM. The experiential results indicated that our proposed algorithm has outstanding performance with respect to both precision and recall.

  14. A new feature constituting approach to detection of vocal fold pathology

    NASA Astrophysics Data System (ADS)

    Hariharan, M.; Polat, Kemal; Yaacob, Sazali

    2014-08-01

    In the last two decades, non-invasive methods through acoustic analysis of voice signal have been proved to be excellent and reliable tool to diagnose vocal fold pathologies. This paper proposes a new feature vector based on the wavelet packet transform and singular value decomposition for the detection of vocal fold pathology. k-means clustering based feature weighting is proposed to increase the distinguishing performance of the proposed features. In this work, two databases Massachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are used. Four different supervised classifiers such as k-nearest neighbour (k-NN), least-square support vector machine, probabilistic neural network and general regression neural network are employed for testing the proposed features. The experimental results uncover that the proposed features give very promising classification accuracy of 100% for both MEEI database and MAPACI speech pathology database.

  15. Distributed collaborative probabilistic design for turbine blade-tip radial running clearance using support vector machine of regression

    NASA Astrophysics Data System (ADS)

    Fei, Cheng-Wei; Bai, Guang-Chen

    2014-12-01

    To improve the computational precision and efficiency of probabilistic design for mechanical dynamic assembly like the blade-tip radial running clearance (BTRRC) of gas turbine, a distribution collaborative probabilistic design method-based support vector machine of regression (SR)(called as DCSRM) is proposed by integrating distribution collaborative response surface method and support vector machine regression model. The mathematical model of DCSRM is established and the probabilistic design idea of DCSRM is introduced. The dynamic assembly probabilistic design of aeroengine high-pressure turbine (HPT) BTRRC is accomplished to verify the proposed DCSRM. The analysis results reveal that the optimal static blade-tip clearance of HPT is gained for designing BTRRC, and improving the performance and reliability of aeroengine. The comparison of methods shows that the DCSRM has high computational accuracy and high computational efficiency in BTRRC probabilistic analysis. The present research offers an effective way for the reliability design of mechanical dynamic assembly and enriches mechanical reliability theory and method.

  16. Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.

    PubMed

    Chen, Yanguang

    2016-01-01

    In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.

  17. A Code Generation Approach for Auto-Vectorization in the Spade Compiler

    NASA Astrophysics Data System (ADS)

    Wang, Huayong; Andrade, Henrique; Gedik, Buğra; Wu, Kun-Lung

    We describe an auto-vectorization approach for the Spade stream processing programming language, comprising two ideas. First, we provide support for vectors as a primitive data type. Second, we provide a C++ library with architecture-specific implementations of a large number of pre-vectorized operations as the means to support language extensions. We evaluate our approach with several stream processing operators, contrasting Spade's auto-vectorization with the native auto-vectorization provided by the GNU gcc and Intel icc compilers.

  18. A modified temporal criterion to meta-optimize the extended Kalman filter for land cover classification of remotely sensed time series

    NASA Astrophysics Data System (ADS)

    Salmon, B. P.; Kleynhans, W.; Olivier, J. C.; van den Bergh, F.; Wessels, K. J.

    2018-05-01

    Humans are transforming land cover at an ever-increasing rate. Accurate geographical maps on land cover, especially rural and urban settlements are essential to planning sustainable development. Time series extracted from MODerate resolution Imaging Spectroradiometer (MODIS) land surface reflectance products have been used to differentiate land cover classes by analyzing the seasonal patterns in reflectance values. The proper fitting of a parametric model to these time series usually requires several adjustments to the regression method. To reduce the workload, a global setting of parameters is done to the regression method for a geographical area. In this work we have modified a meta-optimization approach to setting a regression method to extract the parameters on a per time series basis. The standard deviation of the model parameters and magnitude of residuals are used as scoring function. We successfully fitted a triply modulated model to the seasonal patterns of our study area using a non-linear extended Kalman filter (EKF). The approach uses temporal information which significantly reduces the processing time and storage requirements to process each time series. It also derives reliability metrics for each time series individually. The features extracted using the proposed method are classified with a support vector machine and the performance of the method is compared to the original approach on our ground truth data.

  19. On approaches to analyze the sensitivity of simulated hydrologic fluxes to model parameters in the community land model

    DOE PAGES

    Bao, Jie; Hou, Zhangshuan; Huang, Maoyi; ...

    2015-12-04

    Here, effective sensitivity analysis approaches are needed to identify important parameters or factors and their uncertainties in complex Earth system models composed of multi-phase multi-component phenomena and multiple biogeophysical-biogeochemical processes. In this study, the impacts of 10 hydrologic parameters in the Community Land Model on simulations of runoff and latent heat flux are evaluated using data from a watershed. Different metrics, including residual statistics, the Nash-Sutcliffe coefficient, and log mean square error, are used as alternative measures of the deviations between the simulated and field observed values. Four sensitivity analysis (SA) approaches, including analysis of variance based on the generalizedmore » linear model, generalized cross validation based on the multivariate adaptive regression splines model, standardized regression coefficients based on a linear regression model, and analysis of variance based on support vector machine, are investigated. Results suggest that these approaches show consistent measurement of the impacts of major hydrologic parameters on response variables, but with differences in the relative contributions, particularly for the secondary parameters. The convergence behaviors of the SA with respect to the number of sampling points are also examined with different combinations of input parameter sets and output response variables and their alternative metrics. This study helps identify the optimal SA approach, provides guidance for the calibration of the Community Land Model parameters to improve the model simulations of land surface fluxes, and approximates the magnitudes to be adjusted in the parameter values during parametric model optimization.« less

  20. Multivariate models for prediction of human skin sensitization hazard.

    PubMed

    Strickland, Judy; Zang, Qingda; Paris, Michael; Lehmann, David M; Allen, David; Choksi, Neepa; Matheson, Joanna; Jacobs, Abigail; Casey, Warren; Kleinstreuer, Nicole

    2017-03-01

    One of the Interagency Coordinating Committee on the Validation of Alternative Method's (ICCVAM) top priorities is the development and evaluation of non-animal approaches to identify potential skin sensitizers. The complexity of biological events necessary to produce skin sensitization suggests that no single alternative method will replace the currently accepted animal tests. ICCVAM is evaluating an integrated approach to testing and assessment based on the adverse outcome pathway for skin sensitization that uses machine learning approaches to predict human skin sensitization hazard. We combined data from three in chemico or in vitro assays - the direct peptide reactivity assay (DPRA), human cell line activation test (h-CLAT) and KeratinoSens™ assay - six physicochemical properties and an in silico read-across prediction of skin sensitization hazard into 12 variable groups. The variable groups were evaluated using two machine learning approaches, logistic regression and support vector machine, to predict human skin sensitization hazard. Models were trained on 72 substances and tested on an external set of 24 substances. The six models (three logistic regression and three support vector machine) with the highest accuracy (92%) used: (1) DPRA, h-CLAT and read-across; (2) DPRA, h-CLAT, read-across and KeratinoSens; or (3) DPRA, h-CLAT, read-across, KeratinoSens and log P. The models performed better at predicting human skin sensitization hazard than the murine local lymph node assay (accuracy 88%), any of the alternative methods alone (accuracy 63-79%) or test batteries combining data from the individual methods (accuracy 75%). These results suggest that computational methods are promising tools to identify effectively the potential human skin sensitizers without animal testing. Published 2016. This article has been contributed to by US Government employees and their work is in the public domain in the USA. Published 2016. This article has been contributed to by US Government employees and their work is in the public domain in the USA.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tang, Kunkun, E-mail: ktg@illinois.edu; Inria Bordeaux – Sud-Ouest, Team Cardamom, 200 avenue de la Vieille Tour, 33405 Talence; Congedo, Pietro M.

    The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable formore » real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.« less

  2. An Event-Triggered Machine Learning Approach for Accelerometer-Based Fall Detection.

    PubMed

    Putra, I Putu Edy Suardiyana; Brusey, James; Gaura, Elena; Vesilo, Rein

    2017-12-22

    The fixed-size non-overlapping sliding window (FNSW) and fixed-size overlapping sliding window (FOSW) approaches are the most commonly used data-segmentation techniques in machine learning-based fall detection using accelerometer sensors. However, these techniques do not segment by fall stages (pre-impact, impact, and post-impact) and thus useful information is lost, which may reduce the detection rate of the classifier. Aligning the segment with the fall stage is difficult, as the segment size varies. We propose an event-triggered machine learning (EvenT-ML) approach that aligns each fall stage so that the characteristic features of the fall stages are more easily recognized. To evaluate our approach, two publicly accessible datasets were used. Classification and regression tree (CART), k -nearest neighbor ( k -NN), logistic regression (LR), and the support vector machine (SVM) were used to train the classifiers. EvenT-ML gives classifier F-scores of 98% for a chest-worn sensor and 92% for a waist-worn sensor, and significantly reduces the computational cost compared with the FNSW- and FOSW-based approaches, with reductions of up to 8-fold and 78-fold, respectively. EvenT-ML achieves a significantly better F-score than existing fall detection approaches. These results indicate that aligning feature segments with fall stages significantly increases the detection rate and reduces the computational cost.

  3. An Enhanced MEMS Error Modeling Approach Based on Nu-Support Vector Regression

    PubMed Central

    Bhatt, Deepak; Aggarwal, Priyanka; Bhattacharya, Prabir; Devabhaktuni, Vijay

    2012-01-01

    Micro Electro Mechanical System (MEMS)-based inertial sensors have made possible the development of a civilian land vehicle navigation system by offering a low-cost solution. However, the accurate modeling of the MEMS sensor errors is one of the most challenging tasks in the design of low-cost navigation systems. These sensors exhibit significant errors like biases, drift, noises; which are negligible for higher grade units. Different conventional techniques utilizing the Gauss Markov model and neural network method have been previously utilized to model the errors. However, Gauss Markov model works unsatisfactorily in the case of MEMS units due to the presence of high inherent sensor errors. On the other hand, modeling the random drift utilizing Neural Network (NN) is time consuming, thereby affecting its real-time implementation. We overcome these existing drawbacks by developing an enhanced Support Vector Machine (SVM) based error model. Unlike NN, SVMs do not suffer from local minimisation or over-fitting problems and delivers a reliable global solution. Experimental results proved that the proposed SVM approach reduced the noise standard deviation by 10–35% for gyroscopes and 61–76% for accelerometers. Further, positional error drifts under static conditions improved by 41% and 80% in comparison to NN and GM approaches. PMID:23012552

  4. Validation of SplitVectors Encoding for Quantitative Visualization of Large-Magnitude-Range Vector Fields

    PubMed Central

    Zhao, Henan; Bryant, Garnett W.; Griffin, Wesley; Terrill, Judith E.; Chen, Jian

    2017-01-01

    We designed and evaluated SplitVectors, a new vector field display approach to help scientists perform new discrimination tasks on large-magnitude-range scientific data shown in three-dimensional (3D) visualization environments. SplitVectors uses scientific notation to display vector magnitude, thus improving legibility. We present an empirical study comparing the SplitVectors approach with three other approaches - direct linear representation, logarithmic, and text display commonly used in scientific visualizations. Twenty participants performed three domain analysis tasks: reading numerical values (a discrimination task), finding the ratio between values (a discrimination task), and finding the larger of two vectors (a pattern detection task). Participants used both mono and stereo conditions. Our results suggest the following: (1) SplitVectors improve accuracy by about 10 times compared to linear mapping and by four times to logarithmic in discrimination tasks; (2) SplitVectors have no significant differences from the textual display approach, but reduce cluttering in the scene; (3) SplitVectors and textual display are less sensitive to data scale than linear and logarithmic approaches; (4) using logarithmic can be problematic as participants' confidence was as high as directly reading from the textual display, but their accuracy was poor; and (5) Stereoscopy improved performance, especially in more challenging discrimination tasks. PMID:28113469

  5. Validation of SplitVectors Encoding for Quantitative Visualization of Large-Magnitude-Range Vector Fields.

    PubMed

    Henan Zhao; Bryant, Garnett W; Griffin, Wesley; Terrill, Judith E; Jian Chen

    2017-06-01

    We designed and evaluated SplitVectors, a new vector field display approach to help scientists perform new discrimination tasks on large-magnitude-range scientific data shown in three-dimensional (3D) visualization environments. SplitVectors uses scientific notation to display vector magnitude, thus improving legibility. We present an empirical study comparing the SplitVectors approach with three other approaches - direct linear representation, logarithmic, and text display commonly used in scientific visualizations. Twenty participants performed three domain analysis tasks: reading numerical values (a discrimination task), finding the ratio between values (a discrimination task), and finding the larger of two vectors (a pattern detection task). Participants used both mono and stereo conditions. Our results suggest the following: (1) SplitVectors improve accuracy by about 10 times compared to linear mapping and by four times to logarithmic in discrimination tasks; (2) SplitVectors have no significant differences from the textual display approach, but reduce cluttering in the scene; (3) SplitVectors and textual display are less sensitive to data scale than linear and logarithmic approaches; (4) using logarithmic can be problematic as participants' confidence was as high as directly reading from the textual display, but their accuracy was poor; and (5) Stereoscopy improved performance, especially in more challenging discrimination tasks.

  6. Advanced signal processing based on support vector regression for lidar applications

    NASA Astrophysics Data System (ADS)

    Gelfusa, M.; Murari, A.; Malizia, A.; Lungaroni, M.; Peluso, E.; Parracino, S.; Talebzadeh, S.; Vega, J.; Gaudio, P.

    2015-10-01

    The LIDAR technique has recently found many applications in atmospheric physics and remote sensing. One of the main issues, in the deployment of systems based on LIDAR, is the filtering of the backscattered signal to alleviate the problems generated by noise. Improvement in the signal to noise ratio is typically achieved by averaging a quite large number (of the order of hundreds) of successive laser pulses. This approach can be effective but presents significant limitations. First of all, it implies a great stress on the laser source, particularly in the case of systems for automatic monitoring of large areas for long periods. Secondly, this solution can become difficult to implement in applications characterised by rapid variations of the atmosphere, for example in the case of pollutant emissions, or by abrupt changes in the noise. In this contribution, a new method for the software filtering and denoising of LIDAR signals is presented. The technique is based on support vector regression. The proposed new method is insensitive to the statistics of the noise and is therefore fully general and quite robust. The developed numerical tool has been systematically compared with the most powerful techniques available, using both synthetic and experimental data. Its performances have been tested for various statistical distributions of the noise and also for other disturbances of the acquired signal such as outliers. The competitive advantages of the proposed method are fully documented. The potential of the proposed approach to widen the capability of the LIDAR technique, particularly in the detection of widespread smoke, is discussed in detail.

  7. Lithium-ion battery state of health monitoring and remaining useful life prediction based on support vector regression-particle filter

    NASA Astrophysics Data System (ADS)

    Dong, Hancheng; Jin, Xiaoning; Lou, Yangbing; Wang, Changhong

    2014-12-01

    Lithium-ion batteries are used as the main power source in many electronic and electrical devices. In particular, with the growth in battery-powered electric vehicle development, the lithium-ion battery plays a critical role in the reliability of vehicle systems. In order to provide timely maintenance and replacement of battery systems, it is necessary to develop a reliable and accurate battery health diagnostic that takes a prognostic approach. Therefore, this paper focuses on two main methods to determine a battery's health: (1) Battery State-of-Health (SOH) monitoring and (2) Remaining Useful Life (RUL) prediction. Both of these are calculated by using a filter algorithm known as the Support Vector Regression-Particle Filter (SVR-PF). Models for battery SOH monitoring based on SVR-PF are developed with novel capacity degradation parameters introduced to determine battery health in real time. Moreover, the RUL prediction model is proposed, which is able to provide the RUL value and update the RUL probability distribution to the End-of-Life cycle. Results for both methods are presented, showing that the proposed SOH monitoring and RUL prediction methods have good performance and that the SVR-PF has better monitoring and prediction capability than the standard particle filter (PF).

  8. Detecting Dementia Through Interactive Computer Avatars

    PubMed Central

    Adachi, Hiroyoshi; Ukita, Norimichi; Ikeda, Manabu; Kazui, Hiroaki; Kudo, Takashi; Nakamura, Satoshi

    2017-01-01

    This paper proposes a new approach to automatically detect dementia. Even though some works have detected dementia from speech and language attributes, most have applied detection using picture descriptions, narratives, and cognitive tasks. In this paper, we propose a new computer avatar with spoken dialog functionalities that produces spoken queries based on the mini-mental state examination, the Wechsler memory scale-revised, and other related neuropsychological questions. We recorded the interactive data of spoken dialogues from 29 participants (14 dementia and 15 healthy controls) and extracted various audiovisual features. We tried to predict dementia using audiovisual features and two machine learning algorithms (support vector machines and logistic regression). Here, we show that the support vector machines outperformed logistic regression, and by using the extracted features they classified the participants into two groups with 0.93 detection performance, as measured by the areas under the receiver operating characteristic curve. We also newly identified some contributing features, e.g., gap before speaking, the variations of fundamental frequency, voice quality, and the ratio of smiling. We concluded that our system has the potential to detect dementia through spoken dialog systems and that the system can assist health care workers. In addition, these findings could help medical personnel detect signs of dementia. PMID:29018636

  9. Design optimization of tailor-rolled blank thin-walled structures based on ɛ-support vector regression technique and genetic algorithm

    NASA Astrophysics Data System (ADS)

    Duan, Libin; Xiao, Ning-cong; Li, Guangyao; Cheng, Aiguo; Chen, Tao

    2017-07-01

    Tailor-rolled blank thin-walled (TRB-TH) structures have become important vehicle components owing to their advantages of light weight and crashworthiness. The purpose of this article is to provide an efficient lightweight design for improving the energy-absorbing capability of TRB-TH structures under dynamic loading. A finite element (FE) model for TRB-TH structures is established and validated by performing a dynamic axial crash test. Different material properties for individual parts with different thicknesses are considered in the FE model. Then, a multi-objective crashworthiness design of the TRB-TH structure is constructed based on the ɛ-support vector regression (ɛ-SVR) technique and non-dominated sorting genetic algorithm-II. The key parameters (C, ɛ and σ) are optimized to further improve the predictive accuracy of ɛ-SVR under limited sample points. Finally, the technique for order preference by similarity to the ideal solution method is used to rank the solutions in Pareto-optimal frontiers and find the best compromise optima. The results demonstrate that the light weight and crashworthiness performance of the optimized TRB-TH structures are superior to their uniform thickness counterparts. The proposed approach provides useful guidance for designing TRB-TH energy absorbers for vehicle bodies.

  10. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

    PubMed

    Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

    2015-08-01

    Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  11. Assessing the blood volume and heart rate responses during haemodialysis in fluid overloaded patients using support vector regression.

    PubMed

    Javed, Faizan; Savkin, Andrey V; Chan, Gregory S H; Middleton, Paul M; Malouf, Philip; Steel, Elizabeth; Mackie, James; Lovell, Nigel H

    2009-11-01

    This study aims to assess the blood volume and heart rate (HR) responses during haemodialysis in fluid overloaded patients by a nonparametric nonlinear regression approach based on a support vector machine (SVM). Relative blood volume (RBV) and electrocardiogram (ECG) was recorded from 23 haemodynamically stable renal failure patients during regular haemodialysis. Modelling was performed on 18 fluid overloaded patients (fluid removal of >2 L). SVM-based regression was used to obtain the models of RBV change with time as well as the percentage change in HR with respect to RBV. Mean squared error (MSE) and goodness of fit (R(2)) were used for comparison among different kernel functions. The design parameters were estimated using a grid search approach and the selected models were validated by a k-fold cross-validation technique. For the model of HR versus RBV change, a radial basis function (RBF) kernel (MSE = 17.37 and R(2) = 0.932) gave the least MSE compared to linear (MSE = 25.97 and R(2) = 0.898) and polynomial (MSE = 18.18 and R(2)= 0.929). The MSE was significantly lower for training data set when using RBF kernel compared to other kernels (p < 0.01). The RBF kernel also provided a slightly better fit of RBV change with time (MSE = 1.12 and R(2) = 0.91) compared to a linear kernel (MSE = 1.46 and R(2) = 0.88). The modelled HR response was characterized by an initial drop and a subsequent rise during progressive reduction in RBV, which may be interpreted as the reflex response to a transition from central hypervolaemia to hypovolaemia. These modelled curves can be used as references to a controller that can be designed to regulate the haemodynamic variables to ensure the stability of patients undergoing haemodialysis.

  12. Intelligent Design of Metal Oxide Gas Sensor Arrays Using Reciprocal Kernel Support Vector Regression

    NASA Astrophysics Data System (ADS)

    Dougherty, Andrew W.

    Metal oxides are a staple of the sensor industry. The combination of their sensitivity to a number of gases, and the electrical nature of their sensing mechanism, make the particularly attractive in solid state devices. The high temperature stability of the ceramic material also make them ideal for detecting combustion byproducts where exhaust temperatures can be high. However, problems do exist with metal oxide sensors. They are not very selective as they all tend to be sensitive to a number of reduction and oxidation reactions on the oxide's surface. This makes sensors with large numbers of sensors interesting to study as a method for introducing orthogonality to the system. Also, the sensors tend to suffer from long term drift for a number of reasons. In this thesis I will develop a system for intelligently modeling metal oxide sensors and determining their suitability for use in large arrays designed to analyze exhaust gas streams. It will introduce prior knowledge of the metal oxide sensors' response mechanisms in order to produce a response function for each sensor from sparse training data. The system will use the same technique to model and remove any long term drift from the sensor response. It will also provide an efficient means for determining the orthogonality of the sensor to determine whether they are useful in gas sensing arrays. The system is based on least squares support vector regression using the reciprocal kernel. The reciprocal kernel is introduced along with a method of optimizing the free parameters of the reciprocal kernel support vector machine. The reciprocal kernel is shown to be simpler and to perform better than an earlier kernel, the modified reciprocal kernel. Least squares support vector regression is chosen as it uses all of the training points and an emphasis was placed throughout this research for extracting the maximum information from very sparse data. The reciprocal kernel is shown to be effective in modeling the sensor responses in the time, gas and temperature domains, and the dual representation of the support vector regression solution is shown to provide insight into the sensor's sensitivity and potential orthogonality. Finally, the dual weights of the support vector regression solution to the sensor's response are suggested as a fitness function for a genetic algorithm, or some other method for efficiently searching large parameter spaces.

  13. Spacebased Estimation of Moisture Transport in Marine Atmosphere Using Support Vector Regression

    NASA Technical Reports Server (NTRS)

    Xie, Xiaosu; Liu, W. Timothy; Tang, Benyang

    2007-01-01

    An improved algorithm is developed based on support vector regression (SVR) to estimate horizonal water vapor transport integrated through the depth of the atmosphere ((Theta)) over the global ocean from observations of surface wind-stress vector by QuikSCAT, cloud drift wind vector derived from the Multi-angle Imaging SpectroRadiometer (MISR) and geostationary satellites, and precipitable water from the Special Sensor Microwave/Imager (SSM/I). The statistical relation is established between the input parameters (the surface wind stress, the 850 mb wind, the precipitable water, time and location) and the target data ((Theta) calculated from rawinsondes and reanalysis of numerical weather prediction model). The results are validated with independent daily rawinsonde observations, monthly mean reanalysis data, and through regional water balance. This study clearly demonstrates the improvement of (Theta) derived from satellite data using SVR over previous data sets based on linear regression and neural network. The SVR methodology reduces both mean bias and standard deviation comparedwith rawinsonde observations. It agrees better with observations from synoptic to seasonal time scales, and compare more favorably with the reanalysis data on seasonal variations. Only the SVR result can achieve the water balance over South America. The rationale of the advantage by SVR method and the impact of adding the upper level wind will also be discussed.

  14. Investigating the Performance of Alternate Regression Weights by Studying All Possible Criteria in Regression Models with a Fixed Set of Predictors

    ERIC Educational Resources Information Center

    Waller, Niels; Jones, Jeff

    2011-01-01

    We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n x 1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a…

  15. PREDICTION OF SOLAR FLARE SIZE AND TIME-TO-FLARE USING SUPPORT VECTOR MACHINE REGRESSION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Boucheron, Laura E.; Al-Ghraibah, Amani; McAteer, R. T. James

    We study the prediction of solar flare size and time-to-flare using 38 features describing magnetic complexity of the photospheric magnetic field. This work uses support vector regression to formulate a mapping from the 38-dimensional feature space to a continuous-valued label vector representing flare size or time-to-flare. When we consider flaring regions only, we find an average error in estimating flare size of approximately half a geostationary operational environmental satellite (GOES) class. When we additionally consider non-flaring regions, we find an increased average error of approximately three-fourths a GOES class. We also consider thresholding the regressed flare size for the experimentmore » containing both flaring and non-flaring regions and find a true positive rate of 0.69 and a true negative rate of 0.86 for flare prediction. The results for both of these size regression experiments are consistent across a wide range of predictive time windows, indicating that the magnetic complexity features may be persistent in appearance long before flare activity. This is supported by our larger error rates of some 40 hr in the time-to-flare regression problem. The 38 magnetic complexity features considered here appear to have discriminative potential for flare size, but their persistence in time makes them less discriminative for the time-to-flare problem.« less

  16. Propensity score estimation: neural networks, support vector machines, decision trees (CART), and meta-classifiers as alternatives to logistic regression.

    PubMed

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-08-01

    Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.

  17. Entropy-Based TOA Estimation and SVM-Based Ranging Error Mitigation in UWB Ranging Systems

    PubMed Central

    Yin, Zhendong; Cui, Kai; Wu, Zhilu; Yin, Liang

    2015-01-01

    The major challenges for Ultra-wide Band (UWB) indoor ranging systems are the dense multipath and non-line-of-sight (NLOS) problems of the indoor environment. To precisely estimate the time of arrival (TOA) of the first path (FP) in such a poor environment, a novel approach of entropy-based TOA estimation and support vector machine (SVM) regression-based ranging error mitigation is proposed in this paper. The proposed method can estimate the TOA precisely by measuring the randomness of the received signals and mitigate the ranging error without the recognition of the channel conditions. The entropy is used to measure the randomness of the received signals and the FP can be determined by the decision of the sample which is followed by a great entropy decrease. The SVM regression is employed to perform the ranging-error mitigation by the modeling of the regressor between the characteristics of received signals and the ranging error. The presented numerical simulation results show that the proposed approach achieves significant performance improvements in the CM1 to CM4 channels of the IEEE 802.15.4a standard, as compared to conventional approaches. PMID:26007726

  18. Combining Relevance Vector Machines and exponential regression for bearing residual life estimation

    NASA Astrophysics Data System (ADS)

    Di Maio, Francesco; Tsui, Kwok Leung; Zio, Enrico

    2012-08-01

    In this paper we present a new procedure for estimating the bearing Residual Useful Life (RUL) by combining data-driven and model-based techniques. Respectively, we resort to (i) Relevance Vector Machines (RVMs) for selecting a low number of significant basis functions, called Relevant Vectors (RVs), and (ii) exponential regression to compute and continuously update residual life estimations. The combination of these techniques is developed with reference to partially degraded thrust ball bearings and tested on real world vibration-based degradation data. On the case study considered, the proposed procedure outperforms other model-based methods, with the added value of an adequate representation of the uncertainty associated to the estimates of the quantification of the credibility of the results by the Prognostic Horizon (PH) metric.

  19. Modeling landslide susceptibility in data-scarce environments using optimized data mining and statistical methods

    NASA Astrophysics Data System (ADS)

    Lee, Jung-Hyun; Sameen, Maher Ibrahim; Pradhan, Biswajeet; Park, Hyuck-Jin

    2018-02-01

    This study evaluated the generalizability of five models to select a suitable approach for landslide susceptibility modeling in data-scarce environments. In total, 418 landslide inventories and 18 landslide conditioning factors were analyzed. Multicollinearity and factor optimization were investigated before data modeling, and two experiments were then conducted. In each experiment, five susceptibility maps were produced based on support vector machine (SVM), random forest (RF), weight-of-evidence (WoE), ridge regression (Rid_R), and robust regression (RR) models. The highest accuracy (AUC = 0.85) was achieved with the SVM model when either the full or limited landslide inventories were used. Furthermore, the RF and WoE models were severely affected when less landslide samples were used for training. The other models were affected slightly when the training samples were limited.

  20. Predicting fundamental and realized distributions based on thermal niche: A case study of a freshwater turtle

    NASA Astrophysics Data System (ADS)

    Rodrigues, João Fabrício Mota; Coelho, Marco Túlio Pacheco; Ribeiro, Bruno R.

    2018-04-01

    Species distribution models (SDM) have been broadly used in ecology to address theoretical and practical problems. Currently, there are two main approaches to generate SDMs: (i) correlative, which is based on species occurrences and environmental predictor layers and (ii) process-based models, which are constructed based on species' functional traits and physiological tolerances. The distributions estimated by each approach are based on different components of species niche. Predictions of correlative models approach species realized niches, while predictions of process-based are more akin to species fundamental niche. Here, we integrated the predictions of fundamental and realized distributions of the freshwater turtle Trachemys dorbigni. Fundamental distribution was estimated using data of T. dorbigni's egg incubation temperature, and realized distribution was estimated using species occurrence records. Both types of distributions were estimated using the same regression approaches (logistic regression and support vector machines), both considering macroclimatic and microclimatic temperatures. The realized distribution of T. dorbigni was generally nested in its fundamental distribution reinforcing theoretical assumptions that the species' realized niche is a subset of its fundamental niche. Both modelling algorithms produced similar results but microtemperature generated better results than macrotemperature for the incubation model. Finally, our results reinforce the conclusion that species realized distributions are constrained by other factors other than just thermal tolerances.

  1. Linear regression models for solvent accessibility prediction in proteins.

    PubMed

    Wagner, Michael; Adamczak, Rafał; Porollo, Aleksey; Meller, Jarosław

    2005-04-01

    The relative solvent accessibility (RSA) of an amino acid residue in a protein structure is a real number that represents the solvent exposed surface area of this residue in relative terms. The problem of predicting the RSA from the primary amino acid sequence can therefore be cast as a regression problem. Nevertheless, RSA prediction has so far typically been cast as a classification problem. Consequently, various machine learning techniques have been used within the classification framework to predict whether a given amino acid exceeds some (arbitrary) RSA threshold and would thus be predicted to be "exposed," as opposed to "buried." We have recently developed novel methods for RSA prediction using nonlinear regression techniques which provide accurate estimates of the real-valued RSA and outperform classification-based approaches with respect to commonly used two-class projections. However, while their performance seems to provide a significant improvement over previously published approaches, these Neural Network (NN) based methods are computationally expensive to train and involve several thousand parameters. In this work, we develop alternative regression models for RSA prediction which are computationally much less expensive, involve orders-of-magnitude fewer parameters, and are still competitive in terms of prediction quality. In particular, we investigate several regression models for RSA prediction using linear L1-support vector regression (SVR) approaches as well as standard linear least squares (LS) regression. Using rigorously derived validation sets of protein structures and extensive cross-validation analysis, we compare the performance of the SVR with that of LS regression and NN-based methods. In particular, we show that the flexibility of the SVR (as encoded by metaparameters such as the error insensitivity and the error penalization terms) can be very beneficial to optimize the prediction accuracy for buried residues. We conclude that the simple and computationally much more efficient linear SVR performs comparably to nonlinear models and thus can be used in order to facilitate further attempts to design more accurate RSA prediction methods, with applications to fold recognition and de novo protein structure prediction methods.

  2. Spatial prediction of landslides using a hybrid machine learning approach based on Random Subspace and Classification and Regression Trees

    NASA Astrophysics Data System (ADS)

    Pham, Binh Thai; Prakash, Indra; Tien Bui, Dieu

    2018-02-01

    A hybrid machine learning approach of Random Subspace (RSS) and Classification And Regression Trees (CART) is proposed to develop a model named RSSCART for spatial prediction of landslides. This model is a combination of the RSS method which is known as an efficient ensemble technique and the CART which is a state of the art classifier. The Luc Yen district of Yen Bai province, a prominent landslide prone area of Viet Nam, was selected for the model development. Performance of the RSSCART model was evaluated through the Receiver Operating Characteristic (ROC) curve, statistical analysis methods, and the Chi Square test. Results were compared with other benchmark landslide models namely Support Vector Machines (SVM), single CART, Naïve Bayes Trees (NBT), and Logistic Regression (LR). In the development of model, ten important landslide affecting factors related with geomorphology, geology and geo-environment were considered namely slope angles, elevation, slope aspect, curvature, lithology, distance to faults, distance to rivers, distance to roads, and rainfall. Performance of the RSSCART model (AUC = 0.841) is the best compared with other popular landslide models namely SVM (0.835), single CART (0.822), NBT (0.821), and LR (0.723). These results indicate that performance of the RSSCART is a promising method for spatial landslide prediction.

  3. Determinants of Health Service Responsiveness in Community-Based Vector Surveillance for Chagas Disease in Guatemala, El Salvador, and Honduras

    PubMed Central

    Hashimoto, Ken; Zúniga, Concepción; Romero, Eduardo; Morales, Zoraida; Maguire, James H.

    2015-01-01

    Background Central American countries face a major challenge in the control of Triatoma dimidiata, a widespread vector of Chagas disease that cannot be eliminated. The key to maintaining the risk of transmission of Trypanosoma cruzi at lowest levels is to sustain surveillance throughout endemic areas. Guatemala, El Salvador, and Honduras integrated community-based vector surveillance into local health systems. Community participation was effective in detection of the vector, but some health services had difficulty sustaining their response to reports of vectors from the population. To date, no research has investigated how best to maintain and reinforce health service responsiveness, especially in resource-limited settings. Methodology/Principal Findings We reviewed surveillance and response records of 12 health centers in Guatemala, El Salvador, and Honduras from 2008 to 2012 and analyzed the data in relation to the volume of reports of vector infestation, local geography, demography, human resources, managerial approach, and results of interviews with health workers. Health service responsiveness was defined as the percentage of households that reported vector infestation for which the local health service provided indoor residual spraying of insecticide or educational advice. Eight potential determinants of responsiveness were evaluated by linear and mixed-effects multi-linear regression. Health service responsiveness (overall 77.4%) was significantly associated with quarterly monitoring by departmental health offices. Other potential determinants of responsiveness were not found to be significant, partly because of short- and long-term strategies, such as temporary adjustments in manpower and redistribution of tasks among local participants in the effort. Conclusions/Significance Consistent monitoring within the local health system contributes to sustainability of health service responsiveness in community-based vector surveillance of Chagas disease. Even with limited resources, countries can improve health service responsiveness with thoughtful strategies and management practices in the local health systems. PMID:26252767

  4. Text mining approach to predict hospital admissions using early medical records from the emergency department.

    PubMed

    Lucini, Filipe R; S Fogliatto, Flavio; C da Silveira, Giovani J; L Neyeloff, Jeruza; Anzanello, Michel J; de S Kuchenbecker, Ricardo; D Schaan, Beatriz

    2017-04-01

    Emergency department (ED) overcrowding is a serious issue for hospitals. Early information on short-term inward bed demand from patients receiving care at the ED may reduce the overcrowding problem, and optimize the use of hospital resources. In this study, we use text mining methods to process data from early ED patient records using the SOAP framework, and predict future hospitalizations and discharges. We try different approaches for pre-processing of text records and to predict hospitalization. Sets-of-words are obtained via binary representation, term frequency, and term frequency-inverse document frequency. Unigrams, bigrams and trigrams are tested for feature formation. Feature selection is based on χ 2 and F-score metrics. In the prediction module, eight text mining methods are tested: Decision Tree, Random Forest, Extremely Randomized Tree, AdaBoost, Logistic Regression, Multinomial Naïve Bayes, Support Vector Machine (Kernel linear) and Nu-Support Vector Machine (Kernel linear). Prediction performance is evaluated by F1-scores. Precision and Recall values are also informed for all text mining methods tested. Nu-Support Vector Machine was the text mining method with the best overall performance. Its average F1-score in predicting hospitalization was 77.70%, with a standard deviation (SD) of 0.66%. The method could be used to manage daily routines in EDs such as capacity planning and resource allocation. Text mining could provide valuable information and facilitate decision-making by inward bed management teams. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  5. Adenoviral Vector Immunity: Its Implications and circumvention strategies

    PubMed Central

    Ahi, Yadvinder S.; Bangari, Dinesh S.; Mittal, Suresh K.

    2014-01-01

    Adenoviral (Ad) vectors have emerged as a promising gene delivery platform for a variety of therapeutic and vaccine purposes during last two decades. However, the presence of preexisting Ad immunity and the rapid development of Ad vector immunity still pose significant challenges to the clinical use of these vectors. Innate inflammatory response following Ad vector administration may lead to systemic toxicity, drastically limit vector transduction efficiency and significantly abbreviate the duration of transgene expression. Currently, a number of approaches are being extensively pursued to overcome these drawbacks by strategies that target either the host or the Ad vector. In addition, significant progress has been made in the development of novel Ad vectors based on less prevalent human Ad serotypes and nonhuman Ad. This review provides an update on our current understanding of immune responses to Ad vectors and delineates various approaches for eluding Ad vector immunity. Approaches targeting the host and those targeting the vector are discussed in light of their promises and limitations. PMID:21453277

  6. Support vector machine regression (LS-SVM)--an alternative to artificial neural networks (ANNs) for the analysis of quantum chemistry data?

    PubMed

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-06-28

    A multilayer feed-forward artificial neural network (MLP-ANN) with a single, hidden layer that contains a finite number of neurons can be regarded as a universal non-linear approximator. Today, the ANN method and linear regression (MLR) model are widely used for quantum chemistry (QC) data analysis (e.g., thermochemistry) to improve their accuracy (e.g., Gaussian G2-G4, B3LYP/B3-LYP, X1, or W1 theoretical methods). In this study, an alternative approach based on support vector machines (SVMs) is used, the least squares support vector machine (LS-SVM) regression. It has been applied to ab initio (first principle) and density functional theory (DFT) quantum chemistry data. So, QC + SVM methodology is an alternative to QC + ANN one. The task of the study was to estimate the Møller-Plesset (MPn) or DFT (B3LYP, BLYP, BMK) energies calculated with large basis sets (e.g., 6-311G(3df,3pd)) using smaller ones (6-311G, 6-311G*, 6-311G**) plus molecular descriptors. A molecular set (BRM-208) containing a total of 208 organic molecules was constructed and used for the LS-SVM training, cross-validation, and testing. MP2, MP3, MP4(DQ), MP4(SDQ), and MP4/MP4(SDTQ) ab initio methods were tested. Hartree-Fock (HF/SCF) results were also reported for comparison. Furthermore, constitutional (CD: total number of atoms and mole fractions of different atoms) and quantum-chemical (QD: HOMO-LUMO gap, dipole moment, average polarizability, and quadrupole moment) molecular descriptors were used for the building of the LS-SVM calibration model. Prediction accuracies (MADs) of 1.62 ± 0.51 and 0.85 ± 0.24 kcal mol(-1) (1 kcal mol(-1) = 4.184 kJ mol(-1)) were reached for SVM-based approximations of ab initio and DFT energies, respectively. The LS-SVM model was more accurate than the MLR model. A comparison with the artificial neural network approach shows that the accuracy of the LS-SVM method is similar to the accuracy of ANN. The extrapolation and interpolation results show that LS-SVM is superior by almost an order of magnitude over the ANN method in terms of the stability, generality, and robustness of the final model. The LS-SVM model needs a much smaller numbers of samples (a much smaller sample set) to make accurate prediction results. Potential energy surface (PES) approximations for molecular dynamics (MD) studies are discussed as a promising application for the LS-SVM calibration approach. This journal is © the Owner Societies 2011

  7. Regularizing portfolio optimization

    NASA Astrophysics Data System (ADS)

    Still, Susanne; Kondor, Imre

    2010-07-01

    The optimization of large portfolios displays an inherent instability due to estimation error. This poses a fundamental problem, because solutions that are not stable under sample fluctuations may look optimal for a given sample, but are, in effect, very far from optimal with respect to the average risk. In this paper, we approach the problem from the point of view of statistical learning theory. The occurrence of the instability is intimately related to over-fitting, which can be avoided using known regularization methods. We show how regularized portfolio optimization with the expected shortfall as a risk measure is related to support vector regression. The budget constraint dictates a modification. We present the resulting optimization problem and discuss the solution. The L2 norm of the weight vector is used as a regularizer, which corresponds to a diversification 'pressure'. This means that diversification, besides counteracting downward fluctuations in some assets by upward fluctuations in others, is also crucial because it improves the stability of the solution. The approach we provide here allows for the simultaneous treatment of optimization and diversification in one framework that enables the investor to trade off between the two, depending on the size of the available dataset.

  8. Detecting glaucomatous change in visual fields: Analysis with an optimization framework.

    PubMed

    Yousefi, Siamak; Goldbaum, Michael H; Varnousfaderani, Ehsan S; Belghith, Akram; Jung, Tzyy-Ping; Medeiros, Felipe A; Zangwill, Linda M; Weinreb, Robert N; Liebmann, Jeffrey M; Girkin, Christopher A; Bowd, Christopher

    2015-12-01

    Detecting glaucomatous progression is an important aspect of glaucoma management. The assessment of longitudinal series of visual fields, measured using Standard Automated Perimetry (SAP), is considered the reference standard for this effort. We seek efficient techniques for determining progression from longitudinal visual fields by formulating the problem as an optimization framework, learned from a population of glaucoma data. The longitudinal data from each patient's eye were used in a convex optimization framework to find a vector that is representative of the progression direction of the sample population, as a whole. Post-hoc analysis of longitudinal visual fields across the derived vector led to optimal progression (change) detection. The proposed method was compared to recently described progression detection methods and to linear regression of instrument-defined global indices, and showed slightly higher sensitivities at the highest specificities than other methods (a clinically desirable result). The proposed approach is simpler, faster, and more efficient for detecting glaucomatous changes, compared to our previously proposed machine learning-based methods, although it provides somewhat less information. This approach has potential application in glaucoma clinics for patient monitoring and in research centers for classification of study participants. Copyright © 2015 Elsevier Inc. All rights reserved.

  9. Emergency Department Visit Forecasting and Dynamic Nursing Staff Allocation Using Machine Learning Techniques With Readily Available Open-Source Software.

    PubMed

    Zlotnik, Alexander; Gallardo-Antolín, Ascensión; Cuchí Alfaro, Miguel; Pérez Pérez, María Carmen; Montero Martínez, Juan Manuel

    2015-08-01

    Although emergency department visit forecasting can be of use for nurse staff planning, previous research has focused on models that lacked sufficient resolution and realistic error metrics for these predictions to be applied in practice. Using data from a 1100-bed specialized care hospital with 553,000 patients assigned to its healthcare area, forecasts with different prediction horizons, from 2 to 24 weeks ahead, with an 8-hour granularity, using support vector regression, M5P, and stratified average time-series models were generated with an open-source software package. As overstaffing and understaffing errors have different implications, error metrics and potential personnel monetary savings were calculated with a custom validation scheme, which simulated subsequent generation of predictions during a 4-year period. Results were then compared with a generalized estimating equation regression. Support vector regression and M5P models were found to be superior to the stratified average model with a 95% confidence interval. Our findings suggest that medium and severe understaffing situations could be reduced in more than an order of magnitude and average yearly savings of up to €683,500 could be achieved if dynamic nursing staff allocation was performed with support vector regression instead of the static staffing levels currently in use.

  10. The application of artificial neural networks and support vector regression for simultaneous spectrophotometric determination of commercial eye drop contents

    NASA Astrophysics Data System (ADS)

    Valizadeh, Maryam; Sohrabi, Mahmoud Reza

    2018-03-01

    In the present study, artificial neural networks (ANNs) and support vector regression (SVR) as intelligent methods coupled with UV spectroscopy for simultaneous quantitative determination of Dorzolamide (DOR) and Timolol (TIM) in eye drop. Several synthetic mixtures were analyzed for validating the proposed methods. At first, neural network time series, which one type of network from the artificial neural network was employed and its efficiency was evaluated. Afterwards, the radial basis network was applied as another neural network. Results showed that the performance of this method is suitable for predicting. Finally, support vector regression was proposed to construct the Zilomole prediction model. Also, root mean square error (RMSE) and mean recovery (%) were calculated for SVR method. Moreover, the proposed methods were compared to the high-performance liquid chromatography (HPLC) as a reference method. One way analysis of variance (ANOVA) test at the 95% confidence level applied to the comparison results of suggested and reference methods that there were no significant differences between them. Also, the effect of interferences was investigated in spike solutions.

  11. A new method for reconstruction of solar irradiance

    NASA Astrophysics Data System (ADS)

    Privalsky, Victor

    2018-07-01

    The purpose of this research is to show how time series should be reconstructed using an example with the data on total solar irradiation (TSI) of the Earth and on sunspot numbers (SSN) since 1749. The traditional approach through regression equation(s) is designed for time-invariant vectors of random variables and is not applicable to time series, which present random functions of time. The autoregressive reconstruction (ARR) method suggested here requires fitting a multivariate stochastic difference equation to the target/proxy time series. The reconstruction is done through the scalar equation for the target time series with the white noise term excluded. The time series approach is shown to provide a better reconstruction of TSI than the correlation/regression method. A reconstruction criterion is introduced which allows one to define in advance the achievable level of success in the reconstruction. The conclusion is that time series, including the total solar irradiance, cannot be reconstructed properly if the data are not treated as sample records of random processes and analyzed in both time and frequency domains.

  12. Irrigation in the arid regions of Tunisia impacts the abundance and apparent density of sand fly vectors of Leishmania infantum

    PubMed Central

    Barhoumi, Walid; Qualls, Whitney A.; Archer, Reginald; Fuller, Douglas O.; Chelbi, Ifhem; Cherni, Saifedine; Derbali, Mohamed; Arheart, Kristopher L.; Zhioua, Elyes; Beier, John C.

    2015-01-01

    The distribution expansion of important human visceral leishmaniasis (HVL) and sporadic cutaneous leishmaniasis (SCL) vector species, Phlebotomus perfiliewi and P. perniciosus, throughout central Tunisia is a major public health concern. This study was designed to investigate if the expansion of irrigation influences the abundance of sand fly species potentially involved in the transmission of HVL and SCL located in arid bioclimatic regions. Geographic and remote sensing approaches were used to predict the density of visceral leishmaniasis vectors in Tunisia. Entomological investigations were performed in the governorate of Sidi Bouzid, located in the arid bioclimatic region of Tunisia. In 2012, sand flies were collected by CDC light traps located at nine irrigated and nine non-irrigated sites to determine species abundance. Eight species in two genera were collected. Among sand flies of the subgenus Larroussius, P. perfiliewi was the only species collected significantly more in irrigated areas. Trap data were then used to develop Poisson regression models to map the apparent density of important sand fly species as a function of different environmental covariates including climate and vegetation density. The density of P. perfiliewi is predicted to be moderately high in the arid regions. These results highlight that the abundance of P. perfiliewi is associated with the development of irrigated areas and suggests that the expansion of this species will continue to more arid areas of the country as irrigation sites continue to be developed in the region. The continued increase in irrigated areas in the Middle East and North Africa region deserves attention, as it is associated with the spread of L. infantum vector P. perfiliewi. Integrated vector management strategies targeting irrigation structures to reduce sand fly vector populations should be evaluated in light of these findings. PMID:25447265

  13. Knowledge, Attitude, and Practices Regarding Vector-borne Diseases in Western Jamaica.

    PubMed

    Alobuia, Wilson M; Missikpode, Celestin; Aung, Maung; Jolly, Pauline E

    2015-01-01

    Outbreaks of vector-borne diseases (VBDs) such as dengue and malaria can overwhelm health systems in resource-poor countries. Environmental management strategies that reduce or eliminate vector breeding sites combined with improved personal prevention strategies can help to significantly reduce transmission of these infections. The aim of this study was to assess the knowledge, attitudes, and practices (KAPs) of residents in western Jamaica regarding control of mosquito vectors and protection from mosquito bites. A cross-sectional study was conducted between May and August 2010 among patients or family members of patients waiting to be seen at hospitals in western Jamaica. Participants completed an interviewer-administered questionnaire on sociodemographic factors and KAPs regarding VBDs. KAP scores were calculated and categorized as high or low based on the number of correct or positive responses. Logistic regression analyses were conducted to identify predictors of KAP and linear regression analysis conducted to determine if knowledge and attitude scores predicted practice scores. In all, 361 (85 men and 276 women) people participated in the study. Most participants (87%) scored low on knowledge and practice items (78%). Conversely, 78% scored high on attitude items. By multivariate logistic regression, housewives were 82% less likely than laborers to have high attitude scores; homeowners were 65% less likely than renters to have high attitude scores. Participants from households with 1 to 2 children were 3.4 times more likely to have high attitude scores compared with those from households with no children. Participants from households with at least 5 people were 65% less likely than those from households with fewer than 5 people to have high practice scores. By multivariable linear regression knowledge and attitude scores were significant predictors of practice score. The study revealed poor knowledge of VBDs and poor prevention practices among participants. It identified specific groups that can be targeted with vector control and personal protection interventions to decrease transmission of the infections. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  14. Support vector machines classifiers of physical activities in preschoolers

    USDA-ARS?s Scientific Manuscript database

    The goal of this study is to develop, test, and compare multinomial logistic regression (MLR) and support vector machines (SVM) in classifying preschool-aged children physical activity data acquired from an accelerometer. In this study, 69 children aged 3-5 years old were asked to participate in a s...

  15. Estimation of perceptible water vapor of atmosphere using artificial neural network, support vector machine and multiple linear regression algorithm and their comparative study

    NASA Astrophysics Data System (ADS)

    Shastri, Niket; Pathak, Kamlesh

    2018-05-01

    The water vapor content in atmosphere plays very important role in climate. In this paper the application of GPS signal in meteorology is discussed, which is useful technique that is used to estimate the perceptible water vapor of atmosphere. In this paper various algorithms like artificial neural network, support vector machine and multiple linear regression are use to predict perceptible water vapor. The comparative studies in terms of root mean square error and mean absolute errors are also carried out for all the algorithms.

  16. The 11-year solar cycle in current reanalyses: a (non)linear attribution study of the middle atmosphere

    NASA Astrophysics Data System (ADS)

    Kuchar, A.; Sacha, P.; Miksovsky, J.; Pisoft, P.

    2015-06-01

    This study focusses on the variability of temperature, ozone and circulation characteristics in the stratosphere and lower mesosphere with regard to the influence of the 11-year solar cycle. It is based on attribution analysis using multiple nonlinear techniques (support vector regression, neural networks) besides the multiple linear regression approach. The analysis was applied to several current reanalysis data sets for the 1979-2013 period, including MERRA, ERA-Interim and JRA-55, with the aim to compare how these types of data resolve especially the double-peaked solar response in temperature and ozone variables and the consequent changes induced by these anomalies. Equatorial temperature signals in the tropical stratosphere were found to be in qualitative agreement with previous attribution studies, although the agreement with observational results was incomplete, especially for JRA-55. The analysis also pointed to the solar signal in the ozone data sets (i.e. MERRA and ERA-Interim) not being consistent with the observed double-peaked ozone anomaly extracted from satellite measurements. The results obtained by linear regression were confirmed by the nonlinear approach through all data sets, suggesting that linear regression is a relevant tool to sufficiently resolve the solar signal in the middle atmosphere. The seasonal evolution of the solar response was also discussed in terms of dynamical causalities in the winter hemispheres. The hypothetical mechanism of a weaker Brewer-Dobson circulation at solar maxima was reviewed together with a discussion of polar vortex behaviour.

  17. Adaptive surrogate modeling by ANOVA and sparse polynomial dimensional decomposition for global sensitivity analysis in fluid simulation

    NASA Astrophysics Data System (ADS)

    Tang, Kunkun; Congedo, Pietro M.; Abgrall, Rémi

    2016-06-01

    The Polynomial Dimensional Decomposition (PDD) is employed in this work for the global sensitivity analysis and uncertainty quantification (UQ) of stochastic systems subject to a moderate to large number of input random variables. Due to the intimate connection between the PDD and the Analysis of Variance (ANOVA) approaches, PDD is able to provide a simpler and more direct evaluation of the Sobol' sensitivity indices, when compared to the Polynomial Chaos expansion (PC). Unfortunately, the number of PDD terms grows exponentially with respect to the size of the input random vector, which makes the computational cost of standard methods unaffordable for real engineering applications. In order to address the problem of the curse of dimensionality, this work proposes essentially variance-based adaptive strategies aiming to build a cheap meta-model (i.e. surrogate model) by employing the sparse PDD approach with its coefficients computed by regression. Three levels of adaptivity are carried out in this paper: 1) the truncated dimensionality for ANOVA component functions, 2) the active dimension technique especially for second- and higher-order parameter interactions, and 3) the stepwise regression approach designed to retain only the most influential polynomials in the PDD expansion. During this adaptive procedure featuring stepwise regressions, the surrogate model representation keeps containing few terms, so that the cost to resolve repeatedly the linear systems of the least-squares regression problem is negligible. The size of the finally obtained sparse PDD representation is much smaller than the one of the full expansion, since only significant terms are eventually retained. Consequently, a much smaller number of calls to the deterministic model is required to compute the final PDD coefficients.

  18. Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression

    PubMed Central

    Chen, Yanguang

    2016-01-01

    In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test. PMID:26800271

  19. A classical regression framework for mediation analysis: fitting one model to estimate mediation effects.

    PubMed

    Saunders, Christina T; Blume, Jeffrey D

    2017-10-26

    Mediation analysis explores the degree to which an exposure's effect on an outcome is diverted through a mediating variable. We describe a classical regression framework for conducting mediation analyses in which estimates of causal mediation effects and their variance are obtained from the fit of a single regression model. The vector of changes in exposure pathway coefficients, which we named the essential mediation components (EMCs), is used to estimate standard causal mediation effects. Because these effects are often simple functions of the EMCs, an analytical expression for their model-based variance follows directly. Given this formula, it is instructive to revisit the performance of routinely used variance approximations (e.g., delta method and resampling methods). Requiring the fit of only one model reduces the computation time required for complex mediation analyses and permits the use of a rich suite of regression tools that are not easily implemented on a system of three equations, as would be required in the Baron-Kenny framework. Using data from the BRAIN-ICU study, we provide examples to illustrate the advantages of this framework and compare it with the existing approaches. © The Author 2017. Published by Oxford University Press.

  20. Inclusive τ lepton hadronic decay in vector and axial-vector channels within dispersive approach to QCD

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nesterenko, A. V.

    The dispersive approach to QCD, which properly embodies the intrinsically nonperturbative constraints originating in the kinematic restrictions on relevant physical processes and extends the applicability range of perturbation theory towards the infrared domain, is briefly overviewed. The study of OPAL (update 2012) and ALEPH (update 2014) experimental data on inclusive τ lepton hadronic decay in vector and axial-vector channels within dispersive approach is presented.

  1. A comparative evaluation of end-emic and non-endemic region of visceral leishmaniasis (Kala-azar) in India with ground survey and space technology.

    PubMed

    Kesari, Shreekant; Bhunia, Gouri Sankar; Kumar, Vijay; Jeyaram, Algarswamy; Ranjan, Alok; Das, Pradeep

    2011-08-01

    In visceral leishmaniasis, phlebotomine vectors are targets for control measures. Understanding the ecosystem of the vectors is a prerequisite for creating these control measures. This study endeavours to delineate the suitable locations of Phlebotomus argentipes with relation to environmental characteristics between endemic and non-endemic districts in India. A cross-sectional survey was conducted on 25 villages in each district. Environmental data were obtained through remote sensing images and vector density was measured using a CDC light trap. Simple linear regression analysis was used to measure the association between climatic parameters and vector density. Using factor analysis, the relationship between land cover classes and P. argentipes density among the villages in both districts was investigated. The results of the regression analysis indicated that indoor temperature and relative humidity are the best predictors for P. argentipes distribution. Factor analysis confirmed breeding preferences for P. argentipes by landscape element. Minimum Normalised Difference Vegetation Index, marshy land and orchard/settlement produced high loading in an endemic region, whereas water bodies and dense forest were preferred in non-endemic sites. Soil properties between the two districts were studied and indicated that soil pH and moisture content is higher in endemic sites compared to non-endemic sites. The present study should be utilised to make critical decisions for vector surveillance and controlling Kala-azar disease vectors.

  2. Multiple concurrent recursive least squares identification with application to on-line spacecraft mass-property identification

    NASA Technical Reports Server (NTRS)

    Wilson, Edward (Inventor)

    2006-01-01

    The present invention is a method for identifying unknown parameters in a system having a set of governing equations describing its behavior that cannot be put into regression form with the unknown parameters linearly represented. In this method, the vector of unknown parameters is segmented into a plurality of groups where each individual group of unknown parameters may be isolated linearly by manipulation of said equations. Multiple concurrent and independent recursive least squares identification of each said group run, treating other unknown parameters appearing in their regression equation as if they were known perfectly, with said values provided by recursive least squares estimation from the other groups, thereby enabling the use of fast, compact, efficient linear algorithms to solve problems that would otherwise require nonlinear solution approaches. This invention is presented with application to identification of mass and thruster properties for a thruster-controlled spacecraft.

  3. An integrated simulation and optimization approach for managing human health risks of atmospheric pollutants by coal-fired power plants.

    PubMed

    Dai, C; Cai, X H; Cai, Y P; Guo, H C; Sun, W; Tan, Q; Huang, G H

    2014-06-01

    This research developed a simulation-aided nonlinear programming model (SNPM). This model incorporated the consideration of pollutant dispersion modeling, and the management of coal blending and the related human health risks within a general modeling framework In SNPM, the simulation effort (i.e., California puff [CALPUFF]) was used to forecast the fate of air pollutants for quantifying the health risk under various conditions, while the optimization studies were to identify the optimal coal blending strategies from a number of alternatives. To solve the model, a surrogate-based indirect search approach was proposed, where the support vector regression (SVR) was used to create a set of easy-to-use and rapid-response surrogates for identifying the function relationships between coal-blending operating conditions and health risks. Through replacing the CALPUFF and the corresponding hazard quotient equation with the surrogates, the computation efficiency could be improved. The developed SNPM was applied to minimize the human health risk associated with air pollutants discharged from Gaojing and Shijingshan power plants in the west of Beijing. Solution results indicated that it could be used for reducing the health risk of the public in the vicinity of the two power plants, identifying desired coal blending strategies for decision makers, and considering a proper balance between coal purchase cost and human health risk. A simulation-aided nonlinear programming model (SNPM) is developed. It integrates the advantages of CALPUFF and nonlinear programming model. To solve the model, a surrogate-based indirect search approach based on the combination of support vector regression and genetic algorithm is proposed. SNPM is applied to reduce the health risk caused by air pollutants discharged from Gaojing and Shijingshan power plants in the west of Beijing. Solution results indicate that it is useful for generating coal blending schemes, reducing the health risk of the public, reflecting the trade-offbetween coal purchase cost and health risk.

  4. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

    PubMed

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-12-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  5. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

    PubMed

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-09-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  6. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

    NASA Astrophysics Data System (ADS)

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-12-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  7. Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

    NASA Astrophysics Data System (ADS)

    Schroeter, Timon Sebastian; Schwaighofer, Anton; Mika, Sebastian; Ter Laak, Antonius; Suelzle, Detlev; Ganzer, Ursula; Heinrich, Nikolaus; Müller, Klaus-Robert

    2007-09-01

    We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

  8. Discordance between net analyte signal theory and practical multivariate calibration.

    PubMed

    Brown, Christopher D

    2004-08-01

    Lorber's concept of net analyte signal is reviewed in the context of classical and inverse least-squares approaches to multivariate calibration. It is shown that, in the presence of device measurement error, the classical and inverse calibration procedures have radically different theoretical prediction objectives, and the assertion that the popular inverse least-squares procedures (including partial least squares, principal components regression) approximate Lorber's net analyte signal vector in the limit is disproved. Exact theoretical expressions for the prediction error bias, variance, and mean-squared error are given under general measurement error conditions, which reinforce the very discrepant behavior between these two predictive approaches, and Lorber's net analyte signal theory. Implications for multivariate figures of merit and numerous recently proposed preprocessing treatments involving orthogonal projections are also discussed.

  9. Novel SHM method to locate damages in substructures based on VARX models

    NASA Astrophysics Data System (ADS)

    Ugalde, U.; Anduaga, J.; Martínez, F.; Iturrospe, A.

    2015-07-01

    A novel damage localization method is proposed, which is based on a substructuring approach and makes use of Vector Auto-Regressive with eXogenous input (VARX) models. The substructuring approach aims to divide the monitored structure into several multi-DOF isolated substructures. Later, each individual substructure is modelled as a VARX model, and the health of each substructure is determined analyzing the variation of the VARX model. The method allows to detect whether the isolated substructure is damaged, and besides allows to locate and quantify the damage within the substructure. It is not necessary to have a theoretical model of the structure and only the measured displacement data is required to estimate the isolated substructure's VARX model. The proposed method is validated by simulations of a two-dimensional lattice structure.

  10. Prediction of Spirometric Forced Expiratory Volume (FEV1) Data Using Support Vector Regression

    NASA Astrophysics Data System (ADS)

    Kavitha, A.; Sujatha, C. M.; Ramakrishnan, S.

    2010-01-01

    In this work, prediction of forced expiratory volume in 1 second (FEV1) in pulmonary function test is carried out using the spirometer and support vector regression analysis. Pulmonary function data are measured with flow volume spirometer from volunteers (N=175) using a standard data acquisition protocol. The acquired data are then used to predict FEV1. Support vector machines with polynomial kernel function with four different orders were employed to predict the values of FEV1. The performance is evaluated by computing the average prediction accuracy for normal and abnormal cases. Results show that support vector machines are capable of predicting FEV1 in both normal and abnormal cases and the average prediction accuracy for normal subjects was higher than that of abnormal subjects. Accuracy in prediction was found to be high for a regularization constant of C=10. Since FEV1 is the most significant parameter in the analysis of spirometric data, it appears that this method of assessment is useful in diagnosing the pulmonary abnormalities with incomplete data and data with poor recording.

  11. Habitat suitability and ecological niche profile of major malaria vectors in Cameroon

    PubMed Central

    2009-01-01

    Background Suitability of environmental conditions determines a species distribution in space and time. Understanding and modelling the ecological niche of mosquito disease vectors can, therefore, be a powerful predictor of the risk of exposure to the pathogens they transmit. In Africa, five anophelines are responsible for over 95% of total malaria transmission. However, detailed knowledge of the geographic distribution and ecological requirements of these species is to date still inadequate. Methods Indoor-resting mosquitoes were sampled from 386 villages covering the full range of ecological settings available in Cameroon, Central Africa. Using a predictive species distribution modeling approach based only on presence records, habitat suitability maps were constructed for the five major malaria vectors Anopheles gambiae, Anopheles funestus, Anopheles arabiensis, Anopheles nili and Anopheles moucheti. The influence of 17 climatic, topographic, and land use variables on mosquito geographic distribution was assessed by multivariate regression and ordination techniques. Results Twenty-four anopheline species were collected, of which 17 are known to transmit malaria in Africa. Ecological Niche Factor Analysis, Habitat Suitability modeling and Canonical Correspondence Analysis revealed marked differences among the five major malaria vector species, both in terms of ecological requirements and niche breadth. Eco-geographical variables (EGVs) related to human activity had the highest impact on habitat suitability for the five major malaria vectors, with areas of low population density being of marginal or unsuitable habitat quality. Sunlight exposure, rainfall, evapo-transpiration, relative humidity, and wind speed were among the most discriminative EGVs separating "forest" from "savanna" species. Conclusions The distribution of major malaria vectors in Cameroon is strongly affected by the impact of humans on the environment, with variables related to proximity to human settings being among the best predictors of habitat suitability. The ecologically more tolerant species An. gambiae and An. funestus were recorded in a wide range of eco-climatic settings. The other three major vectors, An. arabiensis, An. moucheti, and An. nili, were more specialized. Ecological niche and species distribution modelling should help improve malaria vector control interventions by targeting places and times where the impact on vector populations and disease transmission can be optimized. PMID:20028559

  12. Habitat suitability and ecological niche profile of major malaria vectors in Cameroon.

    PubMed

    Ayala, Diego; Costantini, Carlo; Ose, Kenji; Kamdem, Guy C; Antonio-Nkondjio, Christophe; Agbor, Jean-Pierre; Awono-Ambene, Parfait; Fontenille, Didier; Simard, Frédéric

    2009-12-23

    Suitability of environmental conditions determines a species distribution in space and time. Understanding and modelling the ecological niche of mosquito disease vectors can, therefore, be a powerful predictor of the risk of exposure to the pathogens they transmit. In Africa, five anophelines are responsible for over 95% of total malaria transmission. However, detailed knowledge of the geographic distribution and ecological requirements of these species is to date still inadequate. Indoor-resting mosquitoes were sampled from 386 villages covering the full range of ecological settings available in Cameroon, Central Africa. Using a predictive species distribution modeling approach based only on presence records, habitat suitability maps were constructed for the five major malaria vectors Anopheles gambiae, Anopheles funestus, Anopheles arabiensis, Anopheles nili and Anopheles moucheti. The influence of 17 climatic, topographic, and land use variables on mosquito geographic distribution was assessed by multivariate regression and ordination techniques. Twenty-four anopheline species were collected, of which 17 are known to transmit malaria in Africa. Ecological Niche Factor Analysis, Habitat Suitability modeling and Canonical Correspondence Analysis revealed marked differences among the five major malaria vector species, both in terms of ecological requirements and niche breadth. Eco-geographical variables (EGVs) related to human activity had the highest impact on habitat suitability for the five major malaria vectors, with areas of low population density being of marginal or unsuitable habitat quality. Sunlight exposure, rainfall, evapo-transpiration, relative humidity, and wind speed were among the most discriminative EGVs separating "forest" from "savanna" species. The distribution of major malaria vectors in Cameroon is strongly affected by the impact of humans on the environment, with variables related to proximity to human settings being among the best predictors of habitat suitability. The ecologically more tolerant species An. gambiae and An. funestus were recorded in a wide range of eco-climatic settings. The other three major vectors, An. arabiensis, An. moucheti, and An. nili, were more specialized. Ecological niche and species distribution modelling should help improve malaria vector control interventions by targeting places and times where the impact on vector populations and disease transmission can be optimized.

  13. Spatial Support Vector Regression to Detect Silent Errors in the Exascale Era

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Subasi, Omer; Di, Sheng; Bautista-Gomez, Leonardo

    As the exascale era approaches, the increasing capacity of high-performance computing (HPC) systems with targeted power and energy budget goals introduces significant challenges in reliability. Silent data corruptions (SDCs) or silent errors are one of the major sources that corrupt the executionresults of HPC applications without being detected. In this work, we explore a low-memory-overhead SDC detector, by leveraging epsilon-insensitive support vector machine regression, to detect SDCs that occur in HPC applications that can be characterized by an impact error bound. The key contributions are three fold. (1) Our design takes spatialfeatures (i.e., neighbouring data values for each data pointmore » in a snapshot) into training data, such that little memory overhead (less than 1%) is introduced. (2) We provide an in-depth study on the detection ability and performance with different parameters, and we optimize the detection range carefully. (3) Experiments with eight real-world HPC applications show thatour detector can achieve the detection sensitivity (i.e., recall) up to 99% yet suffer a less than 1% of false positive rate for most cases. Our detector incurs low performance overhead, 5% on average, for all benchmarks studied in the paper. Compared with other state-of-the-art techniques, our detector exhibits the best tradeoff considering the detection ability and overheads.« less

  14. A Personalized Electronic Movie Recommendation System Based on Support Vector Machine and Improved Particle Swarm Optimization

    PubMed Central

    Wang, Xibin; Luo, Fengji; Qian, Ying; Ranzi, Gianluca

    2016-01-01

    With the rapid development of ICT and Web technologies, a large an amount of information is becoming available and this is producing, in some instances, a condition of information overload. Under these conditions, it is difficult for a person to locate and access useful information for making decisions. To address this problem, there are information filtering systems, such as the personalized recommendation system (PRS) considered in this paper, that assist a person in identifying possible products or services of interest based on his/her preferences. Among available approaches, collaborative Filtering (CF) is one of the most widely used recommendation techniques. However, CF has some limitations, e.g., the relatively simple similarity calculation, cold start problem, etc. In this context, this paper presents a new regression model based on the support vector machine (SVM) classification and an improved PSO (IPSO) for the development of an electronic movie PRS. In its implementation, a SVM classification model is first established to obtain a preliminary movie recommendation list based on which a SVM regression model is applied to predict movies’ ratings. The proposed PRS not only considers the movie’s content information but also integrates the users’ demographic and behavioral information to better capture the users’ interests and preferences. The efficiency of the proposed method is verified by a series of experiments based on the MovieLens benchmark data set. PMID:27898691

  15. A Personalized Electronic Movie Recommendation System Based on Support Vector Machine and Improved Particle Swarm Optimization.

    PubMed

    Wang, Xibin; Luo, Fengji; Qian, Ying; Ranzi, Gianluca

    2016-01-01

    With the rapid development of ICT and Web technologies, a large an amount of information is becoming available and this is producing, in some instances, a condition of information overload. Under these conditions, it is difficult for a person to locate and access useful information for making decisions. To address this problem, there are information filtering systems, such as the personalized recommendation system (PRS) considered in this paper, that assist a person in identifying possible products or services of interest based on his/her preferences. Among available approaches, collaborative Filtering (CF) is one of the most widely used recommendation techniques. However, CF has some limitations, e.g., the relatively simple similarity calculation, cold start problem, etc. In this context, this paper presents a new regression model based on the support vector machine (SVM) classification and an improved PSO (IPSO) for the development of an electronic movie PRS. In its implementation, a SVM classification model is first established to obtain a preliminary movie recommendation list based on which a SVM regression model is applied to predict movies' ratings. The proposed PRS not only considers the movie's content information but also integrates the users' demographic and behavioral information to better capture the users' interests and preferences. The efficiency of the proposed method is verified by a series of experiments based on the MovieLens benchmark data set.

  16. Noncoplanar minimum delta V two-impulse and three-impulse orbital transfer from a regressing oblate earth assembly parking ellipse onto a flyby trans-Mars asymptotic velocity vector.

    NASA Technical Reports Server (NTRS)

    Bean, W. C.

    1971-01-01

    Comparison of two-impulse and three-impulse orbital transfer, using data from a 63-case numerical study. For each case investigated for which coplanarity of the regressing assembly parking ellipse was attained with the target asymptotic velocity vector, a two-impulse maneuver (or a one-impulse equivalent) was found for which the velocity expenditure was within 1% of a reference absolute minimum lower bound. Therefore, for the coplanar cases, use of a minimum delta-V three-impulse maneuver afforded scant improvement in velocity penalty. However, as the noncoplanarity of the parking ellipse and the target asymptotic velocity vector increased, there was a significant increase in the superiority of minimum delta-V three-impulse maneuvers for slowing the growth of velocity expenditure. It is concluded that a multiple-impulse maneuver should be contemplated if nonnominal launch conditions could occur.

  17. NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms

    PubMed Central

    Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan

    2014-01-01

    One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available. PMID:24667482

  18. Non-Gaussian spatiotemporal simulation of multisite daily precipitation: downscaling framework

    NASA Astrophysics Data System (ADS)

    Ben Alaya, M. A.; Ouarda, T. B. M. J.; Chebana, F.

    2018-01-01

    Probabilistic regression approaches for downscaling daily precipitation are very useful. They provide the whole conditional distribution at each forecast step to better represent the temporal variability. The question addressed in this paper is: how to simulate spatiotemporal characteristics of multisite daily precipitation from probabilistic regression models? Recent publications point out the complexity of multisite properties of daily precipitation and highlight the need for using a non-Gaussian flexible tool. This work proposes a reasonable compromise between simplicity and flexibility avoiding model misspecification. A suitable nonparametric bootstrapping (NB) technique is adopted. A downscaling model which merges a vector generalized linear model (VGLM as a probabilistic regression tool) and the proposed bootstrapping technique is introduced to simulate realistic multisite precipitation series. The model is applied to data sets from the southern part of the province of Quebec, Canada. It is shown that the model is capable of reproducing both at-site properties and the spatial structure of daily precipitations. Results indicate the superiority of the proposed NB technique, over a multivariate autoregressive Gaussian framework (i.e. Gaussian copula).

  19. Modeling and forecasting US presidential election using learning algorithms

    NASA Astrophysics Data System (ADS)

    Zolghadr, Mohammad; Niaki, Seyed Armin Akhavan; Niaki, S. T. A.

    2017-09-01

    The primary objective of this research is to obtain an accurate forecasting model for the US presidential election. To identify a reliable model, artificial neural networks (ANN) and support vector regression (SVR) models are compared based on some specified performance measures. Moreover, six independent variables such as GDP, unemployment rate, the president's approval rate, and others are considered in a stepwise regression to identify significant variables. The president's approval rate is identified as the most significant variable, based on which eight other variables are identified and considered in the model development. Preprocessing methods are applied to prepare the data for the learning algorithms. The proposed procedure significantly increases the accuracy of the model by 50%. The learning algorithms (ANN and SVR) proved to be superior to linear regression based on each method's calculated performance measures. The SVR model is identified as the most accurate model among the other models as this model successfully predicted the outcome of the election in the last three elections (2004, 2008, and 2012). The proposed approach significantly increases the accuracy of the forecast.

  20. A novel approach for prediction of tacrolimus blood concentration in liver transplantation patients in the intensive care unit through support vector regression.

    PubMed

    Van Looy, Stijn; Verplancke, Thierry; Benoit, Dominique; Hoste, Eric; Van Maele, Georges; De Turck, Filip; Decruyenaere, Johan

    2007-01-01

    Tacrolimus is an important immunosuppressive drug for organ transplantation patients. It has a narrow therapeutic range, toxic side effects, and a blood concentration with wide intra- and interindividual variability. Hence, it is of the utmost importance to monitor tacrolimus blood concentration, thereby ensuring clinical effect and avoiding toxic side effects. Prediction models for tacrolimus blood concentration can improve clinical care by optimizing monitoring of these concentrations, especially in the initial phase after transplantation during intensive care unit (ICU) stay. This is the first study in the ICU in which support vector machines, as a new data modeling technique, are investigated and tested in their prediction capabilities of tacrolimus blood concentration. Linear support vector regression (SVR) and nonlinear radial basis function (RBF) SVR are compared with multiple linear regression (MLR). Tacrolimus blood concentrations, together with 35 other relevant variables from 50 liver transplantation patients, were extracted from our ICU database. This resulted in a dataset of 457 blood samples, on average between 9 and 10 samples per patient, finally resulting in a database of more than 16,000 data values. Nonlinear RBF SVR, linear SVR, and MLR were performed after selection of clinically relevant input variables and model parameters. Differences between observed and predicted tacrolimus blood concentrations were calculated. Prediction accuracy of the three methods was compared after fivefold cross-validation (Friedman test and Wilcoxon signed rank analysis). Linear SVR and nonlinear RBF SVR had mean absolute differences between observed and predicted tacrolimus blood concentrations of 2.31 ng/ml (standard deviation [SD] 2.47) and 2.38 ng/ml (SD 2.49), respectively. MLR had a mean absolute difference of 2.73 ng/ml (SD 3.79). The difference between linear SVR and MLR was statistically significant (p < 0.001). RBF SVR had the advantage of requiring only 2 input variables to perform this prediction in comparison to 15 and 16 variables needed by linear SVR and MLR, respectively. This is an indication of the superior prediction capability of nonlinear SVR. Prediction of tacrolimus blood concentration with linear and nonlinear SVR was excellent, and accuracy was superior in comparison with an MLR model.

  1. Measures of Residual Risk with Connections to Regression, Risk Tracking, Surrogate Models, and Ambiguity

    DTIC Science & Technology

    2015-01-07

    vector that helps to manage , predict, and mitigate the risk in the original variable. Residual risk can be exemplified as a quantification of the improved... the random variable of interest is viewed in concert with a related random vector that helps to manage , predict, and mitigate the risk in the original...measures of risk. They view a random variable of interest in concert with an auxiliary random vector that helps to manage , predict and mitigate the risk

  2. Classification of vegetation types in military region

    NASA Astrophysics Data System (ADS)

    Gonçalves, Miguel; Silva, Jose Silvestre; Bioucas-Dias, Jose

    2015-10-01

    In decision-making process regarding planning and execution of military operations, the terrain is a determining factor. Aerial photographs are a source of vital information for the success of an operation in hostile region, namely when the cartographic information behind enemy lines is scarce or non-existent. The objective of present work is the development of a tool capable of processing aerial photos. The methodology implemented starts with feature extraction, followed by the application of an automatic selector of features. The next step, using the k-fold cross validation technique, estimates the input parameters for the following classifiers: Sparse Multinomial Logist Regression (SMLR), K Nearest Neighbor (KNN), Linear Classifier using Principal Component Expansion on the Joint Data (PCLDC) and Multi-Class Support Vector Machine (MSVM). These classifiers were used in two different studies with distinct objectives: discrimination of vegetation's density and identification of vegetation's main components. It was found that the best classifier on the first approach is the Sparse Logistic Multinomial Regression (SMLR). On the second approach, the implemented methodology applied to high resolution images showed that the better performance was achieved by KNN classifier and PCLDC. Comparing the two approaches there is a multiscale issue, in which for different resolutions, the best solution to the problem requires different classifiers and the extraction of different features.

  3. DOA Finding with Support Vector Regression Based Forward-Backward Linear Prediction.

    PubMed

    Pan, Jingjing; Wang, Yide; Le Bastard, Cédric; Wang, Tianzhen

    2017-05-27

    Direction-of-arrival (DOA) estimation has drawn considerable attention in array signal processing, particularly with coherent signals and a limited number of snapshots. Forward-backward linear prediction (FBLP) is able to directly deal with coherent signals. Support vector regression (SVR) is robust with small samples. This paper proposes the combination of the advantages of FBLP and SVR in the estimation of DOAs of coherent incoming signals with low snapshots. The performance of the proposed method is validated with numerical simulations in coherent scenarios, in terms of different angle separations, numbers of snapshots, and signal-to-noise ratios (SNRs). Simulation results show the effectiveness of the proposed method.

  4. Applications of Support Vector Machines In Chemo And Bioinformatics

    NASA Astrophysics Data System (ADS)

    Jayaraman, V. K.; Sundararajan, V.

    2010-10-01

    Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.

  5. A Wireless Electronic Nose System Using a Fe2O3 Gas Sensing Array and Least Squares Support Vector Regression

    PubMed Central

    Song, Kai; Wang, Qi; Liu, Qi; Zhang, Hongquan; Cheng, Yingguo

    2011-01-01

    This paper describes the design and implementation of a wireless electronic nose (WEN) system which can online detect the combustible gases methane and hydrogen (CH4/H2) and estimate their concentrations, either singly or in mixtures. The system is composed of two wireless sensor nodes—a slave node and a master node. The former comprises a Fe2O3 gas sensing array for the combustible gas detection, a digital signal processor (DSP) system for real-time sampling and processing the sensor array data and a wireless transceiver unit (WTU) by which the detection results can be transmitted to the master node connected with a computer. A type of Fe2O3 gas sensor insensitive to humidity is developed for resistance to environmental influences. A threshold-based least square support vector regression (LS-SVR)estimator is implemented on a DSP for classification and concentration measurements. Experimental results confirm that LS-SVR produces higher accuracy compared with artificial neural networks (ANNs) and a faster convergence rate than the standard support vector regression (SVR). The designed WEN system effectively achieves gas mixture analysis in a real-time process. PMID:22346587

  6. A support vector regression-firefly algorithm-based model for limiting velocity prediction in sewer pipes.

    PubMed

    Ebtehaj, Isa; Bonakdari, Hossein

    2016-01-01

    Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods.

  7. Multiple-output support vector machine regression with feature selection for arousal/valence space emotion assessment.

    PubMed

    Torres-Valencia, Cristian A; Álvarez, Mauricio A; Orozco-Gutiérrez, Alvaro A

    2014-01-01

    Human emotion recognition (HER) allows the assessment of an affective state of a subject. Until recently, such emotional states were described in terms of discrete emotions, like happiness or contempt. In order to cover a high range of emotions, researchers in the field have introduced different dimensional spaces for emotion description that allow the characterization of affective states in terms of several variables or dimensions that measure distinct aspects of the emotion. One of the most common of such dimensional spaces is the bidimensional Arousal/Valence space. To the best of our knowledge, all HER systems so far have modelled independently, the dimensions in these dimensional spaces. In this paper, we study the effect of modelling the output dimensions simultaneously and show experimentally the advantages in modeling them in this way. We consider a multimodal approach by including features from the Electroencephalogram and a few physiological signals. For modelling the multiple outputs, we employ a multiple output regressor based on support vector machines. We also include an stage of feature selection that is developed within an embedded approach known as Recursive Feature Elimination (RFE), proposed initially for SVM. The results show that several features can be eliminated using the multiple output support vector regressor with RFE without affecting the performance of the regressor. From the analysis of the features selected in smaller subsets via RFE, it can be observed that the signals that are more informative into the arousal and valence space discrimination are the EEG, Electrooculogram/Electromiogram (EOG/EMG) and the Galvanic Skin Response (GSR).

  8. Carbon financial markets: A time-frequency analysis of CO2 prices

    NASA Astrophysics Data System (ADS)

    Sousa, Rita; Aguiar-Conraria, Luís; Soares, Maria Joana

    2014-11-01

    We characterize the interrelation of CO2 prices with energy prices (electricity, gas and coal), and with economic activity. Previous studies have relied on time-domain techniques, such as Vector Auto-Regressions. In this study, we use multivariate wavelet analysis, which operates in the time-frequency domain. Wavelet analysis provides convenient tools to distinguish relations at particular frequencies and at particular time horizons. Our empirical approach has the potential to identify relations getting stronger and then disappearing over specific time intervals and frequencies. We are able to examine the coherency of these variables and lead-lag relations at different frequencies for the time periods in focus.

  9. Energy-free machine learning force field for aluminum.

    PubMed

    Kruglov, Ivan; Sergeev, Oleg; Yanilkin, Alexey; Oganov, Artem R

    2017-08-17

    We used the machine learning technique of Li et al. (PRL 114, 2015) for molecular dynamics simulations. Atomic configurations were described by feature matrix based on internal vectors, and linear regression was used as a learning technique. We implemented this approach in the LAMMPS code. The method was applied to crystalline and liquid aluminum and uranium at different temperatures and densities, and showed the highest accuracy among different published potentials. Phonon density of states, entropy and melting temperature of aluminum were calculated using this machine learning potential. The results are in excellent agreement with experimental data and results of full ab initio calculations.

  10. Novel solutions for an old disease: diagnosis of acute appendicitis with random forest, support vector machines, and artificial neural networks.

    PubMed

    Hsieh, Chung-Ho; Lu, Ruey-Hwa; Lee, Nai-Hsin; Chiu, Wen-Ta; Hsu, Min-Huei; Li, Yu-Chuan Jack

    2011-01-01

    Diagnosing acute appendicitis clinically is still difficult. We developed random forests, support vector machines, and artificial neural network models to diagnose acute appendicitis. Between January 2006 and December 2008, patients who had a consultation session with surgeons for suspected acute appendicitis were enrolled. Seventy-five percent of the data set was used to construct models including random forest, support vector machines, artificial neural networks, and logistic regression. Twenty-five percent of the data set was withheld to evaluate model performance. The area under the receiver operating characteristic curve (AUC) was used to evaluate performance, which was compared with that of the Alvarado score. Data from a total of 180 patients were collected, 135 used for training and 45 for testing. The mean age of patients was 39.4 years (range, 16-85). Final diagnosis revealed 115 patients with and 65 without appendicitis. The AUC of random forest, support vector machines, artificial neural networks, logistic regression, and Alvarado was 0.98, 0.96, 0.91, 0.87, and 0.77, respectively. The sensitivity, specificity, positive, and negative predictive values of random forest were 94%, 100%, 100%, and 87%, respectively. Random forest performed better than artificial neural networks, logistic regression, and Alvarado. We demonstrated that random forest can predict acute appendicitis with good accuracy and, deployed appropriately, can be an effective tool in clinical decision making. Copyright © 2011 Mosby, Inc. All rights reserved.

  11. Low-rank separated representation surrogates of high-dimensional stochastic functions: Application in Bayesian inference

    NASA Astrophysics Data System (ADS)

    Validi, AbdoulAhad

    2014-03-01

    This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow.

  12. Epigenome-wide cross-tissue predictive modeling and comparison of cord blood and placental methylation in a birth cohort

    PubMed Central

    De Carli, Margherita M; Baccarelli, Andrea A; Trevisi, Letizia; Pantic, Ivan; Brennan, Kasey JM; Hacker, Michele R; Loudon, Holly; Brunst, Kelly J; Wright, Robert O; Wright, Rosalind J; Just, Allan C

    2017-01-01

    Aim: We compared predictive modeling approaches to estimate placental methylation using cord blood methylation. Materials & methods: We performed locus-specific methylation prediction using both linear regression and support vector machine models with 174 matched pairs of 450k arrays. Results: At most CpG sites, both approaches gave poor predictions in spite of a misleading improvement in array-wide correlation. CpG islands and gene promoters, but not enhancers, were the genomic contexts where the correlation between measured and predicted placental methylation levels achieved higher values. We provide a list of 714 sites where both models achieved an R2 ≥0.75. Conclusion: The present study indicates the need for caution in interpreting cross-tissue predictions. Few methylation sites can be predicted between cord blood and placenta. PMID:28234020

  13. A Novel Continuous Blood Pressure Estimation Approach Based on Data Mining Techniques.

    PubMed

    Miao, Fen; Fu, Nan; Zhang, Yuan-Ting; Ding, Xiao-Rong; Hong, Xi; He, Qingyun; Li, Ye

    2017-11-01

    Continuous blood pressure (BP) estimation using pulse transit time (PTT) is a promising method for unobtrusive BP measurement. However, the accuracy of this approach must be improved for it to be viable for a wide range of applications. This study proposes a novel continuous BP estimation approach that combines data mining techniques with a traditional mechanism-driven model. First, 14 features derived from simultaneous electrocardiogram and photoplethysmogram signals were extracted for beat-to-beat BP estimation. A genetic algorithm-based feature selection method was then used to select BP indicators for each subject. Multivariate linear regression and support vector regression were employed to develop the BP model. The accuracy and robustness of the proposed approach were validated for static, dynamic, and follow-up performance. Experimental results based on 73 subjects showed that the proposed approach exhibited excellent accuracy in static BP estimation, with a correlation coefficient and mean error of 0.852 and -0.001 ± 3.102 mmHg for systolic BP, and 0.790 and -0.004 ± 2.199 mmHg for diastolic BP. Similar performance was observed for dynamic BP estimation. The robustness results indicated that the estimation accuracy was lower by a certain degree one day after model construction but was relatively stable from one day to six months after construction. The proposed approach is superior to the state-of-the-art PTT-based model for an approximately 2-mmHg reduction in the standard derivation at different time intervals, thus providing potentially novel insights for cuffless BP estimation.

  14. Improving near-infrared prediction model robustness with support vector machine regression: a pharmaceutical tablet assay example.

    PubMed

    Igne, Benoît; Drennen, James K; Anderson, Carl A

    2014-01-01

    Changes in raw materials and process wear and tear can have significant effects on the prediction error of near-infrared calibration models. When the variability that is present during routine manufacturing is not included in the calibration, test, and validation sets, the long-term performance and robustness of the model will be limited. Nonlinearity is a major source of interference. In near-infrared spectroscopy, nonlinearity can arise from light path-length differences that can come from differences in particle size or density. The usefulness of support vector machine (SVM) regression to handle nonlinearity and improve the robustness of calibration models in scenarios where the calibration set did not include all the variability present in test was evaluated. Compared to partial least squares (PLS) regression, SVM regression was less affected by physical (particle size) and chemical (moisture) differences. The linearity of the SVM predicted values was also improved. Nevertheless, although visualization and interpretation tools have been developed to enhance the usability of SVM-based methods, work is yet to be done to provide chemometricians in the pharmaceutical industry with a regression method that can supplement PLS-based methods.

  15. Effect of removing the common mode errors on linear regression analysis of noise amplitudes in position time series of a regional GPS network & a case study of GPS stations in Southern California

    NASA Astrophysics Data System (ADS)

    Jiang, Weiping; Ma, Jun; Li, Zhao; Zhou, Xiaohui; Zhou, Boye

    2018-05-01

    The analysis of the correlations between the noise in different components of GPS stations has positive significance to those trying to obtain more accurate uncertainty of velocity with respect to station motion. Previous research into noise in GPS position time series focused mainly on single component evaluation, which affects the acquisition of precise station positions, the velocity field, and its uncertainty. In this study, before and after removing the common-mode error (CME), we performed one-dimensional linear regression analysis of the noise amplitude vectors in different components of 126 GPS stations with a combination of white noise, flicker noise, and random walking noise in Southern California. The results show that, on the one hand, there are above-moderate degrees of correlation between the white noise amplitude vectors in all components of the stations before and after removal of the CME, while the correlations between flicker noise amplitude vectors in horizontal and vertical components are enhanced from un-correlated to moderately correlated by removing the CME. On the other hand, the significance tests show that, all of the obtained linear regression equations, which represent a unique function of the noise amplitude in any two components, are of practical value after removing the CME. According to the noise amplitude estimates in two components and the linear regression equations, more accurate noise amplitudes can be acquired in the two components.

  16. High dimensional linear regression models under long memory dependence and measurement error

    NASA Astrophysics Data System (ADS)

    Kaul, Abhishek

    This dissertation consists of three chapters. The first chapter introduces the models under consideration and motivates problems of interest. A brief literature review is also provided in this chapter. The second chapter investigates the properties of Lasso under long range dependent model errors. Lasso is a computationally efficient approach to model selection and estimation, and its properties are well studied when the regression errors are independent and identically distributed. We study the case, where the regression errors form a long memory moving average process. We establish a finite sample oracle inequality for the Lasso solution. We then show the asymptotic sign consistency in this setup. These results are established in the high dimensional setup (p> n) where p can be increasing exponentially with n. Finally, we show the consistency, n½ --d-consistency of Lasso, along with the oracle property of adaptive Lasso, in the case where p is fixed. Here d is the memory parameter of the stationary error sequence. The performance of Lasso is also analysed in the present setup with a simulation study. The third chapter proposes and investigates the properties of a penalized quantile based estimator for measurement error models. Standard formulations of prediction problems in high dimension regression models assume the availability of fully observed covariates and sub-Gaussian and homogeneous model errors. This makes these methods inapplicable to measurement errors models where covariates are unobservable and observations are possibly non sub-Gaussian and heterogeneous. We propose weighted penalized corrected quantile estimators for the regression parameter vector in linear regression models with additive measurement errors, where unobservable covariates are nonrandom. The proposed estimators forgo the need for the above mentioned model assumptions. We study these estimators in both the fixed dimension and high dimensional sparse setups, in the latter setup, the dimensionality can grow exponentially with the sample size. In the fixed dimensional setting we provide the oracle properties associated with the proposed estimators. In the high dimensional setting, we provide bounds for the statistical error associated with the estimation, that hold with asymptotic probability 1, thereby providing the ℓ1-consistency of the proposed estimator. We also establish the model selection consistency in terms of the correctly estimated zero components of the parameter vector. A simulation study that investigates the finite sample accuracy of the proposed estimator is also included in this chapter.

  17. Multiclass Reduced-Set Support Vector Machines

    NASA Technical Reports Server (NTRS)

    Tang, Benyang; Mazzoni, Dominic

    2006-01-01

    There are well-established methods for reducing the number of support vectors in a trained binary support vector machine, often with minimal impact on accuracy. We show how reduced-set methods can be applied to multiclass SVMs made up of several binary SVMs, with significantly better results than reducing each binary SVM independently. Our approach is based on Burges' approach that constructs each reduced-set vector as the pre-image of a vector in kernel space, but we extend this by recomputing the SVM weights and bias optimally using the original SVM objective function. This leads to greater accuracy for a binary reduced-set SVM, and also allows vectors to be 'shared' between multiple binary SVMs for greater multiclass accuracy with fewer reduced-set vectors. We also propose computing pre-images using differential evolution, which we have found to be more robust than gradient descent alone. We show experimental results on a variety of problems and find that this new approach is consistently better than previous multiclass reduced-set methods, sometimes with a dramatic difference.

  18. Predicting beta-turns in proteins using support vector machines with fractional polynomials

    PubMed Central

    2013-01-01

    Background β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. Results We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. Conclusions In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods. PMID:24565438

  19. Predicting beta-turns in proteins using support vector machines with fractional polynomials.

    PubMed

    Elbashir, Murtada; Wang, Jianxin; Wu, Fang-Xiang; Wang, Lusheng

    2013-11-07

    β-turns are secondary structure type that have essential role in molecular recognition, protein folding, and stability. They are found to be the most common type of non-repetitive structures since 25% of amino acids in protein structures are situated on them. Their prediction is considered to be one of the crucial problems in bioinformatics and molecular biology, which can provide valuable insights and inputs for the fold recognition and drug design. We propose an approach that combines support vector machines (SVMs) and logistic regression (LR) in a hybrid prediction method, which we call (H-SVM-LR) to predict β-turns in proteins. Fractional polynomials are used for LR modeling. We utilize position specific scoring matrices (PSSMs) and predicted secondary structure (PSS) as features. Our simulation studies show that H-SVM-LR achieves Qtotal of 82.87%, 82.84%, and 82.32% on the BT426, BT547, and BT823 datasets respectively. These values are the highest among other β-turns prediction methods that are based on PSSMs and secondary structure information. H-SVM-LR also achieves favorable performance in predicting β-turns as measured by the Matthew's correlation coefficient (MCC) on these datasets. Furthermore, H-SVM-LR shows good performance when considering shape strings as additional features. In this paper, we present a comprehensive approach for β-turns prediction. Experiments show that our proposed approach achieves better performance compared to other competing prediction methods.

  20. RNAi in Arthropods: Insight into the Machinery and Applications for Understanding the Pathogen-Vector Interface

    PubMed Central

    Barnard, Annette-Christi; Nijhof, Ard M.; Fick, Wilma; Stutzer, Christian; Maritz-Olivier, Christine

    2012-01-01

    The availability of genome sequencing data in combination with knowledge of expressed genes via transcriptome and proteome data has greatly advanced our understanding of arthropod vectors of disease. Not only have we gained insight into vector biology, but also into their respective vector-pathogen interactions. By combining the strengths of postgenomic databases and reverse genetic approaches such as RNAi, the numbers of available drug and vaccine targets, as well as number of transgenes for subsequent transgenic or paratransgenic approaches, have expanded. These are now paving the way for in-field control strategies of vectors and their pathogens. Basic scientific questions, such as understanding the basic components of the vector RNAi machinery, is vital, as this allows for the transfer of basic RNAi machinery components into RNAi-deficient vectors, thereby expanding the genetic toolbox of these RNAi-deficient vectors and pathogens. In this review, we focus on the current knowledge of arthropod vector RNAi machinery and the impact of RNAi on understanding vector biology and vector-pathogen interactions for which vector genomic data is available on VectorBase. PMID:24705082

  1. A Novel Degradation Identification Method for Wind Turbine Pitch System

    NASA Astrophysics Data System (ADS)

    Guo, Hui-Dong

    2018-04-01

    It’s difficult for traditional threshold value method to identify degradation of operating equipment accurately. An novel degradation evaluation method suitable for wind turbine condition maintenance strategy implementation was proposed in this paper. Based on the analysis of typical variable-speed pitch-to-feather control principle and monitoring parameters for pitch system, a multi input multi output (MIMO) regression model was applied to pitch system, where wind speed, power generation regarding as input parameters, wheel rotation speed, pitch angle and motor driving currency for three blades as output parameters. Then, the difference between the on-line measurement and the calculated value from the MIMO regression model applying least square support vector machines (LSSVM) method was defined as the Observed Vector of the system. The Gaussian mixture model (GMM) was applied to fitting the distribution of the multi dimension Observed Vectors. Applying the model established, the Degradation Index was calculated using the SCADA data of a wind turbine damaged its pitch bearing retainer and rolling body, which illustrated the feasibility of the provided method.

  2. A vector space model approach to identify genetically related diseases.

    PubMed

    Sarkar, Indra Neil

    2012-01-01

    The relationship between diseases and their causative genes can be complex, especially in the case of polygenic diseases. Further exacerbating the challenges in their study is that many genes may be causally related to multiple diseases. This study explored the relationship between diseases through the adaptation of an approach pioneered in the context of information retrieval: vector space models. A vector space model approach was developed that bridges gene disease knowledge inferred across three knowledge bases: Online Mendelian Inheritance in Man, GenBank, and Medline. The approach was then used to identify potentially related diseases for two target diseases: Alzheimer disease and Prader-Willi Syndrome. In the case of both Alzheimer Disease and Prader-Willi Syndrome, a set of plausible diseases were identified that may warrant further exploration. This study furthers seminal work by Swanson, et al. that demonstrated the potential for mining literature for putative correlations. Using a vector space modeling approach, information from both biomedical literature and genomic resources (like GenBank) can be combined towards identification of putative correlations of interest. To this end, the relevance of the predicted diseases of interest in this study using the vector space modeling approach were validated based on supporting literature. The results of this study suggest that a vector space model approach may be a useful means to identify potential relationships between complex diseases, and thereby enable the coordination of gene-based findings across multiple complex diseases.

  3. Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

    PubMed

    Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

    2015-08-01

    Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.

  4. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features

    PubMed Central

    Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-01-01

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin. PMID:29596375

  5. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

    PubMed

    Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-03-29

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

  6. Canopy Spectral Reflectance as a Predictor of Soil Water Potential in Rice

    NASA Astrophysics Data System (ADS)

    Panigrahi, N.; Das, B. S.

    2018-04-01

    Soil water potential (SWP) is a key parameter for characterizing water stress. Typically, a tensiometer is used to measure SWP. However, the measurement range for commercially available tensiometers is limited to -90 kPa and a tensiometer can only provide estimate of SWP at a single location. In this study, a new approach was developed for estimating SWP from spectral reflectance data of a standing rice crop over the visible to shortwave-infrared region (wavelength: 350-2,500 nm). Five water stress treatments corresponding to targeted SWP of -30, -50, -70, -120, and -140 kPa were examined by withholding irrigation during the vegetative growth stage of three rice varieties. Tensiometers and mechanistic water flow model were used for monitoring SWP. Spectral models for SWP were developed using partial-least-squares regression (PLSR), support vector regression (SVR), and coupled PLSR and feature selection (PLSRFS) approaches. Results showed that the SVR approach was the best model for estimating SWP from spectral reflectance data with the coefficient of determination values of 0.71 and 0.55 for the calibration and validation data sets, respectively. Observed root-mean-squared residuals for the predicted SWPs were in the range of -7 to -19 kPa. A new spectral water stress index was also developed using the reflectance values at 745 and 2,002 nm, which showed strong correlation with relative water contents and electrolyte leakage. This new approach is rapid and noninvasive and may be used for estimating SWP over large areas.

  7. Continuous Space Estimation: Increasing WiFi-Based Indoor Localization Resolution without Increasing the Site-Survey Effort.

    PubMed

    Hernández, Noelia; Ocaña, Manuel; Alonso, Jose M; Kim, Euntai

    2017-01-13

    Although much research has taken place in WiFi indoor localization systems, their accuracy can still be improved. When designing this kind of system, fingerprint-based methods are a common choice. The problem with fingerprint-based methods comes with the need of site surveying the environment, which is effort consuming. In this work, we propose an approach, based on support vector regression, to estimate the received signal strength at non-site-surveyed positions of the environment. Experiments, performed in a real environment, show that the proposed method could be used to improve the resolution of fingerprint-based indoor WiFi localization systems without increasing the site survey effort.

  8. Continuous Space Estimation: Increasing WiFi-Based Indoor Localization Resolution without Increasing the Site-Survey Effort †

    PubMed Central

    Hernández, Noelia; Ocaña, Manuel; Alonso, Jose M.; Kim, Euntai

    2017-01-01

    Although much research has taken place in WiFi indoor localization systems, their accuracy can still be improved. When designing this kind of system, fingerprint-based methods are a common choice. The problem with fingerprint-based methods comes with the need of site surveying the environment, which is effort consuming. In this work, we propose an approach, based on support vector regression, to estimate the received signal strength at non-site-surveyed positions of the environment. Experiments, performed in a real environment, show that the proposed method could be used to improve the resolution of fingerprint-based indoor WiFi localization systems without increasing the site survey effort. PMID:28098773

  9. Real time flaw detection and characterization in tube through partial least squares and SVR: Application to eddy current testing

    NASA Astrophysics Data System (ADS)

    Ahmed, Shamim; Miorelli, Roberto; Calmon, Pierre; Anselmi, Nicola; Salucci, Marco

    2018-04-01

    This paper describes Learning-By-Examples (LBE) technique for performing quasi real time flaw localization and characterization within a conductive tube based on Eddy Current Testing (ECT) signals. Within the framework of LBE, the combination of full-factorial (i.e., GRID) sampling and Partial Least Squares (PLS) feature extraction (i.e., GRID-PLS) techniques are applied for generating a suitable training set in offine phase. Support Vector Regression (SVR) is utilized for model development and inversion during offine and online phases, respectively. The performance and robustness of the proposed GIRD-PLS/SVR strategy on noisy test set is evaluated and compared with standard GRID/SVR approach.

  10. A statistical learning approach to the modeling of chromatographic retention of oligonucleotides incorporating sequence and secondary structure data

    PubMed Central

    Sturm, Marc; Quinten, Sascha; Huber, Christian G.; Kohlbacher, Oliver

    2007-01-01

    We propose a new model for predicting the retention time of oligonucleotides. The model is based on ν support vector regression using features derived from base sequence and predicted secondary structure of oligonucleotides. Because of the secondary structure information, the model is applicable even at relatively low temperatures where the secondary structure is not suppressed by thermal denaturing. This makes the prediction of oligonucleotide retention time for arbitrary temperatures possible, provided that the target temperature lies within the temperature range of the training data. We describe different possibilities of feature calculation from base sequence and secondary structure, present the results and compare our model to existing models. PMID:17567619

  11. Analysis of the quality of image data acquired by the LANDSAT-4 Thematic Mapper and Multispectral Scanners

    NASA Technical Reports Server (NTRS)

    Colwell, R. N. (Principal Investigator)

    1984-01-01

    The geometric quality of TM film and digital products is evaluated by making selective photomeasurements and by measuring the coordinates of known features on both the TM products and map products. These paired observations are related using a standard linear least squares regression approach. Using regression equations and coefficients developed from 225 (TM film product) and 20 (TM digital product) control points, map coordinates of test points are predicted. The residual error vectors and analysis of variance (ANOVA) were performed on the east and north residual using nine image segments (blocks) as treatments. Based on the root mean square error of the 223 (TM film product) and 22 (TM digital product) test points, users of TM data expect the planimetric accuracy of mapped points to be within 91 meters and within 117 meters for the film products, and to be within 12 meters and within 14 meters for the digital products.

  12. Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach.

    PubMed

    Duarte, Belmiro P M; Wong, Weng Kee

    2015-08-01

    This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted.

  13. Finding Bayesian Optimal Designs for Nonlinear Models: A Semidefinite Programming-Based Approach

    PubMed Central

    Duarte, Belmiro P. M.; Wong, Weng Kee

    2014-01-01

    Summary This paper uses semidefinite programming (SDP) to construct Bayesian optimal design for nonlinear regression models. The setup here extends the formulation of the optimal designs problem as an SDP problem from linear to nonlinear models. Gaussian quadrature formulas (GQF) are used to compute the expectation in the Bayesian design criterion, such as D-, A- or E-optimality. As an illustrative example, we demonstrate the approach using the power-logistic model and compare results in the literature. Additionally, we investigate how the optimal design is impacted by different discretising schemes for the design space, different amounts of uncertainty in the parameter values, different choices of GQF and different prior distributions for the vector of model parameters, including normal priors with and without correlated components. Further applications to find Bayesian D-optimal designs with two regressors for a logistic model and a two-variable generalised linear model with a gamma distributed response are discussed, and some limitations of our approach are noted. PMID:26512159

  14. Retrieval and Mapping of Heavy Metal Concentration in Soil Using Time Series Landsat 8 Imagery

    NASA Astrophysics Data System (ADS)

    Fang, Y.; Xu, L.; Peng, J.; Wang, H.; Wong, A.; Clausi, D. A.

    2018-04-01

    Heavy metal pollution is a critical global environmental problem which has always been a concern. Traditional approach to obtain heavy metal concentration relying on field sampling and lab testing is expensive and time consuming. Although many related studies use spectrometers data to build relational model between heavy metal concentration and spectra information, and then use the model to perform prediction using the hyperspectral imagery, this manner can hardly quickly and accurately map soil metal concentration of an area due to the discrepancies between spectrometers data and remote sensing imagery. Taking the advantage of easy accessibility of Landsat 8 data, this study utilizes Landsat 8 imagery to retrieve soil Cu concentration and mapping its distribution in the study area. To enlarge the spectral information for more accurate retrieval and mapping, 11 single date Landsat 8 imagery from 2013-2017 are selected to form a time series imagery. Three regression methods, partial least square regression (PLSR), artificial neural network (ANN) and support vector regression (SVR) are used to model construction. By comparing these models unbiasedly, the best model are selected to mapping Cu concentration distribution. The produced distribution map shows a good spatial autocorrelation and consistency with the mining area locations.

  15. A New Approach for Mobile Advertising Click-Through Rate Estimation Based on Deep Belief Nets.

    PubMed

    Chen, Jie-Hao; Zhao, Zi-Qian; Shi, Ji-Yun; Zhao, Chong

    2017-01-01

    In recent years, with the rapid development of mobile Internet and its business applications, mobile advertising Click-Through Rate (CTR) estimation has become a hot research direction in the field of computational advertising, which is used to achieve accurate advertisement delivery for the best benefits in the three-side game between media, advertisers, and audiences. Current research on the estimation of CTR mainly uses the methods and models of machine learning, such as linear model or recommendation algorithms. However, most of these methods are insufficient to extract the data features and cannot reflect the nonlinear relationship between different features. In order to solve these problems, we propose a new model based on Deep Belief Nets to predict the CTR of mobile advertising, which combines together the powerful data representation and feature extraction capability of Deep Belief Nets, with the advantage of simplicity of traditional Logistic Regression models. Based on the training dataset with the information of over 40 million mobile advertisements during a period of 10 days, our experiments show that our new model has better estimation accuracy than the classic Logistic Regression (LR) model by 5.57% and Support Vector Regression (SVR) model by 5.80%.

  16. A New Approach for Mobile Advertising Click-Through Rate Estimation Based on Deep Belief Nets

    PubMed Central

    Zhao, Zi-Qian; Shi, Ji-Yun; Zhao, Chong

    2017-01-01

    In recent years, with the rapid development of mobile Internet and its business applications, mobile advertising Click-Through Rate (CTR) estimation has become a hot research direction in the field of computational advertising, which is used to achieve accurate advertisement delivery for the best benefits in the three-side game between media, advertisers, and audiences. Current research on the estimation of CTR mainly uses the methods and models of machine learning, such as linear model or recommendation algorithms. However, most of these methods are insufficient to extract the data features and cannot reflect the nonlinear relationship between different features. In order to solve these problems, we propose a new model based on Deep Belief Nets to predict the CTR of mobile advertising, which combines together the powerful data representation and feature extraction capability of Deep Belief Nets, with the advantage of simplicity of traditional Logistic Regression models. Based on the training dataset with the information of over 40 million mobile advertisements during a period of 10 days, our experiments show that our new model has better estimation accuracy than the classic Logistic Regression (LR) model by 5.57% and Support Vector Regression (SVR) model by 5.80%. PMID:29209363

  17. Partial least squares methods for spectrally estimating lunar soil FeO abundance: A stratified approach to revealing nonlinear effect and qualitative interpretation

    NASA Astrophysics Data System (ADS)

    Li, Lin

    2008-12-01

    Partial least squares (PLS) regressions were applied to lunar highland and mare soil data characterized by the Lunar Soil Characterization Consortium (LSCC) for spectral estimation of the abundance of lunar soil chemical constituents FeO and Al2O3. The LSCC data set was split into a number of subsets including the total highland, Apollo 16, Apollo 14, and total mare soils, and then PLS was applied to each to investigate the effect of nonlinearity on the performance of the PLS method. The weight-loading vectors resulting from PLS were analyzed to identify mineral species responsible for spectral estimation of the soil chemicals. The results from PLS modeling indicate that the PLS performance depends on the correlation of constituents of interest to their major mineral carriers, and the Apollo 16 soils are responsible for the large errors of FeO and Al2O3 estimates when the soils were modeled along with other types of soils. These large errors are primarily attributed to the degraded correlation FeO to pyroxene for the relatively mature Apollo 16 soils as a result of space weathering and secondary to the interference of olivine. PLS consistently yields very accurate fits to the two soil chemicals when applied to mare soils. Although Al2O3 has no spectrally diagnostic characteristics, this chemical can be predicted for all subset data by PLS modeling at high accuracies because of its correlation to FeO. This correlation is reflected in the symmetry of the PLS weight-loading vectors for FeO and Al2O3, which prove to be very useful for qualitative interpretation of the PLS results. However, this qualitative interpretation of PLS modeling cannot be achieved using principal component regression loading vectors.

  18. Fluoroscopic tumor tracking for image-guided lung cancer radiotherapy

    NASA Astrophysics Data System (ADS)

    Lin, Tong; Cerviño, Laura I.; Tang, Xiaoli; Vasconcelos, Nuno; Jiang, Steve B.

    2009-02-01

    Accurate lung tumor tracking in real time is a keystone to image-guided radiotherapy of lung cancers. Existing lung tumor tracking approaches can be roughly grouped into three categories: (1) deriving tumor position from external surrogates; (2) tracking implanted fiducial markers fluoroscopically or electromagnetically; (3) fluoroscopically tracking lung tumor without implanted fiducial markers. The first approach suffers from insufficient accuracy, while the second may not be widely accepted due to the risk of pneumothorax. Previous studies in fluoroscopic markerless tracking are mainly based on template matching methods, which may fail when the tumor boundary is unclear in fluoroscopic images. In this paper we propose a novel markerless tumor tracking algorithm, which employs the correlation between the tumor position and surrogate anatomic features in the image. The positions of the surrogate features are not directly tracked; instead, we use principal component analysis of regions of interest containing them to obtain parametric representations of their motion patterns. Then, the tumor position can be predicted from the parametric representations of surrogates through regression. Four regression methods were tested in this study: linear and two-degree polynomial regression, artificial neural network (ANN) and support vector machine (SVM). The experimental results based on fluoroscopic sequences of ten lung cancer patients demonstrate a mean tracking error of 2.1 pixels and a maximum error at a 95% confidence level of 4.6 pixels (pixel size is about 0.5 mm) for the proposed tracking algorithm.

  19. Solar cycle in current reanalyses: (non)linear attribution study

    NASA Astrophysics Data System (ADS)

    Kuchar, A.; Sacha, P.; Miksovsky, J.; Pisoft, P.

    2014-12-01

    This study focusses on the variability of temperature, ozone and circulation characteristics in the stratosphere and lower mesosphere with regard to the influence of the 11 year solar cycle. It is based on attribution analysis using multiple nonlinear techniques (Support Vector Regression, Neural Networks) besides the traditional linear approach. The analysis was applied to several current reanalysis datasets for the 1979-2013 period, including MERRA, ERA-Interim and JRA-55, with the aim to compare how this type of data resolves especially the double-peaked solar response in temperature and ozone variables and the consequent changes induced by these anomalies. Equatorial temperature signals in the lower and upper stratosphere were found to be sufficiently robust and in qualitative agreement with previous observational studies. The analysis also pointed to the solar signal in the ozone datasets (i.e. MERRA and ERA-Interim) not being consistent with the observed double-peaked ozone anomaly extracted from satellite measurements. Consequently the results obtained by linear regression were confirmed by the nonlinear approach through all datasets, suggesting that linear regression is a relevant tool to sufficiently resolve the solar signal in the middle atmosphere. Furthermore, the seasonal dependence of the solar response was also discussed, mainly as a source of dynamical causalities in the wave propagation characteristics in the zonal wind and the induced meridional circulation in the winter hemispheres. The hypothetical mechanism of a weaker Brewer Dobson circulation was reviewed together with discussion of polar vortex stability.

  20. Feature selection using probabilistic prediction of support vector regression.

    PubMed

    Yang, Jian-Bo; Ong, Chong-Jin

    2011-06-01

    This paper presents a new wrapper-based feature selection method for support vector regression (SVR) using its probabilistic predictions. The method computes the importance of a feature by aggregating the difference, over the feature space, of the conditional density functions of the SVR prediction with and without the feature. As the exact computation of this importance measure is expensive, two approximations are proposed. The effectiveness of the measure using these approximations, in comparison to several other existing feature selection methods for SVR, is evaluated on both artificial and real-world problems. The result of the experiments show that the proposed method generally performs better than, or at least as well as, the existing methods, with notable advantage when the dataset is sparse.

  1. Predicting Jakarta composite index using hybrid of fuzzy time series and support vector regression models

    NASA Astrophysics Data System (ADS)

    Febrian Umbara, Rian; Tarwidi, Dede; Budi Setiawan, Erwin

    2018-03-01

    The paper discusses the prediction of Jakarta Composite Index (JCI) in Indonesia Stock Exchange. The study is based on JCI historical data for 1286 days to predict the value of JCI one day ahead. This paper proposes predictions done in two stages., The first stage using Fuzzy Time Series (FTS) to predict values of ten technical indicators, and the second stage using Support Vector Regression (SVR) to predict the value of JCI one day ahead, resulting in a hybrid prediction model FTS-SVR. The performance of this combined prediction model is compared with the performance of the single stage prediction model using SVR only. Ten technical indicators are used as input for each model.

  2. The dynamic correlation between policy uncertainty and stock market returns in China

    NASA Astrophysics Data System (ADS)

    Yang, Miao; Jiang, Zhi-Qiang

    2016-11-01

    The dynamic correlation is examined between government's policy uncertainty and Chinese stock market returns in the period from January 1995 to December 2014. We find that the stock market is significantly correlated to policy uncertainty based on the results of the Vector Auto Regression (VAR) and Structural Vector Auto Regression (SVAR) models. In contrast, the results of the Dynamic Conditional Correlation Generalized Multivariate Autoregressive Conditional Heteroscedasticity (DCC-MGARCH) model surprisingly show a low dynamic correlation coefficient between policy uncertainty and market returns, suggesting that the fluctuations of each variable are greatly influenced by their values in the preceding period. Our analysis highlights the understanding of the dynamical relationship between stock market and fiscal and monetary policy.

  3. Application of support vector regression for optimization of vibration flow field of high-density polyethylene melts characterized by small angle light scattering

    NASA Astrophysics Data System (ADS)

    Xian, Guangming

    2018-03-01

    In this paper, the vibration flow field parameters of polymer melts in a visual slit die are optimized by using intelligent algorithm. Experimental small angle light scattering (SALS) patterns are shown to characterize the processing process. In order to capture the scattered light, a polarizer and an analyzer are placed before and after the polymer melts. The results reported in this study are obtained using high-density polyethylene (HDPE) with rotation speed at 28 rpm. In addition, support vector regression (SVR) analytical method is introduced for optimization the parameters of vibration flow field. This work establishes the general applicability of SVR for predicting the optimal parameters of vibration flow field.

  4. A feature selection approach towards progressive vector transmission over the Internet

    NASA Astrophysics Data System (ADS)

    Miao, Ru; Song, Jia; Feng, Min

    2017-09-01

    WebGIS has been applied for visualizing and sharing geospatial information popularly over the Internet. In order to improve the efficiency of the client applications, the web-based progressive vector transmission approach is proposed. Important features should be selected and transferred firstly, and the methods for measuring the importance of features should be further considered in the progressive transmission. However, studies on progressive transmission for large-volume vector data have mostly focused on map generalization in the field of cartography, but rarely discussed on the selection of geographic features quantitatively. This paper applies information theory for measuring the feature importance of vector maps. A measurement model for the amount of information of vector features is defined based upon the amount of information for dealing with feature selection issues. The measurement model involves geometry factor, spatial distribution factor and thematic attribute factor. Moreover, a real-time transport protocol (RTP)-based progressive transmission method is then presented to improve the transmission of vector data. To clearly demonstrate the essential methodology and key techniques, a prototype for web-based progressive vector transmission is presented, and an experiment of progressive selection and transmission for vector features is conducted. The experimental results indicate that our approach clearly improves the performance and end-user experience of delivering and manipulating large vector data over the Internet.

  5. Consolidating tactical planning and implementation frameworks for integrated vector management in Uganda.

    PubMed

    Okia, Michael; Okui, Peter; Lugemwa, Myers; Govere, John M; Katamba, Vincent; Rwakimari, John B; Mpeka, Betty; Chanda, Emmanuel

    2016-04-14

    Integrated vector management (IVM) is the recommended approach for controlling some vector-borne diseases (VBD). In the face of current challenges to disease vector control, IVM is vital to achieve national targets set for VBD control. Though global efforts, especially for combating malaria, now focus on elimination and eradication, IVM remains useful for Uganda which is principally still in the control phase of the malaria continuum. This paper outlines the processes undertaken to consolidate tactical planning and implementation frameworks for IVM in Uganda. The Uganda National Malaria Control Programme with its efforts to implement an IVM approach to vector control was the 'case' for this study. Integrated management of malaria vectors in Uganda remained an underdeveloped component of malaria control policy. In 2012, knowledge and perceptions of malaria vector control policy and IVM were assessed, and recommendations for a specific IVM policy were made. In 2014, a thorough vector control needs assessment (VCNA) was conducted according to WHO recommendations. The findings of the VCNA informed the development of the national IVM strategic guidelines. Information sources for this study included all available data and accessible archived documentary records on VBD control in Uganda. The literature was reviewed and adapted to the local context and translated into the consolidated tactical framework. WHO recommends implementation of IVM as the main strategy to vector control and has encouraged member states to adopt the approach. However, many VBD-endemic countries lack IVM policy frameworks to guide implementation of the approach. In Uganda most VBD coexists and could be managed more effectively if done in tandem. In order to successfully control malaria and other VBD and move towards their elimination, the country needs to scale up proven and effective vector control interventions and also learn from the experience of other countries. The IVM strategy is important in consolidating inter-sectoral collaboration and coordination and providing the tactical direction for effective deployment of vector control interventions along the five key elements of the approach and to align them with contemporary epidemiology of VBD in the country. Uganda has successfully established an evidence-based IVM approach and consolidated strategic planning and operational frameworks for VBD control. However, operating implementation arrangements as outlined in the national strategic guidelines for IVM and managing insecticide resistance, as well as improving vector surveillance, are imperative. In addition, strengthened information, education and communication/behaviour change and communication, collaboration and coordination will be crucial in scaling up and using vector control interventions.

  6. The Vector-Ballot Approach for Online Voting Procedures

    NASA Astrophysics Data System (ADS)

    Kiayias, Aggelos; Yung, Moti

    Looking at current cryptographic-based e-voting protocols, one can distinguish three basic design paradigms (or approaches): (a) Mix-Networks based, (b) Homomorphic Encryption based, and (c) Blind Signatures based. Each of the three possesses different advantages and disadvantages w.r.t. the basic properties of (i) efficient tallying, (ii) universal verifiability, and (iii) allowing write-in ballot capability (in addition to predetermined candidates). In fact, none of the approaches results in a scheme that simultaneously achieves all three. This is unfortunate, since the three basic properties are crucial for efficiency, integrity and versatility (flexibility), respectively. Further, one can argue that a serious business offering of voting technology should offer a flexible technology that achieves various election goals with a single user interface. This motivates our goal, which is to suggest a new "vector-ballot" based approach for secret-ballot e-voting that is based on three new notions: Provably Consistent Vector Ballot Encodings, Shrink-and-Mix Networks and Punch-Hole-Vector-Ballots. At the heart of our approach is the combination of mix networks and homomorphic encryption under a single user interface; given this, it is rather surprising that it achieves much more than any of the previous approaches for e-voting achieved in terms of the basic properties. Our approach is presented in two generic designs called "homomorphic vector-ballots with write-in votes" and "multi-candidate punch-hole vector-ballots"; both of our designs can be instantiated over any homomorphic encryption function.

  7. Fault detection in reciprocating compressor valves under varying load conditions

    NASA Astrophysics Data System (ADS)

    Pichler, Kurt; Lughofer, Edwin; Pichler, Markus; Buchegger, Thomas; Klement, Erich Peter; Huschenbett, Matthias

    2016-03-01

    This paper presents a novel approach for detecting cracked or broken reciprocating compressor valves under varying load conditions. The main idea is that the time frequency representation of vibration measurement data will show typical patterns depending on the fault state. The problem is to detect these patterns reliably. For the detection task, we make a detour via the two dimensional autocorrelation. The autocorrelation emphasizes the patterns and reduces noise effects. This makes it easier to define appropriate features. After feature extraction, classification is done using logistic regression and support vector machines. The method's performance is validated by analyzing real world measurement data. The results will show a very high detection accuracy while keeping the false alarm rates at a very low level for different compressor loads, thus achieving a load-independent method. The proposed approach is, to our best knowledge, the first automated method for reciprocating compressor valve fault detection that can handle varying load conditions.

  8. A computational visual saliency model based on statistics and machine learning.

    PubMed

    Lin, Ru-Je; Lin, Wei-Song

    2014-08-01

    Identifying the type of stimuli that attracts human visual attention has been an appealing topic for scientists for many years. In particular, marking the salient regions in images is useful for both psychologists and many computer vision applications. In this paper, we propose a computational approach for producing saliency maps using statistics and machine learning methods. Based on four assumptions, three properties (Feature-Prior, Position-Prior, and Feature-Distribution) can be derived and combined by a simple intersection operation to obtain a saliency map. These properties are implemented by a similarity computation, support vector regression (SVR) technique, statistical analysis of training samples, and information theory using low-level features. This technique is able to learn the preferences of human visual behavior while simultaneously considering feature uniqueness. Experimental results show that our approach performs better in predicting human visual attention regions than 12 other models in two test databases. © 2014 ARVO.

  9. Catchments as non-linear filters: evaluating data-driven approaches for spatio-temporal predictions in ungauged basins

    NASA Astrophysics Data System (ADS)

    Bellugi, D. G.; Tennant, C.; Larsen, L.

    2016-12-01

    Catchment and climate heterogeneity complicate prediction of runoff across time and space, and resulting parameter uncertainty can lead to large accumulated errors in hydrologic models, particularly in ungauged basins. Recently, data-driven modeling approaches have been shown to avoid the accumulated uncertainty associated with many physically-based models, providing an appealing alternative for hydrologic prediction. However, the effectiveness of different methods in hydrologically and geomorphically distinct catchments, and the robustness of these methods to changing climate and changing hydrologic processes remain to be tested. Here, we evaluate the use of machine learning techniques to predict daily runoff across time and space using only essential climatic forcing (e.g. precipitation, temperature, and potential evapotranspiration) time series as model input. Model training and testing was done using a high quality dataset of daily runoff and climate forcing data for 25+ years for 600+ minimally-disturbed catchments (drainage area range 5-25,000 km2, median size 336 km2) that cover a wide range of climatic and physical characteristics. Preliminary results using Support Vector Regression (SVR) suggest that in some catchments this nonlinear-based regression technique can accurately predict daily runoff, while the same approach fails in other catchments, indicating that the representation of climate inputs and/or catchment filter characteristics in the model structure need further refinement to increase performance. We bolster this analysis by using Sparse Identification of Nonlinear Dynamics (a sparse symbolic regression technique) to uncover the governing equations that describe runoff processes in catchments where SVR performed well and for ones where it performed poorly, thereby enabling inference about governing processes. This provides a robust means of examining how catchment complexity influences runoff prediction skill, and represents a contribution towards the integration of data-driven inference and physically-based models.

  10. Multiple linear regression analysis

    NASA Technical Reports Server (NTRS)

    Edwards, T. R.

    1980-01-01

    Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.

  11. Probability estimation with machine learning methods for dichotomous and multicategory outcome: theory.

    PubMed

    Kruppa, Jochen; Liu, Yufeng; Biau, Gérard; Kohler, Michael; König, Inke R; Malley, James D; Ziegler, Andreas

    2014-07-01

    Probability estimation for binary and multicategory outcome using logistic and multinomial logistic regression has a long-standing tradition in biostatistics. However, biases may occur if the model is misspecified. In contrast, outcome probabilities for individuals can be estimated consistently with machine learning approaches, including k-nearest neighbors (k-NN), bagged nearest neighbors (b-NN), random forests (RF), and support vector machines (SVM). Because machine learning methods are rarely used by applied biostatisticians, the primary goal of this paper is to explain the concept of probability estimation with these methods and to summarize recent theoretical findings. Probability estimation in k-NN, b-NN, and RF can be embedded into the class of nonparametric regression learning machines; therefore, we start with the construction of nonparametric regression estimates and review results on consistency and rates of convergence. In SVMs, outcome probabilities for individuals are estimated consistently by repeatedly solving classification problems. For SVMs we review classification problem and then dichotomous probability estimation. Next we extend the algorithms for estimating probabilities using k-NN, b-NN, and RF to multicategory outcomes and discuss approaches for the multicategory probability estimation problem using SVM. In simulation studies for dichotomous and multicategory dependent variables we demonstrate the general validity of the machine learning methods and compare it with logistic regression. However, each method fails in at least one simulation scenario. We conclude with a discussion of the failures and give recommendations for selecting and tuning the methods. Applications to real data and example code are provided in a companion article (doi:10.1002/bimj.201300077). © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Climate, Deer, Rodents, and Acorns as Determinants of Variation in Lyme-Disease Risk

    PubMed Central

    Canham, Charles D; Oggenfuss, Kelly; Winchcombe, Raymond J; Keesing, Felicia

    2006-01-01

    Risk of human exposure to vector-borne zoonotic pathogens is a function of the abundance and infection prevalence of vectors. We assessed the determinants of Lyme-disease risk (density and Borrelia burgdorferi-infection prevalence of nymphal Ixodes scapularis ticks) over 13 y on several field plots within eastern deciduous forests in the epicenter of US Lyme disease (Dutchess County, New York). We used a model comparison approach to simultaneously test the importance of ambient growing-season temperature, precipitation, two indices of deer (Odocoileus virginianus) abundance, and densities of white-footed mice (Peromyscus leucopus), eastern chipmunks (Tamias striatus), and acorns ( Quercus spp.), in both simple and multiple regression models, in predicting entomological risk. Indices of deer abundance had no predictive power, and precipitation in the current year and temperature in the prior year had only weak effects on entomological risk. The strongest predictors of a current year's risk were the prior year's abundance of mice and chipmunks and abundance of acorns 2 y previously. In no case did inclusion of deer or climate variables improve the predictive power of models based on rodents, acorns, or both. We conclude that interannual variation in entomological risk of exposure to Lyme disease is correlated positively with prior abundance of key hosts for the immature stages of the tick vector and with critical food resources for those hosts. PMID:16669698

  13. Predictive spectroscopy and chemical imaging based on novel optical systems

    NASA Astrophysics Data System (ADS)

    Nelson, Matthew Paul

    1998-10-01

    This thesis describes two futuristic optical systems designed to surpass contemporary spectroscopic methods for predictive spectroscopy and chemical imaging. These systems are advantageous to current techniques in a number of ways including lower cost, enhanced portability, shorter analysis time, and improved S/N. First, a novel optical approach to predicting chemical and physical properties based on principal component analysis (PCA) is proposed and evaluated. A regression vector produced by PCA is designed into the structure of a set of paired optical filters. Light passing through the paired filters produces an analog detector signal directly proportional to the chemical/physical property for which the regression vector was designed. Second, a novel optical system is described which takes a single-shot approach to chemical imaging with high spectroscopic resolution using a dimension-reduction fiber-optic array. Images are focused onto a two- dimensional matrix of optical fibers which are drawn into a linear distal array with specific ordering. The distal end is imaged with a spectrograph equipped with an ICCD camera for spectral analysis. Software is used to extract the spatial/spectral information contained in the ICCD images and deconvolute them into wave length-specific reconstructed images or position-specific spectra which span a multi-wavelength space. This thesis includes a description of the fabrication of two dimension-reduction arrays as well as an evaluation of the system for spatial and spectral resolution, throughput, image brightness, resolving power, depth of focus, and channel cross-talk. PCA is performed on the images by treating rows of the ICCD images as spectra and plotting the scores of each PC as a function of reconstruction position. In addition, iterative target transformation factor analysis (ITTFA) is performed on the spectroscopic images to generate ``true'' chemical maps of samples. Univariate zero-order images, univariate first-order spectroscopic images, bivariate first-order spectroscopic images, and multivariate first-order spectroscopic images of the temporal development of laser-induced plumes are presented and interpreted. Reconstructed chemical images generated using bivariate and trivariate wavelength techniques, bimodal and trimodal PCA methods, and bimodal and trimodal ITTFA approaches are also included.

  14. Classifying injury narratives of large administrative databases for surveillance-A practical approach combining machine learning ensembles and human review.

    PubMed

    Marucci-Wellman, Helen R; Corns, Helen L; Lehto, Mark R

    2017-01-01

    Injury narratives are now available real time and include useful information for injury surveillance and prevention. However, manual classification of the cause or events leading to injury found in large batches of narratives, such as workers compensation claims databases, can be prohibitive. In this study we compare the utility of four machine learning algorithms (Naïve Bayes, Single word and Bi-gram models, Support Vector Machine and Logistic Regression) for classifying narratives into Bureau of Labor Statistics Occupational Injury and Illness event leading to injury classifications for a large workers compensation database. These algorithms are known to do well classifying narrative text and are fairly easy to implement with off-the-shelf software packages such as Python. We propose human-machine learning ensemble approaches which maximize the power and accuracy of the algorithms for machine-assigned codes and allow for strategic filtering of rare, emerging or ambiguous narratives for manual review. We compare human-machine approaches based on filtering on the prediction strength of the classifier vs. agreement between algorithms. Regularized Logistic Regression (LR) was the best performing algorithm alone. Using this algorithm and filtering out the bottom 30% of predictions for manual review resulted in high accuracy (overall sensitivity/positive predictive value of 0.89) of the final machine-human coded dataset. The best pairings of algorithms included Naïve Bayes with Support Vector Machine whereby the triple ensemble NB SW =NB BI-GRAM =SVM had very high performance (0.93 overall sensitivity/positive predictive value and high accuracy (i.e. high sensitivity and positive predictive values)) across both large and small categories leaving 41% of the narratives for manual review. Integrating LR into this ensemble mix improved performance only slightly. For large administrative datasets we propose incorporation of methods based on human-machine pairings such as we have done here, utilizing readily-available off-the-shelf machine learning techniques and resulting in only a fraction of narratives that require manual review. Human-machine ensemble methods are likely to improve performance over total manual coding. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  15. Inter-model comparison of the landscape determinants of vector-borne disease: implications for epidemiological and entomological risk modeling.

    PubMed

    Lorenz, Alyson; Dhingra, Radhika; Chang, Howard H; Bisanzio, Donal; Liu, Yang; Remais, Justin V

    2014-01-01

    Extrapolating landscape regression models for use in assessing vector-borne disease risk and other applications requires thoughtful evaluation of fundamental model choice issues. To examine implications of such choices, an analysis was conducted to explore the extent to which disparate landscape models agree in their epidemiological and entomological risk predictions when extrapolated to new regions. Agreement between six literature-drawn landscape models was examined by comparing predicted county-level distributions of either Lyme disease or Ixodes scapularis vector using Spearman ranked correlation. AUC analyses and multinomial logistic regression were used to assess the ability of these extrapolated landscape models to predict observed national data. Three models based on measures of vegetation, habitat patch characteristics, and herbaceous landcover emerged as effective predictors of observed disease and vector distribution. An ensemble model containing these three models improved precision and predictive ability over individual models. A priori assessment of qualitative model characteristics effectively identified models that subsequently emerged as better predictors in quantitative analysis. Both a methodology for quantitative model comparison and a checklist for qualitative assessment of candidate models for extrapolation are provided; both tools aim to improve collaboration between those producing models and those interested in applying them to new areas and research questions.

  16. 3D reconstruction of the magnetic vector potential using model based iterative reconstruction

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Prabhat, K. C.; Aditya Mohan, K.; Phatak, Charudatta

    Lorentz transmission electron microscopy (TEM) observations of magnetic nanoparticles contain information on the magnetic and electrostatic potentials. Vector field electron tomography (VFET) can be used to reconstruct electromagnetic potentials of the nanoparticles from their corresponding LTEM images. The VFET approach is based on the conventional filtered back projection approach to tomographic reconstructions and the availability of an incomplete set of measurements due to experimental limitations means that the reconstructed vector fields exhibit significant artifacts. In this paper, we outline a model-based iterative reconstruction (MBIR) algorithm to reconstruct the magnetic vector potential of magnetic nanoparticles. We combine a forward model formore » image formation in TEM experiments with a prior model to formulate the tomographic problem as a maximum a-posteriori probability estimation problem (MAP). The MAP cost function is minimized iteratively to determine the vector potential. Here, a comparative reconstruction study of simulated as well as experimental data sets show that the MBIR approach yields quantifiably better reconstructions than the VFET approach.« less

  17. Transcranial Magnetic Stimulation: An Automated Procedure to Obtain Coil-specific Models for Field Calculations.

    PubMed

    Madsen, Kristoffer H; Ewald, Lars; Siebner, Hartwig R; Thielscher, Axel

    2015-01-01

    Field calculations for transcranial magnetic stimulation (TMS) are increasingly implemented online in neuronavigation systems and in more realistic offline approaches based on finite-element methods. They are often based on simplified and/or non-validated models of the magnetic vector potential of the TMS coils. To develop an approach to reconstruct the magnetic vector potential based on automated measurements. We implemented a setup that simultaneously measures the three components of the magnetic field with high spatial resolution. This is complemented by a novel approach to determine the magnetic vector potential via volume integration of the measured field. The integration approach reproduces the vector potential with very good accuracy. The vector potential distribution of a standard figure-of-eight shaped coil determined with our setup corresponds well with that calculated using a model reconstructed from x-ray images. The setup can supply validated models for existing and newly appearing TMS coils. Copyright © 2015 Elsevier Inc. All rights reserved.

  18. 3D reconstruction of the magnetic vector potential using model based iterative reconstruction.

    PubMed

    Prabhat, K C; Aditya Mohan, K; Phatak, Charudatta; Bouman, Charles; De Graef, Marc

    2017-11-01

    Lorentz transmission electron microscopy (TEM) observations of magnetic nanoparticles contain information on the magnetic and electrostatic potentials. Vector field electron tomography (VFET) can be used to reconstruct electromagnetic potentials of the nanoparticles from their corresponding LTEM images. The VFET approach is based on the conventional filtered back projection approach to tomographic reconstructions and the availability of an incomplete set of measurements due to experimental limitations means that the reconstructed vector fields exhibit significant artifacts. In this paper, we outline a model-based iterative reconstruction (MBIR) algorithm to reconstruct the magnetic vector potential of magnetic nanoparticles. We combine a forward model for image formation in TEM experiments with a prior model to formulate the tomographic problem as a maximum a-posteriori probability estimation problem (MAP). The MAP cost function is minimized iteratively to determine the vector potential. A comparative reconstruction study of simulated as well as experimental data sets show that the MBIR approach yields quantifiably better reconstructions than the VFET approach. Copyright © 2017 Elsevier B.V. All rights reserved.

  19. 3D reconstruction of the magnetic vector potential using model based iterative reconstruction

    DOE PAGES

    Prabhat, K. C.; Aditya Mohan, K.; Phatak, Charudatta; ...

    2017-07-03

    Lorentz transmission electron microscopy (TEM) observations of magnetic nanoparticles contain information on the magnetic and electrostatic potentials. Vector field electron tomography (VFET) can be used to reconstruct electromagnetic potentials of the nanoparticles from their corresponding LTEM images. The VFET approach is based on the conventional filtered back projection approach to tomographic reconstructions and the availability of an incomplete set of measurements due to experimental limitations means that the reconstructed vector fields exhibit significant artifacts. In this paper, we outline a model-based iterative reconstruction (MBIR) algorithm to reconstruct the magnetic vector potential of magnetic nanoparticles. We combine a forward model formore » image formation in TEM experiments with a prior model to formulate the tomographic problem as a maximum a-posteriori probability estimation problem (MAP). The MAP cost function is minimized iteratively to determine the vector potential. Here, a comparative reconstruction study of simulated as well as experimental data sets show that the MBIR approach yields quantifiably better reconstructions than the VFET approach.« less

  20. A novel strategy for forensic age prediction by DNA methylation and support vector regression model

    PubMed Central

    Xu, Cheng; Qu, Hongzhu; Wang, Guangyu; Xie, Bingbing; Shi, Yi; Yang, Yaran; Zhao, Zhao; Hu, Lan; Fang, Xiangdong; Yan, Jiangwei; Feng, Lei

    2015-01-01

    High deviations resulting from prediction model, gender and population difference have limited age estimation application of DNA methylation markers. Here we identified 2,957 novel age-associated DNA methylation sites (P < 0.01 and R2 > 0.5) in blood of eight pairs of Chinese Han female monozygotic twins. Among them, nine novel sites (false discovery rate < 0.01), along with three other reported sites, were further validated in 49 unrelated female volunteers with ages of 20–80 years by Sequenom Massarray. A total of 95 CpGs were covered in the PCR products and 11 of them were built the age prediction models. After comparing four different models including, multivariate linear regression, multivariate nonlinear regression, back propagation neural network and support vector regression, SVR was identified as the most robust model with the least mean absolute deviation from real chronological age (2.8 years) and an average accuracy of 4.7 years predicted by only six loci from the 11 loci, as well as an less cross-validated error compared with linear regression model. Our novel strategy provides an accurate measurement that is highly useful in estimating the individual age in forensic practice as well as in tracking the aging process in other related applications. PMID:26635134

  1. Analysis of an Environmental Exposure Health Questionnaire in a Metropolitan Minority Population Utilizing Logistic Regression and Support Vector Machines

    PubMed Central

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D.; Hood, Darryl B.; Skelton, Tyler

    2014-01-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire. PMID:23395953

  2. An ultra low power feature extraction and classification system for wearable seizure detection.

    PubMed

    Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh

    2015-01-01

    In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.

  3. [Quantitative structure-gas chromatographic retention relationship of polycyclic aromatic sulfur heterocycles using molecular electronegativity-distance vector].

    PubMed

    Li, Zhenghua; Cheng, Fansheng; Xia, Zhining

    2011-01-01

    The chemical structures of 114 polycyclic aromatic sulfur heterocycles (PASHs) have been studied by molecular electronegativity-distance vector (MEDV). The linear relationships between gas chromatographic retention index and the MEDV have been established by a multiple linear regression (MLR) model. The results of variable selection by stepwise multiple regression (SMR) and the powerful predictive abilities of the optimization model appraised by leave-one-out cross-validation showed that the optimization model with the correlation coefficient (R) of 0.994 7 and the cross-validated correlation coefficient (Rcv) of 0.994 0 possessed the best statistical quality. Furthermore, when the 114 PASHs compounds were divided into calibration and test sets in the ratio of 2:1, the statistical analysis showed our models possesses almost equal statistical quality, the very similar regression coefficients and the good robustness. The quantitative structure-retention relationship (QSRR) model established may provide a convenient and powerful method for predicting the gas chromatographic retention of PASHs.

  4. Analysis of an environmental exposure health questionnaire in a metropolitan minority population utilizing logistic regression and Support Vector Machines.

    PubMed

    Chen, Chau-Kuang; Bruce, Michelle; Tyler, Lauren; Brown, Claudine; Garrett, Angelica; Goggins, Susan; Lewis-Polite, Brandy; Weriwoh, Mirabel L; Juarez, Paul D; Hood, Darryl B; Skelton, Tyler

    2013-02-01

    The goal of this study was to analyze a 54-item instrument for assessment of perception of exposure to environmental contaminants within the context of the built environment, or exposome. This exposome was defined in five domains to include 1) home and hobby, 2) school, 3) community, 4) occupation, and 5) exposure history. Interviews were conducted with child-bearing-age minority women at Metro Nashville General Hospital at Meharry Medical College. Data were analyzed utilizing DTReg software for Support Vector Machine (SVM) modeling followed by an SPSS package for a logistic regression model. The target (outcome) variable of interest was respondent's residence by ZIP code. The results demonstrate that the rank order of important variables with respect to SVM modeling versus traditional logistic regression models is almost identical. This is the first study documenting that SVM analysis has discriminate power for determination of higher-ordered spatial relationships on an environmental exposure history questionnaire.

  5. Alternative methods to evaluate trial level surrogacy.

    PubMed

    Abrahantes, Josè Cortiñas; Shkedy, Ziv; Molenberghs, Geert

    2008-01-01

    The evaluation and validation of surrogate endpoints have been extensively studied in the last decade. Prentice [1] and Freedman, Graubard and Schatzkin [2] laid the foundations for the evaluation of surrogate endpoints in randomized clinical trials. Later, Buyse et al. [5] proposed a meta-analytic methodology, producing different methods for different settings, which was further studied by Alonso and Molenberghs [9], in their unifying approach based on information theory. In this article, we focus our attention on the trial-level surrogacy and propose alternative procedures to evaluate such surrogacy measure, which do not pre-specify the type of association. A promising correction based on cross-validation is investigated. As well as the construction of confidence intervals for this measure. In order to avoid making assumption about the type of relationship between the treatment effects and its distribution, a collection of alternative methods, based on regression trees, bagging, random forests, and support vector machines, combined with bootstrap-based confidence interval and, should one wish, in conjunction with a cross-validation based correction, will be proposed and applied. We apply the various strategies to data from three clinical studies: in opthalmology, in advanced colorectal cancer, and in schizophrenia. The results obtained for the three case studies are compared; they indicate that using random forest or bagging models produces larger estimated values for the surrogacy measure, which are in general stabler and the confidence interval narrower than linear regression and support vector regression. For the advanced colorectal cancer studies, we even found the trial-level surrogacy is considerably different from what has been reported. In general the alternative methods are more computationally demanding, and specially the calculation of the confidence intervals, require more computational time that the delta-method counterpart. First, more flexible modeling techniques can be used, allowing for other type of association. Second, when no cross-validation-based correction is applied, overly optimistic trial-level surrogacy estimates will be found, thus cross-validation is highly recommendable. Third, the use of the delta method to calculate confidence intervals is not recommendable since it makes assumptions valid only in very large samples. It may also produce range-violating limits. We therefore recommend alternatives: bootstrap methods in general. Also, the information-theoretic approach produces comparable results with the bagging and random forest approaches, when cross-validation correction is applied. It is also important to observe that, even for the case in which the linear model might be a good option too, bagging methods perform well too, and their confidence intervals were more narrow.

  6. Global Transport Networks and Infectious Disease Spread

    PubMed Central

    Tatem, A.J.; Rogers, D.J.; Hay, S.I.

    2011-01-01

    Air, sea and land transport networks continue to expand in reach, speed of travel and volume of passengers and goods carried. Pathogens and their vectors can now move further, faster and in greater numbers than ever before. Three important consequences of global transport network expansion are infectious disease pandemics, vector invasion events and vector-borne pathogen importation. This review briefly examines some of the important historical examples of these disease and vector movements, such as the global influenza pandemics, the devastating Anopheles gambiae invasion of Brazil and the recent increases in imported Plasmodium falciparum malaria cases. We then outline potential approaches for future studies of disease movement, focussing on vector invasion and vector-borne disease importation. Such approaches allow us to explore the potential implications of international air travel, shipping routes and other methods of transport on global pathogen and vector traffic. PMID:16647974

  7. Major vectors and vector-borne diseases in small ruminants in Ethiopia: A systematic review.

    PubMed

    Asmare, Kassahun; Abayneh, Takele; Sibhat, Berhanu; Shiferaw, Dessie; Szonyi, Barbara; Krontveit, Randi I; Skjerve, Eystein; Wieland, Barbara

    2017-06-01

    Vector-borne diseases are among major health constraints of small ruminant in Ethiopia. While various studies on single vector-borne diseases or presence of vectors have been conducted, no summarized evidence is available on the occurrence of these diseases and the related vectors. This systematic literature review provides a comprehensive summary on major vectors and vector-borne diseases in small ruminants in Ethiopia. Search for published and unpublished literature was conducted between 8th of January and 25th of June 2015. The search was both manual and electronic. The databases used in electronic search were PubMed, Web of Science, CAB Direct and AJOL. For most of the vector-borne diseases, the summary was limited to narrative synthesis due to lack of sufficient data. Meta-analysis was computed for trypanosomosis and dermatophilosis while meta-regression and sensitivity analysis was done only for trypanososmosis due to lack of sufficient reports on dermatophilosis. Owing emphasis to their vector role, ticks and flies were summarized narratively at genera/species level. In line with inclusion criteria, out of 106 initially identified research reports 43 peer-reviewed articles passed the quality assessment. Data on 7 vector-borne diseases were extracted at species and region level from each source. Accordingly, the pooled prevalence estimate of trypanosomosis was 3.7% with 95% confidence interval (CI) 2.8, 4.9), while that of dermatophilosis was 3.1% (95% CI: 1.6, 6.0). The in-between study variance noted for trypanosomosis was statistically significant (p<0.05). Among the three covariates considered for meta-regression, only one (species) fitted the final model significantly (p<0.05) and explained 65.44% of the between studies variance (R 2 ). The prevalence in sheep (5.5%) increased nearly by 34% compared to goats (2.9%). The parasitic presence in blood was documented for babesiosis (3.7% in goats); and anaplasmosis (3.9% in sheep). Serological evidence was retrieved for bluetongue ranging from 34.1% to 46.67% in sheep, and coxiellosis was 10.4% in goats. There was also molecular evidence on the presence of theileriosis in sheep (93%, n=160) and goats (1.9%, n=265). Regarding vectors of veterinary importance, 14 species of ticks in five genera, four species of Glossina and 4 genera of biting flies were reported. Despite the evidence on presence of various vectors including ticks, flies, mosquitoes and midges, studies on vector-borne diseases in Ethiopia are surprisingly rare, especially considering risks related to climate change, which is likely to affect distribution of vectors. Thus better evidence on the current situation is urgently needed in order to prevent spread and to model future distribution scenarios. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Initial Flight Test Evaluation of the F-15 ACTIVE Axisymmetric Vectoring Nozzle Performance

    NASA Technical Reports Server (NTRS)

    Orme, John S.; Hathaway, Ross; Ferguson, Michael D.

    1998-01-01

    A full envelope database of a thrust-vectoring axisymmetric nozzle performance for the Pratt & Whitney Pitch/Yaw Balance Beam Nozzle (P/YBBN) is being developed using the F-15 Advanced Control Technology for Integrated Vehicles (ACTIVE) aircraft. At this time, flight research has been completed for steady-state pitch vector angles up to 20' at an altitude of 30,000 ft from low power settings to maximum afterburner power. The nozzle performance database includes vector forces, internal nozzle pressures, and temperatures all of which can be used for regression analysis modeling. The database was used to substantiate a set of nozzle performance data from wind tunnel testing and computational fluid dynamic analyses. Findings from initial flight research at Mach 0.9 and 1.2 are presented in this paper. The results show that vector efficiency is strongly influenced by power setting. A significant discrepancy in nozzle performance has been discovered between predicted and measured results during vectoring.

  9. ADMET Evaluation in Drug Discovery. 18. Reliable Prediction of Chemical-Induced Urinary Tract Toxicity by Boosting Machine Learning Approaches.

    PubMed

    Lei, Tailong; Sun, Huiyong; Kang, Yu; Zhu, Feng; Liu, Hui; Zhou, Wenfang; Wang, Zhe; Li, Dan; Li, Youyong; Hou, Tingjun

    2017-11-06

    Xenobiotic chemicals and their metabolites are mainly excreted out of our bodies by the urinary tract through the urine. Chemical-induced urinary tract toxicity is one of the main reasons that cause failure during drug development, and it is a common adverse event for medications, natural supplements, and environmental chemicals. Despite its importance, there are only a few in silico models for assessing urinary tract toxicity for a large number of compounds with diverse chemical structures. Here, we developed a series of qualitative and quantitative structure-activity relationship (QSAR) models for predicting urinary tract toxicity. In our study, the recursive feature elimination method incorporated with random forests (RFE-RF) was used for dimension reduction, and then eight machine learning approaches were used for QSAR modeling, i.e., relevance vector machine (RVM), support vector machine (SVM), regularized random forest (RRF), C5.0 trees, eXtreme gradient boosting (XGBoost), AdaBoost.M1, SVM boosting (SVMBoost), and RVM boosting (RVMBoost). For building classification models, the synthetic minority oversampling technique was used to handle the imbalance data set problem. Among all the machine learning approaches, SVMBoost based on the RBF kernel achieves both the best quantitative (q ext 2 = 0.845) and qualitative predictions for the test set (MCC of 0.787, AUC of 0.893, sensitivity of 89.6%, specificity of 94.1%, and global accuracy of 90.8%). The application domains were then analyzed, and all of the tested chemicals fall within the application domain coverage. We also examined the structure features of the chemicals with large prediction errors. In brief, both the regression and classification models developed by the SVMBoost approach have reliable prediction capability for assessing chemical-induced urinary tract toxicity.

  10. Directional Characteristics of Inner Shelf Internal Tides

    DTIC Science & Technology

    2007-06-01

    Figure 18. YD 202-206 Current vector plot of significant events. Significant events include internal tidal bores, solibores, and solitons . The upper...Events (Bores, Solibores, and Solitons ): Upper column leading-edge cross-shore current velocity and cross-shore wind regression. The small ellipse...Significant Events (Bores, Solibores, and Solitons ): Upper column leading-edge along-shore current velocity and along-shore wind regression. The small

  11. Predicting surface fuel models and fuel metrics using lidar and CIR imagery in a dense mixed conifer forest

    Treesearch

    Marek K. Jakubowksi; Qinghua Guo; Brandon Collins; Scott Stephens; Maggi Kelly

    2013-01-01

    We compared the ability of several classification and regression algorithms to predict forest stand structure metrics and standard surface fuel models. Our study area spans a dense, topographically complex Sierra Nevada mixed-conifer forest. We used clustering, regression trees, and support vector machine algorithms to analyze high density (average 9 pulses/m

  12. A simple map-based localization strategy using range measurements

    NASA Astrophysics Data System (ADS)

    Moore, Kevin L.; Kutiyanawala, Aliasgar; Chandrasekharan, Madhumita

    2005-05-01

    In this paper we present a map-based approach to localization. We consider indoor navigation in known environments based on the idea of a "vector cloud" by observing that any point in a building has an associated vector defining its distance to the key structural components (e.g., walls, ceilings, etc.) of the building in any direction. Given a building blueprint we can derive the "ideal" vector cloud at any point in space. Then, given measurements from sensors on the robot we can compare the measured vector cloud to the possible vector clouds cataloged from the blueprint, thus determining location. We present algorithms for implementing this approach to localization, using the Hamming norm, the 1-norm, and the 2-norm. The effectiveness of the approach is verified by experiments on a 2-D testbed using a mobile robot with a 360° laser range-finder and through simulation analysis of robustness.

  13. Disabled infectious single cycle-herpes simplex virus (DISC-HSV) as a vector for immunogene therapy of cancer.

    PubMed

    Rees, Robert C; McArdle, Stephanie; Mian, Shahid; Li, Geng; Ahmad, Murrium; Parkinson, Richard; Ali, Selman A

    2002-02-01

    Disabled infectious single cycle-herpes simplex viruses (DISC-HSV) have been shown to be safe for use in humans and may be considered efficacious as vectors for immunogene therapy in cancer. Preclinical studies show that DISC-HSV is an efficient delivery system for cytokine genes and antigens. DISC-HSV infects a high proportion of cells, resulting in rapid gene expression for at least 72 h. The DISC-HSV-mGM-CSF vector, when inoculated into tumors, induces tumor regression in a high percentage of animals, concomitant with establishing a cytotoxic T-cell response, which is MHC class I restricted and directed against peptides of known tumor antigens. The inherent properties of DISC-HSV makes it a suitable vector for consideration in human immunogene therapy trials.

  14. Word-level recognition of multifont Arabic text using a feature vector matching approach

    NASA Astrophysics Data System (ADS)

    Erlandson, Erik J.; Trenkle, John M.; Vogt, Robert C., III

    1996-03-01

    Many text recognition systems recognize text imagery at the character level and assemble words from the recognized characters. An alternative approach is to recognize text imagery at the word level, without analyzing individual characters. This approach avoids the problem of individual character segmentation, and can overcome local errors in character recognition. A word-level recognition system for machine-printed Arabic text has been implemented. Arabic is a script language, and is therefore difficult to segment at the character level. Character segmentation has been avoided by recognizing text imagery of complete words. The Arabic recognition system computes a vector of image-morphological features on a query word image. This vector is matched against a precomputed database of vectors from a lexicon of Arabic words. Vectors from the database with the highest match score are returned as hypotheses for the unknown image. Several feature vectors may be stored for each word in the database. Database feature vectors generated using multiple fonts and noise models allow the system to be tuned to its input stream. Used in conjunction with database pruning techniques, this Arabic recognition system has obtained promising word recognition rates on low-quality multifont text imagery.

  15. Kernel approach to molecular similarity based on iterative graph similarity.

    PubMed

    Rupp, Matthias; Proschak, Ewgenij; Schneider, Gisbert

    2007-01-01

    Similarity measures for molecules are of basic importance in chemical, biological, and pharmaceutical applications. We introduce a molecular similarity measure defined directly on the annotated molecular graph, based on iterative graph similarity and optimal assignments. We give an iterative algorithm for the computation of the proposed molecular similarity measure, prove its convergence and the uniqueness of the solution, and provide an upper bound on the required number of iterations necessary to achieve a desired precision. Empirical evidence for the positive semidefiniteness of certain parametrizations of our function is presented. We evaluated our molecular similarity measure by using it as a kernel in support vector machine classification and regression applied to several pharmaceutical and toxicological data sets, with encouraging results.

  16. Polarization Control with Plasmonic Antenna Tips: A Universal Approach to Optical Nanocrystallography and Vector-Field Imaging

    NASA Astrophysics Data System (ADS)

    Park, Kyoung-Duck; Raschke, Markus B.

    2018-05-01

    Controlling the propagation and polarization vectors in linear and nonlinear optical spectroscopy enables to probe the anisotropy of optical responses providing structural symmetry selective contrast in optical imaging. Here we present a novel tilted antenna-tip approach to control the optical vector-field by breaking the axial symmetry of the nano-probe in tip-enhanced near-field microscopy. This gives rise to a localized plasmonic antenna effect with significantly enhanced optical field vectors with control of both \\textit{in-plane} and \\textit{out-of-plane} components. We use the resulting vector-field specificity in the symmetry selective nonlinear optical response of second-harmonic generation (SHG) for a generalized approach to optical nano-crystallography and -imaging. In tip-enhanced SHG imaging of monolayer MoS$_2$ films and single-crystalline ferroelectric YMnO$_3$, we reveal nano-crystallographic details of domain boundaries and domain topology with enhanced sensitivity and nanoscale spatial resolution. The approach is applicable to any anisotropic linear and nonlinear optical response, and provides for optical nano-crystallographic imaging of molecular or quantum materials.

  17. Classification of small lesions on dynamic breast MRI: Integrating dimension reduction and out-of-sample extension into CADx methodology

    PubMed Central

    Nagarajan, Mahesh B.; Huber, Markus B.; Schlossbauer, Thomas; Leinsinger, Gerda; Krol, Andrzej; Wismüller, Axel

    2014-01-01

    Objective While dimension reduction has been previously explored in computer aided diagnosis (CADx) as an alternative to feature selection, previous implementations of its integration into CADx do not ensure strict separation between training and test data required for the machine learning task. This compromises the integrity of the independent test set, which serves as the basis for evaluating classifier performance. Methods and Materials We propose, implement and evaluate an improved CADx methodology where strict separation is maintained. This is achieved by subjecting the training data alone to dimension reduction; the test data is subsequently processed with out-of-sample extension methods. Our approach is demonstrated in the research context of classifying small diagnostically challenging lesions annotated on dynamic breast magnetic resonance imaging (MRI) studies. The lesions were dynamically characterized through topological feature vectors derived from Minkowski functionals. These feature vectors were then subject to dimension reduction with different linear and non-linear algorithms applied in conjunction with out-of-sample extension techniques. This was followed by classification through supervised learning with support vector regression. Area under the receiver-operating characteristic curve (AUC) was evaluated as the metric of classifier performance. Results Of the feature vectors investigated, the best performance was observed with Minkowski functional ’perimeter’ while comparable performance was observed with ’area’. Of the dimension reduction algorithms tested with ’perimeter’, the best performance was observed with Sammon’s mapping (0.84 ± 0.10) while comparable performance was achieved with exploratory observation machine (0.82 ± 0.09) and principal component analysis (0.80 ± 0.10). Conclusions The results reported in this study with the proposed CADx methodology present a significant improvement over previous results reported with such small lesions on dynamic breast MRI. In particular, non-linear algorithms for dimension reduction exhibited better classification performance than linear approaches, when integrated into our CADx methodology. We also note that while dimension reduction techniques may not necessarily provide an improvement in classification performance over feature selection, they do allow for a higher degree of feature compaction. PMID:24355697

  18. Spectral functions with the density matrix renormalization group: Krylov-space approach for correction vectors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None, None

    Frequency-dependent correlations, such as the spectral function and the dynamical structure factor, help illustrate condensed matter experiments. Within the density matrix renormalization group (DMRG) framework, an accurate method for calculating spectral functions directly in frequency is the correction-vector method. The correction vector can be computed by solving a linear equation or by minimizing a functional. Our paper proposes an alternative to calculate the correction vector: to use the Krylov-space approach. This paper also studies the accuracy and performance of the Krylov-space approach, when applied to the Heisenberg, the t-J, and the Hubbard models. The cases we studied indicate that themore » Krylov-space approach can be more accurate and efficient than the conjugate gradient, and that the error of the former integrates best when a Krylov-space decomposition is also used for ground state DMRG.« less

  19. Spectral functions with the density matrix renormalization group: Krylov-space approach for correction vectors

    DOE PAGES

    None, None

    2016-11-21

    Frequency-dependent correlations, such as the spectral function and the dynamical structure factor, help illustrate condensed matter experiments. Within the density matrix renormalization group (DMRG) framework, an accurate method for calculating spectral functions directly in frequency is the correction-vector method. The correction vector can be computed by solving a linear equation or by minimizing a functional. Our paper proposes an alternative to calculate the correction vector: to use the Krylov-space approach. This paper also studies the accuracy and performance of the Krylov-space approach, when applied to the Heisenberg, the t-J, and the Hubbard models. The cases we studied indicate that themore » Krylov-space approach can be more accurate and efficient than the conjugate gradient, and that the error of the former integrates best when a Krylov-space decomposition is also used for ground state DMRG.« less

  20. Efficient modeling of vector hysteresis using a novel Hopfield neural network implementation of Stoner–Wohlfarth-like operators

    PubMed Central

    Adly, Amr A.; Abd-El-Hafiz, Salwa K.

    2012-01-01

    Incorporation of hysteresis models in electromagnetic analysis approaches is indispensable to accurate field computation in complex magnetic media. Throughout those computations, vector nature and computational efficiency of such models become especially crucial when sophisticated geometries requiring massive sub-region discretization are involved. Recently, an efficient vector Preisach-type hysteresis model constructed from only two scalar models having orthogonally coupled elementary operators has been proposed. This paper presents a novel Hopfield neural network approach for the implementation of Stoner–Wohlfarth-like operators that could lead to a significant enhancement in the computational efficiency of the aforementioned model. Advantages of this approach stem from the non-rectangular nature of these operators that substantially minimizes the number of operators needed to achieve an accurate vector hysteresis model. Details of the proposed approach, its identification and experimental testing are presented in the paper. PMID:25685446

  1. Versatile generation of optical vector fields and vector beams using a non-interferometric approach.

    PubMed

    Tripathi, Santosh; Toussaint, Kimani C

    2012-05-07

    We present a versatile, non-interferometric method for generating vector fields and vector beams which can produce all the states of polarization represented on a higher-order Poincaré sphere. The versatility and non-interferometric nature of this method is expected to enable exploration of various exotic properties of vector fields and vector beams. To illustrate this, we study the propagation properties of some vector fields and find that, in general, propagation alters both their intensity and polarization distribution, and more interestingly, converts some vector fields into vector beams. In the article, we also suggest a modified Jones vector formalism to represent vector fields and vector beams.

  2. Vectorized Jiles-Atherton hysteresis model

    NASA Astrophysics Data System (ADS)

    Szymański, Grzegorz; Waszak, Michał

    2004-01-01

    This paper deals with vector hysteresis modeling. A vector model consisting of individual Jiles-Atherton components placed along principal axes is proposed. The cross-axis coupling ensures general vector model properties. Minor loops are obtained using scaling method. The model is intended for efficient finite element method computations defined in terms of magnetic vector potential. Numerical efficiency is ensured by differential susceptibility approach.

  3. New perspectives in tracing vector-borne interaction networks.

    PubMed

    Gómez-Díaz, Elena; Figuerola, Jordi

    2010-10-01

    Disentangling trophic interaction networks in vector-borne systems has important implications in epidemiological and evolutionary studies. Molecular methods based on bloodmeal typing in vectors have been increasingly used to identify hosts. Although most molecular approaches benefit from good specificity and sensitivity, their temporal resolution is limited by the often rapid digestion of blood, and mixed bloodmeals still remain a challenge for bloodmeal identification in multi-host vector systems. Stable isotope analyses represent a novel complementary tool that can overcome some of these problems. The utility of these methods using examples from different vector-borne systems are discussed and the extents to which they are complementary and versatile are highlighted. There are excellent opportunities for progress in the study of vector-borne transmission networks resulting from the integration of both molecular and stable isotope approaches. Copyright © 2010 Elsevier Ltd. All rights reserved.

  4. Segmentation of discrete vector fields.

    PubMed

    Li, Hongyu; Chen, Wenbin; Shen, I-Fan

    2006-01-01

    In this paper, we propose an approach for 2D discrete vector field segmentation based on the Green function and normalized cut. The method is inspired by discrete Hodge Decomposition such that a discrete vector field can be broken down into three simpler components, namely, curl-free, divergence-free, and harmonic components. We show that the Green Function Method (GFM) can be used to approximate the curl-free and the divergence-free components to achieve our goal of the vector field segmentation. The final segmentation curves that represent the boundaries of the influence region of singularities are obtained from the optimal vector field segmentations. These curves are composed of piecewise smooth contours or streamlines. Our method is applicable to both linear and nonlinear discrete vector fields. Experiments show that the segmentations obtained using our approach essentially agree with human perceptual judgement.

  5. Enhancing vector shoreline data using a data fusion approach

    NASA Astrophysics Data System (ADS)

    Carlotto, Mark; Nebrich, Mark; DeMichele, David

    2017-05-01

    Vector shoreline (VSL) data is potentially useful in ATR systems that distinguish between objects on land or water. Unfortunately available data such as the NOAA 1:250,000 World Vector Shoreline and NGA Prototype Global Shoreline data cannot be used by themselves to make a land/water determination because of the manner in which the data are compiled. We describe a data fusion approach for creating labeled VSL data using test points from Global 30 Arc-Second Elevation (GTOPO30) data to determine the direction of vector segments; i.e., whether they are in clockwise or counterclockwise order. We show consistently labeled VSL data be used to easily determine whether a point is on land or water using a vector cross product test.

  6. Analyzing big data with the hybrid interval regression methods.

    PubMed

    Huang, Chia-Hui; Yang, Keng-Chieh; Kao, Han-Ying

    2014-01-01

    Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM) to analyze big data. Recently, the smooth support vector machine (SSVM) was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes.

  7. Analyzing Big Data with the Hybrid Interval Regression Methods

    PubMed Central

    Kao, Han-Ying

    2014-01-01

    Big data is a new trend at present, forcing the significant impacts on information technologies. In big data applications, one of the most concerned issues is dealing with large-scale data sets that often require computation resources provided by public cloud services. How to analyze big data efficiently becomes a big challenge. In this paper, we collaborate interval regression with the smooth support vector machine (SSVM) to analyze big data. Recently, the smooth support vector machine (SSVM) was proposed as an alternative of the standard SVM that has been proved more efficient than the traditional SVM in processing large-scale data. In addition the soft margin method is proposed to modify the excursion of separation margin and to be effective in the gray zone that the distribution of data becomes hard to be described and the separation margin between classes. PMID:25143968

  8. Dynamic RSA: Examining parasympathetic regulatory dynamics via vector-autoregressive modeling of time-varying RSA and heart period.

    PubMed

    Fisher, Aaron J; Reeves, Jonathan W; Chi, Cyrus

    2016-07-01

    Expanding on recently published methods, the current study presents an approach to estimating the dynamic, regulatory effect of the parasympathetic nervous system on heart period on a moment-to-moment basis. We estimated second-to-second variation in respiratory sinus arrhythmia (RSA) in order to estimate the contemporaneous and time-lagged relationships among RSA, interbeat interval (IBI), and respiration rate via vector autoregression. Moreover, we modeled these relationships at lags of 1 s to 10 s, in order to evaluate the optimal latency for estimating dynamic RSA effects. The IBI (t) on RSA (t-n) regression parameter was extracted from individual models as an operationalization of the regulatory effect of RSA on IBI-referred to as dynamic RSA (dRSA). Dynamic RSA positively correlated with standard averages of heart rate and negatively correlated with standard averages of RSA. We propose that dRSA reflects the active downregulation of heart period by the parasympathetic nervous system and thus represents a novel metric that provides incremental validity in the measurement of autonomic cardiac control-specifically, a method by which parasympathetic regulatory effects can be measured in process. © 2016 Society for Psychophysiological Research.

  9. An ecosystemic approach to evaluating ecological, socioeconomic and group dynamics affecting the prevalence of Aedes aegypti in two Colombian towns.

    PubMed

    Quintero, Juliana; Carrasquilla, Gabriel; Suárez, Roberto; González, Catalina; Olano, Victor A

    2009-01-01

    This article focuses on the epidemiological methods and results of a global Ecohealth study that explored the complexity of the relationship between ecological, biological, economical, social and political factors and vector presence. The study was carried out in two dengue endemic areas of Colombia. A transdisciplinary team gathered quantitative and qualitative data. A survey in randomly sampled households was applied and, simultaneously, direct observation of potential breeding sites was carried out. Logistic regressions and qualitative techniques were used. Qualitative and quantitative data were compared using triangulation. The presence of low water containers increases seven-fold the risk of finding immature forms of Aedes aegypti in the household (OR = 7.5; 95%CI: 1.7-32.2). An inverse association between socioeconomic stratum and presence of the vector was identified (Low stratum OR = 0.9; 95%CI: 0.6-1.4; High stratum OR =0.4; 95%CI: 0.07-1.7). Water management is a complex social dynamic associated with the presence of Ae. aegypti. Dengue control is a challenge for public health authorities and researchers as they should address promotion and prevention strategies that take into account cultural, behavioral, socioeconomic and health factors.

  10. Rate determination from vector observations

    NASA Technical Reports Server (NTRS)

    Weiss, Jerold L.

    1993-01-01

    Vector observations are a common class of attitude data provided by a wide variety of attitude sensors. Attitude determination from vector observations is a well-understood process and numerous algorithms such as the TRIAD algorithm exist. These algorithms require measurement of the line of site (LOS) vector to reference objects and knowledge of the LOS directions in some predetermined reference frame. Once attitude is determined, it is a simple matter to synthesize vehicle rate using some form of lead-lag filter, and then, use it for vehicle stabilization. Many situations arise, however, in which rate knowledge is required but knowledge of the nominal LOS directions are not available. This paper presents two methods for determining spacecraft angular rates from vector observations without a priori knowledge of the vector directions. The first approach uses an extended Kalman filter with a spacecraft dynamic model and a kinematic model representing the motion of the observed LOS vectors. The second approach uses a 'differential' TRIAD algorithm to compute the incremental direction cosine matrix, from which vehicle rate is then derived.

  11. Estimating top-of-atmosphere thermal infrared radiance using MERRA-2 atmospheric data

    NASA Astrophysics Data System (ADS)

    Kleynhans, Tania; Montanaro, Matthew; Gerace, Aaron; Kanan, Christopher

    2017-05-01

    Thermal infrared satellite images have been widely used in environmental studies. However, satellites have limited temporal resolution, e.g., 16 day Landsat or 1 to 2 day Terra MODIS. This paper investigates the use of the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) reanalysis data product, produced by NASA's Global Modeling and Assimilation Office (GMAO) to predict global topof-atmosphere (TOA) thermal infrared radiance. The high temporal resolution of the MERRA-2 data product presents opportunities for novel research and applications. Various methods were applied to estimate TOA radiance from MERRA-2 variables namely (1) a parameterized physics based method, (2) Linear regression models and (3) non-linear Support Vector Regression. Model prediction accuracy was evaluated using temporally and spatially coincident Moderate Resolution Imaging Spectroradiometer (MODIS) thermal infrared data as reference data. This research found that Support Vector Regression with a radial basis function kernel produced the lowest error rates. Sources of errors are discussed and defined. Further research is currently being conducted to train deep learning models to predict TOA thermal radiance

  12. Mathematical models application for mapping soils spatial distribution on the example of the farm from the North of Udmurt Republic of Russia

    NASA Astrophysics Data System (ADS)

    Dokuchaev, P. M.; Meshalkina, J. L.; Yaroslavtsev, A. M.

    2018-01-01

    Comparative analysis of soils geospatial modeling using multinomial logistic regression, decision trees, random forest, regression trees and support vector machines algorithms was conducted. The visual interpretation of the digital maps obtained and their comparison with the existing map, as well as the quantitative assessment of the individual soil groups detection overall accuracy and of the models kappa showed that multiple logistic regression, support vector method, and random forest models application with spatial prediction of the conditional soil groups distribution can be reliably used for mapping of the study area. It has shown the most accurate detection for sod-podzolics soils (Phaeozems Albic) lightly eroded and moderately eroded soils. In second place, according to the mean overall accuracy of the prediction, there are sod-podzolics soils - non-eroded and warp one, as well as sod-gley soils (Umbrisols Gleyic) and alluvial soils (Fluvisols Dystric, Umbric). Heavy eroded sod-podzolics and gray forest soils (Phaeozems Albic) were detected by methods of automatic classification worst of all.

  13. On the role of cost-sensitive learning in multi-class brain-computer interfaces.

    PubMed

    Devlaminck, Dieter; Waegeman, Willem; Wyns, Bart; Otte, Georges; Santens, Patrick

    2010-06-01

    Brain-computer interfaces (BCIs) present an alternative way of communication for people with severe disabilities. One of the shortcomings in current BCI systems, recently put forward in the fourth BCI competition, is the asynchronous detection of motor imagery versus resting state. We investigated this extension to the three-class case, in which the resting state is considered virtually lying between two motor classes, resulting in a large penalty when one motor task is misclassified into the other motor class. We particularly focus on the behavior of different machine-learning techniques and on the role of multi-class cost-sensitive learning in such a context. To this end, four different kernel methods are empirically compared, namely pairwise multi-class support vector machines (SVMs), two cost-sensitive multi-class SVMs and kernel-based ordinal regression. The experimental results illustrate that ordinal regression performs better than the other three approaches when a cost-sensitive performance measure such as the mean-squared error is considered. By contrast, multi-class cost-sensitive learning enables us to control the number of large errors made between two motor tasks.

  14. Teaching High School Students Machine Learning Algorithms to Analyze Flood Risk Factors in River Deltas

    NASA Astrophysics Data System (ADS)

    Rose, R.; Aizenman, H.; Mei, E.; Choudhury, N.

    2013-12-01

    High School students interested in the STEM fields benefit most when actively participating, so I created a series of learning modules on how to analyze complex systems using machine-learning that give automated feedback to students. The automated feedbacks give timely responses that will encourage the students to continue testing and enhancing their programs. I have designed my modules to take the tactical learning approach in conveying the concepts behind correlation, linear regression, and vector distance based classification and clustering. On successful completion of these modules, students will learn how to calculate linear regression, Pearson's correlation, and apply classification and clustering techniques to a dataset. Working on these modules will allow the students to take back to the classroom what they've learned and then apply it to the Earth Science curriculum. During my research this summer, we applied these lessons to analyzing river deltas; we looked at trends in the different variables over time, looked for similarities in NDVI, precipitation, inundation, runoff and discharge, and attempted to predict floods based on the precipitation, waves mean, area of discharge, NDVI, and inundation.

  15. Domain-Invariant Partial-Least-Squares Regression.

    PubMed

    Nikzad-Langerodi, Ramin; Zellinger, Werner; Lughofer, Edwin; Saminger-Platz, Susanne

    2018-05-11

    Multivariate calibration models often fail to extrapolate beyond the calibration samples because of changes associated with the instrumental response, environmental condition, or sample matrix. Most of the current methods used to adapt a source calibration model to a target domain exclusively apply to calibration transfer between similar analytical devices, while generic methods for calibration-model adaptation are largely missing. To fill this gap, we here introduce domain-invariant partial-least-squares (di-PLS) regression, which extends ordinary PLS by a domain regularizer in order to align the source and target distributions in the latent-variable space. We show that a domain-invariant weight vector can be derived in closed form, which allows the integration of (partially) labeled data from the source and target domains as well as entirely unlabeled data from the latter. We test our approach on a simulated data set where the aim is to desensitize a source calibration model to an unknown interfering agent in the target domain (i.e., unsupervised model adaptation). In addition, we demonstrate unsupervised, semisupervised, and supervised model adaptation by di-PLS on two real-world near-infrared (NIR) spectroscopic data sets.

  16. Interactive optimization approach for optimal impulsive rendezvous using primer vector and evolutionary algorithms

    NASA Astrophysics Data System (ADS)

    Luo, Ya-Zhong; Zhang, Jin; Li, Hai-yang; Tang, Guo-Jin

    2010-08-01

    In this paper, a new optimization approach combining primer vector theory and evolutionary algorithms for fuel-optimal non-linear impulsive rendezvous is proposed. The optimization approach is designed to seek the optimal number of impulses as well as the optimal impulse vectors. In this optimization approach, adding a midcourse impulse is determined by an interactive method, i.e. observing the primer-magnitude time history. An improved version of simulated annealing is employed to optimize the rendezvous trajectory with the fixed-number of impulses. This interactive approach is evaluated by three test cases: coplanar circle-to-circle rendezvous, same-circle rendezvous and non-coplanar rendezvous. The results show that the interactive approach is effective and efficient in fuel-optimal non-linear rendezvous design. It can guarantee solutions, which satisfy the Lawden's necessary optimality conditions.

  17. Molecule kernels: a descriptor- and alignment-free quantitative structure-activity relationship approach.

    PubMed

    Mohr, Johannes A; Jain, Brijnesh J; Obermayer, Klaus

    2008-09-01

    Quantitative structure activity relationship (QSAR) analysis is traditionally based on extracting a set of molecular descriptors and using them to build a predictive model. In this work, we propose a QSAR approach based directly on the similarity between the 3D structures of a set of molecules measured by a so-called molecule kernel, which is independent of the spatial prealignment of the compounds. Predictors can be build using the molecule kernel in conjunction with the potential support vector machine (P-SVM), a recently proposed machine learning method for dyadic data. The resulting models make direct use of the structural similarities between the compounds in the test set and a subset of the training set and do not require an explicit descriptor construction. We evaluated the predictive performance of the proposed method on one classification and four regression QSAR datasets and compared its results to the results reported in the literature for several state-of-the-art descriptor-based and 3D QSAR approaches. In this comparison, the proposed molecule kernel method performed better than the other QSAR methods.

  18. Rotating electrical machines: Poynting flow

    NASA Astrophysics Data System (ADS)

    Donaghy-Spargo, C.

    2017-09-01

    This paper presents a complementary approach to the traditional Lorentz and Faraday approaches that are typically adopted in the classroom when teaching the fundamentals of electrical machines—motors and generators. The approach adopted is based upon the Poynting vector, which illustrates the ‘flow’ of electromagnetic energy. It is shown through simple vector analysis that the energy-flux density flow approach can provide insight into the operation of electrical machines and it is also shown that the results are in agreement with conventional Maxwell stress-based theory. The advantage of this approach is its complementary completion of the physical picture regarding the electromechanical energy conversion process—it is also a means of maintaining student interest in this subject and as an unconventional application of the Poynting vector during normal study of electromagnetism.

  19. Quantile regression via vector generalized additive models.

    PubMed

    Yee, Thomas W

    2004-07-30

    One of the most popular methods for quantile regression is the LMS method of Cole and Green. The method naturally falls within a penalized likelihood framework, and consequently allows for considerable flexible because all three parameters may be modelled by cubic smoothing splines. The model is also very understandable: for a given value of the covariate, the LMS method applies a Box-Cox transformation to the response in order to transform it to standard normality; to obtain the quantiles, an inverse Box-Cox transformation is applied to the quantiles of the standard normal distribution. The purposes of this article are three-fold. Firstly, LMS quantile regression is presented within the framework of the class of vector generalized additive models. This confers a number of advantages such as a unifying theory and estimation process. Secondly, a new LMS method based on the Yeo-Johnson transformation is proposed, which has the advantage that the response is not restricted to be positive. Lastly, this paper describes a software implementation of three LMS quantile regression methods in the S language. This includes the LMS-Yeo-Johnson method, which is estimated efficiently by a new numerical integration scheme. The LMS-Yeo-Johnson method is illustrated by way of a large cross-sectional data set from a New Zealand working population. Copyright 2004 John Wiley & Sons, Ltd.

  20. Classification of sodium MRI data of cartilage using machine learning.

    PubMed

    Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R

    2015-11-01

    To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.

  1. A general approach for developing system-specific functions to score protein-ligand docked complexes using support vector inductive logic programming.

    PubMed

    Amini, Ata; Shrimpton, Paul J; Muggleton, Stephen H; Sternberg, Michael J E

    2007-12-01

    Despite the increased recent use of protein-ligand and protein-protein docking in the drug discovery process due to the increases in computational power, the difficulty of accurately ranking the binding affinities of a series of ligands or a series of proteins docked to a protein receptor remains largely unsolved. This problem is of major concern in lead optimization procedures and has lead to the development of scoring functions tailored to rank the binding affinities of a series of ligands to a specific system. However, such methods can take a long time to develop and their transferability to other systems remains open to question. Here we demonstrate that given a suitable amount of background information a new approach using support vector inductive logic programming (SVILP) can be used to produce system-specific scoring functions. Inductive logic programming (ILP) learns logic-based rules for a given dataset that can be used to describe properties of each member of the set in a qualitative manner. By combining ILP with support vector machine regression, a quantitative set of rules can be obtained. SVILP has previously been used in a biological context to examine datasets containing a series of singular molecular structures and properties. Here we describe the use of SVILP to produce binding affinity predictions of a series of ligands to a particular protein. We also for the first time examine the applicability of SVILP techniques to datasets consisting of protein-ligand complexes. Our results show that SVILP performs comparably with other state-of-the-art methods on five protein-ligand systems as judged by similar cross-validated squares of their correlation coefficients. A McNemar test comparing SVILP to CoMFA and CoMSIA across the five systems indicates our method to be significantly better on one occasion. The ability to graphically display and understand the SVILP-produced rules is demonstrated and this feature of ILP can be used to derive hypothesis for future ligand design in lead optimization procedures. The approach can readily be extended to evaluate the binding affinities of a series of protein-protein complexes. (c) 2007 Wiley-Liss, Inc.

  2. Reanalyzing the "far medial" (transcondylar-transtubercular) approach based on three anatomical vectors: the ventral posterolateral corridor.

    PubMed

    Chakravarthi, Srikant; Monroy-Sosa, Alejandro; Gonen, Lior; Fukui, Melanie; Rovin, Richard; Kojis, Nathaniel; Lindsay, Mark; Khalili, Sammy; Celix, Juanita; Corsten, Martin; Kassam, Amin B

    2018-06-01

    Endoscopic endonasal access to the jugular foramen and occipital condyle - the transcondylar-transtubercular approach - is anatomically complex and requires detailed knowledge of the relative position of critical neurovascular structures, in order to avoid inadvertent injury and resultant complications. However, access to this region can be confusing as the orientation and relationships of osseous, vascular, and neural structures are very much different from traditional dorsal approaches. This review aims at providing an organizational construct for a more understandable framework in accessing the transcondylar-transtubercular window. The region can be conceptualized using a three-vector coordinate system: vector 1 represents a dorsal or ventral corridor, vector 2 represents the outer and inner circumferential anatomical limits; in an "onion-skin" fashion, key osseous, vascular, and neural landmarks are organized based on a 360-degree skull base model, and vector 3 represents the final core or target of the surgical corridor. The creation of an organized "global-positioning system" may better guide the surgeon in accessing the far-medial transcondylar-transtubercular region, and related pathologies, and help understand the surgical limits to the occipital condyle and jugular foramen - the ventral posterolateral corridor - via the endoscopic endonasal approach.

  3. Construction of siRNA/miRNA expression vectors based on a one-step PCR process

    PubMed Central

    Xu, Jun; Zeng, Jie Qiong; Wan, Gang; Hu, Gui Bin; Yan, Hong; Ma, Li Xin

    2009-01-01

    Background RNA interference (RNAi) has become a powerful means for silencing target gene expression in mammalian cells and is envisioned to be useful in therapeutic approaches to human disease. In recent years, high-throughput, genome-wide screening of siRNA/miRNA libraries has emerged as a desirable approach. Current methods for constructing siRNA/miRNA expression vectors require the synthesis of long oligonucleotides, which is costly and suffers from mutation problems. Results Here we report an ingenious method to solve traditional problems associated with construction of siRNA/miRNA expression vectors. We synthesized shorter primers (< 50 nucleotides) to generate a linear expression structure by PCR. The PCR products were directly transformed into chemically competent E. coli and converted to functional vectors in vivo via homologous recombination. The positive clones could be easily screened under UV light. Using this method we successfully constructed over 500 functional siRNA/miRNA expression vectors. Sequencing of the vectors confirmed a high accuracy rate. Conclusion This novel, convenient, low-cost and highly efficient approach may be useful for high-throughput assays of RNAi libraries. PMID:19490634

  4. Accelerating 4D flow MRI by exploiting vector field divergence regularization.

    PubMed

    Santelli, Claudio; Loecher, Michael; Busch, Julia; Wieben, Oliver; Schaeffter, Tobias; Kozerke, Sebastian

    2016-01-01

    To improve velocity vector field reconstruction from undersampled four-dimensional (4D) flow MRI by penalizing divergence of the measured flow field. Iterative image reconstruction in which magnitude and phase are regularized separately in alternating iterations was implemented. The approach allows incorporating prior knowledge of the flow field being imaged. In the present work, velocity data were regularized to reduce divergence, using either divergence-free wavelets (DFW) or a finite difference (FD) method using the ℓ1-norm of divergence and curl. The reconstruction methods were tested on a numerical phantom and in vivo data. Results of the DFW and FD approaches were compared with data obtained with standard compressed sensing (CS) reconstruction. Relative to standard CS, directional errors of vector fields and divergence were reduced by 55-60% and 38-48% for three- and six-fold undersampled data with the DFW and FD methods. Velocity vector displays of the numerical phantom and in vivo data were found to be improved upon DFW or FD reconstruction. Regularization of vector field divergence in image reconstruction from undersampled 4D flow data is a valuable approach to improve reconstruction accuracy of velocity vector fields. © 2014 Wiley Periodicals, Inc.

  5. Rainfall-induced Landslide Susceptibility assessment at the Longnan county

    NASA Astrophysics Data System (ADS)

    Hong, Haoyuan; Zhang, Ying

    2017-04-01

    Landslides are a serious disaster in Longnan county, China. Therefore landslide susceptibility assessment is useful tool for government or decision making. The main objective of this study is to investigate and compare the frequency ratio, support vector machines, and logistic regression. The Longnan county (Jiangxi province, China) was selected as the case study. First, the landslide inventory map with 354 landslide locations was constructed. Then landslide locations were then randomly divided into a ratio of 70/30 for the training and validating the models. Second, fourteen landslide conditioning factors were prepared such as slope, aspect, altitude, topographic wetness index (TWI), stream power index (SPI), sediment transport index (STI), plan curvature, lithology, distance to faults, distance to rivers, distance to roads, land use, normalized difference vegetation index (NDVI), and rainfall. Using the frequency ratio, support vector machines, and logistic regression, a total of three landslide susceptibility models were constructed. Finally, the overall performance of the resulting models was assessed and compared using the Receiver operating characteristic (ROC) curve technique. The result showed that the support vector machines model is the best model in the study area. The success rate is 88.39 %; and prediction rate is 84.06 %.

  6. Introducing the Filtered Park's and Filtered Extended Park's Vector Approach to detect broken rotor bars in induction motors independently from the rotor slots number

    NASA Astrophysics Data System (ADS)

    Gyftakis, Konstantinos N.; Marques Cardoso, Antonio J.; Antonino-Daviu, Jose A.

    2017-09-01

    The Park's Vector Approach (PVA), together with its variations, has been one of the most widespread diagnostic methods for electrical machines and drives. Regarding the broken rotor bars fault diagnosis in induction motors, the common practice is to rely on the width increase of the Park's Vector (PV) ring and then apply some more sophisticated signal processing methods. It is shown in this paper that this method can be unreliable and is strongly dependent on the magnetic poles and rotor slot numbers. To overcome this constraint, the novel Filtered Park's/Extended Park's Vector Approach (FPVA/FEPVA) is introduced. The investigation is carried out with FEM simulations and experimental testing. The results prove to satisfyingly coincide, whereas the proposed advanced FPVA method is desirably reliable.

  7. Anopheles Vectors in Mainland China While Approaching Malaria Elimination.

    PubMed

    Zhang, Shaosen; Guo, Shaohua; Feng, Xinyu; Afelt, Aneta; Frutos, Roger; Zhou, Shuisen; Manguin, Sylvie

    2017-11-01

    China is approaching malaria elimination; however, well-documented information on malaria vectors is still missing, which could hinder the development of appropriate surveillance strategies and WHO certification. This review summarizes the nationwide distribution of malaria vectors, their bionomic characteristics, control measures, and related studies. After several years of effort, the area of distribution of the principal malaria vectors was reduced, in particular for Anopheles lesteri (synonym: An. anthropophagus) and Anopheles dirus s.l., which nearly disappeared from their former endemic regions. Anopheles sinensis is becoming the predominant species in southwestern China. The bionomic characteristics of these species have changed, and resistance to insecticides was reported. There is a need to update surveillance tools and investigate the role of secondary vectors in malaria transmission. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Reverse chemical ecology approach for the identification of a mosquito oviposition attractant

    USDA-ARS?s Scientific Manuscript database

    Pheromones and other semiochemicals play a crucial role in today’s integrated pest and vector management strategies for controlling populations of insects causing loses to agriculture and vectoring diseases to humans. These semiochemicals are typically discovered by bioassay-guided approaches. Here,...

  9. A vectorial semantics approach to personality assessment.

    PubMed

    Neuman, Yair; Cohen, Yochai

    2014-04-23

    Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy.

  10. A Vectorial Semantics Approach to Personality Assessment

    NASA Astrophysics Data System (ADS)

    Neuman, Yair; Cohen, Yochai

    2014-04-01

    Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy.

  11. A Vectorial Semantics Approach to Personality Assessment

    PubMed Central

    Neuman, Yair; Cohen, Yochai

    2014-01-01

    Personality assessment and, specifically, the assessment of personality disorders have traditionally been indifferent to computational models. Computational personality is a new field that involves the automatic classification of individuals' personality traits that can be compared against gold-standard labels. In this context, we introduce a new vectorial semantics approach to personality assessment, which involves the construction of vectors representing personality dimensions and disorders, and the automatic measurements of the similarity between these vectors and texts written by human subjects. We evaluated our approach by using a corpus of 2468 essays written by students who were also assessed through the five-factor personality model. To validate our approach, we measured the similarity between the essays and the personality vectors to produce personality disorder scores. These scores and their correspondence with the subjects' classification of the five personality factors reproduce patterns well-documented in the psychological literature. In addition, we show that, based on the personality vectors, we can predict each of the five personality factors with high accuracy. PMID:24755833

  12. Conceptualizing Vectors in College Geometry: A New Framework for Analysis of Student Approaches and Difficulties

    ERIC Educational Resources Information Center

    Kwon, Oh Hoon

    2012-01-01

    This dissertation documents a new way of conceptualizing vectors in college mathematics, especially in geometry. First, I will introduce three problems to show the complexity and subtlety of the construct of vectors with the classical vector representations. These highlight the need for a new framework that: (1) differentiates abstraction from a…

  13. Deep learning ensemble with asymptotic techniques for oscillometric blood pressure estimation.

    PubMed

    Lee, Soojeong; Chang, Joon-Hyuk

    2017-11-01

    This paper proposes a deep learning based ensemble regression estimator with asymptotic techniques, and offers a method that can decrease uncertainty for oscillometric blood pressure (BP) measurements using the bootstrap and Monte-Carlo approach. While the former is used to estimate SBP and DBP, the latter attempts to determine confidence intervals (CIs) for SBP and DBP based on oscillometric BP measurements. This work originally employs deep belief networks (DBN)-deep neural networks (DNN) to effectively estimate BPs based on oscillometric measurements. However, there are some inherent problems with these methods. First, it is not easy to determine the best DBN-DNN estimator, and worthy information might be omitted when selecting one DBN-DNN estimator and discarding the others. Additionally, our input feature vectors, obtained from only five measurements per subject, represent a very small sample size; this is a critical weakness when using the DBN-DNN technique and can cause overfitting or underfitting, depending on the structure of the algorithm. To address these problems, an ensemble with an asymptotic approach (based on combining the bootstrap with the DBN-DNN technique) is utilized to generate the pseudo features needed to estimate the SBP and DBP. In the first stage, the bootstrap-aggregation technique is used to create ensemble parameters. Afterward, the AdaBoost approach is employed for the second-stage SBP and DBP estimation. We then use the bootstrap and Monte-Carlo techniques in order to determine the CIs based on the target BP estimated using the DBN-DNN ensemble regression estimator with the asymptotic technique in the third stage. The proposed method can mitigate the estimation uncertainty such as large the standard deviation of error (SDE) on comparing the proposed DBN-DNN ensemble regression estimator with the DBN-DNN single regression estimator, we identify that the SDEs of the SBP and DBP are reduced by 0.58 and 0.57  mmHg, respectively. These indicate that the proposed method actually enhances the performance by 9.18% and 10.88% compared with the DBN-DNN single estimator. The proposed methodology improves the accuracy of BP estimation and reduces the uncertainty for BP estimation. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. Cloud field classification based upon high spatial resolution textural features. II - Simplified vector approaches

    NASA Technical Reports Server (NTRS)

    Chen, D. W.; Sengupta, S. K.; Welch, R. M.

    1989-01-01

    This paper compares the results of cloud-field classification derived from two simplified vector approaches, the Sum and Difference Histogram (SADH) and the Gray Level Difference Vector (GLDV), with the results produced by the Gray Level Cooccurrence Matrix (GLCM) approach described by Welch et al. (1988). It is shown that the SADH method produces accuracies equivalent to those obtained using the GLCM method, while the GLDV method fails to resolve error clusters. Compared to the GLCM method, the SADH method leads to a 31 percent saving in run time and a 50 percent saving in storage requirements, while the GLVD approach leads to a 40 percent saving in run time and an 87 percent saving in storage requirements.

  15. Vibration-based damage detection in a concrete beam under temperature variations using AR models and state-space approaches

    NASA Astrophysics Data System (ADS)

    Clément, A.; Laurens, S.

    2011-07-01

    The Structural Health Monitoring of civil structures subjected to ambient vibrations is very challenging. Indeed, the variations of environmental conditions and the difficulty to characterize the excitation make the damage detection a hard task. Auto-regressive (AR) models coefficients are often used as damage sensitive feature. The presented work proposes a comparison of the AR approach with a state-space feature formed by the Jacobian matrix of the dynamical process. Since the detection of damage can be formulated as a novelty detection problem, Mahalanobis distance is applied to track new points from an undamaged reference collection of feature vectors. Data from a concrete beam subjected to temperature variations and damaged by several static loading are analyzed. It is observed that the damage sensitive features are effectively sensitive to temperature variations. However, the use of the Mahalanobis distance makes possible the detection of cracking with both of them. Early damage (before cracking) is only revealed by the AR coefficients with a good sensibility.

  16. Prediction of Backbreak in Open-Pit Blasting Operations Using the Machine Learning Method

    NASA Astrophysics Data System (ADS)

    Khandelwal, Manoj; Monjezi, M.

    2013-03-01

    Backbreak is an undesirable phenomenon in blasting operations. It can cause instability of mine walls, falling down of machinery, improper fragmentation, reduced efficiency of drilling, etc. The existence of various effective parameters and their unknown relationships are the main reasons for inaccuracy of the empirical models. Presently, the application of new approaches such as artificial intelligence is highly recommended. In this paper, an attempt has been made to predict backbreak in blasting operations of Soungun iron mine, Iran, incorporating rock properties and blast design parameters using the support vector machine (SVM) method. To investigate the suitability of this approach, the predictions by SVM have been compared with multivariate regression analysis (MVRA). The coefficient of determination (CoD) and the mean absolute error (MAE) were taken as performance measures. It was found that the CoD between measured and predicted backbreak was 0.987 and 0.89 by SVM and MVRA, respectively, whereas the MAE was 0.29 and 1.07 by SVM and MVRA, respectively.

  17. Estimating atmospheric visibility using synergy of MODIS data and ground-based observations

    NASA Astrophysics Data System (ADS)

    Komeilian, H.; Mohyeddin Bateni, S.; Xu, T.; Nielson, J.

    2015-05-01

    Dust events are intricate climatic processes, which can have adverse effects on human health, safety, and the environment. In this study, two data mining approaches, namely, back-propagation artificial neural network (BP ANN) and supporting vector regression (SVR), were used to estimate atmospheric visibility through the synergistic use of Moderate Resolution Imaging Spectroradiometer (MODIS) Level 1B (L1B) data and ground-based observations at fourteen stations in the province of Khuzestan (southwestern Iran), during 2009-2010. Reflectance and brightness temperature in different bands (from MODIS) along with in situ meteorological data were input to the models to estimate atmospheric visibility. The results show that both models can accurately estimate atmospheric visibility. The visibility estimates from the BP ANN network had a root-mean-square error (RMSE) and Pearson's correlation coefficient (R) of 0.67 and 0.69, respectively. The corresponding RMSE and R from the SVR model were 0.59 and 0.71, implying that the SVR approach outperforms the BP ANN.

  18. Combined self-learning based single-image super-resolution and dual-tree complex wavelet transform denoising for medical images

    NASA Astrophysics Data System (ADS)

    Yang, Guang; Ye, Xujiong; Slabaugh, Greg; Keegan, Jennifer; Mohiaddin, Raad; Firmin, David

    2016-03-01

    In this paper, we propose a novel self-learning based single-image super-resolution (SR) method, which is coupled with dual-tree complex wavelet transform (DTCWT) based denoising to better recover high-resolution (HR) medical images. Unlike previous methods, this self-learning based SR approach enables us to reconstruct HR medical images from a single low-resolution (LR) image without extra training on HR image datasets in advance. The relationships between the given image and its scaled down versions are modeled using support vector regression with sparse coding and dictionary learning, without explicitly assuming reoccurrence or self-similarity across image scales. In addition, we perform DTCWT based denoising to initialize the HR images at each scale instead of simple bicubic interpolation. We evaluate our method on a variety of medical images. Both quantitative and qualitative results show that the proposed approach outperforms bicubic interpolation and state-of-the-art single-image SR methods while effectively removing noise.

  19. An improved EMD method for modal identification and a combined static-dynamic method for damage detection

    NASA Astrophysics Data System (ADS)

    Yang, Jinping; Li, Peizhen; Yang, Youfa; Xu, Dian

    2018-04-01

    Empirical mode decomposition (EMD) is a highly adaptable signal processing method. However, the EMD approach has certain drawbacks, including distortions from end effects and mode mixing. In the present study, these two problems are addressed using an end extension method based on the support vector regression machine (SVRM) and a modal decomposition method based on the characteristics of the Hilbert transform. The algorithm includes two steps: using the SVRM, the time series data are extended at both endpoints to reduce the end effects, and then, a modified EMD method using the characteristics of the Hilbert transform is performed on the resulting signal to reduce mode mixing. A new combined static-dynamic method for identifying structural damage is presented. This method combines the static and dynamic information in an equilibrium equation that can be solved using the Moore-Penrose generalized matrix inverse. The combination method uses the differences in displacements of the structure with and without damage and variations in the modal force vector. Tests on a four-story, steel-frame structure were conducted to obtain static and dynamic responses of the structure. The modal parameters are identified using data from the dynamic tests and improved EMD method. The new method is shown to be more accurate and effective than the traditional EMD method. Through tests with a shear-type test frame, the higher performance of the proposed static-dynamic damage detection approach, which can detect both single and multiple damage locations and the degree of the damage, is demonstrated. For structures with multiple damage, the combined approach is more effective than either the static or dynamic method. The proposed EMD method and static-dynamic damage detection method offer improved modal identification and damage detection, respectively, in structures.

  20. Understanding Phlebotomus perniciosus abundance in south-east Spain: assessing the role of environmental and anthropic factors.

    PubMed

    Risueño, José; Muñoz, Clara; Pérez-Cutillas, Pedro; Goyena, Elena; Gonzálvez, Moisés; Ortuño, María; Bernal, Luis Jesús; Ortiz, Juana; Alten, Bulent; Berriatua, Eduardo

    2017-04-19

    Leishmaniosis is associated with Phlebotomus sand fly vector density, but our knowledge of the environmental framework that regulates highly overdispersed vector abundance distributions is limited. We used a standardized sampling procedure in the bioclimatically diverse Murcia Region in Spain and multilevel regression models for count data to estimate P. perniciosus abundance in relation to environmental and anthropic factors. Twenty-five dog and sheep premises were sampled for sand flies using adhesive and light-attraction traps, from late May to early October 2015. Temperature, relative humidity and other animal- and premise-related data recorded on site and other environmental data were extracted from digital databases using a geographical information system. The relationship between sand fly abundance and explanatory variables was analysed using binomial regression models. The total number of sand flies captured, mostly with light-attraction traps, was 3,644 specimens, including 80% P. perniciosus, the main L. infantum vector in Spain. Abundance varied between and within zones and was positively associated with increasing altitude from 0 to 900 m above sea level, except from 500 to 700 m where it was low. Populations peaked in July and especially during a 3-day heat wave when relative humidity and wind speed plummeted. Regression models indicated that climate and not land use or soil characteristics have the greatest impact on this species density on a large geographical scale. In contrast, micro-environmental factors such as animal building characteristics and husbandry practices affect sand fly population size on a smaller scale. A standardised sampling procedure and statistical analysis for highly overdispersed distributions allow reliable estimation of P. perniciosus abundance and identification of environmental drivers. While climatic variables have the greatest impact at macro-environmental scale, anthropic factors may be determinant at a micro-geographical scale. These finding may be used to elaborate predictive distribution maps useful for vector and pathogen control programs.

  1. Application of Artificial Neural Network and Support Vector Machines in Predicting Metabolizable Energy in Compound Feeds for Pigs.

    PubMed

    Ahmadi, Hamed; Rodehutscord, Markus

    2017-01-01

    In the nutrition literature, there are several reports on the use of artificial neural network (ANN) and multiple linear regression (MLR) approaches for predicting feed composition and nutritive value, while the use of support vector machines (SVM) method as a new alternative approach to MLR and ANN models is still not fully investigated. The MLR, ANN, and SVM models were developed to predict metabolizable energy (ME) content of compound feeds for pigs based on the German energy evaluation system from analyzed contents of crude protein (CP), ether extract (EE), crude fiber (CF), and starch. A total of 290 datasets from standardized digestibility studies with compound feeds was provided from several institutions and published papers, and ME was calculated thereon. Accuracy and precision of developed models were evaluated, given their produced prediction values. The results revealed that the developed ANN [ R 2  = 0.95; root mean square error (RMSE) = 0.19 MJ/kg of dry matter] and SVM ( R 2  = 0.95; RMSE = 0.21 MJ/kg of dry matter) models produced better prediction values in estimating ME in compound feed than those produced by conventional MLR ( R 2  = 0.89; RMSE = 0.27 MJ/kg of dry matter). The developed ANN and SVM models produced better prediction values in estimating ME in compound feed than those produced by conventional MLR; however, there were not obvious differences between performance of ANN and SVM models. Thus, SVM model may also be considered as a promising tool for modeling the relationship between chemical composition and ME of compound feeds for pigs. To provide the readers and nutritionist with the easy and rapid tool, an Excel ® calculator, namely, SVM_ME_pig, was created to predict the metabolizable energy values in compound feeds for pigs using developed support vector machine model.

  2. Detection of fraudulent financial statements using the hybrid data mining approach.

    PubMed

    Chen, Suduan

    2016-01-01

    The purpose of this study is to construct a valid and rigorous fraudulent financial statement detection model. The research objects are companies which experienced both fraudulent and non-fraudulent financial statements between the years 2002 and 2013. In the first stage, two decision tree algorithms, including the classification and regression trees (CART) and the Chi squared automatic interaction detector (CHAID) are applied in the selection of major variables. The second stage combines CART, CHAID, Bayesian belief network, support vector machine and artificial neural network in order to construct fraudulent financial statement detection models. According to the results, the detection performance of the CHAID-CART model is the most effective, with an overall accuracy of 87.97 % (the FFS detection accuracy is 92.69 %).

  3. Just-in-Time Correntropy Soft Sensor with Noisy Data for Industrial Silicon Content Prediction.

    PubMed

    Chen, Kun; Liang, Yu; Gao, Zengliang; Liu, Yi

    2017-08-08

    Development of accurate data-driven quality prediction models for industrial blast furnaces encounters several challenges mainly because the collected data are nonlinear, non-Gaussian, and uneven distributed. A just-in-time correntropy-based local soft sensing approach is presented to predict the silicon content in this work. Without cumbersome efforts for outlier detection, a correntropy support vector regression (CSVR) modeling framework is proposed to deal with the soft sensor development and outlier detection simultaneously. Moreover, with a continuous updating database and a clustering strategy, a just-in-time CSVR (JCSVR) method is developed. Consequently, more accurate prediction and efficient implementations of JCSVR can be achieved. Better prediction performance of JCSVR is validated on the online silicon content prediction, compared with traditional soft sensors.

  4. Just-in-Time Correntropy Soft Sensor with Noisy Data for Industrial Silicon Content Prediction

    PubMed Central

    Chen, Kun; Liang, Yu; Gao, Zengliang; Liu, Yi

    2017-01-01

    Development of accurate data-driven quality prediction models for industrial blast furnaces encounters several challenges mainly because the collected data are nonlinear, non-Gaussian, and uneven distributed. A just-in-time correntropy-based local soft sensing approach is presented to predict the silicon content in this work. Without cumbersome efforts for outlier detection, a correntropy support vector regression (CSVR) modeling framework is proposed to deal with the soft sensor development and outlier detection simultaneously. Moreover, with a continuous updating database and a clustering strategy, a just-in-time CSVR (JCSVR) method is developed. Consequently, more accurate prediction and efficient implementations of JCSVR can be achieved. Better prediction performance of JCSVR is validated on the online silicon content prediction, compared with traditional soft sensors. PMID:28786957

  5. Rapid construction of capsid-modified adenoviral vectors through bacteriophage lambda Red recombination.

    PubMed

    Campos, Samuel K; Barry, Michael A

    2004-11-01

    There are extensive efforts to develop cell-targeting adenoviral vectors for gene therapy wherein endogenous cell-binding ligands are ablated and exogenous ligands are introduced by genetic means. Although current approaches can genetically manipulate the capsid genes of adenoviral vectors, these approaches can be time-consuming and require multiple steps to produce a modified viral genome. We present here the use of the bacteriophage lambda Red recombination system as a valuable tool for the easy and rapid construction of capsid-modified adenoviral genomes.

  6. Robust model predictive control for satellite formation keeping with eccentricity/inclination vector separation

    NASA Astrophysics Data System (ADS)

    Lim, Yeerang; Jung, Youeyun; Bang, Hyochoong

    2018-05-01

    This study presents model predictive formation control based on an eccentricity/inclination vector separation strategy. Alternative collision avoidance can be accomplished by using eccentricity/inclination vectors and adding a simple goal function term for optimization process. Real-time control is also achievable with model predictive controller based on convex formulation. Constraint-tightening approach is address as well improve robustness of the controller, and simulation results are presented to verify performance enhancement for the proposed approach.

  7. Classification of suicide attempters in schizophrenia using sociocultural and clinical features: A machine learning approach.

    PubMed

    Hettige, Nuwan C; Nguyen, Thai Binh; Yuan, Chen; Rajakulendran, Thanara; Baddour, Jermeen; Bhagwat, Nikhil; Bani-Fatemi, Ali; Voineskos, Aristotle N; Mallar Chakravarty, M; De Luca, Vincenzo

    2017-07-01

    Suicide is a major concern for those afflicted by schizophrenia. Identifying patients at the highest risk for future suicide attempts remains a complex problem for psychiatric interventions. Machine learning models allow for the integration of many risk factors in order to build an algorithm that predicts which patients are likely to attempt suicide. Currently it is unclear how to integrate previously identified risk factors into a clinically relevant predictive tool to estimate the probability of a patient with schizophrenia for attempting suicide. We conducted a cross-sectional assessment on a sample of 345 participants diagnosed with schizophrenia spectrum disorders. Suicide attempters and non-attempters were clearly identified using the Columbia Suicide Severity Rating Scale (C-SSRS) and the Beck Suicide Ideation Scale (BSS). We developed four classification algorithms using a regularized regression, random forest, elastic net and support vector machine models with sociocultural and clinical variables as features to train the models. All classification models performed similarly in identifying suicide attempters and non-attempters. Our regularized logistic regression model demonstrated an accuracy of 67% and an area under the curve (AUC) of 0.71, while the random forest model demonstrated 66% accuracy and an AUC of 0.67. Support vector classifier (SVC) model demonstrated an accuracy of 67% and an AUC of 0.70, and the elastic net model demonstrated and accuracy of 65% and an AUC of 0.71. Machine learning algorithms offer a relatively successful method for incorporating many clinical features to predict individuals at risk for future suicide attempts. Increased performance of these models using clinically relevant variables offers the potential to facilitate early treatment and intervention to prevent future suicide attempts. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Beyond the hype: deep neural networks outperform established methods using a ChEMBL bioactivity benchmark set.

    PubMed

    Lenselink, Eelke B; Ten Dijke, Niels; Bongers, Brandon; Papadatos, George; van Vlijmen, Herman W T; Kowalczyk, Wojtek; IJzerman, Adriaan P; van Westen, Gerard J P

    2017-08-14

    The increase of publicly available bioactivity data in recent years has fueled and catalyzed research in chemogenomics, data mining, and modeling approaches. As a direct result, over the past few years a multitude of different methods have been reported and evaluated, such as target fishing, nearest neighbor similarity-based methods, and Quantitative Structure Activity Relationship (QSAR)-based protocols. However, such studies are typically conducted on different datasets, using different validation strategies, and different metrics. In this study, different methods were compared using one single standardized dataset obtained from ChEMBL, which is made available to the public, using standardized metrics (BEDROC and Matthews Correlation Coefficient). Specifically, the performance of Naïve Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods. All methods were validated using both a random split validation and a temporal validation, with the latter being a more realistic benchmark of expected prospective execution. Deep Neural Networks are the top performing classifiers, highlighting the added value of Deep Neural Networks over other more conventional methods. Moreover, the best method ('DNN_PCM') performed significantly better at almost one standard deviation higher than the mean performance. Furthermore, Multi-task and PCM implementations were shown to improve performance over single task Deep Neural Networks. Conversely, target prediction performed almost two standard deviations under the mean performance. Random Forests, Support Vector Machines, and Logistic Regression performed around mean performance. Finally, using an ensemble of DNNs, alongside additional tuning, enhanced the relative performance by another 27% (compared with unoptimized 'DNN_PCM'). Here, a standardized set to test and evaluate different machine learning algorithms in the context of multi-task learning is offered by providing the data and the protocols. Graphical Abstract .

  9. Data-driven mapping of the potential mountain permafrost distribution.

    PubMed

    Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail

    2017-07-15

    Existing mountain permafrost distribution models generally offer a good overview of the potential extent of this phenomenon at a regional scale. They are however not always able to reproduce the high spatial discontinuity of permafrost at the micro-scale (scale of a specific landform; ten to several hundreds of meters). To overcome this lack, we tested an alternative modelling approach using three classification algorithms belonging to statistics and machine learning: Logistic regression, Support Vector Machines and Random forests. These supervised learning techniques infer a classification function from labelled training data (pixels of permafrost absence and presence) with the aim of predicting the permafrost occurrence where it is unknown. The research was carried out in a 588km 2 area of the Western Swiss Alps. Permafrost evidences were mapped from ortho-image interpretation (rock glacier inventorying) and field data (mainly geoelectrical and thermal data). The relationship between selected permafrost evidences and permafrost controlling factors was computed with the mentioned techniques. Classification performances, assessed with AUROC, range between 0.81 for Logistic regression, 0.85 with Support Vector Machines and 0.88 with Random forests. The adopted machine learning algorithms have demonstrated to be efficient for permafrost distribution modelling thanks to consistent results compared to the field reality. The high resolution of the input dataset (10m) allows elaborating maps at the micro-scale with a modelled permafrost spatial distribution less optimistic than classic spatial models. Moreover, the probability output of adopted algorithms offers a more precise overview of the potential distribution of mountain permafrost than proposing simple indexes of the permafrost favorability. These encouraging results also open the way to new possibilities of permafrost data analysis and mapping. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Discrete wavelength selection for the optical readout of a metamaterial biosensing system for glucose concentration estimation via a support vector regression model.

    PubMed

    Teutsch, T; Mesch, M; Giessen, H; Tarin, C

    2015-01-01

    In this contribution, a method to select discrete wavelengths that allow an accurate estimation of the glucose concentration in a biosensing system based on metamaterials is presented. The sensing concept is adapted to the particular application of ophthalmic glucose sensing by covering the metamaterial with a glucose-sensitive hydrogel and the sensor readout is performed optically. Due to the fact that in a mobile context a spectrometer is not suitable, few discrete wavelengths must be selected to estimate the glucose concentration. The developed selection methods are based on nonlinear support vector regression (SVR) models. Two selection methods are compared and it is shown that wavelengths selected by a sequential forward feature selection algorithm achieves an estimation improvement. The presented method can be easily applied to different metamaterial layouts and hydrogel configurations.

  11. Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms

    PubMed Central

    Hu, Zhongyi; Xiong, Tao

    2013-01-01

    Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature. PMID:24459425

  12. Electricity load forecasting using support vector regression with memetic algorithms.

    PubMed

    Hu, Zhongyi; Bao, Yukun; Xiong, Tao

    2013-01-01

    Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.

  13. Supplier Short Term Load Forecasting Using Support Vector Regression and Exogenous Input

    NASA Astrophysics Data System (ADS)

    Matijaš, Marin; Vukićcević, Milan; Krajcar, Slavko

    2011-09-01

    In power systems, task of load forecasting is important for keeping equilibrium between production and consumption. With liberalization of electricity markets, task of load forecasting changed because each market participant has to forecast their own load. Consumption of end-consumers is stochastic in nature. Due to competition, suppliers are not in a position to transfer their costs to end-consumers; therefore it is essential to keep forecasting error as low as possible. Numerous papers are investigating load forecasting from the perspective of the grid or production planning. We research forecasting models from the perspective of a supplier. In this paper, we investigate different combinations of exogenous input on the simulated supplier loads and show that using points of delivery as a feature for Support Vector Regression leads to lower forecasting error, while adding customer number in different datasets does the opposite.

  14. Herpes simplex virus type 1-derived recombinant and amplicon vectors.

    PubMed

    Fraefel, Cornel; Marconi, Peggy; Epstein, Alberto L

    2011-01-01

    Herpes simplex virus type 1 (HSV-1) is a human pathogen whose lifestyle is based on a long-term dual interaction with the infected host, being able to establish both lytic and latent infections. The virus genome is a 153 kbp double-stranded DNA molecule encoding more than 80 genes. The interest of HSV-1 as gene transfer vector stems from its ability to infect many different cell types, both quiescent and proliferating cells, the very high packaging capacity of the virus capsid, the outstanding neurotropic adaptations that this virus has evolved, and the fact that it never integrates into the cellular chromosomes, thus avoiding the risk of insertional mutagenesis. Two types of vectors can be derived from HSV-1, recombinant vectors and amplicon vectors, and different methodologies have been developed to prepare large stocks of each type of vector. This chapter summarizes (1) the two approaches most commonly used to prepare recombinant vectors through homologous recombination, either in eukaryotic cells or in bacteria, and (2) the two methodologies currently used to generate helper-free amplicon vectors, either using a bacterial artificial chromosome (BAC)-based approach or a Cre/loxP site-specific recombination strategy.

  15. A Novel Vaccine Approach for Chagas Disease Using Rare Adenovirus Serotype 48 Vectors

    PubMed Central

    Farrow, Anitra L.; Peng, Binghao J.; Gu, Linlin; Krendelchtchikov, Alexandre; Matthews, Qiana L.

    2016-01-01

    Due to the increasing amount of people afflicted worldwide with Chagas disease and an increasing prevalence in the United States, there is a greater need to develop a safe and effective vaccine for this neglected disease. Adenovirus serotype 5 (Ad5) is the most common adenovirus vector used for gene therapy and vaccine approaches, but its efficacy is limited by preexisting vector immunity in humans resulting from natural infections. Therefore, we have employed rare serotype adenovirus 48 (Ad48) as an alternative choice for adenovirus/Chagas vaccine therapy. In this study, we modified Ad5 and Ad48 vectors to contain T. cruzi’s amastigote surface protein 2 (ASP-2) in the adenoviral early gene. We also modified Ad5 and Ad48 vectors to utilize the “Antigen Capsid-Incorporation” strategy by adding T. cruzi epitopes to protein IX (pIX). Mice that were immunized with the modified vectors were able to elicit T. cruzi-specific humoral and cellular responses. This study indicates that Ad48-modified vectors function comparable to or even premium to Ad5-modified vectors. This study provides novel data demonstrating that Ad48 can be used as a potential adenovirus vaccine vector against Chagas disease. PMID:26978385

  16. Optimal source coding, removable noise elimination, and natural coordinate system construction for general vector sources using replicator neural networks

    NASA Astrophysics Data System (ADS)

    Hecht-Nielsen, Robert

    1997-04-01

    A new universal one-chart smooth manifold model for vector information sources is introduced. Natural coordinates (a particular type of chart) for such data manifolds are then defined. Uniformly quantized natural coordinates form an optimal vector quantization code for a general vector source. Replicator neural networks (a specialized type of multilayer perceptron with three hidden layers) are the introduced. As properly configured examples of replicator networks approach minimum mean squared error (e.g., via training and architecture adjustment using randomly chosen vectors from the source), these networks automatically develop a mapping which, in the limit, produces natural coordinates for arbitrary source vectors. The new concept of removable noise (a noise model applicable to a wide variety of real-world noise processes) is then discussed. Replicator neural networks, when configured to approach minimum mean squared reconstruction error (e.g., via training and architecture adjustment on randomly chosen examples from a vector source, each with randomly chosen additive removable noise contamination), in the limit eliminate removable noise and produce natural coordinates for the data vector portions of the noise-corrupted source vectors. Consideration regarding selection of the dimension of a data manifold source model and the training/configuration of replicator neural networks are discussed.

  17. Static investigation of two STOL nozzle concepts with pitch thrust-vectoring capability

    NASA Technical Reports Server (NTRS)

    Mason, M. L.; Burley, J. R., II

    1986-01-01

    A static investigation of the internal performance of two short take-off and landing (STOL) nozzle concepts with pitch thrust-vectoring capability has been conducted. An axisymmetric nozzle concept and a nonaxisymmetric nozzle concept were tested at dry and afterburning power settings. The axisymmetric concept consisted of a circular approach duct with a convergent-divergent nozzle. Pitch thrust vectoring was accomplished by vectoring the approach duct without changing the nozzle geometry. The nonaxisymmetric concept consisted of a two dimensional convergent-divergent nozzle. Pitch thrust vectoring was implemented by blocking the nozzle exit and deflecting a door in the lower nozzle flap. The test nozzle pressure ratio was varied up to 10.0, depending on model geometry. Results indicate that both pitch vectoring concepts produced resultant pitch vector angles which were nearly equal to the geometric pitch deflection angles. The axisymmetric nozzle concept had only small thrust losses at the largest pitch deflection angle of 70 deg., but the two-dimensional convergent-divergent nozzle concept had large performance losses at both of the two pitch deflection angles tested, 60 deg. and 70 deg.

  18. Feature Vector Construction Method for IRIS Recognition

    NASA Astrophysics Data System (ADS)

    Odinokikh, G.; Fartukov, A.; Korobkin, M.; Yoo, J.

    2017-05-01

    One of the basic stages of iris recognition pipeline is iris feature vector construction procedure. The procedure represents the extraction of iris texture information relevant to its subsequent comparison. Thorough investigation of feature vectors obtained from iris showed that not all the vector elements are equally relevant. There are two characteristics which determine the vector element utility: fragility and discriminability. Conventional iris feature extraction methods consider the concept of fragility as the feature vector instability without respect to the nature of such instability appearance. This work separates sources of the instability into natural and encodinginduced which helps deeply investigate each source of instability independently. According to the separation concept, a novel approach of iris feature vector construction is proposed. The approach consists of two steps: iris feature extraction using Gabor filtering with optimal parameters and quantization with separated preliminary optimized fragility thresholds. The proposed method has been tested on two different datasets of iris images captured under changing environmental conditions. The testing results show that the proposed method surpasses all the methods considered as a prior art by recognition accuracy on both datasets.

  19. Rapid Assembly of Customized TALENs into Multiple Delivery Systems

    PubMed Central

    Zhang, Zhengxing; Zhang, Siliang; Huang, Xin; Orwig, Kyle E.; Sheng, Yi

    2013-01-01

    Transcriptional activator-like effector nucleases (TALENs) have become a powerful tool for genome editing. Here we present an efficient TALEN assembly approach in which TALENs are assembled by direct Golden Gate ligation into Gateway® Entry vectors from a repeat variable di-residue (RVD) plasmid array. We constructed TALEN pairs targeted to mouse Ddx3 subfamily genes, and demonstrated that our modified TALEN assembly approach efficiently generates accurate TALEN moieties that effectively introduce mutations into target genes. We generated “user friendly” TALEN Entry vectors containing TALEN expression cassettes with fluorescent reporter genes that can be efficiently transferred via Gateway (LR) recombination into different delivery systems. We demonstrated that the TALEN Entry vectors can be easily transferred to an adenoviral delivery system to expand application to cells that are difficult to transfect. Since TALENs work in pairs, we also generated a TALEN Entry vector set that combines a TALEN pair into one PiggyBac transposon-based destination vector. The approach described here can also be modified for construction of TALE transcriptional activators, repressors or other functional domains. PMID:24244669

  20. Scalar-vector soliton fiber laser mode-locked by nonlinear polarization rotation.

    PubMed

    Wu, Zhichao; Liu, Deming; Fu, Songnian; Li, Lei; Tang, Ming; Zhao, Luming

    2016-08-08

    We report a passively mode-locked fiber laser by nonlinear polarization rotation (NPR), where both vector and scalar soliton can co-exist within the laser cavity. The mode-locked pulse evolves as a vector soliton in the strong birefringent segment and is transformed into a regular scalar soliton after the polarizer within the laser cavity. The existence of solutions in a polarization-dependent cavity comprising a periodic combination of two distinct nonlinear waves is first demonstrated and likely to be applicable to various other nonlinear systems. For very large local birefringence, our laser approaches the operation regime of vector soliton lasers, while it approaches scalar soliton fiber lasers under the condition of very small birefringence.

  1. Logistic regression of family data from retrospective study designs.

    PubMed

    Whittemore, Alice S; Halpern, Jerry

    2003-11-01

    We wish to study the effects of genetic and environmental factors on disease risk, using data from families ascertained because they contain multiple cases of the disease. To do so, we must account for the way participants were ascertained, and for within-family correlations in both disease occurrences and covariates. We model the joint probability distribution of the covariates of ascertained family members, given family disease occurrence and pedigree structure. We describe two such covariate models: the random effects model and the marginal model. Both models assume a logistic form for the distribution of one person's covariates that involves a vector beta of regression parameters. The components of beta in the two models have different interpretations, and they differ in magnitude when the covariates are correlated within families. We describe ascertainment assumptions needed to estimate consistently the parameters beta(RE) in the random effects model and the parameters beta(M) in the marginal model. Under the ascertainment assumptions for the random effects model, we show that conditional logistic regression (CLR) of matched family data gives a consistent estimate beta(RE) for beta(RE) and a consistent estimate for the covariance matrix of beta(RE). Under the ascertainment assumptions for the marginal model, we show that unconditional logistic regression (ULR) gives a consistent estimate for beta(M), and we give a consistent estimator for its covariance matrix. The random effects/CLR approach is simple to use and to interpret, but it can use data only from families containing both affected and unaffected members. The marginal/ULR approach uses data from all individuals, but its variance estimates require special computations. A C program to compute these variance estimates is available at http://www.stanford.edu/dept/HRP/epidemiology. We illustrate these pros and cons by application to data on the effects of parity on ovarian cancer risk in mother/daughter pairs, and use simulations to study the performance of the estimates. Copyright 2003 Wiley-Liss, Inc.

  2. Multiscale asymmetric orthogonal wavelet kernel for linear programming support vector learning and nonlinear dynamic systems identification.

    PubMed

    Lu, Zhao; Sun, Jing; Butts, Kenneth

    2014-05-01

    Support vector regression for approximating nonlinear dynamic systems is more delicate than the approximation of indicator functions in support vector classification, particularly for systems that involve multitudes of time scales in their sampled data. The kernel used for support vector learning determines the class of functions from which a support vector machine can draw its solution, and the choice of kernel significantly influences the performance of a support vector machine. In this paper, to bridge the gap between wavelet multiresolution analysis and kernel learning, the closed-form orthogonal wavelet is exploited to construct new multiscale asymmetric orthogonal wavelet kernels for linear programming support vector learning. The closed-form multiscale orthogonal wavelet kernel provides a systematic framework to implement multiscale kernel learning via dyadic dilations and also enables us to represent complex nonlinear dynamics effectively. To demonstrate the superiority of the proposed multiscale wavelet kernel in identifying complex nonlinear dynamic systems, two case studies are presented that aim at building parallel models on benchmark datasets. The development of parallel models that address the long-term/mid-term prediction issue is more intricate and challenging than the identification of series-parallel models where only one-step ahead prediction is required. Simulation results illustrate the effectiveness of the proposed multiscale kernel learning.

  3. Using Time Series Analysis to Predict Cardiac Arrest in a PICU.

    PubMed

    Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P

    2015-11-01

    To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.

  4. SNPs selection using support vector regression and genetic algorithms in GWAS

    PubMed Central

    2014-01-01

    Introduction This paper proposes a new methodology to simultaneously select the most relevant SNPs markers for the characterization of any measurable phenotype described by a continuous variable using Support Vector Regression with Pearson Universal kernel as fitness function of a binary genetic algorithm. The proposed methodology is multi-attribute towards considering several markers simultaneously to explain the phenotype and is based jointly on statistical tools, machine learning and computational intelligence. Results The suggested method has shown potential in the simulated database 1, with additive effects only, and real database. In this simulated database, with a total of 1,000 markers, and 7 with major effect on the phenotype and the other 993 SNPs representing the noise, the method identified 21 markers. Of this total, 5 are relevant SNPs between the 7 but 16 are false positives. In real database, initially with 50,752 SNPs, we have reduced to 3,073 markers, increasing the accuracy of the model. In the simulated database 2, with additive effects and interactions (epistasis), the proposed method matched to the methodology most commonly used in GWAS. Conclusions The method suggested in this paper demonstrates the effectiveness in explaining the real phenotype (PTA for milk), because with the application of the wrapper based on genetic algorithm and Support Vector Regression with Pearson Universal, many redundant markers were eliminated, increasing the prediction and accuracy of the model on the real database without quality control filters. The PUK demonstrated that it can replicate the performance of linear and RBF kernels. PMID:25573332

  5. Bayesian dose-response analysis for epidemiological studies with complex uncertainty in dose estimation.

    PubMed

    Kwon, Deukwoo; Hoffman, F Owen; Moroz, Brian E; Simon, Steven L

    2016-02-10

    Most conventional risk analysis methods rely on a single best estimate of exposure per person, which does not allow for adjustment for exposure-related uncertainty. Here, we propose a Bayesian model averaging method to properly quantify the relationship between radiation dose and disease outcomes by accounting for shared and unshared uncertainty in estimated dose. Our Bayesian risk analysis method utilizes multiple realizations of sets (vectors) of doses generated by a two-dimensional Monte Carlo simulation method that properly separates shared and unshared errors in dose estimation. The exposure model used in this work is taken from a study of the risk of thyroid nodules among a cohort of 2376 subjects who were exposed to fallout from nuclear testing in Kazakhstan. We assessed the performance of our method through an extensive series of simulations and comparisons against conventional regression risk analysis methods. When the estimated doses contain relatively small amounts of uncertainty, the Bayesian method using multiple a priori plausible draws of dose vectors gave similar results to the conventional regression-based methods of dose-response analysis. However, when large and complex mixtures of shared and unshared uncertainties are present, the Bayesian method using multiple dose vectors had significantly lower relative bias than conventional regression-based risk analysis methods and better coverage, that is, a markedly increased capability to include the true risk coefficient within the 95% credible interval of the Bayesian-based risk estimate. An evaluation of the dose-response using our method is presented for an epidemiological study of thyroid disease following radiation exposure. Copyright © 2015 John Wiley & Sons, Ltd.

  6. Patient Stratification Using Electronic Health Records from a Chronic Disease Management Program.

    PubMed

    Chen, Robert; Sun, Jimeng; Dittus, Robert S; Fabbri, Daniel; Kirby, Jacqueline; Laffer, Cheryl L; McNaughton, Candace D; Malin, Bradley

    2016-01-04

    The goal of this study is to devise a machine learning framework to assist care coordination programs in prognostic stratification to design and deliver personalized care plans and to allocate financial and medical resources effectively. This study is based on a de-identified cohort of 2,521 hypertension patients from a chronic care coordination program at the Vanderbilt University Medical Center. Patients were modeled as vectors of features derived from electronic health records (EHRs) over a six-year period. We applied a stepwise regression to identify risk factors associated with a decrease in mean arterial pressure of at least 2 mmHg after program enrollment. The resulting features were subsequently validated via a logistic regression classifier. Finally, risk factors were applied to group the patients through model-based clustering. We identified a set of predictive features that consisted of a mix of demographic, medication, and diagnostic concepts. Logistic regression over these features yielded an area under the ROC curve (AUC) of 0.71 (95% CI: [0.67, 0.76]). Based on these features, four clinically meaningful groups are identified through clustering - two of which represented patients with more severe disease profiles, while the remaining represented patients with mild disease profiles. Patients with hypertension can exhibit significant variation in their blood pressure control status and responsiveness to therapy. Yet this work shows that a clustering analysis can generate more homogeneous patient groups, which may aid clinicians in designing and implementing customized care programs. The study shows that predictive modeling and clustering using EHR data can be beneficial for providing a systematic, generalized approach for care providers to tailor their management approach based upon patient-level factors.

  7. Vector method for strain estimation in phase-sensitive optical coherence elastography

    NASA Astrophysics Data System (ADS)

    Matveyev, A. L.; Matveev, L. A.; Sovetsky, A. A.; Gelikonov, G. V.; Moiseev, A. A.; Zaitsev, V. Y.

    2018-06-01

    A noise-tolerant approach to strain estimation in phase-sensitive optical coherence elastography, robust to decorrelation distortions, is discussed. The method is based on evaluation of interframe phase-variation gradient, but its main feature is that the phase is singled out at the very last step of the gradient estimation. All intermediate steps operate with complex-valued optical coherence tomography (OCT) signals represented as vectors in the complex plane (hence, we call this approach the ‘vector’ method). In comparison with such a popular method as least-square fitting of the phase-difference slope over a selected region (even in the improved variant with amplitude weighting for suppressing small-amplitude noisy pixels), the vector approach demonstrates superior tolerance to both additive noise in the receiving system and speckle-decorrelation caused by tissue straining. Another advantage of the vector approach is that it obviates the usual necessity of error-prone phase unwrapping. Here, special attention is paid to modifications of the vector method that make it especially suitable for processing deformations with significant lateral inhomogeneity, which often occur in real situations. The method’s advantages are demonstrated using both simulated and real OCT scans obtained during reshaping of a collagenous tissue sample irradiated by an IR laser beam producing complex spatially inhomogeneous deformations.

  8. Datamining approaches for modeling tumor control probability.

    PubMed

    Naqa, Issam El; Deasy, Joseph O; Mu, Yi; Huang, Ellen; Hope, Andrew J; Lindsay, Patricia E; Apte, Aditya; Alaly, James; Bradley, Jeffrey D

    2010-11-01

    Tumor control probability (TCP) to radiotherapy is determined by complex interactions between tumor biology, tumor microenvironment, radiation dosimetry, and patient-related variables. The complexity of these heterogeneous variable interactions constitutes a challenge for building predictive models for routine clinical practice. We describe a datamining framework that can unravel the higher order relationships among dosimetric dose-volume prognostic variables, interrogate various radiobiological processes, and generalize to unseen data before when applied prospectively. Several datamining approaches are discussed that include dose-volume metrics, equivalent uniform dose, mechanistic Poisson model, and model building methods using statistical regression and machine learning techniques. Institutional datasets of non-small cell lung cancer (NSCLC) patients are used to demonstrate these methods. The performance of the different methods was evaluated using bivariate Spearman rank correlations (rs). Over-fitting was controlled via resampling methods. Using a dataset of 56 patients with primary NCSLC tumors and 23 candidate variables, we estimated GTV volume and V75 to be the best model parameters for predicting TCP using statistical resampling and a logistic model. Using these variables, the support vector machine (SVM) kernel method provided superior performance for TCP prediction with an rs=0.68 on leave-one-out testing compared to logistic regression (rs=0.4), Poisson-based TCP (rs=0.33), and cell kill equivalent uniform dose model (rs=0.17). The prediction of treatment response can be improved by utilizing datamining approaches, which are able to unravel important non-linear complex interactions among model variables and have the capacity to predict on unseen data for prospective clinical applications.

  9. Combining macula clinical signs and patient characteristics for age-related macular degeneration diagnosis: a machine learning approach.

    PubMed

    Fraccaro, Paolo; Nicolo, Massimo; Bonetto, Monica; Giacomini, Mauro; Weller, Peter; Traverso, Carlo Enrico; Prosperi, Mattia; OSullivan, Dympna

    2015-01-27

    To investigate machine learning methods, ranging from simpler interpretable techniques to complex (non-linear) "black-box" approaches, for automated diagnosis of Age-related Macular Degeneration (AMD). Data from healthy subjects and patients diagnosed with AMD or other retinal diseases were collected during routine visits via an Electronic Health Record (EHR) system. Patients' attributes included demographics and, for each eye, presence/absence of major AMD-related clinical signs (soft drusen, retinal pigment epitelium, defects/pigment mottling, depigmentation area, subretinal haemorrhage, subretinal fluid, macula thickness, macular scar, subretinal fibrosis). Interpretable techniques known as white box methods including logistic regression and decision trees as well as less interpreitable techniques known as black box methods, such as support vector machines (SVM), random forests and AdaBoost, were used to develop models (trained and validated on unseen data) to diagnose AMD. The gold standard was confirmed diagnosis of AMD by physicians. Sensitivity, specificity and area under the receiver operating characteristic (AUC) were used to assess performance. Study population included 487 patients (912 eyes). In terms of AUC, random forests, logistic regression and adaboost showed a mean performance of (0.92), followed by SVM and decision trees (0.90). All machine learning models identified soft drusen and age as the most discriminating variables in clinicians' decision pathways to diagnose AMD. Both black-box and white box methods performed well in identifying diagnoses of AMD and their decision pathways. Machine learning models developed through the proposed approach, relying on clinical signs identified by retinal specialists, could be embedded into EHR to provide physicians with real time (interpretable) support.

  10. Predicting primary progressive aphasias with support vector machine approaches in structural MRI data.

    PubMed

    Bisenius, Sandrine; Mueller, Karsten; Diehl-Schmid, Janine; Fassbender, Klaus; Grimmer, Timo; Jessen, Frank; Kassubek, Jan; Kornhuber, Johannes; Landwehrmeyer, Bernhard; Ludolph, Albert; Schneider, Anja; Anderl-Straub, Sarah; Stuke, Katharina; Danek, Adrian; Otto, Markus; Schroeter, Matthias L

    2017-01-01

    Primary progressive aphasia (PPA) encompasses the three subtypes nonfluent/agrammatic variant PPA, semantic variant PPA, and the logopenic variant PPA, which are characterized by distinct patterns of language difficulties and regional brain atrophy. To validate the potential of structural magnetic resonance imaging data for early individual diagnosis, we used support vector machine classification on grey matter density maps obtained by voxel-based morphometry analysis to discriminate PPA subtypes (44 patients: 16 nonfluent/agrammatic variant PPA, 17 semantic variant PPA, 11 logopenic variant PPA) from 20 healthy controls (matched for sample size, age, and gender) in the cohort of the multi-center study of the German consortium for frontotemporal lobar degeneration. Here, we compared a whole-brain with a meta-analysis-based disease-specific regions-of-interest approach for support vector machine classification. We also used support vector machine classification to discriminate the three PPA subtypes from each other. Whole brain support vector machine classification enabled a very high accuracy between 91 and 97% for identifying specific PPA subtypes vs. healthy controls, and 78/95% for the discrimination between semantic variant vs. nonfluent/agrammatic or logopenic PPA variants. Only for the discrimination between nonfluent/agrammatic and logopenic PPA variants accuracy was low with 55%. Interestingly, the regions that contributed the most to the support vector machine classification of patients corresponded largely to the regions that were atrophic in these patients as revealed by group comparisons. Although the whole brain approach took also into account regions that were not covered in the regions-of-interest approach, both approaches showed similar accuracies due to the disease-specificity of the selected networks. Conclusion, support vector machine classification of multi-center structural magnetic resonance imaging data enables prediction of PPA subtypes with a very high accuracy paving the road for its application in clinical settings.

  11. Gene delivery strategies for the treatment of mucopolysaccharidoses.

    PubMed

    Baldo, Guilherme; Giugliani, Roberto; Matte, Ursula

    2014-03-01

    Mucopolysaccharidosis (MPS) disorders are genetic diseases caused by deficiencies in the lysosomal enzymes responsible for the degradation of glycosaminoglycans. Current treatments are not able to correct all disease symptoms and are not available for all MPS types, which makes gene therapy especially relevant. Multiple gene therapy approaches have been tested for different types of MPS, and our aim in this study is to critically analyze each of them. In this review, we have included the major studies that describe the use of adeno-associated retroviral and lentiviral vectors, as well as relevant non-viral approaches for MPS disorders. Some protocols such as the use of adeno-associated vectors and lentiviral vectors are approaching the clinic for these disorders and, along with combined approaches, seem to be the future of gene therapy for MPS.

  12. Assessment of changes of vector borne diseases with wetland characteristics using multivariate analysis.

    PubMed

    Sheela, A M; Sarun, S; Justus, J; Vineetha, P; Sheeja, R V

    2015-04-01

    Vector borne diseases are a threat to human health. Little attention has been paid to the prevention of these diseases. We attempted to identify the significant wetland characteristics associated with the spread of chikungunya, dengue fever and malaria in Kerala, a tropical region of South West India using multivariate analyses (hierarchical cluster analysis, factor analysis and multiple regression). High/medium turbid coastal lagoons and inland water-logged wetlands with aquatic vegetation have significant effect on the incidence of chikungunya while dengue influenced by high turbid coastal beaches and malaria by medium turbid coastal beaches. The high turbidity in water is due to the urban waste discharge namely sewage, sullage and garbage from the densely populated cities and towns. The large extent of wetland is low land area favours the occurrence of vector borne diseases. Hence the provision of pollution control measures at source including soil erosion control measures is vital. The identification of vulnerable zones favouring the vector borne diseases will help the authorities to control pollution especially from urban areas and prevent these vector borne diseases. Future research should cover land use cover changes, climatic factors, seasonal variations in weather and pollution factors favouring the occurrence of vector borne diseases.

  13. A Hybrid Neuro-Fuzzy Model For Integrating Large Earth-Science Datasets

    NASA Astrophysics Data System (ADS)

    Porwal, A.; Carranza, J.; Hale, M.

    2004-12-01

    A GIS-based hybrid neuro-fuzzy approach to integration of large earth-science datasets for mineral prospectivity mapping is described. It implements a Takagi-Sugeno type fuzzy inference system in the framework of a four-layered feed-forward adaptive neural network. Each unique combination of the datasets is considered a feature vector whose components are derived by knowledge-based ordinal encoding of the constituent datasets. A subset of feature vectors with a known output target vector (i.e., unique conditions known to be associated with either a mineralized or a barren location) is used for the training of an adaptive neuro-fuzzy inference system. Training involves iterative adjustment of parameters of the adaptive neuro-fuzzy inference system using a hybrid learning procedure for mapping each training vector to its output target vector with minimum sum of squared error. The trained adaptive neuro-fuzzy inference system is used to process all feature vectors. The output for each feature vector is a value that indicates the extent to which a feature vector belongs to the mineralized class or the barren class. These values are used to generate a prospectivity map. The procedure is demonstrated by an application to regional-scale base metal prospectivity mapping in a study area located in the Aravalli metallogenic province (western India). A comparison of the hybrid neuro-fuzzy approach with pure knowledge-driven fuzzy and pure data-driven neural network approaches indicates that the former offers a superior method for integrating large earth-science datasets for predictive spatial mathematical modelling.

  14. Forecasting Daily Patient Outflow From a Ward Having No Real-Time Clinical Data

    PubMed Central

    Tran, Truyen; Luo, Wei; Phung, Dinh; Venkatesh, Svetha

    2016-01-01

    Background: Modeling patient flow is crucial in understanding resource demand and prioritization. We study patient outflow from an open ward in an Australian hospital, where currently bed allocation is carried out by a manager relying on past experiences and looking at demand. Automatic methods that provide a reasonable estimate of total next-day discharges can aid in efficient bed management. The challenges in building such methods lie in dealing with large amounts of discharge noise introduced by the nonlinear nature of hospital procedures, and the nonavailability of real-time clinical information in wards. Objective Our study investigates different models to forecast the total number of next-day discharges from an open ward having no real-time clinical data. Methods We compared 5 popular regression algorithms to model total next-day discharges: (1) autoregressive integrated moving average (ARIMA), (2) the autoregressive moving average with exogenous variables (ARMAX), (3) k-nearest neighbor regression, (4) random forest regression, and (5) support vector regression. Although the autoregressive integrated moving average model relied on past 3-month discharges, nearest neighbor forecasting used median of similar discharges in the past in estimating next-day discharge. In addition, the ARMAX model used the day of the week and number of patients currently in ward as exogenous variables. For the random forest and support vector regression models, we designed a predictor set of 20 patient features and 88 ward-level features. Results Our data consisted of 12,141 patient visits over 1826 days. Forecasting quality was measured using mean forecast error, mean absolute error, symmetric mean absolute percentage error, and root mean square error. When compared with a moving average prediction model, all 5 models demonstrated superior performance with the random forests achieving 22.7% improvement in mean absolute error, for all days in the year 2014. Conclusions In the absence of clinical information, our study recommends using patient-level and ward-level data in predicting next-day discharges. Random forest and support vector regression models are able to use all available features from such data, resulting in superior performance over traditional autoregressive methods. An intelligent estimate of available beds in wards plays a crucial role in relieving access block in emergency departments. PMID:27444059

  15. A New Approach to Attitude Stability and Control for Low Airspeed Vehicles

    NASA Technical Reports Server (NTRS)

    Lim, K. B.; Shin, Y-Y.; Moerder, D. D.; Cooper, E. G.

    2004-01-01

    This paper describes an approach for controlling the attitude of statically unstable thrust-levitated vehicles in hover or slow translation. The large thrust vector that characterizes such vehicles can be modulated to provide control forces and moments to the airframe, but such modulation is accompanied by significant unsteady flow effects. These effects are difficult to model, and can compromise the practical value of thrust vectoring in closed-loop attitude stability, even if the thrust vectoring machinery has sufficient bandwidth for stabilization. The stabilization approach described in this paper is based on using internal angular momentum transfer devices for stability, augmented by thrust vectoring for trim and other "outer loop" control functions. The three main components of this approach are: (1) a z-body axis angular momentum bias enhances static attitude stability, reducing the amount of control activity needed for stabilization, (2) optionally, gimbaled reaction wheels provide high-bandwidth control torques for additional stabilization, or agility, and (3) the resulting strongly coupled system dynamics are controlled by a multivariable controller. A flight test vehicle is described, and nonlinear simulation results are provided that demonstrate the efficiency of the approach.

  16. High-efficiency and flexible generation of vector vortex optical fields by a reflective phase-only spatial light modulator.

    PubMed

    Cai, Meng-Qiang; Wang, Zhou-Xiang; Liang, Juan; Wang, Yan-Kun; Gao, Xu-Zhen; Li, Yongnan; Tu, Chenghou; Wang, Hui-Tian

    2017-08-01

    The scheme for generating vector optical fields should have not only high efficiency but also flexibility for satisfying the requirements of various applications. However, in general, high efficiency and flexibility are not compatible. Here we present and experimentally demonstrate a solution to directly, flexibly, and efficiently generate vector vortex optical fields (VVOFs) with a reflective phase-only liquid crystal spatial light modulator (LC-SLM) based on optical birefringence of liquid crystal molecules. To generate the VVOFs, this approach needs in principle only a half-wave plate, an LC-SLM, and a quarter-wave plate. This approach has some advantages, including a simple experimental setup, good flexibility, and high efficiency, making the approach very promising in some applications when higher power is need. This approach has a generation efficiency of 44.0%, which is much higher than the 1.1% of the common path interferometric approach.

  17. Multiple scattering effects with cyclical terms in active remote sensing of vegetated surface using vector radiative transfer theory

    USDA-ARS?s Scientific Manuscript database

    The energy transport in a vegetated (corn) surface layer is examined by solving the vector radiative transfer equation using a numerical iterative approach. This approach allows a higher order that includes the multiple scattering effects. Multiple scattering effects are important when the optical t...

  18. A Fast Reduced Kernel Extreme Learning Machine.

    PubMed

    Deng, Wan-Yu; Ong, Yew-Soon; Zheng, Qing-Hua

    2016-04-01

    In this paper, we present a fast and accurate kernel-based supervised algorithm referred to as the Reduced Kernel Extreme Learning Machine (RKELM). In contrast to the work on Support Vector Machine (SVM) or Least Square SVM (LS-SVM), which identifies the support vectors or weight vectors iteratively, the proposed RKELM randomly selects a subset of the available data samples as support vectors (or mapping samples). By avoiding the iterative steps of SVM, significant cost savings in the training process can be readily attained, especially on Big datasets. RKELM is established based on the rigorous proof of universal learning involving reduced kernel-based SLFN. In particular, we prove that RKELM can approximate any nonlinear functions accurately under the condition of support vectors sufficiency. Experimental results on a wide variety of real world small instance size and large instance size applications in the context of binary classification, multi-class problem and regression are then reported to show that RKELM can perform at competitive level of generalized performance as the SVM/LS-SVM at only a fraction of the computational effort incurred. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Machine Learning Intermolecular Potentials for 1,3,5-Triamino-2,4,6-trinitrobenzene (TATB) Using Symmetry-Adapted Perturbation Theory

    DTIC Science & Technology

    2018-04-25

    unlimited. NOTICES Disclaimers The findings in this report are not to be construed as an official Department of the Army position unless so...this report, intermolecular potentials for 1,3,5-triamino-2,4,6-trinitrobenzene (TATB) are developed using machine learning techniques. Three...potentials based on support vector regression, kernel ridge regression, and a neural network are fit using symmetry-adapted perturbation theory. The

  20. Multivariate analysis of fMRI time series: classification and regression of brain responses using machine learning.

    PubMed

    Formisano, Elia; De Martino, Federico; Valente, Giancarlo

    2008-09-01

    Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.

  1. [Mapping environmental vulnerability from ETM + data in the Yellow River Mouth Area].

    PubMed

    Wang, Rui-Yan; Yu, Zhen-Wen; Xia, Yan-Ling; Wang, Xiang-Feng; Zhao, Geng-Xing; Jiang, Shu-Qian

    2013-10-01

    The environmental vulnerability retrieval is important to support continuing data. The spatial distribution of regional environmental vulnerability was got through remote sensing retrieval. In view of soil and vegetation, the environmental vulnerability evaluation index system was built, and the environmental vulnerability of sampling points was calculated by the AHP-fuzzy method, then the correlation between the sampling points environmental vulnerability and ETM + spectral reflectance ratio including some kinds of conversion data was analyzed to determine the sensitive spectral parameters. Based on that, models of correlation analysis, traditional regression, BP neural network and support vector regression were taken to explain the quantitative relationship between the spectral reflectance and the environmental vulnerability. With this model, the environmental vulnerability distribution was retrieved in the Yellow River Mouth Area. The results showed that the correlation between the environmental vulnerability and the spring NDVI, the September NDVI and the spring brightness was better than others, so they were selected as the sensitive spectral parameters. The model precision result showed that in addition to the support vector model, the other model reached the significant level. While all the multi-variable regression was better than all one-variable regression, and the model accuracy of BP neural network was the best. This study will serve as a reliable theoretical reference for the large spatial scale environmental vulnerability estimation based on remote sensing data.

  2. Modeling daily soil temperature over diverse climate conditions in Iran—a comparison of multiple linear regression and support vector regression techniques

    NASA Astrophysics Data System (ADS)

    Delbari, Masoomeh; Sharifazari, Salman; Mohammadi, Ehsan

    2018-02-01

    The knowledge of soil temperature at different depths is important for agricultural industry and for understanding climate change. The aim of this study is to evaluate the performance of a support vector regression (SVR)-based model in estimating daily soil temperature at 10, 30 and 100 cm depth at different climate conditions over Iran. The obtained results were compared to those obtained from a more classical multiple linear regression (MLR) model. The correlation sensitivity for the input combinations and periodicity effect were also investigated. Climatic data used as inputs to the models were minimum and maximum air temperature, solar radiation, relative humidity, dew point, and the atmospheric pressure (reduced to see level), collected from five synoptic stations Kerman, Ahvaz, Tabriz, Saghez, and Rasht located respectively in the hyper-arid, arid, semi-arid, Mediterranean, and hyper-humid climate conditions. According to the results, the performance of both MLR and SVR models was quite well at surface layer, i.e., 10-cm depth. However, SVR performed better than MLR in estimating soil temperature at deeper layers especially 100 cm depth. Moreover, both models performed better in humid climate condition than arid and hyper-arid areas. Further, adding a periodicity component into the modeling process considerably improved the models' performance especially in the case of SVR.

  3. Raman spectroscopy based investigation of molecular changes associated with an early stage of dengue virus infection

    NASA Astrophysics Data System (ADS)

    Bilal, Maria; Bilal, Muhammad; Saleem, Muhammad; Khurram, Muhammad; Khan, Saranjam; Ullah, Rahat; Ali, Hina; Ahmed, Mushtaq; Shahzada, Shaista; Ullah Khan, Ehsan

    2017-04-01

    Raman spectroscopy based investigations of the molecular changes associated with an early stage of dengue virus infection (DENV) using a partial least squares (PLS) regression model is presented. This study is based on non-structural protein 1 (NS1) which appears after three days of DENV infection. In total, 39 blood sera samples were collected and divided into two groups. The control group contained samples which were the negative for NS1 and antibodies and the positive group contained those samples in which NS1 is positive and antibodies were negative. Out of 39 samples, 29 Raman spectra were used for the model development while the remaining 10 were kept hidden for blind testing of the model. PLS regression yielded a vector of regression coefficients as a function of Raman shift, which were analyzed. Cytokines in the region 775-875 cm-1, lectins at 1003, 1238, 1340, 1449 and 1672 cm-1, DNA in the region 1040-1140 cm-1 and alpha and beta structures of proteins in the region 933-967 cm-1 have been identified in the regression vector for their role in an early stage of DENV infection. Validity of the model was established by its R-square value of 0.891. Sensitivity, specificity and accuracy were 100% each and the area under the receiver operator characteristic curve was found to be 1.

  4. Real-data comparison of data mining methods in prediction of diabetes in iran.

    PubMed

    Tapak, Lily; Mahjub, Hossein; Hamidi, Omid; Poorolajal, Jalal

    2013-09-01

    Diabetes is one of the most common non-communicable diseases in developing countries. Early screening and diagnosis play an important role in effective prevention strategies. This study compared two traditional classification methods (logistic regression and Fisher linear discriminant analysis) and four machine-learning classifiers (neural networks, support vector machines, fuzzy c-mean, and random forests) to classify persons with and without diabetes. The data set used in this study included 6,500 subjects from the Iranian national non-communicable diseases risk factors surveillance obtained through a cross-sectional survey. The obtained sample was based on cluster sampling of the Iran population which was conducted in 2005-2009 to assess the prevalence of major non-communicable disease risk factors. Ten risk factors that are commonly associated with diabetes were selected to compare the performance of six classifiers in terms of sensitivity, specificity, total accuracy, and area under the receiver operating characteristic (ROC) curve criteria. Support vector machines showed the highest total accuracy (0.986) as well as area under the ROC (0.979). Also, this method showed high specificity (1.000) and sensitivity (0.820). All other methods produced total accuracy of more than 85%, but for all methods, the sensitivity values were very low (less than 0.350). The results of this study indicate that, in terms of sensitivity, specificity, and overall classification accuracy, the support vector machine model ranks first among all the classifiers tested in the prediction of diabetes. Therefore, this approach is a promising classifier for predicting diabetes, and it should be further investigated for the prediction of other diseases.

  5. Data Mining Technologies Inspired from Visual Principle

    NASA Astrophysics Data System (ADS)

    Xu, Zongben

    In this talk we review the recent work done by our group on data mining (DM) technologies deduced from simulating visual principle. Through viewing a DM problem as a cognition problems and treading a data set as an image with each light point located at a datum position, we developed a series of high efficient algorithms for clustering, classification and regression via mimicking visual principles. In pattern recognition, human eyes seem to possess a singular aptitude to group objects and find important structure in an efficient way. Thus, a DM algorithm simulating visual system may solve some basic problems in DM research. From this point of view, we proposed a new approach for data clustering by modeling the blurring effect of lateral retinal interconnections based on scale space theory. In this approach, as the data image blurs, smaller light blobs merge into large ones until the whole image becomes one light blob at a low enough level of resolution. By identifying each blob with a cluster, the blurring process then generates a family of clustering along the hierarchy. The proposed approach provides unique solutions to many long standing problems, such as the cluster validity and the sensitivity to initialization problems, in clustering. We extended such an approach to classification and regression problems, through combatively employing the Weber's law in physiology and the cell response classification facts. The resultant classification and regression algorithms are proven to be very efficient and solve the problems of model selection and applicability to huge size of data set in DM technologies. We finally applied the similar idea to the difficult parameter setting problem in support vector machine (SVM). Viewing the parameter setting problem as a recognition problem of choosing a visual scale at which the global and local structures of a data set can be preserved, and the difference between the two structures be maximized in the feature space, we derived a direct parameter setting formula for the Gaussian SVM. The simulations and applications show that the suggested formula significantly outperforms the known model selection methods in terms of efficiency and precision.

  6. Current Advances and Future Challenges in Adenoviral Vector Biology and Targeting

    PubMed Central

    Campos, Samuel K.; Barry, Michael A.

    2008-01-01

    Gene delivery vectors based on Adenoviral (Ad) vectors have enormous potential for the treatment of both hereditary and acquired disease. Detailed structural analysis of the Ad virion, combined with functional studies has broadened our knowledge of the structure/function relationships between Ad vectors and host cells/tissues and substantial achievement has been made towards a thorough understanding of the biology of Ad vectors. The widespread use of Ad vectors for clinical gene therapy is compromised by their inherent immunogenicity. The generation of safer and more effective Ad vectors, targeted to the site of disease, has therefore become a great ambition in the field of Ad vector development. This review provides a synopsis of the structure/function relationships between Ad vectors and host systems and summarizes the many innovative approaches towards achieving Ad vector targeting. PMID:17584037

  7. Pseudotyped Lentiviral Vectors for Retrograde Gene Delivery into Target Brain Regions

    PubMed Central

    Kobayashi, Kenta; Inoue, Ken-ichi; Tanabe, Soshi; Kato, Shigeki; Takada, Masahiko; Kobayashi, Kazuto

    2017-01-01

    Gene transfer through retrograde axonal transport of viral vectors offers a substantial advantage for analyzing roles of specific neuronal pathways or cell types forming complex neural networks. This genetic approach may also be useful in gene therapy trials by enabling delivery of transgenes into a target brain region distant from the injection site of the vectors. Pseudotyping of a lentiviral vector based on human immunodeficiency virus type 1 (HIV-1) with various fusion envelope glycoproteins composed of different combinations of rabies virus glycoprotein (RV-G) and vesicular stomatitis virus glycoprotein (VSV-G) enhances the efficiency of retrograde gene transfer in both rodent and nonhuman primate brains. The most recently developed lentiviral vector is a pseudotype with fusion glycoprotein type E (FuG-E), which demonstrates highly efficient retrograde gene transfer in the brain. The FuG-E–pseudotyped vector permits powerful experimental strategies for more precisely investigating the mechanisms underlying various brain functions. It also contributes to the development of new gene therapy approaches for neurodegenerative disorders, such as Parkinson’s disease, by delivering genes required for survival and protection into specific neuronal populations. In this review article, we report the properties of the FuG-E–pseudotyped vector, and we describe the application of the vector to neural circuit analysis and the potential use of the FuG-E vector in gene therapy for Parkinson’s disease. PMID:28824385

  8. Production of SV40-derived vectors.

    PubMed

    Strayer, David S; Mitchell, Christine; Maier, Dawn A; Nichols, Carmen N

    2010-06-01

    Recombinant simian virus 40 (rSV40)-derived vectors are particularly useful for gene delivery to bone marrow progenitor cells and their differentiated derivatives, certain types of epithelial cells (e.g., hepatocytes), and central nervous system neurons and microglia. They integrate rapidly into cellular DNA to provide long-term gene expression in vitro and in vivo in both resting and dividing cells. Here we describe a protocol for production and purification of these vectors. These procedures require only packaging cells (e.g., COS-7) and circular vector genome DNA. Amplification involves repeated infection of packaging cells with vector produced by transfection. Cotransfection is not required in any step. Viruses are purified by centrifugation using discontinuous sucrose or cesium chloride (CsCl) gradients and resulting vectors are replication-incompetent and contain no detectable wild-type SV40 revertants. These approaches are simple, give reproducible results, and may be used to generate vectors that are deleted only for large T antigen (Tag), or for all SV40-coding sequences capable of carrying up to 5 kb of foreign DNA. These vectors are best applied to long-term expression of proteins normally encoded by mammalian cells or by viruses that infect mammalian cells, or of untranslated RNAs (e.g., RNA interference). The preparative approaches described facilitate application of these vectors and allow almost any laboratory to exploit their strengths for diverse gene delivery applications.

  9. Bayesian data assimilation provides rapid decision support for vector-borne diseases.

    PubMed

    Jewell, Chris P; Brown, Richard G

    2015-07-06

    Predicting the spread of vector-borne diseases in response to incursions requires knowledge of both host and vector demographics in advance of an outbreak. Although host population data are typically available, for novel disease introductions there is a high chance of the pathogen using a vector for which data are unavailable. This presents a barrier to estimating the parameters of dynamical models representing host-vector-pathogen interaction, and hence limits their ability to provide quantitative risk forecasts. The Theileria orientalis (Ikeda) outbreak in New Zealand cattle demonstrates this problem: even though the vector has received extensive laboratory study, a high degree of uncertainty persists over its national demographic distribution. Addressing this, we develop a Bayesian data assimilation approach whereby indirect observations of vector activity inform a seasonal spatio-temporal risk surface within a stochastic epidemic model. We provide quantitative predictions for the future spread of the epidemic, quantifying uncertainty in the model parameters, case infection times and the disease status of undetected infections. Importantly, we demonstrate how our model learns sequentially as the epidemic unfolds and provide evidence for changing epidemic dynamics through time. Our approach therefore provides a significant advance in rapid decision support for novel vector-borne disease outbreaks. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  10. Ecdysis period and rate deviations of dengue mosquito vector, Aedes aegypti reared in different artificial water-holding containers.

    PubMed

    Almanzor, Beatriz Louise J; Ho, Howell T; Carvajal, Thaddeus M

    2016-03-01

    Artificial water-holding containers (AWHCs) have been well-documented in many Aedes aegypti studies for dengue surveillance and developmental research. Hence, we investigated the role of different AHWCs on the development and ecdysis period of Ae. aegypti dengue vector, a container breeding mosquito. Nine types of AWHCs, namely glass, polystyrene foam, rubber, steel, porcelain, plastic, aluminum, clay and concrete, were chosen for the study. All AWHCs were subjected to the developmental assay for an observation period of 10 days. Regression and hazard analyses were employed to the developmental stages and the characteristics of the AWHCs. The observations revealed that Ae. aegypti development is fastest in glass and polystyrene containers while slowest in concrete containers. Moreover, pupal ecdysis appears to be the most affected by the characteristics of the AWHCs based on regression and hazard analyses. Characteristics of the container that can regulate water temperature seem to be the driving force with regards to the slow or fast development of Ae. aegypti, more notably in pupal ecdysis. The results of the study further strengthen our understanding on the dynamics of Ae. aegypti's developmental biology to different characteristics of artificial water containers. This, in turn, would aid in devising vector control strategies against dengue especially in endemic areas.

  11. Hourly predictive Levenberg-Marquardt ANN and multi linear regression models for predicting of dew point temperature

    NASA Astrophysics Data System (ADS)

    Zounemat-Kermani, Mohammad

    2012-08-01

    In this study, the ability of two models of multi linear regression (MLR) and Levenberg-Marquardt (LM) feed-forward neural network was examined to estimate the hourly dew point temperature. Dew point temperature is the temperature at which water vapor in the air condenses into liquid. This temperature can be useful in estimating meteorological variables such as fog, rain, snow, dew, and evapotranspiration and in investigating agronomical issues as stomatal closure in plants. The availability of hourly records of climatic data (air temperature, relative humidity and pressure) which could be used to predict dew point temperature initiated the practice of modeling. Additionally, the wind vector (wind speed magnitude and direction) and conceptual input of weather condition were employed as other input variables. The three quantitative standard statistical performance evaluation measures, i.e. the root mean squared error, mean absolute error, and absolute logarithmic Nash-Sutcliffe efficiency coefficient ( {| {{{Log}}({{NS}})} |} ) were employed to evaluate the performances of the developed models. The results showed that applying wind vector and weather condition as input vectors along with meteorological variables could slightly increase the ANN and MLR predictive accuracy. The results also revealed that LM-NN was superior to MLR model and the best performance was obtained by considering all potential input variables in terms of different evaluation criteria.

  12. Climatic, ecological, and socioeconomic factors associated with West Nile virus incidence in Atlanta, Georgia, U.S.A.

    PubMed

    Lockaby, Graeme; Noori, Navideh; Morse, Wayde; Zipperer, Wayne; Kalin, Latif; Governo, Robin; Sawant, Rajesh; Ricker, Matthew

    2016-12-01

    The integrated effects of the many risk factors associated with West Nile virus (WNV) incidence are complex and not well understood. We studied an array of risk factors in and around Atlanta, GA, that have been shown to be linked with WNV in other locations. This array was comprehensive and included climate and meteorological metrics, vegetation characteristics, land use / land cover analyses, and socioeconomic factors. Data on mosquito abundance and WNV mosquito infection rates were obtained for 58 sites and covered 2009-2011, a period following the combined storm water - sewer overflow remediation in that city. Risk factors were compared to mosquito abundance and the WNV vector index (VI) using regression analyses individually and in combination. Lagged climate variables, including soil moisture and temperature, were significantly correlated (positively) with vector index as were forest patch size and percent pine composition of patches (both negatively). Socioeconomic factors that were most highly correlated (positively) with the VI included the proportion of low income households and homes built before 1960 and housing density. The model selected through stepwise regression that related risk factors to the VI included (in the order of decreasing influence) proportion of houses built before 1960, percent of pine in patches, and proportion of low income households. © 2016 The Society for Vector Ecology.

  13. Vector-transmitted disease vaccines: targeting salivary proteins in transmission (SPIT).

    PubMed

    McDowell, Mary Ann

    2015-08-01

    More than half the population of the world is at risk for morbidity and mortality from vector-transmitted diseases, and emerging vector-transmitted infections are threatening new populations. Rising insecticide resistance and lack of efficacious vaccines highlight the need for novel control measures. One such approach is targeting the vector-host interface by incorporating vector salivary proteins in anti-pathogen vaccines. Debate remains about whether vector saliva exposure exacerbates or protects against more severe clinical manifestations, induces immunity through natural exposure or extends to all vector species and associated pathogens. Nevertheless, exploiting this unique biology holds promise as a viable strategy for the development of vaccines against vector-transmitted diseases. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Estimation of Surface Seawater Fugacity of Carbon Dioxide Using Satellite Data and Machine Learning

    NASA Astrophysics Data System (ADS)

    Jang, E.; Im, J.; Park, G.; Park, Y.

    2016-12-01

    The ocean controls the climate of Earth by absorbing and releasing CO2 through the carbon cycle. The amount of CO2 in the ocean has increased since the industrial revolution. High CO2 concentration in the ocean has a negative influence to marine organisms and reduces the ability of absorbing CO2 in the ocean. This study estimated surface seawater fugacity of CO2 (fCO2) in the East Sea of Korea using Geostationary Ocean Color Imager (GOCI) and Moderate Resolution Imaging Spectroradiometer (MODIS) satellite data, and Hybrid Coordinate Ocean Model (HYCOM) reanalysis data. GOCI is the world first geostationary ocean color observation satellite sensor, and it provides 8 images with 8 bands hourly per day from 9 am to 4 pm at 500m resolution. Two machine learning approaches (i.e., random forest and support vector regression) were used to model fCO2 in this study. While most of the existing studies used multiple linear regression to estimate the pressure of CO2 in the ocean, machine learning may handle more complex relationship between surface seawater fCO2 and ocean parameters in a dynamic spatiotemporal environment. Five ocean related parameters, colored dissolved organic matter (CDOM), chlorophyll-a (chla), sea surface temperature (SST), sea surface salinity (SSS), and mixed layer depth (MLD), were used as input variables. This study examined two schemes, one with GOCI-derived products and the other with MODIS-derived ones. Results show that random forest performed better than support vector regression regardless of satellite data used. The accuracy of GOCI-based estimation was higher than MODIS-based one, possibly thanks to the better spatiotemporal resolution of GOCI data. MLD was identified the most contributing parameter in estimating surface seawater fCO2 among the five ocean related parameters, which might be related with an active deep convection in the East Sea. The surface seawater fCO2 in summer was higher in general with some spatial variation than the other seasons because of higher SST.

  15. Bayesian data assimilation provides rapid decision support for vector-borne diseases

    PubMed Central

    Jewell, Chris P.; Brown, Richard G.

    2015-01-01

    Predicting the spread of vector-borne diseases in response to incursions requires knowledge of both host and vector demographics in advance of an outbreak. Although host population data are typically available, for novel disease introductions there is a high chance of the pathogen using a vector for which data are unavailable. This presents a barrier to estimating the parameters of dynamical models representing host–vector–pathogen interaction, and hence limits their ability to provide quantitative risk forecasts. The Theileria orientalis (Ikeda) outbreak in New Zealand cattle demonstrates this problem: even though the vector has received extensive laboratory study, a high degree of uncertainty persists over its national demographic distribution. Addressing this, we develop a Bayesian data assimilation approach whereby indirect observations of vector activity inform a seasonal spatio-temporal risk surface within a stochastic epidemic model. We provide quantitative predictions for the future spread of the epidemic, quantifying uncertainty in the model parameters, case infection times and the disease status of undetected infections. Importantly, we demonstrate how our model learns sequentially as the epidemic unfolds and provide evidence for changing epidemic dynamics through time. Our approach therefore provides a significant advance in rapid decision support for novel vector-borne disease outbreaks. PMID:26136225

  16. All That Glisters Is Not Gold: Sampling-Process Uncertainty in Disease-Vector Surveys with False-Negative and False-Positive Detections

    PubMed Central

    Abad-Franch, Fernando; Valença-Barbosa, Carolina; Sarquis, Otília; Lima, Marli M.

    2014-01-01

    Background Vector-borne diseases are major public health concerns worldwide. For many of them, vector control is still key to primary prevention, with control actions planned and evaluated using vector occurrence records. Yet vectors can be difficult to detect, and vector occurrence indices will be biased whenever spurious detection/non-detection records arise during surveys. Here, we investigate the process of Chagas disease vector detection, assessing the performance of the surveillance method used in most control programs – active triatomine-bug searches by trained health agents. Methodology/Principal Findings Control agents conducted triplicate vector searches in 414 man-made ecotopes of two rural localities. Ecotope-specific ‘detection histories’ (vectors or their traces detected or not in each individual search) were analyzed using ordinary methods that disregard detection failures and multiple detection-state site-occupancy models that accommodate false-negative and false-positive detections. Mean (±SE) vector-search sensitivity was ∼0.283±0.057. Vector-detection odds increased as bug colonies grew denser, and were lower in houses than in most peridomestic structures, particularly woodpiles. False-positive detections (non-vector fecal streaks misidentified as signs of vector presence) occurred with probability ∼0.011±0.008. The model-averaged estimate of infestation (44.5±6.4%) was ∼2.4–3.9 times higher than naïve indices computed assuming perfect detection after single vector searches (11.4–18.8%); about 106–137 infestation foci went undetected during such standard searches. Conclusions/Significance We illustrate a relatively straightforward approach to addressing vector detection uncertainty under realistic field survey conditions. Standard vector searches had low sensitivity except in certain singular circumstances. Our findings suggest that many infestation foci may go undetected during routine surveys, especially when vector density is low. Undetected foci can cause control failures and induce bias in entomological indices; this may confound disease risk assessment and mislead program managers into flawed decision making. By helping correct bias in naïve indices, the approach we illustrate has potential to critically strengthen vector-borne disease control-surveillance systems. PMID:25233352

  17. Prediction of p38 map kinase inhibitory activity of 3, 4-dihydropyrido [3, 2-d] pyrimidone derivatives using an expert system based on principal component analysis and least square support vector machine

    PubMed Central

    Shahlaei, M.; Saghaie, L.

    2014-01-01

    A quantitative structure–activity relationship (QSAR) study is suggested for the prediction of biological activity (pIC50) of 3, 4-dihydropyrido [3,2-d] pyrimidone derivatives as p38 inhibitors. Modeling of the biological activities of compounds of interest as a function of molecular structures was established by means of principal component analysis (PCA) and least square support vector machine (LS-SVM) methods. The results showed that the pIC50 values calculated by LS-SVM are in good agreement with the experimental data, and the performance of the LS-SVM regression model is superior to the PCA-based model. The developed LS-SVM model was applied for the prediction of the biological activities of pyrimidone derivatives, which were not in the modeling procedure. The resulted model showed high prediction ability with root mean square error of prediction of 0.460 for LS-SVM. The study provided a novel and effective approach for predicting biological activities of 3, 4-dihydropyrido [3,2-d] pyrimidone derivatives as p38 inhibitors and disclosed that LS-SVM can be used as a powerful chemometrics tool for QSAR studies. PMID:26339262

  18. A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

    PubMed Central

    Zhang, Daqing; Xiao, Jianfeng; Zhou, Nannan; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian

    2015-01-01

    Blood-brain barrier (BBB) is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM) is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA) to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA)/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration. PMID:26504797

  19. Use seismic colored inversion and power law committee machine based on imperial competitive algorithm for improving porosity prediction in a heterogeneous reservoir

    NASA Astrophysics Data System (ADS)

    Ansari, Hamid Reza

    2014-09-01

    In this paper we propose a new method for predicting rock porosity based on a combination of several artificial intelligence systems. The method focuses on one of the Iranian carbonate fields in the Persian Gulf. Because there is strong heterogeneity in carbonate formations, estimation of rock properties experiences more challenge than sandstone. For this purpose, seismic colored inversion (SCI) and a new approach of committee machine are used in order to improve porosity estimation. The study comprises three major steps. First, a series of sample-based attributes is calculated from 3D seismic volume. Acoustic impedance is an important attribute that is obtained by the SCI method in this study. Second, porosity log is predicted from seismic attributes using common intelligent computation systems including: probabilistic neural network (PNN), radial basis function network (RBFN), multi-layer feed forward network (MLFN), ε-support vector regression (ε-SVR) and adaptive neuro-fuzzy inference system (ANFIS). Finally, a power law committee machine (PLCM) is constructed based on imperial competitive algorithm (ICA) to combine the results of all previous predictions in a single solution. This technique is called PLCM-ICA in this paper. The results show that PLCM-ICA model improved the results of neural networks, support vector machine and neuro-fuzzy system.

  20. Periscope: quantitative prediction of soluble protein expression in the periplasm of Escherichia coli

    NASA Astrophysics Data System (ADS)

    Chang, Catherine Ching Han; Li, Chen; Webb, Geoffrey I.; Tey, Bengti; Song, Jiangning; Ramanan, Ramakrishnan Nagasundara

    2016-03-01

    Periplasmic expression of soluble proteins in Escherichia coli not only offers a much-simplified downstream purification process, but also enhances the probability of obtaining correctly folded and biologically active proteins. Different combinations of signal peptides and target proteins lead to different soluble protein expression levels, ranging from negligible to several grams per litre. Accurate algorithms for rational selection of promising candidates can serve as a powerful tool to complement with current trial-and-error approaches. Accordingly, proteomics studies can be conducted with greater efficiency and cost-effectiveness. Here, we developed a predictor with a two-stage architecture, to predict the real-valued expression level of target protein in the periplasm. The output of the first-stage support vector machine (SVM) classifier determines which second-stage support vector regression (SVR) classifier to be used. When tested on an independent test dataset, the predictor achieved an overall prediction accuracy of 78% and a Pearson’s correlation coefficient (PCC) of 0.77. We further illustrate the relative importance of various features with respect to different models. The results indicate that the occurrence of dipeptide glutamine and aspartic acid is the most important feature for the classification model. Finally, we provide access to the implemented predictor through the Periscope webserver, freely accessible at http://lightning.med.monash.edu/periscope/.

  1. Big genomics and clinical data analytics strategies for precision cancer prognosis.

    PubMed

    Ow, Ghim Siong; Kuznetsov, Vladimir A

    2016-11-07

    The field of personalized and precise medicine in the era of big data analytics is growing rapidly. Previously, we proposed our model of patient classification termed Prognostic Signature Vector Matching (PSVM) and identified a 37 variable signature comprising 36 let-7b associated prognostic significant mRNAs and the age risk factor that stratified large high-grade serous ovarian cancer patient cohorts into three survival-significant risk groups. Here, we investigated the predictive performance of PSVM via optimization of the prognostic variable weights, which represent the relative importance of one prognostic variable over the others. In addition, we compared several multivariate prognostic models based on PSVM with classical machine learning techniques such as K-nearest-neighbor, support vector machine, random forest, neural networks and logistic regression. Our results revealed that negative log-rank p-values provides more robust weight values as opposed to the use of other quantities such as hazard ratios, fold change, or a combination of those factors. PSVM, together with the classical machine learning classifiers were combined in an ensemble (multi-test) voting system, which collectively provides a more precise and reproducible patient stratification. The use of the multi-test system approach, rather than the search for the ideal classification/prediction method, might help to address limitations of the individual classification algorithm in specific situation.

  2. A Hamiltonian approach to the planar optimization of mid-course corrections

    NASA Astrophysics Data System (ADS)

    Iorfida, E.; Palmer, P. L.; Roberts, M.

    2016-04-01

    Lawden's primer vector theory gives a set of necessary conditions that characterize the optimality of a transfer orbit, defined accordingly to the possibility of adding mid-course corrections. In this paper a novel approach is proposed where, through a polar coordinates transformation, the primer vector components decouple. Furthermore, the case when transfer, departure and arrival orbits are coplanar is analyzed using a Hamiltonian approach. This procedure leads to approximate analytic solutions for the in-plane components of the primer vector. Moreover, the solution for the circular transfer case is proven to be the Hill's solution. The novel procedure reduces the mathematical and computational complexity of the original case study. It is shown that the primer vector is independent of the semi-major axis of the transfer orbit. The case with a fixed transfer trajectory and variable initial and final thrust impulses is studied. The acquired related optimality maps are presented and analyzed and they express the likelihood of a set of trajectories to be optimal. Furthermore, it is presented which kind of requirements have to be fulfilled by a set of departure and arrival orbits to have the same profile of primer vector.

  3. "Singing in the Tube"--audiovisual assay of plant oil repellent activity against mosquitoes (Culex pipiens).

    PubMed

    Adams, Temitope F; Wongchai, Chatchawal; Chaidee, Anchalee; Pfeiffer, Wolfgang

    2016-01-01

    Plant essential oils have been suggested as a promising alternative to the established mosquito repellent DEET (N,N-diethyl-meta-toluamide). Searching for an assay with generally available equipment, we designed a new audiovisual assay of repellent activity against mosquitoes "Singing in the Tube," testing single mosquitoes in Drosophila cultivation tubes. Statistics with regression analysis should compensate for limitations of simple hardware. The assay was established with female Culex pipiens mosquitoes in 60 experiments, 120-h audio recording, and 2580 estimations of the distance between mosquito sitting position and the chemical. Correlations between parameters of sitting position, flight activity pattern, and flight tone spectrum were analyzed. Regression analysis of psycho-acoustic data of audio files (dB[A]) used a squared and modified sinus function determining wing beat frequency WBF ± SD (357 ± 47 Hz). Application of logistic regression defined the repelling velocity constant. The repelling velocity constant showed a decreasing order of efficiency of plant essential oils: rosemary (Rosmarinus officinalis), eucalyptus (Eucalyptus globulus), lavender (Lavandula angustifolia), citronella (Cymbopogon nardus), tea tree (Melaleuca alternifolia), clove (Syzygium aromaticum), lemon (Citrus limon), patchouli (Pogostemon cablin), DEET, cedar wood (Cedrus atlantica). In conclusion, we suggest (1) disease vector control (e.g., impregnation of bed nets) by eight plant essential oils with repelling velocity superior to DEET, (2) simple mosquito repellency testing in Drosophila cultivation tubes, (3) automated approaches and room surveillance by generally available audio equipment (dB[A]: ISO standard 226), and (4) quantification of repellent activity by parameters of the audiovisual assay defined by correlation and regression analyses.

  4. Application of XGBoost algorithm in hourly PM2.5 concentration prediction

    NASA Astrophysics Data System (ADS)

    Pan, Bingyue

    2018-02-01

    In view of prediction techniques of hourly PM2.5 concentration in China, this paper applied the XGBoost(Extreme Gradient Boosting) algorithm to predict hourly PM2.5 concentration. The monitoring data of air quality in Tianjin city was analyzed by using XGBoost algorithm. The prediction performance of the XGBoost method is evaluated by comparing observed and predicted PM2.5 concentration using three measures of forecast accuracy. The XGBoost method is also compared with the random forest algorithm, multiple linear regression, decision tree regression and support vector machines for regression models using computational results. The results demonstrate that the XGBoost algorithm outperforms other data mining methods.

  5. Partial F-tests with multiply imputed data in the linear regression framework via coefficient of determination.

    PubMed

    Chaurasia, Ashok; Harel, Ofer

    2015-02-10

    Tests for regression coefficients such as global, local, and partial F-tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F-tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.

  6. Muscle-specific CRISPR/Cas9 dystrophin gene editing ameliorates pathophysiology in a mouse model for Duchenne muscular dystrophy

    PubMed Central

    Bengtsson, Niclas E.; Hall, John K.; Odom, Guy L.; Phelps, Michael P.; Andrus, Colin R.; Hawkins, R. David; Hauschka, Stephen D.; Chamberlain, Joel R.; Chamberlain, Jeffrey S.

    2017-01-01

    Gene replacement therapies utilizing adeno-associated viral (AAV) vectors hold great promise for treating Duchenne muscular dystrophy (DMD). A related approach uses AAV vectors to edit specific regions of the DMD gene using CRISPR/Cas9. Here we develop multiple approaches for editing the mutation in dystrophic mdx4cv mice using single and dual AAV vector delivery of a muscle-specific Cas9 cassette together with single-guide RNA cassettes and, in one approach, a dystrophin homology region to fully correct the mutation. Muscle-restricted Cas9 expression enables direct editing of the mutation, multi-exon deletion or complete gene correction via homologous recombination in myogenic cells. Treated muscles express dystrophin in up to 70% of the myogenic area and increased force generation following intramuscular delivery. Furthermore, systemic administration of the vectors results in widespread expression of dystrophin in both skeletal and cardiac muscles. Our results demonstrate that AAV-mediated muscle-specific gene editing has significant potential for therapy of neuromuscular disorders. PMID:28195574

  7. Mitochondrial Neurogastrointestinal Encephalomyopathy Caused by Thymidine Phosphorylase Enzyme Deficiency: From Pathogenesis to Emerging Therapeutic Options

    PubMed Central

    Yadak, Rana; Sillevis Smitt, Peter; van Gisbergen, Marike W.; van Til, Niek P.; de Coo, Irenaeus F. M.

    2017-01-01

    Mitochondrial neurogastrointestinal encephalomyopathy (MNGIE) is a progressive metabolic disorder caused by thymidine phosphorylase (TP) enzyme deficiency. The lack of TP results in systemic accumulation of deoxyribonucleosides thymidine (dThd) and deoxyuridine (dUrd). In these patients, clinical features include mental regression, ophthalmoplegia, and fatal gastrointestinal complications. The accumulation of nucleosides also causes imbalances in mitochondrial DNA (mtDNA) deoxyribonucleoside triphosphates (dNTPs), which may play a direct or indirect role in the mtDNA depletion/deletion abnormalities, although the exact underlying mechanism remains unknown. The available therapeutic approaches include dialysis and enzyme replacement therapy, both can only transiently reverse the biochemical imbalance. Allogeneic hematopoietic stem cell transplantation is shown to be able to restore normal enzyme activity and improve clinical manifestations in MNGIE patients. However, transplant related complications and disease progression result in a high mortality rate. New therapeutic approaches, such as adeno-associated viral vector and hematopoietic stem cell gene therapy have been tested in Tymp-/-Upp1-/- mice, a murine model for MNGIE. This review provides background information on disease manifestations of MNGIE with a focus on current management and treatment options. It also outlines the pre-clinical approaches toward future treatment of the disease. PMID:28261062

  8. Short-Term Load Forecasting Based Automatic Distribution Network Reconfiguration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, Huaiguang; Ding, Fei; Zhang, Yingchen

    In a traditional dynamic network reconfiguration study, the optimal topology is determined at every scheduled time point by using the real load data measured at that time. The development of the load forecasting technique can provide an accurate prediction of the load power that will happen in a future time and provide more information about load changes. With the inclusion of load forecasting, the optimal topology can be determined based on the predicted load conditions during a longer time period instead of using a snapshot of the load at the time when the reconfiguration happens; thus, the distribution system operatormore » can use this information to better operate the system reconfiguration and achieve optimal solutions. This paper proposes a short-term load forecasting approach to automatically reconfigure distribution systems in a dynamic and pre-event manner. Specifically, a short-term and high-resolution distribution system load forecasting approach is proposed with a forecaster based on support vector regression and parallel parameters optimization. The network reconfiguration problem is solved by using the forecasted load continuously to determine the optimal network topology with the minimum amount of loss at the future time. The simulation results validate and evaluate the proposed approach.« less

  9. Short-Term Load Forecasting Based Automatic Distribution Network Reconfiguration: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, Huaiguang; Ding, Fei; Zhang, Yingchen

    In the traditional dynamic network reconfiguration study, the optimal topology is determined at every scheduled time point by using the real load data measured at that time. The development of load forecasting technique can provide accurate prediction of load power that will happen in future time and provide more information about load changes. With the inclusion of load forecasting, the optimal topology can be determined based on the predicted load conditions during the longer time period instead of using the snapshot of load at the time when the reconfiguration happens, and thus it can provide information to the distribution systemmore » operator (DSO) to better operate the system reconfiguration to achieve optimal solutions. Thus, this paper proposes a short-term load forecasting based approach for automatically reconfiguring distribution systems in a dynamic and pre-event manner. Specifically, a short-term and high-resolution distribution system load forecasting approach is proposed with support vector regression (SVR) based forecaster and parallel parameters optimization. And the network reconfiguration problem is solved by using the forecasted load continuously to determine the optimal network topology with the minimum loss at the future time. The simulation results validate and evaluate the proposed approach.« less

  10. Short-Term Load Forecasting-Based Automatic Distribution Network Reconfiguration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jiang, Huaiguang; Ding, Fei; Zhang, Yingchen

    In a traditional dynamic network reconfiguration study, the optimal topology is determined at every scheduled time point by using the real load data measured at that time. The development of the load forecasting technique can provide an accurate prediction of the load power that will happen in a future time and provide more information about load changes. With the inclusion of load forecasting, the optimal topology can be determined based on the predicted load conditions during a longer time period instead of using a snapshot of the load at the time when the reconfiguration happens; thus, the distribution system operatormore » can use this information to better operate the system reconfiguration and achieve optimal solutions. This paper proposes a short-term load forecasting approach to automatically reconfigure distribution systems in a dynamic and pre-event manner. Specifically, a short-term and high-resolution distribution system load forecasting approach is proposed with a forecaster based on support vector regression and parallel parameters optimization. The network reconfiguration problem is solved by using the forecasted load continuously to determine the optimal network topology with the minimum amount of loss at the future time. The simulation results validate and evaluate the proposed approach.« less

  11. Machine Learning Algorithms for prediction of regions of high Reynolds Averaged Navier Stokes Uncertainty

    NASA Astrophysics Data System (ADS)

    Mishra, Aashwin; Iaccarino, Gianluca

    2017-11-01

    In spite of their deficiencies, RANS models represent the workhorse for industrial investigations into turbulent flows. In this context, it is essential to provide diagnostic measures to assess the quality of RANS predictions. To this end, the primary step is to identify feature importances amongst massive sets of potentially descriptive and discriminative flow features. This aids the physical interpretability of the resultant discrepancy model and its extensibility to similar problems. Recent investigations have utilized approaches such as Random Forests, Support Vector Machines and the Least Absolute Shrinkage and Selection Operator for feature selection. With examples, we exhibit how such methods may not be suitable for turbulent flow datasets. The underlying rationale, such as the correlation bias and the required conditions for the success of penalized algorithms, are discussed with illustrative examples. Finally, we provide alternate approaches using convex combinations of regularized regression approaches and randomized sub-sampling in combination with feature selection algorithms, to infer model structure from data. This research was supported by the Defense Advanced Research Projects Agency under the Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) project (technical monitor: Dr Fariba Fahroo).

  12. Approaches to control diseases vectored by ambrosia beetles in avocado and other American Lauraceae

    USDA-ARS?s Scientific Manuscript database

    Invasive ambrosia beetles and the plant pathogenic fungi they vector represent a significant challenge to North American agriculture, native and landscape trees. Ambrosia beetles encompass a range of insect species and they vector a diverse set of plant pathogenic fungi. Our lab has taken several bi...

  13. Structural Analysis of Biodiversity

    PubMed Central

    Sirovich, Lawrence; Stoeckle, Mark Y.; Zhang, Yu

    2010-01-01

    Large, recently-available genomic databases cover a wide range of life forms, suggesting opportunity for insights into genetic structure of biodiversity. In this study we refine our recently-described technique using indicator vectors to analyze and visualize nucleotide sequences. The indicator vector approach generates correlation matrices, dubbed Klee diagrams, which represent a novel way of assembling and viewing large genomic datasets. To explore its potential utility, here we apply the improved algorithm to a collection of almost 17000 DNA barcode sequences covering 12 widely-separated animal taxa, demonstrating that indicator vectors for classification gave correct assignment in all 11000 test cases. Indicator vector analysis revealed discontinuities corresponding to species- and higher-level taxonomic divisions, suggesting an efficient approach to classification of organisms from poorly-studied groups. As compared to standard distance metrics, indicator vectors preserve diagnostic character probabilities, enable automated classification of test sequences, and generate high-information density single-page displays. These results support application of indicator vectors for comparative analysis of large nucleotide data sets and raise prospect of gaining insight into broad-scale patterns in the genetic structure of biodiversity. PMID:20195371

  14. Evaluating predictive models for solar energy growth in the US states and identifying the key drivers

    NASA Astrophysics Data System (ADS)

    Chakraborty, Joheen; Banerji, Sugata

    2018-03-01

    Driven by a desire to control climate change and reduce the dependence on fossil fuels, governments around the world are increasing the adoption of renewable energy sources. However, among the US states, we observe a wide disparity in renewable penetration. In this study, we have identified and cleaned over a dozen datasets representing solar energy penetration in each US state, and the potentially relevant socioeconomic and other factors that may be driving the growth in solar. We have applied a number of predictive modeling approaches - including machine learning and regression - on these datasets over a 17-year period and evaluated the relative performance of the models. Our goals were: (1) identify the most important factors that are driving the growth in solar, (2) choose the most effective predictive modeling technique for solar growth, and (3) develop a model for predicting next year’s solar growth using this year’s data. We obtained very promising results with random forests (about 90% efficacy) and varying degrees of success with support vector machines and regression techniques (linear, polynomial, ridge). We also identified states with solar growth slower than expected and representing a potential for stronger growth in future.

  15. Development and Application of Modern Optimal Controllers for a Membrane Structure Using Vector Second Order Form

    NASA Astrophysics Data System (ADS)

    Ferhat, Ipar

    With increasing advancement in material science and computational power of current computers that allows us to analyze high dimensional systems, very light and large structures are being designed and built for aerospace applications. One example is a reflector of a space telescope that is made of membrane structures. These reflectors are light and foldable which makes the shipment easy and cheaper unlike traditional reflectors made of glass or other heavy materials. However, one of the disadvantages of membranes is that they are very sensitive to external changes, such as thermal load or maneuvering of the space telescope. These effects create vibrations that dramatically affect the performance of the reflector. To overcome vibrations in membranes, in this work, piezoelectric actuators are used to develop distributed controllers for membranes. These actuators generate bending effects to suppress the vibration. The actuators attached to a membrane are relatively thick which makes the system heterogeneous; thus, an analytical solution cannot be obtained to solve the partial differential equation of the system. Therefore, the Finite Element Model is applied to obtain an approximate solution for the membrane actuator system. Another difficulty that arises with very flexible large structures is the dimension of the discretized system. To obtain an accurate result, the system needs to be discretized using smaller segments which makes the dimension of the system very high. This issue will persist as long as the improving technology will allow increasingly complex and large systems to be designed and built. To deal with this difficulty, the analysis of the system and controller development to suppress the vibration are carried out using vector second order form as an alternative to vector first order form. In vector second order form, the number of equations that need to be solved are half of the number equations in vector first order form. Analyzing the system for control characteristics such as stability, controllability and observability is a key step that needs to be carried out before developing a controller. This analysis determines what kind of system is being modeled and the appropriate approach for controller development. Therefore, accuracy of the system analysis is very crucial. The results of the system analysis using vector second order form and vector first order form show the computational advantages of using vector second order form. Using similar concepts, LQR and LQG controllers, that are developed to suppress the vibration, are derived using vector second order form. To develop a controller using vector second order form, two different approaches are used. One is reducing the size of the Algebraic Riccati Equation to half by partitioning the solution matrix. The other approach is using the Hamiltonian method directly in vector second order form. Controllers are developed using both approaches and compared to each other. Some simple solutions for special cases are derived for vector second order form using the reduced Algebraic Riccati Equation. The advantages and drawbacks of both approaches are explained through examples. System analysis and controller applications are carried out for a square membrane system with four actuators. Two different systems with different actuator locations are analyzed. One system has the actuators at the corners of the membrane, the other has the actuators away from the corners. The structural and control effect of actuator locations are demonstrated with mode shapes and simulations. The results of the controller applications and the comparison of the vector first order form with the vector second order form demonstrate the efficacy of the controllers.

  16. Noise-induced hearing loss and associated factors among vector control workers in a Malaysian state.

    PubMed

    Masilamani, Retneswari; Rasib, Abdul; Darus, Azlan; Ting, Anselm Su

    2014-11-01

    This study aims to determine the prevalence and associated factors of noise-induced hearing loss (NIHL) among vector control workers in the state of Negeri Sembilan, Malaysia. This was an analytical cross-sectional study conducted on 181 vector control workers who were working in district health offices in a state in Malaysia. Data were collected using a self-administered questionnaire and audiometry. Prevalence of NIHL was 26% among this group of workers. NIHL was significantly associated with the age-group of 40 years and older, length of service of 10 or more years, current occupational noise exposure, listening to loud music, history of firearms use, and history of mumps/measles infection. Following logistic regression, age of more than 40 years and noise exposure in current occupation were associated with NIHL with an odds ratio of 3.45 (95% confidence interval = 1.68-7.07) and 6.87 (95% confidence interval = 1.54-30.69), respectively, among this group of vector control workers. © 2012 APJPH.

  17. Integrating vector control across diseases.

    PubMed

    Golding, Nick; Wilson, Anne L; Moyes, Catherine L; Cano, Jorge; Pigott, David M; Velayudhan, Raman; Brooker, Simon J; Smith, David L; Hay, Simon I; Lindsay, Steve W

    2015-10-01

    Vector-borne diseases cause a significant proportion of the overall burden of disease across the globe, accounting for over 10 % of the burden of infectious diseases. Despite the availability of effective interventions for many of these diseases, a lack of resources prevents their effective control. Many existing vector control interventions are known to be effective against multiple diseases, so combining vector control programmes to simultaneously tackle several diseases could offer more cost-effective and therefore sustainable disease reductions. The highly successful cross-disease integration of vaccine and mass drug administration programmes in low-resource settings acts a precedent for cross-disease vector control. Whilst deliberate implementation of vector control programmes across multiple diseases has yet to be trialled on a large scale, a number of examples of 'accidental' cross-disease vector control suggest the potential of such an approach. Combining contemporary high-resolution global maps of the major vector-borne pathogens enables us to quantify overlap in their distributions and to estimate the populations jointly at risk of multiple diseases. Such an analysis shows that over 80 % of the global population live in regions of the world at risk from one vector-borne disease, and more than half the world's population live in areas where at least two different vector-borne diseases pose a threat to health. Combining information on co-endemicity with an assessment of the overlap of vector control methods effective against these diseases allows us to highlight opportunities for such integration. Malaria, leishmaniasis, lymphatic filariasis, and dengue are prime candidates for combined vector control. All four of these diseases overlap considerably in their distributions and there is a growing body of evidence for the effectiveness of insecticide-treated nets, screens, and curtains for controlling all of their vectors. The real-world effectiveness of cross-disease vector control programmes can only be evaluated by large-scale trials, but there is clear evidence of the potential of such an approach to enable greater overall health benefit using the limited funds available.

  18. Prediction of the distillation temperatures of crude oils using ¹H NMR and support vector regression with estimated confidence intervals.

    PubMed

    Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J

    2015-09-01

    This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.

  19. Serendipity in dark photon searches

    NASA Astrophysics Data System (ADS)

    Ilten, Philip; Soreq, Yotam; Williams, Mike; Xue, Wei

    2018-06-01

    Searches for dark photons provide serendipitous discovery potential for other types of vector particles. We develop a framework for recasting dark photon searches to obtain constraints on more general theories, which includes a data-driven method for determining hadronic decay rates. We demonstrate our approach by deriving constraints on a vector that couples to the B-L current, a leptophobic B boson that couples directly to baryon number and to leptons via B- γ kinetic mixing, and on a vector that mediates a protophobic force. Our approach can easily be generalized to any massive gauge boson with vector couplings to the Standard Model fermions, and software to perform any such recasting is provided at https://gitlab.com/philten/darkcast .

  20. A simple method for construction of artificial microRNA vector in plant.

    PubMed

    Li, Yang; Li, Yang; Zhao, Sunping; Zhong, Sheng; Wang, Zhaohai; Ding, Bo; Li, Yangsheng

    2014-10-01

    Artificial microRNA (amiRNA) is a powerful tool for silencing genes in many plant species. Here we provide an easy method to construct amiRNA vectors that reinvents the Golden Gate cloning approach and features a novel system called top speed amiRNA construction (TAC). This speedy approach accomplishes one restriction-ligation step in only 5 min, allowing easy and high-throughput vector construction. Three primers were annealed to be a specific adaptor, then digested and ligated on our novel vector pTAC. Importantly, this method allows the recombined amiRNA constructs to maintain the precursor of osa-miR528 with exception of the desired amiRNA/amiRNA* sequences. Using this method, our results showed the expected decrease of targeted genes in Nicotiana benthamiana and Oryza sativa.

  1. Expressing Transgenes That Exceed the Packaging Capacity of Adeno-Associated Virus Capsids

    PubMed Central

    Chamberlain, Kyle; Riyad, Jalish Mahmud; Weber, Thomas

    2016-01-01

    Recombinant adeno-associated virus vectors (rAAV) are being explored as gene delivery vehicles for the treatment of various inherited and acquired disorders. rAAVs are attractive vectors for several reasons: wild-type AAVs are nonpathogenic, and rAAVs can trigger long-term transgene expression even in the absence of genome integration—at least in postmitotic tissues. Moreover, rAAVs have a low immunogenic profile, and the various AAV serotypes and variants display broad but distinct tropisms. One limitation of rAAVs is that their genome-packaging capacity is only ∼5 kb. For most applications this is not of major concern because the median human protein size is 375 amino acids. Excluding the ITRs, for a protein of typical length, this allows the incorporation of ∼3.5 kb of DNA for the promoter, polyadenylation sequence, and other regulatory elements into a single AAV vector. Nonetheless, for certain diseases the packaging limit of AAV does not allow the delivery of a full-length therapeutic protein by a single AAV vector. Hence, approaches to overcome this limitation have become an important area of research for AAV gene therapy. Among the most promising approaches to overcome the limitation imposed by the packaging capacity of AAV is the use of dual-vector approaches, whereby a transgene is split across two separate AAV vectors. Coinfection of a cell with these two rAAVs will then—through a variety of mechanisms—result in the transcription of an assembled mRNA that could not be encoded by a single AAV vector because of the DNA packaging limits of AAV. The main purpose of this review is to assess the current literature with respect to dual-AAV-vector design, to highlight the effectiveness of the different methodologies and to briefly discuss future areas of research to improve the efficiency of dual-AAV-vector transduction. PMID:26757051

  2. Progress in malaria vector control.

    PubMed

    Pant, C P; Rishikesh, N; Bang, Y H; Smith, A

    1981-01-01

    Malaria control, except in tropical Africa, will probably continue to be based to a large extent on the use of insecticides for many years. However, the development of resistance to insecticides in the vectors has caused serious difficulties and it is necessary to change the strategy of insecticide use to maximize their efficacy. A thorough knowledge of the ecology and behaviour of each vector species is required before the control strategy can be adapted to different epidemiological situations. The behavioural differences between sibling species have been recognized for several years, but study of this problem has recently been simplified by improved means of identification that involve chromosomal banding patterns and electrophoretic analysis. Behavioural differences have also been associated with certain chromosomal rearrangements.New records of insecticide resistance among anophelines continue to appear and the impact of this on antimalaria operations has been seriously felt in Central America (multi-resistance in Anopheles albimanus), Turkey (A. sacharovi), India and several Asian countries (A. culicifacies and A. stephensi), and some other countries. Work continues on the screening and testing of newer insecticides that can be used as alternatives, but DDT, malathion, temephos, fenitrothion, and propoxur continue to be used as the main insecticides in many malaria control projects. The search for simpler and innovative approaches to insecticide application also continues.Biological control of vectors is receiving increased attention, as it could become an important component of integrated vector control strategies, and most progress has been made with the spore-forming bacterium, serotype H-14 of Bacillus thuringiensis. Larvivorous fish such as Gambusia spp. and Poecilia spp. continue to be used in some programmes.Application of environmental management measures, such as source reduction, source elimination, flushing of drainage and irrigation channels, and intermittent irrigation have been re-examined and currently a great deal of interest is being shown in these approaches.There has been limited interest in the genetic control of mosquitos and the phenomenon of refractoriness in some strains of the disease vectors, with the idea of replacing the vector species with the refractory strain. More research is needed before this approach can become a practical tool.It is apparent that in future a more integrated approach will have to be used for vector control within the context of antimalaria programmes. Training of staff, research, and cooperation at all levels will be an essential requirement for this approach.

  3. Development of ocean color algorithms for estimating chlorophyll-a concentrations and inherent optical properties using gene expression programming (GEP).

    PubMed

    Chang, Chih-Hua

    2015-03-09

    This paper proposes new inversion algorithms for the estimation of Chlorophyll-a concentration (Chla) and the ocean's inherent optical properties (IOPs) from the measurement of remote sensing reflectance (Rrs). With in situ data from the NASA bio-optical marine algorithm data set (NOMAD), inversion algorithms were developed by the novel gene expression programming (GEP) approach, which creates, manipulates and selects the most appropriate tree-structured functions based on evolutionary computing. The limitations and validity of the proposed algorithms are evaluated by simulated Rrs spectra with respect to NOMAD, and a closure test for IOPs obtained at a single reference wavelength. The application of GEP-derived algorithms is validated against in situ, synthetic and satellite match-up data sets compiled by NASA and the International Ocean Color Coordinate Group (IOCCG). The new algorithms are able to provide Chla and IOPs retrievals to those derived by other state-of-the-art regression approaches and obtained with the semi- and quasi-analytical algorithms, respectively. In practice, there are no significant differences between GEP, support vector regression, and multilayer perceptron model in terms of the overall performance. The GEP-derived algorithms are successfully applied in processing the images taken by the Sea Wide Field-of-view Sensor (SeaWiFS), generate Chla and IOPs maps which show better details of developing algal blooms, and give more information on the distribution of water constituents between different water bodies.

  4. Improved Measurement of Blood Pressure by Extraction of Characteristic Features from the Cuff Oscillometric Waveform

    PubMed Central

    Lim, Pooi Khoon; Ng, Siew-Cheok; Jassim, Wissam A.; Redmond, Stephen J.; Zilany, Mohammad; Avolio, Alberto; Lim, Einly; Tan, Maw Pin; Lovell, Nigel H.

    2015-01-01

    We present a novel approach to improve the estimation of systolic (SBP) and diastolic blood pressure (DBP) from oscillometric waveform data using variable characteristic ratios between SBP and DBP with mean arterial pressure (MAP). This was verified in 25 healthy subjects, aged 28 ± 5 years. The multiple linear regression (MLR) and support vector regression (SVR) models were used to examine the relationship between the SBP and the DBP ratio with ten features extracted from the oscillometric waveform envelope (OWE). An automatic algorithm based on relative changes in the cuff pressure and neighbouring oscillometric pulses was proposed to remove outlier points caused by movement artifacts. Substantial reduction in the mean and standard deviation of the blood pressure estimation errors were obtained upon artifact removal. Using the sequential forward floating selection (SFFS) approach, we were able to achieve a significant reduction in the mean and standard deviation of differences between the estimated SBP values and the reference scoring (MLR: mean ± SD = −0.3 ± 5.8 mmHg; SVR and −0.6 ± 5.4 mmHg) with only two features, i.e., Ratio2 and Area3, as compared to the conventional maximum amplitude algorithm (MAA) method (mean ± SD = −1.6 ± 8.6 mmHg). Comparing the performance of both MLR and SVR models, our results showed that the MLR model was able to achieve comparable performance to that of the SVR model despite its simplicity. PMID:26087370

  5. Effects of vector backbone and pseudotype on lentiviral vector-mediated gene transfer: studies in infant ADA-deficient mice and rhesus monkeys.

    PubMed

    Carbonaro Sarracino, Denise; Tarantal, Alice F; Lee, C Chang I; Martinez, Michele; Jin, Xiangyang; Wang, Xiaoyan; Hardee, Cinnamon L; Geiger, Sabine; Kahl, Christoph A; Kohn, Donald B

    2014-10-01

    Systemic delivery of a lentiviral vector carrying a therapeutic gene represents a new treatment for monogenic disease. Previously, we have shown that transfer of the adenosine deaminase (ADA) cDNA in vivo rescues the lethal phenotype and reconstitutes immune function in ADA-deficient mice. In order to translate this approach to ADA-deficient severe combined immune deficiency patients, neonatal ADA-deficient mice and newborn rhesus monkeys were treated with species-matched and mismatched vectors and pseudotypes. We compared gene delivery by the HIV-1-based vector to murine γ-retroviral vectors pseudotyped with vesicular stomatitis virus-glycoprotein or murine retroviral envelopes in ADA-deficient mice. The vesicular stomatitis virus-glycoprotein pseudotyped lentiviral vectors had the highest titer and resulted in the highest vector copy number in multiple tissues, particularly liver and lung. In monkeys, HIV-1 or simian immunodeficiency virus vectors resulted in similar biodistribution in most tissues including bone marrow, spleen, liver, and lung. Simian immunodeficiency virus pseudotyped with the gibbon ape leukemia virus envelope produced 10- to 30-fold lower titers than the vesicular stomatitis virus-glycoprotein pseudotype, but had a similar tissue biodistribution and similar copy number in blood cells. The relative copy numbers achieved in mice and monkeys were similar when adjusted to the administered dose per kg. These results suggest that this approach can be scaled-up to clinical levels for treatment of ADA-deficient severe combined immune deficiency subjects with suboptimal hematopoietic stem cell transplantation options.

  6. Alternatives to the stochastic "noise vector" approach

    NASA Astrophysics Data System (ADS)

    de Forcrand, Philippe; Jäger, Benjamin

    2018-03-01

    Several important observables, like the quark condensate and the Taylor coefficients of the expansion of the QCD pressure with respect to the chemical potential, are based on the trace of the inverse Dirac operator and of its powers. Such traces are traditionally estimated with "noise vectors" sandwiching the operator. We explore alternative approaches based on polynomial approximations of the inverse Dirac operator.

  7. Sentence alignment using feed forward neural network.

    PubMed

    Fattah, Mohamed Abdel; Ren, Fuji; Kuroiwa, Shingo

    2006-12-01

    Parallel corpora have become an essential resource for work in multi lingual natural language processing. However, sentence aligned parallel corpora are more efficient than non-aligned parallel corpora for cross language information retrieval and machine translation applications. In this paper, we present a new approach to align sentences in bilingual parallel corpora based on feed forward neural network classifier. A feature parameter vector is extracted from the text pair under consideration. This vector contains text features such as length, punctuate score, and cognate score values. A set of manually prepared training data has been assigned to train the feed forward neural network. Another set of data was used for testing. Using this new approach, we could achieve an error reduction of 60% over length based approach when applied on English-Arabic parallel documents. Moreover this new approach is valid for any language pair and it is quite flexible approach since the feature parameter vector may contain more/less or different features than that we used in our system such as lexical match feature.

  8. Link-Based Similarity Measures Using Reachability Vectors

    PubMed Central

    Yoon, Seok-Ho; Kim, Ji-Soo; Ryu, Minsoo; Choi, Ho-Jin

    2014-01-01

    We present a novel approach for computing link-based similarities among objects accurately by utilizing the link information pertaining to the objects involved. We discuss the problems with previous link-based similarity measures and propose a novel approach for computing link based similarities that does not suffer from these problems. In the proposed approach each target object is represented by a vector. Each element of the vector corresponds to all the objects in the given data, and the value of each element denotes the weight for the corresponding object. As for this weight value, we propose to utilize the probability of reaching from the target object to the specific object, computed using the “Random Walk with Restart” strategy. Then, we define the similarity between two objects as the cosine similarity of the two vectors. In this paper, we provide examples to show that our approach does not suffer from the aforementioned problems. We also evaluate the performance of the proposed methods in comparison with existing link-based measures, qualitatively and quantitatively, with respect to two kinds of data sets, scientific papers and Web documents. Our experimental results indicate that the proposed methods significantly outperform the existing measures. PMID:24701188

  9. On estimating gravity anomalies - A comparison of least squares collocation with conventional least squares techniques

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Lowrey, B.

    1977-01-01

    The least squares collocation algorithm for estimating gravity anomalies from geodetic data is shown to be an application of the well known regression equations which provide the mean and covariance of a random vector (gravity anomalies) given a realization of a correlated random vector (geodetic data). It is also shown that the collocation solution for gravity anomalies is equivalent to the conventional least-squares-Stokes' function solution when the conventional solution utilizes properly weighted zero a priori estimates. The mathematical and physical assumptions underlying the least squares collocation estimator are described.

  10. Support Vector Machine algorithm for regression and classification

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yu, Chenggang; Zavaljevski, Nela

    2001-08-01

    The software is an implementation of the Support Vector Machine (SVM) algorithm that was invented and developed by Vladimir Vapnik and his co-workers at AT&T Bell Laboratories. The specific implementation reported here is an Active Set method for solving a quadratic optimization problem that forms the major part of any SVM program. The implementation is tuned to specific constraints generated in the SVM learning. Thus, it is more efficient than general-purpose quadratic optimization programs. A decomposition method has been implemented in the software that enables processing large data sets. The size of the learning data is virtually unlimited by themore » capacity of the computer physical memory. The software is flexible and extensible. Two upper bounds are implemented to regulate the SVM learning for classification, which allow users to adjust the false positive and false negative rates. The software can be used either as a standalone, general-purpose SVM regression or classification program, or be embedded into a larger software system.« less

  11. Boosted Regression Trees Outperforms Support Vector Machines in Predicting (Regional) Yields of Winter Wheat from Single and Cumulated Dekadal Spot-VGT Derived Normalized Difference Vegetation Indices

    NASA Astrophysics Data System (ADS)

    Stas, Michiel; Dong, Qinghan; Heremans, Stien; Zhang, Beier; Van Orshoven, Jos

    2016-08-01

    This paper compares two machine learning techniques to predict regional winter wheat yields. The models, based on Boosted Regression Trees (BRT) and Support Vector Machines (SVM), are constructed of Normalized Difference Vegetation Indices (NDVI) derived from low resolution SPOT VEGETATION satellite imagery. Three types of NDVI-related predictors were used: Single NDVI, Incremental NDVI and Targeted NDVI. BRT and SVM were first used to select features with high relevance for predicting the yield. Although the exact selections differed between the prefectures, certain periods with high influence scores for multiple prefectures could be identified. The same period of high influence stretching from March to June was detected by both machine learning methods. After feature selection, BRT and SVM models were applied to the subset of selected features for actual yield forecasting. Whereas both machine learning methods returned very low prediction errors, BRT seems to slightly but consistently outperform SVM.

  12. Predicting pork loin intramuscular fat using computer vision system.

    PubMed

    Liu, J-H; Sun, X; Young, J M; Bachmeier, L A; Newman, D J

    2018-09-01

    The objective of this study was to investigate the ability of computer vision system to predict pork intramuscular fat percentage (IMF%). Center-cut loin samples (n = 85) were trimmed of subcutaneous fat and connective tissue. Images were acquired and pixels were segregated to estimate image IMF% and 18 image color features for each image. Subjective IMF% was determined by a trained grader. Ether extract IMF% was calculated using ether extract method. Image color features and image IMF% were used as predictors for stepwise regression and support vector machine models. Results showed that subjective IMF% had a correlation of 0.81 with ether extract IMF% while the image IMF% had a 0.66 correlation with ether extract IMF%. Accuracy rates for regression models were 0.63 for stepwise and 0.75 for support vector machine. Although subjective IMF% has shown to have better prediction, results from computer vision system demonstrates the potential of being used as a tool in predicting pork IMF% in the future. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. A hybrid sales forecasting scheme by combining independent component analysis with K-means clustering and support vector regression.

    PubMed

    Lu, Chi-Jie; Chang, Chi-Chang

    2014-01-01

    Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.

  14. A Hybrid Sales Forecasting Scheme by Combining Independent Component Analysis with K-Means Clustering and Support Vector Regression

    PubMed Central

    2014-01-01

    Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738

  15. [Extraction Optimization of Rhizome of Curcuma longa by Response Surface Methodology and Support Vector Regression].

    PubMed

    Zhou, Pei-pei; Shan, Jin-feng; Jiang, Jian-lan

    2015-12-01

    To optimize the optimal microwave-assisted extraction method of curcuminoids from Curcuma longa. On the base of single factor experiment, the ethanol concentration, the ratio of liquid to solid and the microwave time were selected for further optimization. Support Vector Regression (SVR) and Central Composite Design-Response Surface Methodology (CCD) algorithm were utilized to design and establish models respectively, while Particle Swarm Optimization (PSO) was introduced to optimize the parameters of SVR models and to search optimal points of models. The evaluation indicator, the sum of curcumin, demethoxycurcumin and bisdemethoxycurcumin by HPLC, were used. The optimal parameters of microwave-assisted extraction were as follows: ethanol concentration of 69%, ratio of liquid to solid of 21 : 1, microwave time of 55 s. On those conditions, the sum of three curcuminoids was 28.97 mg/g (per gram of rhizomes powder). Both the CCD model and the SVR model were credible, for they have predicted the similar process condition and the deviation of yield were less than 1.2%.

  16. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    NASA Astrophysics Data System (ADS)

    Heddam, Salim; Kisi, Ozgur

    2018-04-01

    In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.

  17. Intelligent Optimization of the Film-to-Fiber Ratio of a Degradable Braided Bicomponent Ureteral Stent

    PubMed Central

    Liu, Xiaoyan; Li, Feng; Ding, Yongsheng; Zou, Ting; Wang, Lu; Hao, Kuangrong

    2015-01-01

    A hierarchical support vector regression (SVR) model (HSVRM) was employed to correlate the compositions and mechanical properties of bicomponent stents composed of poly(lactic-co-glycolic acid) (PGLA) film and poly(glycolic acid) (PGA) fibers for urethral repair for the first time. PGLA film and PGA fibers could provide ureteral stents with good compressive and tensile properties, respectively. In bicomponent stents, high film content led to high stiffness, while high fiber content resulted in poor compressional properties. To simplify the procedures to optimize the ratio of PGLA film and PGA fiber in the stents, a hierarchical support vector regression model (HSVRM) and particle swarm optimization (PSO) algorithm were used to construct relationships between the film-to-fiber weight ratio and the measured compressional/tensile properties of the stents. The experimental data and simulated data fit well, proving that the HSVRM could closely reflect the relationship between the component ratio and performance properties of the ureteral stents. PMID:28793658

  18. Vector-model-supported approach in prostate plan optimization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Eva Sau Fan; Department of Health Technology and Informatics, The Hong Kong Polytechnic University; Wu, Vincent Wing Cheung

    Lengthy time consumed in traditional manual plan optimization can limit the use of step-and-shoot intensity-modulated radiotherapy/volumetric-modulated radiotherapy (S&S IMRT/VMAT). A vector model base, retrieving similar radiotherapy cases, was developed with respect to the structural and physiologic features extracted from the Digital Imaging and Communications in Medicine (DICOM) files. Planning parameters were retrieved from the selected similar reference case and applied to the test case to bypass the gradual adjustment of planning parameters. Therefore, the planning time spent on the traditional trial-and-error manual optimization approach in the beginning of optimization could be reduced. Each S&S IMRT/VMAT prostate reference database comprised 100more » previously treated cases. Prostate cases were replanned with both traditional optimization and vector-model-supported optimization based on the oncologists' clinical dose prescriptions. A total of 360 plans, which consisted of 30 cases of S&S IMRT, 30 cases of 1-arc VMAT, and 30 cases of 2-arc VMAT plans including first optimization and final optimization with/without vector-model-supported optimization, were compared using the 2-sided t-test and paired Wilcoxon signed rank test, with a significance level of 0.05 and a false discovery rate of less than 0.05. For S&S IMRT, 1-arc VMAT, and 2-arc VMAT prostate plans, there was a significant reduction in the planning time and iteration with vector-model-supported optimization by almost 50%. When the first optimization plans were compared, 2-arc VMAT prostate plans had better plan quality than 1-arc VMAT plans. The volume receiving 35 Gy in the femoral head for 2-arc VMAT plans was reduced with the vector-model-supported optimization compared with the traditional manual optimization approach. Otherwise, the quality of plans from both approaches was comparable. Vector-model-supported optimization was shown to offer much shortened planning time and iteration number without compromising the plan quality.« less

  19. Geographical distribution of reference value of aging people's left ventricular end systolic diameter based on the support vector regression.

    PubMed

    Han, Xiao; Ge, Miao; Dong, Jie; Xue, Ranying; Wang, Zixuan; He, Jinwei

    2014-09-01

    The aim of this paper is to analyze the geographical distribution of reference value of aging people's left ventricular end systolic diameter (LVDs), and to provide a scientific basis for clinical examination. The study is focus on the relationship between reference value of left ventricular end systolic diameter of aging people and 14 geographical factors, selecting 2495 samples of left ventricular end systolic diameter (LVDs) of aging people in 71 units of China, in which including 1620 men and 875 women. By using the Moran's I index to make sure the relationship between the reference values and spatial geographical factors, extracting 5 geographical factors which have significant correlation with left ventricular end systolic diameter for building the support vector regression, detecting by the method of paired sample t test to make sure the consistency between predicted and measured values, finally, makes the distribution map through the disjunctive kriging interpolation method and fits the three-dimensional trend of normal reference value. It is found that the correlation between the extracted geographical factors and the reference value of left ventricular end systolic diameter is quite significant, the 5 indexes respectively are latitude, annual mean air temperature, annual mean relative humidity, annual precipitation amount, annual range of air temperature, the predicted values and the observed ones are in good conformity, there is no significant difference at 95% degree of confidence. The overall trend of predicted values increases from west to east, increases first and then decreases from north to south. If geographical values are obtained in one region, the reference value of left ventricular end systolic diameter of aging people in this region can be obtained by using the support vector regression model. It could be more scientific to formulate the different distributions on the basis of synthesizing the physiological and the geographical factors. -Use Moran's index to analyze the spatial correlation. -Choose support vector machine to build model that overcome complexity of variables. -Test normal distribution of predicted data to guarantee the interpolation results. -Through trend analysis to explain the changes of reference value clearly. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Host Life History Strategy, Species Diversity, and Habitat Influence Trypanosoma cruzi Vector Infection in Changing Landscapes

    PubMed Central

    Gottdenker, Nicole L.; Chaves, Luis Fernando; Calzada, José E.; Saldaña, Azael; Carroll, C. Ronald

    2012-01-01

    Background Anthropogenic land use may influence transmission of multi-host vector-borne pathogens by changing diversity, relative abundance, and community composition of reservoir hosts. These reservoir hosts may have varying competence for vector-borne pathogens depending on species-specific characteristics, such as life history strategy. The objective of this study is to evaluate how anthropogenic land use change influences blood meal species composition and the effects of changing blood meal species composition on the parasite infection rate of the Chagas disease vector Rhodnius pallescens in Panama. Methodology/Principal Findings R. pallescens vectors (N = 643) were collected in different habitat types across a gradient of anthropogenic disturbance. Blood meal species in DNA extracted from these vectors was identified in 243 (40.3%) vectors by amplification and sequencing of a vertebrate-specific fragment of the 12SrRNA gene, and T. cruzi vector infection was determined by pcr. Vector infection rate was significantly greater in deforested habitats as compared to contiguous forests. Forty-two different species of blood meal were identified in R. pallescens, and species composition of blood meals varied across habitat types. Mammals (88.3%) dominated R. pallescens blood meals. Xenarthrans (sloths and tamanduas) were the most frequently identified species in blood meals across all habitat types. A regression tree analysis indicated that blood meal species diversity, host life history strategy (measured as rmax, the maximum intrinsic rate of population increase), and habitat type (forest fragments and peridomiciliary sites) were important determinants of vector infection with T. cruzi. The mean intrinsic rate of increase and the skewness and variability of rmax were positively associated with higher vector infection rate at a site. Conclusions/Significance In this study, anthropogenic landscape disturbance increased vector infection with T. cruzi, potentially by changing host community structure to favor hosts that are short-lived with high reproductive rates. Study results apply to potential environmental management strategies for Chagas disease. PMID:23166846

  1. A multi-layered mechanistic modelling approach to understand how effector genes extend beyond phytoplasma to modulate plant hosts, insect vectors and the environment.

    PubMed

    Tomkins, Melissa; Kliot, Adi; Marée, Athanasius Fm; Hogenhout, Saskia A

    2018-03-13

    Members of the Candidatus genus Phytoplasma are small bacterial pathogens that hijack their plant hosts via the secretion of virulence proteins (effectors) leading to a fascinating array of plant phenotypes, such as witch's brooms (stem proliferations) and phyllody (retrograde development of flowers into vegetative tissues). Phytoplasma depend on insect vectors for transmission, and interestingly, these insect vectors were found to be (in)directly attracted to plants with these phenotypes. Therefore, phytoplasma effectors appear to reprogram plant development and defence to lure insect vectors, similarly to social engineering malware, which employs tricks to lure people to infected computers and webpages. A multi-layered mechanistic modelling approach will enable a better understanding of how phytoplasma effector-mediated modulations of plant host development and insect vector behaviour contribute to phytoplasma spread, and ultimately to predict the long reach of phytoplasma effector genes. Copyright © 2018. Published by Elsevier Ltd.

  2. Earth Observation and Indicators Pertaining to Determinants of Health- An Approach to Support Local Scale Characterization of Environmental Determinants of Vector-Borne Diseases

    NASA Astrophysics Data System (ADS)

    Kotchi, Serge Olivier; Brazeau, Stephanie; Ludwig, Antoinette; Aube, Guy; Berthiaume, Pilippe

    2016-08-01

    Environmental determinants (EVDs) were identified as key determinant of health (DoH) for the emergence and re-emergence of several vector-borne diseases. Maintaining ongoing acquisition of data related to EVDs at local scale and for large regions constitutes a significant challenge. Earth observation (EO) satellites offer a framework to overcome this challenge. However, EO image analysis methods commonly used to estimate EVDs are time and resource consuming. Moreover, variations of microclimatic conditions combined with high landscape heterogeneity limit the effectiveness of climatic variables derived from EO. In this study, we present what are DoH and EVDs, the impacts of EVDs on vector-borne diseases in the context of global environmental change, the need to characterize EVDs of vector-borne diseases at local scale and its challenges, and finally we propose an approach based on EO images to estimate at local scale indicators pertaining to EVDs of vector-borne diseases.

  3. Vector Blood Meals Are an Early Indicator of the Effectiveness of the Ecohealth Approach in Halting Chagas Transmission in Guatemala

    PubMed Central

    Pellecer, Mariele J.; Dorn, Patricia L.; Bustamante, Dulce M.; Rodas, Antonieta; Monroy, M. Carlota

    2013-01-01

    A novel method using vector blood meal sources to assess the impact of control efforts on the risk of transmission of Chagas disease was tested in the village of El Tule, Jutiapa, Guatemala. Control used Ecohealth interventions, where villagers ameliorated the factors identified as most important for transmission. First, after an initial insecticide application, house walls were plastered. Later, bedroom floors were improved and domestic animals were moved outdoors. Only vector blood meal sources revealed the success of the first interventions: human blood meals declined from 38% to 3% after insecticide application and wall plastering. Following all interventions both vector blood meal sources and entomological indices revealed the reduction in transmission risk. These results indicate that vector blood meals may reveal effects of control efforts early on, effects that may not be apparent using traditional entomological indices, and provide further support for the Ecohealth approach to Chagas control in Guatemala. PMID:23382165

  4. SMOS salinity retrieval by using Support Vector Regression (SVR)

    NASA Astrophysics Data System (ADS)

    Katagis, Thomas; Fernández-Prieto, Diego; Marconcini, Mattia; Sabia, Roberto; Martinez, Justino

    2013-04-01

    The Soil Moisture and Ocean Salinity (SMOS) mission was launched in November 2009 within the framework of the European Space Agency (ESA) Living Planet programme. Over the oceans, it aims at providing Sea Surface Salinity (SSS) maps with spatial and temporal coverage adequate for large scale oceanography. A comprehensive inversion scheme has been defined and implemented in the operational retrieval chain to allow proper SSS estimates in a single satellite overpass (L2 product) from the multi-angular brightness temperatures (TBs) measured by SMOS. Such SMOS operational L2 salinity processor minimizes the difference between the measured and modeled TBs, including additional constraints on Sea Surface Temperature (SST) and wind speed auxiliary fields. In particular, by adopting a maximum-likelihood Bayesian approach, the inversion scheme retrieves salinity under an iterative convergence loop. However, despite the implemented iterative technique is well established and robust, it is still prone to limitations; for instance, the presence of local minima in the cost function cannot be excluded. Moreover, previous studies have demonstrated that the background and observational terms of the cost function are not properly balanced and this is likely to introduce errors in the retrieval procedure. In order to overcome such potential drawbacks, in this study it is proposed a novel approach for the SSS estimation based on the ɛ-insensitive Support Vector Regression (SVR), where both SMOS L1 measurements and auxiliary parameters are used as input. The SVR technique already proved capable of high generalization and robustness in a variety of different applications, with a limited complexity in handling the learning phase. Notably, instead of minimizing the observed training error, it attempts to minimize the generalization error bound so as to achieve generalized performance. For this purpose, the original input domain is mapped into a higher dimensionality space (where the function underlying the data is supposed to have increased flatness) and linear regression is performed. The SVR training is performed using suitable in situ SSS data (i.e., ARGO buoys data) collected in a representative region of the ocean. So far, in situ data coming from a match-up ARGO database in November 2010 over the South Pacific constitute the preliminary benchmark of the study. Ongoing activities point at extending this spatial and temporal frame to assess the robustness of the method. The in situ data have been collocated with SMOS TB measurements and additional parameters (e.g., SST and wind speed) in the learning phase of the SVR under various training/testing configurations. Afterwards, the SSS regression has been performed out of the SMOS TBs or emissivities. Estimated SVR salinity fields are in general (very) well correlated with ARGO data. The analysis of the different impact of the various features has been performed once a rigorous data filtering/flagging is applied, and misfit (SSSSVR-SSSARGO) statistics have been computed. For assessing the effectiveness of the proposed method, final results will be compared to those obtained using the official SMOS SSS retrieval algorithm.

  5. Reviewing the connection between speech and obstructive sleep apnea.

    PubMed

    Espinoza-Cuadros, Fernando; Fernández-Pozo, Rubén; Toledano, Doroteo T; Alcázar-Ramírez, José D; López-Gonzalo, Eduardo; Hernández-Gómez, Luis A

    2016-02-20

    Sleep apnea (OSA) is a common sleep disorder characterized by recurring breathing pauses during sleep caused by a blockage of the upper airway (UA). The altered UA structure or function in OSA speakers has led to hypothesize the automatic analysis of speech for OSA assessment. In this paper we critically review several approaches using speech analysis and machine learning techniques for OSA detection, and discuss the limitations that can arise when using machine learning techniques for diagnostic applications. A large speech database including 426 male Spanish speakers suspected to suffer OSA and derived to a sleep disorders unit was used to study the clinical validity of several proposals using machine learning techniques to predict the apnea-hypopnea index (AHI) or classify individuals according to their OSA severity. AHI describes the severity of patients' condition. We first evaluate AHI prediction using state-of-the-art speaker recognition technologies: speech spectral information is modelled using supervectors or i-vectors techniques, and AHI is predicted through support vector regression (SVR). Using the same database we then critically review several OSA classification approaches previously proposed. The influence and possible interference of other clinical variables or characteristics available for our OSA population: age, height, weight, body mass index, and cervical perimeter, are also studied. The poor results obtained when estimating AHI using supervectors or i-vectors followed by SVR contrast with the positive results reported by previous research. This fact prompted us to a careful review of these approaches, also testing some reported results over our database. Several methodological limitations and deficiencies were detected that may have led to overoptimistic results. The methodological deficiencies observed after critically reviewing previous research can be relevant examples of potential pitfalls when using machine learning techniques for diagnostic applications. We have found two common limitations that can explain the likelihood of false discovery in previous research: (1) the use of prediction models derived from sources, such as speech, which are also correlated with other patient characteristics (age, height, sex,…) that act as confounding factors; and (2) overfitting of feature selection and validation methods when working with a high number of variables compared to the number of cases. We hope this study could not only be a useful example of relevant issues when using machine learning for medical diagnosis, but it will also help in guiding further research on the connection between speech and OSA.

  6. Local gene transfection in the cochlea (Review).

    PubMed

    Xia, Li; Yin, Shankai

    2013-07-01

    There is much interest in the potential application of vector-induced gene therapeutic approaches to several forms of hearing disorders due to the poor efficacy of existing treatments. The cochlea is an ideal site for local gene transfection due to its anatomical encapsulation and fluid flow within its ducts. However, this requires the development of novel technologies in materials science and microbial supply vectors for target gene delivery. This review focuses on the introduction of various viral and non-viral vectors as well as injection approaches to transfecting cochlear cells in vivo. Finally, the perspective of local gene therapy was discussed. Therapeutic approaches using local gene transfection may provide a means of cochlear cell and tissue protection and treatment in cases of exogenous hearing loss and endogenous disorders.

  7. Ensemble predictive model for more accurate soil organic carbon spectroscopic estimation

    NASA Astrophysics Data System (ADS)

    Vašát, Radim; Kodešová, Radka; Borůvka, Luboš

    2017-07-01

    A myriad of signal pre-processing strategies and multivariate calibration techniques has been explored in attempt to improve the spectroscopic prediction of soil organic carbon (SOC) over the last few decades. Therefore, to come up with a novel, more powerful, and accurate predictive approach to beat the rank becomes a challenging task. However, there may be a way, so that combine several individual predictions into a single final one (according to ensemble learning theory). As this approach performs best when combining in nature different predictive algorithms that are calibrated with structurally different predictor variables, we tested predictors of two different kinds: 1) reflectance values (or transforms) at each wavelength and 2) absorption feature parameters. Consequently we applied four different calibration techniques, two per each type of predictors: a) partial least squares regression and support vector machines for type 1, and b) multiple linear regression and random forest for type 2. The weights to be assigned to individual predictions within the ensemble model (constructed as a weighted average) were determined by an automated procedure that ensured the best solution among all possible was selected. The approach was tested at soil samples taken from surface horizon of four sites differing in the prevailing soil units. By employing the ensemble predictive model the prediction accuracy of SOC improved at all four sites. The coefficient of determination in cross-validation (R2cv) increased from 0.849, 0.611, 0.811 and 0.644 (the best individual predictions) to 0.864, 0.650, 0.824 and 0.698 for Site 1, 2, 3 and 4, respectively. Generally, the ensemble model affected the final prediction so that the maximal deviations of predicted vs. observed values of the individual predictions were reduced, and thus the correlation cloud became thinner as desired.

  8. Genome Investigations of Vector Competence in Aedes aegypti to Inform Novel Arbovirus Disease Control Approaches

    PubMed Central

    Severson, David W.; Behura, Susanta K.

    2016-01-01

    Dengue (DENV), yellow fever, chikungunya, and Zika virus transmission to humans by a mosquito host is confounded by both intrinsic and extrinsic variables. Besides virulence factors of the individual arboviruses, likelihood of virus transmission is subject to variability in the genome of the primary mosquito vector, Aedes aegypti. The “vectorial capacity” of A. aegypti varies depending upon its density, biting rate, and survival rate, as well as its intrinsic ability to acquire, host and transmit a given arbovirus. This intrinsic ability is known as “vector competence”. Based on whole transcriptome analysis, several genes and pathways have been predicated to have an association with a susceptible or refractory response in A. aegypti to DENV infection. However, the functional genomics of vector competence of A. aegypti is not well understood, primarily due to lack of integrative approaches in genomic or transcriptomic studies. In this review, we focus on the present status of genomics studies of DENV vector competence in A. aegypti as limited information is available relative to the other arboviruses. We propose future areas of research needed to facilitate the integration of vector and virus genomics and environmental factors to work towards better understanding of vector competence and vectorial capacity in natural conditions. PMID:27809220

  9. Covariantized vector Galileons

    NASA Astrophysics Data System (ADS)

    Hull, Matthew; Koyama, Kazuya; Tasinato, Gianmassimo

    2016-03-01

    Vector Galileons are ghost-free systems containing higher derivative interactions of vector fields. They break the vector gauge symmetry, and the dynamics of the longitudinal vector polarizations acquire a Galileon symmetry in an appropriate decoupling limit in Minkowski space. Using an Arnowitt-Deser-Misner approach, we carefully reconsider the coupling with gravity of vector Galileons, with the aim of studying the necessary conditions to avoid the propagation of ghosts. We develop arguments that put on a more solid footing the results previously obtained in the literature. Moreover, working in analogy with the scalar counterpart, we find indications for the existence of a "beyond Horndeski" theory involving vector degrees of freedom that avoids the propagation of ghosts thanks to secondary constraints. In addition, we analyze a Higgs mechanism for generating vector Galileons through spontaneous symmetry breaking, and we present its consistent covariantization.

  10. Production of non viral DNA vectors.

    PubMed

    Schleef, Martin; Blaesen, Markus; Schmeer, Marco; Baier, Ruth; Marie, Corinne; Dickson, George; Scherman, Daniel

    2010-12-01

    After some decades of research, development and first clinical approaches to use DNA vectors in gene therapy, cell therapy and DNA vaccination, the requirements for the pharmaceutical manufacturing of gene vectors has improved significantly step by step. Even the expression level and specificity of non viral DNA vectors were significantly modified and followed the success of viral vectors. The strict separation of "viral" and "non viral" gene transfer are historic borders between scientist and we will show that both fields together are able to allow the next step towards successful prevention and therapy. Here we summarize the features of producing and modifying these non-viral gene vectors to ensure the required quality to modify cells and to treat human and animals.

  11. An innovative ecohealth intervention for Chagas disease vector control in Yucatan, Mexico.

    PubMed

    Waleckx, Etienne; Camara-Mejia, Javier; Ramirez-Sierra, Maria Jesus; Cruz-Chan, Vladimir; Rosado-Vallado, Miguel; Vazquez-Narvaez, Santos; Najera-Vazquez, Rosario; Gourbière, Sébastien; Dumonteil, Eric

    2015-02-01

    Non-domiciliated (intrusive) triatomine vectors remain a challenge for the sustainability of Chagas disease vector control as these triatomines are able to transiently (re-)infest houses. One of the best-characterized examples is Triatoma dimidiata from the Yucatan peninsula, Mexico, where adult insects seasonally infest houses between March and July. We focused our study on three rural villages in the state of Yucatan, Mexico, in which we performed a situation analysis as a first step before the implementation of an ecohealth (ecosystem approach to health) vector control intervention. The identification of the key determinants affecting the transient invasion of human dwellings by T. dimidiata was performed by exploring associations between bug presence and qualitative and quantitative variables describing the ecological, biological and social context of the communities. We then used a participatory action research approach for implementation and evaluation of a control strategy based on window insect screens to reduce house infestation by T. dimidiata. This ecohealth approach may represent a valuable alternative to vertically-organized insecticide spraying. Further evaluation may confirm that it is sustainable and provides effective control (in the sense of limiting infestation of human dwellings and vector/human contacts) of intrusive triatomines in the region. © The author 2015. The World Health Organization has granted Oxford University Press permission for the reproduction of this article.

  12. Hybrid Support Vector Regression and Autoregressive Integrated Moving Average Models Improved by Particle Swarm Optimization for Property Crime Rates Forecasting with Economic Indicators

    PubMed Central

    Alwee, Razana; Hj Shamsuddin, Siti Mariyam; Sallehuddin, Roselina

    2013-01-01

    Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models. PMID:23766729

  13. Hybrid support vector regression and autoregressive integrated moving average models improved by particle swarm optimization for property crime rates forecasting with economic indicators.

    PubMed

    Alwee, Razana; Shamsuddin, Siti Mariyam Hj; Sallehuddin, Roselina

    2013-01-01

    Crimes forecasting is an important area in the field of criminology. Linear models, such as regression and econometric models, are commonly applied in crime forecasting. However, in real crimes data, it is common that the data consists of both linear and nonlinear components. A single model may not be sufficient to identify all the characteristics of the data. The purpose of this study is to introduce a hybrid model that combines support vector regression (SVR) and autoregressive integrated moving average (ARIMA) to be applied in crime rates forecasting. SVR is very robust with small training data and high-dimensional problem. Meanwhile, ARIMA has the ability to model several types of time series. However, the accuracy of the SVR model depends on values of its parameters, while ARIMA is not robust to be applied to small data sets. Therefore, to overcome this problem, particle swarm optimization is used to estimate the parameters of the SVR and ARIMA models. The proposed hybrid model is used to forecast the property crime rates of the United State based on economic indicators. The experimental results show that the proposed hybrid model is able to produce more accurate forecasting results as compared to the individual models.

  14. Urban air quality forecasting based on multi-dimensional collaborative Support Vector Regression (SVR): A case study of Beijing-Tianjin-Shijiazhuang

    PubMed Central

    Liu, Bing-Chun; Binaykia, Arihant; Chang, Pei-Chann; Tiwari, Manoj Kumar; Tsao, Cheng-Chin

    2017-01-01

    Today, China is facing a very serious issue of Air Pollution due to its dreadful impact on the human health as well as the environment. The urban cities in China are the most affected due to their rapid industrial and economic growth. Therefore, it is of extreme importance to come up with new, better and more reliable forecasting models to accurately predict the air quality. This paper selected Beijing, Tianjin and Shijiazhuang as three cities from the Jingjinji Region for the study to come up with a new model of collaborative forecasting using Support Vector Regression (SVR) for Urban Air Quality Index (AQI) prediction in China. The present study is aimed to improve the forecasting results by minimizing the prediction error of present machine learning algorithms by taking into account multiple city multi-dimensional air quality information and weather conditions as input. The results show that there is a decrease in MAPE in case of multiple city multi-dimensional regression when there is a strong interaction and correlation of the air quality characteristic attributes with AQI. Also, the geographical location is found to play a significant role in Beijing, Tianjin and Shijiazhuang AQI prediction. PMID:28708836

  15. Modeling animal movements using stochastic differential equations

    Treesearch

    Haiganoush K. Preisler; Alan A. Ager; Bruce K. Johnson; John G. Kie

    2004-01-01

    We describe the use of bivariate stochastic differential equations (SDE) for modeling movements of 216 radiocollared female Rocky Mountain elk at the Starkey Experimental Forest and Range in northeastern Oregon. Spatially and temporally explicit vector fields were estimated using approximating difference equations and nonparametric regression techniques. Estimated...

  16. Estimation of Discontinuous Displacement Vector Fields with the Minimum Description Length Criterion.

    DTIC Science & Technology

    1990-10-01

    type of approach for finding a dense displacement vector field has a time complexity that allows a real - time implementation when an appropriate control...hardly vector fields as they appear in Stereo or motion. The reason for this is the fact that local displacement vector field ( DVF ) esti- mates bave...2 objects’ motion, but that the quantitative optical flow is not a reliable measure of the real motion [VP87, SU87]. This applies even more to the

  17. Curriculum Assessment Using Artificial Neural Network and Support Vector Machine Modeling Approaches: A Case Study. IR Applications. Volume 29

    ERIC Educational Resources Information Center

    Chen, Chau-Kuang

    2010-01-01

    Artificial Neural Network (ANN) and Support Vector Machine (SVM) approaches have been on the cutting edge of science and technology for pattern recognition and data classification. In the ANN model, classification accuracy can be achieved by using the feed-forward of inputs, back-propagation of errors, and the adjustment of connection weights. In…

  18. Variable Speed CMG Control of a Dual-Spin Stabilized Unconventional VTOL Air Vehicle

    NASA Technical Reports Server (NTRS)

    Lim, Kyong B.; Moerder, Daniel D.; Shin, J-Y.

    2004-01-01

    This paper describes an approach based on using both bias momentum and multiple control moment gyros for controlling the attitude of statically unstable thrust-levitated vehicles in hover or slow translation. The stabilization approach described in this paper uses these internal angular momentum transfer devices for stability, augmented by thrust vectoring for trim and other outer loop control functions, including CMG stabilization/ desaturation under persistent external disturbances. Simulation results show the feasibility of (1) improved vehicle performance beyond bias momentum assisted vector thrusting control, and (2) using control moment gyros to significantly reduce the external torque required from the vector thrusting machinery.

  19. Statistical-learning strategies generate only modestly performing predictive models for urinary symptoms following external beam radiotherapy of the prostate: A comparison of conventional and machine-learning methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yahya, Noorazrul, E-mail: noorazrul.yahya@research.uwa.edu.au; Ebert, Martin A.; Bulsara, Max

    Purpose: Given the paucity of available data concerning radiotherapy-induced urinary toxicity, it is important to ensure derivation of the most robust models with superior predictive performance. This work explores multiple statistical-learning strategies for prediction of urinary symptoms following external beam radiotherapy of the prostate. Methods: The performance of logistic regression, elastic-net, support-vector machine, random forest, neural network, and multivariate adaptive regression splines (MARS) to predict urinary symptoms was analyzed using data from 754 participants accrued by TROG03.04-RADAR. Predictive features included dose-surface data, comorbidities, and medication-intake. Four symptoms were analyzed: dysuria, haematuria, incontinence, and frequency, each with three definitions (grade ≥more » 1, grade ≥ 2 and longitudinal) with event rate between 2.3% and 76.1%. Repeated cross-validations producing matched models were implemented. A synthetic minority oversampling technique was utilized in endpoints with rare events. Parameter optimization was performed on the training data. Area under the receiver operating characteristic curve (AUROC) was used to compare performance using sample size to detect differences of ≥0.05 at the 95% confidence level. Results: Logistic regression, elastic-net, random forest, MARS, and support-vector machine were the highest-performing statistical-learning strategies in 3, 3, 3, 2, and 1 endpoints, respectively. Logistic regression, MARS, elastic-net, random forest, neural network, and support-vector machine were the best, or were not significantly worse than the best, in 7, 7, 5, 5, 3, and 1 endpoints. The best-performing statistical model was for dysuria grade ≥ 1 with AUROC ± standard deviation of 0.649 ± 0.074 using MARS. For longitudinal frequency and dysuria grade ≥ 1, all strategies produced AUROC>0.6 while all haematuria endpoints and longitudinal incontinence models produced AUROC<0.6. Conclusions: Logistic regression and MARS were most likely to be the best-performing strategy for the prediction of urinary symptoms with elastic-net and random forest producing competitive results. The predictive power of the models was modest and endpoint-dependent. New features, including spatial dose maps, may be necessary to achieve better models.« less

  20. Relationship between rice yield and climate variables in southwest Nigeria using multiple linear regression and support vector machine analysis

    NASA Astrophysics Data System (ADS)

    Oguntunde, Philip G.; Lischeid, Gunnar; Dietrich, Ottfried

    2018-03-01

    This study examines the variations of climate variables and rice yield and quantifies the relationships among them using multiple linear regression, principal component analysis, and support vector machine (SVM) analysis in southwest Nigeria. The climate and yield data used was for a period of 36 years between 1980 and 2015. Similar to the observed decrease ( P < 0.001) in rice yield, pan evaporation, solar radiation, and wind speed declined significantly. Eight principal components exhibited an eigenvalue > 1 and explained 83.1% of the total variance of predictor variables. The SVM regression function using the scores of the first principal component explained about 75% of the variance in rice yield data and linear regression about 64%. SVM regression between annual solar radiation values and yield explained 67% of the variance. Only the first component of the principal component analysis (PCA) exhibited a clear long-term trend and sometimes short-term variance similar to that of rice yield. Short-term fluctuations of the scores of the PC1 are closely coupled to those of rice yield during the 1986-1993 and the 2006-2013 periods thereby revealing the inter-annual sensitivity of rice production to climate variability. Solar radiation stands out as the climate variable of highest influence on rice yield, and the influence was especially strong during monsoon and post-monsoon periods, which correspond to the vegetative, booting, flowering, and grain filling stages in the study area. The outcome is expected to provide more in-depth regional-specific climate-rice linkage for screening of better cultivars that can positively respond to future climate fluctuations as well as providing information that may help optimized planting dates for improved radiation use efficiency in the study area.

  1. A Systematic Approach for Model-Based Aircraft Engine Performance Estimation

    NASA Technical Reports Server (NTRS)

    Simon, Donald L.; Garg, Sanjay

    2010-01-01

    A requirement for effective aircraft engine performance estimation is the ability to account for engine degradation, generally described in terms of unmeasurable health parameters such as efficiencies and flow capacities related to each major engine module. This paper presents a linear point design methodology for minimizing the degradation-induced error in model-based aircraft engine performance estimation applications. The technique specifically focuses on the underdetermined estimation problem, where there are more unknown health parameters than available sensor measurements. A condition for Kalman filter-based estimation is that the number of health parameters estimated cannot exceed the number of sensed measurements. In this paper, the estimated health parameter vector will be replaced by a reduced order tuner vector whose dimension is equivalent to the sensed measurement vector. The reduced order tuner vector is systematically selected to minimize the theoretical mean squared estimation error of a maximum a posteriori estimator formulation. This paper derives theoretical estimation errors at steady-state operating conditions, and presents the tuner selection routine applied to minimize these values. Results from the application of the technique to an aircraft engine simulation are presented and compared to the estimation accuracy achieved through conventional maximum a posteriori and Kalman filter estimation approaches. Maximum a posteriori estimation results demonstrate that reduced order tuning parameter vectors can be found that approximate the accuracy of estimating all health parameters directly. Kalman filter estimation results based on the same reduced order tuning parameter vectors demonstrate that significantly improved estimation accuracy can be achieved over the conventional approach of selecting a subset of health parameters to serve as the tuner vector. However, additional development is necessary to fully extend the methodology to Kalman filter-based estimation applications.

  2. Space-Time Point Pattern Analysis of Flavescence Dorée Epidemic in a Grapevine Field: Disease Progression and Recovery

    PubMed Central

    Maggi, Federico; Bosco, Domenico; Galetto, Luciana; Palmano, Sabrina; Marzachì, Cristina

    2017-01-01

    Analyses of space-time statistical features of a flavescence dorée (FD) epidemic in Vitis vinifera plants are presented. FD spread was surveyed from 2011 to 2015 in a vineyard of 17,500 m2 surface area in the Piemonte region, Italy; count and position of symptomatic plants were used to test the hypothesis of epidemic Complete Spatial Randomness and isotropicity in the space-time static (year-by-year) point pattern measure. Space-time dynamic (year-to-year) point pattern analyses were applied to newly infected and recovered plants to highlight statistics of FD progression and regression over time. Results highlighted point patterns ranging from disperse (at small scales) to aggregated (at large scales) over the years, suggesting that the FD epidemic is characterized by multiscale properties that may depend on infection incidence, vector population, and flight behavior. Dynamic analyses showed moderate preferential progression and regression along rows. Nearly uniform distributions of direction and negative exponential distributions of distance of newly symptomatic and recovered plants relative to existing symptomatic plants highlighted features of vector mobility similar to Brownian motion. These evidences indicate that space-time epidemics modeling should include environmental setting (e.g., vineyard geometry and topography) to capture anisotropicity as well as statistical features of vector flight behavior, plant recovery and susceptibility, and plant mortality. PMID:28111581

  3. Improved animal models for testing gene therapy for atherosclerosis.

    PubMed

    Du, Liang; Zhang, Jingwan; De Meyer, Guido R Y; Flynn, Rowan; Dichek, David A

    2014-04-01

    Gene therapy delivered to the blood vessel wall could augment current therapies for atherosclerosis, including systemic drug therapy and stenting. However, identification of clinically useful vectors and effective therapeutic transgenes remains at the preclinical stage. Identification of effective vectors and transgenes would be accelerated by availability of animal models that allow practical and expeditious testing of vessel-wall-directed gene therapy. Such models would include humanlike lesions that develop rapidly in vessels that are amenable to efficient gene delivery. Moreover, because human atherosclerosis develops in normal vessels, gene therapy that prevents atherosclerosis is most logically tested in relatively normal arteries. Similarly, gene therapy that causes atherosclerosis regression requires gene delivery to an existing lesion. Here we report development of three new rabbit models for testing vessel-wall-directed gene therapy that either prevents or reverses atherosclerosis. Carotid artery intimal lesions in these new models develop within 2-7 months after initiation of a high-fat diet and are 20-80 times larger than lesions in a model we described previously. Individual models allow generation of lesions that are relatively rich in either macrophages or smooth muscle cells, permitting testing of gene therapy strategies targeted at either cell type. Two of the models include gene delivery to essentially normal arteries and will be useful for identifying strategies that prevent lesion development. The third model generates lesions rapidly in vector-naïve animals and can be used for testing gene therapy that promotes lesion regression. These models are optimized for testing helper-dependent adenovirus (HDAd)-mediated gene therapy; however, they could be easily adapted for testing of other vectors or of different types of molecular therapies, delivered directly to the blood vessel wall. Our data also supports the promise of HDAd to deliver long-term therapy from vascular endothelium without accelerating atherosclerotic disease.

  4. Behavioral plasticity in feeding by Diaphorina citri (Hemiptera, Liviidae): ingestion from phloem versus xylem is influenced by leaf age and surface

    USDA-ARS?s Scientific Manuscript database

    Diaphorina citri is a major pest of citrus because it transmits the bacterium that causes Huanglongbing (HLB) (a.k.a. citrus greening). One approach to disease management is vector management using insecticides. However, knowledge of vector mortality alone is not sufficient if the vector has had tim...

  5. A force vector and surface orientation sensor for intelligent grasping

    NASA Technical Reports Server (NTRS)

    Mcglasson, W. D.; Lorenz, R. D.; Duffie, N. A.; Gale, K. L.

    1991-01-01

    The paper discusses a force vector and surface orientation sensor suitable for intelligent grasping. The use of a novel four degree-of-freedom force vector robotic fingertip sensor allows efficient, real time intelligent grasping operations. The basis of sensing for intelligent grasping operations is presented and experimental results demonstrate the accuracy and ease of implementation of this approach.

  6. Use of Mapping and Spatial and Space-Time Modeling Approaches in Operational Control of Aedes aegypti and Dengue

    PubMed Central

    Eisen, Lars; Lozano-Fuentes, Saul

    2009-01-01

    The aims of this review paper are to 1) provide an overview of how mapping and spatial and space-time modeling approaches have been used to date to visualize and analyze mosquito vector and epidemiologic data for dengue; and 2) discuss the potential for these approaches to be included as routine activities in operational vector and dengue control programs. Geographical information system (GIS) software are becoming more user-friendly and now are complemented by free mapping software that provide access to satellite imagery and basic feature-making tools and have the capacity to generate static maps as well as dynamic time-series maps. Our challenge is now to move beyond the research arena by transferring mapping and GIS technologies and spatial statistical analysis techniques in user-friendly packages to operational vector and dengue control programs. This will enable control programs to, for example, generate risk maps for exposure to dengue virus, develop Priority Area Classifications for vector control, and explore socioeconomic associations with dengue risk. PMID:19399163

  7. Thermal noise model of antiferromagnetic dynamics: A macroscopic approach

    NASA Astrophysics Data System (ADS)

    Li, Xilai; Semenov, Yuriy; Kim, Ki Wook

    In the search for post-silicon technologies, antiferromagnetic (AFM) spintronics is receiving widespread attention. Due to faster dynamics when compared with its ferromagnetic counterpart, AFM enables ultra-fast magnetization switching and THz oscillations. A crucial factor that affects the stability of antiferromagnetic dynamics is the thermal fluctuation, rarely considered in AFM research. Here, we derive from theory both stochastic dynamic equations for the macroscopic AFM Neel vector (L-vector) and the corresponding Fokker-Plank equation for the L-vector distribution function. For the dynamic equation approach, thermal noise is modeled by a stochastic fluctuating magnetic field that affects the AFM dynamics. The field is correlated within the correlation time and the amplitude is derived from the energy dissipation theory. For the distribution function approach, the inertial behavior of AFM dynamics forces consideration of the generalized space, including both coordinates and velocities. Finally, applying the proposed thermal noise model, we analyze a particular case of L-vector reversal of AFM nanoparticles by voltage controlled perpendicular magnetic anisotropy (PMA) with a tailored pulse width. This work was supported, in part, by SRC/NRI SWAN.

  8. A country bug in the city: urban infestation by the Chagas disease vector Triatoma infestans in Arequipa, Peru

    PubMed Central

    2013-01-01

    Background Interruption of vector-borne transmission of Trypanosoma cruzi remains an unrealized objective in many Latin American countries. The task of vector control is complicated by the emergence of vector insects in urban areas. Methods Utilizing data from a large-scale vector control program in Arequipa, Peru, we explored the spatial patterns of infestation by Triatoma infestans in an urban and peri-urban landscape. Multilevel logistic regression was utilized to assess the associations between household infestation and household- and locality-level socio-environmental measures. Results Of 37,229 households inspected for infestation, 6,982 (18.8%; 95% CI: 18.4 – 19.2%) were infested by T. infestans. Eighty clusters of infestation were identified, ranging in area from 0.1 to 68.7 hectares and containing as few as one and as many as 1,139 infested households. Spatial dependence between infested households was significant at distances up to 2,000 meters. Household T. infestans infestation was associated with household- and locality-level factors, including housing density, elevation, land surface temperature, and locality type. Conclusions High levels of T. infestans infestation, characterized by spatial heterogeneity, were found across extensive urban and peri-urban areas prior to vector control. Several environmental and social factors, which may directly or indirectly influence the biology and behavior of T. infestans, were associated with infestation. Spatial clustering of infestation in the urban context may both challenge and inform surveillance and control of vector reemergence after insecticide intervention. PMID:24171704

  9. Effects of Vector Backbone and Pseudotype on Lentiviral Vector-mediated Gene Transfer: Studies in Infant ADA-Deficient Mice and Rhesus Monkeys

    PubMed Central

    Carbonaro Sarracino, Denise; Tarantal, Alice F; Lee, C Chang I.; Martinez, Michele; Jin, Xiangyang; Wang, Xiaoyan; Hardee, Cinnamon L; Geiger, Sabine; Kahl, Christoph A; Kohn, Donald B

    2014-01-01

    Systemic delivery of a lentiviral vector carrying a therapeutic gene represents a new treatment for monogenic disease. Previously, we have shown that transfer of the adenosine deaminase (ADA) cDNA in vivo rescues the lethal phenotype and reconstitutes immune function in ADA-deficient mice. In order to translate this approach to ADA-deficient severe combined immune deficiency patients, neonatal ADA-deficient mice and newborn rhesus monkeys were treated with species-matched and mismatched vectors and pseudotypes. We compared gene delivery by the HIV-1-based vector to murine γ-retroviral vectors pseudotyped with vesicular stomatitis virus-glycoprotein or murine retroviral envelopes in ADA-deficient mice. The vesicular stomatitis virus-glycoprotein pseudotyped lentiviral vectors had the highest titer and resulted in the highest vector copy number in multiple tissues, particularly liver and lung. In monkeys, HIV-1 or simian immunodeficiency virus vectors resulted in similar biodistribution in most tissues including bone marrow, spleen, liver, and lung. Simian immunodeficiency virus pseudotyped with the gibbon ape leukemia virus envelope produced 10- to 30-fold lower titers than the vesicular stomatitis virus-glycoprotein pseudotype, but had a similar tissue biodistribution and similar copy number in blood cells. The relative copy numbers achieved in mice and monkeys were similar when adjusted to the administered dose per kg. These results suggest that this approach can be scaled-up to clinical levels for treatment of ADA-deficient severe combined immune deficiency subjects with suboptimal hematopoietic stem cell transplantation options. PMID:24925206

  10. Noninvasive spectral imaging of skin chromophores based on multiple regression analysis aided by Monte Carlo simulation

    NASA Astrophysics Data System (ADS)

    Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa

    2011-08-01

    In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.

  11. Robust Image Regression Based on the Extended Matrix Variate Power Exponential Distribution of Dependent Noise.

    PubMed

    Luo, Lei; Yang, Jian; Qian, Jianjun; Tai, Ying; Lu, Gui-Fu

    2017-09-01

    Dealing with partial occlusion or illumination is one of the most challenging problems in image representation and classification. In this problem, the characterization of the representation error plays a crucial role. In most current approaches, the error matrix needs to be stretched into a vector and each element is assumed to be independently corrupted. This ignores the dependence between the elements of error. In this paper, it is assumed that the error image caused by partial occlusion or illumination changes is a random matrix variate and follows the extended matrix variate power exponential distribution. This has the heavy tailed regions and can be used to describe a matrix pattern of l×m dimensional observations that are not independent. This paper reveals the essence of the proposed distribution: it actually alleviates the correlations between pixels in an error matrix E and makes E approximately Gaussian. On the basis of this distribution, we derive a Schatten p -norm-based matrix regression model with L q regularization. Alternating direction method of multipliers is applied to solve this model. To get a closed-form solution in each step of the algorithm, two singular value function thresholding operators are introduced. In addition, the extended Schatten p -norm is utilized to characterize the distance between the test samples and classes in the design of the classifier. Extensive experimental results for image reconstruction and classification with structural noise demonstrate that the proposed algorithm works much more robustly than some existing regression-based methods.

  12. No-reference image quality assessment based on statistics of convolution feature maps

    NASA Astrophysics Data System (ADS)

    Lv, Xiaoxin; Qin, Min; Chen, Xiaohui; Wei, Guo

    2018-04-01

    We propose a Convolutional Feature Maps (CFM) driven approach to accurately predict image quality. Our motivation bases on the finding that the Nature Scene Statistic (NSS) features on convolution feature maps are significantly sensitive to distortion degree of an image. In our method, a Convolutional Neural Network (CNN) is trained to obtain kernels for generating CFM. We design a forward NSS layer which performs on CFM to better extract NSS features. The quality aware features derived from the output of NSS layer is effective to describe the distortion type and degree an image suffered. Finally, a Support Vector Regression (SVR) is employed in our No-Reference Image Quality Assessment (NR-IQA) model to predict a subjective quality score of a distorted image. Experiments conducted on two public databases demonstrate the promising performance of the proposed method is competitive to state of the art NR-IQA methods.

  13. A scale-invariant change detection method for land use/cover change research

    NASA Astrophysics Data System (ADS)

    Xing, Jin; Sieber, Renee; Caelli, Terrence

    2018-07-01

    Land Use/Cover Change (LUCC) detection relies increasingly on comparing remote sensing images with different spatial and spectral scales. Based on scale-invariant image analysis algorithms in computer vision, we propose a scale-invariant LUCC detection method to identify changes from scale heterogeneous images. This method is composed of an entropy-based spatial decomposition, two scale-invariant feature extraction methods, Maximally Stable Extremal Region (MSER) and Scale-Invariant Feature Transformation (SIFT) algorithms, a spatial regression voting method to integrate MSER and SIFT results, a Markov Random Field-based smoothing method, and a support vector machine classification method to assign LUCC labels. We test the scale invariance of our new method with a LUCC case study in Montreal, Canada, 2005-2012. We found that the scale-invariant LUCC detection method provides similar accuracy compared with the resampling-based approach but this method avoids the LUCC distortion incurred by resampling.

  14. Automated Scoring of Chinese Engineering Students' English Essays

    ERIC Educational Resources Information Center

    Liu, Ming; Wang, Yuqi; Xu, Weiwei; Liu, Li

    2017-01-01

    The number of Chinese engineering students has increased greatly since 1999. Rating the quality of these students' English essays has thus become time-consuming and challenging. This paper presents a novel automatic essay scoring algorithm called PSOSVR, based on a machine learning algorithm, Support Vector Machine for Regression (SVR), and a…

  15. Converging Human and Malaria Vector Diagnostics with Data Management towards an Integrated Holistic One Health Approach.

    PubMed

    Mitsakakis, Konstantinos; Hin, Sebastian; Müller, Pie; Wipf, Nadja; Thomsen, Edward; Coleman, Michael; Zengerle, Roland; Vontas, John; Mavridis, Konstantinos

    2018-02-03

    Monitoring malaria prevalence in humans, as well as vector populations, for the presence of Plasmodium , is an integral component of effective malaria control, and eventually, elimination. In the field of human diagnostics, a major challenge is the ability to define, precisely, the causative agent of fever, thereby differentiating among several candidate (also non-malaria) febrile diseases. This requires genetic-based pathogen identification and multiplexed analysis, which, in combination, are hardly provided by the current gold standard diagnostic tools. In the field of vectors, an essential component of control programs is the detection of Plasmodium species within its mosquito vectors, particularly in the salivary glands, where the infective sporozoites reside. In addition, the identification of species composition and insecticide resistance alleles within vector populations is a primary task in routine monitoring activities, aiming to support control efforts. In this context, the use of converging diagnostics is highly desirable for providing comprehensive information, including differential fever diagnosis in humans, and mosquito species composition, infection status, and resistance to insecticides of vectors. Nevertheless, the two fields of human diagnostics and vector control are rarely combined, both at the diagnostic and at the data management end, resulting in fragmented data and mis- or non-communication between various stakeholders. To this direction, molecular technologies, their integration in automated platforms, and the co-assessment of data from multiple diagnostic sources through information and communication technologies are possible pathways towards a unified human vector approach.

  16. Converging Human and Malaria Vector Diagnostics with Data Management towards an Integrated Holistic One Health Approach

    PubMed Central

    Mitsakakis, Konstantinos; Hin, Sebastian; Wipf, Nadja; Coleman, Michael; Zengerle, Roland; Vontas, John; Mavridis, Konstantinos

    2018-01-01

    Monitoring malaria prevalence in humans, as well as vector populations, for the presence of Plasmodium, is an integral component of effective malaria control, and eventually, elimination. In the field of human diagnostics, a major challenge is the ability to define, precisely, the causative agent of fever, thereby differentiating among several candidate (also non-malaria) febrile diseases. This requires genetic-based pathogen identification and multiplexed analysis, which, in combination, are hardly provided by the current gold standard diagnostic tools. In the field of vectors, an essential component of control programs is the detection of Plasmodium species within its mosquito vectors, particularly in the salivary glands, where the infective sporozoites reside. In addition, the identification of species composition and insecticide resistance alleles within vector populations is a primary task in routine monitoring activities, aiming to support control efforts. In this context, the use of converging diagnostics is highly desirable for providing comprehensive information, including differential fever diagnosis in humans, and mosquito species composition, infection status, and resistance to insecticides of vectors. Nevertheless, the two fields of human diagnostics and vector control are rarely combined, both at the diagnostic and at the data management end, resulting in fragmented data and mis- or non-communication between various stakeholders. To this direction, molecular technologies, their integration in automated platforms, and the co-assessment of data from multiple diagnostic sources through information and communication technologies are possible pathways towards a unified human vector approach. PMID:29401670

  17. An alternative subspace approach to EEG dipole source localization

    NASA Astrophysics Data System (ADS)

    Xu, Xiao-Liang; Xu, Bobby; He, Bin

    2004-01-01

    In the present study, we investigate a new approach to electroencephalography (EEG) three-dimensional (3D) dipole source localization by using a non-recursive subspace algorithm called FINES. In estimating source dipole locations, the present approach employs projections onto a subspace spanned by a small set of particular vectors (FINES vector set) in the estimated noise-only subspace instead of the entire estimated noise-only subspace in the case of classic MUSIC. The subspace spanned by this vector set is, in the sense of principal angle, closest to the subspace spanned by the array manifold associated with a particular brain region. By incorporating knowledge of the array manifold in identifying FINES vector sets in the estimated noise-only subspace for different brain regions, the present approach is able to estimate sources with enhanced accuracy and spatial resolution, thus enhancing the capability of resolving closely spaced sources and reducing estimation errors. The present computer simulations show, in EEG 3D dipole source localization, that compared to classic MUSIC, FINES has (1) better resolvability of two closely spaced dipolar sources and (2) better estimation accuracy of source locations. In comparison with RAP-MUSIC, FINES' performance is also better for the cases studied when the noise level is high and/or correlations among dipole sources exist.

  18. Assessment of different virus-mediated approaches for retinal gene therapy of Usher 1B.

    PubMed

    Lopes, Vanda S; Diemer, Tanja; Williams, David S

    2014-01-01

    Usher syndrome type 1B, which is characterized by congenital deafness and progressive retinal degeneration, is caused by the loss of the function of MYO7A. Prevention of the retinal degeneration should be possible by delivering functional MYO7A to retinal cells. Although this approach has been used successfully in clinical trials for Leber congenital amaurosis (LCA2), it remains a challenge for Usher 1B because of the large size of the MYO7A cDNA. Different viral vectors have been tested for use in MYO7A gene therapy. Here, we review approaches with lentiviruses, which can accommodate larger genes, as well as attempts to use adeno-associated virus (AAV), which has a smaller packaging capacity. In conclusion, both types of viral vector appear to be effective. Despite concerns about the ability of lentiviruses to access the photoreceptor cells, a phenotype of the photoreceptors of Myo7a-mutant mice can be corrected. And although MYO7A cDNA is significantly larger than the nominal carrying capacity of AAV, AAV-MYO7A in single vectors also corrected Myo7a-mutant phenotypes in photoreceptor and RPE cells. Interestingly, however, a dual AAV vector approach was found to be much less effective.

  19. Three-Month Real-Time Dengue Forecast Models: An Early Warning System for Outbreak Alerts and Policy Decision Support in Singapore.

    PubMed

    Shi, Yuan; Liu, Xu; Kok, Suet-Yheng; Rajarethinam, Jayanthi; Liang, Shaohong; Yap, Grace; Chong, Chee-Seng; Lee, Kim-Sung; Tan, Sharon S Y; Chin, Christopher Kuan Yew; Lo, Andrew; Kong, Waiming; Ng, Lee Ching; Cook, Alex R

    2016-09-01

    With its tropical rainforest climate, rapid urbanization, and changing demography and ecology, Singapore experiences endemic dengue; the last large outbreak in 2013 culminated in 22,170 cases. In the absence of a vaccine on the market, vector control is the key approach for prevention. We sought to forecast the evolution of dengue epidemics in Singapore to provide early warning of outbreaks and to facilitate the public health response to moderate an impending outbreak. We developed a set of statistical models using least absolute shrinkage and selection operator (LASSO) methods to forecast the weekly incidence of dengue notifications over a 3-month time horizon. This forecasting tool used a variety of data streams and was updated weekly, including recent case data, meteorological data, vector surveillance data, and population-based national statistics. The forecasting methodology was compared with alternative approaches that have been proposed to model dengue case data (seasonal autoregressive integrated moving average and step-down linear regression) by fielding them on the 2013 dengue epidemic, the largest on record in Singapore. Operationally useful forecasts were obtained at a 3-month lag using the LASSO-derived models. Based on the mean average percentage error, the LASSO approach provided more accurate forecasts than the other methods we assessed. We demonstrate its utility in Singapore's dengue control program by providing a forecast of the 2013 outbreak for advance preparation of outbreak response. Statistical models built using machine learning methods such as LASSO have the potential to markedly improve forecasting techniques for recurrent infectious disease outbreaks such as dengue. Shi Y, Liu X, Kok SY, Rajarethinam J, Liang S, Yap G, Chong CS, Lee KS, Tan SS, Chin CK, Lo A, Kong W, Ng LC, Cook AR. 2016. Three-month real-time dengue forecast models: an early warning system for outbreak alerts and policy decision support in Singapore. Environ Health Perspect 124:1369-1375; http://dx.doi.org/10.1289/ehp.1509981.

  20. Novel Hybrid of LS-SVM and Kalman Filter for GPS/INS Integration

    NASA Astrophysics Data System (ADS)

    Xu, Zhenkai; Li, Yong; Rizos, Chris; Xu, Xiaosu

    Integration of Global Positioning System (GPS) and Inertial Navigation System (INS) technologies can overcome the drawbacks of the individual systems. One of the advantages is that the integrated solution can provide continuous navigation capability even during GPS outages. However, bridging the GPS outages is still a challenge when Micro-Electro-Mechanical System (MEMS) inertial sensors are used. Methods being currently explored by the research community include applying vehicle motion constraints, optimal smoother, and artificial intelligence (AI) techniques. In the research area of AI, the neural network (NN) approach has been extensively utilised up to the present. In an NN-based integrated system, a Kalman filter (KF) estimates position, velocity and attitude errors, as well as the inertial sensor errors, to output navigation solutions while GPS signals are available. At the same time, an NN is trained to map the vehicle dynamics with corresponding KF states, and to correct INS measurements when GPS measurements are unavailable. To achieve good performance it is critical to select suitable quality and an optimal number of samples for the NN. This is sometimes too rigorous a requirement which limits real world application of NN-based methods.The support vector machine (SVM) approach is based on the structural risk minimisation principle, instead of the minimised empirical error principle that is commonly implemented in an NN. The SVM can avoid local minimisation and over-fitting problems in an NN, and therefore potentially can achieve a higher level of global performance. This paper focuses on the least squares support vector machine (LS-SVM), which can solve highly nonlinear and noisy black-box modelling problems. This paper explores the application of the LS-SVM to aid the GPS/INS integrated system, especially during GPS outages. The paper describes the principles of the LS-SVM and of the KF hybrid method, and introduces the LS-SVM regression algorithm. Field test data is processed to evaluate the performance of the proposed approach.

  1. Coherent states for the relativistic harmonic oscillator

    NASA Technical Reports Server (NTRS)

    Aldaya, Victor; Guerrero, J.

    1995-01-01

    Recently we have obtained, on the basis of a group approach to quantization, a Bargmann-Fock-like realization of the Relativistic Harmonic Oscillator as well as a generalized Bargmann transform relating fock wave functions and a set of relativistic Hermite polynomials. Nevertheless, the relativistic creation and annihilation operators satisfy typical relativistic commutation relations of the Lie product (vector-z, vector-z(sup dagger)) approximately equals Energy (an SL(2,R) algebra). Here we find higher-order polarization operators on the SL(2,R) group, providing canonical creation and annihilation operators satisfying the Lie product (vector-a, vector-a(sup dagger)) = identity vector 1, the eigenstates of which are 'true' coherent states.

  2. Retroviral packaging cells encapsulated in TheraCyte immunoisolation devices enable long-term in vivo gene delivery.

    PubMed

    Krupetsky, Anna; Parveen, Zahida; Marusich, Elena; Goodrich, Adrienne; Dornburg, Ralph

    2003-05-01

    The method of delivering a therapeutic gene into a patient is still one of the major obstacles towards successful human gene therapy. Here we describe a novel gene delivery approach using TheraCyte immunoisolation devices. Retroviral vector producing cells, derived from the avian retrovirus spleen necrosis virus, SNV, were encapsulated in TheraCyte devices and tested for the release of retroviral vectors. In vitro experiments show that such devices release infectious retroviral vectors into the tissue culture medium for up to 4 months. When such devices were implanted subcutaneously in SCID mice, infectious virus was released into the blood stream. There, the vectors were transported to and infected tumors, which had been induced by subcutaneous injection of tissue culture cells. Thus, this novel concept of a continuous, long-term gene delivery may constitute an attractive approach for future in vivo human gene therapy.

  3. X-31 quasi-tailless flight demonstration

    NASA Technical Reports Server (NTRS)

    Huber, Peter; Schellenger, Harvey G.

    1994-01-01

    The primary objective of the quasi-tailless flight demonstration is to demonstrate the feasibility of using thrust vectoring for directional control of an unstable aircraft. By using this low-cost, low-risk approach it is possible to get information about required thrust vector control power and deflection rates from an inflight experiment as well as insight in low-power thrust vectoring issues. The quasi-tailless flight demonstration series with the X-31 began in March 1994. The demonstration flight condition was Mach 1.2 at 37,500 feet. A series of basic flying quality maneuvers, doublets, bank to bank rolls, and wind-up-turns have been performed with a simulated 100% vertical tail reduction. Flight test and supporting simulation demonstrated that the quasi-tailless approach is effective in representing the reduced stability of tailless configurations. The flights also demonstrated that thrust vectoring could be effectively used to stabilize a directionally unstable configuration and provide control power for maneuver coordination.

  4. Interpolating of climate data using R

    NASA Astrophysics Data System (ADS)

    Reinhardt, Katja

    2017-04-01

    Interpolation methods are used in many different geoscientific areas, such as soil physics, climatology and meteorology. Thereby, unknown values are calculated by using statistical calculation approaches applied on known values. So far, the majority of climatologists have been using computer languages, such as FORTRAN or C++, but there is also an increasing number of climate scientists using R for data processing and visualization. Most of them, however, are still working with arrays and vector based data which is often associated with complex R code structures. For the presented study, I have decided to convert the climate data into geodata and to perform the whole data processing using the raster package, gstat and similar packages, providing a much more comfortable way for data handling. A central goal of my approach is to create an easy to use, powerful and fast R script, implementing the entire geodata processing and visualization into a single and fully automated R based procedure, which allows avoiding the necessity of using other software packages, such as ArcGIS or QGIS. Thus, large amount of data with recurrent process sequences can be processed. The aim of the presented study, which is located in western Central Asia, is to interpolate wind data based on the European reanalysis data Era-Interim, which are available as raster data with a resolution of 0.75˚ x 0.75˚ , to a finer grid. Therefore, various interpolation methods are used: inverse distance weighting, the geostatistical methods ordinary kriging and regression kriging, generalized additve model and the machine learning algorithms support vector machine and neural networks. Besides the first two mentioned methods, the methods are used with influencing factors, e.g. geopotential and topography.

  5. Quaternion-Based Texture Analysis of Multiband Satellite Images: Application to the Estimation of Aboveground Biomass in the East Region of Cameroon.

    PubMed

    Djiongo Kenfack, Cedrigue Boris; Monga, Olivier; Mpong, Serge Moto; Ndoundam, René

    2018-03-01

    Within the last decade, several approaches using quaternion numbers to handle and model multiband images in a holistic manner were introduced. The quaternion Fourier transform can be efficiently used to model texture in multidimensional data such as color images. For practical application, multispectral satellite data appear as a primary source for measuring past trends and monitoring changes in forest carbon stocks. In this work, we propose a texture-color descriptor based on the quaternion Fourier transform to extract relevant information from multiband satellite images. We propose a new multiband image texture model extraction, called FOTO++, in order to address biomass estimation issues. The first stage consists in removing noise from the multispectral data while preserving the edges of canopies. Afterward, color texture descriptors are extracted thanks to a discrete form of the quaternion Fourier transform, and finally the support vector regression method is used to deduce biomass estimation from texture indices. Our texture features are modeled using a vector composed with the radial spectrum coming from the amplitude of the quaternion Fourier transform. We conduct several experiments in order to study the sensitivity of our model to acquisition parameters. We also assess its performance both on synthetic images and on real multispectral images of Cameroonian forest. The results show that our model is more robust to acquisition parameters than the classical Fourier Texture Ordination model (FOTO). Our scheme is also more accurate for aboveground biomass estimation. We stress that a similar methodology could be implemented using quaternion wavelets. These results highlight the potential of the quaternion-based approach to study multispectral satellite images.

  6. Recombinase-Mediated Cassette Exchange Using Adenoviral Vectors.

    PubMed

    Kolb, Andreas F; Knowles, Christopher; Pultinevicius, Patrikas; Harbottle, Jennifer A; Petrie, Linda; Robinson, Claire; Sorrell, David A

    2017-01-01

    Site-specific recombinases are important tools for the modification of mammalian genomes. In conjunction with viral vectors, they can be utilized to mediate site-specific gene insertions in animals and in cell lines which are difficult to transfect. Here we describe a method for the generation and analysis of an adenovirus vector supporting a recombinase-mediated cassette exchange reaction and discuss the advantages and limitations of this approach.

  7. Approaches for Language Identification in Mismatched Environments

    DTIC Science & Technology

    2016-09-08

    different i-vector systems are considered, which differ in their feature extraction mechanism. The first, which we refer to as the standard i-vector, or...both conversational telephone speech and narrowband broadcast speech. Multiple experiments are conducted to assess the performance of the system in...bottleneck features using i-vectors. The proposed system results in a 30% improvement over the baseline result. Index Terms: language identification

  8. Boost OCR accuracy using iVector based system combination approach

    NASA Astrophysics Data System (ADS)

    Peng, Xujun; Cao, Huaigu; Natarajan, Prem

    2015-01-01

    Optical character recognition (OCR) is a challenging task because most existing preprocessing approaches are sensitive to writing style, writing material, noises and image resolution. Thus, a single recognition system cannot address all factors of real document images. In this paper, we describe an approach to combine diverse recognition systems by using iVector based features, which is a newly developed method in the field of speaker verification. Prior to system combination, document images are preprocessed and text line images are extracted with different approaches for each system, where iVector is transformed from a high-dimensional supervector of each text line and is used to predict the accuracy of OCR. We merge hypotheses from multiple recognition systems according to the overlap ratio and the predicted OCR score of text line images. We present evaluation results on an Arabic document database where the proposed method is compared against the single best OCR system using word error rate (WER) metric.

  9. Cross-entropy embedding of high-dimensional data using the neural gas model.

    PubMed

    Estévez, Pablo A; Figueroa, Cristián J; Saito, Kazumi

    2005-01-01

    A cross-entropy approach to mapping high-dimensional data into a low-dimensional space embedding is presented. The method allows to project simultaneously the input data and the codebook vectors, obtained with the Neural Gas (NG) quantizer algorithm, into a low-dimensional output space. The aim of this approach is to preserve the relationship defined by the NG neighborhood function for each pair of input and codebook vectors. A cost function based on the cross-entropy between input and output probabilities is minimized by using a Newton-Raphson method. The new approach is compared with Sammon's non-linear mapping (NLM) and the hierarchical approach of combining a vector quantizer such as the self-organizing feature map (SOM) or NG with the NLM recall algorithm. In comparison with these techniques, our method delivers a clear visualization of both data points and codebooks, and it achieves a better mapping quality in terms of the topology preservation measure q(m).

  10. Characterization of GM events by insert knowledge adapted re-sequencing approaches

    PubMed Central

    Yang, Litao; Wang, Congmao; Holst-Jensen, Arne; Morisset, Dany; Lin, Yongjun; Zhang, Dabing

    2013-01-01

    Detection methods and data from molecular characterization of genetically modified (GM) events are needed by stakeholders of public risk assessors and regulators. Generally, the molecular characteristics of GM events are incomprehensively revealed by current approaches and biased towards detecting transformation vector derived sequences. GM events are classified based on available knowledge of the sequences of vectors and inserts (insert knowledge). Herein we present three insert knowledge-adapted approaches for characterization GM events (TT51-1 and T1c-19 rice as examples) based on paired-end re-sequencing with the advantages of comprehensiveness, accuracy, and automation. The comprehensive molecular characteristics of two rice events were revealed with additional unintended insertions comparing with the results from PCR and Southern blotting. Comprehensive transgene characterization of TT51-1 and T1c-19 is shown to be independent of a priori knowledge of the insert and vector sequences employing the developed approaches. This provides an opportunity to identify and characterize also unknown GM events. PMID:24088728

  11. Characterization of GM events by insert knowledge adapted re-sequencing approaches.

    PubMed

    Yang, Litao; Wang, Congmao; Holst-Jensen, Arne; Morisset, Dany; Lin, Yongjun; Zhang, Dabing

    2013-10-03

    Detection methods and data from molecular characterization of genetically modified (GM) events are needed by stakeholders of public risk assessors and regulators. Generally, the molecular characteristics of GM events are incomprehensively revealed by current approaches and biased towards detecting transformation vector derived sequences. GM events are classified based on available knowledge of the sequences of vectors and inserts (insert knowledge). Herein we present three insert knowledge-adapted approaches for characterization GM events (TT51-1 and T1c-19 rice as examples) based on paired-end re-sequencing with the advantages of comprehensiveness, accuracy, and automation. The comprehensive molecular characteristics of two rice events were revealed with additional unintended insertions comparing with the results from PCR and Southern blotting. Comprehensive transgene characterization of TT51-1 and T1c-19 is shown to be independent of a priori knowledge of the insert and vector sequences employing the developed approaches. This provides an opportunity to identify and characterize also unknown GM events.

  12. Can Emotional and Behavioral Dysregulation in Youth Be Decoded from Functional Neuroimaging?

    PubMed

    Portugal, Liana C L; Rosa, Maria João; Rao, Anil; Bebko, Genna; Bertocci, Michele A; Hinze, Amanda K; Bonar, Lisa; Almeida, Jorge R C; Perlman, Susan B; Versace, Amelia; Schirda, Claudiu; Travis, Michael; Gill, Mary Kay; Demeter, Christine; Diwadkar, Vaibhav A; Ciuffetelli, Gary; Rodriguez, Eric; Forbes, Erika E; Sunshine, Jeffrey L; Holland, Scott K; Kowatch, Robert A; Birmaher, Boris; Axelson, David; Horwitz, Sarah M; Arnold, Eugene L; Fristad, Mary A; Youngstrom, Eric A; Findling, Robert L; Pereira, Mirtes; Oliveira, Leticia; Phillips, Mary L; Mourao-Miranda, Janaina

    2016-01-01

    High comorbidity among pediatric disorders characterized by behavioral and emotional dysregulation poses problems for diagnosis and treatment, and suggests that these disorders may be better conceptualized as dimensions of abnormal behaviors. Furthermore, identifying neuroimaging biomarkers related to dimensional measures of behavior may provide targets to guide individualized treatment. We aimed to use functional neuroimaging and pattern regression techniques to determine whether patterns of brain activity could accurately decode individual-level severity on a dimensional scale measuring behavioural and emotional dysregulation at two different time points. A sample of fifty-seven youth (mean age: 14.5 years; 32 males) was selected from a multi-site study of youth with parent-reported behavioral and emotional dysregulation. Participants performed a block-design reward paradigm during functional Magnetic Resonance Imaging (fMRI). Pattern regression analyses consisted of Relevance Vector Regression (RVR) and two cross-validation strategies implemented in the Pattern Recognition for Neuroimaging toolbox (PRoNTo). Medication was treated as a binary confounding variable. Decoded and actual clinical scores were compared using Pearson's correlation coefficient (r) and mean squared error (MSE) to evaluate the models. Permutation test was applied to estimate significance levels. Relevance Vector Regression identified patterns of neural activity associated with symptoms of behavioral and emotional dysregulation at the initial study screen and close to the fMRI scanning session. The correlation and the mean squared error between actual and decoded symptoms were significant at the initial study screen and close to the fMRI scanning session. However, after controlling for potential medication effects, results remained significant only for decoding symptoms at the initial study screen. Neural regions with the highest contribution to the pattern regression model included cerebellum, sensory-motor and fronto-limbic areas. The combination of pattern regression models and neuroimaging can help to determine the severity of behavioral and emotional dysregulation in youth at different time points.

  13. Regression-assisted deconvolution.

    PubMed

    McIntyre, Julie; Stefanski, Leonard A

    2011-06-30

    We present a semi-parametric deconvolution estimator for the density function of a random variable biX that is measured with error, a common challenge in many epidemiological studies. Traditional deconvolution estimators rely only on assumptions about the distribution of X and the error in its measurement, and ignore information available in auxiliary variables. Our method assumes the availability of a covariate vector statistically related to X by a mean-variance function regression model, where regression errors are normally distributed and independent of the measurement errors. Simulations suggest that the estimator achieves a much lower integrated squared error than the observed-data kernel density estimator when models are correctly specified and the assumption of normal regression errors is met. We illustrate the method using anthropometric measurements of newborns to estimate the density function of newborn length. Copyright © 2011 John Wiley & Sons, Ltd.

  14. Unified heat kernel regression for diffusion, kernel smoothing and wavelets on manifolds and its application to mandible growth modeling in CT images.

    PubMed

    Chung, Moo K; Qiu, Anqi; Seo, Seongho; Vorperian, Houri K

    2015-05-01

    We present a novel kernel regression framework for smoothing scalar surface data using the Laplace-Beltrami eigenfunctions. Starting with the heat kernel constructed from the eigenfunctions, we formulate a new bivariate kernel regression framework as a weighted eigenfunction expansion with the heat kernel as the weights. The new kernel method is mathematically equivalent to isotropic heat diffusion, kernel smoothing and recently popular diffusion wavelets. The numerical implementation is validated on a unit sphere using spherical harmonics. As an illustration, the method is applied to characterize the localized growth pattern of mandible surfaces obtained in CT images between ages 0 and 20 by regressing the length of displacement vectors with respect to a surface template. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. Increasing the Efficacy of Oncolytic Adenovirus Vectors

    PubMed Central

    Toth, Karoly; Wold, William S. M.

    2010-01-01

    Oncolytic adenovirus (Ad) vectors present a new modality to treat cancer. These vectors attack tumors via replicating in and killing cancer cells. Upon completion of the vector replication cycle, the infected tumor cell lyses and releases progeny virions that are capable of infecting neighboring tumor cells. Repeated cycles of vector replication and cell lysis can destroy the tumor. Numerous Ad vectors have been generated and tested, some of them reaching human clinical trials. In 2005, the first oncolytic Ad was approved for the treatment of head-and-neck cancer by the Chinese FDA. Oncolytic Ads have been proven to be safe, with no serious adverse effects reported even when high doses of the vector were injected intravenously. The vectors demonstrated modest anti-tumor effect when applied as a single agent; their efficacy improved when they were combined with another modality. The efficacy of oncolytic Ads can be improved using various approaches, including vector design, delivery techniques, and ancillary treatment, which will be discussed in this review. PMID:21994711

  16. Eco-bio-social research on community-based approaches for Chagas disease vector control in Latin America.

    PubMed

    Gürtler, Ricardo E; Yadon, Zaida E

    2015-02-01

    This article provides an overview of three research projects which designed and implemented innovative interventions for Chagas disease vector control in Bolivia, Guatemala and Mexico. The research initiative was based on sound principles of community-based ecosystem management (ecohealth), integrated vector management, and interdisciplinary analysis. The initial situational analysis achieved a better understanding of ecological, biological and social determinants of domestic infestation. The key factors identified included: housing quality; type of peridomestic habitats; presence and abundance of domestic dogs, chickens and synanthropic rodents; proximity to public lights; location in the periphery of the village. In Bolivia, plastering of mud walls with appropriate local materials and regular cleaning of beds and of clothes next to the walls, substantially decreased domestic infestation and abundance of the insect vector Triatoma infestans. The Guatemalan project revealed close links between house infestation by rodents and Triatoma dimidiata, and vector infection with Trypanosoma cruzi. A novel community-operated rodent control program significantly reduced rodent infestation and bug infection. In Mexico, large-scale implementation of window screens translated into promising reductions in domestic infestation. A multi-pronged approach including community mobilisation and empowerment, intersectoral cooperation and adhesion to integrated vector management principles may be the key to sustainable vector and disease control in the affected regions. © World Health Organization 2015. The World Health Organization has granted Oxford University Press permission for the reproduction of this article.

  17. Aircraft Engine Thrust Estimator Design Based on GSA-LSSVM

    NASA Astrophysics Data System (ADS)

    Sheng, Hanlin; Zhang, Tianhong

    2017-08-01

    In view of the necessity of highly precise and reliable thrust estimator to achieve direct thrust control of aircraft engine, based on support vector regression (SVR), as well as least square support vector machine (LSSVM) and a new optimization algorithm - gravitational search algorithm (GSA), by performing integrated modelling and parameter optimization, a GSA-LSSVM-based thrust estimator design solution is proposed. The results show that compared to particle swarm optimization (PSO) algorithm, GSA can find unknown optimization parameter better and enables the model developed with better prediction and generalization ability. The model can better predict aircraft engine thrust and thus fulfills the need of direct thrust control of aircraft engine.

  18. "Analytical" vector-functions I

    NASA Astrophysics Data System (ADS)

    Todorov, Vladimir Todorov

    2017-12-01

    In this note we try to give a new (or different) approach to the investigation of analytical vector functions. More precisely a notion of a power xn; n ∈ ℕ+ of a vector x ∈ ℝ3 is introduced which allows to define an "analytical" function f : ℝ3 → ℝ3. Let furthermore f (ξ )= ∑n =0 ∞ anξn be an analytical function of the real variable ξ. Here we replace the power ξn of the number ξ with the power of a vector x ∈ ℝ3 to obtain a vector "power series" f (x )= ∑n =0 ∞ anxn . We research some properties of the vector series as well as some applications of this idea. Note that an "analytical" vector function does not depend of any basis, which may be used in research into some problems in physics.

  19. Engineered Lentivector Targeting of Dendritic Cells for In Vivo Immunization

    PubMed Central

    Yang, Lili; Yang, Haiguang; Rideout, Kendra; Cho, Taehoon; Joo, Kye il; Ziegler, Leslie; Elliot, Abigail; Walls, Anthony; Yu, Dongzi; Baltimore, David; Wang, Pin

    2008-01-01

    We report a method of inducing antigen production in dendritic cells (DCs) by in vivo targeting with lentiviral vectors that specifically bind to the DC surface protein, DC-SIGN. To target the DCs, the lentivector was enveloped with a viral glycoprotein from Sindbis virus, engineered to be DC-SIGN-specific. In vitro, this lentivector specifically transduced DCs and induced DC maturation. A remarkable frequency (up to 12%) of ovalbumin (OVA)-specific CD8+ T cells and a significant antibody response were observed 2 weeks following injection of a targeted lentiviral vector encoding an OVA transgene into naïve mice. These mice were solidly protected against the growth of the OVA-expressing E.G7 tumor and this methodology could even induce regression of an established tumor. Thus, lentiviral vectors targeting DCs provide a simple method of producing effective immunity and may provide an alternative route for immunization with protein antigens. PMID:18297056

  20. A transposase strategy for creating libraries of circularly permuted proteins.

    PubMed

    Mehta, Manan M; Liu, Shirley; Silberg, Jonathan J

    2012-05-01

    A simple approach for creating libraries of circularly permuted proteins is described that is called PERMutation Using Transposase Engineering (PERMUTE). In PERMUTE, the transposase MuA is used to randomly insert a minitransposon that can function as a protein expression vector into a plasmid that contains the open reading frame (ORF) being permuted. A library of vectors that express different permuted variants of the ORF-encoded protein is created by: (i) using bacteria to select for target vectors that acquire an integrated minitransposon; (ii) excising the ensemble of ORFs that contain an integrated minitransposon from the selected vectors; and (iii) circularizing the ensemble of ORFs containing integrated minitransposons using intramolecular ligation. Construction of a Thermotoga neapolitana adenylate kinase (AK) library using PERMUTE revealed that this approach produces vectors that express circularly permuted proteins with distinct sequence diversity from existing methods. In addition, selection of this library for variants that complement the growth of Escherichia coli with a temperature-sensitive AK identified functional proteins with novel architectures, suggesting that PERMUTE will be useful for the directed evolution of proteins with new functions.

  1. A transposase strategy for creating libraries of circularly permuted proteins

    PubMed Central

    Mehta, Manan M.; Liu, Shirley; Silberg, Jonathan J.

    2012-01-01

    A simple approach for creating libraries of circularly permuted proteins is described that is called PERMutation Using Transposase Engineering (PERMUTE). In PERMUTE, the transposase MuA is used to randomly insert a minitransposon that can function as a protein expression vector into a plasmid that contains the open reading frame (ORF) being permuted. A library of vectors that express different permuted variants of the ORF-encoded protein is created by: (i) using bacteria to select for target vectors that acquire an integrated minitransposon; (ii) excising the ensemble of ORFs that contain an integrated minitransposon from the selected vectors; and (iii) circularizing the ensemble of ORFs containing integrated minitransposons using intramolecular ligation. Construction of a Thermotoga neapolitana adenylate kinase (AK) library using PERMUTE revealed that this approach produces vectors that express circularly permuted proteins with distinct sequence diversity from existing methods. In addition, selection of this library for variants that complement the growth of Escherichia coli with a temperature-sensitive AK identified functional proteins with novel architectures, suggesting that PERMUTE will be useful for the directed evolution of proteins with new functions. PMID:22319214

  2. Plant Virus–Insect Vector Interactions: Current and Potential Future Research Directions

    PubMed Central

    Dietzgen, Ralf G.; Mann, Krin S.; Johnson, Karyn N.

    2016-01-01

    Acquisition and transmission by an insect vector is central to the infection cycle of the majority of plant pathogenic viruses. Plant viruses can interact with their insect host in a variety of ways including both non-persistent and circulative transmission; in some cases, the latter involves virus replication in cells of the insect host. Replicating viruses can also elicit both innate and specific defense responses in the insect host. A consistent feature is that the interaction of the virus with its insect host/vector requires specific molecular interactions between virus and host, commonly via proteins. Understanding the interactions between plant viruses and their insect host can underpin approaches to protect plants from infection by interfering with virus uptake and transmission. Here, we provide a perspective focused on identifying novel approaches and research directions to facilitate control of plant viruses by better understanding and targeting virus–insect molecular interactions. We also draw parallels with molecular interactions in insect vectors of animal viruses, and consider technical advances for their control that may be more broadly applicable to plant virus vectors. PMID:27834855

  3. Plant Virus-Insect Vector Interactions: Current and Potential Future Research Directions.

    PubMed

    Dietzgen, Ralf G; Mann, Krin S; Johnson, Karyn N

    2016-11-09

    Acquisition and transmission by an insect vector is central to the infection cycle of the majority of plant pathogenic viruses. Plant viruses can interact with their insect host in a variety of ways including both non-persistent and circulative transmission; in some cases, the latter involves virus replication in cells of the insect host. Replicating viruses can also elicit both innate and specific defense responses in the insect host. A consistent feature is that the interaction of the virus with its insect host/vector requires specific molecular interactions between virus and host, commonly via proteins. Understanding the interactions between plant viruses and their insect host can underpin approaches to protect plants from infection by interfering with virus uptake and transmission. Here, we provide a perspective focused on identifying novel approaches and research directions to facilitate control of plant viruses by better understanding and targeting virus-insect molecular interactions. We also draw parallels with molecular interactions in insect vectors of animal viruses, and consider technical advances for their control that may be more broadly applicable to plant virus vectors.

  4. A Vector Library for Silencing Central Carbon Metabolism Genes with Antisense RNAs in Escherichia coli

    PubMed Central

    Ohno, Satoshi; Yoshikawa, Katsunori; Shimizu, Hiroshi; Tamura, Tomohiro

    2014-01-01

    We describe here the construction of a series of 71 vectors to silence central carbon metabolism genes in Escherichia coli. The vectors inducibly express antisense RNAs called paired-terminus antisense RNAs, which have a higher silencing efficacy than ordinary antisense RNAs. By measuring mRNA amounts, measuring activities of target proteins, or observing specific phenotypes, it was confirmed that all the vectors were able to silence the expression of target genes efficiently. Using this vector set, each of the central carbon metabolism genes was silenced individually, and the accumulation of metabolites was investigated. We were able to obtain accurate information on ways to increase the production of pyruvate, an industrially valuable compound, from the silencing results. Furthermore, the experimental results of pyruvate accumulation were compared to in silico predictions, and both sets of results were consistent. Compared to the gene disruption approach, the silencing approach has an advantage in that any E. coli strain can be used and multiple gene silencing is easily possible in any combination. PMID:24212579

  5. Structured caustic vector vortex optical field: manipulating optical angular momentum flux and polarization rotation.

    PubMed

    Chen, Rui-Pin; Chen, Zhaozhong; Chew, Khian-Hooi; Li, Pei-Gang; Yu, Zhongliang; Ding, Jianping; He, Sailing

    2015-05-29

    A caustic vector vortex optical field is experimentally generated and demonstrated by a caustic-based approach. The desired caustic with arbitrary acceleration trajectories, as well as the structured states of polarization (SoP) and vortex orders located in different positions in the field cross-section, is generated by imposing the corresponding spatial phase function in a vector vortex optical field. Our study reveals that different spin and orbital angular momentum flux distributions (including opposite directions) in different positions in the cross-section of a caustic vector vortex optical field can be dynamically managed during propagation by intentionally choosing the initial polarization and vortex topological charges, as a result of the modulation of the caustic phase. We find that the SoP in the field cross-section rotates during propagation due to the existence of the vortex. The unique structured feature of the caustic vector vortex optical field opens the possibility of multi-manipulation of optical angular momentum fluxes and SoP, leading to more complex manipulation of the optical field scenarios. Thus this approach further expands the functionality of an optical system.

  6. Reflections on the Anopheles gambiae genome sequence, transgenic mosquitoes and the prospect for controlling malaria and other vector borne diseases.

    PubMed

    Tabachnick, Walter J

    2003-09-01

    The completion of the Anopheles gambiae Giles genome sequencing project is a milestone toward developing more effective strategies in reducing the impact of malaria and other vector borne diseases. The successes in developing transgenic approaches using mosquitoes have provided another essential new tool for further progress in basic vector genetics and the goal of disease control. The use of transgenic approaches to develop refractory mosquitoes is also possible. The ability to use genome sequence to identify genes, and transgenic approaches to construct refractory mosquitoes, has provided the opportunity that with the future development of an appropriate genetic drive system, refractory transgenes can be released into vector populations leading to nontransmitting mosquitoes. An. gambiae populations incapable of transmitting malaria. This compelling strategy will be very difficult to achieve and will require a broad substantial research program for success. The fundamental information that is required on genome structure, gene function and environmental effects on genetic expression are largely unknown. The ability to predict gene effects on phenotype is rudimentary, particularly in natural populations. As a result, the release of a refractory transgene into natural mosquito populations is imprecise and there is little ability to predict unintended consequences. The new genetic tools at hand provide opportunities to address an array of important issues, many of which can have immediate impact on the effectiveness of a host of strategies to control vector borne disease. Transgenic release approaches represent only one strategy that should be pursued. A balanced research program is required.

  7. Propensity score estimation: machine learning and classification methods as alternatives to logistic regression

    PubMed Central

    Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson

    2010-01-01

    Summary Objective Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this Review was to assess machine learning alternatives to logistic regression which may accomplish the same goals but with fewer assumptions or greater accuracy. Study Design and Setting We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. Results We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (CART), and meta-classifiers (in particular, boosting). Conclusion While the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and to a lesser extent decision trees (particularly CART) appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. PMID:20630332

  8. Infinite-Dimensional Symmetry Algebras as a Help Toward Solutions of the Self-Dual Field Equations with One Killing Vector

    NASA Astrophysics Data System (ADS)

    Finley, Daniel; McIver, John K.

    2002-12-01

    The sDiff(2) Toda equation determines all self-dual, vacuum solutions of the Einstein field equations with one rotational Killing vector. Some history of the searches for non-trivial solutions is given, including those that begin with the limit as n → ∞ of the An Toda lattice equations. That approach is applied here to the known prolongation structure for the Toda lattice, hoping to use Bäcklund transformations to generate new solutions. Although this attempt has not yet succeeded, new faithful (tangent-vector) realizations of A∞ are described, and a direct approach via the continuum Lie algebras of Saveliev and Leznov is given.

  9. Adding localization information in a fingerprint binary feature vector representation

    NASA Astrophysics Data System (ADS)

    Bringer, Julien; Despiegel, Vincent; Favre, Mélanie

    2011-06-01

    At BTAS'10, a new framework to transform a fingerprint minutiae template into a binary feature vector of fixed length is described. A fingerprint is characterized by its similarity with a fixed number set of representative local minutiae vicinities. This approach by representative leads to a fixed length binary representation, and, as the approach is local, it enables to deal with local distortions that may occur between two acquisitions. We extend this construction to incorporate additional information in the binary vector, in particular on localization of the vicinities. We explore the use of position and orientation information. The performance improvement is promising for utilization into fast identification algorithms or into privacy protection algorithms.

  10. Vector Autoregression, Structural Equation Modeling, and Their Synthesis in Neuroimaging Data Analysis

    PubMed Central

    Chen, Gang; Glen, Daniel R.; Saad, Ziad S.; Hamilton, J. Paul; Thomason, Moriah E.; Gotlib, Ian H.; Cox, Robert W.

    2011-01-01

    Vector autoregression (VAR) and structural equation modeling (SEM) are two popular brain-network modeling tools. VAR, which is a data-driven approach, assumes that connected regions exert time-lagged influences on one another. In contrast, the hypothesis-driven SEM is used to validate an existing connectivity model where connected regions have contemporaneous interactions among them. We present the two models in detail and discuss their applicability to FMRI data, and interpretational limits. We also propose a unified approach that models both lagged and contemporaneous effects. The unifying model, structural vector autoregression (SVAR), may improve statistical and explanatory power, and avoids some prevalent pitfalls that can occur when VAR and SEM are utilized separately. PMID:21975109

  11. Computing energy levels of CH4, CHD3, CH3D, and CH3F with a direct product basis and coordinates based on the methyl subsystem.

    PubMed

    Zhao, Zhiqiang; Chen, Jun; Zhang, Zhaojun; Zhang, Dong H; Wang, Xiao-Gang; Carrington, Tucker; Gatti, Fabien

    2018-02-21

    Quantum mechanical calculations of ro-vibrational energies of CH 4 , CHD 3 , CH 3 D, and CH 3 F were made with two different numerical approaches. Both use polyspherical coordinates. The computed energy levels agree, confirming the accuracy of the methods. In the first approach, for all the molecules, the coordinates are defined using three Radau vectors for the CH 3 subsystem and a Jacobi vector between the remaining atom and the centre of mass of CH 3 . Euler angles specifying the orientation of a frame attached to CH 3 with respect to a frame attached to the Jacobi vector are used as vibrational coordinates. A direct product potential-optimized discrete variable vibrational basis is used to build a Hamiltonian matrix. Ro-vibrational energies are computed using a re-started Arnoldi eigensolver. In the second approach, the coordinates are the spherical coordinates associated with four Radau vectors or three Radau vectors and a Jacobi vector, and the frame is an Eckart frame. Vibrational basis functions are products of contracted stretch and bend functions, and eigenvalues are computed with the Lanczos algorithm. For CH 4 , CHD 3 , and CH 3 D, we report the first J > 0 energy levels computed on the Wang-Carrington potential energy surface [X.-G. Wang and T. Carrington, J. Chem. Phys. 141(15), 154106 (2014)]. For CH 3 F, the potential energy surface of Zhao et al. [J. Chem. Phys. 144, 204302 (2016)] was used. All the results are in good agreement with experimental data.

  12. Computing energy levels of CH4, CHD3, CH3D, and CH3F with a direct product basis and coordinates based on the methyl subsystem

    NASA Astrophysics Data System (ADS)

    Zhao, Zhiqiang; Chen, Jun; Zhang, Zhaojun; Zhang, Dong H.; Wang, Xiao-Gang; Carrington, Tucker; Gatti, Fabien

    2018-02-01

    Quantum mechanical calculations of ro-vibrational energies of CH4, CHD3, CH3D, and CH3F were made with two different numerical approaches. Both use polyspherical coordinates. The computed energy levels agree, confirming the accuracy of the methods. In the first approach, for all the molecules, the coordinates are defined using three Radau vectors for the CH3 subsystem and a Jacobi vector between the remaining atom and the centre of mass of CH3. Euler angles specifying the orientation of a frame attached to CH3 with respect to a frame attached to the Jacobi vector are used as vibrational coordinates. A direct product potential-optimized discrete variable vibrational basis is used to build a Hamiltonian matrix. Ro-vibrational energies are computed using a re-started Arnoldi eigensolver. In the second approach, the coordinates are the spherical coordinates associated with four Radau vectors or three Radau vectors and a Jacobi vector, and the frame is an Eckart frame. Vibrational basis functions are products of contracted stretch and bend functions, and eigenvalues are computed with the Lanczos algorithm. For CH4, CHD3, and CH3D, we report the first J > 0 energy levels computed on the Wang-Carrington potential energy surface [X.-G. Wang and T. Carrington, J. Chem. Phys. 141(15), 154106 (2014)]. For CH3F, the potential energy surface of Zhao et al. [J. Chem. Phys. 144, 204302 (2016)] was used. All the results are in good agreement with experimental data.

  13. Fractal vector optical fields.

    PubMed

    Pan, Yue; Gao, Xu-Zhen; Cai, Meng-Qiang; Zhang, Guan-Lin; Li, Yongnan; Tu, Chenghou; Wang, Hui-Tian

    2016-07-15

    We introduce the concept of a fractal, which provides an alternative approach for flexibly engineering the optical fields and their focal fields. We propose, design, and create a new family of optical fields-fractal vector optical fields, which build a bridge between the fractal and vector optical fields. The fractal vector optical fields have polarization states exhibiting fractal geometry, and may also involve the phase and/or amplitude simultaneously. The results reveal that the focal fields exhibit self-similarity, and the hierarchy of the fractal has the "weeding" role. The fractal can be used to engineer the focal field.

  14. New Term Weighting Formulas for the Vector Space Method in Information Retrieval

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chisholm, E.; Kolda, T.G.

    The goal in information retrieval is to enable users to automatically and accurately find data relevant to their queries. One possible approach to this problem i use the vector space model, which models documents and queries as vectors in the term space. The components of the vectors are determined by the term weighting scheme, a function of the frequencies of the terms in the document or query as well as throughout the collection. We discuss popular term weighting schemes and present several new schemes that offer improved performance.

  15. Analysis of near infrared spectra for age-grading of wild populations of Anopheles gambiae.

    PubMed

    Krajacich, Benjamin J; Meyers, Jacob I; Alout, Haoues; Dabiré, Roch K; Dowell, Floyd E; Foy, Brian D

    2017-11-07

    Understanding the age-structure of mosquito populations, especially malaria vectors such as Anopheles gambiae, is important for assessing the risk of infectious mosquitoes, and how vector control interventions may impact this risk. The use of near-infrared spectroscopy (NIRS) for age-grading has been demonstrated previously on laboratory and semi-field mosquitoes, but to date has not been utilized on wild-caught mosquitoes whose age is externally validated via parity status or parasite infection stage. In this study, we developed regression and classification models using NIRS on datasets of wild An. gambiae (s.l.) reared from larvae collected from the field in Burkina Faso, and two laboratory strains. We compared the accuracy of these models for predicting the ages of wild-caught mosquitoes that had been scored for their parity status as well as for positivity for Plasmodium sporozoites. Regression models utilizing variable selection increased predictive accuracy over the more common full-spectrum partial least squares (PLS) approach for cross-validation of the datasets, validation, and independent test sets. Models produced from datasets that included the greatest range of mosquito samples (i.e. different sampling locations and times) had the highest predictive accuracy on independent testing sets, though overall accuracy on these samples was low. For classification, we found that intramodel accuracy ranged between 73.5-97.0% for grouping of mosquitoes into "early" and "late" age classes, with the highest prediction accuracy found in laboratory colonized mosquitoes. However, this accuracy was decreased on test sets, with the highest classification of an independent set of wild-caught larvae reared to set ages being 69.6%. Variation in NIRS data, likely from dietary, genetic, and other factors limits the accuracy of this technique with wild-caught mosquitoes. Alternative algorithms may help improve prediction accuracy, but care should be taken to either maximize variety in models or minimize confounders.

  16. Epithelial–mesenchymal transition biomarkers and support vector machine guided model in preoperatively predicting regional lymph node metastasis for rectal cancer

    PubMed Central

    Fan, X-J; Wan, X-B; Huang, Y; Cai, H-M; Fu, X-H; Yang, Z-L; Chen, D-K; Song, S-X; Wu, P-H; Liu, Q; Wang, L; Wang, J-P

    2012-01-01

    Background: Current imaging modalities are inadequate in preoperatively predicting regional lymph node metastasis (RLNM) status in rectal cancer (RC). Here, we designed support vector machine (SVM) model to address this issue by integrating epithelial–mesenchymal-transition (EMT)-related biomarkers along with clinicopathological variables. Methods: Using tissue microarrays and immunohistochemistry, the EMT-related biomarkers expression was measured in 193 RC patients. Of which, 74 patients were assigned to the training set to select the robust variables for designing SVM model. The SVM model predictive value was validated in the testing set (119 patients). Results: In training set, eight variables, including six EMT-related biomarkers and two clinicopathological variables, were selected to devise SVM model. In testing set, we identified 63 patients with high risk to RLNM and 56 patients with low risk. The sensitivity, specificity and overall accuracy of SVM in predicting RLNM were 68.3%, 81.1% and 72.3%, respectively. Importantly, multivariate logistic regression analysis showed that SVM model was indeed an independent predictor of RLNM status (odds ratio, 11.536; 95% confidence interval, 4.113–32.361; P<0.0001). Conclusion: Our SVM-based model displayed moderately strong predictive power in defining the RLNM status in RC patients, providing an important approach to select RLNM high-risk subgroup for neoadjuvant chemoradiotherapy. PMID:22538975

  17. Integrating Transgenic Vector Manipulation with Clinical Interventions to Manage Vector-Borne Diseases.

    PubMed

    Okamoto, Kenichi W; Gould, Fred; Lloyd, Alun L

    2016-03-01

    Many vector-borne diseases lack effective vaccines and medications, and the limitations of traditional vector control have inspired novel approaches based on using genetic engineering to manipulate vector populations and thereby reduce transmission. Yet both the short- and long-term epidemiological effects of these transgenic strategies are highly uncertain. If neither vaccines, medications, nor transgenic strategies can by themselves suffice for managing vector-borne diseases, integrating these approaches becomes key. Here we develop a framework to evaluate how clinical interventions (i.e., vaccination and medication) can be integrated with transgenic vector manipulation strategies to prevent disease invasion and reduce disease incidence. We show that the ability of clinical interventions to accelerate disease suppression can depend on the nature of the transgenic manipulation deployed (e.g., whether vector population reduction or replacement is attempted). We find that making a specific, individual strategy highly effective may not be necessary for attaining public-health objectives, provided suitable combinations can be adopted. However, we show how combining only partially effective antimicrobial drugs or vaccination with transgenic vector manipulations that merely temporarily lower vector competence can amplify disease resurgence following transient suppression. Thus, transgenic vector manipulation that cannot be sustained can have adverse consequences-consequences which ineffective clinical interventions can at best only mitigate, and at worst temporarily exacerbate. This result, which arises from differences between the time scale on which the interventions affect disease dynamics and the time scale of host population dynamics, highlights the importance of accounting for the potential delay in the effects of deploying public health strategies on long-term disease incidence. We find that for systems at the disease-endemic equilibrium, even modest perturbations induced by weak interventions can exhibit strong, albeit transient, epidemiological effects. This, together with our finding that under some conditions combining strategies could have transient adverse epidemiological effects suggests that a relatively long time horizon may be necessary to discern the efficacy of alternative intervention strategies.

  18. Integrating Transgenic Vector Manipulation with Clinical Interventions to Manage Vector-Borne Diseases

    PubMed Central

    Okamoto, Kenichi W.; Gould, Fred; Lloyd, Alun L.

    2016-01-01

    Many vector-borne diseases lack effective vaccines and medications, and the limitations of traditional vector control have inspired novel approaches based on using genetic engineering to manipulate vector populations and thereby reduce transmission. Yet both the short- and long-term epidemiological effects of these transgenic strategies are highly uncertain. If neither vaccines, medications, nor transgenic strategies can by themselves suffice for managing vector-borne diseases, integrating these approaches becomes key. Here we develop a framework to evaluate how clinical interventions (i.e., vaccination and medication) can be integrated with transgenic vector manipulation strategies to prevent disease invasion and reduce disease incidence. We show that the ability of clinical interventions to accelerate disease suppression can depend on the nature of the transgenic manipulation deployed (e.g., whether vector population reduction or replacement is attempted). We find that making a specific, individual strategy highly effective may not be necessary for attaining public-health objectives, provided suitable combinations can be adopted. However, we show how combining only partially effective antimicrobial drugs or vaccination with transgenic vector manipulations that merely temporarily lower vector competence can amplify disease resurgence following transient suppression. Thus, transgenic vector manipulation that cannot be sustained can have adverse consequences—consequences which ineffective clinical interventions can at best only mitigate, and at worst temporarily exacerbate. This result, which arises from differences between the time scale on which the interventions affect disease dynamics and the time scale of host population dynamics, highlights the importance of accounting for the potential delay in the effects of deploying public health strategies on long-term disease incidence. We find that for systems at the disease-endemic equilibrium, even modest perturbations induced by weak interventions can exhibit strong, albeit transient, epidemiological effects. This, together with our finding that under some conditions combining strategies could have transient adverse epidemiological effects suggests that a relatively long time horizon may be necessary to discern the efficacy of alternative intervention strategies. PMID:26962871

  19. Comparison of Nine Statistical Model Based Warfarin Pharmacogenetic Dosing Algorithms Using the Racially Diverse International Warfarin Pharmacogenetic Consortium Cohort Database

    PubMed Central

    Liu, Rong; Li, Xi; Zhang, Wei; Zhou, Hong-Hao

    2015-01-01

    Objective Multiple linear regression (MLR) and machine learning techniques in pharmacogenetic algorithm-based warfarin dosing have been reported. However, performances of these algorithms in racially diverse group have never been objectively evaluated and compared. In this literature-based study, we compared the performances of eight machine learning techniques with those of MLR in a large, racially-diverse cohort. Methods MLR, artificial neural network (ANN), regression tree (RT), multivariate adaptive regression splines (MARS), boosted regression tree (BRT), support vector regression (SVR), random forest regression (RFR), lasso regression (LAR) and Bayesian additive regression trees (BART) were applied in warfarin dose algorithms in a cohort from the International Warfarin Pharmacogenetics Consortium database. Covariates obtained by stepwise regression from 80% of randomly selected patients were used to develop algorithms. To compare the performances of these algorithms, the mean percentage of patients whose predicted dose fell within 20% of the actual dose (mean percentage within 20%) and the mean absolute error (MAE) were calculated in the remaining 20% of patients. The performances of these techniques in different races, as well as the dose ranges of therapeutic warfarin were compared. Robust results were obtained after 100 rounds of resampling. Results BART, MARS and SVR were statistically indistinguishable and significantly out performed all the other approaches in the whole cohort (MAE: 8.84–8.96 mg/week, mean percentage within 20%: 45.88%–46.35%). In the White population, MARS and BART showed higher mean percentage within 20% and lower mean MAE than those of MLR (all p values < 0.05). In the Asian population, SVR, BART, MARS and LAR performed the same as MLR. MLR and LAR optimally performed among the Black population. When patients were grouped in terms of warfarin dose range, all machine learning techniques except ANN and LAR showed significantly higher mean percentage within 20%, and lower MAE (all p values < 0.05) than MLR in the low- and high- dose ranges. Conclusion Overall, machine learning-based techniques, BART, MARS and SVR performed superior than MLR in warfarin pharmacogenetic dosing. Differences of algorithms’ performances exist among the races. Moreover, machine learning-based algorithms tended to perform better in the low- and high- dose ranges than MLR. PMID:26305568

  20. Analytic calculation of finite-population reproductive numbers for direct- and vector-transmitted diseases with homogeneous mixing.

    PubMed

    Keegan, Lindsay; Dushoff, Jonathan

    2014-05-01

    The basic reproductive number, R0, provides a foundation for evaluating how various factors affect the incidence of infectious diseases. Recently, it has been suggested that, particularly for vector-transmitted diseases, R0 should be modified to account for the effects of finite host population within a single disease transmission generation. Here, we use a transmission factor approach to calculate such "finite-population reproductive numbers," under the assumption of homogeneous mixing, for both vector-borne and directly transmitted diseases. In the case of vector-borne diseases, we estimate finite-population reproductive numbers for both host-to-host and vector-to-vector generations, assuming that the vector population is effectively infinite. We find simple, interpretable formulas for all three of these quantities. In the direct case, we find that finite-population reproductive numbers diverge from R0 before R0 reaches half of the population size. In the vector-transmitted case, we find that the host-to-host number diverges at even lower values of R0, while the vector-to-vector number diverges very little over realistic parameter ranges.

  1. Recent progresses in gene delivery-based bone tissue engineering.

    PubMed

    Lu, Chia-Hsin; Chang, Yu-Han; Lin, Shih-Yeh; Li, Kuei-Chang; Hu, Yu-Chen

    2013-12-01

    Gene therapy has converged with bone engineering over the past decade, by which a variety of therapeutic genes have been delivered to stimulate bone repair. These genes can be administered via in vivo or ex vivo approach using either viral or nonviral vectors. This article reviews the fundamental aspects and recent progresses in the gene therapy-based bone engineering, with emphasis on the new genes, viral vectors and gene delivery approaches. © 2013.

  2. Multiple regression analysis in nomogram development for myopic wavefront laser in situ keratomileusis: Improving astigmatic outcomes.

    PubMed

    Allan, Bruce D; Hassan, Hala; Ieong, Alvin

    2015-05-01

    To describe and evaluate a new multiple regression-derived nomogram for myopic wavefront laser in situ keratomileusis (LASIK). Moorfields Eye Hospital, London, United Kingdom. Prospective comparative case series. Multiple regression modeling was used to derive a simplified formula for adjusting attempted spherical correction in myopic LASIK. An adaptation of Thibos' power vector method was then applied to derive adjustments to attempted cylindrical correction in eyes with 1.0 diopter (D) or more of preoperative cylinder. These elements were combined in a new nomogram (nomogram II). The 3-month refractive results for myopic wavefront LASIK (spherical equivalent ≤11.0 D; cylinder ≤4.5 D) were compared between 299 consecutive eyes treated using the earlier nomogram (nomogram I) in 2009 and 2010 and 414 eyes treated using nomogram II in 2011 and 2012. There was no significant difference in treatment accuracy (variance in the postoperative manifest refraction spherical equivalent error) between nomogram I and nomogram II (P = .73, Bartlett test). Fewer patients treated with nomogram II had more than 0.5 D of residual postoperative astigmatism (P = .0001, Fisher exact test). There was no significant coupling between adjustments to the attempted cylinder and the achieved sphere (P = .18, t test). Discarding marginal influences from a multiple regression-derived nomogram for myopic wavefront LASIK had no clinically significant effect on treatment accuracy. Thibos' power vector method can be used to guide adjustments to the treatment cylinder alongside nomograms designed to optimize postoperative spherical equivalent results in myopic LASIK. mentioned. Copyright © 2015 ASCRS and ESCRS. Published by Elsevier Inc. All rights reserved.

  3. Support vector machine learning model for the prediction of sentinel node status in patients with cutaneous melanoma.

    PubMed

    Mocellin, Simone; Ambrosi, Alessandro; Montesco, Maria Cristina; Foletto, Mirto; Zavagno, Giorgio; Nitti, Donato; Lise, Mario; Rossi, Carlo Riccardo

    2006-08-01

    Currently, approximately 80% of melanoma patients undergoing sentinel node biopsy (SNB) have negative sentinel lymph nodes (SLNs), and no prediction system is reliable enough to be implemented in the clinical setting to reduce the number of SNB procedures. In this study, the predictive power of support vector machine (SVM)-based statistical analysis was tested. The clinical records of 246 patients who underwent SNB at our institution were used for this analysis. The following clinicopathologic variables were considered: the patient's age and sex and the tumor's histological subtype, Breslow thickness, Clark level, ulceration, mitotic index, lymphocyte infiltration, regression, angiolymphatic invasion, microsatellitosis, and growth phase. The results of SVM-based prediction of SLN status were compared with those achieved with logistic regression. The SLN positivity rate was 22% (52 of 234). When the accuracy was > or = 80%, the negative predictive value, positive predictive value, specificity, and sensitivity were 98%, 54%, 94%, and 77% and 82%, 41%, 69%, and 93% by using SVM and logistic regression, respectively. Moreover, SVM and logistic regression were associated with a diagnostic error and an SNB percentage reduction of (1) 1% and 60% and (2) 15% and 73%, respectively. The results from this pilot study suggest that SVM-based prediction of SLN status might be evaluated as a prognostic method to avoid the SNB procedure in 60% of patients currently eligible, with a very low error rate. If validated in larger series, this strategy would lead to obvious advantages in terms of both patient quality of life and costs for the health care system.

  4. Vectorization with SIMD extensions speeds up reconstruction in electron tomography.

    PubMed

    Agulleiro, J I; Garzón, E M; García, I; Fernández, J J

    2010-06-01

    Electron tomography allows structural studies of cellular structures at molecular detail. Large 3D reconstructions are needed to meet the resolution requirements. The processing time to compute these large volumes may be considerable and so, high performance computing techniques have been used traditionally. This work presents a vector approach to tomographic reconstruction that relies on the exploitation of the SIMD extensions available in modern processors in combination to other single processor optimization techniques. This approach succeeds in producing full resolution tomograms with an important reduction in processing time, as evaluated with the most common reconstruction algorithms, namely WBP and SIRT. The main advantage stems from the fact that this approach is to be run on standard computers without the need of specialized hardware, which facilitates the development, use and management of programs. Future trends in processor design open excellent opportunities for vector processing with processor's SIMD extensions in the field of 3D electron microscopy.

  5. Krylov subspace methods on supercomputers

    NASA Technical Reports Server (NTRS)

    Saad, Youcef

    1988-01-01

    A short survey of recent research on Krylov subspace methods with emphasis on implementation on vector and parallel computers is presented. Conjugate gradient methods have proven very useful on traditional scalar computers, and their popularity is likely to increase as three-dimensional models gain importance. A conservative approach to derive effective iterative techniques for supercomputers has been to find efficient parallel/vector implementations of the standard algorithms. The main source of difficulty in the incomplete factorization preconditionings is in the solution of the triangular systems at each step. A few approaches consisting of implementing efficient forward and backward triangular solutions are described in detail. Polynomial preconditioning as an alternative to standard incomplete factorization techniques is also discussed. Another efficient approach is to reorder the equations so as to improve the structure of the matrix to achieve better parallelism or vectorization. An overview of these and other ideas and their effectiveness or potential for different types of architectures is given.

  6. Operational efficiency and sustainability of vector control of malaria and dengue: descriptive case studies from the Philippines

    PubMed Central

    2012-01-01

    Background Analysis is lacking on the management of vector control systems in disease-endemic countries with respect to the efficiency and sustainability of operations. Methods Three locations were selected, at the scale of province, municipality and barangay (i.e. village). Data on disease incidence, programme activities, and programme management were collected on-site through meetings and focus group discussions. Results Adaptation of disease control strategies to the epidemiological situation per barangay, through micro-stratification, brings gains in efficiency, but should be accompanied by further capacity building on local situational analysis for better selection and targeting of vector control interventions within the barangay. An integrated approach to vector control, aiming to improve the rational use of resources, was evident with a multi-disease strategy for detection and response, and by the use of combinations of vector control methods. Collaboration within the health sector was apparent from the involvement of barangay health workers, re-orientation of job descriptions and the creation of a disease surveillance unit. The engagement of barangay leaders and use of existing community structures helped mobilize local resources and voluntary services for vector control. In one location, local authorities and the community were involved in the planning, implementation and evaluation of malaria control, which triggered local programme ownership. Conclusions Strategies that contributed to an improved efficiency and sustainability of vector control operations were: micro-stratification, integration of vector control within the health sector, a multi-disease approach, involvement of local authorities, and empowerment of communities. Capacity building on situational analysis and vector surveillance should be addressed through national policy and guidelines. PMID:22873707

  7. The efficacy of support vector machines (SVM) in robust determination of earthquake early warning magnitudes in central Japan

    NASA Astrophysics Data System (ADS)

    Reddy, Ramakrushna; Nair, Rajesh R.

    2013-10-01

    This work deals with a methodology applied to seismic early warning systems which are designed to provide real-time estimation of the magnitude of an event. We will reappraise the work of Simons et al. (2006), who on the basis of wavelet approach predicted a magnitude error of ±1. We will verify and improve upon the methodology of Simons et al. (2006) by applying an SVM statistical learning machine on the time-scale wavelet decomposition methods. We used the data of 108 events in central Japan with magnitude ranging from 3 to 7.4 recorded at KiK-net network stations, for a source-receiver distance of up to 150 km during the period 1998-2011. We applied a wavelet transform on the seismogram data and calculating scale-dependent threshold wavelet coefficients. These coefficients were then classified into low magnitude and high magnitude events by constructing a maximum margin hyperplane between the two classes, which forms the essence of SVMs. Further, the classified events from both the classes were picked up and linear regressions were plotted to determine the relationship between wavelet coefficient magnitude and earthquake magnitude, which in turn helped us to estimate the earthquake magnitude of an event given its threshold wavelet coefficient. At wavelet scale number 7, we predicted the earthquake magnitude of an event within 2.7 seconds. This means that a magnitude determination is available within 2.7 s after the initial onset of the P-wave. These results shed light on the application of SVM as a way to choose the optimal regression function to estimate the magnitude from a few seconds of an incoming seismogram. This would improve the approaches from Simons et al. (2006) which use an average of the two regression functions to estimate the magnitude.

  8. Detection of Independent Associations of Plasma Lipidomic Parameters with Insulin Sensitivity Indices Using Data Mining Methodology.

    PubMed

    Kopprasch, Steffi; Dheban, Srirangan; Schuhmann, Kai; Xu, Aimin; Schulte, Klaus-Martin; Simeonovic, Charmaine J; Schwarz, Peter E H; Bornstein, Stefan R; Shevchenko, Andrej; Graessler, Juergen

    2016-01-01

    Glucolipotoxicity is a major pathophysiological mechanism in the development of insulin resistance and type 2 diabetes mellitus (T2D). We aimed to detect subtle changes in the circulating lipid profile by shotgun lipidomics analyses and to associate them with four different insulin sensitivity indices. The cross-sectional study comprised 90 men with a broad range of insulin sensitivity including normal glucose tolerance (NGT, n = 33), impaired glucose tolerance (IGT, n = 32) and newly detected T2D (n = 25). Prior to oral glucose challenge plasma was obtained and quantitatively analyzed for 198 lipid molecular species from 13 different lipid classes including triacylglycerls (TAGs), phosphatidylcholine plasmalogen/ether (PC O-s), sphingomyelins (SMs), and lysophosphatidylcholines (LPCs). To identify a lipidomic signature of individual insulin sensitivity we applied three data mining approaches, namely least absolute shrinkage and selection operator (LASSO), Support Vector Regression (SVR) and Random Forests (RF) for the following insulin sensitivity indices: homeostasis model of insulin resistance (HOMA-IR), glucose insulin sensitivity index (GSI), insulin sensitivity index (ISI), and disposition index (DI). The LASSO procedure offers a high prediction accuracy and and an easier interpretability than SVR and RF. After LASSO selection, the plasma lipidome explained 3% (DI) to maximal 53% (HOMA-IR) variability of the sensitivity indexes. Among the lipid species with the highest positive LASSO regression coefficient were TAG 54:2 (HOMA-IR), PC O- 32:0 (GSI), and SM 40:3:1 (ISI). The highest negative regression coefficient was obtained for LPC 22:5 (HOMA-IR), TAG 51:1 (GSI), and TAG 58:6 (ISI). Although a substantial part of lipid molecular species showed a significant correlation with insulin sensitivity indices we were able to identify a limited number of lipid metabolites of particular importance based on the LASSO approach. These few selected lipids with the closest connection to sensitivity indices may help to further improve disease risk prediction and disease and therapy monitoring.

  9. Performance comparisons between PCA-EA-LBG and PCA-LBG-EA approaches in VQ codebook generation for image compression

    NASA Astrophysics Data System (ADS)

    Tsai, Jinn-Tsong; Chou, Ping-Yi; Chou, Jyh-Horng

    2015-11-01

    The aim of this study is to generate vector quantisation (VQ) codebooks by integrating principle component analysis (PCA) algorithm, Linde-Buzo-Gray (LBG) algorithm, and evolutionary algorithms (EAs). The EAs include genetic algorithm (GA), particle swarm optimisation (PSO), honey bee mating optimisation (HBMO), and firefly algorithm (FF). The study is to provide performance comparisons between PCA-EA-LBG and PCA-LBG-EA approaches. The PCA-EA-LBG approaches contain PCA-GA-LBG, PCA-PSO-LBG, PCA-HBMO-LBG, and PCA-FF-LBG, while the PCA-LBG-EA approaches contain PCA-LBG, PCA-LBG-GA, PCA-LBG-PSO, PCA-LBG-HBMO, and PCA-LBG-FF. All training vectors of test images are grouped according to PCA. The PCA-EA-LBG used the vectors grouped by PCA as initial individuals, and the best solution gained by the EAs was given for LBG to discover a codebook. The PCA-LBG approach is to use the PCA to select vectors as initial individuals for LBG to find a codebook. The PCA-LBG-EA used the final result of PCA-LBG as an initial individual for EAs to find a codebook. The search schemes in PCA-EA-LBG first used global search and then applied local search skill, while in PCA-LBG-EA first used local search and then employed global search skill. The results verify that the PCA-EA-LBG indeed gain superior results compared to the PCA-LBG-EA, because the PCA-EA-LBG explores a global area to find a solution, and then exploits a better one from the local area of the solution. Furthermore the proposed PCA-EA-LBG approaches in designing VQ codebooks outperform existing approaches shown in the literature.

  10. Application of Machine Learning Approaches for Classifying Sitting Posture Based on Force and Acceleration Sensors.

    PubMed

    Zemp, Roland; Tanadini, Matteo; Plüss, Stefan; Schnüriger, Karin; Singh, Navrag B; Taylor, William R; Lorenzetti, Silvio

    2016-01-01

    Occupational musculoskeletal disorders, particularly chronic low back pain (LBP), are ubiquitous due to prolonged static sitting or nonergonomic sitting positions. Therefore, the aim of this study was to develop an instrumented chair with force and acceleration sensors to determine the accuracy of automatically identifying the user's sitting position by applying five different machine learning methods (Support Vector Machines, Multinomial Regression, Boosting, Neural Networks, and Random Forest). Forty-one subjects were requested to sit four times in seven different prescribed sitting positions (total 1148 samples). Sixteen force sensor values and the backrest angle were used as the explanatory variables (features) for the classification. The different classification methods were compared by means of a Leave-One-Out cross-validation approach. The best performance was achieved using the Random Forest classification algorithm, producing a mean classification accuracy of 90.9% for subjects with which the algorithm was not familiar. The classification accuracy varied between 81% and 98% for the seven different sitting positions. The present study showed the possibility of accurately classifying different sitting positions by means of the introduced instrumented office chair combined with machine learning analyses. The use of such novel approaches for the accurate assessment of chair usage could offer insights into the relationships between sitting position, sitting behaviour, and the occurrence of musculoskeletal disorders.

  11. A Systematic Approach to Predicting Spring Force for Sagittal Craniosynostosis Surgery.

    PubMed

    Zhang, Guangming; Tan, Hua; Qian, Xiaohua; Zhang, Jian; Li, King; David, Lisa R; Zhou, Xiaobo

    2016-05-01

    Spring-assisted surgery (SAS) can effectively treat scaphocephaly by reshaping crania with the appropriate spring force. However, it is difficult to accurately estimate spring force without considering biomechanical properties of tissues. This study presents and validates a reliable system to accurately predict the spring force for sagittal craniosynostosis surgery. The authors randomly chose 23 patients who underwent SAS and had been followed for at least 2 years. An elastic model was designed to characterize the biomechanical behavior of calvarial bone tissue for each individual. After simulating the contact force on accurate position of the skull strip with the springs, the finite element method was applied to calculating the stress of each tissue node based on the elastic model. A support vector regression approach was then used to model the relationships between biomechanical properties generated from spring force, bone thickness, and the change of cephalic index after surgery. Therefore, for a new patient, the optimal spring force can be predicted based on the learned model with virtual spring simulation and dynamic programming approach prior to SAS. Leave-one-out cross-validation was implemented to assess the accuracy of our prediction. As a result, the mean prediction accuracy of this model was 93.35%, demonstrating the great potential of this model as a useful adjunct for preoperative planning tool.

  12. Integrating machine learning to achieve an automatic parameter prediction for practical continuous-variable quantum key distribution

    NASA Astrophysics Data System (ADS)

    Liu, Weiqi; Huang, Peng; Peng, Jinye; Fan, Jianping; Zeng, Guihua

    2018-02-01

    For supporting practical quantum key distribution (QKD), it is critical to stabilize the physical parameters of signals, e.g., the intensity, phase, and polarization of the laser signals, so that such QKD systems can achieve better performance and practical security. In this paper, an approach is developed by integrating a support vector regression (SVR) model to optimize the performance and practical security of the QKD system. First, a SVR model is learned to precisely predict the time-along evolutions of the physical parameters of signals. Second, such predicted time-along evolutions are employed as feedback to control the QKD system for achieving the optimal performance and practical security. Finally, our proposed approach is exemplified by using the intensity evolution of laser light and a local oscillator pulse in the Gaussian modulated coherent state QKD system. Our experimental results have demonstrated three significant benefits of our SVR-based approach: (1) it can allow the QKD system to achieve optimal performance and practical security, (2) it does not require any additional resources and any real-time monitoring module to support automatic prediction of the time-along evolutions of the physical parameters of signals, and (3) it is applicable to any measurable physical parameter of signals in the practical QKD system.

  13. Resurgent vector-borne diseases as a global health problem.

    PubMed Central

    Gubler, D. J.

    1998-01-01

    Vector-borne infectious diseases are emerging or resurging as a result of changes in public health policy, insecticide and drug resistance, shift in emphasis from prevention to emergency response, demographic and societal changes, and genetic changes in pathogens. Effective prevention strategies can reverse this trend. Research on vaccines, environmentally safe insecticides, alternative approaches to vector control, and training programs for health-care workers are needed. PMID:9716967

  14. AAV viral vector delivery to the brain by shape-conforming MR-guided infusions.

    PubMed

    Bankiewicz, Krystof S; Sudhakar, Vivek; Samaranch, Lluis; San Sebastian, Waldy; Bringas, John; Forsayeth, John

    2016-10-28

    Gene transfer technology offers great promise as a potential therapeutic approach to the brain but has to be viewed as a very complex technology. Success of ongoing clinical gene therapy trials depends on many factors such as selection of the correct genetic and anatomical target in the brain. In addition, selection of the viral vector capable of transfer of therapeutic gene into target cells, along with long-term expression that avoids immunotoxicity has to be established. As with any drug development strategy, delivery of gene therapy has to be consistent and predictable in each study subject. Failed drug and vector delivery will lead to failed clinical trials. In this article, we describe our experience with AAV viral vector delivery system, that allows us to optimize and monitor in real time viral vector administration into affected regions of the brain. In addition to discussing MRI-guided technology for administration of AAV vectors we have developed and now employ in current clinical trials, we also describe ways in which infusion cannula design and stereotactic trajectory may be used to maximize the anatomical coverage by using fluid backflow. This innovative approach enables more precise coverage by fitting the shape of the infusion to the shape of the anatomical target. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. Visualization of Morse connection graphs for topologically rich 2D vector fields.

    PubMed

    Szymczak, Andrzej; Sipeki, Levente

    2013-12-01

    Recent advances in vector field topologymake it possible to compute its multi-scale graph representations for autonomous 2D vector fields in a robust and efficient manner. One of these representations is a Morse Connection Graph (MCG), a directed graph whose nodes correspond to Morse sets, generalizing stationary points and periodic trajectories, and arcs - to trajectories connecting them. While being useful for simple vector fields, the MCG can be hard to comprehend for topologically rich vector fields, containing a large number of features. This paper describes a visual representation of the MCG, inspired by previous work on graph visualization. Our approach aims to preserve the spatial relationships between the MCG arcs and nodes and highlight the coherent behavior of connecting trajectories. Using simulations of ocean flow, we show that it can provide useful information on the flow structure. This paper focuses specifically on MCGs computed for piecewise constant (PC) vector fields. In particular, we describe extensions of the PC framework that make it more flexible and better suited for analysis of data on complex shaped domains with a boundary. We also describe a topology simplification scheme that makes our MCG visualizations less ambiguous. Despite the focus on the PC framework, our approach could also be applied to graph representations or topological skeletons computed using different methods.

  16. Prediction task guided representation learning of medical codes in EHR.

    PubMed

    Cui, Liwen; Xie, Xiaolei; Shen, Zuojun

    2018-06-18

    There have been rapidly growing applications using machine learning models for predictive analytics in Electronic Health Records (EHR) to improve the quality of hospital services and the efficiency of healthcare resource utilization. A fundamental and crucial step in developing such models is to convert medical codes in EHR to feature vectors. These medical codes are used to represent diagnoses or procedures. Their vector representations have a tremendous impact on the performance of machine learning models. Recently, some researchers have utilized representation learning methods from Natural Language Processing (NLP) to learn vector representations of medical codes. However, most previous approaches are unsupervised, i.e. the generation of medical code vectors is independent from prediction tasks. Thus, the obtained feature vectors may be inappropriate for a specific prediction task. Moreover, unsupervised methods often require a lot of samples to obtain reliable results, but most practical problems have very limited patient samples. In this paper, we develop a new method called Prediction Task Guided Health Record Aggregation (PTGHRA), which aggregates health records guided by prediction tasks, to construct training corpus for various representation learning models. Compared with unsupervised approaches, representation learning models integrated with PTGHRA yield a significant improvement in predictive capability of generated medical code vectors, especially for limited training samples. Copyright © 2018. Published by Elsevier Inc.

  17. Volatility forecasting for low-volatility portfolio selection in the US and the Korean equity markets

    NASA Astrophysics Data System (ADS)

    Kim, Saejoon

    2018-01-01

    We consider the problem of low-volatility portfolio selection which has been the subject of extensive research in the field of portfolio selection. To improve the currently existing techniques that rely purely on past information to select low-volatility portfolios, this paper investigates the use of time series regression techniques that make forecasts of future volatility to select the portfolios. In particular, for the first time, the utility of support vector regression and its enhancements as portfolio selection techniques is provided. It is shown that our regression-based portfolio selection provides attractive outperformances compared to the benchmark index and the portfolio defined by a well-known strategy on the data-sets of the S&P 500 and the KOSPI 200.

  18. Viral vector-based reversible neuronal inactivation and behavioral manipulation in the macaque monkey

    PubMed Central

    Nielsen, Kristina J.; Callaway, Edward M.; Krauzlis, Richard J.

    2012-01-01

    Viral vectors are promising tools for the dissection of neural circuits. In principle, they can manipulate neurons at a level of specificity not otherwise achievable. While many studies have used viral vector-based approaches in the rodent brain, only a few have employed this technique in the non-human primate, despite the importance of this animal model for neuroscience research. Here, we report evidence that a viral vector-based approach can be used to manipulate a monkey's behavior in a task. For this purpose, we used the allatostatin receptor/allatostatin (AlstR/AL) system, which has previously been shown to allow inactivation of neurons in vivo. The AlstR was expressed in neurons in monkey V1 by injection of an adeno-associated virus 1 (AAV1) vector. Two monkeys were trained in a detection task, in which they had to make a saccade to a faint peripheral target. Injection of AL caused a retinotopic deficit in the detection task in one monkey. Specifically, the monkey showed marked impairment for detection targets placed at the visual field location represented at the virus injection site, but not for targets shown elsewhere. We confirmed that these deficits indeed were due to the interaction of AlstR and AL by injecting saline, or AL at a V1 location without AlstR expression. Post-mortem histology confirmed AlstR expression in this monkey. We failed to replicate the behavioral results in a second monkey, as AL injection did not impair the second monkey's performance in the detection task. However, post-mortem histology revealed a very low level of AlstR expression in this monkey. Our results demonstrate that viral vector-based approaches can produce effects strong enough to influence a monkey's performance in a behavioral task, supporting the further development of this approach for studying how neuronal circuits control complex behaviors in non-human primates. PMID:22723770

  19. Time-series panel analysis (TSPA): multivariate modeling of temporal associations in psychotherapy process.

    PubMed

    Ramseyer, Fabian; Kupper, Zeno; Caspar, Franz; Znoj, Hansjörg; Tschacher, Wolfgang

    2014-10-01

    Processes occurring in the course of psychotherapy are characterized by the simple fact that they unfold in time and that the multiple factors engaged in change processes vary highly between individuals (idiographic phenomena). Previous research, however, has neglected the temporal perspective by its traditional focus on static phenomena, which were mainly assessed at the group level (nomothetic phenomena). To support a temporal approach, the authors introduce time-series panel analysis (TSPA), a statistical methodology explicitly focusing on the quantification of temporal, session-to-session aspects of change in psychotherapy. TSPA-models are initially built at the level of individuals and are subsequently aggregated at the group level, thus allowing the exploration of prototypical models. TSPA is based on vector auto-regression (VAR), an extension of univariate auto-regression models to multivariate time-series data. The application of TSPA is demonstrated in a sample of 87 outpatient psychotherapy patients who were monitored by postsession questionnaires. Prototypical mechanisms of change were derived from the aggregation of individual multivariate models of psychotherapy process. In a 2nd step, the associations between mechanisms of change (TSPA) and pre- to postsymptom change were explored. TSPA allowed a prototypical process pattern to be identified, where patient's alliance and self-efficacy were linked by a temporal feedback-loop. Furthermore, therapist's stability over time in both mastery and clarification interventions was positively associated with better outcomes. TSPA is a statistical tool that sheds new light on temporal mechanisms of change. Through this approach, clinicians may gain insight into prototypical patterns of change in psychotherapy. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  20. Reducing vector-borne disease by empowering farmers in integrated vector management.

    PubMed

    van den Berg, Henk; von Hildebrand, Alexander; Ragunathan, Vaithilingam; Das, Pradeep K

    2007-07-01

    Irrigated agriculture exposes rural people to health risks associated with vector-borne diseases and pesticides used in agriculture and for public health protection. Most developing countries lack collaboration between the agricultural and health sectors to jointly address these problems. We present an evaluation of a project that uses the "farmer field school" method to teach farmers how to manage vector-borne diseases and how to improve rice yields. Teaching farmers about these two concepts together is known as "integrated pest and vector management". An intersectoral project targeting rice irrigation systems in Sri Lanka. Project partners developed a new curriculum for the field school that included a component on vector-borne diseases. Rice farmers in intervention villages who graduated from the field school took vector-control actions as well as improving environmental sanitation and their personal protection measures against disease transmission. They also reduced their use of agricultural pesticides, especially insecticides. The intervention motivated and enabled rural people to take part in vector-management activities and to reduce several environmental health risks. There is scope for expanding the curriculum to include information on the harmful effects of pesticides on human health and to address other public health concerns. Benefits of this approach for community-based health programmes have not yet been optimally assessed. Also, the institutional basis of the integrated management approach needs to be broadened so that people from a wider range of organizations take part. A monitoring and evaluation system needs to be established to measure the performance of integrated management initiatives.

Top