Sample records for prediction comparative-genetic algorithm

  1. Increasing Prediction the Original Final Year Project of Student Using Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Saragih, Rijois Iboy Erwin; Turnip, Mardi; Sitanggang, Delima; Aritonang, Mendarissan; Harianja, Eva

    2018-04-01

    Final year project is very important forgraduation study of a student. Unfortunately, many students are not seriouslydidtheir final projects. Many of studentsask for someone to do it for them. In this paper, an application of genetic algorithms to predict the original final year project of a studentis proposed. In the simulation, the data of the final project for the last 5 years is collected. The genetic algorithm has several operators namely population, selection, crossover, and mutation. The result suggest that genetic algorithm can do better prediction than other comparable model. Experimental results of predicting showed that 70% was more accurate than the previous researched.

  2. Real coded genetic algorithm for fuzzy time series prediction

    NASA Astrophysics Data System (ADS)

    Jain, Shilpa; Bisht, Dinesh C. S.; Singh, Phool; Mathpal, Prakash C.

    2017-10-01

    Genetic Algorithm (GA) forms a subset of evolutionary computing, rapidly growing area of Artificial Intelligence (A.I.). Some variants of GA are binary GA, real GA, messy GA, micro GA, saw tooth GA, differential evolution GA. This research article presents a real coded GA for predicting enrollments of University of Alabama. Data of Alabama University is a fuzzy time series. Here, fuzzy logic is used to predict enrollments of Alabama University and genetic algorithm optimizes fuzzy intervals. Results are compared to other eminent author works and found satisfactory, and states that real coded GA are fast and accurate.

  3. Comparative Analysis of Soft Computing Models in Prediction of Bending Rigidity of Cotton Woven Fabrics

    NASA Astrophysics Data System (ADS)

    Guruprasad, R.; Behera, B. K.

    2015-10-01

    Quantitative prediction of fabric mechanical properties is an essential requirement for design engineering of textile and apparel products. In this work, the possibility of prediction of bending rigidity of cotton woven fabrics has been explored with the application of Artificial Neural Network (ANN) and two hybrid methodologies, namely Neuro-genetic modeling and Adaptive Neuro-Fuzzy Inference System (ANFIS) modeling. For this purpose, a set of cotton woven grey fabrics was desized, scoured and relaxed. The fabrics were then conditioned and tested for bending properties. With the database thus created, a neural network model was first developed using back propagation as the learning algorithm. The second model was developed by applying a hybrid learning strategy, in which genetic algorithm was first used as a learning algorithm to optimize the number of neurons and connection weights of the neural network. The Genetic algorithm optimized network structure was further allowed to learn using back propagation algorithm. In the third model, an ANFIS modeling approach was attempted to map the input-output data. The prediction performances of the models were compared and a sensitivity analysis was reported. The results show that the prediction by neuro-genetic and ANFIS models were better in comparison with that of back propagation neural network model.

  4. Firefly algorithm versus genetic algorithm as powerful variable selection tools and their effect on different multivariate calibration models in spectroscopy: A comparative study

    NASA Astrophysics Data System (ADS)

    Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed

    2017-01-01

    For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration.

  5. Refined genetic algorithm -- Economic dispatch example

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sheble, G.B.; Brittig, K.

    1995-02-01

    A genetic-based algorithm is used to solve an economic dispatch (ED) problem. The algorithm utilizes payoff information of perspective solutions to evaluate optimality. Thus, the constraints of classical LaGrangian techniques on unit curves are eliminated. Using an economic dispatch problem as a basis for comparison, several different techniques which enhance program efficiency and accuracy, such as mutation prediction, elitism, interval approximation and penalty factors, are explored. Two unique genetic algorithms are also compared. The results are verified for a sample problem using a classical technique.

  6. Firefly algorithm versus genetic algorithm as powerful variable selection tools and their effect on different multivariate calibration models in spectroscopy: A comparative study.

    PubMed

    Attia, Khalid A M; Nassar, Mohammed W I; El-Zeiny, Mohamed B; Serag, Ahmed

    2017-01-05

    For the first time, a new variable selection method based on swarm intelligence namely firefly algorithm is coupled with three different multivariate calibration models namely, concentration residual augmented classical least squares, artificial neural network and support vector regression in UV spectral data. A comparative study between the firefly algorithm and the well-known genetic algorithm was developed. The discussion revealed the superiority of using this new powerful algorithm over the well-known genetic algorithm. Moreover, different statistical tests were performed and no significant differences were found between all the models regarding their predictabilities. This ensures that simpler and faster models were obtained without any deterioration of the quality of the calibration. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. 3D Protein structure prediction with genetic tabu search algorithm

    PubMed Central

    2010-01-01

    Background Protein structure prediction (PSP) has important applications in different fields, such as drug design, disease prediction, and so on. In protein structure prediction, there are two important issues. The first one is the design of the structure model and the second one is the design of the optimization technology. Because of the complexity of the realistic protein structure, the structure model adopted in this paper is a simplified model, which is called off-lattice AB model. After the structure model is assumed, optimization technology is needed for searching the best conformation of a protein sequence based on the assumed structure model. However, PSP is an NP-hard problem even if the simplest model is assumed. Thus, many algorithms have been developed to solve the global optimization problem. In this paper, a hybrid algorithm, which combines genetic algorithm (GA) and tabu search (TS) algorithm, is developed to complete this task. Results In order to develop an efficient optimization algorithm, several improved strategies are developed for the proposed genetic tabu search algorithm. The combined use of these strategies can improve the efficiency of the algorithm. In these strategies, tabu search introduced into the crossover and mutation operators can improve the local search capability, the adoption of variable population size strategy can maintain the diversity of the population, and the ranking selection strategy can improve the possibility of an individual with low energy value entering into next generation. Experiments are performed with Fibonacci sequences and real protein sequences. Experimental results show that the lowest energy obtained by the proposed GATS algorithm is lower than that obtained by previous methods. Conclusions The hybrid algorithm has the advantages from both genetic algorithm and tabu search algorithm. It makes use of the advantage of multiple search points in genetic algorithm, and can overcome poor hill-climbing capability in the conventional genetic algorithm by using the flexible memory functions of TS. Compared with some previous algorithms, GATS algorithm has better performance in global optimization and can predict 3D protein structure more effectively. PMID:20522256

  8. Prediction of Industrial Electric Energy Consumption in Anhui Province Based on GA-BP Neural Network

    NASA Astrophysics Data System (ADS)

    Zhang, Jiajing; Yin, Guodong; Ni, Youcong; Chen, Jinlan

    2018-01-01

    In order to improve the prediction accuracy of industrial electrical energy consumption, a prediction model of industrial electrical energy consumption was proposed based on genetic algorithm and neural network. The model use genetic algorithm to optimize the weights and thresholds of BP neural network, and the model is used to predict the energy consumption of industrial power in Anhui Province, to improve the prediction accuracy of industrial electric energy consumption in Anhui province. By comparing experiment of GA-BP prediction model and BP neural network model, the GA-BP model is more accurate with smaller number of neurons in the hidden layer.

  9. Process optimization of rolling for zincked sheet technology using response surface methodology and genetic algorithm

    NASA Astrophysics Data System (ADS)

    Ji, Liang-Bo; Chen, Fang

    2017-07-01

    Numerical simulation and intelligent optimization technology were adopted for rolling and extrusion of zincked sheet. By response surface methodology (RSM), genetic algorithm (GA) and data processing technology, an efficient optimization of process parameters for rolling of zincked sheet was investigated. The influence trend of roller gap, rolling speed and friction factor effects on reduction rate and plate shortening rate were analyzed firstly. Then a predictive response surface model for comprehensive quality index of part was created using RSM. Simulated and predicted values were compared. Through genetic algorithm method, the optimal process parameters for the forming of rolling were solved. They were verified and the optimum process parameters of rolling were obtained. It is feasible and effective.

  10. Sequential and Mixed Genetic Algorithm and Learning Automata (SGALA, MGALA) for Feature Selection in QSAR

    PubMed Central

    MotieGhader, Habib; Gharaghani, Sajjad; Masoudi-Sobhanzadeh, Yosef; Masoudi-Nejad, Ali

    2017-01-01

    Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as GA, PSO, ACO and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR feature selection are proposed. SGALA algorithm uses advantages of Genetic algorithm and Learning Automata sequentially and the MGALA algorithm uses advantages of Genetic Algorithm and Learning Automata simultaneously. We applied our proposed algorithms to select the minimum possible number of features from three different datasets and also we observed that the MGALA and SGALA algorithms had the best outcome independently and in average compared to other feature selection algorithms. Through comparison of our proposed algorithms, we deduced that the rate of convergence to optimal result in MGALA and SGALA algorithms were better than the rate of GA, ACO, PSO and LA algorithms. In the end, the results of GA, ACO, PSO, LA, SGALA, and MGALA algorithms were applied as the input of LS-SVR model and the results from LS-SVR models showed that the LS-SVR model had more predictive ability with the input from SGALA and MGALA algorithms than the input from all other mentioned algorithms. Therefore, the results have corroborated that not only is the predictive efficiency of proposed algorithms better, but their rate of convergence is also superior to the all other mentioned algorithms. PMID:28979308

  11. Sequential and Mixed Genetic Algorithm and Learning Automata (SGALA, MGALA) for Feature Selection in QSAR.

    PubMed

    MotieGhader, Habib; Gharaghani, Sajjad; Masoudi-Sobhanzadeh, Yosef; Masoudi-Nejad, Ali

    2017-01-01

    Feature selection is of great importance in Quantitative Structure-Activity Relationship (QSAR) analysis. This problem has been solved using some meta-heuristic algorithms such as GA, PSO, ACO and so on. In this work two novel hybrid meta-heuristic algorithms i.e. Sequential GA and LA (SGALA) and Mixed GA and LA (MGALA), which are based on Genetic algorithm and learning automata for QSAR feature selection are proposed. SGALA algorithm uses advantages of Genetic algorithm and Learning Automata sequentially and the MGALA algorithm uses advantages of Genetic Algorithm and Learning Automata simultaneously. We applied our proposed algorithms to select the minimum possible number of features from three different datasets and also we observed that the MGALA and SGALA algorithms had the best outcome independently and in average compared to other feature selection algorithms. Through comparison of our proposed algorithms, we deduced that the rate of convergence to optimal result in MGALA and SGALA algorithms were better than the rate of GA, ACO, PSO and LA algorithms. In the end, the results of GA, ACO, PSO, LA, SGALA, and MGALA algorithms were applied as the input of LS-SVR model and the results from LS-SVR models showed that the LS-SVR model had more predictive ability with the input from SGALA and MGALA algorithms than the input from all other mentioned algorithms. Therefore, the results have corroborated that not only is the predictive efficiency of proposed algorithms better, but their rate of convergence is also superior to the all other mentioned algorithms.

  12. Near infrared spectrometric technique for testing fruit quality: optimisation of regression models using genetic algorithms

    NASA Astrophysics Data System (ADS)

    Isingizwe Nturambirwe, J. Frédéric; Perold, Willem J.; Opara, Umezuruike L.

    2016-02-01

    Near infrared (NIR) spectroscopy has gained extensive use in quality evaluation. It is arguably one of the most advanced spectroscopic tools in non-destructive quality testing of food stuff, from measurement to data analysis and interpretation. NIR spectral data are interpreted through means often involving multivariate statistical analysis, sometimes associated with optimisation techniques for model improvement. The objective of this research was to explore the extent to which genetic algorithms (GA) can be used to enhance model development, for predicting fruit quality. Apple fruits were used, and NIR spectra in the range from 12000 to 4000 cm-1 were acquired on both bruised and healthy tissues, with different degrees of mechanical damage. GAs were used in combination with partial least squares regression methods to develop bruise severity prediction models, and compared to PLS models developed using the full NIR spectrum. A classification model was developed, which clearly separated bruised from unbruised apple tissue. GAs helped improve prediction models by over 10%, in comparison with full spectrum-based models, as evaluated in terms of error of prediction (Root Mean Square Error of Cross-validation). PLS models to predict internal quality, such as sugar content and acidity were developed and compared to the versions optimized by genetic algorithm. Overall, the results highlighted the potential use of GA method to improve speed and accuracy of fruit quality prediction.

  13. Optimization of Straight Cylindrical Turning Using Artificial Bee Colony (ABC) Algorithm

    NASA Astrophysics Data System (ADS)

    Prasanth, Rajanampalli Seshasai Srinivasa; Hans Raj, Kandikonda

    2017-04-01

    Artificial bee colony (ABC) algorithm, that mimics the intelligent foraging behavior of honey bees, is increasingly gaining acceptance in the field of process optimization, as it is capable of handling nonlinearity, complexity and uncertainty. Straight cylindrical turning is a complex and nonlinear machining process which involves the selection of appropriate cutting parameters that affect the quality of the workpiece. This paper presents the estimation of optimal cutting parameters of the straight cylindrical turning process using the ABC algorithm. The ABC algorithm is first tested on four benchmark problems of numerical optimization and its performance is compared with genetic algorithm (GA) and ant colony optimization (ACO) algorithm. Results indicate that, the rate of convergence of ABC algorithm is better than GA and ACO. Then, the ABC algorithm is used to predict optimal cutting parameters such as cutting speed, feed rate, depth of cut and tool nose radius to achieve good surface finish. Results indicate that, the ABC algorithm estimated a comparable surface finish when compared with real coded genetic algorithm and differential evolution algorithm.

  14. Linear genetic programming application for successive-station monthly streamflow prediction

    NASA Astrophysics Data System (ADS)

    Danandeh Mehr, Ali; Kahya, Ercan; Yerdelen, Cahit

    2014-09-01

    In recent decades, artificial intelligence (AI) techniques have been pronounced as a branch of computer science to model wide range of hydrological phenomena. A number of researches have been still comparing these techniques in order to find more effective approaches in terms of accuracy and applicability. In this study, we examined the ability of linear genetic programming (LGP) technique to model successive-station monthly streamflow process, as an applied alternative for streamflow prediction. A comparative efficiency study between LGP and three different artificial neural network algorithms, namely feed forward back propagation (FFBP), generalized regression neural networks (GRNN), and radial basis function (RBF), has also been presented in this study. For this aim, firstly, we put forward six different successive-station monthly streamflow prediction scenarios subjected to training by LGP and FFBP using the field data recorded at two gauging stations on Çoruh River, Turkey. Based on Nash-Sutcliffe and root mean squared error measures, we then compared the efficiency of these techniques and selected the best prediction scenario. Eventually, GRNN and RBF algorithms were utilized to restructure the selected scenario and to compare with corresponding FFBP and LGP. Our results indicated the promising role of LGP for successive-station monthly streamflow prediction providing more accurate results than those of all the ANN algorithms. We found an explicit LGP-based expression evolved by only the basic arithmetic functions as the best prediction model for the river, which uses the records of the both target and upstream stations.

  15. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification.

    PubMed

    Alshamlan, Hala M; Badr, Ghada H; Alohali, Yousef A

    2015-06-01

    Naturally inspired evolutionary algorithms prove effectiveness when used for solving feature selection and classification problems. Artificial Bee Colony (ABC) is a relatively new swarm intelligence method. In this paper, we propose a new hybrid gene selection method, namely Genetic Bee Colony (GBC) algorithm. The proposed algorithm combines the used of a Genetic Algorithm (GA) along with Artificial Bee Colony (ABC) algorithm. The goal is to integrate the advantages of both algorithms. The proposed algorithm is applied to a microarray gene expression profile in order to select the most predictive and informative genes for cancer classification. In order to test the accuracy performance of the proposed algorithm, extensive experiments were conducted. Three binary microarray datasets are use, which include: colon, leukemia, and lung. In addition, another three multi-class microarray datasets are used, which are: SRBCT, lymphoma, and leukemia. Results of the GBC algorithm are compared with our recently proposed technique: mRMR when combined with the Artificial Bee Colony algorithm (mRMR-ABC). We also compared the combination of mRMR with GA (mRMR-GA) and Particle Swarm Optimization (mRMR-PSO) algorithms. In addition, we compared the GBC algorithm with other related algorithms that have been recently published in the literature, using all benchmark datasets. The GBC algorithm shows superior performance as it achieved the highest classification accuracy along with the lowest average number of selected genes. This proves that the GBC algorithm is a promising approach for solving the gene selection problem in both binary and multi-class cancer classification. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Comparison of genetic algorithm and imperialist competitive algorithms in predicting bed load transport in clean pipe.

    PubMed

    Ebtehaj, Isa; Bonakdari, Hossein

    2014-01-01

    The existence of sediments in wastewater greatly affects the performance of the sewer and wastewater transmission systems. Increased sedimentation in wastewater collection systems causes problems such as reduced transmission capacity and early combined sewer overflow. The article reviews the performance of the genetic algorithm (GA) and imperialist competitive algorithm (ICA) in minimizing the target function (mean square error of observed and predicted Froude number). To study the impact of bed load transport parameters, using four non-dimensional groups, six different models have been presented. Moreover, the roulette wheel selection method is used to select the parents. The ICA with root mean square error (RMSE) = 0.007, mean absolute percentage error (MAPE) = 3.5% show better results than GA (RMSE = 0.007, MAPE = 5.6%) for the selected model. All six models return better results than the GA. Also, the results of these two algorithms were compared with multi-layer perceptron and existing equations.

  17. Optimum location of external markers using feature selection algorithms for real-time tumor tracking in external-beam radiotherapy: a virtual phantom study.

    PubMed

    Nankali, Saber; Torshabi, Ahmad Esmaili; Miandoab, Payam Samadi; Baghizadeh, Amin

    2016-01-08

    In external-beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation-based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two "Genetic" and "Ranker" searching procedures. The performance of these algorithms has been evaluated using four-dimensional extended cardiac-torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro-fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F-test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation-based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers.

  18. A new warfarin dosing algorithm including VKORC1 3730 G > A polymorphism: comparison with results obtained by other published algorithms.

    PubMed

    Cini, Michela; Legnani, Cristina; Cosmi, Benilde; Guazzaloca, Giuliana; Valdrè, Lelia; Frascaro, Mirella; Palareti, Gualtiero

    2012-08-01

    Warfarin dosing is affected by clinical and genetic variants, but the contribution of the genotype associated with warfarin resistance in pharmacogenetic algorithms has not been well assessed yet. We developed a new dosing algorithm including polymorphisms associated both with warfarin sensitivity and resistance in the Italian population, and its performance was compared with those of eight previously published algorithms. Clinical and genetic data (CYP2C9*2, CYP2C9*3, VKORC1 -1639 G > A, and VKORC1 3730 G > A) were used to elaborate the new algorithm. Derivation and validation groups comprised 55 (58.2% men, mean age 69 years) and 40 (57.5% men, mean age 70 years) patients, respectively, who were on stable anticoagulation therapy for at least 3 months with different oral anticoagulation therapy (OAT) indications. Performance of the new algorithm, evaluated with mean absolute error (MAE) defined as the absolute value of the difference between observed daily maintenance dose and predicted daily dose, correlation with the observed dose and R(2) value, was comparable with or slightly lower than that obtained using the other algorithms. The new algorithm could correctly assign 53.3%, 50.0%, and 57.1% of patients to the low (≤25 mg/week), intermediate (26-44 mg/week) and high (≥ 45 mg/week) dosing range, respectively. Our data showed a significant increase in predictive accuracy among patients requiring high warfarin dose compared with the other algorithms (ranging from 0% to 28.6%). The algorithm including VKORC1 3730 G > A, associated with warfarin resistance, allowed a more accurate identification of resistant patients who require higher warfarin dosage.

  19. Genetic training of network using chaos concept: application to QSAR studies of vibration modes of tetrahedral halides.

    PubMed

    Lu, Qingzhang; Shen, Guoli; Yu, Ruqin

    2002-11-15

    The chaotic dynamical system is introduced in genetic algorithm to train ANN to formulate the CGANN algorithm. Logistic mapping as one of the most important chaotic dynamic mappings provides each new generation a high chance to hold GA's population diversity. This enhances the ability to overcome overfitting in training an ANN. The proposed CGANN has been used for QSAR studies to predict the tetrahedral modes (nu(1)(A1) and nu(2)(E)) of halides [MX(4)](epsilon). The frequencies predicted by QSAR were compared with those calculated by quantum chemistry methods including PM3, AM1, and MNDO/d. The possibility of improving the predictive ability of QSAR by including quantum chemistry parameters as feature variables has been investigated using tetrahedral tetrahalide examples. Copyright 2002 Wiley Periodicals, Inc.

  20. Research on prediction of agricultural machinery total power based on grey model optimized by genetic algorithm

    NASA Astrophysics Data System (ADS)

    Xie, Yan; Li, Mu; Zhou, Jin; Zheng, Chang-zheng

    2009-07-01

    Agricultural machinery total power is an important index to reflex and evaluate the level of agricultural mechanization. It is the power source of agricultural production, and is the main factors to enhance the comprehensive agricultural production capacity expand production scale and increase the income of the farmers. Its demand is affected by natural, economic, technological and social and other "grey" factors. Therefore, grey system theory can be used to analyze the development of agricultural machinery total power. A method based on genetic algorithm optimizing grey modeling process is introduced in this paper. This method makes full use of the advantages of the grey prediction model and characteristics of genetic algorithm to find global optimization. So the prediction model is more accurate. According to data from a province, the GM (1, 1) model for predicting agricultural machinery total power was given based on the grey system theories and genetic algorithm. The result indicates that the model can be used as agricultural machinery total power an effective tool for prediction.

  1. Estimation of leaf water contents from mid- and thermal infrared spectra by coupling genetic algorithm and partial least squares regression

    NASA Astrophysics Data System (ADS)

    Arshad, Muhammad; Ullah, Saleem; Khurshid, Khurram; Ali, Asad

    2017-10-01

    Leaf Water Content (LWC) is an essential constituent of plant leaves that determines vegetation heath and its productivity. An accurate and on-time measurement of water content is crucial for planning irrigation, forecasting drought and predicting woodland fire. The retrieval of LWC from Visible to Shortwave Infrared (VSWIR: 0.4-2.5 μm) has been extensively investigated but little has been done in the Mid and Thermal Infrared (MIR and TIR: 2.50 -14.0 μm), windows of electromagnetic spectrum. This study is mainly focused on retrieval of LWC from Mid and Thermal Infrared, using Genetic Algorithm integrated with Partial Least Square Regression (PLSR). Genetic Algorithm fused with PLSR selects spectral wavebands with high predictive performance i.e., yields high adjusted-R2 and low RMSE. In our case, GA-PLSR selected eight variables (bands) and yielded highly accurate models with adjusted-R2 of 0.93 and RMSEcv equal to 7.1 %. The study also demonstrated that MIR is more sensitive to the variation in LWC as compared to TIR. However, the combined use of MIR and TIR spectra enhances the predictive performance in retrieval of LWC. The integration of Genetic Algorithm and PLSR, not only increases the estimation precision by selecting the most sensitive spectral bands but also helps in identifying the important spectral regions for quantifying water stresses in vegetation. The findings of this study will allow the future space missions (like HyspIRI) to position wavebands at sensitive regions for characterizing vegetation stresses.

  2. Application of a soft computing technique in predicting the percentage of shear force carried by walls in a rectangular channel with non-homogeneous roughness.

    PubMed

    Khozani, Zohreh Sheikh; Bonakdari, Hossein; Zaji, Amir Hossein

    2016-01-01

    Two new soft computing models, namely genetic programming (GP) and genetic artificial algorithm (GAA) neural network (a combination of modified genetic algorithm and artificial neural network methods) were developed in order to predict the percentage of shear force in a rectangular channel with non-homogeneous roughness. The ability of these methods to estimate the percentage of shear force was investigated. Moreover, the independent parameters' effectiveness in predicting the percentage of shear force was determined using sensitivity analysis. According to the results, the GP model demonstrated superior performance to the GAA model. A comparison was also made between the GP program determined as the best model and five equations obtained in prior research. The GP model with the lowest error values (root mean square error ((RMSE) of 0.0515) had the best function compared with the other equations presented for rough and smooth channels as well as smooth ducts. The equation proposed for rectangular channels with rough boundaries (RMSE of 0.0642) outperformed the prior equations for smooth boundaries.

  3. Beam-column joint shear prediction using hybridized deep learning neural network with genetic algorithm

    NASA Astrophysics Data System (ADS)

    Mundher Yaseen, Zaher; Abdulmohsin Afan, Haitham; Tran, Minh-Tung

    2018-04-01

    Scientifically evidenced that beam-column joints are a critical point in the reinforced concrete (RC) structure under the fluctuation loads effects. In this novel hybrid data-intelligence model developed to predict the joint shear behavior of exterior beam-column structure frame. The hybrid data-intelligence model is called genetic algorithm integrated with deep learning neural network model (GA-DLNN). The genetic algorithm is used as prior modelling phase for the input approximation whereas the DLNN predictive model is used for the prediction phase. To demonstrate this structural problem, experimental data is collected from the literature that defined the dimensional and specimens’ properties. The attained findings evidenced the efficitveness of the hybrid GA-DLNN in modelling beam-column joint shear problem. In addition, the accurate prediction achived with less input variables owing to the feasibility of the evolutionary phase.

  4. Efficient genetic algorithms using discretization scheduling.

    PubMed

    McLay, Laura A; Goldberg, David E

    2005-01-01

    In many applications of genetic algorithms, there is a tradeoff between speed and accuracy in fitness evaluations when evaluations use numerical methods with varying discretization. In these types of applications, the cost and accuracy vary from discretization errors when implicit or explicit quadrature is used to estimate the function evaluations. This paper examines discretization scheduling, or how to vary the discretization within the genetic algorithm in order to use the least amount of computation time for a solution of a desired quality. The effectiveness of discretization scheduling can be determined by comparing its computation time to the computation time of a GA using a constant discretization. There are three ingredients for the discretization scheduling: population sizing, estimated time for each function evaluation and predicted convergence time analysis. Idealized one- and two-dimensional experiments and an inverse groundwater application illustrate the computational savings to be achieved from using discretization scheduling.

  5. Identification of handwriting by using the genetic algorithm (GA) and support vector machine (SVM)

    NASA Astrophysics Data System (ADS)

    Zhang, Qigui; Deng, Kai

    2016-12-01

    As portable digital camera and a camera phone comes more and more popular, and equally pressing is meeting the requirements of people to shoot at any time, to identify and storage handwritten character. In this paper, genetic algorithm(GA) and support vector machine(SVM)are used for identification of handwriting. Compare with parameters-optimized method, this technique overcomes two defects: first, it's easy to trap in the local optimum; second, finding the best parameters in the larger range will affects the efficiency of classification and prediction. As the experimental results suggest, GA-SVM has a higher recognition rate.

  6. Intelligent Soft Computing on Forex: Exchange Rates Forecasting with Hybrid Radial Basis Neural Network

    PubMed Central

    Marcek, Dusan; Durisova, Maria

    2016-01-01

    This paper deals with application of quantitative soft computing prediction models into financial area as reliable and accurate prediction models can be very helpful in management decision-making process. The authors suggest a new hybrid neural network which is a combination of the standard RBF neural network, a genetic algorithm, and a moving average. The moving average is supposed to enhance the outputs of the network using the error part of the original neural network. Authors test the suggested model on high-frequency time series data of USD/CAD and examine the ability to forecast exchange rate values for the horizon of one day. To determine the forecasting efficiency, they perform a comparative statistical out-of-sample analysis of the tested model with autoregressive models and the standard neural network. They also incorporate genetic algorithm as an optimizing technique for adapting parameters of ANN which is then compared with standard backpropagation and backpropagation combined with K-means clustering algorithm. Finally, the authors find out that their suggested hybrid neural network is able to produce more accurate forecasts than the standard models and can be helpful in eliminating the risk of making the bad decision in decision-making process. PMID:26977450

  7. Intelligent Soft Computing on Forex: Exchange Rates Forecasting with Hybrid Radial Basis Neural Network.

    PubMed

    Falat, Lukas; Marcek, Dusan; Durisova, Maria

    2016-01-01

    This paper deals with application of quantitative soft computing prediction models into financial area as reliable and accurate prediction models can be very helpful in management decision-making process. The authors suggest a new hybrid neural network which is a combination of the standard RBF neural network, a genetic algorithm, and a moving average. The moving average is supposed to enhance the outputs of the network using the error part of the original neural network. Authors test the suggested model on high-frequency time series data of USD/CAD and examine the ability to forecast exchange rate values for the horizon of one day. To determine the forecasting efficiency, they perform a comparative statistical out-of-sample analysis of the tested model with autoregressive models and the standard neural network. They also incorporate genetic algorithm as an optimizing technique for adapting parameters of ANN which is then compared with standard backpropagation and backpropagation combined with K-means clustering algorithm. Finally, the authors find out that their suggested hybrid neural network is able to produce more accurate forecasts than the standard models and can be helpful in eliminating the risk of making the bad decision in decision-making process.

  8. Prediction of road traffic death rate using neural networks optimised by genetic algorithm.

    PubMed

    Jafari, Seyed Ali; Jahandideh, Sepideh; Jahandideh, Mina; Asadabadi, Ebrahim Barzegari

    2015-01-01

    Road traffic injuries (RTIs) are realised as a main cause of public health problems at global, regional and national levels. Therefore, prediction of road traffic death rate will be helpful in its management. Based on this fact, we used an artificial neural network model optimised through Genetic algorithm to predict mortality. In this study, a five-fold cross-validation procedure on a data set containing total of 178 countries was used to verify the performance of models. The best-fit model was selected according to the root mean square errors (RMSE). Genetic algorithm, as a powerful model which has not been introduced in prediction of mortality to this extent in previous studies, showed high performance. The lowest RMSE obtained was 0.0808. Such satisfactory results could be attributed to the use of Genetic algorithm as a powerful optimiser which selects the best input feature set to be fed into the neural networks. Seven factors have been known as the most effective factors on the road traffic mortality rate by high accuracy. The gained results displayed that our model is very promising and may play a useful role in developing a better method for assessing the influence of road traffic mortality risk factors.

  9. Paroxysmal atrial fibrillation prediction based on HRV analysis and non-dominated sorting genetic algorithm III.

    PubMed

    Boon, K H; Khalil-Hani, M; Malarvili, M B

    2018-01-01

    This paper presents a method that able to predict the paroxysmal atrial fibrillation (PAF). The method uses shorter heart rate variability (HRV) signals when compared to existing methods, and achieves good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to electrically stabilize and prevent the onset of atrial arrhythmias with different pacing techniques. We propose a multi-objective optimization algorithm based on the non-dominated sorting genetic algorithm III for optimizing the baseline PAF prediction system, that consists of the stages of pre-processing, HRV feature extraction, and support vector machine (SVM) model. The pre-processing stage comprises of heart rate correction, interpolation, and signal detrending. After that, time-domain, frequency-domain, non-linear HRV features are extracted from the pre-processed data in feature extraction stage. Then, these features are used as input to the SVM for predicting the PAF event. The proposed optimization algorithm is used to optimize the parameters and settings of various HRV feature extraction algorithms, select the best feature subsets, and tune the SVM parameters simultaneously for maximum prediction performance. The proposed method achieves an accuracy rate of 87.7%, which significantly outperforms most of the previous works. This accuracy rate is achieved even with the HRV signal length being reduced from the typical 30 min to just 5 min (a reduction of 83%). Furthermore, another significant result is the sensitivity rate, which is considered more important that other performance metrics in this paper, can be improved with the trade-off of lower specificity. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Multi-model data fusion to improve an early warning system for hypo-/hyperglycemic events.

    PubMed

    Botwey, Ransford Henry; Daskalaki, Elena; Diem, Peter; Mougiakakou, Stavroula G

    2014-01-01

    Correct predictions of future blood glucose levels in individuals with Type 1 Diabetes (T1D) can be used to provide early warning of upcoming hypo-/hyperglycemic events and thus to improve the patient's safety. To increase prediction accuracy and efficiency, various approaches have been proposed which combine multiple predictors to produce superior results compared to single predictors. Three methods for model fusion are presented and comparatively assessed. Data from 23 T1D subjects under sensor-augmented pump (SAP) therapy were used in two adaptive data-driven models (an autoregressive model with output correction - cARX, and a recurrent neural network - RNN). Data fusion techniques based on i) Dempster-Shafer Evidential Theory (DST), ii) Genetic Algorithms (GA), and iii) Genetic Programming (GP) were used to merge the complimentary performances of the prediction models. The fused output is used in a warning algorithm to issue alarms of upcoming hypo-/hyperglycemic events. The fusion schemes showed improved performance with lower root mean square errors, lower time lags, and higher correlation. In the warning algorithm, median daily false alarms (DFA) of 0.25%, and 100% correct alarms (CA) were obtained for both event types. The detection times (DT) before occurrence of events were 13.0 and 12.1 min respectively for hypo-/hyperglycemic events. Compared to the cARX and RNN models, and a linear fusion of the two, the proposed fusion schemes represents a significant improvement.

  11. An Efficient Rank Based Approach for Closest String and Closest Substring

    PubMed Central

    2012-01-01

    This paper aims to present a new genetic approach that uses rank distance for solving two known NP-hard problems, and to compare rank distance with other distance measures for strings. The two NP-hard problems we are trying to solve are closest string and closest substring. For each problem we build a genetic algorithm and we describe the genetic operations involved. Both genetic algorithms use a fitness function based on rank distance. We compare our algorithms with other genetic algorithms that use different distance measures, such as Hamming distance or Levenshtein distance, on real DNA sequences. Our experiments show that the genetic algorithms based on rank distance have the best results. PMID:22675483

  12. An improved partial least-squares regression method for Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Momenpour Tehran Monfared, Ali; Anis, Hanan

    2017-10-01

    It is known that the performance of partial least-squares (PLS) regression analysis can be improved using the backward variable selection method (BVSPLS). In this paper, we further improve the BVSPLS based on a novel selection mechanism. The proposed method is based on sorting the weighted regression coefficients, and then the importance of each variable of the sorted list is evaluated using root mean square errors of prediction (RMSEP) criterion in each iteration step. Our Improved BVSPLS (IBVSPLS) method has been applied to leukemia and heparin data sets and led to an improvement in limit of detection of Raman biosensing ranged from 10% to 43% compared to PLS. Our IBVSPLS was also compared to the jack-knifing (simpler) and Genetic Algorithm (more complex) methods. Our method was consistently better than the jack-knifing method and showed either a similar or a better performance compared to the genetic algorithm.

  13. Predicting Protein Structure Using Parallel Genetic Algorithms.

    DTIC Science & Technology

    1994-12-01

    Molecular dynamics attempts to simulate the protein folding process. However, the time steps required for this simulation are on the order of one...harmonics. These two factors have limited molecular dynamics simulations to less than a few nanoseconds (10-9 sec), even on today’s fastest supercomputers...By " Predicting rotein Structure D istribticfiar.. ................ Using Parallel Genetic Algorithms ,Avaiu " ’ •"... Dist THESIS I IGeorge H

  14. Efficient search, mapping, and optimization of multi-protein genetic systems in diverse bacteria

    PubMed Central

    Farasat, Iman; Kushwaha, Manish; Collens, Jason; Easterbrook, Michael; Guido, Matthew; Salis, Howard M

    2014-01-01

    Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression-activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs. PMID:24952589

  15. Optimization of Bioactive Ingredient Extraction from Chinese Herbal Medicine Glycyrrhiza glabra: A Comparative Study of Three Optimization Models

    PubMed Central

    Li, Xiaohong; Zhang, Yuyan

    2018-01-01

    The ultraviolet spectrophotometric method is often used for determining the content of glycyrrhizic acid from Chinese herbal medicine Glycyrrhiza glabra. Based on the traditional single variable approach, four extraction parameters of ammonia concentration, ethanol concentration, circumfluence time, and liquid-solid ratio are adopted as the independent extraction variables. In the present work, central composite design of four factors and five levels is applied to design the extraction experiments. Subsequently, the prediction models of response surface methodology, artificial neural networks, and genetic algorithm-artificial neural networks are developed to analyze the obtained experimental data, while the genetic algorithm is utilized to find the optimal extraction parameters for the above well-established models. It is found that the optimization of extraction technology is presented as ammonia concentration 0.595%, ethanol concentration 58.45%, return time 2.5 h, and liquid-solid ratio 11.065 : 1. Under these conditions, the model predictive value is 381.24 mg, the experimental average value is 376.46 mg, and the expectation discrepancy is 4.78 mg. For the first time, a comparative study of these three approaches is conducted for the evaluation and optimization of the effects of the extraction independent variables. Furthermore, it is demonstrated that the combinational method of genetic algorithm and artificial neural networks provides a more reliable and more accurate strategy for design and optimization of glycyrrhizic acid extraction from Glycyrrhiza glabra. PMID:29887907

  16. Optimization of Bioactive Ingredient Extraction from Chinese Herbal Medicine Glycyrrhiza glabra: A Comparative Study of Three Optimization Models.

    PubMed

    Yu, Li; Jin, Weifeng; Li, Xiaohong; Zhang, Yuyan

    2018-01-01

    The ultraviolet spectrophotometric method is often used for determining the content of glycyrrhizic acid from Chinese herbal medicine Glycyrrhiza glabra . Based on the traditional single variable approach, four extraction parameters of ammonia concentration, ethanol concentration, circumfluence time, and liquid-solid ratio are adopted as the independent extraction variables. In the present work, central composite design of four factors and five levels is applied to design the extraction experiments. Subsequently, the prediction models of response surface methodology, artificial neural networks, and genetic algorithm-artificial neural networks are developed to analyze the obtained experimental data, while the genetic algorithm is utilized to find the optimal extraction parameters for the above well-established models. It is found that the optimization of extraction technology is presented as ammonia concentration 0.595%, ethanol concentration 58.45%, return time 2.5 h, and liquid-solid ratio 11.065 : 1. Under these conditions, the model predictive value is 381.24 mg, the experimental average value is 376.46 mg, and the expectation discrepancy is 4.78 mg. For the first time, a comparative study of these three approaches is conducted for the evaluation and optimization of the effects of the extraction independent variables. Furthermore, it is demonstrated that the combinational method of genetic algorithm and artificial neural networks provides a more reliable and more accurate strategy for design and optimization of glycyrrhizic acid extraction from Glycyrrhiza glabra .

  17. Prediction of Unsteady Aerodynamic Coefficients at High Angles of Attack

    NASA Technical Reports Server (NTRS)

    Pamadi, Bandu N.; Murphy, Patrick C.; Klein, Vladislav; Brandon, Jay M.

    2001-01-01

    The nonlinear indicial response method is used to model the unsteady aerodynamic coefficients in the low speed longitudinal oscillatory wind tunnel test data of the 0.1 scale model of the F-16XL aircraft. Exponential functions are used to approximate the deficiency function in the indicial response. Using one set of oscillatory wind tunnel data and parameter identification method, the unknown parameters in the exponential functions are estimated. The genetic algorithm is used as a least square minimizing algorithm. The assumed model structures and parameter estimates are validated by comparing the predictions with other sets of available oscillatory wind tunnel test data.

  18. Using genetic algorithms to optimise current and future health planning--the example of ambulance locations.

    PubMed

    Sasaki, Satoshi; Comber, Alexis J; Suzuki, Hiroshi; Brunsdon, Chris

    2010-01-28

    Ambulance response time is a crucial factor in patient survival. The number of emergency cases (EMS cases) requiring an ambulance is increasing due to changes in population demographics. This is decreasing ambulance response times to the emergency scene. This paper predicts EMS cases for 5-year intervals from 2020, to 2050 by correlating current EMS cases with demographic factors at the level of the census area and predicted population changes. It then applies a modified grouping genetic algorithm to compare current and future optimal locations and numbers of ambulances. Sets of potential locations were evaluated in terms of the (current and predicted) EMS case distances to those locations. Future EMS demands were predicted to increase by 2030 using the model (R2 = 0.71). The optimal locations of ambulances based on future EMS cases were compared with current locations and with optimal locations modelled on current EMS case data. Optimising the location of ambulance stations locations reduced the average response times by 57 seconds. Current and predicted future EMS demand at modelled locations were calculated and compared. The reallocation of ambulances to optimal locations improved response times and could contribute to higher survival rates from life-threatening medical events. Modelling EMS case 'demand' over census areas allows the data to be correlated to population characteristics and optimal 'supply' locations to be identified. Comparing current and future optimal scenarios allows more nuanced planning decisions to be made. This is a generic methodology that could be used to provide evidence in support of public health planning and decision making.

  19. Comparison and optimization of in silico algorithms for predicting the pathogenicity of sodium channel variants in epilepsy.

    PubMed

    Holland, Katherine D; Bouley, Thomas M; Horn, Paul S

    2017-07-01

    Variants in neuronal voltage-gated sodium channel α-subunits genes SCN1A, SCN2A, and SCN8A are common in early onset epileptic encephalopathies and other autosomal dominant childhood epilepsy syndromes. However, in clinical practice, missense variants are often classified as variants of uncertain significance when missense variants are identified but heritability cannot be determined. Genetic testing reports often include results of computational tests to estimate pathogenicity and the frequency of that variant in population-based databases. The objective of this work was to enhance clinicians' understanding of results by (1) determining how effectively computational algorithms predict epileptogenicity of sodium channel (SCN) missense variants; (2) optimizing their predictive capabilities; and (3) determining if epilepsy-associated SCN variants are present in population-based databases. This will help clinicians better understand the results of indeterminate SCN test results in people with epilepsy. Pathogenic, likely pathogenic, and benign variants in SCNs were identified using databases of sodium channel variants. Benign variants were also identified from population-based databases. Eight algorithms commonly used to predict pathogenicity were compared. In addition, logistic regression was used to determine if a combination of algorithms could better predict pathogenicity. Based on American College of Medical Genetic Criteria, 440 variants were classified as pathogenic or likely pathogenic and 84 were classified as benign or likely benign. Twenty-eight variants previously associated with epilepsy were present in population-based gene databases. The output provided by most computational algorithms had a high sensitivity but low specificity with an accuracy of 0.52-0.77. Accuracy could be improved by adjusting the threshold for pathogenicity. Using this adjustment, the Mendelian Clinically Applicable Pathogenicity (M-CAP) algorithm had an accuracy of 0.90 and a combination of algorithms increased the accuracy to 0.92. Potentially pathogenic variants are present in population-based sources. Most computational algorithms overestimate pathogenicity; however, a weighted combination of several algorithms increased classification accuracy to >0.90. Wiley Periodicals, Inc. © 2017 International League Against Epilepsy.

  20. Combining neural networks and genetic algorithms for hydrological flow forecasting

    NASA Astrophysics Data System (ADS)

    Neruda, Roman; Srejber, Jan; Neruda, Martin; Pascenko, Petr

    2010-05-01

    We present a neural network approach to rainfall-runoff modeling for small size river basins based on several time series of hourly measured data. Different neural networks are considered for short time runoff predictions (from one to six hours lead time) based on runoff and rainfall data observed in previous time steps. Correlation analysis shows that runoff data, short time rainfall history, and aggregated API values are the most significant data for the prediction. Neural models of multilayer perceptron and radial basis function networks with different numbers of units are used and compared with more traditional linear time series predictors. Out of possible 48 hours of relevant history of all the input variables, the most important ones are selected by means of input filters created by a genetic algorithm. The genetic algorithm works with population of binary encoded vectors defining input selection patterns. Standard genetic operators of two-point crossover, random bit-flipping mutation, and tournament selection were used. The evaluation of objective function of each individual consists of several rounds of building and testing a particular neural network model. The whole procedure is rather computational exacting (taking hours to days on a desktop PC), thus a high-performance mainframe computer has been used for our experiments. Results based on two years worth data from the Ploucnice river in Northern Bohemia suggest that main problems connected with this approach to modeling are ovetraining that can lead to poor generalization, and relatively small number of extreme events which makes it difficult for a model to predict the amplitude of the event. Thus, experiments with both absolute and relative runoff predictions were carried out. In general it can be concluded that the neural models show about 5 per cent improvement in terms of efficiency coefficient over liner models. Multilayer perceptrons with one hidden layer trained by back propagation algorithm and predicting relative runoff show the best behavior so far. Utilizing the genetically evolved input filter improves the performance of yet another 5 per cent. In the future we would like to continue with experiments in on-line prediction using real-time data from Smeda River with 6 hours lead time forecast. Following the operational reality we will focus on classification of the runoffs into flood alert levels, and reformulation of the time series prediction task as a classification problem. The main goal of all this work is to improve flood warning system operated by the Czech Hydrometeorological Institute.

  1. A genetic algorithm approach in interface and surface structure optimization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Jian

    The thesis is divided into two parts. In the first part a global optimization method is developed for the interface and surface structures optimization. Two prototype systems are chosen to be studied. One is Si[001] symmetric tilted grain boundaries and the other is Ag/Au induced Si(111) surface. It is found that Genetic Algorithm is very efficient in finding lowest energy structures in both cases. Not only existing structures in the experiments can be reproduced, but also many new structures can be predicted using Genetic Algorithm. Thus it is shown that Genetic Algorithm is a extremely powerful tool for the materialmore » structures predictions. The second part of the thesis is devoted to the explanation of an experimental observation of thermal radiation from three-dimensional tungsten photonic crystal structures. The experimental results seems astounding and confusing, yet the theoretical models in the paper revealed the physics insight behind the phenomena and can well reproduced the experimental results.« less

  2. Evaluation of an automated spike-and-wave complex detection algorithm in the EEG from a rat model of absence epilepsy.

    PubMed

    Bauquier, Sebastien H; Lai, Alan; Jiang, Jonathan L; Sui, Yi; Cook, Mark J

    2015-10-01

    The aim of this prospective blinded study was to evaluate an automated algorithm for spike-and-wave discharge (SWD) detection applied to EEGs from genetic absence epilepsy rats from Strasbourg (GAERS). Five GAERS underwent four sessions of 20-min EEG recording. Each EEG was manually analyzed for SWDs longer than one second by two investigators and automatically using an algorithm developed in MATLAB®. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for the manual (reference) versus the automatic (test) methods. The results showed that the algorithm had specificity, sensitivity, PPV and NPV >94%, comparable to published methods that are based on analyzing EEG changes in the frequency domain. This provides a good alternative as a method designed to mimic human manual marking in the time domain.

  3. G/SPLINES: A hybrid of Friedman's Multivariate Adaptive Regression Splines (MARS) algorithm with Holland's genetic algorithm

    NASA Technical Reports Server (NTRS)

    Rogers, David

    1991-01-01

    G/SPLINES are a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm with Holland's Genetic Algorithm. In this hybrid, the incremental search is replaced by a genetic search. The G/SPLINE algorithm exhibits performance comparable to that of the MARS algorithm, requires fewer least squares computations, and allows significantly larger problems to be considered.

  4. Refined Genetic Algorithms for Polypeptide Structure Prediction.

    DTIC Science & Technology

    1996-12-01

    16 I I I. Algorithm Analysis, Design , and Implemen tation : : : : : : : : : : : : : : : : : : : : : : : : : 18 3.1 Analysis...21 3.2 Algorithm Design and Implemen tation : : : : : : : : : : : : : : : : : : : : : : : : : 22 3.2.1...26 IV. Exp erimen t Design

  5. Using a genetic algorithm to optimize a water-monitoring network for accuracy and cost effectiveness

    NASA Astrophysics Data System (ADS)

    Julich, R. J.

    2004-05-01

    The purpose of this project is to determine the optimal spatial distribution of water-monitoring wells to maximize important data collection and to minimize the cost of managing the network. We have employed a genetic algorithm (GA) towards this goal. The GA uses a simple fitness measure with two parts: the first part awards a maximal score to those combinations of hydraulic head observations whose net uncertainty is closest to the value representing all observations present, thereby maximizing accuracy; the second part applies a penalty function to minimize the number of observations, thereby minimizing the overall cost of the monitoring network. We used the linear statistical inference equation to calculate standard deviations on predictions from a numerical model generated for the 501-observation Death Valley Regional Flow System as the basis for our uncertainty calculations. We have organized the results to address the following three questions: 1) what is the optimal design strategy for a genetic algorithm to optimize this problem domain; 2) what is the consistency of solutions over several optimization runs; and 3) how do these results compare to what is known about the conceptual hydrogeology? Our results indicate the genetic algorithms are a more efficient and robust method for solving this class of optimization problems than have been traditional optimization approaches.

  6. A Novel Admixture-Based Pharmacogenetic Approach to Refine Warfarin Dosing in Caribbean Hispanics.

    PubMed

    Duconge, Jorge; Ramos, Alga S; Claudio-Campos, Karla; Rivera-Miranda, Giselle; Bermúdez-Bosch, Luis; Renta, Jessicca Y; Cadilla, Carmen L; Cruz, Iadelisse; Feliu, Juan F; Vergara, Cunegundo; Ruaño, Gualberto

    2016-01-01

    This study is aimed at developing a novel admixture-adjusted pharmacogenomic approach to individually refine warfarin dosing in Caribbean Hispanic patients. A multiple linear regression analysis of effective warfarin doses versus relevant genotypes, admixture, clinical and demographic factors was performed in 255 patients and further validated externally in another cohort of 55 individuals. The admixture-adjusted, genotype-guided warfarin dosing refinement algorithm developed in Caribbean Hispanics showed better predictability (R2 = 0.70, MAE = 0.72mg/day) than a clinical algorithm that excluded genotypes and admixture (R2 = 0.60, MAE = 0.99mg/day), and outperformed two prior pharmacogenetic algorithms in predicting effective dose in this population. For patients at the highest risk of adverse events, 45.5% of the dose predictions using the developed pharmacogenetic model resulted in ideal dose as compared with only 29% when using the clinical non-genetic algorithm (p<0.001). The admixture-driven pharmacogenetic algorithm predicted 58% of warfarin dose variance when externally validated in 55 individuals from an independent validation cohort (MAE = 0.89 mg/day, 24% mean bias). Results supported our rationale to incorporate individual's genotypes and unique admixture metrics into pharmacogenetic refinement models in order to increase predictability when expanding them to admixed populations like Caribbean Hispanics. ClinicalTrials.gov NCT01318057.

  7. Chi-square-based scoring function for categorization of MEDLINE citations.

    PubMed

    Kastrin, A; Peterlin, B; Hristovski, D

    2010-01-01

    Text categorization has been used in biomedical informatics for identifying documents containing relevant topics of interest. We developed a simple method that uses a chi-square-based scoring function to determine the likelihood of MEDLINE citations containing genetic relevant topic. Our procedure requires construction of a genetic and a nongenetic domain document corpus. We used MeSH descriptors assigned to MEDLINE citations for this categorization task. We compared frequencies of MeSH descriptors between two corpora applying chi-square test. A MeSH descriptor was considered to be a positive indicator if its relative observed frequency in the genetic domain corpus was greater than its relative observed frequency in the nongenetic domain corpus. The output of the proposed method is a list of scores for all the citations, with the highest score given to those citations containing MeSH descriptors typical for the genetic domain. Validation was done on a set of 734 manually annotated MEDLINE citations. It achieved predictive accuracy of 0.87 with 0.69 recall and 0.64 precision. We evaluated the method by comparing it to three machine-learning algorithms (support vector machines, decision trees, naïve Bayes). Although the differences were not statistically significantly different, results showed that our chi-square scoring performs as good as compared machine-learning algorithms. We suggest that the chi-square scoring is an effective solution to help categorize MEDLINE citations. The algorithm is implemented in the BITOLA literature-based discovery support system as a preprocessor for gene symbol disambiguation process.

  8. Improved hybrid optimization algorithm for 3D protein structure prediction.

    PubMed

    Zhou, Changjun; Hou, Caixia; Wei, Xiaopeng; Zhang, Qiang

    2014-07-01

    A new improved hybrid optimization algorithm - PGATS algorithm, which is based on toy off-lattice model, is presented for dealing with three-dimensional protein structure prediction problems. The algorithm combines the particle swarm optimization (PSO), genetic algorithm (GA), and tabu search (TS) algorithms. Otherwise, we also take some different improved strategies. The factor of stochastic disturbance is joined in the particle swarm optimization to improve the search ability; the operations of crossover and mutation that are in the genetic algorithm are changed to a kind of random liner method; at last tabu search algorithm is improved by appending a mutation operator. Through the combination of a variety of strategies and algorithms, the protein structure prediction (PSP) in a 3D off-lattice model is achieved. The PSP problem is an NP-hard problem, but the problem can be attributed to a global optimization problem of multi-extremum and multi-parameters. This is the theoretical principle of the hybrid optimization algorithm that is proposed in this paper. The algorithm combines local search and global search, which overcomes the shortcoming of a single algorithm, giving full play to the advantage of each algorithm. In the current universal standard sequences, Fibonacci sequences and real protein sequences are certified. Experiments show that the proposed new method outperforms single algorithms on the accuracy of calculating the protein sequence energy value, which is proved to be an effective way to predict the structure of proteins.

  9. Prediction of Compressional, Shear, and Stoneley Wave Velocities from Conventional Well Log Data Using a Committee Machine with Intelligent Systems

    NASA Astrophysics Data System (ADS)

    Asoodeh, Mojtaba; Bagheripour, Parisa

    2012-01-01

    Measurement of compressional, shear, and Stoneley wave velocities, carried out by dipole sonic imager (DSI) logs, provides invaluable data in geophysical interpretation, geomechanical studies and hydrocarbon reservoir characterization. The presented study proposes an improved methodology for making a quantitative formulation between conventional well logs and sonic wave velocities. First, sonic wave velocities were predicted from conventional well logs using artificial neural network, fuzzy logic, and neuro-fuzzy algorithms. Subsequently, a committee machine with intelligent systems was constructed by virtue of hybrid genetic algorithm-pattern search technique while outputs of artificial neural network, fuzzy logic and neuro-fuzzy models were used as inputs of the committee machine. It is capable of improving the accuracy of final prediction through integrating the outputs of aforementioned intelligent systems. The hybrid genetic algorithm-pattern search tool, embodied in the structure of committee machine, assigns a weight factor to each individual intelligent system, indicating its involvement in overall prediction of DSI parameters. This methodology was implemented in Asmari formation, which is the major carbonate reservoir rock of Iranian oil field. A group of 1,640 data points was used to construct the intelligent model, and a group of 800 data points was employed to assess the reliability of the proposed model. The results showed that the committee machine with intelligent systems performed more effectively compared with individual intelligent systems performing alone.

  10. Comparison of genetic algorithms with conjugate gradient methods

    NASA Technical Reports Server (NTRS)

    Bosworth, J. L.; Foo, N. Y.; Zeigler, B. P.

    1972-01-01

    Genetic algorithms for mathematical function optimization are modeled on search strategies employed in natural adaptation. Comparisons of genetic algorithms with conjugate gradient methods, which were made on an IBM 1800 digital computer, show that genetic algorithms display superior performance over gradient methods for functions which are poorly behaved mathematically, for multimodal functions, and for functions obscured by additive random noise. Genetic methods offer performance comparable to gradient methods for many of the standard functions.

  11. Genetic algorithm based adaptive neural network ensemble and its application in predicting carbon flux

    USGS Publications Warehouse

    Xue, Y.; Liu, S.; Hu, Y.; Yang, J.; Chen, Q.

    2007-01-01

    To improve the accuracy in prediction, Genetic Algorithm based Adaptive Neural Network Ensemble (GA-ANNE) is presented. Intersections are allowed between different training sets based on the fuzzy clustering analysis, which ensures the diversity as well as the accuracy of individual Neural Networks (NNs). Moreover, to improve the accuracy of the adaptive weights of individual NNs, GA is used to optimize the cluster centers. Empirical results in predicting carbon flux of Duke Forest reveal that GA-ANNE can predict the carbon flux more accurately than Radial Basis Function Neural Network (RBFNN), Bagging NN ensemble, and ANNE. ?? 2007 IEEE.

  12. A tuning algorithm for model predictive controllers based on genetic algorithms and fuzzy decision making.

    PubMed

    van der Lee, J H; Svrcek, W Y; Young, B R

    2008-01-01

    Model Predictive Control is a valuable tool for the process control engineer in a wide variety of applications. Because of this the structure of an MPC can vary dramatically from application to application. There have been a number of works dedicated to MPC tuning for specific cases. Since MPCs can differ significantly, this means that these tuning methods become inapplicable and a trial and error tuning approach must be used. This can be quite time consuming and can result in non-optimum tuning. In an attempt to resolve this, a generalized automated tuning algorithm for MPCs was developed. This approach is numerically based and combines a genetic algorithm with multi-objective fuzzy decision-making. The key advantages to this approach are that genetic algorithms are not problem specific and only need to be adapted to account for the number and ranges of tuning parameters for a given MPC. As well, multi-objective fuzzy decision-making can handle qualitative statements of what optimum control is, in addition to being able to use multiple inputs to determine tuning parameters that best match the desired results. This is particularly useful for multi-input, multi-output (MIMO) cases where the definition of "optimum" control is subject to the opinion of the control engineer tuning the system. A case study will be presented in order to illustrate the use of the tuning algorithm. This will include how different definitions of "optimum" control can arise, and how they are accounted for in the multi-objective decision making algorithm. The resulting tuning parameters from each of the definition sets will be compared, and in doing so show that the tuning parameters vary in order to meet each definition of optimum control, thus showing the generalized automated tuning algorithm approach for tuning MPCs is feasible.

  13. The Predicted Cross Value for Genetic Introgression of Multiple Alleles

    PubMed Central

    Han, Ye; Cameron, John N.; Wang, Lizhi; Beavis, William D.

    2017-01-01

    We consider the plant genetic improvement challenge of introgressing multiple alleles from a homozygous donor to a recipient. First, we frame the project as an algorithmic process that can be mathematically formulated. We then introduce a novel metric for selecting breeding parents that we refer to as the predicted cross value (PCV). Unlike estimated breeding values, which represent predictions of general combining ability, the PCV predicts specific combining ability. The PCV takes estimates of recombination frequencies as an input vector and calculates the probability that a pair of parents will produce a gamete with desirable alleles at all specified loci. We compared the PCV approach with existing estimated-breeding-value approaches in two simulation experiments, in which 7 and 20 desirable alleles were to be introgressed from a donor line into a recipient line. Results suggest that the PCV is more efficient and effective for multi-allelic trait introgression. We also discuss how operations research can be used for other crop genetic improvement projects and suggest several future research directions. PMID:28122824

  14. Classification of Alzheimer's disease and prediction of mild cognitive impairment-to-Alzheimer's conversion from structural magnetic resource imaging using feature ranking and a genetic algorithm.

    PubMed

    Beheshti, Iman; Demirel, Hasan; Matsuda, Hiroshi

    2017-04-01

    We developed a novel computer-aided diagnosis (CAD) system that uses feature-ranking and a genetic algorithm to analyze structural magnetic resonance imaging data; using this system, we can predict conversion of mild cognitive impairment (MCI)-to-Alzheimer's disease (AD) at between one and three years before clinical diagnosis. The CAD system was developed in four stages. First, we used a voxel-based morphometry technique to investigate global and local gray matter (GM) atrophy in an AD group compared with healthy controls (HCs). Regions with significant GM volume reduction were segmented as volumes of interest (VOIs). Second, these VOIs were used to extract voxel values from the respective atrophy regions in AD, HC, stable MCI (sMCI) and progressive MCI (pMCI) patient groups. The voxel values were then extracted into a feature vector. Third, at the feature-selection stage, all features were ranked according to their respective t-test scores and a genetic algorithm designed to find the optimal feature subset. The Fisher criterion was used as part of the objective function in the genetic algorithm. Finally, the classification was carried out using a support vector machine (SVM) with 10-fold cross validation. We evaluated the proposed automatic CAD system by applying it to baseline values from the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset (160 AD, 162 HC, 65 sMCI and 71 pMCI subjects). The experimental results indicated that the proposed system is capable of distinguishing between sMCI and pMCI patients, and would be appropriate for practical use in a clinical setting. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Risk-Seeking Versus Risk-Avoiding Investments in Noisy Periodic Environments

    NASA Astrophysics Data System (ADS)

    Navarro-Barrientos, J. Emeterio; Walter, Frank E.; Schweitzer, Frank

    We study the performance of various agent strategies in an artificial investment scenario. Agents are equipped with a budget, x(t), and at each time step invest a particular fraction, q(t), of their budget. The return on investment (RoI), r(t), is characterized by a periodic function with different types and levels of noise. Risk-avoiding agents choose their fraction q(t) proportional to the expected positive RoI, while risk-seeking agents always choose a maximum value qmax if they predict the RoI to be positive ("everything on red"). In addition to these different strategies, agents have different capabilities to predict the future r(t), dependent on their internal complexity. Here, we compare "zero-intelligent" agents using technical analysis (such as moving least squares) with agents using reinforcement learning or genetic algorithms to predict r(t). The performance of agents is measured by their average budget growth after a certain number of time steps. We present results of extensive computer simulations, which show that, for our given artificial environment, (i) the risk-seeking strategy outperforms the risk-avoiding one, and (ii) the genetic algorithm was able to find this optimal strategy itself, and thus outperforms other prediction approaches considered.

  16. Inferring genetic interactions via a nonlinear model and an optimization algorithm.

    PubMed

    Chen, Chung-Ming; Lee, Chih; Chuang, Cheng-Long; Wang, Chia-Chang; Shieh, Grace S

    2010-02-26

    Biochemical pathways are gradually becoming recognized as central to complex human diseases and recently genetic/transcriptional interactions have been shown to be able to predict partial pathways. With the abundant information made available by microarray gene expression data (MGED), nonlinear modeling of these interactions is now feasible. Two of the latest advances in nonlinear modeling used sigmoid models to depict transcriptional interaction of a transcription factor (TF) for a target gene, but do not model cooperative or competitive interactions of several TFs for a target. An S-shape model and an optimization algorithm (GASA) were developed to infer genetic interactions/transcriptional regulation of several genes simultaneously using MGED. GASA consists of a genetic algorithm (GA) and a simulated annealing (SA) algorithm, which is enhanced by a steepest gradient descent algorithm to avoid being trapped in local minimum. Using simulated data with various degrees of noise, we studied how GASA with two model selection criteria and two search spaces performed. Furthermore, GASA was shown to outperform network component analysis, the time series network inference algorithm (TSNI), GA with regular GA (GAGA) and GA with regular SA. Two applications are demonstrated. First, GASA is applied to infer a subnetwork of human T-cell apoptosis. Several of the predicted interactions are supported by the literature. Second, GASA was applied to infer the transcriptional factors of 34 cell cycle regulated targets in S. cerevisiae, and GASA performed better than one of the latest advances in nonlinear modeling, GAGA and TSNI. Moreover, GASA is able to predict multiple transcription factors for certain targets, and these results coincide with experiments confirmed data in YEASTRACT. GASA is shown to infer both genetic interactions and transcriptional regulatory interactions well. In particular, GASA seems able to characterize the nonlinear mechanism of transcriptional regulatory interactions (TIs) in yeast, and may be applied to infer TIs in other organisms. The predicted genetic interactions of a subnetwork of human T-cell apoptosis coincide with existing partial pathways, suggesting the potential of GASA on inferring biochemical pathways.

  17. Experimental Performance of a Genetic Algorithm for Airborne Strategic Conflict Resolution

    NASA Technical Reports Server (NTRS)

    Karr, David A.; Vivona, Robert A.; Roscoe, David A.; DePascale, Stephen M.; Consiglio, Maria

    2009-01-01

    The Autonomous Operations Planner, a research prototype flight-deck decision support tool to enable airborne self-separation, uses a pattern-based genetic algorithm to resolve predicted conflicts between the ownship and traffic aircraft. Conflicts are resolved by modifying the active route within the ownship s flight management system according to a predefined set of maneuver pattern templates. The performance of this pattern-based genetic algorithm was evaluated in the context of batch-mode Monte Carlo simulations running over 3600 flight hours of autonomous aircraft in en-route airspace under conditions ranging from typical current traffic densities to several times that level. Encountering over 8900 conflicts during two simulation experiments, the genetic algorithm was able to resolve all but three conflicts, while maintaining a required time of arrival constraint for most aircraft. Actual elapsed running time for the algorithm was consistent with conflict resolution in real time. The paper presents details of the genetic algorithm s design, along with mathematical models of the algorithm s performance and observations regarding the effectiveness of using complimentary maneuver patterns when multiple resolutions by the same aircraft were required.

  18. Experimental Performance of a Genetic Algorithm for Airborne Strategic Conflict Resolution

    NASA Technical Reports Server (NTRS)

    Karr, David A.; Vivona, Robert A.; Roscoe, David A.; DePascale, Stephen M.; Consiglio, Maria

    2009-01-01

    The Autonomous Operations Planner, a research prototype flight-deck decision support tool to enable airborne self-separation, uses a pattern-based genetic algorithm to resolve predicted conflicts between the ownship and traffic aircraft. Conflicts are resolved by modifying the active route within the ownship's flight management system according to a predefined set of maneuver pattern templates. The performance of this pattern-based genetic algorithm was evaluated in the context of batch-mode Monte Carlo simulations running over 3600 flight hours of autonomous aircraft in en-route airspace under conditions ranging from typical current traffic densities to several times that level. Encountering over 8900 conflicts during two simulation experiments, the genetic algorithm was able to resolve all but three conflicts, while maintaining a required time of arrival constraint for most aircraft. Actual elapsed running time for the algorithm was consistent with conflict resolution in real time. The paper presents details of the genetic algorithm's design, along with mathematical models of the algorithm's performance and observations regarding the effectiveness of using complimentary maneuver patterns when multiple resolutions by the same aircraft were required.

  19. Novel applications of multitask learning and multiple output regression to multiple genetic trait prediction.

    PubMed

    He, Dan; Kuhn, David; Parida, Laxmi

    2016-06-15

    Given a set of biallelic molecular markers, such as SNPs, with genotype values encoded numerically on a collection of plant, animal or human samples, the goal of genetic trait prediction is to predict the quantitative trait values by simultaneously modeling all marker effects. Genetic trait prediction is usually represented as linear regression models. In many cases, for the same set of samples and markers, multiple traits are observed. Some of these traits might be correlated with each other. Therefore, modeling all the multiple traits together may improve the prediction accuracy. In this work, we view the multitrait prediction problem from a machine learning angle: as either a multitask learning problem or a multiple output regression problem, depending on whether different traits share the same genotype matrix or not. We then adapted multitask learning algorithms and multiple output regression algorithms to solve the multitrait prediction problem. We proposed a few strategies to improve the least square error of the prediction from these algorithms. Our experiments show that modeling multiple traits together could improve the prediction accuracy for correlated traits. The programs we used are either public or directly from the referred authors, such as MALSAR (http://www.public.asu.edu/~jye02/Software/MALSAR/) package. The Avocado data set has not been published yet and is available upon request. dhe@us.ibm.com. © The Author 2016. Published by Oxford University Press.

  20. Optimum location of external markers using feature selection algorithms for real‐time tumor tracking in external‐beam radiotherapy: a virtual phantom study

    PubMed Central

    Nankali, Saber; Miandoab, Payam Samadi; Baghizadeh, Amin

    2016-01-01

    In external‐beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation‐based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two “Genetic” and “Ranker” searching procedures. The performance of these algorithms has been evaluated using four‐dimensional extended cardiac‐torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro‐fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F‐test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation‐based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers. PACS numbers: 87.55.km, 87.56.Fc PMID:26894358

  1. A Test of Genetic Algorithms in Relevance Feedback.

    ERIC Educational Resources Information Center

    Lopez-Pujalte, Cristina; Guerrero Bote, Vicente P.; Moya Anegon, Felix de

    2002-01-01

    Discussion of information retrieval, query optimization techniques, and relevance feedback focuses on genetic algorithms, which are derived from artificial intelligence techniques. Describes an evaluation of different genetic algorithms using a residual collection method and compares results with the Ide dec-hi method (Salton and Buckley, 1990…

  2. Predicting mining activity with parallel genetic algorithms

    USGS Publications Warehouse

    Talaie, S.; Leigh, R.; Louis, S.J.; Raines, G.L.; Beyer, H.G.; O'Reilly, U.M.; Banzhaf, Arnold D.; Blum, W.; Bonabeau, C.; Cantu-Paz, E.W.; ,; ,

    2005-01-01

    We explore several different techniques in our quest to improve the overall model performance of a genetic algorithm calibrated probabilistic cellular automata. We use the Kappa statistic to measure correlation between ground truth data and data predicted by the model. Within the genetic algorithm, we introduce a new evaluation function sensitive to spatial correctness and we explore the idea of evolving different rule parameters for different subregions of the land. We reduce the time required to run a simulation from 6 hours to 10 minutes by parallelizing the code and employing a 10-node cluster. Our empirical results suggest that using the spatially sensitive evaluation function does indeed improve the performance of the model and our preliminary results also show that evolving different rule parameters for different regions tends to improve overall model performance. Copyright 2005 ACM.

  3. Comparison of algorithms for the detection of cancer-drivers at sub-gene resolution

    PubMed Central

    Porta-Pardo, Eduard; Kamburov, Atanas; Tamborero, David; Pons, Tirso; Grases, Daniela; Valencia, Alfonso; Lopez-Bigas, Nuria; Getz, Gad; Godzik, Adam

    2018-01-01

    Understanding genetic events that lead to cancer initiation and progression remains one of the biggest challenges in cancer biology. Traditionally most algorithms for cancer driver identification look for genes that have more mutations than expected from the average background mutation rate. However, there is now a wide variety of methods that look for non-random distribution of mutations within proteins as a signal they have a driving role in cancer. Here we classify and review the progress of such sub-gene resolution algorithms, compare their findings on four distinct cancer datasets from The Cancer Genome Atlas and discuss how predictions from these algorithms can be interpreted in the emerging paradigms that challenge the simple dichotomy between driver and passenger genes. PMID:28714987

  4. A Novel Admixture-Based Pharmacogenetic Approach to Refine Warfarin Dosing in Caribbean Hispanics

    PubMed Central

    Claudio-Campos, Karla; Rivera-Miranda, Giselle; Bermúdez-Bosch, Luis; Renta, Jessicca Y.; Cadilla, Carmen L.; Cruz, Iadelisse; Feliu, Juan F.; Vergara, Cunegundo; Ruaño, Gualberto

    2016-01-01

    Aim This study is aimed at developing a novel admixture-adjusted pharmacogenomic approach to individually refine warfarin dosing in Caribbean Hispanic patients. Patients & Methods A multiple linear regression analysis of effective warfarin doses versus relevant genotypes, admixture, clinical and demographic factors was performed in 255 patients and further validated externally in another cohort of 55 individuals. Results The admixture-adjusted, genotype-guided warfarin dosing refinement algorithm developed in Caribbean Hispanics showed better predictability (R2 = 0.70, MAE = 0.72mg/day) than a clinical algorithm that excluded genotypes and admixture (R2 = 0.60, MAE = 0.99mg/day), and outperformed two prior pharmacogenetic algorithms in predicting effective dose in this population. For patients at the highest risk of adverse events, 45.5% of the dose predictions using the developed pharmacogenetic model resulted in ideal dose as compared with only 29% when using the clinical non-genetic algorithm (p<0.001). The admixture-driven pharmacogenetic algorithm predicted 58% of warfarin dose variance when externally validated in 55 individuals from an independent validation cohort (MAE = 0.89 mg/day, 24% mean bias). Conclusions Results supported our rationale to incorporate individual’s genotypes and unique admixture metrics into pharmacogenetic refinement models in order to increase predictability when expanding them to admixed populations like Caribbean Hispanics. Trial Registration ClinicalTrials.gov NCT01318057 PMID:26745506

  5. Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis

    PubMed Central

    Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.

    2017-01-01

    Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571

  6. Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis.

    PubMed

    Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L

    2017-02-14

    Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.

  7. Comparison of genetic algorithm methods for fuel management optimization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    DeChaine, M.D.; Feltus, M.A.

    1995-12-31

    The CIGARO system was developed for genetic algorithm fuel management optimization. Tests are performed to find the best fuel location swap mutation operator probability and to compare genetic algorithm to a truly random search method. Tests showed the fuel swap probability should be between 0% and 10%, and a 50% definitely hampered the optimization. The genetic algorithm performed significantly better than the random search method, which did not even satisfy the peak normalized power constraint.

  8. Prediction of composite fatigue life under variable amplitude loading using artificial neural network trained by genetic algorithm

    NASA Astrophysics Data System (ADS)

    Rohman, Muhamad Nur; Hidayat, Mas Irfan P.; Purniawan, Agung

    2018-04-01

    Neural networks (NN) have been widely used in application of fatigue life prediction. In the use of fatigue life prediction for polymeric-base composite, development of NN model is necessary with respect to the limited fatigue data and applicable to be used to predict the fatigue life under varying stress amplitudes in the different stress ratios. In the present paper, Multilayer-Perceptrons (MLP) model of neural network is developed, and Genetic Algorithm was employed to optimize the respective weights of NN for prediction of polymeric-base composite materials under variable amplitude loading. From the simulation result obtained with two different composite systems, named E-glass fabrics/epoxy (layups [(±45)/(0)2]S), and E-glass/polyester (layups [90/0/±45/0]S), NN model were trained with fatigue data from two different stress ratios, which represent limited fatigue data, can be used to predict another four and seven stress ratios respectively, with high accuracy of fatigue life prediction. The accuracy of NN prediction were quantified with the small value of mean square error (MSE). When using 33% from the total fatigue data for training, the NN model able to produce high accuracy for all stress ratios. When using less fatigue data during training (22% from the total fatigue data), the NN model still able to produce high coefficient of determination between the prediction result compared with obtained by experiment.

  9. Protein Structure Prediction with Evolutionary Algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hart, W.E.; Krasnogor, N.; Pelta, D.A.

    1999-02-08

    Evolutionary algorithms have been successfully applied to a variety of molecular structure prediction problems. In this paper we reconsider the design of genetic algorithms that have been applied to a simple protein structure prediction problem. Our analysis considers the impact of several algorithmic factors for this problem: the confirmational representation, the energy formulation and the way in which infeasible conformations are penalized, Further we empirically evaluated the impact of these factors on a small set of polymer sequences. Our analysis leads to specific recommendations for both GAs as well as other heuristic methods for solving PSP on the HP model.

  10. Personalized Risk Prediction in Clinical Oncology Research: Applications and Practical Issues Using Survival Trees and Random Forests.

    PubMed

    Hu, Chen; Steingrimsson, Jon Arni

    2018-01-01

    A crucial component of making individualized treatment decisions is to accurately predict each patient's disease risk. In clinical oncology, disease risks are often measured through time-to-event data, such as overall survival and progression/recurrence-free survival, and are often subject to censoring. Risk prediction models based on recursive partitioning methods are becoming increasingly popular largely due to their ability to handle nonlinear relationships, higher-order interactions, and/or high-dimensional covariates. The most popular recursive partitioning methods are versions of the Classification and Regression Tree (CART) algorithm, which builds a simple interpretable tree structured model. With the aim of increasing prediction accuracy, the random forest algorithm averages multiple CART trees, creating a flexible risk prediction model. Risk prediction models used in clinical oncology commonly use both traditional demographic and tumor pathological factors as well as high-dimensional genetic markers and treatment parameters from multimodality treatments. In this article, we describe the most commonly used extensions of the CART and random forest algorithms to right-censored outcomes. We focus on how they differ from the methods for noncensored outcomes, and how the different splitting rules and methods for cost-complexity pruning impact these algorithms. We demonstrate these algorithms by analyzing a randomized Phase III clinical trial of breast cancer. We also conduct Monte Carlo simulations to compare the prediction accuracy of survival forests with more commonly used regression models under various scenarios. These simulation studies aim to evaluate how sensitive the prediction accuracy is to the underlying model specifications, the choice of tuning parameters, and the degrees of missing covariates.

  11. Predicting Student Grades in Learning Management Systems with Multiple Instance Genetic Programming

    ERIC Educational Resources Information Center

    Zafra, Amelia; Ventura, Sebastian

    2009-01-01

    The ability to predict a student's performance could be useful in a great number of different ways associated with university-level learning. In this paper, a grammar guided genetic programming algorithm, G3P-MI, has been applied to predict if the student will fail or pass a certain course and identifies activities to promote learning in a…

  12. Comparative Study on Prediction Effects of Short Fatigue Crack Propagation Rate by Two Different Calculation Methods

    NASA Astrophysics Data System (ADS)

    Yang, Bing; Liao, Zhen; Qin, Yahang; Wu, Yayun; Liang, Sai; Xiao, Shoune; Yang, Guangwu; Zhu, Tao

    2017-05-01

    To describe the complicated nonlinear process of the fatigue short crack evolution behavior, especially the change of the crack propagation rate, two different calculation methods are applied. The dominant effective short fatigue crack propagation rates are calculated based on the replica fatigue short crack test with nine smooth funnel-shaped specimens and the observation of the replica films according to the effective short fatigue cracks principle. Due to the fast decay and the nonlinear approximation ability of wavelet analysis, the self-learning ability of neural network, and the macroscopic searching and global optimization of genetic algorithm, the genetic wavelet neural network can reflect the implicit complex nonlinear relationship when considering multi-influencing factors synthetically. The effective short fatigue cracks and the dominant effective short fatigue crack are simulated and compared by the Genetic Wavelet Neural Network. The simulation results show that Genetic Wavelet Neural Network is a rational and available method for studying the evolution behavior of fatigue short crack propagation rate. Meanwhile, a traditional data fitting method for a short crack growth model is also utilized for fitting the test data. It is reasonable and applicable for predicting the growth rate. Finally, the reason for the difference between the prediction effects by these two methods is interpreted.

  13. Social Media: Menagerie of Metrics

    DTIC Science & Technology

    2010-01-27

    intelligence, an evolutionary algorithm (EA) is a subset of evolutionary computation, a generic population-based metaheuristic optimization algorithm . An EA...Cloning - 22 Animals were cloned to date; genetic algorithms can help prediction (e.g. “elitism” - attempts to ensure selection by including performers...28, 2010 Evolutionary Algorithm • Evolutionary algorithm From Wikipedia, the free encyclopedia Artificial intelligence portal In artificial

  14. Weather prediction using a genetic memory

    NASA Technical Reports Server (NTRS)

    Rogers, David

    1990-01-01

    Kanaerva's sparse distributed memory (SDM) is an associative memory model based on the mathematical properties of high dimensional binary address spaces. Holland's genetic algorithms are a search technique for high dimensional spaces inspired by evolutional processes of DNA. Genetic Memory is a hybrid of the above two systems, in which the memory uses a genetic algorithm to dynamically reconfigure its physical storage locations to reflect correlations between the stored addresses and data. This architecture is designed to maximize the ability of the system to scale-up to handle real world problems.

  15. Prediction of Aerodynamic Coefficients for Wind Tunnel Data using a Genetic Algorithm Optimized Neural Network

    NASA Technical Reports Server (NTRS)

    Rajkumar, T.; Aragon, Cecilia; Bardina, Jorge; Britten, Roy

    2002-01-01

    A fast, reliable way of predicting aerodynamic coefficients is produced using a neural network optimized by a genetic algorithm. Basic aerodynamic coefficients (e.g. lift, drag, pitching moment) are modelled as functions of angle of attack and Mach number. The neural network is first trained on a relatively rich set of data from wind tunnel tests of numerical simulations to learn an overall model. Most of the aerodynamic parameters can be well-fitted using polynomial functions. A new set of data, which can be relatively sparse, is then supplied to the network to produce a new model consistent with the previous model and the new data. Because the new model interpolates realistically between the sparse test data points, it is suitable for use in piloted simulations. The genetic algorithm is used to choose a neural network architecture to give best results, avoiding over-and under-fitting of the test data.

  16. Combining the genetic algorithm and successive projection algorithm for the selection of feature wavelengths to evaluate exudative characteristics in frozen-thawed fish muscle.

    PubMed

    Cheng, Jun-Hu; Sun, Da-Wen; Pu, Hongbin

    2016-04-15

    The potential use of feature wavelengths for predicting drip loss in grass carp fish, as affected by being frozen at -20°C for 24 h and thawed at 4°C for 1, 2, 4, and 6 days, was investigated. Hyperspectral images of frozen-thawed fish were obtained and their corresponding spectra were extracted. Least-squares support vector machine and multiple linear regression (MLR) models were established using five key wavelengths, selected by combining a genetic algorithm and successive projections algorithm, and this showed satisfactory performance in drip loss prediction. The MLR model with a determination coefficient of prediction (R(2)P) of 0.9258, and lower root mean square error estimated by a prediction (RMSEP) of 1.12%, was applied to transfer each pixel of the image and generate the distribution maps of exudation changes. The results confirmed that it is feasible to identify the feature wavelengths using variable selection methods and chemometric analysis for developing on-line multispectral imaging. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. Evolving hard problems: Generating human genetics datasets with a complex etiology.

    PubMed

    Himmelstein, Daniel S; Greene, Casey S; Moore, Jason H

    2011-07-07

    A goal of human genetics is to discover genetic factors that influence individuals' susceptibility to common diseases. Most common diseases are thought to result from the joint failure of two or more interacting components instead of single component failures. This greatly complicates both the task of selecting informative genetic variants and the task of modeling interactions between them. We and others have previously developed algorithms to detect and model the relationships between these genetic factors and disease. Previously these methods have been evaluated with datasets simulated according to pre-defined genetic models. Here we develop and evaluate a model free evolution strategy to generate datasets which display a complex relationship between individual genotype and disease susceptibility. We show that this model free approach is capable of generating a diverse array of datasets with distinct gene-disease relationships for an arbitrary interaction order and sample size. We specifically generate eight-hundred Pareto fronts; one for each independent run of our algorithm. In each run the predictiveness of single genetic variation and pairs of genetic variants have been minimized, while the predictiveness of third, fourth, or fifth-order combinations is maximized. Two hundred runs of the algorithm are further dedicated to creating datasets with predictive four or five order interactions and minimized lower-level effects. This method and the resulting datasets will allow the capabilities of novel methods to be tested without pre-specified genetic models. This allows researchers to evaluate which methods will succeed on human genetics problems where the model is not known in advance. We further make freely available to the community the entire Pareto-optimal front of datasets from each run so that novel methods may be rigorously evaluated. These 76,600 datasets are available from http://discovery.dartmouth.edu/model_free_data/.

  18. An Agent Inspired Reconfigurable Computing Implementation of a Genetic Algorithm

    NASA Technical Reports Server (NTRS)

    Weir, John M.; Wells, B. Earl

    2003-01-01

    Many software systems have been successfully implemented using an agent paradigm which employs a number of independent entities that communicate with one another to achieve a common goal. The distributed nature of such a paradigm makes it an excellent candidate for use in high speed reconfigurable computing hardware environments such as those present in modem FPGA's. In this paper, a distributed genetic algorithm that can be applied to the agent based reconfigurable hardware model is introduced. The effectiveness of this new algorithm is evaluated by comparing the quality of the solutions found by the new algorithm with those found by traditional genetic algorithms. The performance of a reconfigurable hardware implementation of the new algorithm on an FPGA is compared to traditional single processor implementations.

  19. Hyperspectral Imaging for Predicting the Internal Quality of Kiwifruits Based on Variable Selection Algorithms and Chemometric Models.

    PubMed

    Zhu, Hongyan; Chu, Bingquan; Fan, Yangyang; Tao, Xiaoya; Yin, Wenxin; He, Yong

    2017-08-10

    We investigated the feasibility and potentiality of determining firmness, soluble solids content (SSC), and pH in kiwifruits using hyperspectral imaging, combined with variable selection methods and calibration models. The images were acquired by a push-broom hyperspectral reflectance imaging system covering two spectral ranges. Weighted regression coefficients (BW), successive projections algorithm (SPA) and genetic algorithm-partial least square (GAPLS) were compared and evaluated for the selection of effective wavelengths. Moreover, multiple linear regression (MLR), partial least squares regression and least squares support vector machine (LS-SVM) were developed to predict quality attributes quantitatively using effective wavelengths. The established models, particularly SPA-MLR, SPA-LS-SVM and GAPLS-LS-SVM, performed well. The SPA-MLR models for firmness (R pre  = 0.9812, RPD = 5.17) and SSC (R pre  = 0.9523, RPD = 3.26) at 380-1023 nm showed excellent performance, whereas GAPLS-LS-SVM was the optimal model at 874-1734 nm for predicting pH (R pre  = 0.9070, RPD = 2.60). Image processing algorithms were developed to transfer the predictive model in every pixel to generate prediction maps that visualize the spatial distribution of firmness and SSC. Hence, the results clearly demonstrated that hyperspectral imaging has the potential as a fast and non-invasive method to predict the quality attributes of kiwifruits.

  20. Can we do better than the grid survey: Optimal synoptic surveys in presence of variable uncertainty and decorrelation scales

    NASA Astrophysics Data System (ADS)

    Frolov, Sergey; Garau, Bartolame; Bellingham, James

    2014-08-01

    Regular grid ("lawnmower") survey is a classical strategy for synoptic sampling of the ocean. Is it possible to achieve a more effective use of available resources if one takes into account a priori knowledge about variability in magnitudes of uncertainty and decorrelation scales? In this article, we develop and compare the performance of several path-planning algorithms: optimized "lawnmower," a graph-search algorithm (A*), and a fully nonlinear genetic algorithm. We use the machinery of the best linear unbiased estimator (BLUE) to quantify the ability of a vehicle fleet to synoptically map distribution of phytoplankton off the central California coast. We used satellite and in situ data to specify covariance information required by the BLUE estimator. Computational experiments showed that two types of sampling strategies are possible: a suboptimal space-filling design (produced by the "lawnmower" and the A* algorithms) and an optimal uncertainty-aware design (produced by the genetic algorithm). Unlike the space-filling designs that attempted to cover the entire survey area, the optimal design focused on revisiting areas of high uncertainty. Results of the multivehicle experiments showed that fleet performance predictors, such as cumulative speed or the weight of the fleet, predicted the performance of a homogeneous fleet well; however, these were poor predictors for comparing the performance of different platforms.

  1. Load balancing prediction method of cloud storage based on analytic hierarchy process and hybrid hierarchical genetic algorithm.

    PubMed

    Zhou, Xiuze; Lin, Fan; Yang, Lvqing; Nie, Jing; Tan, Qian; Zeng, Wenhua; Zhang, Nian

    2016-01-01

    With the continuous expansion of the cloud computing platform scale and rapid growth of users and applications, how to efficiently use system resources to improve the overall performance of cloud computing has become a crucial issue. To address this issue, this paper proposes a method that uses an analytic hierarchy process group decision (AHPGD) to evaluate the load state of server nodes. Training was carried out by using a hybrid hierarchical genetic algorithm (HHGA) for optimizing a radial basis function neural network (RBFNN). The AHPGD makes the aggregative indicator of virtual machines in cloud, and become input parameters of predicted RBFNN. Also, this paper proposes a new dynamic load balancing scheduling algorithm combined with a weighted round-robin algorithm, which uses the predictive periodical load value of nodes based on AHPPGD and RBFNN optimized by HHGA, then calculates the corresponding weight values of nodes and makes constant updates. Meanwhile, it keeps the advantages and avoids the shortcomings of static weighted round-robin algorithm.

  2. Multi-objective optimization to predict muscle tensions in a pinch function using genetic algorithm

    NASA Astrophysics Data System (ADS)

    Bensghaier, Amani; Romdhane, Lotfi; Benouezdou, Fethi

    2012-03-01

    This work is focused on the determination of the thumb and the index finger muscle tensions in a tip pinch task. A biomechanical model of the musculoskeletal system of the thumb and the index finger is developed. Due to the assumptions made in carrying out the biomechanical model, the formulated force analysis problem is indeterminate leading to an infinite number of solutions. Thus, constrained single and multi-objective optimization methodologies are used in order to explore the muscular redundancy and to predict optimal muscle tension distributions. Various models are investigated using the optimization process. The basic criteria to minimize are the sum of the muscle stresses, the sum of individual muscle tensions and the maximum muscle stress. The multi-objective optimization is solved using a Pareto genetic algorithm to obtain non-dominated solutions, defined as the set of optimal distributions of muscle tensions. The results show the advantage of the multi-objective formulation over the single objective one. The obtained solutions are compared to those available in the literature demonstrating the effectiveness of our approach in the analysis of the fingers musculoskeletal systems when predicting muscle tensions.

  3. Estimating the solute transport parameters of the spatial fractional advection-dispersion equation using Bees Algorithm

    NASA Astrophysics Data System (ADS)

    Mehdinejadiani, Behrouz

    2017-08-01

    This study represents the first attempt to estimate the solute transport parameters of the spatial fractional advection-dispersion equation using Bees Algorithm. The numerical studies as well as the experimental studies were performed to certify the integrity of Bees Algorithm. The experimental ones were conducted in a sandbox for homogeneous and heterogeneous soils. A detailed comparative study was carried out between the results obtained from Bees Algorithm and those from Genetic Algorithm and LSQNONLIN routines in FracFit toolbox. The results indicated that, in general, the Bees Algorithm much more accurately appraised the sFADE parameters in comparison with Genetic Algorithm and LSQNONLIN, especially in the heterogeneous soil and for α values near to 1 in the numerical study. Also, the results obtained from Bees Algorithm were more reliable than those from Genetic Algorithm. The Bees Algorithm showed the relative similar performances for all cases, while the Genetic Algorithm and the LSQNONLIN yielded different performances for various cases. The performance of LSQNONLIN strongly depends on the initial guess values so that, compared to the Genetic Algorithm, it can more accurately estimate the sFADE parameters by taking into consideration the suitable initial guess values. To sum up, the Bees Algorithm was found to be very simple, robust and accurate approach to estimate the transport parameters of the spatial fractional advection-dispersion equation.

  4. Estimating the solute transport parameters of the spatial fractional advection-dispersion equation using Bees Algorithm.

    PubMed

    Mehdinejadiani, Behrouz

    2017-08-01

    This study represents the first attempt to estimate the solute transport parameters of the spatial fractional advection-dispersion equation using Bees Algorithm. The numerical studies as well as the experimental studies were performed to certify the integrity of Bees Algorithm. The experimental ones were conducted in a sandbox for homogeneous and heterogeneous soils. A detailed comparative study was carried out between the results obtained from Bees Algorithm and those from Genetic Algorithm and LSQNONLIN routines in FracFit toolbox. The results indicated that, in general, the Bees Algorithm much more accurately appraised the sFADE parameters in comparison with Genetic Algorithm and LSQNONLIN, especially in the heterogeneous soil and for α values near to 1 in the numerical study. Also, the results obtained from Bees Algorithm were more reliable than those from Genetic Algorithm. The Bees Algorithm showed the relative similar performances for all cases, while the Genetic Algorithm and the LSQNONLIN yielded different performances for various cases. The performance of LSQNONLIN strongly depends on the initial guess values so that, compared to the Genetic Algorithm, it can more accurately estimate the sFADE parameters by taking into consideration the suitable initial guess values. To sum up, the Bees Algorithm was found to be very simple, robust and accurate approach to estimate the transport parameters of the spatial fractional advection-dispersion equation. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Genetics-based control of a mimo boiler-turbine plant

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dimeo, R.M.; Lee, K.Y.

    1994-12-31

    A genetic algorithm is used to develop an optimal controller for a non-linear, multi-input/multi-output boiler-turbine plant. The algorithm is used to train a control system for the plant over a wide operating range in an effort to obtain better performance. The results of the genetic algorithm`s controller designed from the linearized plant model at a nominal operating point. Because the genetic algorithm is well-suited to solving traditionally difficult optimization problems it is found that the algorithm is capable of developing the controller based on input/output information only. This controller achieves a performance comparable to the standard linear quadratic regulator.

  6. Comparative genetic responses to climate for the varieties of Pinus ponderosa and Pseudotsuga menziesii: realized climate niches

    Treesearch

    Gerald E. Rehfeldt; Barry C. Jaquish; Javier Lopez-Upton; Cuauhtemoc Saenz-Romero; J. Bradley St Clair; Laura P. Leites; Dennis G. Joyce

    2014-01-01

    The Random Forests classification algorithm was used to predict the occurrence of the realized climate niche for two sub-specific varieties of Pinus ponderosa and three varieties of Pseudotsuga menziesii from presence-absence data in forest inventory ground plots. Analyses were based on ca. 271,000 observations for P. ponderosa and ca. 426,000 observations for P....

  7. Application of a single-objective, hybrid genetic algorithm approach to pharmacokinetic model building.

    PubMed

    Sherer, Eric A; Sale, Mark E; Pollock, Bruce G; Belani, Chandra P; Egorin, Merrill J; Ivy, Percy S; Lieberman, Jeffrey A; Manuck, Stephen B; Marder, Stephen R; Muldoon, Matthew F; Scher, Howard I; Solit, David B; Bies, Robert R

    2012-08-01

    A limitation in traditional stepwise population pharmacokinetic model building is the difficulty in handling interactions between model components. To address this issue, a method was previously introduced which couples NONMEM parameter estimation and model fitness evaluation to a single-objective, hybrid genetic algorithm for global optimization of the model structure. In this study, the generalizability of this approach for pharmacokinetic model building is evaluated by comparing (1) correct and spurious covariate relationships in a simulated dataset resulting from automated stepwise covariate modeling, Lasso methods, and single-objective hybrid genetic algorithm approaches to covariate identification and (2) information criteria values, model structures, convergence, and model parameter values resulting from manual stepwise versus single-objective, hybrid genetic algorithm approaches to model building for seven compounds. Both manual stepwise and single-objective, hybrid genetic algorithm approaches to model building were applied, blinded to the results of the other approach, for selection of the compartment structure as well as inclusion and model form of inter-individual and inter-occasion variability, residual error, and covariates from a common set of model options. For the simulated dataset, stepwise covariate modeling identified three of four true covariates and two spurious covariates; Lasso identified two of four true and 0 spurious covariates; and the single-objective, hybrid genetic algorithm identified three of four true covariates and one spurious covariate. For the clinical datasets, the Akaike information criterion was a median of 22.3 points lower (range of 470.5 point decrease to 0.1 point decrease) for the best single-objective hybrid genetic-algorithm candidate model versus the final manual stepwise model: the Akaike information criterion was lower by greater than 10 points for four compounds and differed by less than 10 points for three compounds. The root mean squared error and absolute mean prediction error of the best single-objective hybrid genetic algorithm candidates were a median of 0.2 points higher (range of 38.9 point decrease to 27.3 point increase) and 0.02 points lower (range of 0.98 point decrease to 0.74 point increase), respectively, than that of the final stepwise models. In addition, the best single-objective, hybrid genetic algorithm candidate models had successful convergence and covariance steps for each compound, used the same compartment structure as the manual stepwise approach for 6 of 7 (86 %) compounds, and identified 54 % (7 of 13) of covariates included by the manual stepwise approach and 16 covariate relationships not included by manual stepwise models. The model parameter values between the final manual stepwise and best single-objective, hybrid genetic algorithm models differed by a median of 26.7 % (q₁ = 4.9 % and q₃ = 57.1 %). Finally, the single-objective, hybrid genetic algorithm approach was able to identify models capable of estimating absorption rate parameters for four compounds that the manual stepwise approach did not identify. The single-objective, hybrid genetic algorithm represents a general pharmacokinetic model building methodology whose ability to rapidly search the feasible solution space leads to nearly equivalent or superior model fits to pharmacokinetic data.

  8. Application of a hybrid model of neural networks and genetic algorithms to evaluate landslide susceptibility

    NASA Astrophysics Data System (ADS)

    Wang, H. B.; Li, J. W.; Zhou, B.; Yuan, Z. Q.; Chen, Y. P.

    2013-03-01

    In the last few decades, the development of Geographical Information Systems (GIS) technology has provided a method for the evaluation of landslide susceptibility and hazard. Slope units were found to be appropriate for the fundamental morphological elements in landslide susceptibility evaluation. Following the DEM construction in a loess area susceptible to landslides, the direct-reverse DEM technology was employed to generate 216 slope units in the studied area. After a detailed investigation, the landslide inventory was mapped in which 39 landslides, including paleo-landslides, old landslides and recent landslides, were present. Of the 216 slope units, 123 involved landslides. To analyze the mechanism of these landslides, six environmental factors were selected to evaluate landslide occurrence: slope angle, aspect, the height and shape of the slope, distance to river and human activities. These factors were extracted in terms of the slope unit within the ArcGIS software. The spatial analysis demonstrates that most of the landslides are located on convex slopes at an elevation of 100-150 m with slope angles from 135°-225° and 40°-60°. Landslide occurrence was then checked according to these environmental factors using an artificial neural network with back propagation, optimized by genetic algorithms. A dataset of 120 slope units was chosen for training the neural network model, i.e., 80 units with landslide presence and 40 units without landslide presence. The parameters of genetic algorithms and neural networks were then set: population size of 100, crossover probability of 0.65, mutation probability of 0.01, momentum factor of 0.60, learning rate of 0.7, max learning number of 10 000, and target error of 0.000001. After training on the datasets, the susceptibility of landslides was mapped for the land-use plan and hazard mitigation. Comparing the susceptibility map with landslide inventory, it was noted that the prediction accuracy of landslide occurrence is 93.02%, whereas units without landslide occurrence are predicted with an accuracy of 81.13%. To sum up, the verification shows satisfactory agreement with an accuracy of 86.46% between the susceptibility map and the landslide locations. In the landslide susceptibility assessment, ten new slopes were predicted to show potential for failure, which can be confirmed by the engineering geological conditions of these slopes. It was also observed that some disadvantages could be overcome in the application of the neural networks with back propagation, for example, the low convergence rate and local minimum, after the network was optimized using genetic algorithms. To conclude, neural networks with back propagation that are optimized by genetic algorithms are an effective method to predict landslide susceptibility with high accuracy.

  9. Automatic Data Filter Customization Using a Genetic Algorithm

    NASA Technical Reports Server (NTRS)

    Mandrake, Lukas

    2013-01-01

    This work predicts whether a retrieval algorithm will usefully determine CO2 concentration from an input spectrum of GOSAT (Greenhouse Gases Observing Satellite). This was done to eliminate needless runtime on atmospheric soundings that would never yield useful results. A space of 50 dimensions was examined for predictive power on the final CO2 results. Retrieval algorithms are frequently expensive to run, and wasted effort defeats requirements and expends needless resources. This algorithm could be used to help predict and filter unneeded runs in any computationally expensive regime. Traditional methods such as the Fischer discriminant analysis and decision trees can attempt to predict whether a sounding will be properly processed. However, this work sought to detect a subsection of the dimensional space that can be simply filtered out to eliminate unwanted runs. LDAs (linear discriminant analyses) and other systems examine the entire data and judge a "best fit," giving equal weight to complex and problematic regions as well as simple, clear-cut regions. In this implementation, a genetic space of "left" and "right" thresholds outside of which all data are rejected was defined. These left/right pairs are created for each of the 50 input dimensions. A genetic algorithm then runs through countless potential filter settings using a JPL computer cluster, optimizing the tossed-out data s yield (proper vs. improper run removal) and number of points tossed. This solution is robust to an arbitrary decision boundary within the data and avoids the global optimization problem of whole-dataset fitting using LDA or decision trees. It filters out runs that would not have produced useful CO2 values to save needless computation. This would be an algorithmic preprocessing improvement to any computationally expensive system.

  10. A hybrid clustering and classification approach for predicting crash injury severity on rural roads.

    PubMed

    Hasheminejad, Seyed Hessam-Allah; Zahedi, Mohsen; Hasheminejad, Seyed Mohammad Hossein

    2018-03-01

    As a threat for transportation system, traffic crashes have a wide range of social consequences for governments. Traffic crashes are increasing in developing countries and Iran as a developing country is not immune from this risk. There are several researches in the literature to predict traffic crash severity based on artificial neural networks (ANNs), support vector machines and decision trees. This paper attempts to investigate the crash injury severity of rural roads by using a hybrid clustering and classification approach to compare the performance of classification algorithms before and after applying the clustering. In this paper, a novel rule-based genetic algorithm (GA) is proposed to predict crash injury severity, which is evaluated by performance criteria in comparison with classification algorithms like ANN. The results obtained from analysis of 13,673 crashes (5600 property damage, 778 fatal crashes, 4690 slight injuries and 2605 severe injuries) on rural roads in Tehran Province of Iran during 2011-2013 revealed that the proposed GA method outperforms other classification algorithms based on classification metrics like precision (86%), recall (88%) and accuracy (87%). Moreover, the proposed GA method has the highest level of interpretation, is easy to understand and provides feedback to analysts.

  11. Data-Driven Property Estimation for Protective Clothing

    DTIC Science & Technology

    2014-09-01

    reliable predictions falls under the rubric “machine learning”. Inspired by the applications of machine learning in pharmaceutical drug design and...using genetic algorithms, for instance— descriptor selection can be automated as well. A well-known structured learning technique—Artificial Neural...descriptors automatically, by iteration, e.g., using a genetic algorithm [49]. 4.2.4 Avoiding Overfitting A peril of all regression—least squares as

  12. Comparisons of forecasting for hepatitis in Guangxi Province, China by using three neural networks models.

    PubMed

    Gan, Ruijing; Chen, Ni; Huang, Daizheng

    2016-01-01

    This study compares and evaluates the prediction of hepatitis in Guangxi Province, China by using back propagation neural networks based genetic algorithm (BPNN-GA), generalized regression neural networks (GRNN), and wavelet neural networks (WNN). In order to compare the results of forecasting, the data obtained from 2004 to 2013 and 2014 were used as modeling and forecasting samples, respectively. The results show that when the small data set of hepatitis has seasonal fluctuation, the prediction result by BPNN-GA will be better than the two other methods. The WNN method is suitable for predicting the large data set of hepatitis that has seasonal fluctuation and the same for the GRNN method when the data increases steadily.

  13. Optimising the production of succinate and lactate in Escherichia coli using a hybrid of artificial bee colony algorithm and minimisation of metabolic adjustment.

    PubMed

    Tang, Phooi Wah; Choon, Yee Wen; Mohamad, Mohd Saberi; Deris, Safaai; Napis, Suhaimi

    2015-03-01

    Metabolic engineering is a research field that focuses on the design of models for metabolism, and uses computational procedures to suggest genetic manipulation. It aims to improve the yield of particular chemical or biochemical products. Several traditional metabolic engineering methods are commonly used to increase the production of a desired target, but the products are always far below their theoretical maximums. Using numeral optimisation algorithms to identify gene knockouts may stall at a local minimum in a multivariable function. This paper proposes a hybrid of the artificial bee colony (ABC) algorithm and the minimisation of metabolic adjustment (MOMA) to predict an optimal set of solutions in order to optimise the production rate of succinate and lactate. The dataset used in this work was from the iJO1366 Escherichia coli metabolic network. The experimental results include the production rate, growth rate and a list of knockout genes. From the comparative analysis, ABCMOMA produced better results compared to previous works, showing potential for solving genetic engineering problems. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  14. DenguePredict: An Integrated Drug Repositioning Approach towards Drug Discovery for Dengue.

    PubMed

    Wang, QuanQiu; Xu, Rong

    2015-01-01

    Dengue is a viral disease of expanding global incidence without cures. Here we present a drug repositioning system (DenguePredict) leveraging upon a unique drug treatment database and vast amounts of disease- and drug-related data. We first constructed a large-scale genetic disease network with enriched dengue genetics data curated from biomedical literature. We applied a network-based ranking algorithm to find dengue-related diseases from the disease network. We then developed a novel algorithm to prioritize FDA-approved drugs from dengue-related diseases to treat dengue. When tested in a de-novo validation setting, DenguePredict found the only two drugs tested in clinical trials for treating dengue and ranked them highly: chloroquine ranked at top 0.96% and ivermectin at top 22.75%. We showed that drugs targeting immune systems and arachidonic acid metabolism-related apoptotic pathways might represent innovative drugs to treat dengue. In summary, DenguePredict, by combining comprehensive disease- and drug-related data and novel algorithms, may greatly facilitate drug discovery for dengue.

  15. Nonlinear dynamics optimization with particle swarm and genetic algorithms for SPEAR3 emittance upgrade

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Huang, Xiaobiao; Safranek, James

    2014-09-01

    Nonlinear dynamics optimization is carried out for a low emittance upgrade lattice of SPEAR3 in order to improve its dynamic aperture and Touschek lifetime. Two multi-objective optimization algorithms, a genetic algorithm and a particle swarm algorithm, are used for this study. The performance of the two algorithms are compared. The result shows that the particle swarm algorithm converges significantly faster to similar or better solutions than the genetic algorithm and it does not require seeding of good solutions in the initial population. These advantages of the particle swarm algorithm may make it more suitable for many accelerator optimization applications.

  16. On-line monitoring of extraction process of Flos Lonicerae Japonicae using near infrared spectroscopy combined with synergy interval PLS and genetic algorithm

    NASA Astrophysics Data System (ADS)

    Yang, Yue; Wang, Lei; Wu, Yongjiang; Liu, Xuesong; Bi, Yuan; Xiao, Wei; Chen, Yong

    2017-07-01

    There is a growing need for the effective on-line process monitoring during the manufacture of traditional Chinese medicine to ensure quality consistency. In this study, the potential of near infrared (NIR) spectroscopy technique to monitor the extraction process of Flos Lonicerae Japonicae was investigated. A new algorithm of synergy interval PLS with genetic algorithm (Si-GA-PLS) was proposed for modeling. Four different PLS models, namely Full-PLS, Si-PLS, GA-PLS, and Si-GA-PLS, were established, and their performances in predicting two quality parameters (viz. total acid and soluble solid contents) were compared. In conclusion, Si-GA-PLS model got the best results due to the combination of superiority of Si-PLS and GA. For Si-GA-PLS, the determination coefficient (Rp2) and root-mean-square error for the prediction set (RMSEP) were 0.9561 and 147.6544 μg/ml for total acid, 0.9062 and 0.1078% for soluble solid contents, correspondingly. The overall results demonstrated that the NIR spectroscopy technique combined with Si-GA-PLS calibration is a reliable and non-destructive alternative method for on-line monitoring of the extraction process of TCM on the production scale.

  17. Modeling an aquatic ecosystem: application of an evolutionary algorithm with genetic doping to reduce prediction uncertainty

    NASA Astrophysics Data System (ADS)

    Friedel, Michael; Buscema, Massimo

    2016-04-01

    Aquatic ecosystem models can potentially be used to understand the influence of stresses on catchment resource quality. Given that catchment responses are functions of natural and anthropogenic stresses reflected in sparse and spatiotemporal biological, physical, and chemical measurements, an ecosystem is difficult to model using statistical or numerical methods. We propose an artificial adaptive systems approach to model ecosystems. First, an unsupervised machine-learning (ML) network is trained using the set of available sparse and disparate data variables. Second, an evolutionary algorithm with genetic doping is applied to reduce the number of ecosystem variables to an optimal set. Third, the optimal set of ecosystem variables is used to retrain the ML network. Fourth, a stochastic cross-validation approach is applied to quantify and compare the nonlinear uncertainty in selected predictions of the original and reduced models. Results are presented for aquatic ecosystems (tens of thousands of square kilometers) undergoing landscape change in the USA: Upper Illinois River Basin and Central Colorado Assessment Project Area, and Southland region, NZ.

  18. Application of matrix-assisted laser desorption ionization time-of-flight mass spectrometry in the screening of vanA-positive Enterococcus faecium.

    PubMed

    Wang, Li-jun; Lu, Xin-xin; Wu, Wei; Sui, Wen-jun; Zhang, Gui

    2014-01-01

    In order to evaluate a rapid matrix-assisted laser desorption ionization-time of flight mass spectrometry (MAIDI-TOF MS) assay in screening vancomycin-resistant Enterococcus faecium, a total of 150 E. faecium clinical strains were studied, including 60 vancomycin-resistant E. faecium (VREF) isolates and 90 vancomycin-susceptible (VSEF) strains. Vancomycin resistance genes were detected by sequencing. E. faecium were identified by MALDI-TOF MS. A genetic algorithm model with ClinProTools software was generated using spectra of 30 VREF isolates and 30 VSEF isolates. Using this model, 90 test isolates were discriminated between VREF and VSEF. The results showed that all sixty VREF isolates carried the vanA gene. The performance of VREF detection by the genetic algorithm model of MALDI-TOF MS compared to the sequencing method was sensitivity = 80%, specificity = 90%, false positive rate =10%, false negative rate =10%, positive predictive value = 80%, negative predictive value= 90%. MALDI-TOF MS can be used as a screening test for discrimination between vanA-positive E. faecium and vanA-negative E. faecium.

  19. Wind power prediction based on genetic neural network

    NASA Astrophysics Data System (ADS)

    Zhang, Suhan

    2017-04-01

    The scale of grid connected wind farms keeps increasing. To ensure the stability of power system operation, make a reasonable scheduling scheme and improve the competitiveness of wind farm in the electricity generation market, it's important to accurately forecast the short-term wind power. To reduce the influence of the nonlinear relationship between the disturbance factor and the wind power, the improved prediction model based on genetic algorithm and neural network method is established. To overcome the shortcomings of long training time of BP neural network and easy to fall into local minimum and improve the accuracy of the neural network, genetic algorithm is adopted to optimize the parameters and topology of neural network. The historical data is used as input to predict short-term wind power. The effectiveness and feasibility of the method is verified by the actual data of a certain wind farm as an example.

  20. ASPsiRNA: A Resource of ASP-siRNAs Having Therapeutic Potential for Human Genetic Disorders and Algorithm for Prediction of Their Inhibitory Efficacy

    PubMed Central

    Monga, Isha; Qureshi, Abid; Thakur, Nishant; Gupta, Amit Kumar; Kumar, Manoj

    2017-01-01

    Allele-specific siRNAs (ASP-siRNAs) have emerged as promising therapeutic molecules owing to their selectivity to inhibit the mutant allele or associated single-nucleotide polymorphisms (SNPs) sparing the expression of the wild-type counterpart. Thus, a dedicated bioinformatics platform encompassing updated ASP-siRNAs and an algorithm for the prediction of their inhibitory efficacy will be helpful in tackling currently intractable genetic disorders. In the present study, we have developed the ASPsiRNA resource (http://crdd.osdd.net/servers/aspsirna/) covering three components viz (i) ASPsiDb, (ii) ASPsiPred, and (iii) analysis tools like ASP-siOffTar. ASPsiDb is a manually curated database harboring 4543 (including 422 chemically modified) ASP-siRNAs targeting 78 unique genes involved in 51 different diseases. It furnishes comprehensive information from experimental studies on ASP-siRNAs along with multidimensional genetic and clinical information for numerous mutations. ASPsiPred is a two-layered algorithm to predict efficacy of ASP-siRNAs for fully complementary mutant (Effmut) and wild-type allele (Effwild) with one mismatch by ASPsiPredSVM and ASPsiPredmatrix, respectively. In ASPsiPredSVM, 922 unique ASP-siRNAs with experimentally validated quantitative Effmut were used. During 10-fold cross-validation (10nCV) employing various sequence features on the training/testing dataset (T737), the best predictive model achieved a maximum Pearson’s correlation coefficient (PCC) of 0.71. Further, the accuracy of the classifier to predict Effmut against novel genes was assessed by leave one target out cross-validation approach (LOTOCV). ASPsiPredmatrix was constructed from rule-based studies describing the effect of single siRNA:mRNA mismatches on the efficacy at 19 different locations of siRNA. Thus, ASPsiRNA encompasses the first database, prediction algorithm, and off-target analysis tool that is expected to accelerate research in the field of RNAi-based therapeutics for human genetic diseases. PMID:28696921

  1. Artificial Neural Network and Genetic Algorithm Hybrid Intelligence for Predicting Thai Stock Price Index Trend

    PubMed Central

    Boonjing, Veera; Intakosum, Sarun

    2016-01-01

    This study investigated the use of Artificial Neural Network (ANN) and Genetic Algorithm (GA) for prediction of Thailand's SET50 index trend. ANN is a widely accepted machine learning method that uses past data to predict future trend, while GA is an algorithm that can find better subsets of input variables for importing into ANN, hence enabling more accurate prediction by its efficient feature selection. The imported data were chosen technical indicators highly regarded by stock analysts, each represented by 4 input variables that were based on past time spans of 4 different lengths: 3-, 5-, 10-, and 15-day spans before the day of prediction. This import undertaking generated a big set of diverse input variables with an exponentially higher number of possible subsets that GA culled down to a manageable number of more effective ones. SET50 index data of the past 6 years, from 2009 to 2014, were used to evaluate this hybrid intelligence prediction accuracy, and the hybrid's prediction results were found to be more accurate than those made by a method using only one input variable for one fixed length of past time span. PMID:27974883

  2. Artificial Neural Network and Genetic Algorithm Hybrid Intelligence for Predicting Thai Stock Price Index Trend.

    PubMed

    Inthachot, Montri; Boonjing, Veera; Intakosum, Sarun

    2016-01-01

    This study investigated the use of Artificial Neural Network (ANN) and Genetic Algorithm (GA) for prediction of Thailand's SET50 index trend. ANN is a widely accepted machine learning method that uses past data to predict future trend, while GA is an algorithm that can find better subsets of input variables for importing into ANN, hence enabling more accurate prediction by its efficient feature selection. The imported data were chosen technical indicators highly regarded by stock analysts, each represented by 4 input variables that were based on past time spans of 4 different lengths: 3-, 5-, 10-, and 15-day spans before the day of prediction. This import undertaking generated a big set of diverse input variables with an exponentially higher number of possible subsets that GA culled down to a manageable number of more effective ones. SET50 index data of the past 6 years, from 2009 to 2014, were used to evaluate this hybrid intelligence prediction accuracy, and the hybrid's prediction results were found to be more accurate than those made by a method using only one input variable for one fixed length of past time span.

  3. Genome analysis of Legionella pneumophila strains using a mixed-genome microarray.

    PubMed

    Euser, Sjoerd M; Nagelkerke, Nico J; Schuren, Frank; Jansen, Ruud; Den Boer, Jeroen W

    2012-01-01

    Legionella, the causative agent for Legionnaires' disease, is ubiquitous in both natural and man-made aquatic environments. The distribution of Legionella genotypes within clinical strains is significantly different from that found in environmental strains. Developing novel genotypic methods that offer the ability to distinguish clinical from environmental strains could help to focus on more relevant (virulent) Legionella species in control efforts. Mixed-genome microarray data can be used to perform a comparative-genome analysis of strain collections, and advanced statistical approaches, such as the Random Forest algorithm are available to process these data. Microarray analysis was performed on a collection of 222 Legionella pneumophila strains, which included patient-derived strains from notified cases in The Netherlands in the period 2002-2006 and the environmental strains that were collected during the source investigation for those patients within the Dutch National Legionella Outbreak Detection Programme. The Random Forest algorithm combined with a logistic regression model was used to select predictive markers and to construct a predictive model that could discriminate between strains from different origin: clinical or environmental. Four genetic markers were selected that correctly predicted 96% of the clinical strains and 66% of the environmental strains collected within the Dutch National Legionella Outbreak Detection Programme. The Random Forest algorithm is well suited for the development of prediction models that use mixed-genome microarray data to discriminate between Legionella strains from different origin. The identification of these predictive genetic markers could offer the possibility to identify virulence factors within the Legionella genome, which in the future may be implemented in the daily practice of controlling Legionella in the public health environment.

  4. Assessment of genetic and nongenetic interactions for the prediction of depressive symptomatology: an analysis of the Wisconsin Longitudinal Study using machine learning algorithms.

    PubMed

    Roetker, Nicholas S; Page, C David; Yonker, James A; Chang, Vicky; Roan, Carol L; Herd, Pamela; Hauser, Taissa S; Hauser, Robert M; Atwood, Craig S

    2013-10-01

    We examined depression within a multidimensional framework consisting of genetic, environmental, and sociobehavioral factors and, using machine learning algorithms, explored interactions among these factors that might better explain the etiology of depressive symptoms. We measured current depressive symptoms using the Center for Epidemiologic Studies Depression Scale (n = 6378 participants in the Wisconsin Longitudinal Study). Genetic factors were 78 single nucleotide polymorphisms (SNPs); environmental factors-13 stressful life events (SLEs), plus a composite proportion of SLEs index; and sociobehavioral factors-18 personality, intelligence, and other health or behavioral measures. We performed traditional SNP associations via logistic regression likelihood ratio testing and explored interactions with support vector machines and Bayesian networks. After correction for multiple testing, we found no significant single genotypic associations with depressive symptoms. Machine learning algorithms showed no evidence of interactions. Naïve Bayes produced the best models in both subsets and included only environmental and sociobehavioral factors. We found no single or interactive associations with genetic factors and depressive symptoms. Various environmental and sociobehavioral factors were more predictive of depressive symptoms, yet their impacts were independent of one another. A genome-wide analysis of genetic alterations using machine learning methodologies will provide a framework for identifying genetic-environmental-sociobehavioral interactions in depressive symptoms.

  5. Genetic Algorithm for Initial Orbit Determination with Too Short Arc (Continued)

    NASA Astrophysics Data System (ADS)

    Li, X. R.; Wang, X.

    2016-03-01

    When using the genetic algorithm to solve the problem of too-short-arc (TSA) determination, due to the difference of computing processes between the genetic algorithm and classical method, the methods for outliers editing are no longer applicable. In the genetic algorithm, the robust estimation is acquired by means of using different loss functions in the fitness function, then the outlier problem of TSAs is solved. Compared with the classical method, the application of loss functions in the genetic algorithm is greatly simplified. Through the comparison of results of different loss functions, it is clear that the methods of least median square and least trimmed square can greatly improve the robustness of TSAs, and have a high breakdown point.

  6. Prediction of chemical biodegradability using support vector classifier optimized with differential evolution.

    PubMed

    Cao, Qi; Leung, K M

    2014-09-22

    Reliable computer models for the prediction of chemical biodegradability from molecular descriptors and fingerprints are very important for making health and environmental decisions. Coupling of the differential evolution (DE) algorithm with the support vector classifier (SVC) in order to optimize the main parameters of the classifier resulted in an improved classifier called the DE-SVC, which is introduced in this paper for use in chemical biodegradability studies. The DE-SVC was applied to predict the biodegradation of chemicals on the basis of extensive sample data sets and known structural features of molecules. Our optimization experiments showed that DE can efficiently find the proper parameters of the SVC. The resulting classifier possesses strong robustness and reliability compared with grid search, genetic algorithm, and particle swarm optimization methods. The classification experiments conducted here showed that the DE-SVC exhibits better classification performance than models previously used for such studies. It is a more effective and efficient prediction model for chemical biodegradability.

  7. Genetic algorithms for protein threading.

    PubMed

    Yadgari, J; Amir, A; Unger, R

    1998-01-01

    Despite many years of efforts, a direct prediction of protein structure from sequence is still not possible. As a result, in the last few years researchers have started to address the "inverse folding problem": Identifying and aligning a sequence to the fold with which it is most compatible, a process known as "threading". In two meetings in which protein folding predictions were objectively evaluated, it became clear that threading as a concept promises a real breakthrough, but that much improvement is still needed in the technique itself. Threading is a NP-hard problem, and thus no general polynomial solution can be expected. Still a practical approach with demonstrated ability to find optimal solutions in many cases, and acceptable solutions in other cases, is needed. We applied the technique of Genetic Algorithms in order to significantly improve the ability of threading algorithms to find the optimal alignment of a sequence to a structure, i.e. the alignment with the minimum free energy. A major progress reported here is the design of a representation of the threading alignment as a string of fixed length. With this representation validation of alignments and genetic operators are effectively implemented. Appropriate data structure and parameters have been selected. It is shown that Genetic Algorithm threading is effective and is able to find the optimal alignment in a few test cases. Furthermore, the described algorithm is shown to perform well even without pre-definition of core elements. Existing threading methods are dependent on such constraints to make their calculations feasible. But the concept of core elements is inherently arbitrary and should be avoided if possible. While a rigorous proof is hard to submit yet an, we present indications that indeed Genetic Algorithm threading is capable of finding consistently good solutions of full alignments in search spaces of size up to 10(70).

  8. Determination of the Spatial Distribution in Hydraulic Conductivity Using Genetic Algorithm Optimization

    NASA Astrophysics Data System (ADS)

    Aksoy, A.; Lee, J. H.; Kitanidis, P. K.

    2016-12-01

    Heterogeneity in hydraulic conductivity (K) impacts the transport and fate of contaminants in subsurface as well as design and operation of managed aquifer recharge (MAR) systems. Recently, improvements in computational resources and availability of big data through electrical resistivity tomography (ERT) and remote sensing have provided opportunities to better characterize the subsurface. Yet, there is need to improve prediction and evaluation methods in order to obtain information from field measurements for better field characterization. In this study, genetic algorithm optimization, which has been widely used in optimal aquifer remediation designs, was used to determine the spatial distribution of K. A hypothetical 2 km by 2 km aquifer was considered. A genetic algorithm library, PGAPack, was linked with a fast Fourier transform based random field generator as well as a groundwater flow and contaminant transport simulation model (BIO2D-KE). The objective of the optimization model was to minimize the total squared error between measured and predicted field values. It was assumed measured K values were available through ERT. Performance of genetic algorithm in predicting the distribution of K was tested for different cases. In the first one, it was assumed that observed K values were evaluated using the random field generator only as the forward model. In the second case, as well as K-values obtained through ERT, measured head values were incorporated into evaluation in which BIO2D-KE and random field generator were used as the forward models. Lastly, tracer concentrations were used as additional information in the optimization model. Initial results indicated enhanced performance when random field generator and BIO2D-KE are used in combination in predicting the spatial distribution in K.

  9. The fatigue life prediction of aluminium alloy using genetic algorithm and neural network

    NASA Astrophysics Data System (ADS)

    Susmikanti, Mike

    2013-09-01

    The behavior of the fatigue life of the industrial materials is very important. In many cases, the material with experiencing fatigue life cannot be avoided, however, there are many ways to control their behavior. Many investigations of the fatigue life phenomena of alloys have been done, but it is high cost and times consuming computation. This paper report the modeling and simulation approaches to predict the fatigue life behavior of Aluminum Alloys and resolves some problems of computation. First, the simulation using genetic algorithm was utilized to optimize the load to obtain the stress values. These results can be used to provide N-cycle fatigue life of the material. Furthermore, the experimental data was applied as input data in the neural network learning, while the samples data were applied for testing of the training data. Finally, the multilayer perceptron algorithm is applied to predict whether the given data sets in accordance with the fatigue life of the alloy. To achieve rapid convergence, the Levenberg-Marquardt algorithm was also employed. The simulations results shows that the fatigue behaviors of aluminum under pressure can be predicted. In addition, implementation of neural networks successfully identified a model for material fatigue life.

  10. Evolvable Neuronal Paths: A Novel Basis for Information and Search in the Brain

    PubMed Central

    Fernando, Chrisantha; Vasas, Vera; Szathmáry, Eörs; Husbands, Phil

    2011-01-01

    We propose a previously unrecognized kind of informational entity in the brain that is capable of acting as the basis for unlimited hereditary variation in neuronal networks. This unit is a path of activity through a network of neurons, analogous to a path taken through a hidden Markov model. To prove in principle the capabilities of this new kind of informational substrate, we show how a population of paths can be used as the hereditary material for a neuronally implemented genetic algorithm, (the swiss-army knife of black-box optimization techniques) which we have proposed elsewhere could operate at somatic timescales in the brain. We compare this to the same genetic algorithm that uses a standard ‘genetic’ informational substrate, i.e. non-overlapping discrete genotypes, on a range of optimization problems. A path evolution algorithm (PEA) is defined as any algorithm that implements natural selection of paths in a network substrate. A PEA is a previously unrecognized type of natural selection that is well suited for implementation by biological neuronal networks with structural plasticity. The important similarities and differences between a standard genetic algorithm and a PEA are considered. Whilst most experiments are conducted on an abstract network model, at the conclusion of the paper a slightly more realistic neuronal implementation of a PEA is outlined based on Izhikevich spiking neurons. Finally, experimental predictions are made for the identification of such informational paths in the brain. PMID:21887266

  11. Multiple Query Evaluation Based on an Enhanced Genetic Algorithm.

    ERIC Educational Resources Information Center

    Tamine, Lynda; Chrisment, Claude; Boughanem, Mohand

    2003-01-01

    Explains the use of genetic algorithms to combine results from multiple query evaluations to improve relevance in information retrieval. Discusses niching techniques, relevance feedback techniques, and evolution heuristics, and compares retrieval results obtained by both genetic multiple query evaluation and classical single query evaluation…

  12. Genetic Algorithms and Classification Trees in Feature Discovery: Diabetes and the NHANES database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Heredia-Langner, Alejandro; Jarman, Kristin H.; Amidan, Brett G.

    2013-09-01

    This paper presents a feature selection methodology that can be applied to datasets containing a mixture of continuous and categorical variables. Using a Genetic Algorithm (GA), this method explores a dataset and selects a small set of features relevant for the prediction of a binary (1/0) response. Binary classification trees and an objective function based on conditional probabilities are used to measure the fitness of a given subset of features. The method is applied to health data in order to find factors useful for the prediction of diabetes. Results show that our algorithm is capable of narrowing down the setmore » of predictors to around 8 factors that can be validated using reputable medical and public health resources.« less

  13. Accuracy of whole-genome prediction using a genetic architecture-enhanced variance-covariance matrix.

    PubMed

    Zhang, Zhe; Erbe, Malena; He, Jinlong; Ober, Ulrike; Gao, Ning; Zhang, Hao; Simianer, Henner; Li, Jiaqi

    2015-02-09

    Obtaining accurate predictions of unobserved genetic or phenotypic values for complex traits in animal, plant, and human populations is possible through whole-genome prediction (WGP), a combined analysis of genotypic and phenotypic data. Because the underlying genetic architecture of the trait of interest is an important factor affecting model selection, we propose a new strategy, termed BLUP|GA (BLUP-given genetic architecture), which can use genetic architecture information within the dataset at hand rather than from public sources. This is achieved by using a trait-specific covariance matrix ( T: ), which is a weighted sum of a genetic architecture part ( S: matrix) and the realized relationship matrix ( G: ). The algorithm of BLUP|GA (BLUP-given genetic architecture) is provided and illustrated with real and simulated datasets. Predictive ability of BLUP|GA was validated with three model traits in a dairy cattle dataset and 11 traits in three public datasets with a variety of genetic architectures and compared with GBLUP and other approaches. Results show that BLUP|GA outperformed GBLUP in 20 of 21 scenarios in the dairy cattle dataset and outperformed GBLUP, BayesA, and BayesB in 12 of 13 traits in the analyzed public datasets. Further analyses showed that the difference of accuracies for BLUP|GA and GBLUP significantly correlate with the distance between the T: and G: matrices. The new strategy applied in BLUP|GA is a favorable and flexible alternative to the standard GBLUP model, allowing to account for the genetic architecture of the quantitative trait under consideration when necessary. This feature is mainly due to the increased similarity between the trait-specific relationship matrix ( T: matrix) and the genetic relationship matrix at unobserved causal loci. Applying BLUP|GA in WGP would ease the burden of model selection. Copyright © 2015 Zhang et al.

  14. Use of Artificial Intelligence and Machine Learning Algorithms with Gene Expression Profiling to Predict Recurrent Nonmuscle Invasive Urothelial Carcinoma of the Bladder.

    PubMed

    Bartsch, Georg; Mitra, Anirban P; Mitra, Sheetal A; Almal, Arpit A; Steven, Kenneth E; Skinner, Donald G; Fry, David W; Lenehan, Peter F; Worzel, William P; Cote, Richard J

    2016-02-01

    Due to the high recurrence risk of nonmuscle invasive urothelial carcinoma it is crucial to distinguish patients at high risk from those with indolent disease. In this study we used a machine learning algorithm to identify the genes in patients with nonmuscle invasive urothelial carcinoma at initial presentation that were most predictive of recurrence. We used the genes in a molecular signature to predict recurrence risk within 5 years after transurethral resection of bladder tumor. Whole genome profiling was performed on 112 frozen nonmuscle invasive urothelial carcinoma specimens obtained at first presentation on Human WG-6 BeadChips (Illumina®). A genetic programming algorithm was applied to evolve classifier mathematical models for outcome prediction. Cross-validation based resampling and gene use frequencies were used to identify the most prognostic genes, which were combined into rules used in a voting algorithm to predict the sample target class. Key genes were validated by quantitative polymerase chain reaction. The classifier set included 21 genes that predicted recurrence. Quantitative polymerase chain reaction was done for these genes in a subset of 100 patients. A 5-gene combined rule incorporating a voting algorithm yielded 77% sensitivity and 85% specificity to predict recurrence in the training set, and 69% and 62%, respectively, in the test set. A singular 3-gene rule was constructed that predicted recurrence with 80% sensitivity and 90% specificity in the training set, and 71% and 67%, respectively, in the test set. Using primary nonmuscle invasive urothelial carcinoma from initial occurrences genetic programming identified transcripts in reproducible fashion, which were predictive of recurrence. These findings could potentially impact nonmuscle invasive urothelial carcinoma management. Copyright © 2016 American Urological Association Education and Research, Inc. Published by Elsevier Inc. All rights reserved.

  15. Genetic algorithm dynamics on a rugged landscape

    NASA Astrophysics Data System (ADS)

    Bornholdt, Stefan

    1998-04-01

    The genetic algorithm is an optimization procedure motivated by biological evolution and is successfully applied to optimization problems in different areas. A statistical mechanics model for its dynamics is proposed based on the parent-child fitness correlation of the genetic operators, making it applicable to general fitness landscapes. It is compared to a recent model based on a maximum entropy ansatz. Finally it is applied to modeling the dynamics of a genetic algorithm on the rugged fitness landscape of the NK model.

  16. Context-sensitive network-based disease genetics prediction and its implications in drug discovery

    PubMed Central

    Chen, Yang; Xu, Rong

    2017-01-01

    Abstract Motivation: Disease phenotype networks play an important role in computational approaches to identifying new disease-gene associations. Current disease phenotype networks often model disease relationships based on pairwise similarities, therefore ignore the specific context on how two diseases are connected. In this study, we propose a new strategy to model disease associations using context-sensitive networks (CSNs). We developed a CSN-based phenome-driven approach for disease genetics prediction, and investigated the translational potential of the predicted genes in drug discovery. Results: We constructed CSNs by directly connecting diseases with associated phenotypes. Here, we constructed two CSNs using different data sources; the two networks contain 26 790 and 13 822 nodes respectively. We integrated the CSNs with a genetic functional relationship network and predicted disease genes using a network-based ranking algorithm. For comparison, we built Similarity-Based disease Networks (SBN) using the same disease phenotype data. In a de novo cross validation for 3324 diseases, the CSN-based approach significantly increased the average rank from top 12.6 to top 8.8% for all tested genes comparing with the SBN-based approach (p

  17. A synthetic genetic edge detection program.

    PubMed

    Tabor, Jeffrey J; Salis, Howard M; Simpson, Zachary Booth; Chevalier, Aaron A; Levskaya, Anselm; Marcotte, Edward M; Voigt, Christopher A; Ellington, Andrew D

    2009-06-26

    Edge detection is a signal processing algorithm common in artificial intelligence and image recognition programs. We have constructed a genetically encoded edge detection algorithm that programs an isogenic community of E. coli to sense an image of light, communicate to identify the light-dark edges, and visually present the result of the computation. The algorithm is implemented using multiple genetic circuits. An engineered light sensor enables cells to distinguish between light and dark regions. In the dark, cells produce a diffusible chemical signal that diffuses into light regions. Genetic logic gates are used so that only cells that sense light and the diffusible signal produce a positive output. A mathematical model constructed from first principles and parameterized with experimental measurements of the component circuits predicts the performance of the complete program. Quantitatively accurate models will facilitate the engineering of more complex biological behaviors and inform bottom-up studies of natural genetic regulatory networks.

  18. A Synthetic Genetic Edge Detection Program

    PubMed Central

    Tabor, Jeffrey J.; Salis, Howard; Simpson, Zachary B.; Chevalier, Aaron A.; Levskaya, Anselm; Marcotte, Edward M.; Voigt, Christopher A.; Ellington, Andrew D.

    2009-01-01

    Summary Edge detection is a signal processing algorithm common in artificial intelligence and image recognition programs. We have constructed a genetically encoded edge detection algorithm that programs an isogenic community of E.coli to sense an image of light, communicate to identify the light-dark edges, and visually present the result of the computation. The algorithm is implemented using multiple genetic circuits. An engineered light sensor enables cells to distinguish between light and dark regions. In the dark, cells produce a diffusible chemical signal that diffuses into light regions. Genetic logic gates are used so that only cells that sense light and the diffusible signal produce a positive output. A mathematical model constructed from first principles and parameterized with experimental measurements of the component circuits predicts the performance of the complete program. Quantitatively accurate models will facilitate the engineering of more complex biological behaviors and inform bottom-up studies of natural genetic regulatory networks. PMID:19563759

  19. Genetic Algorithm for Initial Orbit Determination with Too Short Arc (Continued)

    NASA Astrophysics Data System (ADS)

    Li, Xin-ran; Wang, Xin

    2017-04-01

    When the genetic algorithm is used to solve the problem of too short-arc (TSA) orbit determination, due to the difference of computing process between the genetic algorithm and the classical method, the original method for outlier deletion is no longer applicable. In the genetic algorithm, the robust estimation is realized by introducing different loss functions for the fitness function, then the outlier problem of the TSA orbit determination is solved. Compared with the classical method, the genetic algorithm is greatly simplified by introducing in different loss functions. Through the comparison on the calculations of multiple loss functions, it is found that the least median square (LMS) estimation and least trimmed square (LTS) estimation can greatly improve the robustness of the TSA orbit determination, and have a high breakdown point.

  20. New knowledge-based genetic algorithm for excavator boom structural optimization

    NASA Astrophysics Data System (ADS)

    Hua, Haiyan; Lin, Shuwen

    2014-03-01

    Due to the insufficiency of utilizing knowledge to guide the complex optimal searching, existing genetic algorithms fail to effectively solve excavator boom structural optimization problem. To improve the optimization efficiency and quality, a new knowledge-based real-coded genetic algorithm is proposed. A dual evolution mechanism combining knowledge evolution with genetic algorithm is established to extract, handle and utilize the shallow and deep implicit constraint knowledge to guide the optimal searching of genetic algorithm circularly. Based on this dual evolution mechanism, knowledge evolution and population evolution can be connected by knowledge influence operators to improve the configurability of knowledge and genetic operators. Then, the new knowledge-based selection operator, crossover operator and mutation operator are proposed to integrate the optimal process knowledge and domain culture to guide the excavator boom structural optimization. Eight kinds of testing algorithms, which include different genetic operators, are taken as examples to solve the structural optimization of a medium-sized excavator boom. By comparing the results of optimization, it is shown that the algorithm including all the new knowledge-based genetic operators can more remarkably improve the evolutionary rate and searching ability than other testing algorithms, which demonstrates the effectiveness of knowledge for guiding optimal searching. The proposed knowledge-based genetic algorithm by combining multi-level knowledge evolution with numerical optimization provides a new effective method for solving the complex engineering optimization problem.

  1. Recurrent neural network-based modeling of gene regulatory network using elephant swarm water search algorithm.

    PubMed

    Mandal, Sudip; Saha, Goutam; Pal, Rajat Kumar

    2017-08-01

    Correct inference of genetic regulations inside a cell from the biological database like time series microarray data is one of the greatest challenges in post genomic era for biologists and researchers. Recurrent Neural Network (RNN) is one of the most popular and simple approach to model the dynamics as well as to infer correct dependencies among genes. Inspired by the behavior of social elephants, we propose a new metaheuristic namely Elephant Swarm Water Search Algorithm (ESWSA) to infer Gene Regulatory Network (GRN). This algorithm is mainly based on the water search strategy of intelligent and social elephants during drought, utilizing the different types of communication techniques. Initially, the algorithm is tested against benchmark small and medium scale artificial genetic networks without and with presence of different noise levels and the efficiency was observed in term of parametric error, minimum fitness value, execution time, accuracy of prediction of true regulation, etc. Next, the proposed algorithm is tested against the real time gene expression data of Escherichia Coli SOS Network and results were also compared with others state of the art optimization methods. The experimental results suggest that ESWSA is very efficient for GRN inference problem and performs better than other methods in many ways.

  2. First flights of genetic-algorithm Kitty Hawk

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Goldberg, D.E.

    1994-12-31

    The design of complex systems requires an effective methodology of invention. This paper considers the methodology of the Wright brothers in inventing the powered airplane and suggests how successes in the design of genetic algorithms have come at the hands of a Wright-brothers-like approach. Recent reliable subquadratic results in solving hard problems with nontraditional GAs and predictions of the limits of simple GAs are presented as two accomplishments achieved in this manner.

  3. Portfolio optimization by using linear programing models based on genetic algorithm

    NASA Astrophysics Data System (ADS)

    Sukono; Hidayat, Y.; Lesmana, E.; Putra, A. S.; Napitupulu, H.; Supian, S.

    2018-01-01

    In this paper, we discussed the investment portfolio optimization using linear programming model based on genetic algorithms. It is assumed that the portfolio risk is measured by absolute standard deviation, and each investor has a risk tolerance on the investment portfolio. To complete the investment portfolio optimization problem, the issue is arranged into a linear programming model. Furthermore, determination of the optimum solution for linear programming is done by using a genetic algorithm. As a numerical illustration, we analyze some of the stocks traded on the capital market in Indonesia. Based on the analysis, it is shown that the portfolio optimization performed by genetic algorithm approach produces more optimal efficient portfolio, compared to the portfolio optimization performed by a linear programming algorithm approach. Therefore, genetic algorithms can be considered as an alternative on determining the investment portfolio optimization, particularly using linear programming models.

  4. Assessment of the Clinical Relevance of BRCA2 Missense Variants by Functional and Computational Approaches.

    PubMed

    Guidugli, Lucia; Shimelis, Hermela; Masica, David L; Pankratz, Vernon S; Lipton, Gary B; Singh, Namit; Hu, Chunling; Monteiro, Alvaro N A; Lindor, Noralane M; Goldgar, David E; Karchin, Rachel; Iversen, Edwin S; Couch, Fergus J

    2018-01-17

    Many variants of uncertain significance (VUS) have been identified in BRCA2 through clinical genetic testing. VUS pose a significant clinical challenge because the contribution of these variants to cancer risk has not been determined. We conducted a comprehensive assessment of VUS in the BRCA2 C-terminal DNA binding domain (DBD) by using a validated functional assay of BRCA2 homologous recombination (HR) DNA-repair activity and defined a classifier of variant pathogenicity. Among 139 variants evaluated, 54 had ≥99% probability of pathogenicity, and 73 had ≥95% probability of neutrality. Functional assay results were compared with predictions of variant pathogenicity from the Align-GVGD protein-sequence-based prediction algorithm, which has been used for variant classification. Relative to the HR assay, Align-GVGD significantly (p < 0.05) over-predicted pathogenic variants. We subsequently combined functional and Align-GVGD prediction results in a Bayesian hierarchical model (VarCall) to estimate the overall probability of pathogenicity for each VUS. In addition, to predict the effects of all other BRCA2 DBD variants and to prioritize variants for functional studies, we used the endoPhenotype-Optimized Sequence Ensemble (ePOSE) algorithm to train classifiers for BRCA2 variants by using data from the HR functional assay. Together, the results show that systematic functional assays in combination with in silico predictors of pathogenicity provide robust tools for clinical annotation of BRCA2 VUS. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.

  5. Genetic algorithms with memory- and elitism-based immigrants in dynamic environments.

    PubMed

    Yang, Shengxiang

    2008-01-01

    In recent years the genetic algorithm community has shown a growing interest in studying dynamic optimization problems. Several approaches have been devised. The random immigrants and memory schemes are two major ones. The random immigrants scheme addresses dynamic environments by maintaining the population diversity while the memory scheme aims to adapt genetic algorithms quickly to new environments by reusing historical information. This paper investigates a hybrid memory and random immigrants scheme, called memory-based immigrants, and a hybrid elitism and random immigrants scheme, called elitism-based immigrants, for genetic algorithms in dynamic environments. In these schemes, the best individual from memory or the elite from the previous generation is retrieved as the base to create immigrants into the population by mutation. This way, not only can diversity be maintained but it is done more efficiently to adapt genetic algorithms to the current environment. Based on a series of systematically constructed dynamic problems, experiments are carried out to compare genetic algorithms with the memory-based and elitism-based immigrants schemes against genetic algorithms with traditional memory and random immigrants schemes and a hybrid memory and multi-population scheme. The sensitivity analysis regarding some key parameters is also carried out. Experimental results show that the memory-based and elitism-based immigrants schemes efficiently improve the performance of genetic algorithms in dynamic environments.

  6. Improved classification accuracy by feature extraction using genetic algorithms

    NASA Astrophysics Data System (ADS)

    Patriarche, Julia; Manduca, Armando; Erickson, Bradley J.

    2003-05-01

    A feature extraction algorithm has been developed for the purposes of improving classification accuracy. The algorithm uses a genetic algorithm / hill-climber hybrid to generate a set of linearly recombined features, which may be of reduced dimensionality compared with the original set. The genetic algorithm performs the global exploration, and a hill climber explores local neighborhoods. Hybridizing the genetic algorithm with a hill climber improves both the rate of convergence, and the final overall cost function value; it also reduces the sensitivity of the genetic algorithm to parameter selection. The genetic algorithm includes the operators: crossover, mutation, and deletion / reactivation - the last of these effects dimensionality reduction. The feature extractor is supervised, and is capable of deriving a separate feature space for each tissue (which are reintegrated during classification). A non-anatomical digital phantom was developed as a gold standard for testing purposes. In tests with the phantom, and with images of multiple sclerosis patients, classification with feature extractor derived features yielded lower error rates than using standard pulse sequences, and with features derived using principal components analysis. Using the multiple sclerosis patient data, the algorithm resulted in a mean 31% reduction in classification error of pure tissues.

  7. A New Efficient Hybrid Intelligent Model for Biodegradation Process of DMP with Fuzzy Wavelet Neural Networks

    NASA Astrophysics Data System (ADS)

    Huang, Mingzhi; Zhang, Tao; Ruan, Jujun; Chen, Xiaohong

    2017-01-01

    A new efficient hybrid intelligent approach based on fuzzy wavelet neural network (FWNN) was proposed for effectively modeling and simulating biodegradation process of Dimethyl phthalate (DMP) in an anaerobic/anoxic/oxic (AAO) wastewater treatment process. With the self learning and memory abilities of neural networks (NN), handling uncertainty capacity of fuzzy logic (FL), analyzing local details superiority of wavelet transform (WT) and global search of genetic algorithm (GA), the proposed hybrid intelligent model can extract the dynamic behavior and complex interrelationships from various water quality variables. For finding the optimal values for parameters of the proposed FWNN, a hybrid learning algorithm integrating an improved genetic optimization and gradient descent algorithm is employed. The results show, compared with NN model (optimized by GA) and kinetic model, the proposed FWNN model have the quicker convergence speed, the higher prediction performance, and smaller RMSE (0.080), MSE (0.0064), MAPE (1.8158) and higher R2 (0.9851) values. which illustrates FWNN model simulates effluent DMP more accurately than the mechanism model.

  8. A New Efficient Hybrid Intelligent Model for Biodegradation Process of DMP with Fuzzy Wavelet Neural Networks

    PubMed Central

    Huang, Mingzhi; Zhang, Tao; Ruan, Jujun; Chen, Xiaohong

    2017-01-01

    A new efficient hybrid intelligent approach based on fuzzy wavelet neural network (FWNN) was proposed for effectively modeling and simulating biodegradation process of Dimethyl phthalate (DMP) in an anaerobic/anoxic/oxic (AAO) wastewater treatment process. With the self learning and memory abilities of neural networks (NN), handling uncertainty capacity of fuzzy logic (FL), analyzing local details superiority of wavelet transform (WT) and global search of genetic algorithm (GA), the proposed hybrid intelligent model can extract the dynamic behavior and complex interrelationships from various water quality variables. For finding the optimal values for parameters of the proposed FWNN, a hybrid learning algorithm integrating an improved genetic optimization and gradient descent algorithm is employed. The results show, compared with NN model (optimized by GA) and kinetic model, the proposed FWNN model have the quicker convergence speed, the higher prediction performance, and smaller RMSE (0.080), MSE (0.0064), MAPE (1.8158) and higher R2 (0.9851) values. which illustrates FWNN model simulates effluent DMP more accurately than the mechanism model. PMID:28120889

  9. Harmony Search as a Powerful Tool for Feature Selection in QSPR Study of the Drugs Lipophilicity.

    PubMed

    Bahadori, Behnoosh; Atabati, Morteza

    2017-01-01

    Aims & Scope: Lipophilicity represents one of the most studied and most frequently used fundamental physicochemical properties. In the present work, harmony search (HS) algorithm is suggested to feature selection in quantitative structure-property relationship (QSPR) modeling to predict lipophilicity of neutral, acidic, basic and amphotheric drugs that were determined by UHPLC. Harmony search is a music-based metaheuristic optimization algorithm. It was affected by the observation that the aim of music is to search for a perfect state of harmony. Semi-empirical quantum-chemical calculations at AM1 level were used to find the optimum 3D geometry of the studied molecules and variant descriptors (1497 descriptors) were calculated by the Dragon software. The selected descriptors by harmony search algorithm (9 descriptors) were applied for model development using multiple linear regression (MLR). In comparison with other feature selection methods such as genetic algorithm and simulated annealing, harmony search algorithm has better results. The root mean square error (RMSE) with and without leave-one out cross validation (LOOCV) were obtained 0.417 and 0.302, respectively. The results were compared with those obtained from the genetic algorithm and simulated annealing methods and it showed that the HS is a helpful tool for feature selection with fine performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  10. Optimal design of the first stage of the plate-fin heat exchanger for the EAST cryogenic system

    NASA Astrophysics Data System (ADS)

    Qingfeng, JIANG; Zhigang, ZHU; Qiyong, ZHANG; Ming, ZHUANG; Xiaofei, LU

    2018-03-01

    The size of the heat exchanger is an important factor determining the dimensions of the cold box in helium cryogenic systems. In this paper, a counter-flow multi-stream plate-fin heat exchanger is optimized by means of a spatial interpolation method coupled with a hybrid genetic algorithm. Compared with empirical correlations, this spatial interpolation algorithm based on a kriging model can be adopted to more precisely predict the Colburn heat transfer factors and Fanning friction factors of offset-strip fins. Moreover, strict computational fluid dynamics simulations can be carried out to predict the heat transfer and friction performance in the absence of reliable experimental data. Within the constraints of heat exchange requirements, maximum allowable pressure drop, existing manufacturing techniques and structural strength, a mathematical model of an optimized design with discrete and continuous variables based on a hybrid genetic algorithm is established in order to minimize the volume. The results show that for the first-stage heat exchanger in the EAST refrigerator, the structural size could be decreased from the original 2.200 × 0.600 × 0.627 (m3) to the optimized 1.854 × 0.420 × 0.340 (m3), with a large reduction in volume. The current work demonstrates that the proposed method could be a useful tool to achieve optimization in an actual engineering project during the practical design process.

  11. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling

    PubMed Central

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems. PMID:25961028

  12. mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling.

    PubMed

    Alshamlan, Hala; Badr, Ghada; Alohali, Yousef

    2015-01-01

    An artificial bee colony (ABC) is a relatively recent swarm intelligence optimization approach. In this paper, we propose the first attempt at applying ABC algorithm in analyzing a microarray gene expression profile. In addition, we propose an innovative feature selection algorithm, minimum redundancy maximum relevance (mRMR), and combine it with an ABC algorithm, mRMR-ABC, to select informative genes from microarray profile. The new approach is based on a support vector machine (SVM) algorithm to measure the classification accuracy for selected genes. We evaluate the performance of the proposed mRMR-ABC algorithm by conducting extensive experiments on six binary and multiclass gene expression microarray datasets. Furthermore, we compare our proposed mRMR-ABC algorithm with previously known techniques. We reimplemented two of these techniques for the sake of a fair comparison using the same parameters. These two techniques are mRMR when combined with a genetic algorithm (mRMR-GA) and mRMR when combined with a particle swarm optimization algorithm (mRMR-PSO). The experimental results prove that the proposed mRMR-ABC algorithm achieves accurate classification performance using small number of predictive genes when tested using both datasets and compared to previously suggested methods. This shows that mRMR-ABC is a promising approach for solving gene selection and cancer classification problems.

  13. ASPsiRNA: A Resource of ASP-siRNAs Having Therapeutic Potential for Human Genetic Disorders and Algorithm for Prediction of Their Inhibitory Efficacy.

    PubMed

    Monga, Isha; Qureshi, Abid; Thakur, Nishant; Gupta, Amit Kumar; Kumar, Manoj

    2017-09-07

    Allele-specific siRNAs (ASP-siRNAs) have emerged as promising therapeutic molecules owing to their selectivity to inhibit the mutant allele or associated single-nucleotide polymorphisms (SNPs) sparing the expression of the wild-type counterpart. Thus, a dedicated bioinformatics platform encompassing updated ASP-siRNAs and an algorithm for the prediction of their inhibitory efficacy will be helpful in tackling currently intractable genetic disorders. In the present study, we have developed the ASPsiRNA resource (http://crdd.osdd.net/servers/aspsirna/) covering three components viz (i) ASPsiDb , (ii) ASPsiPred , and (iii) analysis tools like ASP-siOffTar ASPsiDb is a manually curated database harboring 4543 (including 422 chemically modified) ASP-siRNAs targeting 78 unique genes involved in 51 different diseases. It furnishes comprehensive information from experimental studies on ASP-siRNAs along with multidimensional genetic and clinical information for numerous mutations. ASPsiPred is a two-layered algorithm to predict efficacy of ASP-siRNAs for fully complementary mutant (Eff mut ) and wild-type allele (Eff wild ) with one mismatch by ASPsiPred SVM and ASPsiPred matrix , respectively. In ASPsiPred SVM , 922 unique ASP-siRNAs with experimentally validated quantitative Eff mut were used. During 10-fold cross-validation (10nCV) employing various sequence features on the training/testing dataset (T737), the best predictive model achieved a maximum Pearson's correlation coefficient (PCC) of 0.71. Further, the accuracy of the classifier to predict Eff mut against novel genes was assessed by leave one target out cross-validation approach (LOTOCV). ASPsiPred matrix was constructed from rule-based studies describing the effect of single siRNA:mRNA mismatches on the efficacy at 19 different locations of siRNA. Thus, ASPsiRNA encompasses the first database, prediction algorithm, and off-target analysis tool that is expected to accelerate research in the field of RNAi-based therapeutics for human genetic diseases. Copyright © 2017 Monga et al.

  14. A genetic algorithm for replica server placement

    NASA Astrophysics Data System (ADS)

    Eslami, Ghazaleh; Toroghi Haghighat, Abolfazl

    2012-01-01

    Modern distribution systems use replication to improve communication delay experienced by their clients. Some techniques have been developed for web server replica placement. One of the previous studies was Greedy algorithm proposed by Qiu et al, that needs knowledge about network topology. In This paper, first we introduce a genetic algorithm for web server replica placement. Second, we compare our algorithm with Greedy algorithm proposed by Qiu et al, and Optimum algorithm. We found that our approach can achieve better results than Greedy algorithm proposed by Qiu et al but it's computational time is more than Greedy algorithm.

  15. A genetic algorithm for replica server placement

    NASA Astrophysics Data System (ADS)

    Eslami, Ghazaleh; Toroghi Haghighat, Abolfazl

    2011-12-01

    Modern distribution systems use replication to improve communication delay experienced by their clients. Some techniques have been developed for web server replica placement. One of the previous studies was Greedy algorithm proposed by Qiu et al, that needs knowledge about network topology. In This paper, first we introduce a genetic algorithm for web server replica placement. Second, we compare our algorithm with Greedy algorithm proposed by Qiu et al, and Optimum algorithm. We found that our approach can achieve better results than Greedy algorithm proposed by Qiu et al but it's computational time is more than Greedy algorithm.

  16. Ternary alloy material prediction using genetic algorithm and cluster expansion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Chong

    2015-12-01

    This thesis summarizes our study on the crystal structures prediction of Fe-V-Si system using genetic algorithm and cluster expansion. Our goal is to explore and look for new stable compounds. We started from the current ten known experimental phases, and calculated formation energies of those compounds using density functional theory (DFT) package, namely, VASP. The convex hull was generated based on the DFT calculations of the experimental known phases. Then we did random search on some metal rich (Fe and V) compositions and found that the lowest energy structures were body centered cube (bcc) underlying lattice, under which we didmore » our computational systematic searches using genetic algorithm and cluster expansion. Among hundreds of the searched compositions, thirteen were selected and DFT formation energies were obtained by VASP. The stability checking of those thirteen compounds was done in reference to the experimental convex hull. We found that the composition, 24-8-16, i.e., Fe 3VSi 2 is a new stable phase and it can be very inspiring to the future experiments.« less

  17. Prediction protein structural classes with pseudo-amino acid composition: approximate entropy and hydrophobicity pattern.

    PubMed

    Zhang, Tong-Liang; Ding, Yong-Sheng; Chou, Kuo-Chen

    2008-01-07

    Compared with the conventional amino acid (AA) composition, the pseudo-amino acid (PseAA) composition as originally introduced for protein subcellular location prediction can incorporate much more information of a protein sequence, so as to remarkably enhance the power of using a discrete model to predict various attributes of a protein. In this study, based on the concept of PseAA composition, the approximate entropy and hydrophobicity pattern of a protein sequence are used to characterize the PseAA components. Also, the immune genetic algorithm (IGA) is applied to search the optimal weight factors in generating the PseAA composition. Thus, for a given protein sequence sample, a 27-D (dimensional) PseAA composition is generated as its descriptor. The fuzzy K nearest neighbors (FKNN) classifier is adopted as the prediction engine. The results thus obtained in predicting protein structural classification are quite encouraging, indicating that the current approach may also be used to improve the prediction quality of other protein attributes, or at least can play a complimentary role to the existing methods in the relevant areas. Our algorithm is written in Matlab that is available by contacting the corresponding author.

  18. Hybrid genetic algorithm in the Hopfield network for maximum 2-satisfiability problem

    NASA Astrophysics Data System (ADS)

    Kasihmuddin, Mohd Shareduwan Mohd; Sathasivam, Saratha; Mansor, Mohd. Asyraf

    2017-08-01

    Heuristic method was designed for finding optimal solution more quickly compared to classical methods which are too complex to comprehend. In this study, a hybrid approach that utilizes Hopfield network and genetic algorithm in doing maximum 2-Satisfiability problem (MAX-2SAT) was proposed. Hopfield neural network was used to minimize logical inconsistency in interpretations of logic clauses or program. Genetic algorithm (GA) has pioneered the implementation of methods that exploit the idea of combination and reproduce a better solution. The simulation incorporated with and without genetic algorithm will be examined by using Microsoft Visual 2013 C++ Express software. The performance of both searching techniques in doing MAX-2SAT was evaluate based on global minima ratio, ratio of satisfied clause and computation time. The result obtained form the computer simulation demonstrates the effectiveness and acceleration features of genetic algorithm in doing MAX-2SAT in Hopfield network.

  19. A modified genetic algorithm with fuzzy roulette wheel selection for job-shop scheduling problems

    NASA Astrophysics Data System (ADS)

    Thammano, Arit; Teekeng, Wannaporn

    2015-05-01

    The job-shop scheduling problem is one of the most difficult production planning problems. Since it is in the NP-hard class, a recent trend in solving the job-shop scheduling problem is shifting towards the use of heuristic and metaheuristic algorithms. This paper proposes a novel metaheuristic algorithm, which is a modification of the genetic algorithm. This proposed algorithm introduces two new concepts to the standard genetic algorithm: (1) fuzzy roulette wheel selection and (2) the mutation operation with tabu list. The proposed algorithm has been evaluated and compared with several state-of-the-art algorithms in the literature. The experimental results on 53 JSSPs show that the proposed algorithm is very effective in solving the combinatorial optimization problems. It outperforms all state-of-the-art algorithms on all benchmark problems in terms of the ability to achieve the optimal solution and the computational time.

  20. Application of genetic algorithm to land use optimization for non-point source pollution control based on CLUE-S and SWAT

    NASA Astrophysics Data System (ADS)

    Wang, Qingrui; Liu, Ruimin; Men, Cong; Guo, Lijia

    2018-05-01

    The genetic algorithm (GA) was combined with the Conversion of Land Use and its Effect at Small regional extent (CLUE-S) model to obtain an optimized land use pattern for controlling non-point source (NPS) pollution. The performance of the combination was evaluated. The effect of the optimized land use pattern on the NPS pollution control was estimated by the Soil and Water Assessment Tool (SWAT) model and an assistant map was drawn to support the land use plan for the future. The Xiangxi River watershed was selected as the study area. Two scenarios were used to simulate the land use change. Under the historical trend scenario (Markov chain prediction), the forest area decreased by 2035.06 ha, and was mainly converted into paddy and dryland area. In contrast, under the optimized scenario (genetic algorithm (GA) prediction), up to 3370 ha of dryland area was converted into forest area. Spatially, the conversion of paddy and dryland into forest occurred mainly in the northwest and southeast of the watershed, where the slope land occupied a large proportion. The organic and inorganic phosphorus loads decreased by 3.6% and 3.7%, respectively, in the optimized scenario compared to those in the historical trend scenario. GA showed a better performance in optimized land use prediction. A comparison of the land use patterns in 2010 under the real situation and in 2020 under the optimized situation showed that Shennongjia and Shuiyuesi should convert 1201.76 ha and 1115.33 ha of dryland into forest areas, respectively, which represented the greatest changes in all regions in the watershed. The results of this study indicated that GA and the CLUE-S model can be used to optimize the land use patterns in the future and that SWAT can be used to evaluate the effect of land use optimization on non-point source pollution control. These methods may provide support for land use plan of an area.

  1. Image reconstruction through thin scattering media by simulated annealing algorithm

    NASA Astrophysics Data System (ADS)

    Fang, Longjie; Zuo, Haoyi; Pang, Lin; Yang, Zuogang; Zhang, Xicheng; Zhu, Jianhua

    2018-07-01

    An idea for reconstructing the image of an object behind thin scattering media is proposed by phase modulation. The optimized phase mask is achieved by modulating the scattered light using simulated annealing algorithm. The correlation coefficient is exploited as a fitness function to evaluate the quality of reconstructed image. The reconstructed images optimized from simulated annealing algorithm and genetic algorithm are compared in detail. The experimental results show that our proposed method has better definition and higher speed than genetic algorithm.

  2. A Genetic Algorithm Based Support Vector Machine Model for Blood-Brain Barrier Penetration Prediction

    PubMed Central

    Zhang, Daqing; Xiao, Jianfeng; Zhou, Nannan; Luo, Xiaomin; Jiang, Hualiang; Chen, Kaixian

    2015-01-01

    Blood-brain barrier (BBB) is a highly complex physical barrier determining what substances are allowed to enter the brain. Support vector machine (SVM) is a kernel-based machine learning method that is widely used in QSAR study. For a successful SVM model, the kernel parameters for SVM and feature subset selection are the most important factors affecting prediction accuracy. In most studies, they are treated as two independent problems, but it has been proven that they could affect each other. We designed and implemented genetic algorithm (GA) to optimize kernel parameters and feature subset selection for SVM regression and applied it to the BBB penetration prediction. The results show that our GA/SVM model is more accurate than other currently available log BB models. Therefore, to optimize both SVM parameters and feature subset simultaneously with genetic algorithm is a better approach than other methods that treat the two problems separately. Analysis of our log BB model suggests that carboxylic acid group, polar surface area (PSA)/hydrogen-bonding ability, lipophilicity, and molecular charge play important role in BBB penetration. Among those properties relevant to BBB penetration, lipophilicity could enhance the BBB penetration while all the others are negatively correlated with BBB penetration. PMID:26504797

  3. A novel structure-aware sparse learning algorithm for brain imaging genetics.

    PubMed

    Du, Lei; Jingwen, Yan; Kim, Sungeun; Risacher, Shannon L; Huang, Heng; Inlow, Mark; Moore, Jason H; Saykin, Andrew J; Shen, Li

    2014-01-01

    Brain imaging genetics is an emergent research field where the association between genetic variations such as single nucleotide polymorphisms (SNPs) and neuroimaging quantitative traits (QTs) is evaluated. Sparse canonical correlation analysis (SCCA) is a bi-multivariate analysis method that has the potential to reveal complex multi-SNP-multi-QT associations. Most existing SCCA algorithms are designed using the soft threshold strategy, which assumes that the features in the data are independent from each other. This independence assumption usually does not hold in imaging genetic data, and thus inevitably limits the capability of yielding optimal solutions. We propose a novel structure-aware SCCA (denoted as S2CCA) algorithm to not only eliminate the independence assumption for the input data, but also incorporate group-like structure in the model. Empirical comparison with a widely used SCCA implementation, on both simulated and real imaging genetic data, demonstrated that S2CCA could yield improved prediction performance and biologically meaningful findings.

  4. Discovery of Novel HIV-1 Integrase Inhibitors Using QSAR-Based Virtual Screening of the NCI Open Database.

    PubMed

    Ko, Gene M; Garg, Rajni; Bailey, Barbara A; Kumar, Sunil

    2016-01-01

    Quantitative structure-activity relationship (QSAR) models can be used as a predictive tool for virtual screening of chemical libraries to identify novel drug candidates. The aims of this paper were to report the results of a study performed for descriptor selection, QSAR model development, and virtual screening for identifying novel HIV-1 integrase inhibitor drug candidates. First, three evolutionary algorithms were compared for descriptor selection: differential evolution-binary particle swarm optimization (DE-BPSO), binary particle swarm optimization, and genetic algorithms. Next, three QSAR models were developed from an ensemble of multiple linear regression, partial least squares, and extremely randomized trees models. A comparison of the performances of three evolutionary algorithms showed that DE-BPSO has a significant improvement over the other two algorithms. QSAR models developed in this study were used in consensus as a predictive tool for virtual screening of the NCI Open Database containing 265,242 compounds to identify potential novel HIV-1 integrase inhibitors. Six compounds were predicted to be highly active (plC50 > 6) by each of the three models. The use of a hybrid evolutionary algorithm (DE-BPSO) for descriptor selection and QSAR model development in drug design is a novel approach. Consensus modeling may provide better predictivity by taking into account a broader range of chemical properties within the data set conducive for inhibition that may be missed by an individual model. The six compounds identified provide novel drug candidate leads in the design of next generation HIV- 1 integrase inhibitors targeting drug resistant mutant viruses.

  5. Hill-Climbing search and diversification within an evolutionary approach to protein structure prediction.

    PubMed

    Chira, Camelia; Horvath, Dragos; Dumitrescu, D

    2011-07-30

    Proteins are complex structures made of amino acids having a fundamental role in the correct functioning of living cells. The structure of a protein is the result of the protein folding process. However, the general principles that govern the folding of natural proteins into a native structure are unknown. The problem of predicting a protein structure with minimum-energy starting from the unfolded amino acid sequence is a highly complex and important task in molecular and computational biology. Protein structure prediction has important applications in fields such as drug design and disease prediction. The protein structure prediction problem is NP-hard even in simplified lattice protein models. An evolutionary model based on hill-climbing genetic operators is proposed for protein structure prediction in the hydrophobic - polar (HP) model. Problem-specific search operators are implemented and applied using a steepest-ascent hill-climbing approach. Furthermore, the proposed model enforces an explicit diversification stage during the evolution in order to avoid local optimum. The main features of the resulting evolutionary algorithm - hill-climbing mechanism and diversification strategy - are evaluated in a set of numerical experiments for the protein structure prediction problem to assess their impact to the efficiency of the search process. Furthermore, the emerging consolidated model is compared to relevant algorithms from the literature for a set of difficult bidimensional instances from lattice protein models. The results obtained by the proposed algorithm are promising and competitive with those of related methods.

  6. Optimizing multiple sequence alignments using a genetic algorithm based on three objectives: structural information, non-gaps percentage and totally conserved columns.

    PubMed

    Ortuño, Francisco M; Valenzuela, Olga; Rojas, Fernando; Pomares, Hector; Florido, Javier P; Urquiza, Jose M; Rojas, Ignacio

    2013-09-01

    Multiple sequence alignments (MSAs) are widely used approaches in bioinformatics to carry out other tasks such as structure predictions, biological function analyses or phylogenetic modeling. However, current tools usually provide partially optimal alignments, as each one is focused on specific biological features. Thus, the same set of sequences can produce different alignments, above all when sequences are less similar. Consequently, researchers and biologists do not agree about which is the most suitable way to evaluate MSAs. Recent evaluations tend to use more complex scores including further biological features. Among them, 3D structures are increasingly being used to evaluate alignments. Because structures are more conserved in proteins than sequences, scores with structural information are better suited to evaluate more distant relationships between sequences. The proposed multiobjective algorithm, based on the non-dominated sorting genetic algorithm, aims to jointly optimize three objectives: STRIKE score, non-gaps percentage and totally conserved columns. It was significantly assessed on the BAliBASE benchmark according to the Kruskal-Wallis test (P < 0.01). This algorithm also outperforms other aligners, such as ClustalW, Multiple Sequence Alignment Genetic Algorithm (MSA-GA), PRRP, DIALIGN, Hidden Markov Model Training (HMMT), Pattern-Induced Multi-sequence Alignment (PIMA), MULTIALIGN, Sequence Alignment Genetic Algorithm (SAGA), PILEUP, Rubber Band Technique Genetic Algorithm (RBT-GA) and Vertical Decomposition Genetic Algorithm (VDGA), according to the Wilcoxon signed-rank test (P < 0.05), whereas it shows results not significantly different to 3D-COFFEE (P > 0.05) with the advantage of being able to use less structures. Structural information is included within the objective function to evaluate more accurately the obtained alignments. The source code is available at http://www.ugr.es/~fortuno/MOSAStrE/MO-SAStrE.zip.

  7. [Clinical applications of dosing algorithm in the predication of warfarin maintenance dose].

    PubMed

    Huang, Sheng-wen; Xiang, Dao-kang; An, Bang-quan; Li, Gui-fang; Huang, Ling; Wu, Hai-li

    2011-12-27

    To evaluate the feasibility of clinical application for genetic based dosing algorithm in the predication of warfarin maintenance dose in Chinese population. The clinical data were collected and blood samples harvested from a total of 126 patients undergoing heart valve replacement. The genotypes of VKORC1 and CYP2C9 were determined by melting curve analysis after PCR. They were divided randomly into the study and control groups. In the study group, the first three doses of warfarin were prescribed according to the predicted warfarin maintenance dose while warfarin was initiated at 2.5 mg/d in the control group. The warfarin doses were adjusted according to the measured international normalized ratio (INR) values. And all subjects were followed for 50 days after an initiation of warfarin therapy. At the end of a 50-day follow-up period, the proportions of the patients on a stable dose were 82.4% (42/51) and 62.5% (30/48) for the study and control groups respectively. The mean durations of reaching a stable dose of warfarin were (27.5 ± 1.8) and (34.7 ± 1.8) days and the median durations were (24.0 ± 1.7) and (33.0 ± 4.5) days in the study and control groups respectively. Significant differences existed in the durations of reaching a stable dose between the two groups (P = 0.012). Compared with the control group, the hazard ratio (HR) for the duration of reaching a stable dose was 1.786 in the study group (95%CI 1.088 - 2.875, P = 0.026). The predicted dosing algorithm incorporating genetic and non-genetic factors may shorten the duration of achieving efficiently a stable dose of warfarin. And the present study validates the feasibility of its clinical application.

  8. Assessment of Genetic and Nongenetic Interactions for the Prediction of Depressive Symptomatology: An Analysis of the Wisconsin Longitudinal Study Using Machine Learning Algorithms

    PubMed Central

    Roetker, Nicholas S.; Yonker, James A.; Chang, Vicky; Roan, Carol L.; Herd, Pamela; Hauser, Taissa S.; Hauser, Robert M.

    2013-01-01

    Objectives. We examined depression within a multidimensional framework consisting of genetic, environmental, and sociobehavioral factors and, using machine learning algorithms, explored interactions among these factors that might better explain the etiology of depressive symptoms. Methods. We measured current depressive symptoms using the Center for Epidemiologic Studies Depression Scale (n = 6378 participants in the Wisconsin Longitudinal Study). Genetic factors were 78 single nucleotide polymorphisms (SNPs); environmental factors—13 stressful life events (SLEs), plus a composite proportion of SLEs index; and sociobehavioral factors—18 personality, intelligence, and other health or behavioral measures. We performed traditional SNP associations via logistic regression likelihood ratio testing and explored interactions with support vector machines and Bayesian networks. Results. After correction for multiple testing, we found no significant single genotypic associations with depressive symptoms. Machine learning algorithms showed no evidence of interactions. Naïve Bayes produced the best models in both subsets and included only environmental and sociobehavioral factors. Conclusions. We found no single or interactive associations with genetic factors and depressive symptoms. Various environmental and sociobehavioral factors were more predictive of depressive symptoms, yet their impacts were independent of one another. A genome-wide analysis of genetic alterations using machine learning methodologies will provide a framework for identifying genetic–environmental–sociobehavioral interactions in depressive symptoms. PMID:23927508

  9. A Hybrid Color Space for Skin Detection Using Genetic Algorithm Heuristic Search and Principal Component Analysis Technique

    PubMed Central

    2015-01-01

    Color is one of the most prominent features of an image and used in many skin and face detection applications. Color space transformation is widely used by researchers to improve face and skin detection performance. Despite the substantial research efforts in this area, choosing a proper color space in terms of skin and face classification performance which can address issues like illumination variations, various camera characteristics and diversity in skin color tones has remained an open issue. This research proposes a new three-dimensional hybrid color space termed SKN by employing the Genetic Algorithm heuristic and Principal Component Analysis to find the optimal representation of human skin color in over seventeen existing color spaces. Genetic Algorithm heuristic is used to find the optimal color component combination setup in terms of skin detection accuracy while the Principal Component Analysis projects the optimal Genetic Algorithm solution to a less complex dimension. Pixel wise skin detection was used to evaluate the performance of the proposed color space. We have employed four classifiers including Random Forest, Naïve Bayes, Support Vector Machine and Multilayer Perceptron in order to generate the human skin color predictive model. The proposed color space was compared to some existing color spaces and shows superior results in terms of pixel-wise skin detection accuracy. Experimental results show that by using Random Forest classifier, the proposed SKN color space obtained an average F-score and True Positive Rate of 0.953 and False Positive Rate of 0.0482 which outperformed the existing color spaces in terms of pixel wise skin detection accuracy. The results also indicate that among the classifiers used in this study, Random Forest is the most suitable classifier for pixel wise skin detection applications. PMID:26267377

  10. Minimalist ensemble algorithms for genome-wide protein localization prediction.

    PubMed

    Lin, Jhih-Rong; Mondal, Ananda Mohan; Liu, Rong; Hu, Jianjun

    2012-07-03

    Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi.

  11. Minimalist ensemble algorithms for genome-wide protein localization prediction

    PubMed Central

    2012-01-01

    Background Computational prediction of protein subcellular localization can greatly help to elucidate its functions. Despite the existence of dozens of protein localization prediction algorithms, the prediction accuracy and coverage are still low. Several ensemble algorithms have been proposed to improve the prediction performance, which usually include as many as 10 or more individual localization algorithms. However, their performance is still limited by the running complexity and redundancy among individual prediction algorithms. Results This paper proposed a novel method for rational design of minimalist ensemble algorithms for practical genome-wide protein subcellular localization prediction. The algorithm is based on combining a feature selection based filter and a logistic regression classifier. Using a novel concept of contribution scores, we analyzed issues of algorithm redundancy, consensus mistakes, and algorithm complementarity in designing ensemble algorithms. We applied the proposed minimalist logistic regression (LR) ensemble algorithm to two genome-wide datasets of Yeast and Human and compared its performance with current ensemble algorithms. Experimental results showed that the minimalist ensemble algorithm can achieve high prediction accuracy with only 1/3 to 1/2 of individual predictors of current ensemble algorithms, which greatly reduces computational complexity and running time. It was found that the high performance ensemble algorithms are usually composed of the predictors that together cover most of available features. Compared to the best individual predictor, our ensemble algorithm improved the prediction accuracy from AUC score of 0.558 to 0.707 for the Yeast dataset and from 0.628 to 0.646 for the Human dataset. Compared with popular weighted voting based ensemble algorithms, our classifier-based ensemble algorithms achieved much better performance without suffering from inclusion of too many individual predictors. Conclusions We proposed a method for rational design of minimalist ensemble algorithms using feature selection and classifiers. The proposed minimalist ensemble algorithm based on logistic regression can achieve equal or better prediction performance while using only half or one-third of individual predictors compared to other ensemble algorithms. The results also suggested that meta-predictors that take advantage of a variety of features by combining individual predictors tend to achieve the best performance. The LR ensemble server and related benchmark datasets are available at http://mleg.cse.sc.edu/LRensemble/cgi-bin/predict.cgi. PMID:22759391

  12. A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data.

    PubMed

    Baur, Brittany; Bozdag, Serdar

    2016-01-01

    DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.

  13. Investigation on application of genetic algorithms to optimal reactive power dispatch of power systems

    NASA Astrophysics Data System (ADS)

    Wu, Q. H.; Ma, J. T.

    1993-09-01

    A primary investigation into application of genetic algorithms in optimal reactive power dispatch and voltage control is presented. The application was achieved, based on (the United Kingdom) National Grid 48 bus network model, using a novel genetic search approach. Simulation results, compared with that obtained using nonlinear programming methods, are included to show the potential of applications of the genetic search methodology in power system economical and secure operations.

  14. Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM)

    PubMed Central

    Skinnider, Michael A.; Dejong, Chris A.; Rees, Philip N.; Johnston, Chad W.; Li, Haoxin; Webster, Andrew L. H.; Wyatt, Morgan A.; Magarvey, Nathan A.

    2015-01-01

    Microbial natural products are an invaluable source of evolved bioactive small molecules and pharmaceutical agents. Next-generation and metagenomic sequencing indicates untapped genomic potential, yet high rediscovery rates of known metabolites increasingly frustrate conventional natural product screening programs. New methods to connect biosynthetic gene clusters to novel chemical scaffolds are therefore critical to enable the targeted discovery of genetically encoded natural products. Here, we present PRISM, a computational resource for the identification of biosynthetic gene clusters, prediction of genetically encoded nonribosomal peptides and type I and II polyketides, and bio- and cheminformatic dereplication of known natural products. PRISM implements novel algorithms which render it uniquely capable of predicting type II polyketides, deoxygenated sugars, and starter units, making it a comprehensive genome-guided chemical structure prediction engine. A library of 57 tailoring reactions is leveraged for combinatorial scaffold library generation when multiple potential substrates are consistent with biosynthetic logic. We compare the accuracy of PRISM to existing genomic analysis platforms. PRISM is an open-source, user-friendly web application available at http://magarveylab.ca/prism/. PMID:26442528

  15. On the suitability of different representations of solid catalysts for combinatorial library design by genetic algorithms.

    PubMed

    Gobin, Oliver C; Schüth, Ferdi

    2008-01-01

    Genetic algorithms are widely used to solve and optimize combinatorial problems and are more often applied for library design in combinatorial chemistry. Because of their flexibility, however, their implementation can be challenging. In this study, the influence of the representation of solid catalysts on the performance of genetic algorithms was systematically investigated on the basis of a new, constrained, multiobjective, combinatorial test problem with properties common to problems in combinatorial materials science. Constraints were satisfied by penalty functions, repair algorithms, or special representations. The tests were performed using three state-of-the-art evolutionary multiobjective algorithms by performing 100 optimization runs for each algorithm and test case. Experimental data obtained during the optimization of a noble metal-free solid catalyst system active in the selective catalytic reduction of nitric oxide with propene was used to build up a predictive model to validate the results of the theoretical test problem. A significant influence of the representation on the optimization performance was observed. Binary encodings were found to be the preferred encoding in most of the cases, and depending on the experimental test unit, repair algorithms or penalty functions performed best.

  16. Design of Quiet Rotorcraft Approach Trajectories

    NASA Technical Reports Server (NTRS)

    Padula, Sharon L.; Burley, Casey L.; Boyd, D. Douglas, Jr.; Marcolini, Michael A.

    2009-01-01

    A optimization procedure for identifying quiet rotorcraft approach trajectories is proposed and demonstrated. The procedure employs a multi-objective genetic algorithm in order to reduce noise and create approach paths that will be acceptable to pilots and passengers. The concept is demonstrated by application to two different helicopters. The optimized paths are compared with one another and to a standard 6-deg approach path. The two demonstration cases validate the optimization procedure but highlight the need for improved noise prediction techniques and for additional rotorcraft acoustic data sets.

  17. A comparative study of electrochemical machining process parameters by using GA and Taguchi method

    NASA Astrophysics Data System (ADS)

    Soni, S. K.; Thomas, B.

    2017-11-01

    In electrochemical machining quality of machined surface strongly depend on the selection of optimal parameter settings. This work deals with the application of Taguchi method and genetic algorithm using MATLAB to maximize the metal removal rate and minimize the surface roughness and overcut. In this paper a comparative study is presented for drilling of LM6 AL/B4C composites by comparing the significant impact of numerous machining process parameters such as, electrolyte concentration (g/l),machining voltage (v),frequency (hz) on the response parameters (surface roughness, material removal rate and over cut). Taguchi L27 orthogonal array was chosen in Minitab 17 software, for the investigation of experimental results and also multiobjective optimization done by genetic algorithm is employed by using MATLAB. After obtaining optimized results from Taguchi method and genetic algorithm, a comparative results are presented.

  18. Self-Tuning of Design Variables for Generalized Predictive Control

    NASA Technical Reports Server (NTRS)

    Lin, Chaung; Juang, Jer-Nan

    2000-01-01

    Three techniques are introduced to determine the order and control weighting for the design of a generalized predictive controller. These techniques are based on the application of fuzzy logic, genetic algorithms, and simulated annealing to conduct an optimal search on specific performance indexes or objective functions. Fuzzy logic is found to be feasible for real-time and on-line implementation due to its smooth and quick convergence. On the other hand, genetic algorithms and simulated annealing are applicable for initial estimation of the model order and control weighting, and final fine-tuning within a small region of the solution space, Several numerical simulations for a multiple-input and multiple-output system are given to illustrate the techniques developed in this paper.

  19. Application of the hybrid ANFIS models for long term wind power density prediction with extrapolation capability.

    PubMed

    Hossain, Monowar; Mekhilef, Saad; Afifi, Firdaus; Halabi, Laith M; Olatomiwa, Lanre; Seyedmahmoudian, Mehdi; Horan, Ben; Stojcevski, Alex

    2018-01-01

    In this paper, the suitability and performance of ANFIS (adaptive neuro-fuzzy inference system), ANFIS-PSO (particle swarm optimization), ANFIS-GA (genetic algorithm) and ANFIS-DE (differential evolution) has been investigated for the prediction of monthly and weekly wind power density (WPD) of four different locations named Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas all in Malaysia. For this aim, standalone ANFIS, ANFIS-PSO, ANFIS-GA and ANFIS-DE prediction algorithm are developed in MATLAB platform. The performance of the proposed hybrid ANFIS models is determined by computing different statistical parameters such as mean absolute bias error (MABE), mean absolute percentage error (MAPE), root mean square error (RMSE) and coefficient of determination (R2). The results obtained from ANFIS-PSO and ANFIS-GA enjoy higher performance and accuracy than other models, and they can be suggested for practical application to predict monthly and weekly mean wind power density. Besides, the capability of the proposed hybrid ANFIS models is examined to predict the wind data for the locations where measured wind data are not available, and the results are compared with the measured wind data from nearby stations.

  20. Application of the hybrid ANFIS models for long term wind power density prediction with extrapolation capability

    PubMed Central

    Mekhilef, Saad; Afifi, Firdaus; Halabi, Laith M.; Olatomiwa, Lanre; Seyedmahmoudian, Mehdi; Stojcevski, Alex

    2018-01-01

    In this paper, the suitability and performance of ANFIS (adaptive neuro-fuzzy inference system), ANFIS-PSO (particle swarm optimization), ANFIS-GA (genetic algorithm) and ANFIS-DE (differential evolution) has been investigated for the prediction of monthly and weekly wind power density (WPD) of four different locations named Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas all in Malaysia. For this aim, standalone ANFIS, ANFIS-PSO, ANFIS-GA and ANFIS-DE prediction algorithm are developed in MATLAB platform. The performance of the proposed hybrid ANFIS models is determined by computing different statistical parameters such as mean absolute bias error (MABE), mean absolute percentage error (MAPE), root mean square error (RMSE) and coefficient of determination (R2). The results obtained from ANFIS-PSO and ANFIS-GA enjoy higher performance and accuracy than other models, and they can be suggested for practical application to predict monthly and weekly mean wind power density. Besides, the capability of the proposed hybrid ANFIS models is examined to predict the wind data for the locations where measured wind data are not available, and the results are compared with the measured wind data from nearby stations. PMID:29702645

  1. Prediction of dynamical systems by symbolic regression

    NASA Astrophysics Data System (ADS)

    Quade, Markus; Abel, Markus; Shafi, Kamran; Niven, Robert K.; Noack, Bernd R.

    2016-07-01

    We study the modeling and prediction of dynamical systems based on conventional models derived from measurements. Such algorithms are highly desirable in situations where the underlying dynamics are hard to model from physical principles or simplified models need to be found. We focus on symbolic regression methods as a part of machine learning. These algorithms are capable of learning an analytically tractable model from data, a highly valuable property. Symbolic regression methods can be considered as generalized regression methods. We investigate two particular algorithms, the so-called fast function extraction which is a generalized linear regression algorithm, and genetic programming which is a very general method. Both are able to combine functions in a certain way such that a good model for the prediction of the temporal evolution of a dynamical system can be identified. We illustrate the algorithms by finding a prediction for the evolution of a harmonic oscillator based on measurements, by detecting an arriving front in an excitable system, and as a real-world application, the prediction of solar power production based on energy production observations at a given site together with the weather forecast.

  2. Series Hybrid Electric Vehicle Power System Optimization Based on Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Zhu, Tianjun; Li, Bin; Zong, Changfu; Wu, Yang

    2017-09-01

    Hybrid electric vehicles (HEV), compared with conventional vehicles, have complex structures and more component parameters. If variables optimization designs are carried on all these parameters, it will increase the difficulty and the convergence of algorithm program, so this paper chooses the parameters which has a major influence on the vehicle fuel consumption to make it all work at maximum efficiency. First, HEV powertrain components modelling are built. Second, taking a tandem hybrid structure as an example, genetic algorithm is used in this paper to optimize fuel consumption and emissions. Simulation results in ADVISOR verify the feasibility of the proposed genetic optimization algorithm.

  3. Modelling soil water retention using support vector machines with genetic algorithm optimisation.

    PubMed

    Lamorski, Krzysztof; Sławiński, Cezary; Moreno, Felix; Barna, Gyöngyi; Skierucha, Wojciech; Arrue, José L

    2014-01-01

    This work presents point pedotransfer function (PTF) models of the soil water retention curve. The developed models allowed for estimation of the soil water content for the specified soil water potentials: -0.98, -3.10, -9.81, -31.02, -491.66, and -1554.78 kPa, based on the following soil characteristics: soil granulometric composition, total porosity, and bulk density. Support Vector Machines (SVM) methodology was used for model development. A new methodology for elaboration of retention function models is proposed. Alternative to previous attempts known from literature, the ν-SVM method was used for model development and the results were compared with the formerly used the C-SVM method. For the purpose of models' parameters search, genetic algorithms were used as an optimisation framework. A new form of the aim function used for models parameters search is proposed which allowed for development of models with better prediction capabilities. This new aim function avoids overestimation of models which is typically encountered when root mean squared error is used as an aim function. Elaborated models showed good agreement with measured soil water retention data. Achieved coefficients of determination values were in the range 0.67-0.92. Studies demonstrated usability of ν-SVM methodology together with genetic algorithm optimisation for retention modelling which gave better performing models than other tested approaches.

  4. 4D-Qsar Study of Some Pyrazole Pyridine Carboxylic Acid Derivatives by Electron Conformational-Genetic Algorithm Method.

    PubMed

    Tuzun, Burak; Yavuz, Sevtap Caglar; Sabanci, Nazmiye; Saripinar, Emin

    2018-05-13

    In the present work, pharmacophore identification and biological activity prediction for 86 pyrazole pyridine carboxylic acid derivatives were made using the electron conformational genetic algorithm approach which was introduced as a 4D-QSAR analysis by us in recent years. In the light of the data obtained from quantum chemical calculations at HF/6-311 G** level, the electron conformational matrices of congruity (ECMC) were constructed by EMRE software. Comparing the matrices, electron conformational submatrix of activity (ECSA, Pha) was revealed that are common for these compounds within a minimum tolerance. A parameter pool was generated considering the obtained pharmacophore. To determine the theoretical biological activity of molecules and identify the best subset of variables affecting bioactivities, we used the nonlinear least square regression method and genetic algorithm. The results obtained in this study are in good agreement with the experimental data presented in the literature. The model for training and test sets attained by the optimum 12 parameters gave highly satisfactory results with R2training= 0.889, q2=0.839 and SEtraining=0.066, q2ext1 = 0.770, q2ext2 = 0.750, q2ext3=0.824, ccctr = 0.941, ccctest = 0.869 and cccall = 0.927. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tumuluru, Jaya Shankar; McCulloch, Richard Chet James

    In this work a new hybrid genetic algorithm was developed which combines a rudimentary adaptive steepest ascent hill climbing algorithm with a sophisticated evolutionary algorithm in order to optimize complex multivariate design problems. By combining a highly stochastic algorithm (evolutionary) with a simple deterministic optimization algorithm (adaptive steepest ascent) computational resources are conserved and the solution converges rapidly when compared to either algorithm alone. In genetic algorithms natural selection is mimicked by random events such as breeding and mutation. In the adaptive steepest ascent algorithm each variable is perturbed by a small amount and the variable that caused the mostmore » improvement is incremented by a small step. If the direction of most benefit is exactly opposite of the previous direction with the most benefit then the step size is reduced by a factor of 2, thus the step size adapts to the terrain. A graphical user interface was created in MATLAB to provide an interface between the hybrid genetic algorithm and the user. Additional features such as bounding the solution space and weighting the objective functions individually are also built into the interface. The algorithm developed was tested to optimize the functions developed for a wood pelleting process. Using process variables (such as feedstock moisture content, die speed, and preheating temperature) pellet properties were appropriately optimized. Specifically, variables were found which maximized unit density, bulk density, tapped density, and durability while minimizing pellet moisture content and specific energy consumption. The time and computational resources required for the optimization were dramatically decreased using the hybrid genetic algorithm when compared to MATLAB's native evolutionary optimization tool.« less

  6. Genomic selection and complex trait prediction using a fast EM algorithm applied to genome-wide markers

    PubMed Central

    2010-01-01

    Background The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved. Results This article proposes an expectation-maximization (EM) algorithm called emBayesB which allows only a proportion of SNP to be in LD with QTL and incorporates prior information about the distribution of SNP effects. The posterior probability of being in LD with at least one QTL is calculated for each SNP along with estimates of the hyperparameters for the mixture prior. A simulated example of genomic selection from an international workshop is used to demonstrate the features of the EM algorithm. The accuracy of prediction is comparable to a full Bayesian analysis but the EM algorithm is considerably faster. The EM algorithm was accurate in locating QTL which explained more than 1% of the total genetic variation. A computational algorithm for very large SNP panels is described. Conclusions emBayesB is a fast and accurate EM algorithm for implementing genomic selection and predicting complex traits by mapping QTL in genome-wide dense SNP marker data. Its accuracy is similar to Bayesian methods but it takes only a fraction of the time. PMID:20969788

  7. Optimal placement of tuning masses on truss structures by genetic algorithms

    NASA Technical Reports Server (NTRS)

    Ponslet, Eric; Haftka, Raphael T.; Cudney, Harley H.

    1993-01-01

    Optimal placement of tuning masses, actuators and other peripherals on large space structures is a combinatorial optimization problem. This paper surveys several techniques for solving this problem. The genetic algorithm approach to the solution of the placement problem is described in detail. An example of minimizing the difference between the two lowest frequencies of a laboratory truss by adding tuning masses is used for demonstrating some of the advantages of genetic algorithms. The relative efficiencies of different codings are compared using the results of a large number of optimization runs.

  8. Context-sensitive network-based disease genetics prediction and its implications in drug discovery.

    PubMed

    Chen, Yang; Xu, Rong

    2017-04-01

    Disease phenotype networks play an important role in computational approaches to identifying new disease-gene associations. Current disease phenotype networks often model disease relationships based on pairwise similarities, therefore ignore the specific context on how two diseases are connected. In this study, we propose a new strategy to model disease associations using context-sensitive networks (CSNs). We developed a CSN-based phenome-driven approach for disease genetics prediction, and investigated the translational potential of the predicted genes in drug discovery. We constructed CSNs by directly connecting diseases with associated phenotypes. Here, we constructed two CSNs using different data sources; the two networks contain 26 790 and 13 822 nodes respectively. We integrated the CSNs with a genetic functional relationship network and predicted disease genes using a network-based ranking algorithm. For comparison, we built Similarity-Based disease Networks (SBN) using the same disease phenotype data. In a de novo cross validation for 3324 diseases, the CSN-based approach significantly increased the average rank from top 12.6 to top 8.8% for all tested genes comparing with the SBN-based approach ( p

  9. Determination of thiamine HCl and pyridoxine HCl in pharmaceutical preparations using UV-visible spectrophotometry and genetic algorithm based multivariate calibration methods.

    PubMed

    Ozdemir, Durmus; Dinc, Erdal

    2004-07-01

    Simultaneous determination of binary mixtures pyridoxine hydrochloride and thiamine hydrochloride in a vitamin combination using UV-visible spectrophotometry and classical least squares (CLS) and three newly developed genetic algorithm (GA) based multivariate calibration methods was demonstrated. The three genetic multivariate calibration methods are Genetic Classical Least Squares (GCLS), Genetic Inverse Least Squares (GILS) and Genetic Regression (GR). The sample data set contains the UV-visible spectra of 30 synthetic mixtures (8 to 40 microg/ml) of these vitamins and 10 tablets containing 250 mg from each vitamin. The spectra cover the range from 200 to 330 nm in 0.1 nm intervals. Several calibration models were built with the four methods for the two components. Overall, the standard error of calibration (SEC) and the standard error of prediction (SEP) for the synthetic data were in the range of <0.01 and 0.43 microg/ml for all the four methods. The SEP values for the tablets were in the range of 2.91 and 11.51 mg/tablets. A comparison of genetic algorithm selected wavelengths for each component using GR method was also included.

  10. Pile-up correction by Genetic Algorithm and Artificial Neural Network

    NASA Astrophysics Data System (ADS)

    Kafaee, M.; Saramad, S.

    2009-08-01

    Pile-up distortion is a common problem for high counting rates radiation spectroscopy in many fields such as industrial, nuclear and medical applications. It is possible to reduce pulse pile-up using hardware-based pile-up rejections. However, this phenomenon may not be eliminated completely by this approach and the spectrum distortion caused by pile-up rejection can be increased as well. In addition, inaccurate correction or rejection of pile-up artifacts in applications such as energy dispersive X-ray (EDX) spectrometers can lead to losses of counts, will give poor quantitative results and even false element identification. Therefore, it is highly desirable to use software-based models to predict and correct any recognized pile-up signals in data acquisition systems. The present paper describes two new intelligent approaches for pile-up correction; the Genetic Algorithm (GA) and Artificial Neural Networks (ANNs). The validation and testing results of these new methods have been compared, which shows excellent agreement with the measured data with 60Co source and NaI detector. The Monte Carlo simulation of these new intelligent algorithms also shows their advantages over hardware-based pulse pile-up rejection methods.

  11. Application of XGBoost algorithm in hourly PM2.5 concentration prediction

    NASA Astrophysics Data System (ADS)

    Pan, Bingyue

    2018-02-01

    In view of prediction techniques of hourly PM2.5 concentration in China, this paper applied the XGBoost(Extreme Gradient Boosting) algorithm to predict hourly PM2.5 concentration. The monitoring data of air quality in Tianjin city was analyzed by using XGBoost algorithm. The prediction performance of the XGBoost method is evaluated by comparing observed and predicted PM2.5 concentration using three measures of forecast accuracy. The XGBoost method is also compared with the random forest algorithm, multiple linear regression, decision tree regression and support vector machines for regression models using computational results. The results demonstrate that the XGBoost algorithm outperforms other data mining methods.

  12. Pose estimation for augmented reality applications using genetic algorithm.

    PubMed

    Yu, Ying Kin; Wong, Kin Hong; Chang, Michael Ming Yuen

    2005-12-01

    This paper describes a genetic algorithm that tackles the pose-estimation problem in computer vision. Our genetic algorithm can find the rotation and translation of an object accurately when the three-dimensional structure of the object is given. In our implementation, each chromosome encodes both the pose and the indexes to the selected point features of the object. Instead of only searching for the pose as in the existing work, our algorithm, at the same time, searches for a set containing the most reliable feature points in the process. This mismatch filtering strategy successfully makes the algorithm more robust under the presence of point mismatches and outliers in the images. Our algorithm has been tested with both synthetic and real data with good results. The accuracy of the recovered pose is compared to the existing algorithms. Our approach outperformed the Lowe's method and the other two genetic algorithms under the presence of point mismatches and outliers. In addition, it has been used to estimate the pose of a real object. It is shown that the proposed method is applicable to augmented reality applications.

  13. RNA secondary structure prediction using soft computing.

    PubMed

    Ray, Shubhra Sankar; Pal, Sankar K

    2013-01-01

    Prediction of RNA structure is invaluable in creating new drugs and understanding genetic diseases. Several deterministic algorithms and soft computing-based techniques have been developed for more than a decade to determine the structure from a known RNA sequence. Soft computing gained importance with the need to get approximate solutions for RNA sequences by considering the issues related with kinetic effects, cotranscriptional folding, and estimation of certain energy parameters. A brief description of some of the soft computing-based techniques, developed for RNA secondary structure prediction, is presented along with their relevance. The basic concepts of RNA and its different structural elements like helix, bulge, hairpin loop, internal loop, and multiloop are described. These are followed by different methodologies, employing genetic algorithms, artificial neural networks, and fuzzy logic. The role of various metaheuristics, like simulated annealing, particle swarm optimization, ant colony optimization, and tabu search is also discussed. A relative comparison among different techniques, in predicting 12 known RNA secondary structures, is presented, as an example. Future challenging issues are then mentioned.

  14. Evolving neural networks with genetic algorithms to study the string landscape

    NASA Astrophysics Data System (ADS)

    Ruehle, Fabian

    2017-08-01

    We study possible applications of artificial neural networks to examine the string landscape. Since the field of application is rather versatile, we propose to dynamically evolve these networks via genetic algorithms. This means that we start from basic building blocks and combine them such that the neural network performs best for the application we are interested in. We study three areas in which neural networks can be applied: to classify models according to a fixed set of (physically) appealing features, to find a concrete realization for a computation for which the precise algorithm is known in principle but very tedious to actually implement, and to predict or approximate the outcome of some involved mathematical computation which performs too inefficient to apply it, e.g. in model scans within the string landscape. We present simple examples that arise in string phenomenology for all three types of problems and discuss how they can be addressed by evolving neural networks from genetic algorithms.

  15. Gene selection for cancer classification with the help of bees.

    PubMed

    Moosa, Johra Muhammad; Shakur, Rameen; Kaykobad, Mohammad; Rahman, Mohammad Sohel

    2016-08-10

    Development of biologically relevant models from gene expression data notably, microarray data has become a topic of great interest in the field of bioinformatics and clinical genetics and oncology. Only a small number of gene expression data compared to the total number of genes explored possess a significant correlation with a certain phenotype. Gene selection enables researchers to obtain substantial insight into the genetic nature of the disease and the mechanisms responsible for it. Besides improvement of the performance of cancer classification, it can also cut down the time and cost of medical diagnoses. This study presents a modified Artificial Bee Colony Algorithm (ABC) to select minimum number of genes that are deemed to be significant for cancer along with improvement of predictive accuracy. The search equation of ABC is believed to be good at exploration but poor at exploitation. To overcome this limitation we have modified the ABC algorithm by incorporating the concept of pheromones which is one of the major components of Ant Colony Optimization (ACO) algorithm and a new operation in which successive bees communicate to share their findings. The proposed algorithm is evaluated using a suite of ten publicly available datasets after the parameters are tuned scientifically with one of the datasets. Obtained results are compared to other works that used the same datasets. The performance of the proposed method is proved to be superior. The method presented in this paper can provide subset of genes leading to more accurate classification results while the number of selected genes is smaller. Additionally, the proposed modified Artificial Bee Colony Algorithm could conceivably be applied to problems in other areas as well.

  16. Fireworks algorithm for mean-VaR/CVaR models

    NASA Astrophysics Data System (ADS)

    Zhang, Tingting; Liu, Zhifeng

    2017-10-01

    Intelligent algorithms have been widely applied to portfolio optimization problems. In this paper, we introduce a novel intelligent algorithm, named fireworks algorithm, to solve the mean-VaR/CVaR model for the first time. The results show that, compared with the classical genetic algorithm, fireworks algorithm not only improves the optimization accuracy and the optimization speed, but also makes the optimal solution more stable. We repeat our experiments at different confidence levels and different degrees of risk aversion, and the results are robust. It suggests that fireworks algorithm has more advantages than genetic algorithm in solving the portfolio optimization problem, and it is feasible and promising to apply it into this field.

  17. Probability distribution functions for unit hydrographs with optimization using genetic algorithm

    NASA Astrophysics Data System (ADS)

    Ghorbani, Mohammad Ali; Singh, Vijay P.; Sivakumar, Bellie; H. Kashani, Mahsa; Atre, Atul Arvind; Asadi, Hakimeh

    2017-05-01

    A unit hydrograph (UH) of a watershed may be viewed as the unit pulse response function of a linear system. In recent years, the use of probability distribution functions (pdfs) for determining a UH has received much attention. In this study, a nonlinear optimization model is developed to transmute a UH into a pdf. The potential of six popular pdfs, namely two-parameter gamma, two-parameter Gumbel, two-parameter log-normal, two-parameter normal, three-parameter Pearson distribution, and two-parameter Weibull is tested on data from the Lighvan catchment in Iran. The probability distribution parameters are determined using the nonlinear least squares optimization method in two ways: (1) optimization by programming in Mathematica; and (2) optimization by applying genetic algorithm. The results are compared with those obtained by the traditional linear least squares method. The results show comparable capability and performance of two nonlinear methods. The gamma and Pearson distributions are the most successful models in preserving the rising and recession limbs of the unit hydographs. The log-normal distribution has a high ability in predicting both the peak flow and time to peak of the unit hydrograph. The nonlinear optimization method does not outperform the linear least squares method in determining the UH (especially for excess rainfall of one pulse), but is comparable.

  18. Design of sparse Halbach magnet arrays for portable MRI using a genetic algorithm.

    PubMed

    Cooley, Clarissa Zimmerman; Haskell, Melissa W; Cauley, Stephen F; Sappo, Charlotte; Lapierre, Cristen D; Ha, Christopher G; Stockmann, Jason P; Wald, Lawrence L

    2018-01-01

    Permanent magnet arrays offer several attributes attractive for the development of a low-cost portable MRI scanner for brain imaging. They offer the potential for a relatively lightweight, low to mid-field system with no cryogenics, a small fringe field, and no electrical power requirements or heat dissipation needs. The cylindrical Halbach array, however, requires external shimming or mechanical adjustments to produce B 0 fields with standard MRI homogeneity levels (e.g., 0.1 ppm over FOV), particularly when constrained or truncated geometries are needed, such as a head-only magnet where the magnet length is constrained by the shoulders. For portable scanners using rotation of the magnet for spatial encoding with generalized projections, the spatial pattern of the field is important since it acts as the encoding field. In either a static or rotating magnet, it will be important to be able to optimize the field pattern of cylindrical Halbach arrays in a way that retains construction simplicity. To achieve this, we present a method for designing an optimized cylindrical Halbach magnet using the genetic algorithm to achieve either homogeneity (for standard MRI applications) or a favorable spatial encoding field pattern (for rotational spatial encoding applications). We compare the chosen designs against a standard, fully populated sparse Halbach design, and evaluate optimized spatial encoding fields using point-spread-function and image simulations. We validate the calculations by comparing to the measured field of a constructed magnet. The experimentally implemented design produced fields in good agreement with the predicted fields, and the genetic algorithm was successful in improving the chosen metrics. For the uniform target field, an order of magnitude homogeneity improvement was achieved compared to the un-optimized, fully populated design. For the rotational encoding design the resolution uniformity is improved by 95% compared to a uniformly populated design.

  19. Efficient experimental design of high-fidelity three-qubit quantum gates via genetic programming

    NASA Astrophysics Data System (ADS)

    Devra, Amit; Prabhu, Prithviraj; Singh, Harpreet; Arvind; Dorai, Kavita

    2018-03-01

    We have designed efficient quantum circuits for the three-qubit Toffoli (controlled-controlled-NOT) and the Fredkin (controlled-SWAP) gate, optimized via genetic programming methods. The gates thus obtained were experimentally implemented on a three-qubit NMR quantum information processor, with a high fidelity. Toffoli and Fredkin gates in conjunction with the single-qubit Hadamard gates form a universal gate set for quantum computing and are an essential component of several quantum algorithms. Genetic algorithms are stochastic search algorithms based on the logic of natural selection and biological genetics and have been widely used for quantum information processing applications. We devised a new selection mechanism within the genetic algorithm framework to select individuals from a population. We call this mechanism the "Luck-Choose" mechanism and were able to achieve faster convergence to a solution using this mechanism, as compared to existing selection mechanisms. The optimization was performed under the constraint that the experimentally implemented pulses are of short duration and can be implemented with high fidelity. We demonstrate the advantage of our pulse sequences by comparing our results with existing experimental schemes and other numerical optimization methods.

  20. Intelligent automated control of life support systems using proportional representations.

    PubMed

    Wu, Annie S; Garibay, Ivan I

    2004-06-01

    Effective automatic control of Advanced Life Support Systems (ALSS) is a crucial component of space exploration. An ALSS is a coupled dynamical system which can be extremely sensitive and difficult to predict. As a result, such systems can be difficult to control using deliberative and deterministic methods. We investigate the performance of two machine learning algorithms, a genetic algorithm (GA) and a stochastic hill-climber (SH), on the problem of learning how to control an ALSS, and compare the impact of two different types of problem representations on the performance of both algorithms. We perform experiments on three ALSS optimization problems using five strategies with multiple variations of a proportional representation for a total of 120 experiments. Results indicate that although a proportional representation can effectively boost GA performance, it does not necessarily have the same effect on other algorithms such as SH. Results also support previous conclusions that multivector control strategies are an effective method for control of coupled dynamical systems.

  1. Ab-initio conformational epitope structure prediction using genetic algorithm and SVM for vaccine design.

    PubMed

    Moghram, Basem Ameen; Nabil, Emad; Badr, Amr

    2018-01-01

    T-cell epitope structure identification is a significant challenging immunoinformatic problem within epitope-based vaccine design. Epitopes or antigenic peptides are a set of amino acids that bind with the Major Histocompatibility Complex (MHC) molecules. The aim of this process is presented by Antigen Presenting Cells to be inspected by T-cells. MHC-molecule-binding epitopes are responsible for triggering the immune response to antigens. The epitope's three-dimensional (3D) molecular structure (i.e., tertiary structure) reflects its proper function. Therefore, the identification of MHC class-II epitopes structure is a significant step towards epitope-based vaccine design and understanding of the immune system. In this paper, we propose a new technique using a Genetic Algorithm for Predicting the Epitope Structure (GAPES), to predict the structure of MHC class-II epitopes based on their sequence. The proposed Elitist-based genetic algorithm for predicting the epitope's tertiary structure is based on Ab-Initio Empirical Conformational Energy Program for Peptides (ECEPP) Force Field Model. The developed secondary structure prediction technique relies on Ramachandran Plot. We used two alignment algorithms: the ROSS alignment and TM-Score alignment. We applied four different alignment approaches to calculate the similarity scores of the dataset under test. We utilized the support vector machine (SVM) classifier as an evaluation of the prediction performance. The prediction accuracy and the Area Under Receiver Operating Characteristic (ROC) Curve (AUC) were calculated as measures of performance. The calculations are performed on twelve similarity-reduced datasets of the Immune Epitope Data Base (IEDB) and a large dataset of peptide-binding affinities to HLA-DRB1*0101. The results showed that GAPES was reliable and very accurate. We achieved an average prediction accuracy of 93.50% and an average AUC of 0.974 in the IEDB dataset. Also, we achieved an accuracy of 95.125% and an AUC of 0.987 on the HLA-DRB1*0101 allele of the Wang benchmark dataset. The results indicate that the proposed prediction technique "GAPES" is a promising technique that will help researchers and scientists to predict the protein structure and it will assist them in the intelligent design of new epitope-based vaccines. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Resonance assignment of the NMR spectra of disordered proteins using a multi-objective non-dominated sorting genetic algorithm.

    PubMed

    Yang, Yu; Fritzsching, Keith J; Hong, Mei

    2013-11-01

    A multi-objective genetic algorithm is introduced to predict the assignment of protein solid-state NMR (SSNMR) spectra with partial resonance overlap and missing peaks due to broad linewidths, molecular motion, and low sensitivity. This non-dominated sorting genetic algorithm II (NSGA-II) aims to identify all possible assignments that are consistent with the spectra and to compare the relative merit of these assignments. Our approach is modeled after the recently introduced Monte-Carlo simulated-annealing (MC/SA) protocol, with the key difference that NSGA-II simultaneously optimizes multiple assignment objectives instead of searching for possible assignments based on a single composite score. The multiple objectives include maximizing the number of consistently assigned peaks between multiple spectra ("good connections"), maximizing the number of used peaks, minimizing the number of inconsistently assigned peaks between spectra ("bad connections"), and minimizing the number of assigned peaks that have no matching peaks in the other spectra ("edges"). Using six SSNMR protein chemical shift datasets with varying levels of imperfection that was introduced by peak deletion, random chemical shift changes, and manual peak picking of spectra with moderately broad linewidths, we show that the NSGA-II algorithm produces a large number of valid and good assignments rapidly. For high-quality chemical shift peak lists, NSGA-II and MC/SA perform similarly well. However, when the peak lists contain many missing peaks that are uncorrelated between different spectra and have chemical shift deviations between spectra, the modified NSGA-II produces a larger number of valid solutions than MC/SA, and is more effective at distinguishing good from mediocre assignments by avoiding the hazard of suboptimal weighting factors for the various objectives. These two advantages, namely diversity and better evaluation, lead to a higher probability of predicting the correct assignment for a larger number of residues. On the other hand, when there are multiple equally good assignments that are significantly different from each other, the modified NSGA-II is less efficient than MC/SA in finding all the solutions. This problem is solved by a combined NSGA-II/MC algorithm, which appears to have the advantages of both NSGA-II and MC/SA. This combination algorithm is robust for the three most difficult chemical shift datasets examined here and is expected to give the highest-quality de novo assignment of challenging protein NMR spectra.

  3. Smoking Gun or Circumstantial Evidence? Comparison of Statistical Learning Methods using Functional Annotations for Prioritizing Risk Variants.

    PubMed

    Gagliano, Sarah A; Ravji, Reena; Barnes, Michael R; Weale, Michael E; Knight, Jo

    2015-08-24

    Although technology has triumphed in facilitating routine genome sequencing, new challenges have been created for the data-analyst. Genome-scale surveys of human variation generate volumes of data that far exceed capabilities for laboratory characterization. By incorporating functional annotations as predictors, statistical learning has been widely investigated for prioritizing genetic variants likely to be associated with complex disease. We compared three published prioritization procedures, which use different statistical learning algorithms and different predictors with regard to the quantity, type and coding. We also explored different combinations of algorithm and annotation set. As an application, we tested which methodology performed best for prioritizing variants using data from a large schizophrenia meta-analysis by the Psychiatric Genomics Consortium. Results suggest that all methods have considerable (and similar) predictive accuracies (AUCs 0.64-0.71) in test set data, but there is more variability in the application to the schizophrenia GWAS. In conclusion, a variety of algorithms and annotations seem to have a similar potential to effectively enrich true risk variants in genome-scale datasets, however none offer more than incremental improvement in prediction. We discuss how methods might be evolved for risk variant prediction to address the impending bottleneck of the new generation of genome re-sequencing studies.

  4. Firefly as a novel swarm intelligence variable selection method in spectroscopy.

    PubMed

    Goodarzi, Mohammad; dos Santos Coelho, Leandro

    2014-12-10

    A critical step in multivariate calibration is wavelength selection, which is used to build models with better prediction performance when applied to spectral data. Up to now, many feature selection techniques have been developed. Among all different types of feature selection techniques, those based on swarm intelligence optimization methodologies are more interesting since they are usually simulated based on animal and insect life behavior to, e.g., find the shortest path between a food source and their nests. This decision is made by a crowd, leading to a more robust model with less falling in local minima during the optimization cycle. This paper represents a novel feature selection approach to the selection of spectroscopic data, leading to more robust calibration models. The performance of the firefly algorithm, a swarm intelligence paradigm, was evaluated and compared with genetic algorithm and particle swarm optimization. All three techniques were coupled with partial least squares (PLS) and applied to three spectroscopic data sets. They demonstrate improved prediction results in comparison to when only a PLS model was built using all wavelengths. Results show that firefly algorithm as a novel swarm paradigm leads to a lower number of selected wavelengths while the prediction performance of built PLS stays the same. Copyright © 2014. Published by Elsevier B.V.

  5. A Smart Itsy Bitsy Spider for the Web.

    ERIC Educational Resources Information Center

    Chen, Hsinchun; Chung, Yi-Ming; Ramsey, Marshall; Yang, Christopher C.

    1998-01-01

    This study tested two Web personal spiders (i.e., agents that take users' requests and perform real-time customized searches) based on best first-search and genetic-algorithm techniques. Both results were comparable and complementary, although the genetic algorithm obtained higher recall value. The Java-based interface was found to be necessary…

  6. Family-Based Benchmarking of Copy Number Variation Detection Software.

    PubMed

    Nutsua, Marcel Elie; Fischer, Annegret; Nebel, Almut; Hofmann, Sylvia; Schreiber, Stefan; Krawczak, Michael; Nothnagel, Michael

    2015-01-01

    The analysis of structural variants, in particular of copy-number variations (CNVs), has proven valuable in unraveling the genetic basis of human diseases. Hence, a large number of algorithms have been developed for the detection of CNVs in SNP array signal intensity data. Using the European and African HapMap trio data, we undertook a comparative evaluation of six commonly used CNV detection software tools, namely Affymetrix Power Tools (APT), QuantiSNP, PennCNV, GLAD, R-gada and VEGA, and assessed their level of pair-wise prediction concordance. The tool-specific CNV prediction accuracy was assessed in silico by way of intra-familial validation. Software tools differed greatly in terms of the number and length of the CNVs predicted as well as the number of markers included in a CNV. All software tools predicted substantially more deletions than duplications. Intra-familial validation revealed consistently low levels of prediction accuracy as measured by the proportion of validated CNVs (34-60%). Moreover, up to 20% of apparent family-based validations were found to be due to chance alone. Software using Hidden Markov models (HMM) showed a trend to predict fewer CNVs than segmentation-based algorithms albeit with greater validity. PennCNV yielded the highest prediction accuracy (60.9%). Finally, the pairwise concordance of CNV prediction was found to vary widely with the software tools involved. We recommend HMM-based software, in particular PennCNV, rather than segmentation-based algorithms when validity is the primary concern of CNV detection. QuantiSNP may be used as an additional tool to detect sets of CNVs not detectable by the other tools. Our study also reemphasizes the need for laboratory-based validation, such as qPCR, of CNVs predicted in silico.

  7. MAC Protocol for Ad Hoc Networks Using a Genetic Algorithm

    PubMed Central

    Elizarraras, Omar; Panduro, Marco; Méndez, Aldo L.

    2014-01-01

    The problem of obtaining the transmission rate in an ad hoc network consists in adjusting the power of each node to ensure the signal to interference ratio (SIR) and the energy required to transmit from one node to another is obtained at the same time. Therefore, an optimal transmission rate for each node in a medium access control (MAC) protocol based on CSMA-CDMA (carrier sense multiple access-code division multiple access) for ad hoc networks can be obtained using evolutionary optimization. This work proposes a genetic algorithm for the transmission rate election considering a perfect power control, and our proposition achieves improvement of 10% compared with the scheme that handles the handshaking phase to adjust the transmission rate. Furthermore, this paper proposes a genetic algorithm that solves the problem of power combining, interference, data rate, and energy ensuring the signal to interference ratio in an ad hoc network. The result of the proposed genetic algorithm has a better performance (15%) compared to the CSMA-CDMA protocol without optimizing. Therefore, we show by simulation the effectiveness of the proposed protocol in terms of the throughput. PMID:25140339

  8. Optimization of culture conditions and bench-scale production of L-asparaginase by submerged fermentation of Aspergillus terreus MTCC 1782.

    PubMed

    Gurunathan, Baskar; Sahadevan, Renganathan

    2012-07-01

    Optimization of culture conditions for L-asparaginase production by submerged fermentation of Aspergillus terreus MTCC 1782 was studied using a 3-level central composite design of response surface methodology and artificial neural network linked genetic algorithm. The artificial neural network linked genetic algorithm was found to be more efficient than response surface methodology. The experimental L-asparaginase activity of 43.29 IU/ml was obtained at the optimum culture conditions of temperature 35 degrees C, initial pH 6.3, inoculum size 1% (v/v), agitation rate 140 rpm, and incubation time 58.5 h of the artificial neural network linked genetic algorithm, which was close to the predicted activity of 44.38 IU/ml. Characteristics of L-asparaginase production by A. terreus MTCC 1782 were studied in a 3 L bench-scale bioreactor.

  9. A theoretical comparison of evolutionary algorithms and simulated annealing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hart, W.E.

    1995-08-28

    This paper theoretically compares the performance of simulated annealing and evolutionary algorithms. Our main result is that under mild conditions a wide variety of evolutionary algorithms can be shown to have greater performance than simulated annealing after a sufficiently large number of function evaluations. This class of EAs includes variants of evolutionary strategie and evolutionary programming, the canonical genetic algorithm, as well as a variety of genetic algorithms that have been applied to combinatorial optimization problems. The proof of this result is based on a performance analysis of a very general class of stochastic optimization algorithms, which has implications formore » the performance of a variety of other optimization algorithm.« less

  10. Air data system optimization using a genetic algorithm

    NASA Technical Reports Server (NTRS)

    Deshpande, Samir M.; Kumar, Renjith R.; Seywald, Hans; Siemers, Paul M., III

    1992-01-01

    An optimization method for flush-orifice air data system design has been developed using the Genetic Algorithm approach. The optimization of the orifice array minimizes the effect of normally distributed random noise in the pressure readings on the calculation of air data parameters, namely, angle of attack, sideslip angle and freestream dynamic pressure. The optimization method is applied to the design of Pressure Distribution/Air Data System experiment (PD/ADS) proposed for inclusion in the Aeroassist Flight Experiment (AFE). Results obtained by the Genetic Algorithm method are compared to the results obtained by conventional gradient search method.

  11. A novel hybrid genetic algorithm for optimal design of IPM machines for electric vehicle

    NASA Astrophysics Data System (ADS)

    Wang, Aimeng; Guo, Jiayu

    2017-12-01

    A novel hybrid genetic algorithm (HGA) is proposed to optimize the rotor structure of an IPM machine which is used in EV application. The finite element (FE) simulation results of the HGA design is compared with the genetic algorithm (GA) design and those before optimized. It is shown that the performance of the IPMSM is effectively improved by employing the GA and HGA, especially by HGA. Moreover, higher flux-weakening capability and less magnet usage are also obtained. Therefore, the validity of HGA method in IPMSM optimization design is verified.

  12. A Genetic Algorithm and Fuzzy Logic Approach for Video Shot Boundary Detection

    PubMed Central

    Thounaojam, Dalton Meitei; Khelchandra, Thongam; Singh, Kh. Manglem; Roy, Sudipta

    2016-01-01

    This paper proposed a shot boundary detection approach using Genetic Algorithm and Fuzzy Logic. In this, the membership functions of the fuzzy system are calculated using Genetic Algorithm by taking preobserved actual values for shot boundaries. The classification of the types of shot transitions is done by the fuzzy system. Experimental results show that the accuracy of the shot boundary detection increases with the increase in iterations or generations of the GA optimization process. The proposed system is compared to latest techniques and yields better result in terms of F1score parameter. PMID:27127500

  13. Multivariate Methods for Prediction of Geologic Sample Composition with Laser-Induced Breakdown Spectroscopy

    NASA Technical Reports Server (NTRS)

    Morris, Richard; Anderson, R.; Clegg, S. M.; Bell, J. F., III

    2010-01-01

    Laser-induced breakdown spectroscopy (LIBS) uses pulses of laser light to ablate a material from the surface of a sample and produce an expanding plasma. The optical emission from the plasma produces a spectrum which can be used to classify target materials and estimate their composition. The ChemCam instrument on the Mars Science Laboratory (MSL) mission will use LIBS to rapidly analyze targets remotely, allowing more resource- and time-intensive in-situ analyses to be reserved for targets of particular interest. ChemCam will also be used to analyze samples that are not reachable by the rover's in-situ instruments. Due to these tactical and scientific roles, it is important that ChemCam-derived sample compositions are as accurate as possible. We have compared the results of partial least squares (PLS), multilayer perceptron (MLP) artificial neural networks (ANNs), and cascade correlation (CC) ANNs to determine which technique yields better estimates of quantitative element abundances in rock and mineral samples. The number of hidden nodes in the MLP ANNs was optimized using a genetic algorithm. The influence of two data preprocessing techniques were also investigated: genetic algorithm feature selection and averaging the spectra for each training sample prior to training the PLS and ANN algorithms. We used a ChemCam-like laboratory stand-off LIBS system to collect spectra of 30 pressed powder geostandards and a diverse suite of 196 geologic slab samples of known bulk composition. We tested the performance of PLS and ANNs on a subset of these samples, choosing to focus on silicate rocks and minerals with a loss on ignition of less than 2 percent. This resulted in a set of 22 pressed powder geostandards and 80 geologic samples. Four of the geostandards were used as a validation set and 18 were used as the training set for the algorithms. We found that PLS typically resulted in the lowest average absolute error in its predictions, but that the optimized MLP ANN and the CC ANN often gave results comparable to PLS. Averaging the spectra for each training sample and/or using feature selection to choose a small subset of wavelengths to use for predictions gave mixed results, with degraded performance in some cases and similar or slightly improved performance in other cases. However, training time was significantly reduced for both PLS and ANN methods by implementing feature selection, making this a potentially appealing method for initial, rapid-turn-around analyses necessary for Chemcam's tactical role on MSL. Choice of training samples has a strong influence on the accuracy of predictions. We are currently investigating the use of clustering algorithms (e.g. k-means, neural gas, etc.) to identify training sets that are spectrally similar to the unknown samples that are being predicted, and therefore result in improved predictions

  14. An efficient genetic algorithm for maximum coverage deployment in wireless sensor networks.

    PubMed

    Yoon, Yourim; Kim, Yong-Hyuk

    2013-10-01

    Sensor networks have a lot of applications such as battlefield surveillance, environmental monitoring, and industrial diagnostics. Coverage is one of the most important performance metrics for sensor networks since it reflects how well a sensor field is monitored. In this paper, we introduce the maximum coverage deployment problem in wireless sensor networks and analyze the properties of the problem and its solution space. Random deployment is the simplest way to deploy sensor nodes but may cause unbalanced deployment and therefore, we need a more intelligent way for sensor deployment. We found that the phenotype space of the problem is a quotient space of the genotype space in a mathematical view. Based on this property, we propose an efficient genetic algorithm using a novel normalization method. A Monte Carlo method is adopted to design an efficient evaluation function, and its computation time is decreased without loss of solution quality using a method that starts from a small number of random samples and gradually increases the number for subsequent generations. The proposed genetic algorithms could be further improved by combining with a well-designed local search. The performance of the proposed genetic algorithm is shown by a comparative experimental study. When compared with random deployment and existing methods, our genetic algorithm was not only about twice faster, but also showed significant performance improvement in quality.

  15. A Genetic-Based Scheduling Algorithm to Minimize the Makespan of the Grid Applications

    NASA Astrophysics Data System (ADS)

    Entezari-Maleki, Reza; Movaghar, Ali

    Task scheduling algorithms in grid environments strive to maximize the overall throughput of the grid. In order to maximize the throughput of the grid environments, the makespan of the grid tasks should be minimized. In this paper, a new task scheduling algorithm is proposed to assign tasks to the grid resources with goal of minimizing the total makespan of the tasks. The algorithm uses the genetic approach to find the suitable assignment within grid resources. The experimental results obtained from applying the proposed algorithm to schedule independent tasks within grid environments demonstrate the applicability of the algorithm in achieving schedules with comparatively lower makespan in comparison with other well-known scheduling algorithms such as, Min-min, Max-min, RASA and Sufferage algorithms.

  16. Estimates of the atmospheric parameters of M-type stars: a machine-learning perspective

    NASA Astrophysics Data System (ADS)

    Sarro, L. M.; Ordieres-Meré, J.; Bello-García, A.; González-Marcos, A.; Solano, E.

    2018-05-01

    Estimating the atmospheric parameters of M-type stars has been a difficult task due to the lack of simple diagnostics in the stellar spectra. We aim at uncovering good sets of predictive features of stellar atmospheric parameters (Teff, log (g), [M/H]) in spectra of M-type stars. We define two types of potential features (equivalent widths and integrated flux ratios) able to explain the atmospheric physical parameters. We search the space of feature sets using a genetic algorithm that evaluates solutions by their prediction performance in the framework of the BT-Settl library of stellar spectra. Thereafter, we construct eight regression models using different machine-learning techniques and compare their performances with those obtained using the classical χ2 approach and independent component analysis (ICA) coefficients. Finally, we validate the various alternatives using two sets of real spectra from the NASA Infrared Telescope Facility (IRTF) and Dwarf Archives collections. We find that the cross-validation errors are poor measures of the performance of regression models in the context of physical parameter prediction in M-type stars. For R ˜ 2000 spectra with signal-to-noise ratios typical of the IRTF and Dwarf Archives, feature selection with genetic algorithms or alternative techniques produces only marginal advantages with respect to representation spaces that are unconstrained in wavelength (full spectrum or ICA). We make available the atmospheric parameters for the two collections of observed spectra as online material.

  17. Advancing X-ray scattering metrology using inverse genetic algorithms.

    PubMed

    Hannon, Adam F; Sunday, Daniel F; Windover, Donald; Kline, R Joseph

    2016-01-01

    We compare the speed and effectiveness of two genetic optimization algorithms to the results of statistical sampling via a Markov chain Monte Carlo algorithm to find which is the most robust method for determining real space structure in periodic gratings measured using critical dimension small angle X-ray scattering. Both a covariance matrix adaptation evolutionary strategy and differential evolution algorithm are implemented and compared using various objective functions. The algorithms and objective functions are used to minimize differences between diffraction simulations and measured diffraction data. These simulations are parameterized with an electron density model known to roughly correspond to the real space structure of our nanogratings. The study shows that for X-ray scattering data, the covariance matrix adaptation coupled with a mean-absolute error log objective function is the most efficient combination of algorithm and goodness of fit criterion for finding structures with little foreknowledge about the underlying fine scale structure features of the nanograting.

  18. Prediction of microRNA target genes using an efficient genetic algorithm-based decision tree.

    PubMed

    Rabiee-Ghahfarrokhi, Behzad; Rafiei, Fariba; Niknafs, Ali Akbar; Zamani, Behzad

    2015-01-01

    MicroRNAs (miRNAs) are small, non-coding RNA molecules that regulate gene expression in almost all plants and animals. They play an important role in key processes, such as proliferation, apoptosis, and pathogen-host interactions. Nevertheless, the mechanisms by which miRNAs act are not fully understood. The first step toward unraveling the function of a particular miRNA is the identification of its direct targets. This step has shown to be quite challenging in animals primarily because of incomplete complementarities between miRNA and target mRNAs. In recent years, the use of machine-learning techniques has greatly increased the prediction of miRNA targets, avoiding the need for costly and time-consuming experiments to achieve miRNA targets experimentally. Among the most important machine-learning algorithms are decision trees, which classify data based on extracted rules. In the present work, we used a genetic algorithm in combination with C4.5 decision tree for prediction of miRNA targets. We applied our proposed method to a validated human datasets. We nearly achieved 93.9% accuracy of classification, which could be related to the selection of best rules.

  19. Prediction of microRNA target genes using an efficient genetic algorithm-based decision tree

    PubMed Central

    Rabiee-Ghahfarrokhi, Behzad; Rafiei, Fariba; Niknafs, Ali Akbar; Zamani, Behzad

    2015-01-01

    MicroRNAs (miRNAs) are small, non-coding RNA molecules that regulate gene expression in almost all plants and animals. They play an important role in key processes, such as proliferation, apoptosis, and pathogen–host interactions. Nevertheless, the mechanisms by which miRNAs act are not fully understood. The first step toward unraveling the function of a particular miRNA is the identification of its direct targets. This step has shown to be quite challenging in animals primarily because of incomplete complementarities between miRNA and target mRNAs. In recent years, the use of machine-learning techniques has greatly increased the prediction of miRNA targets, avoiding the need for costly and time-consuming experiments to achieve miRNA targets experimentally. Among the most important machine-learning algorithms are decision trees, which classify data based on extracted rules. In the present work, we used a genetic algorithm in combination with C4.5 decision tree for prediction of miRNA targets. We applied our proposed method to a validated human datasets. We nearly achieved 93.9% accuracy of classification, which could be related to the selection of best rules. PMID:26649272

  20. Automatic variable selection method and a comparison for quantitative analysis in laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Duan, Fajie; Fu, Xiao; Jiang, Jiajia; Huang, Tingting; Ma, Ling; Zhang, Cong

    2018-05-01

    In this work, an automatic variable selection method for quantitative analysis of soil samples using laser-induced breakdown spectroscopy (LIBS) is proposed, which is based on full spectrum correction (FSC) and modified iterative predictor weighting-partial least squares (mIPW-PLS). The method features automatic selection without artificial processes. To illustrate the feasibility and effectiveness of the method, a comparison with genetic algorithm (GA) and successive projections algorithm (SPA) for different elements (copper, barium and chromium) detection in soil was implemented. The experimental results showed that all the three methods could accomplish variable selection effectively, among which FSC-mIPW-PLS required significantly shorter computation time (12 s approximately for 40,000 initial variables) than the others. Moreover, improved quantification models were got with variable selection approaches. The root mean square errors of prediction (RMSEP) of models utilizing the new method were 27.47 (copper), 37.15 (barium) and 39.70 (chromium) mg/kg, which showed comparable prediction effect with GA and SPA.

  1. [Application of genetic algorithm in blending technology for extractions of Cortex Fraxini].

    PubMed

    Yang, Ming; Zhou, Yinmin; Chen, Jialei; Yu, Minying; Shi, Xiufeng; Gu, Xijun

    2009-10-01

    To explore the feasibility of genetic algorithm (GA) on multiple objective blending technology for extractions of Cortex Fraxini. According to that the optimization objective was the combination of fingerprint similarity and the root-mean-square error of multiple key constituents, a new multiple objective optimization model of 10 batches extractions of Cortex Fraxini was built. The blending coefficient was obtained by genetic algorithm. The quality of 10 batches extractions of Cortex Fraxini that after blending was evaluated with the finger print similarity and root-mean-square error as indexes. The quality of 10 batches extractions of Cortex Fraxini that after blending was well improved. Comparing with the fingerprint of the control sample, the similarity was up, but the degree of variation is down. The relative deviation of the key constituents was less than 10%. It is proved that genetic algorithm works well on multiple objective blending technology for extractions of Cortex Fraxini. This method can be a reference to control the quality of extractions of Cortex Fraxini. Genetic algorithm in blending technology for extractions of Chinese medicines is advisable.

  2. A multifactorial analysis of obesity as CVD risk factor: use of neural network based methods in a nutrigenetics context.

    PubMed

    Valavanis, Ioannis K; Mougiakakou, Stavroula G; Grimaldi, Keith A; Nikita, Konstantina S

    2010-09-08

    Obesity is a multifactorial trait, which comprises an independent risk factor for cardiovascular disease (CVD). The aim of the current work is to study the complex etiology beneath obesity and identify genetic variations and/or factors related to nutrition that contribute to its variability. To this end, a set of more than 2300 white subjects who participated in a nutrigenetics study was used. For each subject a total of 63 factors describing genetic variants related to CVD (24 in total), gender, and nutrition (38 in total), e.g. average daily intake in calories and cholesterol, were measured. Each subject was categorized according to body mass index (BMI) as normal (BMI ≤ 25) or overweight (BMI > 25). Two artificial neural network (ANN) based methods were designed and used towards the analysis of the available data. These corresponded to i) a multi-layer feed-forward ANN combined with a parameter decreasing method (PDM-ANN), and ii) a multi-layer feed-forward ANN trained by a hybrid method (GA-ANN) which combines genetic algorithms and the popular back-propagation training algorithm. PDM-ANN and GA-ANN were comparatively assessed in terms of their ability to identify the most important factors among the initial 63 variables describing genetic variations, nutrition and gender, able to classify a subject into one of the BMI related classes: normal and overweight. The methods were designed and evaluated using appropriate training and testing sets provided by 3-fold Cross Validation (3-CV) resampling. Classification accuracy, sensitivity, specificity and area under receiver operating characteristics curve were utilized to evaluate the resulted predictive ANN models. The most parsimonious set of factors was obtained by the GA-ANN method and included gender, six genetic variations and 18 nutrition-related variables. The corresponding predictive model was characterized by a mean accuracy equal of 61.46% in the 3-CV testing sets. The ANN based methods revealed factors that interactively contribute to obesity trait and provided predictive models with a promising generalization ability. In general, results showed that ANNs and their hybrids can provide useful tools for the study of complex traits in the context of nutrigenetics.

  3. Applications of information theory, genetic algorithms, and neural models to predict oil flow

    NASA Astrophysics Data System (ADS)

    Ludwig, Oswaldo; Nunes, Urbano; Araújo, Rui; Schnitman, Leizer; Lepikson, Herman Augusto

    2009-07-01

    This work introduces a new information-theoretic methodology for choosing variables and their time lags in a prediction setting, particularly when neural networks are used in non-linear modeling. The first contribution of this work is the Cross Entropy Function (XEF) proposed to select input variables and their lags in order to compose the input vector of black-box prediction models. The proposed XEF method is more appropriate than the usually applied Cross Correlation Function (XCF) when the relationship among the input and output signals comes from a non-linear dynamic system. The second contribution is a method that minimizes the Joint Conditional Entropy (JCE) between the input and output variables by means of a Genetic Algorithm (GA). The aim is to take into account the dependence among the input variables when selecting the most appropriate set of inputs for a prediction problem. In short, theses methods can be used to assist the selection of input training data that have the necessary information to predict the target data. The proposed methods are applied to a petroleum engineering problem; predicting oil production. Experimental results obtained with a real-world dataset are presented demonstrating the feasibility and effectiveness of the method.

  4. Comparative Analysis of Rank Aggregation Techniques for Metasearch Using Genetic Algorithm

    ERIC Educational Resources Information Center

    Kaur, Parneet; Singh, Manpreet; Singh Josan, Gurpreet

    2017-01-01

    Rank Aggregation techniques have found wide applications for metasearch along with other streams such as Sports, Voting System, Stock Markets, and Reduction in Spam. This paper presents the optimization of rank lists for web queries put by the user on different MetaSearch engines. A metaheuristic approach such as Genetic algorithm based rank…

  5. Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations

    PubMed Central

    2012-01-01

    Background Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. Results We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension – UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high-end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent test data than a classifier trained using features selected by a statistical p-value-based filter, which is currently the most popular approach for constructing predictive models in GWAS. Conclusions Greedy RLS is the first known implementation of a machine learning based method with the capability to conduct a wrapper-based feature selection on an entire GWAS containing several thousand examples and over 400,000 variants. In our experiments, greedy RLS selected a highly predictive subset of genetic variants in a fraction of the time spent by wrapper-based selection methods used together with SVM classifiers. The proposed algorithms are freely available as part of the RLScore software library at http://users.utu.fi/aatapa/RLScore/. PMID:22551170

  6. Genomes to natural products PRediction Informatics for Secondary Metabolomes (PRISM).

    PubMed

    Skinnider, Michael A; Dejong, Chris A; Rees, Philip N; Johnston, Chad W; Li, Haoxin; Webster, Andrew L H; Wyatt, Morgan A; Magarvey, Nathan A

    2015-11-16

    Microbial natural products are an invaluable source of evolved bioactive small molecules and pharmaceutical agents. Next-generation and metagenomic sequencing indicates untapped genomic potential, yet high rediscovery rates of known metabolites increasingly frustrate conventional natural product screening programs. New methods to connect biosynthetic gene clusters to novel chemical scaffolds are therefore critical to enable the targeted discovery of genetically encoded natural products. Here, we present PRISM, a computational resource for the identification of biosynthetic gene clusters, prediction of genetically encoded nonribosomal peptides and type I and II polyketides, and bio- and cheminformatic dereplication of known natural products. PRISM implements novel algorithms which render it uniquely capable of predicting type II polyketides, deoxygenated sugars, and starter units, making it a comprehensive genome-guided chemical structure prediction engine. A library of 57 tailoring reactions is leveraged for combinatorial scaffold library generation when multiple potential substrates are consistent with biosynthetic logic. We compare the accuracy of PRISM to existing genomic analysis platforms. PRISM is an open-source, user-friendly web application available at http://magarveylab.ca/prism/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  7. Bitterness intensity prediction of berberine hydrochloride using an electronic tongue and a GA-BP neural network.

    PubMed

    Liu, Ruixin; Zhang, Xiaodong; Zhang, Lu; Gao, Xiaojie; Li, Huiling; Shi, Junhan; Li, Xuelin

    2014-06-01

    The aim of this study was to predict the bitterness intensity of a drug using an electronic tongue (e-tongue). The model drug of berberine hydrochloride was used to establish a bitterness prediction model (BPM), based on the taste evaluation of bitterness intensity by a taste panel, the data provided by the e-tongue and a genetic algorithm-back-propagation neural network (GA-BP) modeling method. The modeling characteristics of the GA-BP were compared with those of multiple linear regression, partial least square regression and BP methods. The determination coefficient of the BPM was 0.99965±0.00004, the root mean square error of cross-validation was 0.1398±0.0488 and the correlation coefficient of the cross-validation between the true and predicted values was 0.9959±0.0027. The model is superior to the other three models based on these indicators. In conclusion, the model established in this study has a high fitting degree and may be used for the bitterness prediction modeling of berberine hydrochloride of different concentrations. The model also provides a reference for the generation of BPMs of other drugs. Additionally, the algorithm of the study is able to conduct a rapid and accurate quantitative analysis of the data provided by the e-tongue.

  8. Bitterness intensity prediction of berberine hydrochloride using an electronic tongue and a GA-BP neural network

    PubMed Central

    LIU, RUIXIN; ZHANG, XIAODONG; ZHANG, LU; GAO, XIAOJIE; LI, HUILING; SHI, JUNHAN; LI, XUELIN

    2014-01-01

    The aim of this study was to predict the bitterness intensity of a drug using an electronic tongue (e-tongue). The model drug of berberine hydrochloride was used to establish a bitterness prediction model (BPM), based on the taste evaluation of bitterness intensity by a taste panel, the data provided by the e-tongue and a genetic algorithm-back-propagation neural network (GA-BP) modeling method. The modeling characteristics of the GA-BP were compared with those of multiple linear regression, partial least square regression and BP methods. The determination coefficient of the BPM was 0.99965±0.00004, the root mean square error of cross-validation was 0.1398±0.0488 and the correlation coefficient of the cross-validation between the true and predicted values was 0.9959±0.0027. The model is superior to the other three models based on these indicators. In conclusion, the model established in this study has a high fitting degree and may be used for the bitterness prediction modeling of berberine hydrochloride of different concentrations. The model also provides a reference for the generation of BPMs of other drugs. Additionally, the algorithm of the study is able to conduct a rapid and accurate quantitative analysis of the data provided by the e-tongue. PMID:24926369

  9. A new compound arithmetic crossover-based genetic algorithm for constrained optimisation in enterprise systems

    NASA Astrophysics Data System (ADS)

    Jin, Chenxia; Li, Fachao; Tsang, Eric C. C.; Bulysheva, Larissa; Kataev, Mikhail Yu

    2017-01-01

    In many real industrial applications, the integration of raw data with a methodology can support economically sound decision-making. Furthermore, most of these tasks involve complex optimisation problems. Seeking better solutions is critical. As an intelligent search optimisation algorithm, genetic algorithm (GA) is an important technique for complex system optimisation, but it has internal drawbacks such as low computation efficiency and prematurity. Improving the performance of GA is a vital topic in academic and applications research. In this paper, a new real-coded crossover operator, called compound arithmetic crossover operator (CAC), is proposed. CAC is used in conjunction with a uniform mutation operator to define a new genetic algorithm CAC10-GA. This GA is compared with an existing genetic algorithm (AC10-GA) that comprises an arithmetic crossover operator and a uniform mutation operator. To judge the performance of CAC10-GA, two kinds of analysis are performed. First the analysis of the convergence of CAC10-GA is performed by the Markov chain theory; second, a pair-wise comparison is carried out between CAC10-GA and AC10-GA through two test problems available in the global optimisation literature. The overall comparative study shows that the CAC performs quite well and the CAC10-GA defined outperforms the AC10-GA.

  10. Validation of clinical testing for warfarin sensitivity: comparison of CYP2C9-VKORC1 genotyping assays and warfarin-dosing algorithms.

    PubMed

    Langley, Michael R; Booker, Jessica K; Evans, James P; McLeod, Howard L; Weck, Karen E

    2009-05-01

    Responses to warfarin (Coumadin) anticoagulation therapy are affected by genetic variability in both the CYP2C9 and VKORC1 genes. Validation of pharmacogenetic testing for warfarin responses includes demonstration of analytical validity of testing platforms and of the clinical validity of testing. We compared four platforms for determining the relevant single nucleotide polymorphisms (SNPs) in both CYP2C9 and VKORC1 that are associated with warfarin sensitivity (Third Wave Invader Plus, ParagonDx/Cepheid Smart Cycler, Idaho Technology LightCycler, and AutoGenomics Infiniti). Each method was examined for accuracy, cost, and turnaround time. All genotyping methods demonstrated greater than 95% accuracy for identifying the relevant SNPs (CYP2C9 *2 and *3; VKORC1 -1639 or 1173). The ParagonDx and Idaho Technology assays had the shortest turnaround and hands-on times. The Third Wave assay was readily scalable to higher test volumes but had the longest hands-on time. The AutoGenomics assay interrogated the largest number of SNPs but had the longest turnaround time. Four published warfarin-dosing algorithms (Washington University, UCSF, Louisville, and Newcastle) were compared for accuracy for predicting warfarin dose in a retrospective analysis of a local patient population on long-term, stable warfarin therapy. The predicted doses from both the Washington University and UCSF algorithms demonstrated the best correlation with actual warfarin doses.

  11. Biological engineering applications of feedforward neural networks designed and parameterized by genetic algorithms.

    PubMed

    Ferentinos, Konstantinos P

    2005-09-01

    Two neural network (NN) applications in the field of biological engineering are developed, designed and parameterized by an evolutionary method based on the evolutionary process of genetic algorithms. The developed systems are a fault detection NN model and a predictive modeling NN system. An indirect or 'weak specification' representation was used for the encoding of NN topologies and training parameters into genes of the genetic algorithm (GA). Some a priori knowledge of the demands in network topology for specific application cases is required by this approach, so that the infinite search space of the problem is limited to some reasonable degree. Both one-hidden-layer and two-hidden-layer network architectures were explored by the GA. Except for the network architecture, each gene of the GA also encoded the type of activation functions in both hidden and output nodes of the NN and the type of minimization algorithm that was used by the backpropagation algorithm for the training of the NN. Both models achieved satisfactory performance, while the GA system proved to be a powerful tool that can successfully replace the problematic trial-and-error approach that is usually used for these tasks.

  12. Race influences warfarin dose changes associated with genetic factors

    PubMed Central

    Brown, Todd M.; Yan, Qi; Thigpen, Jonathan L.; Shendre, Aditi; Liu, Nianjun; Hill, Charles E.; Arnett, Donna K.; Beasley, T. Mark

    2015-01-01

    Warfarin dosing algorithms adjust for race, assigning a fixed effect size to each predictor, thereby attenuating the differential effect by race. Attenuation likely occurs in both race groups but may be more pronounced in the less-represented race group. Therefore, we evaluated whether the effect of clinical (age, body surface area [BSA], chronic kidney disease [CKD], and amiodarone use) and genetic factors (CYP2C9*2, *3, *5, *6, *11, rs12777823, VKORC1, and CYP4F2) on warfarin dose differs by race using regression analyses among 1357 patients enrolled in a prospective cohort study and compared predictive ability of race-combined vs race-stratified models. Differential effect of predictors by race was assessed using predictor-race interactions in race-combined analyses. Warfarin dose was influenced by age, BSA, CKD, amiodarone use, and CYP2C9*3 and VKORC1 variants in both races, by CYP2C9*2 and CYP4F2 variants in European Americans, and by rs12777823 in African Americans. CYP2C9*2 was associated with a lower dose only among European Americans (20.6% vs 3.0%, P < .001) and rs12777823 only among African Americans (12.3% vs 2.3%, P = .006). Although VKORC1 was associated with dose decrease in both races, the proportional decrease was higher among European Americans (28.9% vs 19.9%, P = .003) compared with African Americans. Race-stratified analysis improved dose prediction in both race groups compared with race-combined analysis. We demonstrate that the effect of predictors on warfarin dose differs by race, which may explain divergent findings reported by recent warfarin pharmacogenetic trials. We recommend that warfarin dosing algorithms should be stratified by race rather than adjusted for race. PMID:26024874

  13. Race influences warfarin dose changes associated with genetic factors.

    PubMed

    Limdi, Nita A; Brown, Todd M; Yan, Qi; Thigpen, Jonathan L; Shendre, Aditi; Liu, Nianjun; Hill, Charles E; Arnett, Donna K; Beasley, T Mark

    2015-07-23

    Warfarin dosing algorithms adjust for race, assigning a fixed effect size to each predictor, thereby attenuating the differential effect by race. Attenuation likely occurs in both race groups but may be more pronounced in the less-represented race group. Therefore, we evaluated whether the effect of clinical (age, body surface area [BSA], chronic kidney disease [CKD], and amiodarone use) and genetic factors (CYP2C9*2, *3, *5, *6, *11, rs12777823, VKORC1, and CYP4F2) on warfarin dose differs by race using regression analyses among 1357 patients enrolled in a prospective cohort study and compared predictive ability of race-combined vs race-stratified models. Differential effect of predictors by race was assessed using predictor-race interactions in race-combined analyses. Warfarin dose was influenced by age, BSA, CKD, amiodarone use, and CYP2C9*3 and VKORC1 variants in both races, by CYP2C9*2 and CYP4F2 variants in European Americans, and by rs12777823 in African Americans. CYP2C9*2 was associated with a lower dose only among European Americans (20.6% vs 3.0%, P < .001) and rs12777823 only among African Americans (12.3% vs 2.3%, P = .006). Although VKORC1 was associated with dose decrease in both races, the proportional decrease was higher among European Americans (28.9% vs 19.9%, P = .003) compared with African Americans. Race-stratified analysis improved dose prediction in both race groups compared with race-combined analysis. We demonstrate that the effect of predictors on warfarin dose differs by race, which may explain divergent findings reported by recent warfarin pharmacogenetic trials. We recommend that warfarin dosing algorithms should be stratified by race rather than adjusted for race. © 2015 by The American Society of Hematology.

  14. A Swarm Optimization Genetic Algorithm Based on Quantum-Behaved Particle Swarm Optimization.

    PubMed

    Sun, Tao; Xu, Ming-Hai

    2017-01-01

    Quantum-behaved particle swarm optimization (QPSO) algorithm is a variant of the traditional particle swarm optimization (PSO). The QPSO that was originally developed for continuous search spaces outperforms the traditional PSO in search ability. This paper analyzes the main factors that impact the search ability of QPSO and converts the particle movement formula to the mutation condition by introducing the rejection region, thus proposing a new binary algorithm, named swarm optimization genetic algorithm (SOGA), because it is more like genetic algorithm (GA) than PSO in form. SOGA has crossover and mutation operator as GA but does not need to set the crossover and mutation probability, so it has fewer parameters to control. The proposed algorithm was tested with several nonlinear high-dimension functions in the binary search space, and the results were compared with those from BPSO, BQPSO, and GA. The experimental results show that SOGA is distinctly superior to the other three algorithms in terms of solution accuracy and convergence.

  15. Peak-to-average power ratio reduction in orthogonal frequency division multiplexing-based visible light communication systems using a modified partial transmit sequence technique

    NASA Astrophysics Data System (ADS)

    Liu, Yan; Deng, Honggui; Ren, Shuang; Tang, Chengying; Qian, Xuewen

    2018-01-01

    We propose an efficient partial transmit sequence technique based on genetic algorithm and peak-value optimization algorithm (GAPOA) to reduce high peak-to-average power ratio (PAPR) in visible light communication systems based on orthogonal frequency division multiplexing (VLC-OFDM). By analysis of hill-climbing algorithm's pros and cons, we propose the POA with excellent local search ability to further process the signals whose PAPR is still over the threshold after processed by genetic algorithm (GA). To verify the effectiveness of the proposed technique and algorithm, we evaluate the PAPR performance and the bit error rate (BER) performance and compare them with partial transmit sequence (PTS) technique based on GA (GA-PTS), PTS technique based on genetic and hill-climbing algorithm (GH-PTS), and PTS based on shuffled frog leaping algorithm and hill-climbing algorithm (SFLAHC-PTS). The results show that our technique and algorithm have not only better PAPR performance but also lower computational complexity and BER than GA-PTS, GH-PTS, and SFLAHC-PTS technique.

  16. Fast optimization of glide vehicle reentry trajectory based on genetic algorithm

    NASA Astrophysics Data System (ADS)

    Jia, Jun; Dong, Ruixing; Yuan, Xuejun; Wang, Chuangwei

    2018-02-01

    An optimization method of reentry trajectory based on genetic algorithm is presented to meet the need of reentry trajectory optimization for glide vehicle. The dynamic model for the glide vehicle during reentry period is established. Considering the constraints of heat flux, dynamic pressure, overload etc., the optimization of reentry trajectory is investigated by utilizing genetic algorithm. The simulation shows that the method presented by this paper is effective for the optimization of reentry trajectory of glide vehicle. The efficiency and speed of this method is comparative with the references. Optimization results meet all constraints, and the on-line fast optimization is potential by pre-processing the offline samples.

  17. On Directly Solving SCHRÖDINGER Equation for H+2 Ion by Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Saha, Rajendra; Bhattacharyya, S. P.

    Schrödinger equation (SE) is sought to be solved directly for the ground state of H+2 ion by invoking genetic algorithm (GA). In one approach the internuclear distance (R) is kept fixed, the corresponding electronic SE for H+2 is solved by GA at each R and the full potential energy curve (PEC) is constructed. The minimum of the PEC is then located giving Ve and Re. Alternatively, Ve and Re are located in a single run by allowing R to vary simultaneously while solving the electronic SE by genetic algorithm. The performance patterns of the two strategies are compared.

  18. Gene selection for microarray cancer classification using a new evolutionary method employing artificial intelligence concepts.

    PubMed

    Dashtban, M; Balafar, Mohammadali

    2017-03-01

    Gene selection is a demanding task for microarray data analysis. The diverse complexity of different cancers makes this issue still challenging. In this study, a novel evolutionary method based on genetic algorithms and artificial intelligence is proposed to identify predictive genes for cancer classification. A filter method was first applied to reduce the dimensionality of feature space followed by employing an integer-coded genetic algorithm with dynamic-length genotype, intelligent parameter settings, and modified operators. The algorithmic behaviors including convergence trends, mutation and crossover rate changes, and running time were studied, conceptually discussed, and shown to be coherent with literature findings. Two well-known filter methods, Laplacian and Fisher score, were examined considering similarities, the quality of selected genes, and their influences on the evolutionary approach. Several statistical tests concerning choice of classifier, choice of dataset, and choice of filter method were performed, and they revealed some significant differences between the performance of different classifiers and filter methods over datasets. The proposed method was benchmarked upon five popular high-dimensional cancer datasets; for each, top explored genes were reported. Comparing the experimental results with several state-of-the-art methods revealed that the proposed method outperforms previous methods in DLBCL dataset. Copyright © 2017 Elsevier Inc. All rights reserved.

  19. Use of Genetic Algorithms to solve Inverse Problems in Relativistic Hydrodynamics

    NASA Astrophysics Data System (ADS)

    Guzmán, F. S.; González, J. A.

    2018-04-01

    We present the use of Genetic Algorithms (GAs) as a strategy to solve inverse problems associated with models of relativistic hydrodynamics. The signal we consider to emulate an observation is the density of a relativistic gas, measured at a point where a shock is traveling. This shock is generated numerically out of a Riemann problem with mildly relativistic conditions. The inverse problem we propose is the prediction of the initial conditions of density, velocity and pressure of the Riemann problem that gave origin to that signal. For this we use the density, velocity and pressure of the gas at both sides of the discontinuity, as the six genes of an organism, initially with random values within a tolerance. We then prepare an initial population of N of these organisms and evolve them using methods based on GAs. In the end, the organism with the best fitness of each generation is compared to the signal and the process ends when the set of initial conditions of the organisms of a later generation fit the Signal within a tolerance.

  20. Characterization and prediction of the backscattered form function of an immersed cylindrical shell using hybrid fuzzy clustering and bio-inspired algorithms.

    PubMed

    Agounad, Said; Aassif, El Houcein; Khandouch, Younes; Maze, Gérard; Décultot, Dominique

    2018-02-01

    The acoustic scattering of a plane wave by an elastic cylindrical shell is studied. A new approach is developed to predict the form function of an immersed cylindrical shell of the radius ratio b/a ('b' is the inner radius and 'a' is the outer radius). The prediction of the backscattered form function is investigated by a combined approach between fuzzy clustering algorithms and bio-inspired algorithms. Four famous fuzzy clustering algorithms: the fuzzy c-means (FCM), the Gustafson-Kessel algorithm (GK), the fuzzy c-regression model (FCRM) and the Gath-Geva algorithm (GG) are combined with particle swarm optimization and genetic algorithm. The symmetric and antisymmetric circumferential waves A, S 0 , A 1 , S 1 and S 2 are investigated in a reduced frequency (k 1 a) range extends over 0.1

  1. A support vector regression-firefly algorithm-based model for limiting velocity prediction in sewer pipes.

    PubMed

    Ebtehaj, Isa; Bonakdari, Hossein

    2016-01-01

    Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods.

  2. Developing a NIR multispectral imaging for prediction and visualization of peanut protein content using variable selection algorithms

    NASA Astrophysics Data System (ADS)

    Cheng, Jun-Hu; Jin, Huali; Liu, Zhiwei

    2018-01-01

    The feasibility of developing a multispectral imaging method using important wavelengths from hyperspectral images selected by genetic algorithm (GA), successive projection algorithm (SPA) and regression coefficient (RC) methods for modeling and predicting protein content in peanut kernel was investigated for the first time. Partial least squares regression (PLSR) calibration model was established between the spectral data from the selected optimal wavelengths and the reference measured protein content ranged from 23.46% to 28.43%. The RC-PLSR model established using eight key wavelengths (1153, 1567, 1972, 2143, 2288, 2339, 2389 and 2446 nm) showed the best predictive results with the coefficient of determination of prediction (R2P) of 0.901, and root mean square error of prediction (RMSEP) of 0.108 and residual predictive deviation (RPD) of 2.32. Based on the obtained best model and image processing algorithms, the distribution maps of protein content were generated. The overall results of this study indicated that developing a rapid and online multispectral imaging system using the feature wavelengths and PLSR analysis is potential and feasible for determination of the protein content in peanut kernels.

  3. Influenza detection and prediction algorithms: comparative accuracy trial in Östergötland county, Sweden, 2008-2012.

    PubMed

    Spreco, A; Eriksson, O; Dahlström, Ö; Timpka, T

    2017-07-01

    Methods for the detection of influenza epidemics and prediction of their progress have seldom been comparatively evaluated using prospective designs. This study aimed to perform a prospective comparative trial of algorithms for the detection and prediction of increased local influenza activity. Data on clinical influenza diagnoses recorded by physicians and syndromic data from a telenursing service were used. Five detection and three prediction algorithms previously evaluated in public health settings were calibrated and then evaluated over 3 years. When applied on diagnostic data, only detection using the Serfling regression method and prediction using the non-adaptive log-linear regression method showed acceptable performances during winter influenza seasons. For the syndromic data, none of the detection algorithms displayed a satisfactory performance, while non-adaptive log-linear regression was the best performing prediction method. We conclude that evidence was found for that available algorithms for influenza detection and prediction display satisfactory performance when applied on local diagnostic data during winter influenza seasons. When applied on local syndromic data, the evaluated algorithms did not display consistent performance. Further evaluations and research on combination of methods of these types in public health information infrastructures for 'nowcasting' (integrated detection and prediction) of influenza activity are warranted.

  4. Comparing genetic algorithm and particle swarm optimization for solving capacitated vehicle routing problem

    NASA Astrophysics Data System (ADS)

    Iswari, T.; Asih, A. M. S.

    2018-04-01

    In the logistics system, transportation plays an important role to connect every element in the supply chain, but it can produces the greatest cost. Therefore, it is important to make the transportation costs as minimum as possible. Reducing the transportation cost can be done in several ways. One of the ways to minimizing the transportation cost is by optimizing the routing of its vehicles. It refers to Vehicle Routing Problem (VRP). The most common type of VRP is Capacitated Vehicle Routing Problem (CVRP). In CVRP, the vehicles have their own capacity and the total demands from the customer should not exceed the capacity of the vehicle. CVRP belongs to the class of NP-hard problems. These NP-hard problems make it more complex to solve such that exact algorithms become highly time-consuming with the increases in problem sizes. Thus, for large-scale problem instances, as typically found in industrial applications, finding an optimal solution is not practicable. Therefore, this paper uses two kinds of metaheuristics approach to solving CVRP. Those are Genetic Algorithm and Particle Swarm Optimization. This paper compares the results of both algorithms and see the performance of each algorithm. The results show that both algorithms perform well in solving CVRP but still needs to be improved. From algorithm testing and numerical example, Genetic Algorithm yields a better solution than Particle Swarm Optimization in total distance travelled.

  5. A study on the performance comparison of metaheuristic algorithms on the learning of neural networks

    NASA Astrophysics Data System (ADS)

    Lai, Kee Huong; Zainuddin, Zarita; Ong, Pauline

    2017-08-01

    The learning or training process of neural networks entails the task of finding the most optimal set of parameters, which includes translation vectors, dilation parameter, synaptic weights, and bias terms. Apart from the traditional gradient descent-based methods, metaheuristic methods can also be used for this learning purpose. Since the inception of genetic algorithm half a century ago, the last decade witnessed the explosion of a variety of novel metaheuristic algorithms, such as harmony search algorithm, bat algorithm, and whale optimization algorithm. Despite the proof of the no free lunch theorem in the discipline of optimization, a survey in the literature of machine learning gives contrasting results. Some researchers report that certain metaheuristic algorithms are superior to the others, whereas some others argue that different metaheuristic algorithms give comparable performance. As such, this paper aims to investigate if a certain metaheuristic algorithm will outperform the other algorithms. In this work, three metaheuristic algorithms, namely genetic algorithms, particle swarm optimization, and harmony search algorithm are considered. The algorithms are incorporated in the learning of neural networks and their classification results on the benchmark UCI machine learning data sets are compared. It is found that all three metaheuristic algorithms give similar and comparable performance, as captured in the average overall classification accuracy. The results corroborate the findings reported in the works done by previous researchers. Several recommendations are given, which include the need of statistical analysis to verify the results and further theoretical works to support the obtained empirical results.

  6. Gene and translation initiation site prediction in metagenomic sequences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hyatt, Philip Douglas; LoCascio, Philip F; Hauser, Loren John

    2012-01-01

    Gene prediction in metagenomic sequences remains a difficult problem. Current sequencing technologies do not achieve sufficient coverage to assemble the individual genomes in a typical sample; consequently, sequencing runs produce a large number of short sequences whose exact origin is unknown. Since these sequences are usually smaller than the average length of a gene, algorithms must make predictions based on very little data. We present MetaProdigal, a metagenomic version of the gene prediction program Prodigal, that can identify genes in short, anonymous coding sequences with a high degree of accuracy. The novel value of the method consists of enhanced translationmore » initiation site identification, ability to identify sequences that use alternate genetic codes and confidence values for each gene call. We compare the results of MetaProdigal with other methods and conclude with a discussion of future improvements.« less

  7. Comparing a Coevolutionary Genetic Algorithm for Multiobjective Optimization

    NASA Technical Reports Server (NTRS)

    Lohn, Jason D.; Kraus, William F.; Haith, Gary L.; Clancy, Daniel (Technical Monitor)

    2002-01-01

    We present results from a study comparing a recently developed coevolutionary genetic algorithm (CGA) against a set of evolutionary algorithms using a suite of multiobjective optimization benchmarks. The CGA embodies competitive coevolution and employs a simple, straightforward target population representation and fitness calculation based on developmental theory of learning. Because of these properties, setting up the additional population is trivial making implementation no more difficult than using a standard GA. Empirical results using a suite of two-objective test functions indicate that this CGA performs well at finding solutions on convex, nonconvex, discrete, and deceptive Pareto-optimal fronts, while giving respectable results on a nonuniform optimization. On a multimodal Pareto front, the CGA finds a solution that dominates solutions produced by eight other algorithms, yet the CGA has poor coverage across the Pareto front.

  8. A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.

    PubMed

    Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2017-10-01

    The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.

  9. Combinatorial Multiobjective Optimization Using Genetic Algorithms

    NASA Technical Reports Server (NTRS)

    Crossley, William A.; Martin. Eric T.

    2002-01-01

    The research proposed in this document investigated multiobjective optimization approaches based upon the Genetic Algorithm (GA). Several versions of the GA have been adopted for multiobjective design, but, prior to this research, there had not been significant comparisons of the most popular strategies. The research effort first generalized the two-branch tournament genetic algorithm in to an N-branch genetic algorithm, then the N-branch GA was compared with a version of the popular Multi-Objective Genetic Algorithm (MOGA). Because the genetic algorithm is well suited to combinatorial (mixed discrete / continuous) optimization problems, the GA can be used in the conceptual phase of design to combine selection (discrete variable) and sizing (continuous variable) tasks. Using a multiobjective formulation for the design of a 50-passenger aircraft to meet the competing objectives of minimizing takeoff gross weight and minimizing trip time, the GA generated a range of tradeoff designs that illustrate which aircraft features change from a low-weight, slow trip-time aircraft design to a heavy-weight, short trip-time aircraft design. Given the objective formulation and analysis methods used, the results of this study identify where turboprop-powered aircraft and turbofan-powered aircraft become more desirable for the 50 seat passenger application. This aircraft design application also begins to suggest how a combinatorial multiobjective optimization technique could be used to assist in the design of morphing aircraft.

  10. Monthly prediction of air temperature in Australia and New Zealand with machine learning algorithms

    NASA Astrophysics Data System (ADS)

    Salcedo-Sanz, S.; Deo, R. C.; Carro-Calvo, L.; Saavedra-Moreno, B.

    2016-07-01

    Long-term air temperature prediction is of major importance in a large number of applications, including climate-related studies, energy, agricultural, or medical. This paper examines the performance of two Machine Learning algorithms (Support Vector Regression (SVR) and Multi-layer Perceptron (MLP)) in a problem of monthly mean air temperature prediction, from the previous measured values in observational stations of Australia and New Zealand, and climate indices of importance in the region. The performance of the two considered algorithms is discussed in the paper and compared to alternative approaches. The results indicate that the SVR algorithm is able to obtain the best prediction performance among all the algorithms compared in the paper. Moreover, the results obtained have shown that the mean absolute error made by the two algorithms considered is significantly larger for the last 20 years than in the previous decades, in what can be interpreted as a change in the relationship among the prediction variables involved in the training of the algorithms.

  11. Integrative genetic risk prediction using non-parametric empirical Bayes classification.

    PubMed

    Zhao, Sihai Dave

    2017-06-01

    Genetic risk prediction is an important component of individualized medicine, but prediction accuracies remain low for many complex diseases. A fundamental limitation is the sample sizes of the studies on which the prediction algorithms are trained. One way to increase the effective sample size is to integrate information from previously existing studies. However, it can be difficult to find existing data that examine the target disease of interest, especially if that disease is rare or poorly studied. Furthermore, individual-level genotype data from these auxiliary studies are typically difficult to obtain. This article proposes a new approach to integrative genetic risk prediction of complex diseases with binary phenotypes. It accommodates possible heterogeneity in the genetic etiologies of the target and auxiliary diseases using a tuning parameter-free non-parametric empirical Bayes procedure, and can be trained using only auxiliary summary statistics. Simulation studies show that the proposed method can provide superior predictive accuracy relative to non-integrative as well as integrative classifiers. The method is applied to a recent study of pediatric autoimmune diseases, where it substantially reduces prediction error for certain target/auxiliary disease combinations. The proposed method is implemented in the R package ssa. © 2016, The International Biometric Society.

  12. Binary Classification using Decision Tree based Genetic Programming and Its Application to Analysis of Bio-mass Data

    NASA Astrophysics Data System (ADS)

    To, Cuong; Pham, Tuan D.

    2010-01-01

    In machine learning, pattern recognition may be the most popular task. "Similar" patterns identification is also very important in biology because first, it is useful for prediction of patterns associated with disease, for example cancer tissue (normal or tumor); second, similarity or dissimilarity of the kinetic patterns is used to identify coordinately controlled genes or proteins involved in the same regulatory process. Third, similar genes (proteins) share similar functions. In this paper, we present an algorithm which uses genetic programming to create decision tree for binary classification problem. The application of the algorithm was implemented on five real biological databases. Base on the results of comparisons with well-known methods, we see that the algorithm is outstanding in most of cases.

  13. An application of traveling salesman problem using the improved genetic algorithm on android google maps

    NASA Astrophysics Data System (ADS)

    Narwadi, Teguh; Subiyanto

    2017-03-01

    The Travelling Salesman Problem (TSP) is one of the best known NP-hard problems, which means that no exact algorithm to solve it in polynomial time. This paper present a new variant application genetic algorithm approach with a local search technique has been developed to solve the TSP. For the local search technique, an iterative hill climbing method has been used. The system is implemented on the Android OS because android is now widely used around the world and it is mobile system. It is also integrated with Google API that can to get the geographical location and the distance of the cities, and displays the route. Therefore, we do some experimentation to test the behavior of the application. To test the effectiveness of the application of hybrid genetic algorithm (HGA) is compare with the application of simple GA in 5 sample from the cities in Central Java, Indonesia with different numbers of cities. According to the experiment results obtained that in the average solution HGA shows in 5 tests out of 5 (100%) is better than simple GA. The results have shown that the hybrid genetic algorithm outperforms the genetic algorithm especially in the case with the problem higher complexity.

  14. Optimization of controlled release nanoparticle formulation of verapamil hydrochloride using artificial neural networks with genetic algorithm and response surface methodology.

    PubMed

    Li, Yongqiang; Abbaspour, Mohammadreza R; Grootendorst, Paul V; Rauth, Andrew M; Wu, Xiao Yu

    2015-08-01

    This study was performed to optimize the formulation of polymer-lipid hybrid nanoparticles (PLN) for the delivery of an ionic water-soluble drug, verapamil hydrochloride (VRP) and to investigate the roles of formulation factors. Modeling and optimization were conducted based on a spherical central composite design. Three formulation factors, i.e., weight ratio of drug to lipid (X1), and concentrations of Tween 80 (X2) and Pluronic F68 (X3), were chosen as independent variables. Drug loading efficiency (Y1) and mean particle size (Y2) of PLN were selected as dependent variables. The predictive performance of artificial neural networks (ANN) and the response surface methodology (RSM) were compared. As ANN was found to exhibit better recognition and generalization capability over RSM, multi-objective optimization of PLN was then conducted based upon the validated ANN models and continuous genetic algorithms (GA). The optimal PLN possess a high drug loading efficiency (92.4%, w/w) and a small mean particle size (∼100nm). The predicted response variables matched well with the observed results. The three formulation factors exhibited different effects on the properties of PLN. ANN in coordination with continuous GA represent an effective and efficient approach to optimize the PLN formulation of VRP with desired properties. Copyright © 2015 Elsevier B.V. All rights reserved.

  15. Hybrid algorithms for fuzzy reverse supply chain network design.

    PubMed

    Che, Z H; Chiang, Tzu-An; Kuo, Y C; Cui, Zhihua

    2014-01-01

    In consideration of capacity constraints, fuzzy defect ratio, and fuzzy transport loss ratio, this paper attempted to establish an optimized decision model for production planning and distribution of a multiphase, multiproduct reverse supply chain, which addresses defects returned to original manufacturers, and in addition, develops hybrid algorithms such as Particle Swarm Optimization-Genetic Algorithm (PSO-GA), Genetic Algorithm-Simulated Annealing (GA-SA), and Particle Swarm Optimization-Simulated Annealing (PSO-SA) for solving the optimized model. During a case study of a multi-phase, multi-product reverse supply chain network, this paper explained the suitability of the optimized decision model and the applicability of the algorithms. Finally, the hybrid algorithms showed excellent solving capability when compared with original GA and PSO methods.

  16. Hybrid Algorithms for Fuzzy Reverse Supply Chain Network Design

    PubMed Central

    Che, Z. H.; Chiang, Tzu-An; Kuo, Y. C.

    2014-01-01

    In consideration of capacity constraints, fuzzy defect ratio, and fuzzy transport loss ratio, this paper attempted to establish an optimized decision model for production planning and distribution of a multiphase, multiproduct reverse supply chain, which addresses defects returned to original manufacturers, and in addition, develops hybrid algorithms such as Particle Swarm Optimization-Genetic Algorithm (PSO-GA), Genetic Algorithm-Simulated Annealing (GA-SA), and Particle Swarm Optimization-Simulated Annealing (PSO-SA) for solving the optimized model. During a case study of a multi-phase, multi-product reverse supply chain network, this paper explained the suitability of the optimized decision model and the applicability of the algorithms. Finally, the hybrid algorithms showed excellent solving capability when compared with original GA and PSO methods. PMID:24892057

  17. Development of hybrid genetic-algorithm-based neural networks using regression trees for modeling air quality inside a public transportation bus.

    PubMed

    Kadiyala, Akhil; Kaur, Devinder; Kumar, Ashok

    2013-02-01

    The present study developed a novel approach to modeling indoor air quality (IAQ) of a public transportation bus by the development of hybrid genetic-algorithm-based neural networks (also known as evolutionary neural networks) with input variables optimized from using the regression trees, referred as the GART approach. This study validated the applicability of the GART modeling approach in solving complex nonlinear systems by accurately predicting the monitored contaminants of carbon dioxide (CO2), carbon monoxide (CO), nitric oxide (NO), sulfur dioxide (SO2), 0.3-0.4 microm sized particle numbers, 0.4-0.5 microm sized particle numbers, particulate matter (PM) concentrations less than 1.0 microm (PM10), and PM concentrations less than 2.5 microm (PM2.5) inside a public transportation bus operating on 20% grade biodiesel in Toledo, OH. First, the important variables affecting each monitored in-bus contaminant were determined using regression trees. Second, the analysis of variance was used as a complimentary sensitivity analysis to the regression tree results to determine a subset of statistically significant variables affecting each monitored in-bus contaminant. Finally, the identified subsets of statistically significant variables were used as inputs to develop three artificial neural network (ANN) models. The models developed were regression tree-based back-propagation network (BPN-RT), regression tree-based radial basis function network (RBFN-RT), and GART models. Performance measures were used to validate the predictive capacity of the developed IAQ models. The results from this approach were compared with the results obtained from using a theoretical approach and a generalized practicable approach to modeling IAQ that included the consideration of additional independent variables when developing the aforementioned ANN models. The hybrid GART models were able to capture majority of the variance in the monitored in-bus contaminants. The genetic-algorithm-based neural network IAQ models outperformed the traditional ANN methods of the back-propagation and the radial basis function networks. The novelty of this research is the development of a novel approach to modeling vehicular indoor air quality by integration of the advanced methods of genetic algorithms, regression trees, and the analysis of variance for the monitored in-vehicle gaseous and particulate matter contaminants, and comparing the results obtained from using the developed approach with conventional artificial intelligence techniques of back propagation networks and radial basis function networks. This study validated the newly developed approach using holdout and threefold cross-validation methods. These results are of great interest to scientists, researchers, and the public in understanding the various aspects of modeling an indoor microenvironment. This methodology can easily be extended to other fields of study also.

  18. Molecular descriptor subset selection in theoretical peptide quantitative structure-retention relationship model development using nature-inspired optimization algorithms.

    PubMed

    Žuvela, Petar; Liu, J Jay; Macur, Katarzyna; Bączek, Tomasz

    2015-10-06

    In this work, performance of five nature-inspired optimization algorithms, genetic algorithm (GA), particle swarm optimization (PSO), artificial bee colony (ABC), firefly algorithm (FA), and flower pollination algorithm (FPA), was compared in molecular descriptor selection for development of quantitative structure-retention relationship (QSRR) models for 83 peptides that originate from eight model proteins. The matrix with 423 descriptors was used as input, and QSRR models based on selected descriptors were built using partial least squares (PLS), whereas root mean square error of prediction (RMSEP) was used as a fitness function for their selection. Three performance criteria, prediction accuracy, computational cost, and the number of selected descriptors, were used to evaluate the developed QSRR models. The results show that all five variable selection methods outperform interval PLS (iPLS), sparse PLS (sPLS), and the full PLS model, whereas GA is superior because of its lowest computational cost and higher accuracy (RMSEP of 5.534%) with a smaller number of variables (nine descriptors). The GA-QSRR model was validated initially through Y-randomization. In addition, it was successfully validated with an external testing set out of 102 peptides originating from Bacillus subtilis proteomes (RMSEP of 22.030%). Its applicability domain was defined, from which it was evident that the developed GA-QSRR exhibited strong robustness. All the sources of the model's error were identified, thus allowing for further application of the developed methodology in proteomics.

  19. From synthetic coiled coils to functional proteins: automated design of a receptor for the calmodulin-binding domain of calcineurin.

    PubMed

    Ghirlanda, G; Lear, J D; Lombardi, A; DeGrado, W F

    1998-08-14

    A series of synthetic receptors capable of binding to the calmodulin-binding domain of calcineurin (CN393-414) was designed, synthesized and characterized. The design was accomplished by docking CN393-414 against a two-helix receptor, using an idealized three-stranded coiled coil as a starting geometry. The sequence of the receptor was chosen using a side-chain re-packing program, which employed a genetic algorithm to select potential binders from a total of 7.5x10(6) possible sequences. A total of 25 receptors were prepared, representing 13 sequences predicted by the algorithm as well as 12 related sequences that were not predicted. The receptors were characterized by CD spectroscopy, analytical ultracentrifugation, and binding assays. The receptors predicted by the algorithm bound CN393-414 with apparent dissociation constants ranging from 0.2 microM to >50 microM. Many of the receptors that were not predicted by the algorithm also bound to CN393-414. Methods to circumvent this problem and to improve the automated design of functional proteins are discussed. Copyright 1998 Academic Press

  20. Packing Boxes into Multiple Containers Using Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Menghani, Deepak; Guha, Anirban

    2016-07-01

    Container loading problems have been studied extensively in the literature and various analytical, heuristic and metaheuristic methods have been proposed. This paper presents two different variants of a genetic algorithm framework for the three-dimensional container loading problem for optimally loading boxes into multiple containers with constraints. The algorithms are designed so that it is easy to incorporate various constraints found in real life problems. The algorithms are tested on data of standard test cases from literature and are found to compare well with the benchmark algorithms in terms of utilization of containers. This, along with the ability to easily incorporate a wide range of practical constraints, makes them attractive for implementation in real life scenarios.

  1. An automated diagnosis system of liver disease using artificial immune and genetic algorithms.

    PubMed

    Liang, Chunlin; Peng, Lingxi

    2013-04-01

    The rise of health care cost is one of the world's most important problems. Disease prediction is also a vibrant research area. Researchers have approached this problem using various techniques such as support vector machine, artificial neural network, etc. This study typically exploits the immune system's characteristics of learning and memory to solve the problem of liver disease diagnosis. The proposed system applies a combination of two methods of artificial immune and genetic algorithm to diagnose the liver disease. The system architecture is based on artificial immune system. The learning procedure of system adopts genetic algorithm to interfere the evolution of antibody population. The experiments use two benchmark datasets in our study, which are acquired from the famous UCI machine learning repository. The obtained diagnosis accuracies are very promising with regard to the other diagnosis system in the literatures. These results suggest that this system may be a useful automatic diagnosis tool for liver disease.

  2. Modeling Self-Healing of Concrete Using Hybrid Genetic Algorithm-Artificial Neural Network.

    PubMed

    Ramadan Suleiman, Ahmed; Nehdi, Moncef L

    2017-02-07

    This paper presents an approach to predicting the intrinsic self-healing in concrete using a hybrid genetic algorithm-artificial neural network (GA-ANN). A genetic algorithm was implemented in the network as a stochastic optimizing tool for the initial optimal weights and biases. This approach can assist the network in achieving a global optimum and avoid the possibility of the network getting trapped at local optima. The proposed model was trained and validated using an especially built database using various experimental studies retrieved from the open literature. The model inputs include the cement content, water-to-cement ratio (w/c), type and dosage of supplementary cementitious materials, bio-healing materials, and both expansive and crystalline additives. Self-healing indicated by means of crack width is the model output. The results showed that the proposed GA-ANN model is capable of capturing the complex effects of various self-healing agents (e.g., biochemical material, silica-based additive, expansive and crystalline components) on the self-healing performance in cement-based materials.

  3. Detecting REM sleep from the finger: an automatic REM sleep algorithm based on peripheral arterial tone (PAT) and actigraphy.

    PubMed

    Herscovici, Sarah; Pe'er, Avivit; Papyan, Surik; Lavie, Peretz

    2007-02-01

    Scoring of REM sleep based on polysomnographic recordings is a laborious and time-consuming process. The growing number of ambulatory devices designed for cost-effective home-based diagnostic sleep recordings necessitates the development of a reliable automatic REM sleep detection algorithm that is not based on the traditional electroencephalographic, electrooccolographic and electromyographic recordings trio. This paper presents an automatic REM detection algorithm based on the peripheral arterial tone (PAT) signal and actigraphy which are recorded with an ambulatory wrist-worn device (Watch-PAT100). The PAT signal is a measure of the pulsatile volume changes at the finger tip reflecting sympathetic tone variations. The algorithm was developed using a training set of 30 patients recorded simultaneously with polysomnography and Watch-PAT100. Sleep records were divided into 5 min intervals and two time series were constructed from the PAT amplitudes and PAT-derived inter-pulse periods in each interval. A prediction function based on 16 features extracted from the above time series that determines the likelihood of detecting a REM epoch was developed. The coefficients of the prediction function were determined using a genetic algorithm (GA) optimizing process tuned to maximize a price function depending on the sensitivity, specificity and agreement of the algorithm in comparison with the gold standard of polysomnographic manual scoring. Based on a separate validation set of 30 patients overall sensitivity, specificity and agreement of the automatic algorithm to identify standard 30 s epochs of REM sleep were 78%, 92%, 89%, respectively. Deploying this REM detection algorithm in a wrist worn device could be very useful for unattended ambulatory sleep monitoring. The innovative method of optimization using a genetic algorithm has been proven to yield robust results in the validation set.

  4. Design of artificial neural networks using a genetic algorithm to predict collection efficiency in venturi scrubbers.

    PubMed

    Taheri, Mahboobeh; Mohebbi, Ali

    2008-08-30

    In this study, a new approach for the auto-design of neural networks, based on a genetic algorithm (GA), has been used to predict collection efficiency in venturi scrubbers. The experimental input data, including particle diameter, throat gas velocity, liquid to gas flow rate ratio, throat hydraulic diameter, pressure drop across the venturi scrubber and collection efficiency as an output, have been used to create a GA-artificial neural network (ANN) model. The testing results from the model are in good agreement with the experimental data. Comparison of the results of the GA optimized ANN model with the results from the trial-and-error calibrated ANN model indicates that the GA-ANN model is more efficient. Finally, the effects of operating parameters such as liquid to gas flow rate ratio, throat gas velocity, and particle diameter on collection efficiency were determined.

  5. Optimization to the Culture Conditions for Phellinus Production with Regression Analysis and Gene-Set Based Genetic Algorithm

    PubMed Central

    Li, Zhongwei; Xin, Yuezhen; Wang, Xun; Sun, Beibei; Xia, Shengyu; Li, Hui

    2016-01-01

    Phellinus is a kind of fungus and is known as one of the elemental components in drugs to avoid cancers. With the purpose of finding optimized culture conditions for Phellinus production in the laboratory, plenty of experiments focusing on single factor were operated and large scale of experimental data were generated. In this work, we use the data collected from experiments for regression analysis, and then a mathematical model of predicting Phellinus production is achieved. Subsequently, a gene-set based genetic algorithm is developed to optimize the values of parameters involved in culture conditions, including inoculum size, PH value, initial liquid volume, temperature, seed age, fermentation time, and rotation speed. These optimized values of the parameters have accordance with biological experimental results, which indicate that our method has a good predictability for culture conditions optimization. PMID:27610365

  6. Nuclear fuel management optimization using genetic algorithms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    DeChaine, M.D.; Feltus, M.A.

    1995-07-01

    The code independent genetic algorithm reactor optimization (CIGARO) system has been developed to optimize nuclear reactor loading patterns. It uses genetic algorithms (GAs) and a code-independent interface, so any reactor physics code (e.g., CASMO-3/SIMULATE-3) can be used to evaluate the loading patterns. The system is compared to other GA-based loading pattern optimizers. Tests were carried out to maximize the beginning of cycle k{sub eff} for a pressurized water reactor core loading with a penalty function to limit power peaking. The CIGARO system performed well, increasing the k{sub eff} after lowering the peak power. Tests of a prototype parallel evaluation methodmore » showed the potential for a significant speedup.« less

  7. Application of neural networks with novel independent component analysis methodologies to a Prussian blue modified glassy carbon electrode array.

    PubMed

    Wang, Liang; Yang, Die; Fang, Cheng; Chen, Zuliang; Lesniewski, Peter J; Mallavarapu, Megharaj; Naidu, Ravendra

    2015-01-01

    Sodium potassium absorption ratio (SPAR) is an important measure of agricultural water quality, wherein four exchangeable cations (K(+), Na(+), Ca(2+) and Mg(2+)) should be simultaneously determined. An ISE-array is suitable for this application because its simplicity, rapid response characteristics and lower cost. However, cross-interferences caused by the poor selectivity of ISEs need to be overcome using multivariate chemometric methods. In this paper, a solid contact ISE array, based on a Prussian blue modified glassy carbon electrode (PB-GCE), was applied with a novel chemometric strategy. One of the most popular independent component analysis (ICA) methods, the fast fixed-point algorithm for ICA (fastICA), was implemented by the genetic algorithm (geneticICA) to avoid the local maxima problem commonly observed with fastICA. This geneticICA can be implemented as a data preprocessing method to improve the prediction accuracy of the Back-propagation neural network (BPNN). The ISE array system was validated using 20 real irrigation water samples from South Australia, and acceptable prediction accuracies were obtained. Copyright © 2014 Elsevier B.V. All rights reserved.

  8. Improving Brain Magnetic Resonance Image (MRI) Segmentation via a Novel Algorithm based on Genetic and Regional Growth

    PubMed Central

    A., Javadpour; A., Mohammadi

    2016-01-01

    Background Regarding the importance of right diagnosis in medical applications, various methods have been exploited for processing medical images solar. The method of segmentation is used to analyze anal to miscall structures in medical imaging. Objective This study describes a new method for brain Magnetic Resonance Image (MRI) segmentation via a novel algorithm based on genetic and regional growth. Methods Among medical imaging methods, brains MRI segmentation is important due to high contrast of non-intrusive soft tissue and high spatial resolution. Size variations of brain tissues are often accompanied by various diseases such as Alzheimer’s disease. As our knowledge about the relation between various brain diseases and deviation of brain anatomy increases, MRI segmentation is exploited as the first step in early diagnosis. In this paper, regional growth method and auto-mate selection of initial points by genetic algorithm is used to introduce a new method for MRI segmentation. Primary pixels and similarity criterion are automatically by genetic algorithms to maximize the accuracy and validity in image segmentation. Results By using genetic algorithms and defining the fixed function of image segmentation, the initial points for the algorithm were found. The proposed algorithms are applied to the images and results are manually selected by regional growth in which the initial points were compared. The results showed that the proposed algorithm could reduce segmentation error effectively. Conclusion The study concluded that the proposed algorithm could reduce segmentation error effectively and help us to diagnose brain diseases. PMID:27672629

  9. Strain gage selection in loads equations using a genetic algorithm

    NASA Technical Reports Server (NTRS)

    1994-01-01

    Traditionally, structural loads are measured using strain gages. A loads calibration test must be done before loads can be accurately measured. In one measurement method, a series of point loads is applied to the structure, and loads equations are derived via the least squares curve fitting algorithm using the strain gage responses to the applied point loads. However, many research structures are highly instrumented with strain gages, and the number and selection of gages used in a loads equation can be problematic. This paper presents an improved technique using a genetic algorithm to choose the strain gages used in the loads equations. Also presented are a comparison of the genetic algorithm performance with the current T-value technique and a variant known as the Best Step-down technique. Examples are shown using aerospace vehicle wings of high and low aspect ratio. In addition, a significant limitation in the current methods is revealed. The genetic algorithm arrived at a comparable or superior set of gages with significantly less human effort, and could be applied in instances when the current methods could not.

  10. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes

    PubMed Central

    2013-01-01

    Motivation Multivariate quantitative traits arise naturally in recent neuroimaging genetics studies, in which both structural and functional variability of the human brain is measured non-invasively through techniques such as magnetic resonance imaging (MRI). There is growing interest in detecting genetic variants associated with such multivariate traits, especially in genome-wide studies. Random forests (RFs) classifiers, which are ensembles of decision trees, are amongst the best performing machine learning algorithms and have been successfully employed for the prioritisation of genetic variants in case-control studies. RFs can also be applied to produce gene rankings in association studies with multivariate quantitative traits, and to estimate genetic similarities measures that are predictive of the trait. However, in studies involving hundreds of thousands of SNPs and high-dimensional traits, a very large ensemble of trees must be inferred from the data in order to obtain reliable rankings, which makes the application of these algorithms computationally prohibitive. Results We have developed a parallel version of the RF algorithm for regression and genetic similarity learning tasks in large-scale population genetic association studies involving multivariate traits, called PaRFR (Parallel Random Forest Regression). Our implementation takes advantage of the MapReduce programming model and is deployed on Hadoop, an open-source software framework that supports data-intensive distributed applications. Notable speed-ups are obtained by introducing a distance-based criterion for node splitting in the tree estimation process. PaRFR has been applied to a genome-wide association study on Alzheimer's disease (AD) in which the quantitative trait consists of a high-dimensional neuroimaging phenotype describing longitudinal changes in the human brain structure. PaRFR provides a ranking of SNPs associated to this trait, and produces pair-wise measures of genetic proximity that can be directly compared to pair-wise measures of phenotypic proximity. Several known AD-related variants have been identified, including APOE4 and TOMM40. We also present experimental evidence supporting the hypothesis of a linear relationship between the number of top-ranked mutated states, or frequent mutation patterns, and an indicator of disease severity. Availability The Java codes are freely available at http://www2.imperial.ac.uk/~gmontana. PMID:24564704

  11. Random forests on Hadoop for genome-wide association studies of multivariate neuroimaging phenotypes.

    PubMed

    Wang, Yue; Goh, Wilson; Wong, Limsoon; Montana, Giovanni

    2013-01-01

    Multivariate quantitative traits arise naturally in recent neuroimaging genetics studies, in which both structural and functional variability of the human brain is measured non-invasively through techniques such as magnetic resonance imaging (MRI). There is growing interest in detecting genetic variants associated with such multivariate traits, especially in genome-wide studies. Random forests (RFs) classifiers, which are ensembles of decision trees, are amongst the best performing machine learning algorithms and have been successfully employed for the prioritisation of genetic variants in case-control studies. RFs can also be applied to produce gene rankings in association studies with multivariate quantitative traits, and to estimate genetic similarities measures that are predictive of the trait. However, in studies involving hundreds of thousands of SNPs and high-dimensional traits, a very large ensemble of trees must be inferred from the data in order to obtain reliable rankings, which makes the application of these algorithms computationally prohibitive. We have developed a parallel version of the RF algorithm for regression and genetic similarity learning tasks in large-scale population genetic association studies involving multivariate traits, called PaRFR (Parallel Random Forest Regression). Our implementation takes advantage of the MapReduce programming model and is deployed on Hadoop, an open-source software framework that supports data-intensive distributed applications. Notable speed-ups are obtained by introducing a distance-based criterion for node splitting in the tree estimation process. PaRFR has been applied to a genome-wide association study on Alzheimer's disease (AD) in which the quantitative trait consists of a high-dimensional neuroimaging phenotype describing longitudinal changes in the human brain structure. PaRFR provides a ranking of SNPs associated to this trait, and produces pair-wise measures of genetic proximity that can be directly compared to pair-wise measures of phenotypic proximity. Several known AD-related variants have been identified, including APOE4 and TOMM40. We also present experimental evidence supporting the hypothesis of a linear relationship between the number of top-ranked mutated states, or frequent mutation patterns, and an indicator of disease severity. The Java codes are freely available at http://www2.imperial.ac.uk/~gmontana.

  12. Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to Recon 3D

    DOE PAGES

    Preciat Gonzalez, German A.; El Assal, Lemmer R. P.; Noronha, Alberto; ...

    2017-06-14

    The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, manymore » algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.« less

  13. Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to Recon 3D

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Preciat Gonzalez, German A.; El Assal, Lemmer R. P.; Noronha, Alberto

    The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, manymore » algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.« less

  14. Comparative evaluation of atom mapping algorithms for balanced metabolic reactions: application to Recon 3D.

    PubMed

    Preciat Gonzalez, German A; El Assal, Lemmer R P; Noronha, Alberto; Thiele, Ines; Haraldsdóttir, Hulda S; Fleming, Ronan M T

    2017-06-14

    The mechanism of each chemical reaction in a metabolic network can be represented as a set of atom mappings, each of which relates an atom in a substrate metabolite to an atom of the same element in a product metabolite. Genome-scale metabolic network reconstructions typically represent biochemistry at the level of reaction stoichiometry. However, a more detailed representation at the underlying level of atom mappings opens the possibility for a broader range of biological, biomedical and biotechnological applications than with stoichiometry alone. Complete manual acquisition of atom mapping data for a genome-scale metabolic network is a laborious process. However, many algorithms exist to predict atom mappings. How do their predictions compare to each other and to manually curated atom mappings? For more than four thousand metabolic reactions in the latest human metabolic reconstruction, Recon 3D, we compared the atom mappings predicted by six atom mapping algorithms. We also compared these predictions to those obtained by manual curation of atom mappings for over five hundred reactions distributed among all top level Enzyme Commission number classes. Five of the evaluated algorithms had similarly high prediction accuracy of over 91% when compared to manually curated atom mapped reactions. On average, the accuracy of the prediction was highest for reactions catalysed by oxidoreductases and lowest for reactions catalysed by ligases. In addition to prediction accuracy, the algorithms were evaluated on their accessibility, their advanced features, such as the ability to identify equivalent atoms, and their ability to map hydrogen atoms. In addition to prediction accuracy, we found that software accessibility and advanced features were fundamental to the selection of an atom mapping algorithm in practice.

  15. Analysis of algorithms for predicting canopy fuel

    Treesearch

    Katharine L. Gray; Elizabeth Reinhardt

    2003-01-01

    We compared observed canopy fuel characteristics with those predicted by existing biomass algorithms. We specifically examined the accuracy of the biomass equations developed by Brown (1978. We used destructively sampled data obtained at 5 different study areas. We compared predicted and observed quantities of foliage and crown biomass for individual trees in our study...

  16. Discrete sequence prediction and its applications

    NASA Technical Reports Server (NTRS)

    Laird, Philip

    1992-01-01

    Learning from experience to predict sequences of discrete symbols is a fundamental problem in machine learning with many applications. We apply sequence prediction using a simple and practical sequence-prediction algorithm, called TDAG. The TDAG algorithm is first tested by comparing its performance with some common data compression algorithms. Then it is adapted to the detailed requirements of dynamic program optimization, with excellent results.

  17. Parameter optimization of the QUAL2K model for a multiple-reach river using an influence coefficient algorithm.

    PubMed

    Cho, Jae Heon; Ha, Sung Ryong

    2010-03-15

    An influence coefficient algorithm and a genetic algorithm (GA) were introduced to develop an automatic calibration model for QUAL2K, the latest version of the QUAL2E river and stream water-quality model. The influence coefficient algorithm was used for the parameter optimization in unsteady state, open channel flow. The GA, used in solving the optimization problem, is very simple and comprehensible yet still applicable to any complicated mathematical problem, where it can find the global-optimum solution quickly and effectively. The previously established model QUAL2Kw was used for the automatic calibration of the QUAL2K. The parameter-optimization method using the influence coefficient and genetic algorithm (POMIG) developed in this study and QUAL2Kw were each applied to the Gangneung Namdaecheon River, which has multiple reaches, and the results of the two models were compared. In the modeling, the river reach was divided into two parts based on considerations of the water quality and hydraulic characteristics. The calibration results by POMIG showed a good correspondence between the calculated and observed values for most of water-quality variables. In the application of POMIG and QUAL2Kw, relatively large errors were generated between the observed and predicted values in the case of the dissolved oxygen (DO) and chlorophyll-a (Chl-a) in the lowest part of the river; therefore, two weighting factors (1 and 5) were applied for DO and Chl-a in the lower river. The sums of the errors for DO and Chl-a with a weighting factor of 5 were slightly lower compared with the application of a factor of 1. However, with a weighting factor of 5 the sums of errors for other water-quality variables were slightly increased in comparison to the case with a factor of 1. Generally, the results of the POMIG were slightly better than those of the QUAL2Kw.

  18. Genetic algorithm applied to the selection of factors in principal component-artificial neural networks: application to QSAR study of calcium channel antagonist activity of 1,4-dihydropyridines (nifedipine analogous).

    PubMed

    Hemmateenejad, Bahram; Akhond, Morteza; Miri, Ramin; Shamsipur, Mojtaba

    2003-01-01

    A QSAR algorithm, principal component-genetic algorithm-artificial neural network (PC-GA-ANN), has been applied to a set of newly synthesized calcium channel blockers, which are of special interest because of their role in cardiac diseases. A data set of 124 1,4-dihydropyridines bearing different ester substituents at the C-3 and C-5 positions of the dihydropyridine ring and nitroimidazolyl, phenylimidazolyl, and methylsulfonylimidazolyl groups at the C-4 position with known Ca(2+) channel binding affinities was employed in this study. Ten different sets of descriptors (837 descriptors) were calculated for each molecule. The principal component analysis was used to compress the descriptor groups into principal components. The most significant descriptors of each set were selected and used as input for the ANN. The genetic algorithm (GA) was used for the selection of the best set of extracted principal components. A feed forward artificial neural network with a back-propagation of error algorithm was used to process the nonlinear relationship between the selected principal components and biological activity of the dihydropyridines. A comparison between PC-GA-ANN and routine PC-ANN shows that the first model yields better prediction ability.

  19. Predictive Feature Selection for Genetic Policy Search

    DTIC Science & Technology

    2014-05-22

    inverted pendulum balancing problem (Gomez and Miikkulainen, 1999), where the agent must learn a policy in a continuous state space using discrete...algorithms to automate the process of training and/or designing NNs, mitigate these drawbacks and allow NNs to be easily applied to RL domains (Sher, 2012...racing simulator and the double inverted pendulum balance environments. It also includes parameter settings for all algorithms included in the study

  20. Searching for discrimination rules in protease proteolytic cleavage activity using genetic programming with a min-max scoring function.

    PubMed

    Yang, Zheng Rong; Thomson, Rebecca; Hodgman, T Charles; Dry, Jonathan; Doyle, Austin K; Narayanan, Ajit; Wu, XiKun

    2003-11-01

    This paper presents an algorithm which is able to extract discriminant rules from oligopeptides for protease proteolytic cleavage activity prediction. The algorithm is developed using genetic programming. Three important components in the algorithm are a min-max scoring function, the reverse Polish notation (RPN) and the use of minimum description length. The min-max scoring function is developed using amino acid similarity matrices for measuring the similarity between an oligopeptide and a rule, which is a complex algebraic equation of amino acids rather than a simple pattern sequence. The Fisher ratio is then calculated on the scoring values using the class label associated with the oligopeptides. The discriminant ability of each rule can therefore be evaluated. The use of RPN makes the evolutionary operations simpler and therefore reduces the computational cost. To prevent overfitting, the concept of minimum description length is used to penalize over-complicated rules. A fitness function is therefore composed of the Fisher ratio and the use of minimum description length for an efficient evolutionary process. In the application to four protease datasets (Trypsin, Factor Xa, Hepatitis C Virus and HIV protease cleavage site prediction), our algorithm is superior to C5, a conventional method for deriving decision trees.

  1. Multiple sequence alignment using multi-objective based bacterial foraging optimization algorithm.

    PubMed

    Rani, R Ranjani; Ramyachitra, D

    2016-12-01

    Multiple sequence alignment (MSA) is a widespread approach in computational biology and bioinformatics. MSA deals with how the sequences of nucleotides and amino acids are sequenced with possible alignment and minimum number of gaps between them, which directs to the functional, evolutionary and structural relationships among the sequences. Still the computation of MSA is a challenging task to provide an efficient accuracy and statistically significant results of alignments. In this work, the Bacterial Foraging Optimization Algorithm was employed to align the biological sequences which resulted in a non-dominated optimal solution. It employs Multi-objective, such as: Maximization of Similarity, Non-gap percentage, Conserved blocks and Minimization of gap penalty. BAliBASE 3.0 benchmark database was utilized to examine the proposed algorithm against other methods In this paper, two algorithms have been proposed: Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC) and Bacterial Foraging Optimization Algorithm. It was found that Hybrid Genetic Algorithm with Artificial Bee Colony performed better than the existing optimization algorithms. But still the conserved blocks were not obtained using GA-ABC. Then BFO was used for the alignment and the conserved blocks were obtained. The proposed Multi-Objective Bacterial Foraging Optimization Algorithm (MO-BFO) was compared with widely used MSA methods Clustal Omega, Kalign, MUSCLE, MAFFT, Genetic Algorithm (GA), Ant Colony Optimization (ACO), Artificial Bee Colony (ABC), Particle Swarm Optimization (PSO) and Hybrid Genetic Algorithm with Artificial Bee Colony (GA-ABC). The final results show that the proposed MO-BFO algorithm yields better alignment than most widely used methods. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  2. Multi-objective evolutionary algorithms for fuzzy classification in survival prediction.

    PubMed

    Jiménez, Fernando; Sánchez, Gracia; Juárez, José M

    2014-03-01

    This paper presents a novel rule-based fuzzy classification methodology for survival/mortality prediction in severe burnt patients. Due to the ethical aspects involved in this medical scenario, physicians tend not to accept a computer-based evaluation unless they understand why and how such a recommendation is given. Therefore, any fuzzy classifier model must be both accurate and interpretable. The proposed methodology is a three-step process: (1) multi-objective constrained optimization of a patient's data set, using Pareto-based elitist multi-objective evolutionary algorithms to maximize accuracy and minimize the complexity (number of rules) of classifiers, subject to interpretability constraints; this step produces a set of alternative (Pareto) classifiers; (2) linguistic labeling, which assigns a linguistic label to each fuzzy set of the classifiers; this step is essential to the interpretability of the classifiers; (3) decision making, whereby a classifier is chosen, if it is satisfactory, according to the preferences of the decision maker. If no classifier is satisfactory for the decision maker, the process starts again in step (1) with a different input parameter set. The performance of three multi-objective evolutionary algorithms, niched pre-selection multi-objective algorithm, elitist Pareto-based multi-objective evolutionary algorithm for diversity reinforcement (ENORA) and the non-dominated sorting genetic algorithm (NSGA-II), was tested using a patient's data set from an intensive care burn unit and a standard machine learning data set from an standard machine learning repository. The results are compared using the hypervolume multi-objective metric. Besides, the results have been compared with other non-evolutionary techniques and validated with a multi-objective cross-validation technique. Our proposal improves the classification rate obtained by other non-evolutionary techniques (decision trees, artificial neural networks, Naive Bayes, and case-based reasoning) obtaining with ENORA a classification rate of 0.9298, specificity of 0.9385, and sensitivity of 0.9364, with 14.2 interpretable fuzzy rules on average. Our proposal improves the accuracy and interpretability of the classifiers, compared with other non-evolutionary techniques. We also conclude that ENORA outperforms niched pre-selection and NSGA-II algorithms. Moreover, given that our multi-objective evolutionary methodology is non-combinational based on real parameter optimization, the time cost is significantly reduced compared with other evolutionary approaches existing in literature based on combinational optimization. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. [Reconstruction of Vehicle-human Crash Accident and Injury Analysis Based on 3D Laser Scanning, Multi-rigid-body Reconstruction and Optimized Genetic Algorithm].

    PubMed

    Sun, J; Wang, T; Li, Z D; Shao, Y; Zhang, Z Y; Feng, H; Zou, D H; Chen, Y J

    2017-12-01

    To reconstruct a vehicle-bicycle-cyclist crash accident and analyse the injuries using 3D laser scanning technology, multi-rigid-body dynamics and optimized genetic algorithm, and to provide biomechanical basis for the forensic identification of death cause. The vehicle was measured by 3D laser scanning technology. The multi-rigid-body models of cyclist, bicycle and vehicle were developed based on the measurements. The value range of optimal variables was set. A multi-objective genetic algorithm and the nondominated sorting genetic algorithm were used to find the optimal solutions, which were compared to the record of the surveillance video around the accident scene. The reconstruction result of laser scanning on vehicle was satisfactory. In the optimal solutions found by optimization method of genetic algorithm, the dynamical behaviours of dummy, bicycle and vehicle corresponded to that recorded by the surveillance video. The injury parameters of dummy were consistent with the situation and position of the real injuries on the cyclist in accident. The motion status before accident, damage process by crash and mechanical analysis on the injury of the victim can be reconstructed using 3D laser scanning technology, multi-rigid-body dynamics and optimized genetic algorithm, which have application value in the identification of injury manner and analysis of death cause in traffic accidents. Copyright© by the Editorial Department of Journal of Forensic Medicine

  4. A learning-based autonomous driver: emulate human driver's intelligence in low-speed car following

    NASA Astrophysics Data System (ADS)

    Wei, Junqing; Dolan, John M.; Litkouhi, Bakhtiar

    2010-04-01

    In this paper, an offline learning mechanism based on the genetic algorithm is proposed for autonomous vehicles to emulate human driver behaviors. The autonomous driving ability is implemented based on a Prediction- and Cost function-Based algorithm (PCB). PCB is designed to emulate a human driver's decision process, which is modeled as traffic scenario prediction and evaluation. This paper focuses on using a learning algorithm to optimize PCB with very limited training data, so that PCB can have the ability to predict and evaluate traffic scenarios similarly to human drivers. 80 seconds of human driving data was collected in low-speed (< 30miles/h) car-following scenarios. In the low-speed car-following tests, PCB was able to perform more human-like carfollowing after learning. A more general 120 kilometer-long simulation showed that PCB performs robustly even in scenarios that are not part of the training set.

  5. Genetic algorithms

    NASA Technical Reports Server (NTRS)

    Wang, Lui; Bayer, Steven E.

    1991-01-01

    Genetic algorithms are mathematical, highly parallel, adaptive search procedures (i.e., problem solving methods) based loosely on the processes of natural genetics and Darwinian survival of the fittest. Basic genetic algorithms concepts are introduced, genetic algorithm applications are introduced, and results are presented from a project to develop a software tool that will enable the widespread use of genetic algorithm technology.

  6. Construction of regulatory networks using expression time-series data of a genotyped population.

    PubMed

    Yeung, Ka Yee; Dombek, Kenneth M; Lo, Kenneth; Mittler, John E; Zhu, Jun; Schadt, Eric E; Bumgarner, Roger E; Raftery, Adrian E

    2011-11-29

    The inference of regulatory and biochemical networks from large-scale genomics data is a basic problem in molecular biology. The goal is to generate testable hypotheses of gene-to-gene influences and subsequently to design bench experiments to confirm these network predictions. Coexpression of genes in large-scale gene-expression data implies coregulation and potential gene-gene interactions, but provide little information about the direction of influences. Here, we use both time-series data and genetics data to infer directionality of edges in regulatory networks: time-series data contain information about the chronological order of regulatory events and genetics data allow us to map DNA variations to variations at the RNA level. We generate microarray data measuring time-dependent gene-expression levels in 95 genotyped yeast segregants subjected to a drug perturbation. We develop a Bayesian model averaging regression algorithm that incorporates external information from diverse data types to infer regulatory networks from the time-series and genetics data. Our algorithm is capable of generating feedback loops. We show that our inferred network recovers existing and novel regulatory relationships. Following network construction, we generate independent microarray data on selected deletion mutants to prospectively test network predictions. We demonstrate the potential of our network to discover de novo transcription-factor binding sites. Applying our construction method to previously published data demonstrates that our method is competitive with leading network construction algorithms in the literature.

  7. A Study of Penalty Function Methods for Constraint Handling with Genetic Algorithm

    NASA Technical Reports Server (NTRS)

    Ortiz, Francisco

    2004-01-01

    COMETBOARDS (Comparative Evaluation Testbed of Optimization and Analysis Routines for Design of Structures) is a design optimization test bed that can evaluate the performance of several different optimization algorithms. A few of these optimization algorithms are the sequence of unconstrained minimization techniques (SUMT), sequential linear programming (SLP) and the sequential quadratic programming techniques (SQP). A genetic algorithm (GA) is a search technique that is based on the principles of natural selection or "survival of the fittest". Instead of using gradient information, the GA uses the objective function directly in the search. The GA searches the solution space by maintaining a population of potential solutions. Then, using evolving operations such as recombination, mutation and selection, the GA creates successive generations of solutions that will evolve and take on the positive characteristics of their parents and thus gradually approach optimal or near-optimal solutions. By using the objective function directly in the search, genetic algorithms can be effectively applied in non-convex, highly nonlinear, complex problems. The genetic algorithm is not guaranteed to find the global optimum, but it is less likely to get trapped at a local optimum than traditional gradient-based search methods when the objective function is not smooth and generally well behaved. The purpose of this research is to assist in the integration of genetic algorithm (GA) into COMETBOARDS. COMETBOARDS cast the design of structures as a constrained nonlinear optimization problem. One method used to solve constrained optimization problem with a GA to convert the constrained optimization problem into an unconstrained optimization problem by developing a penalty function that penalizes infeasible solutions. There have been several suggested penalty function in the literature each with there own strengths and weaknesses. A statistical analysis of some suggested penalty functions is performed in this study. Also, a response surface approach to robust design is used to develop a new penalty function approach. This new penalty function approach is then compared with the other existing penalty functions.

  8. An enhanced deterministic K-Means clustering algorithm for cancer subtype prediction from gene expression data.

    PubMed

    Nidheesh, N; Abdul Nazeer, K A; Ameer, P M

    2017-12-01

    Clustering algorithms with steps involving randomness usually give different results on different executions for the same dataset. This non-deterministic nature of algorithms such as the K-Means clustering algorithm limits their applicability in areas such as cancer subtype prediction using gene expression data. It is hard to sensibly compare the results of such algorithms with those of other algorithms. The non-deterministic nature of K-Means is due to its random selection of data points as initial centroids. We propose an improved, density based version of K-Means, which involves a novel and systematic method for selecting initial centroids. The key idea of the algorithm is to select data points which belong to dense regions and which are adequately separated in feature space as the initial centroids. We compared the proposed algorithm to a set of eleven widely used single clustering algorithms and a prominent ensemble clustering algorithm which is being used for cancer data classification, based on the performances on a set of datasets comprising ten cancer gene expression datasets. The proposed algorithm has shown better overall performance than the others. There is a pressing need in the Biomedical domain for simple, easy-to-use and more accurate Machine Learning tools for cancer subtype prediction. The proposed algorithm is simple, easy-to-use and gives stable results. Moreover, it provides comparatively better predictions of cancer subtypes from gene expression data. Copyright © 2017 Elsevier Ltd. All rights reserved.

  9. Can machine-learning improve cardiovascular risk prediction using routine clinical data?

    PubMed Central

    Kai, Joe; Garibaldi, Jonathan M.; Qureshi, Nadeem

    2017-01-01

    Background Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machine-learning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction. Methods Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the ‘receiver operating curve’ (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins). Findings 24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723–0.735), machine-learning algorithms improved prediction: random forest +1.7% (AUC 0.745, 95% CI 0.739–0.750), logistic regression +3.2% (AUC 0.760, 95% CI 0.755–0.766), gradient boosting +3.3% (AUC 0.761, 95% CI 0.755–0.766), neural networks +3.6% (AUC 0.764, 95% CI 0.759–0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+7.6%) more patients who developed cardiovascular disease compared to the established algorithm. Conclusions Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others. PMID:28376093

  10. Can machine-learning improve cardiovascular risk prediction using routine clinical data?

    PubMed

    Weng, Stephen F; Reps, Jenna; Kai, Joe; Garibaldi, Jonathan M; Qureshi, Nadeem

    2017-01-01

    Current approaches to predict cardiovascular risk fail to identify many people who would benefit from preventive treatment, while others receive unnecessary intervention. Machine-learning offers opportunity to improve accuracy by exploiting complex interactions between risk factors. We assessed whether machine-learning can improve cardiovascular risk prediction. Prospective cohort study using routine clinical data of 378,256 patients from UK family practices, free from cardiovascular disease at outset. Four machine-learning algorithms (random forest, logistic regression, gradient boosting machines, neural networks) were compared to an established algorithm (American College of Cardiology guidelines) to predict first cardiovascular event over 10-years. Predictive accuracy was assessed by area under the 'receiver operating curve' (AUC); and sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) to predict 7.5% cardiovascular risk (threshold for initiating statins). 24,970 incident cardiovascular events (6.6%) occurred. Compared to the established risk prediction algorithm (AUC 0.728, 95% CI 0.723-0.735), machine-learning algorithms improved prediction: random forest +1.7% (AUC 0.745, 95% CI 0.739-0.750), logistic regression +3.2% (AUC 0.760, 95% CI 0.755-0.766), gradient boosting +3.3% (AUC 0.761, 95% CI 0.755-0.766), neural networks +3.6% (AUC 0.764, 95% CI 0.759-0.769). The highest achieving (neural networks) algorithm predicted 4,998/7,404 cases (sensitivity 67.5%, PPV 18.4%) and 53,458/75,585 non-cases (specificity 70.7%, NPV 95.7%), correctly predicting 355 (+7.6%) more patients who developed cardiovascular disease compared to the established algorithm. Machine-learning significantly improves accuracy of cardiovascular risk prediction, increasing the number of patients identified who could benefit from preventive treatment, while avoiding unnecessary treatment of others.

  11. Genetically Predicted Body Mass Index and Alzheimer’s Disease Related Phenotypes in Three Large Samples: Mendelian Randomization Analyses

    PubMed Central

    Mukherjee, Shubhabrata; Walter, Stefan; Kauwe, John S.K.; Saykin, Andrew J.; Bennett, David A.; Larson, Eric B.; Crane, Paul K.; Glymour, M. Maria

    2015-01-01

    Observational research shows that higher body mass index (BMI) increases Alzheimer’s disease (AD) risk, but it is unclear whether this association is causal. We applied genetic variants that predict BMI in Mendelian Randomization analyses, an approach that is not biased by reverse causation or confounding, to evaluate whether higher BMI increases AD risk. We evaluated individual level data from the AD Genetics Consortium (ADGC: 10,079 AD cases and 9,613 controls), the Health and Retirement Study (HRS: 8,403 participants with algorithm-predicted dementia status) and published associations from the Genetic and Environmental Risk for AD consortium (GERAD1: 3,177 AD cases and 7,277 controls). No evidence from individual SNPs or polygenic scores indicated BMI increased AD risk. Mendelian Randomization effect estimates per BMI point (95% confidence intervals) were: ADGC OR=0.95 (0.90, 1.01); HRS OR=1.00 (0.75, 1.32); GERAD1 OR=0.96 (0.87, 1.07). One subscore (cellular processes not otherwise specified) unexpectedly predicted lower AD risk. PMID:26079416

  12. Genetic Algorithms and Local Search

    NASA Technical Reports Server (NTRS)

    Whitley, Darrell

    1996-01-01

    The first part of this presentation is a tutorial level introduction to the principles of genetic search and models of simple genetic algorithms. The second half covers the combination of genetic algorithms with local search methods to produce hybrid genetic algorithms. Hybrid algorithms can be modeled within the existing theoretical framework developed for simple genetic algorithms. An application of a hybrid to geometric model matching is given. The hybrid algorithm yields results that improve on the current state-of-the-art for this problem.

  13. Integrating Genetic and Functional Genomic Data to Elucidate Common Disease Tra

    NASA Astrophysics Data System (ADS)

    Schadt, Eric

    2005-03-01

    The reconstruction of genetic networks in mammalian systems is one of the primary goals in biological research, especially as such reconstructions relate to elucidating not only common, polygenic human diseases, but living systems more generally. Here I present a statistical procedure for inferring causal relationships between gene expression traits and more classic clinical traits, including complex disease traits. This procedure has been generalized to the gene network reconstruction problem, where naturally occurring genetic variations in segregating mouse populations are used as a source of perturbations to elucidate tissue-specific gene networks. Differences in the extent of genetic control between genders and among four different tissues are highlighted. I also demonstrate that the networks derived from expression data in segregating mouse populations using the novel network reconstruction algorithm are able to capture causal associations between genes that result in increased predictive power, compared to more classically reconstructed networks derived from the same data. This approach to causal inference in large segregating mouse populations over multiple tissues not only elucidates fundamental aspects of transcriptional control, it also allows for the objective identification of key drivers of common human diseases.

  14. Evaluation of Residual Static Corrections by Hybrid Genetic Algorithm Steepest Ascent Autostatics Inversion.Application southern Algerian fields

    NASA Astrophysics Data System (ADS)

    Eladj, Said; bansir, fateh; ouadfeul, sid Ali

    2016-04-01

    The application of genetic algorithm starts with an initial population of chromosomes representing a "model space". Chromosome chains are preferentially Reproduced based on Their fitness Compared to the total population. However, a good chromosome has a Greater opportunity to Produce offspring Compared To other chromosomes in the population. The advantage of the combination HGA / SAA is the use of a global search approach on a large population of local maxima to Improve Significantly the performance of the method. To define the parameters of the Hybrid Genetic Algorithm Steepest Ascent Auto Statics (HGA / SAA) job, we Evaluated by testing in the first stage of "Steepest Ascent," the optimal parameters related to the data used. 1- The number of iterations "Number of hill climbing iteration" is equal to 40 iterations. This parameter defines the participation of the algorithm "SA", in this hybrid approach. 2- The minimum eigenvalue for SA '= 0.8. This is linked to the quality of data and S / N ratio. To find an implementation performance of hybrid genetic algorithms in the inversion for estimating of the residual static corrections, tests Were Performed to determine the number of generation of HGA / SAA. Using the values of residual static corrections already calculated by the Approaches "SAA and CSAA" learning has Proved very effective in the building of the cross-correlation table. To determine the optimal number of generation, we Conducted a series of tests ranging from [10 to 200] generations. The application on real seismic data in southern Algeria allowed us to judge the performance and capacity of the inversion with this hybrid method "HGA / SAA". This experience Clarified the influence of the corrections quality estimated from "SAA / CSAA" and the optimum number of generation hybrid genetic algorithm "HGA" required to have a satisfactory performance. Twenty (20) generations Were enough to Improve continuity and resolution of seismic horizons. This Will allow us to achieve a more accurate structural interpretation Key words: Hybrid Genetic Algorithm, number of generations, model space, local maxima, Number of hill climbing iteration, Minimum eigenvalue, cross-correlation table

  15. Streamflow Prediction based on Chaos Theory

    NASA Astrophysics Data System (ADS)

    Li, X.; Wang, X.; Babovic, V. M.

    2015-12-01

    Chaos theory is a popular method in hydrologic time series prediction. Local model (LM) based on this theory utilizes time-delay embedding to reconstruct the phase-space diagram. For this method, its efficacy is dependent on the embedding parameters, i.e. embedding dimension, time lag, and nearest neighbor number. The optimal estimation of these parameters is thus critical to the application of Local model. However, these embedding parameters are conventionally estimated using Average Mutual Information (AMI) and False Nearest Neighbors (FNN) separately. This may leads to local optimization and thus has limitation to its prediction accuracy. Considering about these limitation, this paper applies a local model combined with simulated annealing (SA) to find the global optimization of embedding parameters. It is also compared with another global optimization approach of Genetic Algorithm (GA). These proposed hybrid methods are applied in daily and monthly streamflow time series for examination. The results show that global optimization can contribute to the local model to provide more accurate prediction results compared with local optimization. The LM combined with SA shows more advantages in terms of its computational efficiency. The proposed scheme here can also be applied to other fields such as prediction of hydro-climatic time series, error correction, etc.

  16. Improved regulatory element prediction based on tissue-specific local epigenomic signatures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    He, Yupeng; Gorkin, David U.; Dickel, Diane E.

    Accurate enhancer identification is critical for understanding the spatiotemporal transcriptional regulation during development as well as the functional impact of disease-related noncoding genetic variants. Computational methods have been developed to predict the genomic locations of active enhancers based on histone modifications, but the accuracy and resolution of these methods remain limited. Here, we present an algorithm, regulator y element prediction based on tissue-specific local epigenetic marks (REPTILE), which integrates histone modification and whole-genome cytosine DNA methylation profiles to identify the precise location of enhancers. We tested the ability of REPTILE to identify enhancers previously validated in reporter assays. Compared withmore » existing methods, REPTILE shows consistently superior performance across diverse cell and tissue types, and the enhancer locations are significantly more refined. We show that, by incorporating base-resolution methylation data, REPTILE greatly improves upon current methods for annotation of enhancers across a variety of cell and tissue types.« less

  17. Improved regulatory element prediction based on tissue-specific local epigenomic signatures

    DOE PAGES

    He, Yupeng; Gorkin, David U.; Dickel, Diane E.; ...

    2017-02-13

    Accurate enhancer identification is critical for understanding the spatiotemporal transcriptional regulation during development as well as the functional impact of disease-related noncoding genetic variants. Computational methods have been developed to predict the genomic locations of active enhancers based on histone modifications, but the accuracy and resolution of these methods remain limited. Here, we present an algorithm, regulator y element prediction based on tissue-specific local epigenetic marks (REPTILE), which integrates histone modification and whole-genome cytosine DNA methylation profiles to identify the precise location of enhancers. We tested the ability of REPTILE to identify enhancers previously validated in reporter assays. Compared withmore » existing methods, REPTILE shows consistently superior performance across diverse cell and tissue types, and the enhancer locations are significantly more refined. We show that, by incorporating base-resolution methylation data, REPTILE greatly improves upon current methods for annotation of enhancers across a variety of cell and tissue types.« less

  18. A Modified Decision Tree Algorithm Based on Genetic Algorithm for Mobile User Classification Problem

    PubMed Central

    Liu, Dong-sheng; Fan, Shu-jiang

    2014-01-01

    In order to offer mobile customers better service, we should classify the mobile user firstly. Aimed at the limitations of previous classification methods, this paper puts forward a modified decision tree algorithm for mobile user classification, which introduced genetic algorithm to optimize the results of the decision tree algorithm. We also take the context information as a classification attributes for the mobile user and we classify the context into public context and private context classes. Then we analyze the processes and operators of the algorithm. At last, we make an experiment on the mobile user with the algorithm, we can classify the mobile user into Basic service user, E-service user, Plus service user, and Total service user classes and we can also get some rules about the mobile user. Compared to C4.5 decision tree algorithm and SVM algorithm, the algorithm we proposed in this paper has higher accuracy and more simplicity. PMID:24688389

  19. Development of a Dynamic Operational Scheduling Algorithm for an Independent Micro-Grid with Renewable Energy

    NASA Astrophysics Data System (ADS)

    Obara, Shin'ya

    A micro-grid with the capacity for sustainable energy is expected to be a distributed energy system that exhibits quite a small environmental impact. In an independent micro-grid, “green energy,” which is typically thought of as unstable, can be utilized effectively by introducing a battery. In the past study, the production-of-electricity prediction algorithm (PAS) of the solar cell was developed. In PAS, a layered neural network is made to learn based on past weather data and the operation plan of the compound system of a solar cell and other energy systems was examined using this prediction algorithm. In this paper, a dynamic operational scheduling algorithm is developed using a neural network (PAS) and a genetic algorithm (GA) to provide predictions for solar cell power output. We also do a case study analysis in which we use this algorithm to plan the operation of a system that connects nine houses in Sapporo to a micro-grid composed of power equipment and a polycrystalline silicon solar cell. In this work, the relationship between the accuracy of output prediction of the solar cell and the operation plan of the micro-grid was clarified. Moreover, we found that operating the micro-grid according to the plan derived with PAS was far superior, in terms of equipment hours of operation, to that using past average weather data.

  20. Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers.

    PubMed

    Chen, Peng; Li, Jinyan

    2010-05-17

    Prediction of long-range inter-residue contacts is an important topic in bioinformatics research. It is helpful for determining protein structures, understanding protein foldings, and therefore advancing the annotation of protein functions. In this paper, we propose a novel ensemble of genetic algorithm classifiers (GaCs) to address the long-range contact prediction problem. Our method is based on the key idea called sequence profile centers (SPCs). Each SPC is the average sequence profiles of residue pairs belonging to the same contact class or non-contact class. GaCs train on multiple but different pairs of long-range contact data (positive data) and long-range non-contact data (negative data). The negative data sets, having roughly the same sizes as the positive ones, are constructed by random sampling over the original imbalanced negative data. As a result, about 21.5% long-range contacts are correctly predicted. We also found that the ensemble of GaCs indeed makes an accuracy improvement by around 5.6% over the single GaC. Classifiers with the use of sequence profile centers may advance the long-range contact prediction. In line with this approach, key structural features in proteins would be determined with high efficiency and accuracy.

  1. In silico identification of genetic variants in glucocerebrosidase (GBA) gene involved in Gaucher's disease using multiple software tools.

    PubMed

    Manickam, Madhumathi; Ravanan, Palaniyandi; Singh, Pratibha; Talwar, Priti

    2014-01-01

    Gaucher's disease (GD) is an autosomal recessive disorder caused by the deficiency of glucocerebrosidase, a lysosomal enzyme that catalyses the hydrolysis of the glycolipid glucocerebroside to ceramide and glucose. Polymorphisms in GBA gene have been associated with the development of Gaucher disease. We hypothesize that prediction of SNPs using multiple state of the art software tools will help in increasing the confidence in identification of SNPs involved in GD. Enzyme replacement therapy is the only option for GD. Our goal is to use several state of art SNP algorithms to predict/address harmful SNPs using comparative studies. In this study seven different algorithms (SIFT, MutPred, nsSNP Analyzer, PANTHER, PMUT, PROVEAN, and SNPs&GO) were used to predict the harmful polymorphisms. Among the seven programs, SIFT found 47 nsSNPs as deleterious, MutPred found 46 nsSNPs as harmful. nsSNP Analyzer program found 43 out of 47 nsSNPs are disease causing SNPs whereas PANTHER found 32 out of 47 as highly deleterious, 22 out of 47 are classified as pathological mutations by PMUT, 44 out of 47 were predicted to be deleterious by PROVEAN server, all 47 shows the disease related mutations by SNPs&GO. Twenty two nsSNPs were commonly predicted by all the seven different algorithms. The common 22 targeted mutations are F251L, C342G, W312C, P415R, R463C, D127V, A309V, G46E, G202E, P391L, Y363C, Y205C, W378C, I402T, S366R, F397S, Y418C, P401L, G195E, W184R, R48W, and T43R.

  2. iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space.

    PubMed

    Akbar, Shahid; Hayat, Maqsood; Iqbal, Muhammad; Jan, Mian Ahmad

    2017-06-01

    Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Due to perilous impact of cancer, the development of an accurate and highly efficient intelligent computational model is desirable for identification of anticancer peptides. In this paper, evolutionary intelligent genetic algorithm-based ensemble model, 'iACP-GAEnsC', is proposed for the identification of anticancer peptides. In this model, the protein sequences are formulated, using three different discrete feature representation methods, i.e., amphiphilic Pseudo amino acid composition, g-Gap dipeptide composition, and Reduce amino acid alphabet composition. The performance of the extracted feature spaces are investigated separately and then merged to exhibit the significance of hybridization. In addition, the predicted results of individual classifiers are combined together, using optimized genetic algorithm and simple majority technique in order to enhance the true classification rate. It is observed that genetic algorithm-based ensemble classification outperforms than individual classifiers as well as simple majority voting base ensemble. The performance of genetic algorithm-based ensemble classification is highly reported on hybrid feature space, with an accuracy of 96.45%. In comparison to the existing techniques, 'iACP-GAEnsC' model has achieved remarkable improvement in terms of various performance metrics. Based on the simulation results, it is observed that 'iACP-GAEnsC' model might be a leading tool in the field of drug design and proteomics for researchers. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. A multifactorial analysis of obesity as CVD risk factor: Use of neural network based methods in a nutrigenetics context

    PubMed Central

    2010-01-01

    Background Obesity is a multifactorial trait, which comprises an independent risk factor for cardiovascular disease (CVD). The aim of the current work is to study the complex etiology beneath obesity and identify genetic variations and/or factors related to nutrition that contribute to its variability. To this end, a set of more than 2300 white subjects who participated in a nutrigenetics study was used. For each subject a total of 63 factors describing genetic variants related to CVD (24 in total), gender, and nutrition (38 in total), e.g. average daily intake in calories and cholesterol, were measured. Each subject was categorized according to body mass index (BMI) as normal (BMI ≤ 25) or overweight (BMI > 25). Two artificial neural network (ANN) based methods were designed and used towards the analysis of the available data. These corresponded to i) a multi-layer feed-forward ANN combined with a parameter decreasing method (PDM-ANN), and ii) a multi-layer feed-forward ANN trained by a hybrid method (GA-ANN) which combines genetic algorithms and the popular back-propagation training algorithm. Results PDM-ANN and GA-ANN were comparatively assessed in terms of their ability to identify the most important factors among the initial 63 variables describing genetic variations, nutrition and gender, able to classify a subject into one of the BMI related classes: normal and overweight. The methods were designed and evaluated using appropriate training and testing sets provided by 3-fold Cross Validation (3-CV) resampling. Classification accuracy, sensitivity, specificity and area under receiver operating characteristics curve were utilized to evaluate the resulted predictive ANN models. The most parsimonious set of factors was obtained by the GA-ANN method and included gender, six genetic variations and 18 nutrition-related variables. The corresponding predictive model was characterized by a mean accuracy equal of 61.46% in the 3-CV testing sets. Conclusions The ANN based methods revealed factors that interactively contribute to obesity trait and provided predictive models with a promising generalization ability. In general, results showed that ANNs and their hybrids can provide useful tools for the study of complex traits in the context of nutrigenetics. PMID:20825661

  4. Mechanobiological simulations of peri-acetabular bone ingrowth: a comparative analysis of cell-phenotype specific and phenomenological algorithms.

    PubMed

    Mukherjee, Kaushik; Gupta, Sanjay

    2017-03-01

    Several mechanobiology algorithms have been employed to simulate bone ingrowth around porous coated implants. However, there is a scarcity of quantitative comparison between the efficacies of commonly used mechanoregulatory algorithms. The objectives of this study are: (1) to predict peri-acetabular bone ingrowth using cell-phenotype specific algorithm and to compare these predictions with those obtained using phenomenological algorithm and (2) to investigate the influences of cellular parameters on bone ingrowth. The variation in host bone material property and interfacial micromotion of the implanted pelvis were mapped onto the microscale model of implant-bone interface. An overall variation of 17-88 % in peri-acetabular bone ingrowth was observed. Despite differences in predicted tissue differentiation patterns during the initial period, both the algorithms predicted similar spatial distribution of neo-tissue layer, after attainment of equilibrium. Results indicated that phenomenological algorithm, being computationally faster than the cell-phenotype specific algorithm, might be used to predict peri-prosthetic bone ingrowth. The cell-phenotype specific algorithm, however, was found to be useful in numerically investigating the influence of alterations in cellular activities on bone ingrowth, owing to biologically related factors. Amongst the host of cellular activities, matrix production rate of bone tissue was found to have predominant influence on peri-acetabular bone ingrowth.

  5. Genetic algorithms for multicriteria shape optimization of induction furnace

    NASA Astrophysics Data System (ADS)

    Kůs, Pavel; Mach, František; Karban, Pavel; Doležel, Ivo

    2012-09-01

    In this contribution we deal with a multi-criteria shape optimization of an induction furnace. We want to find shape parameters of the furnace in such a way, that two different criteria are optimized. Since they cannot be optimized simultaneously, instead of one optimum we find set of partially optimal designs, so called Pareto front. We compare two different approaches to the optimization, one using nonlinear conjugate gradient method and second using variation of genetic algorithm. As can be seen from the numerical results, genetic algorithm seems to be the right choice for this problem. Solution of direct problem (coupled problem consisting of magnetic and heat field) is done using our own code Agros2D. It uses finite elements of higher order leading to fast and accurate solution of relatively complicated coupled problem. It also provides advanced scripting support, allowing us to prepare parametric model of the furnace and simply incorporate various types of optimization algorithms.

  6. Crystal Structure Predictions Using Adaptive Genetic Algorithm and Motif Search methods

    NASA Astrophysics Data System (ADS)

    Ho, K. M.; Wang, C. Z.; Zhao, X.; Wu, S.; Lyu, X.; Zhu, Z.; Nguyen, M. C.; Umemoto, K.; Wentzcovitch, R. M. M.

    2017-12-01

    Material informatics is a new initiative which has attracted a lot of attention in recent scientific research. The basic strategy is to construct comprehensive data sets and use machine learning to solve a wide variety of problems in material design and discovery. In pursuit of this goal, a key element is the quality and completeness of the databases used. Recent advance in the development of crystal structure prediction algorithms has made it a complementary and more efficient approach to explore the structure/phase space in materials using computers. In this talk, we discuss the importance of the structural motifs and motif-networks in crystal structure predictions. Correspondingly, powerful methods are developed to improve the sampling of the low-energy structure landscape.

  7. A Prediction Algorithm for Drug Response in Patients with Mesial Temporal Lobe Epilepsy Based on Clinical and Genetic Information

    PubMed Central

    Carvalho, Benilton S.; Bilevicius, Elizabeth; Alvim, Marina K. M.; Lopes-Cendes, Iscia

    2017-01-01

    Mesial temporal lobe epilepsy is the most common form of adult epilepsy in surgical series. Currently, the only characteristic used to predict poor response to clinical treatment in this syndrome is the presence of hippocampal sclerosis. Single nucleotide polymorphisms (SNPs) located in genes encoding drug transporter and metabolism proteins could influence response to therapy. Therefore, we aimed to evaluate whether combining information from clinical variables as well as SNPs in candidate genes could improve the accuracy of predicting response to drug therapy in patients with mesial temporal lobe epilepsy. For this, we divided 237 patients into two groups: 75 responsive and 162 refractory to antiepileptic drug therapy. We genotyped 119 SNPs in ABCB1, ABCC2, CYP1A1, CYP1A2, CYP1B1, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4, and CYP3A5 genes. We used 98 additional SNPs to evaluate population stratification. We assessed a first scenario using only clinical variables and a second one including SNP information. The random forests algorithm combined with leave-one-out cross-validation was used to identify the best predictive model in each scenario and compared their accuracies using the area under the curve statistic. Additionally, we built a variable importance plot to present the set of most relevant predictors on the best model. The selected best model included the presence of hippocampal sclerosis and 56 SNPs. Furthermore, including SNPs in the model improved accuracy from 0.4568 to 0.8177. Our findings suggest that adding genetic information provided by SNPs, located on drug transport and metabolism genes, can improve the accuracy for predicting which patients with mesial temporal lobe epilepsy are likely to be refractory to drug treatment, making it possible to identify patients who may benefit from epilepsy surgery sooner. PMID:28052106

  8. An Improved Cuckoo Search Optimization Algorithm for the Problem of Chaotic Systems Parameter Estimation

    PubMed Central

    Wang, Jun; Zhou, Bihua; Zhou, Shudao

    2016-01-01

    This paper proposes an improved cuckoo search (ICS) algorithm to establish the parameters of chaotic systems. In order to improve the optimization capability of the basic cuckoo search (CS) algorithm, the orthogonal design and simulated annealing operation are incorporated in the CS algorithm to enhance the exploitation search ability. Then the proposed algorithm is used to establish parameters of the Lorenz chaotic system and Chen chaotic system under the noiseless and noise condition, respectively. The numerical results demonstrate that the algorithm can estimate parameters with high accuracy and reliability. Finally, the results are compared with the CS algorithm, genetic algorithm, and particle swarm optimization algorithm, and the compared results demonstrate the method is energy-efficient and superior. PMID:26880874

  9. Depression comorbidity in spinocerebellar ataxia.

    PubMed

    Schmitz-Hübsch, Tanja; Coudert, Mathieu; Tezenas du Montcel, Sophie; Giunti, Paola; Labrum, Robyn; Dürr, Alexandra; Ribai, Pascale; Charles, Perrine; Linnemann, Christoph; Schöls, Ludger; Rakowicz, Maryla; Rola, Rafal; Zdzienicka, Elszbieta; Fancellu, Roberto; Mariotti, Caterina; Baliko, Lazlo; Melegh, Bela; Filla, Alessandro; Salvatore, Elena; van de Warrenburg, Bart P C; Szymanski, Sandra; Infante, Jon; Timmann, Dagmar; Boesch, Sylvia; Depondt, Chantal; Kang, Jun-Suk; Schulz, Jörg B; Klopstock, Thomas; Lossnitzer, Nicole; Löwe, Bernd; Frick, Caroline; Rottländer, Daniela; Schlaepfer, Thomas E; Klockgether, Thomas

    2011-04-01

    This is a description of the prevalence and profile of depressive symptoms in dominant spinocerebellar ataxia (SCA). Depressive symptoms were assessed in a convenience sample of 526 genetically confirmed and clinically affected patients (117 SCA1, 163 SCA2, 139 SCA3, and 107 SCA6) using the Patient Health Questionnaire (PHQ). In addition, depressive status according to the examiner and the use of antidepressants was recorded. Depression self-assessment was compared with an interview-based psychiatric assessment in a subset of 26 patients. Depression prevalence estimates were 17.1% according to the PHQ algorithm and 15.4% when assessed clinically. The sensitivity of clinical impression compared with PHQ classification was low (0.35), whereas diagnostic accuracy of PHQ compared with psychiatric interview in the subset was high. Antidepressants were used by 17.7% of patients and in >10% of patients without current clinically relevant depressive symptoms. Depression profile in SCA did not differ from a sample of patients with major depressive disorder except for the movement-related item. Neither depression prevalence nor use of antidepressants differed between genetic subtypes, with only sleep disturbance more common in SCA3. In a multivariate analysis, ataxia severity and female sex independently predicted depressive status in SCA. The PHQ algorithmic classification is appropriate for use in SCA but should stimulate further psychiatric evaluation if depression is indicated. Despite a higher risk for depression with more severe disease, the relation of depressive symptoms to SCA neurodegeneration remains to be shown. Copyright © 2011 Movement Disorder Society.

  10. Clustering for Binary Data Sets by Using Genetic Algorithm-Incremental K-means

    NASA Astrophysics Data System (ADS)

    Saharan, S.; Baragona, R.; Nor, M. E.; Salleh, R. M.; Asrah, N. M.

    2018-04-01

    This research was initially driven by the lack of clustering algorithms that specifically focus in binary data. To overcome this gap in knowledge, a promising technique for analysing this type of data became the main subject in this research, namely Genetic Algorithms (GA). For the purpose of this research, GA was combined with the Incremental K-means (IKM) algorithm to cluster the binary data streams. In GAIKM, the objective function was based on a few sufficient statistics that may be easily and quickly calculated on binary numbers. The implementation of IKM will give an advantage in terms of fast convergence. The results show that GAIKM is an efficient and effective new clustering algorithm compared to the clustering algorithms and to the IKM itself. In conclusion, the GAIKM outperformed other clustering algorithms such as GCUK, IKM, Scalable K-means (SKM) and K-means clustering and paves the way for future research involving missing data and outliers.

  11. A High-Performance Genetic Algorithm: Using Traveling Salesman Problem as a Case

    PubMed Central

    Tsai, Chun-Wei; Tseng, Shih-Pang; Yang, Chu-Sing

    2014-01-01

    This paper presents a simple but efficient algorithm for reducing the computation time of genetic algorithm (GA) and its variants. The proposed algorithm is motivated by the observation that genes common to all the individuals of a GA have a high probability of surviving the evolution and ending up being part of the final solution; as such, they can be saved away to eliminate the redundant computations at the later generations of a GA. To evaluate the performance of the proposed algorithm, we use it not only to solve the traveling salesman problem but also to provide an extensive analysis on the impact it may have on the quality of the end result. Our experimental results indicate that the proposed algorithm can significantly reduce the computation time of GA and GA-based algorithms while limiting the degradation of the quality of the end result to a very small percentage compared to traditional GA. PMID:24892038

  12. A high-performance genetic algorithm: using traveling salesman problem as a case.

    PubMed

    Tsai, Chun-Wei; Tseng, Shih-Pang; Chiang, Ming-Chao; Yang, Chu-Sing; Hong, Tzung-Pei

    2014-01-01

    This paper presents a simple but efficient algorithm for reducing the computation time of genetic algorithm (GA) and its variants. The proposed algorithm is motivated by the observation that genes common to all the individuals of a GA have a high probability of surviving the evolution and ending up being part of the final solution; as such, they can be saved away to eliminate the redundant computations at the later generations of a GA. To evaluate the performance of the proposed algorithm, we use it not only to solve the traveling salesman problem but also to provide an extensive analysis on the impact it may have on the quality of the end result. Our experimental results indicate that the proposed algorithm can significantly reduce the computation time of GA and GA-based algorithms while limiting the degradation of the quality of the end result to a very small percentage compared to traditional GA.

  13. A Guiding Evolutionary Algorithm with Greedy Strategy for Global Optimization Problems

    PubMed Central

    Cao, Leilei; Xu, Lihong; Goodman, Erik D.

    2016-01-01

    A Guiding Evolutionary Algorithm (GEA) with greedy strategy for global optimization problems is proposed. Inspired by Particle Swarm Optimization, the Genetic Algorithm, and the Bat Algorithm, the GEA was designed to retain some advantages of each method while avoiding some disadvantages. In contrast to the usual Genetic Algorithm, each individual in GEA is crossed with the current global best one instead of a randomly selected individual. The current best individual served as a guide to attract offspring to its region of genotype space. Mutation was added to offspring according to a dynamic mutation probability. To increase the capability of exploitation, a local search mechanism was applied to new individuals according to a dynamic probability of local search. Experimental results show that GEA outperformed the other three typical global optimization algorithms with which it was compared. PMID:27293421

  14. A Guiding Evolutionary Algorithm with Greedy Strategy for Global Optimization Problems.

    PubMed

    Cao, Leilei; Xu, Lihong; Goodman, Erik D

    2016-01-01

    A Guiding Evolutionary Algorithm (GEA) with greedy strategy for global optimization problems is proposed. Inspired by Particle Swarm Optimization, the Genetic Algorithm, and the Bat Algorithm, the GEA was designed to retain some advantages of each method while avoiding some disadvantages. In contrast to the usual Genetic Algorithm, each individual in GEA is crossed with the current global best one instead of a randomly selected individual. The current best individual served as a guide to attract offspring to its region of genotype space. Mutation was added to offspring according to a dynamic mutation probability. To increase the capability of exploitation, a local search mechanism was applied to new individuals according to a dynamic probability of local search. Experimental results show that GEA outperformed the other three typical global optimization algorithms with which it was compared.

  15. Application of different spectrophotometric methods for simultaneous determination of elbasvir and grazoprevir in pharmaceutical preparation

    NASA Astrophysics Data System (ADS)

    Attia, Khalid A. M.; El-Abasawi, Nasr M.; El-Olemy, Ahmed; Abdelazim, Ahmed H.

    2018-01-01

    The first three UV spectrophotometric methods have been developed of simultaneous determination of two new FDA approved drugs namely; elbasvir and grazoprevir in their combined pharmaceutical dosage form. These methods include simultaneous equation, partial least squares with and without variable selection procedure (genetic algorithm). For simultaneous equation method, the absorbance values at 369 (λmax of elbasvir) and 253 nm (λmax of grazoprevir) have been selected for the formation of two simultaneous equations required for the mathematical processing and quantitative analysis of the studied drugs. Alternatively, the partial least squares with and without variable selection procedure (genetic algorithm) have been applied in the spectra analysis because the synchronous inclusion of many unreal wavelengths rather than by using a single or dual wavelength which greatly increases the precision and predictive ability of the methods. Successfully assay of the drugs in their pharmaceutical formulation has been done by the proposed methods. Statistically comparative analysis for the obtained results with the manufacturing methods has been performed. It is noteworthy to mention that there was no significant difference between the proposed methods and the manufacturing one with respect to the validation parameters.

  16. Highly polygenic architecture of antidepressant treatment response: Comparative analysis of SSRI and NRI treatment in an animal model of depression.

    PubMed

    Malki, Karim; Tosto, Maria Grazia; Mouriño-Talín, Héctor; Rodríguez-Lorenzo, Sabela; Pain, Oliver; Jumhaboy, Irfan; Liu, Tina; Parpas, Panos; Newman, Stuart; Malykh, Artem; Carboni, Lucia; Uher, Rudolf; McGuffin, Peter; Schalkwyk, Leonard C; Bryson, Kevin; Herbster, Mark

    2017-04-01

    Response to antidepressant (AD) treatment may be a more polygenic trait than previously hypothesized, with many genetic variants interacting in yet unclear ways. In this study we used methods that can automatically learn to detect patterns of statistical regularity from a sparsely distributed signal across hippocampal transcriptome measurements in a large-scale animal pharmacogenomic study to uncover genomic variations associated with AD. The study used four inbred mouse strains of both sexes, two drug treatments, and a control group (escitalopram, nortriptyline, and saline). Multi-class and binary classification using Machine Learning (ML) and regularization algorithms using iterative and univariate feature selection methods, including InfoGain, mRMR, ANOVA, and Chi Square, were used to uncover genomic markers associated with AD response. Relevant genes were selected based on Jaccard distance and carried forward for gene-network analysis. Linear association methods uncovered only one gene associated with drug treatment response. The implementation of ML algorithms, together with feature reduction methods, revealed a set of 204 genes associated with SSRI and 241 genes associated with NRI response. Although only 10% of genes overlapped across the two drugs, network analysis shows that both drugs modulated the CREB pathway, through different molecular mechanisms. Through careful implementation and optimisations, the algorithms detected a weak signal used to predict whether an animal was treated with nortriptyline (77%) or escitalopram (67%) on an independent testing set. The results from this study indicate that the molecular signature of AD treatment may include a much broader range of genomic markers than previously hypothesized, suggesting that response to medication may be as complex as the pathology. The search for biomarkers of antidepressant treatment response could therefore consider a higher number of genetic markers and their interactions. Through predominately different molecular targets and mechanisms of action, the two drugs modulate the same Creb1 pathway which plays a key role in neurotrophic responses and in inflammatory processes. © 2016 The Authors. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics Published by Wiley Periodicals, Inc. © 2016 The Authors. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics Published by Wiley Periodicals, Inc.

  17. Distributed query plan generation using multiobjective genetic algorithm.

    PubMed

    Panicker, Shina; Kumar, T V Vijay

    2014-01-01

    A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability.

  18. Distributed Query Plan Generation Using Multiobjective Genetic Algorithm

    PubMed Central

    Panicker, Shina; Vijay Kumar, T. V.

    2014-01-01

    A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability. PMID:24963513

  19. DOES GARP REALLY FAIL MISERABLY? A RESPONSE TO STOCKMAN ET AL. (2006)

    EPA Science Inventory

    Stockman et al. (2006) found that ecological niche models built using DesktopGARP 'failed miserably' to predict trapdoor spider (genus Promyrmekiaphila) distributions in California. This apparent failure of GARP (Genetic Algorithm for Rule-Set Production) was actually a failure ...

  20. Neural network-based run-to-run controller using exposure and resist thickness adjustment

    NASA Astrophysics Data System (ADS)

    Geary, Shane; Barry, Ronan

    2003-06-01

    This paper describes the development of a run-to-run control algorithm using a feedforward neural network, trained using the backpropagation training method. The algorithm is used to predict the critical dimension of the next lot using previous lot information. It is compared to a common prediction algorithm - the exponentially weighted moving average (EWMA) and is shown to give superior prediction performance in simulations. The manufacturing implementation of the final neural network showed significantly improved process capability when compared to the case where no run-to-run control was utilised.

  1. Tuning of Kalman filter parameters via genetic algorithm for state-of-charge estimation in battery management system.

    PubMed

    Ting, T O; Man, Ka Lok; Lim, Eng Gee; Leach, Mark

    2014-01-01

    In this work, a state-space battery model is derived mathematically to estimate the state-of-charge (SoC) of a battery system. Subsequently, Kalman filter (KF) is applied to predict the dynamical behavior of the battery model. Results show an accurate prediction as the accumulated error, in terms of root-mean-square (RMS), is a very small value. From this work, it is found that different sets of Q and R values (KF's parameters) can be applied for better performance and hence lower RMS error. This is the motivation for the application of a metaheuristic algorithm. Hence, the result is further improved by applying a genetic algorithm (GA) to tune Q and R parameters of the KF. In an online application, a GA can be applied to obtain the optimal parameters of the KF before its application to a real plant (system). This simply means that the instantaneous response of the KF is not affected by the time consuming GA as this approach is applied only once to obtain the optimal parameters. The relevant workable MATLAB source codes are given in the appendix to ease future work and analysis in this area.

  2. Tuning of Kalman Filter Parameters via Genetic Algorithm for State-of-Charge Estimation in Battery Management System

    PubMed Central

    Ting, T. O.; Lim, Eng Gee

    2014-01-01

    In this work, a state-space battery model is derived mathematically to estimate the state-of-charge (SoC) of a battery system. Subsequently, Kalman filter (KF) is applied to predict the dynamical behavior of the battery model. Results show an accurate prediction as the accumulated error, in terms of root-mean-square (RMS), is a very small value. From this work, it is found that different sets of Q and R values (KF's parameters) can be applied for better performance and hence lower RMS error. This is the motivation for the application of a metaheuristic algorithm. Hence, the result is further improved by applying a genetic algorithm (GA) to tune Q and R parameters of the KF. In an online application, a GA can be applied to obtain the optimal parameters of the KF before its application to a real plant (system). This simply means that the instantaneous response of the KF is not affected by the time consuming GA as this approach is applied only once to obtain the optimal parameters. The relevant workable MATLAB source codes are given in the appendix to ease future work and analysis in this area. PMID:25162041

  3. Hybrid modelling based on support vector regression with genetic algorithms in forecasting the cyanotoxins presence in the Trasona reservoir (Northern Spain).

    PubMed

    García Nieto, P J; Alonso Fernández, J R; de Cos Juez, F J; Sánchez Lasheras, F; Díaz Muñiz, C

    2013-04-01

    Cyanotoxins, a kind of poisonous substances produced by cyanobacteria, are responsible for health risks in drinking and recreational waters. As a result, anticipate its presence is a matter of importance to prevent risks. The aim of this study is to use a hybrid approach based on support vector regression (SVR) in combination with genetic algorithms (GAs), known as a genetic algorithm support vector regression (GA-SVR) model, in forecasting the cyanotoxins presence in the Trasona reservoir (Northern Spain). The GA-SVR approach is aimed at highly nonlinear biological problems with sharp peaks and the tests carried out proved its high performance. Some physical-chemical parameters have been considered along with the biological ones. The results obtained are two-fold. In the first place, the significance of each biological and physical-chemical variable on the cyanotoxins presence in the reservoir is determined with success. Finally, a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. A Novel Hybrid Classification Model of Genetic Algorithms, Modified k-Nearest Neighbor and Developed Backpropagation Neural Network

    PubMed Central

    Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy

    2014-01-01

    Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposed model in terms of classification accuracy is desirable, promising, and competitive to the existing state-of-the-art classification models. PMID:25419659

  5. New application of intelligent agents in sporadic amyotrophic lateral sclerosis identifies unexpected specific genetic background.

    PubMed

    Penco, Silvana; Buscema, Massimo; Patrosso, Maria Cristina; Marocchi, Alessandro; Grossi, Enzo

    2008-05-30

    Few genetic factors predisposing to the sporadic form of amyotrophic lateral sclerosis (ALS) have been identified, but the pathology itself seems to be a true multifactorial disease in which complex interactions between environmental and genetic susceptibility factors take place. The purpose of this study was to approach genetic data with an innovative statistical method such as artificial neural networks to identify a possible genetic background predisposing to the disease. A DNA multiarray panel was applied to genotype more than 60 polymorphisms within 35 genes selected from pathways of lipid and homocysteine metabolism, regulation of blood pressure, coagulation, inflammation, cellular adhesion and matrix integrity, in 54 sporadic ALS patients and 208 controls. Advanced intelligent systems based on novel coupling of artificial neural networks and evolutionary algorithms have been applied. The results obtained have been compared with those derived from the use of standard neural networks and classical statistical analysis Advanced intelligent systems based on novel coupling of artificial neural networks and evolutionary algorithms have been applied. The results obtained have been compared with those derived from the use of standard neural networks and classical statistical analysis. An unexpected discovery of a strong genetic background in sporadic ALS using a DNA multiarray panel and analytical processing of the data with advanced artificial neural networks was found. The predictive accuracy obtained with Linear Discriminant Analysis and Standard Artificial Neural Networks ranged from 70% to 79% (average 75.31%) and from 69.1 to 86.2% (average 76.6%) respectively. The corresponding value obtained with Advanced Intelligent Systems reached an average of 96.0% (range 94.4 to 97.6%). This latter approach allowed the identification of seven genetic variants essential to differentiate cases from controls: apolipoprotein E arg158cys; hepatic lipase -480 C/T; endothelial nitric oxide synthase 690 C/T and glu298asp; vitamin K-dependent coagulation factor seven arg353glu, glycoprotein Ia/IIa 873 G/A and E-selectin ser128arg. This study provides an alternative and reliable method to approach complex diseases. Indeed, the application of a novel artificial intelligence-based method offers a new insight into genetic markers of sporadic ALS pointing out the existence of a strong genetic background.

  6. A fast hybrid algorithm combining regularized motion tracking and predictive search for reducing the occurrence of large displacement errors.

    PubMed

    Jiang, Jingfeng; Hall, Timothy J

    2011-04-01

    A hybrid approach that inherits both the robustness of the regularized motion tracking approach and the efficiency of the predictive search approach is reported. The basic idea is to use regularized speckle tracking to obtain high-quality seeds in an explorative search that can be used in the subsequent intelligent predictive search. The performance of the hybrid speckle-tracking algorithm was compared with three published speckle-tracking methods using in vivo breast lesion data. We found that the hybrid algorithm provided higher displacement quality metric values, lower root mean squared errors compared with a locally smoothed displacement field, and higher improvement ratios compared with the classic block-matching algorithm. On the basis of these comparisons, we concluded that the hybrid method can further enhance the accuracy of speckle tracking compared with its real-time counterparts, at the expense of slightly higher computational demands. © 2011 IEEE

  7. Phase 2 development of Great Lakes algorithms for Nimbus-7 coastal zone color scanner

    NASA Technical Reports Server (NTRS)

    Tanis, Fred J.

    1984-01-01

    A series of experiments have been conducted in the Great Lakes designed to evaluate the application of the NIMBUS-7 Coastal Zone Color Scanner (CZCS). Atmospheric and water optical models were used to relate surface and subsurface measurements to satellite measured radiances. Absorption and scattering measurements were reduced to obtain a preliminary optical model for the Great Lakes. Algorithms were developed for geometric correction, correction for Rayleigh and aerosol path radiance, and prediction of chlorophyll-a pigment and suspended mineral concentrations. The atmospheric algorithm developed compared favorably with existing algorithms and was the only algorithm found to adequately predict the radiance variations in the 670 nm band. The atmospheric correction algorithm developed was designed to extract needed algorithm parameters from the CZCS radiance values. The Gordon/NOAA ocean algorithms could not be demonstrated to work for Great Lakes waters. Predicted values of chlorophyll-a concentration compared favorably with expected and measured data for several areas of the Great Lakes.

  8. Artificial neural network, genetic algorithm, and logistic regression applications for predicting renal colic in emergency settings.

    PubMed

    Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay

    2009-06-03

    Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.

  9. A statistical framework for genetic association studies of power curves in bird flight

    PubMed Central

    Lin, Min; Zhao, Wei

    2006-01-01

    How the power required for bird flight varies as a function of forward speed can be used to predict the flight style and behavioral strategy of a bird for feeding and migration. A U-shaped curve was observed between the power and flight velocity in many birds, which is consistent to the theoretical prediction by aerodynamic models. In this article, we present a general genetic model for fine mapping of quantitative trait loci (QTL) responsible for power curves in a sample of birds drawn from a natural population. This model is developed within the maximum likelihood context, implemented with the EM algorithm for estimating the population genetic parameters of QTL and the simplex algorithm for estimating the QTL genotype-specific parameters of power curves. Using Monte Carlo simulation derived from empirical observations of power curves in the European starling (Sturnus vulgaris), we demonstrate how the underlying QTL for power curves can be detected from molecular markers and how the QTL detected affect the most appropriate flight speeds used to design an optimal migration strategy. The results from our model can be directly integrated into a conceptual framework for understanding flight origin and evolution. PMID:17066123

  10. Limitations and potentials of current motif discovery algorithms

    PubMed Central

    Hu, Jianjun; Li, Bin; Kihara, Daisuke

    2005-01-01

    Computational methods for de novo identification of gene regulation elements, such as transcription factor binding sites, have proved to be useful for deciphering genetic regulatory networks. However, despite the availability of a large number of algorithms, their strengths and weaknesses are not sufficiently understood. Here, we designed a comprehensive set of performance measures and benchmarked five modern sequence-based motif discovery algorithms using large datasets generated from Escherichia coli RegulonDB. Factors that affect the prediction accuracy, scalability and reliability are characterized. It is revealed that the nucleotide and the binding site level accuracy are very low, while the motif level accuracy is relatively high, which indicates that the algorithms can usually capture at least one correct motif in an input sequence. To exploit diverse predictions from multiple runs of one or more algorithms, a consensus ensemble algorithm has been developed, which achieved 6–45% improvement over the base algorithms by increasing both the sensitivity and specificity. Our study illustrates limitations and potentials of existing sequence-based motif discovery algorithms. Taking advantage of the revealed potentials, several promising directions for further improvements are discussed. Since the sequence-based algorithms are the baseline of most of the modern motif discovery algorithms, this paper suggests substantial improvements would be possible for them. PMID:16284194

  11. A systematic investigation of computation models for predicting Adverse Drug Reactions (ADRs).

    PubMed

    Kuang, Qifan; Wang, MinQi; Li, Rong; Dong, YongCheng; Li, Yizhou; Li, Menglong

    2014-01-01

    Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms.

  12. Optimal reentry prediction of space objects from LEO using RSM and GA

    NASA Astrophysics Data System (ADS)

    Mutyalarao, M.; Raj, M. Xavier James

    2012-07-01

    The accurate estimation of the orbital life time (OLT) of decaying near-Earth objects is of considerable importance for the prediction of risk object re-entry time and hazard assessment as well as for mitigation strategies. Recently, due to the reentries of large number of risk objects, which poses threat to the human life and property, a great concern is developed in the space scientific community all over the World. The evolution of objects in Low Earth Orbit (LEO) is determined by a complex interplay of the perturbing forces, mainly due to atmospheric drag and Earth gravity. These orbits are mostly in low eccentric (eccentricity < 0.2) and have variations in perigee and apogee altitudes due to perturbations during a revolution. The changes in the perigee and apogee altitudes of these orbits are mainly due to the gravitational perturbations of the Earth and the atmospheric density. It has become necessary to use extremely complex force models to match with the present operational requirements and observational techniques. Further the re-entry time of the objects in such orbits is sensitive to the initial conditions. In this paper the problem of predicting re-entry time is attempted as an optimal estimation problem. It is known that the errors are more in eccentricity for the observations based on two line elements (TLEs). Thus two parameters, initial eccentricity and ballistic coefficient, are chosen for optimal estimation. These two parameters are computed with response surface method (RSM) using a genetic algorithm (GA) for the selected time zones, based on rough linear variation of response parameter, the mean semi-major axis during orbit evolution. Error minimization between the observed and predicted mean Semi-major axis is achieved by the application of an optimization algorithm such as Genetic Algorithm (GA). The basic feature of the present approach is that the model and measurement errors are accountable in terms of adjusting the ballistic coefficient and eccentricity. The methodology is tested with the recently reentered objects ROSAT and PHOBOS GRUNT satellites. The study reveals a good agreement with the actual reentry time of these objects. It is also observed that the absolute percentage error in re-entry prediction time for all the two objects is found to be very less. Keywords: low eccentric, Response surface method, Genetic algorithm, apogee altitude, Ballistic coefficient

  13. Optimization of Contrast Detection Power with Probabilistic Behavioral Information

    PubMed Central

    Cordes, Dietmar; Herzmann, Grit; Nandy, Rajesh; Curran, Tim

    2012-01-01

    Recent progress in the experimental design for event-related fMRI experiments made it possible to find the optimal stimulus sequence for maximum contrast detection power using a genetic algorithm. In this study, a novel algorithm is proposed for optimization of contrast detection power by including probabilistic behavioral information, based on pilot data, in the genetic algorithm. As a particular application, a recognition memory task is studied and the design matrix optimized for contrasts involving the familiarity of individual items (pictures of objects) and the recollection of qualitative information associated with the items (left/right orientation). Optimization of contrast efficiency is a complicated issue whenever subjects’ responses are not deterministic but probabilistic. Contrast efficiencies are not predictable unless behavioral responses are included in the design optimization. However, available software for design optimization does not include options for probabilistic behavioral constraints. If the anticipated behavioral responses are included in the optimization algorithm, the design is optimal for the assumed behavioral responses, and the resulting contrast efficiency is greater than what either a block design or a random design can achieve. Furthermore, improvements of contrast detection power depend strongly on the behavioral probabilities, the perceived randomness, and the contrast of interest. The present genetic algorithm can be applied to any case in which fMRI contrasts are dependent on probabilistic responses that can be estimated from pilot data. PMID:22326984

  14. An improved shuffled frog leaping algorithm based evolutionary framework for currency exchange rate prediction

    NASA Astrophysics Data System (ADS)

    Dash, Rajashree

    2017-11-01

    Forecasting purchasing power of one currency with respect to another currency is always an interesting topic in the field of financial time series prediction. Despite the existence of several traditional and computational models for currency exchange rate forecasting, there is always a need for developing simpler and more efficient model, which will produce better prediction capability. In this paper, an evolutionary framework is proposed by using an improved shuffled frog leaping (ISFL) algorithm with a computationally efficient functional link artificial neural network (CEFLANN) for prediction of currency exchange rate. The model is validated by observing the monthly prediction measures obtained for three currency exchange data sets such as USD/CAD, USD/CHF, and USD/JPY accumulated within same period of time. The model performance is also compared with two other evolutionary learning techniques such as Shuffled frog leaping algorithm and Particle Swarm optimization algorithm. Practical analysis of results suggest that, the proposed model developed using the ISFL algorithm with CEFLANN network is a promising predictor model for currency exchange rate prediction compared to other models included in the study.

  15. A genetic-algorithm approach for assessing the liquefaction potential of sandy soils

    NASA Astrophysics Data System (ADS)

    Sen, G.; Akyol, E.

    2010-04-01

    The determination of liquefaction potential is required to take into account a large number of parameters, which creates a complex nonlinear structure of the liquefaction phenomenon. The conventional methods rely on simple statistical and empirical relations or charts. However, they cannot characterise these complexities. Genetic algorithms are suited to solve these types of problems. A genetic algorithm-based model has been developed to determine the liquefaction potential by confirming Cone Penetration Test datasets derived from case studies of sandy soils. Software has been developed that uses genetic algorithms for the parameter selection and assessment of liquefaction potential. Then several estimation functions for the assessment of a Liquefaction Index have been generated from the dataset. The generated Liquefaction Index estimation functions were evaluated by assessing the training and test data. The suggested formulation estimates the liquefaction occurrence with significant accuracy. Besides, the parametric study on the liquefaction index curves shows a good relation with the physical behaviour. The total number of misestimated cases was only 7.8% for the proposed method, which is quite low when compared to another commonly used method.

  16. Prediction of monthly rainfall in Victoria, Australia: Clusterwise linear regression approach

    NASA Astrophysics Data System (ADS)

    Bagirov, Adil M.; Mahmood, Arshad; Barton, Andrew

    2017-05-01

    This paper develops the Clusterwise Linear Regression (CLR) technique for prediction of monthly rainfall. The CLR is a combination of clustering and regression techniques. It is formulated as an optimization problem and an incremental algorithm is designed to solve it. The algorithm is applied to predict monthly rainfall in Victoria, Australia using rainfall data with five input meteorological variables over the period of 1889-2014 from eight geographically diverse weather stations. The prediction performance of the CLR method is evaluated by comparing observed and predicted rainfall values using four measures of forecast accuracy. The proposed method is also compared with the CLR using the maximum likelihood framework by the expectation-maximization algorithm, multiple linear regression, artificial neural networks and the support vector machines for regression models using computational results. The results demonstrate that the proposed algorithm outperforms other methods in most locations.

  17. A Comparison Study of Machine Learning Based Algorithms for Fatigue Crack Growth Calculation.

    PubMed

    Wang, Hongxun; Zhang, Weifang; Sun, Fuqiang; Zhang, Wei

    2017-05-18

    The relationships between the fatigue crack growth rate ( d a / d N ) and stress intensity factor range ( Δ K ) are not always linear even in the Paris region. The stress ratio effects on fatigue crack growth rate are diverse in different materials. However, most existing fatigue crack growth models cannot handle these nonlinearities appropriately. The machine learning method provides a flexible approach to the modeling of fatigue crack growth because of its excellent nonlinear approximation and multivariable learning ability. In this paper, a fatigue crack growth calculation method is proposed based on three different machine learning algorithms (MLAs): extreme learning machine (ELM), radial basis function network (RBFN) and genetic algorithms optimized back propagation network (GABP). The MLA based method is validated using testing data of different materials. The three MLAs are compared with each other as well as the classical two-parameter model ( K * approach). The results show that the predictions of MLAs are superior to those of K * approach in accuracy and effectiveness, and the ELM based algorithms show overall the best agreement with the experimental data out of the three MLAs, for its global optimization and extrapolation ability.

  18. Model predictive control of attitude maneuver of a geostationary flexible satellite based on genetic algorithm

    NASA Astrophysics Data System (ADS)

    TayyebTaher, M.; Esmaeilzadeh, S. Majid

    2017-07-01

    This article presents an application of Model Predictive Controller (MPC) to the attitude control of a geostationary flexible satellite. SIMO model has been used for the geostationary satellite, using the Lagrange equations. Flexibility is also included in the modelling equations. The state space equations are expressed in order to simplify the controller. Naturally there is no specific tuning rule to find the best parameters of an MPC controller which fits the desired controller. Being an intelligence method for optimizing problem, Genetic Algorithm has been used for optimizing the performance of MPC controller by tuning the controller parameter due to minimum rise time, settling time, overshoot of the target point of the flexible structure and its mode shape amplitudes to make large attitude maneuvers possible. The model included geosynchronous orbit environment and geostationary satellite parameters. The simulation results of the flexible satellite with attitude maneuver shows the efficiency of proposed optimization method in comparison with LQR optimal controller.

  19. Spread of the Tiger: Global Risk of Invasion by the Mosquito Aedes albopictus

    PubMed Central

    BENEDICT, MARK Q.; LEVINE, REBECCA S.; HAWLEY, WILLIAM A.; LOUNIBOS, L. PHILIP

    2008-01-01

    Aedes albopictus, commonly known as the Asian tiger mosquito, is currently the most invasive mosquito in the world. It is of medical importance due to its aggressive daytime human-biting behavior and ability to vector many viruses, including dengue, LaCrosse, and West Nile. Invasions into new areas of its potential range are often initiated through the transportation of eggs via the international trade in used tires. We use a genetic algorithm, Genetic Algorithm for Rule Set Production (GARP), to determine the ecological niche of Ae. albopictus and predict a global ecological risk map for the continued spread of the species. We combine this analysis with risk due to importation of tires from infested countries and their proximity to countries that have already been invaded to develop a list of countries most at risk for future introductions and establishments. Methods used here have potential for predicting risks of future invasions of vectors or pathogens. PMID:17417960

  20. DEVELOPMENT AND PERFORMANCE OF TEXT-MINING ALGORITHMS TO EXTRACT SOCIOECONOMIC STATUS FROM DE-IDENTIFIED ELECTRONIC HEALTH RECORDS.

    PubMed

    Hollister, Brittany M; Restrepo, Nicole A; Farber-Eger, Eric; Crawford, Dana C; Aldrich, Melinda C; Non, Amy

    2017-01-01

    Socioeconomic status (SES) is a fundamental contributor to health, and a key factor underlying racial disparities in disease. However, SES data are rarely included in genetic studies due in part to the difficultly of collecting these data when studies were not originally designed for that purpose. The emergence of large clinic-based biobanks linked to electronic health records (EHRs) provides research access to large patient populations with longitudinal phenotype data captured in structured fields as billing codes, procedure codes, and prescriptions. SES data however, are often not explicitly recorded in structured fields, but rather recorded in the free text of clinical notes and communications. The content and completeness of these data vary widely by practitioner. To enable gene-environment studies that consider SES as an exposure, we sought to extract SES variables from racial/ethnic minority adult patients (n=9,977) in BioVU, the Vanderbilt University Medical Center biorepository linked to de-identified EHRs. We developed several measures of SES using information available within the de-identified EHR, including broad categories of occupation, education, insurance status, and homelessness. Two hundred patients were randomly selected for manual review to develop a set of seven algorithms for extracting SES information from de-identified EHRs. The algorithms consist of 15 categories of information, with 830 unique search terms. SES data extracted from manual review of 50 randomly selected records were compared to data produced by the algorithm, resulting in positive predictive values of 80.0% (education), 85.4% (occupation), 87.5% (unemployment), 63.6% (retirement), 23.1% (uninsured), 81.8% (Medicaid), and 33.3% (homelessness), suggesting some categories of SES data are easier to extract in this EHR than others. The SES data extraction approach developed here will enable future EHR-based genetic studies to integrate SES information into statistical analyses. Ultimately, incorporation of measures of SES into genetic studies will help elucidate the impact of the social environment on disease risk and outcomes.

  1. A genetic-based algorithm for personalized resistance training

    PubMed Central

    Kiely, J; Suraci, B; Collins, DJ; de Lorenzo, D; Pickering, C; Grimaldi, KA

    2016-01-01

    Association studies have identified dozens of genetic variants linked to training responses and sport-related traits. However, no intervention studies utilizing the idea of personalised training based on athlete's genetic profile have been conducted. Here we propose an algorithm that allows achieving greater results in response to high- or low-intensity resistance training programs by predicting athlete's potential for the development of power and endurance qualities with the panel of 15 performance-associated gene polymorphisms. To develop and validate such an algorithm we performed two studies in independent cohorts of male athletes (study 1: athletes from different sports (n = 28); study 2: soccer players (n = 39)). In both studies athletes completed an eight-week high- or low-intensity resistance training program, which either matched or mismatched their individual genotype. Two variables of explosive power and aerobic fitness, as measured by the countermovement jump (CMJ) and aerobic 3-min cycle test (Aero3) were assessed pre and post 8 weeks of resistance training. In study 1, the athletes from the matched groups (i.e. high-intensity trained with power genotype or low-intensity trained with endurance genotype) significantly increased results in CMJ (P = 0.0005) and Aero3 (P = 0.0004). Whereas, athletes from the mismatched group (i.e. high-intensity trained with endurance genotype or low-intensity trained with power genotype) demonstrated non-significant improvements in CMJ (P = 0.175) and less prominent results in Aero3 (P = 0.0134). In study 2, soccer players from the matched group also demonstrated significantly greater (P < 0.0001) performance changes in both tests compared to the mismatched group. Among non- or low responders of both studies, 82% of athletes (both for CMJ and Aero3) were from the mismatched group (P < 0.0001). Our results indicate that matching the individual's genotype with the appropriate training modality leads to more effective resistance training. The developed algorithm may be used to guide individualised resistance-training interventions. PMID:27274104

  2. A link prediction approach to cancer drug sensitivity prediction.

    PubMed

    Turki, Turki; Wei, Zhi

    2017-10-03

    Predicting the response to a drug for cancer disease patients based on genomic information is an important problem in modern clinical oncology. This problem occurs in part because many available drug sensitivity prediction algorithms do not consider better quality cancer cell lines and the adoption of new feature representations; both lead to the accurate prediction of drug responses. By predicting accurate drug responses to cancer, oncologists gain a more complete understanding of the effective treatments for each patient, which is a core goal in precision medicine. In this paper, we model cancer drug sensitivity as a link prediction, which is shown to be an effective technique. We evaluate our proposed link prediction algorithms and compare them with an existing drug sensitivity prediction approach based on clinical trial data. The experimental results based on the clinical trial data show the stability of our link prediction algorithms, which yield the highest area under the ROC curve (AUC) and are statistically significant. We propose a link prediction approach to obtain new feature representation. Compared with an existing approach, the results show that incorporating the new feature representation to the link prediction algorithms has significantly improved the performance.

  3. Learning Instance-Specific Predictive Models

    PubMed Central

    Visweswaran, Shyam; Cooper, Gregory F.

    2013-01-01

    This paper introduces a Bayesian algorithm for constructing predictive models from data that are optimized to predict a target variable well for a particular instance. This algorithm learns Markov blanket models, carries out Bayesian model averaging over a set of models to predict a target variable of the instance at hand, and employs an instance-specific heuristic to locate a set of suitable models to average over. We call this method the instance-specific Markov blanket (ISMB) algorithm. The ISMB algorithm was evaluated on 21 UCI data sets using five different performance measures and its performance was compared to that of several commonly used predictive algorithms, including nave Bayes, C4.5 decision tree, logistic regression, neural networks, k-Nearest Neighbor, Lazy Bayesian Rules, and AdaBoost. Over all the data sets, the ISMB algorithm performed better on average on all performance measures against all the comparison algorithms. PMID:25045325

  4. Problem solving with genetic algorithms and Splicer

    NASA Technical Reports Server (NTRS)

    Bayer, Steven E.; Wang, Lui

    1991-01-01

    Genetic algorithms are highly parallel, adaptive search procedures (i.e., problem-solving methods) loosely based on the processes of population genetics and Darwinian survival of the fittest. Genetic algorithms have proven useful in domains where other optimization techniques perform poorly. The main purpose of the paper is to discuss a NASA-sponsored software development project to develop a general-purpose tool for using genetic algorithms. The tool, called Splicer, can be used to solve a wide variety of optimization problems and is currently available from NASA and COSMIC. This discussion is preceded by an introduction to basic genetic algorithm concepts and a discussion of genetic algorithm applications.

  5. A High Performance Cloud-Based Protein-Ligand Docking Prediction Algorithm

    PubMed Central

    Chen, Jui-Le; Yang, Chu-Sing

    2013-01-01

    The potential of predicting druggability for a particular disease by integrating biological and computer science technologies has witnessed success in recent years. Although the computer science technologies can be used to reduce the costs of the pharmaceutical research, the computation time of the structure-based protein-ligand docking prediction is still unsatisfied until now. Hence, in this paper, a novel docking prediction algorithm, named fast cloud-based protein-ligand docking prediction algorithm (FCPLDPA), is presented to accelerate the docking prediction algorithm. The proposed algorithm works by leveraging two high-performance operators: (1) the novel migration (information exchange) operator is designed specially for cloud-based environments to reduce the computation time; (2) the efficient operator is aimed at filtering out the worst search directions. Our simulation results illustrate that the proposed method outperforms the other docking algorithms compared in this paper in terms of both the computation time and the quality of the end result. PMID:23762864

  6. Data Sufficiency Assessment and Pumping Test Design for Groundwater Prediction Using Decision Theory and Genetic Algorithms

    NASA Astrophysics Data System (ADS)

    McPhee, J.; William, Y. W.

    2005-12-01

    This work presents a methodology for pumping test design based on the reliability requirements of a groundwater model. Reliability requirements take into consideration the application of the model results in groundwater management, expressed in this case as a multiobjective management model. The pumping test design is formulated as a mixed-integer nonlinear programming (MINLP) problem and solved using a combination of genetic algorithm (GA) and gradient-based optimization. Bayesian decision theory provides a formal framework for assessing the influence of parameter uncertainty over the reliability of the proposed pumping test. The proposed methodology is useful for selecting a robust design that will outperform all other candidate designs under most potential 'true' states of the system

  7. The gradient boosting algorithm and random boosting for genome-assisted evaluation in large data sets.

    PubMed

    González-Recio, O; Jiménez-Montero, J A; Alenda, R

    2013-01-01

    In the next few years, with the advent of high-density single nucleotide polymorphism (SNP) arrays and genome sequencing, genomic evaluation methods will need to deal with a large number of genetic variants and an increasing sample size. The boosting algorithm is a machine-learning technique that may alleviate the drawbacks of dealing with such large data sets. This algorithm combines different predictors in a sequential manner with some shrinkage on them; each predictor is applied consecutively to the residuals from the committee formed by the previous ones to form a final prediction based on a subset of covariates. Here, a detailed description is provided and examples using a toy data set are included. A modification of the algorithm called "random boosting" was proposed to increase predictive ability and decrease computation time of genome-assisted evaluation in large data sets. Random boosting uses a random selection of markers to add a subsequent weak learner to the predictive model. These modifications were applied to a real data set composed of 1,797 bulls genotyped for 39,714 SNP. Deregressed proofs of 4 yield traits and 1 type trait from January 2009 routine evaluations were used as dependent variables. A 2-fold cross-validation scenario was implemented. Sires born before 2005 were used as a training sample (1,576 and 1,562 for production and type traits, respectively), whereas younger sires were used as a testing sample to evaluate predictive ability of the algorithm on yet-to-be-observed phenotypes. Comparison with the original algorithm was provided. The predictive ability of the algorithm was measured as Pearson correlations between observed and predicted responses. Further, estimated bias was computed as the average difference between observed and predicted phenotypes. The results showed that the modification of the original boosting algorithm could be run in 1% of the time used with the original algorithm and with negligible differences in accuracy and bias. This modification may be used to speed the calculus of genome-assisted evaluation in large data sets such us those obtained from consortiums. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  8. An accelerated non-Gaussianity based multichannel predictive deconvolution method with the limited supporting region of filters

    NASA Astrophysics Data System (ADS)

    Li, Zhong-xiao; Li, Zhen-chun

    2016-09-01

    The multichannel predictive deconvolution can be conducted in overlapping temporal and spatial data windows to solve the 2D predictive filter for multiple removal. Generally, the 2D predictive filter can better remove multiples at the cost of more computation time compared with the 1D predictive filter. In this paper we first use the cross-correlation strategy to determine the limited supporting region of filters where the coefficients play a major role for multiple removal in the filter coefficient space. To solve the 2D predictive filter the traditional multichannel predictive deconvolution uses the least squares (LS) algorithm, which requires primaries and multiples are orthogonal. To relax the orthogonality assumption the iterative reweighted least squares (IRLS) algorithm and the fast iterative shrinkage thresholding (FIST) algorithm have been used to solve the 2D predictive filter in the multichannel predictive deconvolution with the non-Gaussian maximization (L1 norm minimization) constraint of primaries. The FIST algorithm has been demonstrated as a faster alternative to the IRLS algorithm. In this paper we introduce the FIST algorithm to solve the filter coefficients in the limited supporting region of filters. Compared with the FIST based multichannel predictive deconvolution without the limited supporting region of filters the proposed method can reduce the computation burden effectively while achieving a similar accuracy. Additionally, the proposed method can better balance multiple removal and primary preservation than the traditional LS based multichannel predictive deconvolution and FIST based single channel predictive deconvolution. Synthetic and field data sets demonstrate the effectiveness of the proposed method.

  9. Genetic algorithms using SISAL parallel programming language

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tejada, S.

    1994-05-06

    Genetic algorithms are a mathematical optimization technique developed by John Holland at the University of Michigan [1]. The SISAL programming language possesses many of the characteristics desired to implement genetic algorithms. SISAL is a deterministic, functional programming language which is inherently parallel. Because SISAL is functional and based on mathematical concepts, genetic algorithms can be efficiently translated into the language. Several of the steps involved in genetic algorithms, such as mutation, crossover, and fitness evaluation, can be parallelized using SISAL. In this paper I will l discuss the implementation and performance of parallel genetic algorithms in SISAL.

  10. Performance of Geno-Fuzzy Model on rainfall-runoff predictions in claypan watersheds

    USDA-ARS?s Scientific Manuscript database

    Fuzzy logic provides a relatively simple approach to simulate complex hydrological systems while accounting for the uncertainty of environmental variables. The objective of this study was to develop a fuzzy inference system (FIS) with genetic algorithm (GA) optimization for membership functions (MF...

  11. Comparison of Structural Optimization Techniques for a Nuclear Electric Space Vehicle

    NASA Technical Reports Server (NTRS)

    Benford, Andrew

    2003-01-01

    The purpose of this paper is to utilize the optimization method of genetic algorithms (GA) for truss design on a nuclear propulsion vehicle. Genetic Algorithms are a guided, random search that mirrors Darwin s theory of natural selection and survival of the fittest. To verify the GA s capabilities, other traditional optimization methods were used to compare the results obtained by the GA's, first on simple 2-D structures, and eventually on full-scale 3-D truss designs.

  12. Fuzzy Mixed Assembly Line Sequencing and Scheduling Optimization Model Using Multiobjective Dynamic Fuzzy GA

    PubMed Central

    Tahriri, Farzad; Dawal, Siti Zawiah Md; Taha, Zahari

    2014-01-01

    A new multiobjective dynamic fuzzy genetic algorithm is applied to solve a fuzzy mixed-model assembly line sequencing problem in which the primary goals are to minimize the total make-span and minimize the setup number simultaneously. Trapezoidal fuzzy numbers are implemented for variables such as operation and travelling time in order to generate results with higher accuracy and representative of real-case data. An improved genetic algorithm called fuzzy adaptive genetic algorithm (FAGA) is proposed in order to solve this optimization model. In establishing the FAGA, five dynamic fuzzy parameter controllers are devised in which fuzzy expert experience controller (FEEC) is integrated with automatic learning dynamic fuzzy controller (ALDFC) technique. The enhanced algorithm dynamically adjusts the population size, number of generations, tournament candidate, crossover rate, and mutation rate compared with using fixed control parameters. The main idea is to improve the performance and effectiveness of existing GAs by dynamic adjustment and control of the five parameters. Verification and validation of the dynamic fuzzy GA are carried out by developing test-beds and testing using a multiobjective fuzzy mixed production assembly line sequencing optimization problem. The simulation results highlight that the performance and efficacy of the proposed novel optimization algorithm are more efficient than the performance of the standard genetic algorithm in mixed assembly line sequencing model. PMID:24982962

  13. WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning

    PubMed Central

    Sutphin, George L.; Mahoney, J. Matthew; Sheppard, Keith; Walton, David O.; Korstanje, Ron

    2016-01-01

    The rapid advancement of technology in genomics and targeted genetic manipulation has made comparative biology an increasingly prominent strategy to model human disease processes. Predicting orthology relationships between species is a vital component of comparative biology. Dozens of strategies for predicting orthologs have been developed using combinations of gene and protein sequence, phylogenetic history, and functional interaction with progressively increasing accuracy. A relatively new class of orthology prediction strategies combines aspects of multiple methods into meta-tools, resulting in improved prediction performance. Here we present WORMHOLE, a novel ortholog prediction meta-tool that applies machine learning to integrate 17 distinct ortholog prediction algorithms to identify novel least diverged orthologs (LDOs) between 6 eukaryotic species—humans, mice, zebrafish, fruit flies, nematodes, and budding yeast. Machine learning allows WORMHOLE to intelligently incorporate predictions from a wide-spectrum of strategies in order to form aggregate predictions of LDOs with high confidence. In this study we demonstrate the performance of WORMHOLE across each combination of query and target species. We show that WORMHOLE is particularly adept at improving LDO prediction performance between distantly related species, expanding the pool of LDOs while maintaining low evolutionary distance and a high level of functional relatedness between genes in LDO pairs. We present extensive validation, including cross-validated prediction of PANTHER LDOs and evaluation of evolutionary divergence and functional similarity, and discuss future applications of machine learning in ortholog prediction. A WORMHOLE web tool has been developed and is available at http://wormhole.jax.org/. PMID:27812085

  14. WORMHOLE: Novel Least Diverged Ortholog Prediction through Machine Learning.

    PubMed

    Sutphin, George L; Mahoney, J Matthew; Sheppard, Keith; Walton, David O; Korstanje, Ron

    2016-11-01

    The rapid advancement of technology in genomics and targeted genetic manipulation has made comparative biology an increasingly prominent strategy to model human disease processes. Predicting orthology relationships between species is a vital component of comparative biology. Dozens of strategies for predicting orthologs have been developed using combinations of gene and protein sequence, phylogenetic history, and functional interaction with progressively increasing accuracy. A relatively new class of orthology prediction strategies combines aspects of multiple methods into meta-tools, resulting in improved prediction performance. Here we present WORMHOLE, a novel ortholog prediction meta-tool that applies machine learning to integrate 17 distinct ortholog prediction algorithms to identify novel least diverged orthologs (LDOs) between 6 eukaryotic species-humans, mice, zebrafish, fruit flies, nematodes, and budding yeast. Machine learning allows WORMHOLE to intelligently incorporate predictions from a wide-spectrum of strategies in order to form aggregate predictions of LDOs with high confidence. In this study we demonstrate the performance of WORMHOLE across each combination of query and target species. We show that WORMHOLE is particularly adept at improving LDO prediction performance between distantly related species, expanding the pool of LDOs while maintaining low evolutionary distance and a high level of functional relatedness between genes in LDO pairs. We present extensive validation, including cross-validated prediction of PANTHER LDOs and evaluation of evolutionary divergence and functional similarity, and discuss future applications of machine learning in ortholog prediction. A WORMHOLE web tool has been developed and is available at http://wormhole.jax.org/.

  15. Seven-spot ladybird optimization: a novel and efficient metaheuristic algorithm for numerical optimization.

    PubMed

    Wang, Peng; Zhu, Zhouquan; Huang, Shuai

    2013-01-01

    This paper presents a novel biologically inspired metaheuristic algorithm called seven-spot ladybird optimization (SLO). The SLO is inspired by recent discoveries on the foraging behavior of a seven-spot ladybird. In this paper, the performance of the SLO is compared with that of the genetic algorithm, particle swarm optimization, and artificial bee colony algorithms by using five numerical benchmark functions with multimodality. The results show that SLO has the ability to find the best solution with a comparatively small population size and is suitable for solving optimization problems with lower dimensions.

  16. Seven-Spot Ladybird Optimization: A Novel and Efficient Metaheuristic Algorithm for Numerical Optimization

    PubMed Central

    Zhu, Zhouquan

    2013-01-01

    This paper presents a novel biologically inspired metaheuristic algorithm called seven-spot ladybird optimization (SLO). The SLO is inspired by recent discoveries on the foraging behavior of a seven-spot ladybird. In this paper, the performance of the SLO is compared with that of the genetic algorithm, particle swarm optimization, and artificial bee colony algorithms by using five numerical benchmark functions with multimodality. The results show that SLO has the ability to find the best solution with a comparatively small population size and is suitable for solving optimization problems with lower dimensions. PMID:24385879

  17. Data Based Prediction of Blood Glucose Concentrations Using Evolutionary Methods.

    PubMed

    Hidalgo, J Ignacio; Colmenar, J Manuel; Kronberger, Gabriel; Winkler, Stephan M; Garnica, Oscar; Lanchares, Juan

    2017-08-08

    Predicting glucose values on the basis of insulin and food intakes is a difficult task that people with diabetes need to do daily. This is necessary as it is important to maintain glucose levels at appropriate values to avoid not only short-term, but also long-term complications of the illness. Artificial intelligence in general and machine learning techniques in particular have already lead to promising results in modeling and predicting glucose concentrations. In this work, several machine learning techniques are used for the modeling and prediction of glucose concentrations using as inputs the values measured by a continuous monitoring glucose system as well as also previous and estimated future carbohydrate intakes and insulin injections. In particular, we use the following four techniques: genetic programming, random forests, k-nearest neighbors, and grammatical evolution. We propose two new enhanced modeling algorithms for glucose prediction, namely (i) a variant of grammatical evolution which uses an optimized grammar, and (ii) a variant of tree-based genetic programming which uses a three-compartment model for carbohydrate and insulin dynamics. The predictors were trained and tested using data of ten patients from a public hospital in Spain. We analyze our experimental results using the Clarke error grid metric and see that 90% of the forecasts are correct (i.e., Clarke error categories A and B), but still even the best methods produce 5 to 10% of serious errors (category D) and approximately 0.5% of very serious errors (category E). We also propose an enhanced genetic programming algorithm that incorporates a three-compartment model into symbolic regression models to create smoothed time series of the original carbohydrate and insulin time series.

  18. Control of epileptic seizures in WAG/Rij rats by means of brain-computer interface

    NASA Astrophysics Data System (ADS)

    Makarov, Vladimir V.; Maksimenko, Vladimir A.; van Luijtelaar, Gilles; Lüttjohann, Annika; Hramov, Alexander E.

    2018-02-01

    The main issue of epileptology is the elimination of epileptic events. This can be achieved by a system that predicts the emergence of seizures in conjunction with a system that interferes with the process that leads to the onset of seizure. The prediction of seizures remains, for the present, unresolved in the absence epilepsy, due to the sudden onset of seizures. We developed an algorithm for predicting seizures in real time, evaluated it and implemented it into an online closed-loop brain stimulation system designed to prevent typical for the absence of epilepsy of spike waves (SWD) in the genetic rat model. The algorithm correctly predicts more than 85% of the seizures and the rest were successfully detected. Unlike the old beliefs that SWDs are unpredictable, current results show that they can be predicted and that the development of systems for predicting and preventing closed-loop capture is a feasible step on the way to intervention to achieve control and freedom from epileptic seizures.

  19. How long will my mouse live? Machine learning approaches for prediction of mouse life span.

    PubMed

    Swindell, William R; Harper, James M; Miller, Richard A

    2008-09-01

    Prediction of individual life span based on characteristics evaluated at middle-age represents a challenging objective for aging research. In this study, we used machine learning algorithms to construct models that predict life span in a stock of genetically heterogeneous mice. Life-span prediction accuracy of 22 algorithms was evaluated using a cross-validation approach, in which models were trained and tested with distinct subsets of data. Using a combination of body weight and T-cell subset measures evaluated before 2 years of age, we show that the life-span quartile to which an individual mouse belongs can be predicted with an accuracy of 35.3% (+/-0.10%). This result provides a new benchmark for the development of life-span-predictive models, but improvement can be expected through identification of new predictor variables and development of computational approaches. Future work in this direction can provide tools for aging research and will shed light on associations between phenotypic traits and longevity.

  20. Genetic algorithms applied to the scheduling of the Hubble Space Telescope

    NASA Technical Reports Server (NTRS)

    Sponsler, Jeffrey L.

    1989-01-01

    A prototype system employing a genetic algorithm (GA) has been developed to support the scheduling of the Hubble Space Telescope. A non-standard knowledge structure is used and appropriate genetic operators have been created. Several different crossover styles (random point selection, evolving points, and smart point selection) are tested and the best GA is compared with a neural network (NN) based optimizer. The smart crossover operator produces the best results and the GA system is able to evolve complete schedules using it. The GA is not as time-efficient as the NN system and the NN solutions tend to be better.

  1. Demonstrating the suitability of genetic algorithms for driving microbial ecosystems in desirable directions.

    PubMed

    Vandecasteele, Frederik P J; Hess, Thomas F; Crawford, Ronald L

    2007-07-01

    The functioning of natural microbial ecosystems is determined by biotic interactions, which are in turn influenced by abiotic environmental conditions. Direct experimental manipulation of such conditions can be used to purposefully drive ecosystems toward exhibiting desirable functions. When a set of environmental conditions can be manipulated to be present at a discrete number of levels, finding the right combination of conditions to obtain the optimal desired effect becomes a typical combinatorial optimisation problem. Genetic algorithms are a class of robust and flexible search and optimisation techniques from the field of computer science that may be very suitable for such a task. To verify this idea, datasets containing growth levels of the total microbial community of four different natural microbial ecosystems in response to all possible combinations of a set of five chemical supplements were obtained. Subsequently, the ability of a genetic algorithm to search this parameter space for combinations of supplements driving the microbial communities to high levels of growth was compared to that of a random search, a local search, and a hill-climbing algorithm, three intuitive alternative optimisation approaches. The results indicate that a genetic algorithm is very suitable for driving microbial ecosystems in desirable directions, which opens opportunities for both fundamental ecological research and industrial applications.

  2. A Systematic Investigation of Computation Models for Predicting Adverse Drug Reactions (ADRs)

    PubMed Central

    Kuang, Qifan; Wang, MinQi; Li, Rong; Dong, YongCheng; Li, Yizhou; Li, Menglong

    2014-01-01

    Background Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. Principal Findings In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. Conclusion Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms. PMID:25180585

  3. Optimal configuration of power grid sources based on optimal particle swarm algorithm

    NASA Astrophysics Data System (ADS)

    Wen, Yuanhua

    2018-04-01

    In order to optimize the distribution problem of power grid sources, an optimized particle swarm optimization algorithm is proposed. First, the concept of multi-objective optimization and the Pareto solution set are enumerated. Then, the performance of the classical genetic algorithm, the classical particle swarm optimization algorithm and the improved particle swarm optimization algorithm are analyzed. The three algorithms are simulated respectively. Compared with the test results of each algorithm, the superiority of the algorithm in convergence and optimization performance is proved, which lays the foundation for subsequent micro-grid power optimization configuration solution.

  4. Optimal sensor placement for spatial lattice structure based on genetic algorithms

    NASA Astrophysics Data System (ADS)

    Liu, Wei; Gao, Wei-cheng; Sun, Yi; Xu, Min-jian

    2008-10-01

    Optimal sensor placement technique plays a key role in structural health monitoring of spatial lattice structures. This paper considers the problem of locating sensors on a spatial lattice structure with the aim of maximizing the data information so that structural dynamic behavior can be fully characterized. Based on the criterion of optimal sensor placement for modal test, an improved genetic algorithm is introduced to find the optimal placement of sensors. The modal strain energy (MSE) and the modal assurance criterion (MAC) have been taken as the fitness function, respectively, so that three placement designs were produced. The decimal two-dimension array coding method instead of binary coding method is proposed to code the solution. Forced mutation operator is introduced when the identical genes appear via the crossover procedure. A computational simulation of a 12-bay plain truss model has been implemented to demonstrate the feasibility of the three optimal algorithms above. The obtained optimal sensor placements using the improved genetic algorithm are compared with those gained by exiting genetic algorithm using the binary coding method. Further the comparison criterion based on the mean square error between the finite element method (FEM) mode shapes and the Guyan expansion mode shapes identified by data-driven stochastic subspace identification (SSI-DATA) method are employed to demonstrate the advantage of the different fitness function. The results showed that some innovations in genetic algorithm proposed in this paper can enlarge the genes storage and improve the convergence of the algorithm. More importantly, the three optimal sensor placement methods can all provide the reliable results and identify the vibration characteristics of the 12-bay plain truss model accurately.

  5. A capacitated vehicle routing problem with order available time in e-commerce industry

    NASA Astrophysics Data System (ADS)

    Liu, Ling; Li, Kunpeng; Liu, Zhixue

    2017-03-01

    In this article, a variant of the well-known capacitated vehicle routing problem (CVRP) called the capacitated vehicle routing problem with order available time (CVRPOAT) is considered, which is observed in the operations of the current e-commerce industry. In this problem, the orders are not available for delivery at the beginning of the planning period. CVRPOAT takes all the assumptions of CVRP, except the order available time, which is determined by the precedent order picking and packing stage in the warehouse of the online grocer. The objective is to minimize the sum of vehicle completion times. An efficient tabu search algorithm is presented to tackle the problem. Moreover, a Lagrangian relaxation algorithm is developed to obtain the lower bounds of reasonably sized problems. Based on the test instances derived from benchmark data, the proposed tabu search algorithm is compared with a published related genetic algorithm, as well as the derived lower bounds. Also, the tabu search algorithm is compared with the current operation strategy of the online grocer. Computational results indicate that the gap between the lower bounds and the results of the tabu search algorithm is small and the tabu search algorithm is superior to the genetic algorithm. Moreover, the CVRPOAT formulation together with the tabu search algorithm performs much better than the current operation strategy of the online grocer.

  6. A traveling salesman approach for predicting protein functions.

    PubMed

    Johnson, Olin; Liu, Jing

    2006-10-12

    Protein-protein interaction information can be used to predict unknown protein functions and to help study biological pathways. Here we present a new approach utilizing the classic Traveling Salesman Problem to study the protein-protein interactions and to predict protein functions in budding yeast Saccharomyces cerevisiae. We apply the global optimization tool from combinatorial optimization algorithms to cluster the yeast proteins based on the global protein interaction information. We then use this clustering information to help us predict protein functions. We use our algorithm together with the direct neighbor algorithm 1 on characterized proteins and compare the prediction accuracy of the two methods. We show our algorithm can produce better predictions than the direct neighbor algorithm, which only considers the immediate neighbors of the query protein. Our method is a promising one to be used as a general tool to predict functions of uncharacterized proteins and a successful sample of using computer science knowledge and algorithms to study biological problems.

  7. A traveling salesman approach for predicting protein functions

    PubMed Central

    Johnson, Olin; Liu, Jing

    2006-01-01

    Background Protein-protein interaction information can be used to predict unknown protein functions and to help study biological pathways. Results Here we present a new approach utilizing the classic Traveling Salesman Problem to study the protein-protein interactions and to predict protein functions in budding yeast Saccharomyces cerevisiae. We apply the global optimization tool from combinatorial optimization algorithms to cluster the yeast proteins based on the global protein interaction information. We then use this clustering information to help us predict protein functions. We use our algorithm together with the direct neighbor algorithm [1] on characterized proteins and compare the prediction accuracy of the two methods. We show our algorithm can produce better predictions than the direct neighbor algorithm, which only considers the immediate neighbors of the query protein. Conclusion Our method is a promising one to be used as a general tool to predict functions of uncharacterized proteins and a successful sample of using computer science knowledge and algorithms to study biological problems. PMID:17147783

  8. Using genetic algorithms to optimize the analogue method for precipitation prediction in the Swiss Alps

    NASA Astrophysics Data System (ADS)

    Horton, Pascal; Jaboyedoff, Michel; Obled, Charles

    2018-01-01

    Analogue methods provide a statistical precipitation prediction based on synoptic predictors supplied by general circulation models or numerical weather prediction models. The method samples a selection of days in the archives that are similar to the target day to be predicted, and consider their set of corresponding observed precipitation (the predictand) as the conditional distribution for the target day. The relationship between the predictors and predictands relies on some parameters that characterize how and where the similarity between two atmospheric situations is defined. This relationship is usually established by a semi-automatic sequential procedure that has strong limitations: (i) it cannot automatically choose the pressure levels and temporal windows (hour of the day) for a given meteorological variable, (ii) it cannot handle dependencies between parameters, and (iii) it cannot easily handle new degrees of freedom. In this work, a global optimization approach relying on genetic algorithms could optimize all parameters jointly and automatically. The global optimization was applied to some variants of the analogue method for the Rhône catchment in the Swiss Alps. The performance scores increased compared to reference methods, especially for days with high precipitation totals. The resulting parameters were found to be relevant and coherent between the different subregions of the catchment. Moreover, they were obtained automatically and objectively, which reduces the effort that needs to be invested in exploration attempts when adapting the method to a new region or for a new predictand. For example, it obviates the need to assess a large number of combinations of pressure levels and temporal windows of predictor variables that were manually selected beforehand. The optimization could also take into account parameter inter-dependencies. In addition, the approach allowed for new degrees of freedom, such as a possible weighting between pressure levels, and non-overlapping spatial windows.

  9. Ortho Image and DTM Generation with Intelligent Methods

    NASA Astrophysics Data System (ADS)

    Bagheri, H.; Sadeghian, S.

    2013-10-01

    Nowadays the artificial intelligent algorithms has considered in GIS and remote sensing. Genetic algorithm and artificial neural network are two intelligent methods that are used for optimizing of image processing programs such as edge extraction and etc. these algorithms are very useful for solving of complex program. In this paper, the ability and application of genetic algorithm and artificial neural network in geospatial production process like geometric modelling of satellite images for ortho photo generation and height interpolation in raster Digital Terrain Model production process is discussed. In first, the geometric potential of Ikonos-2 and Worldview-2 with rational functions, 2D & 3D polynomials were tested. Also comprehensive experiments have been carried out to evaluate the viability of the genetic algorithm for optimization of rational function, 2D & 3D polynomials. Considering the quality of Ground Control Points, the accuracy (RMSE) with genetic algorithm and 3D polynomials method for Ikonos-2 Geo image was 0.508 pixel sizes and the accuracy (RMSE) with GA algorithm and rational function method for Worldview-2 image was 0.930 pixel sizes. For more another optimization artificial intelligent methods, neural networks were used. With the use of perceptron network in Worldview-2 image, a result of 0.84 pixel sizes with 4 neurons in middle layer was gained. The final conclusion was that with artificial intelligent algorithms it is possible to optimize the existing models and have better results than usual ones. Finally the artificial intelligence methods, like genetic algorithms as well as neural networks, were examined on sample data for optimizing interpolation and for generating Digital Terrain Models. The results then were compared with existing conventional methods and it appeared that these methods have a high capacity in heights interpolation and that using these networks for interpolating and optimizing the weighting methods based on inverse distance leads to a high accurate estimation of heights.

  10. Do Staphylococcus epidermidis Genetic Clusters Predict Isolation Sources?

    PubMed Central

    Tolo, Isaiah; Thomas, Jonathan C.; Fischer, Rebecca S. B.; Brown, Eric L.; Gray, Barry M.

    2016-01-01

    Staphylococcus epidermidis is a ubiquitous colonizer of human skin and a common cause of medical device-associated infections. The extent to which the population genetic structure of S. epidermidis distinguishes commensal from pathogenic isolates is unclear. Previously, Bayesian clustering of 437 multilocus sequence types (STs) in the international database revealed a population structure of six genetic clusters (GCs) that may reflect the species' ecology. Here, we first verified the presence of six GCs, including two (GC3 and GC5) with significant admixture, in an updated database of 578 STs. Next, a single nucleotide polymorphism (SNP) assay was developed that accurately assigned 545 (94%) of 578 STs to GCs. Finally, the hypothesis that GCs could distinguish isolation sources was tested by SNP typing and GC assignment of 154 isolates from hospital patients with bacteremia and those with blood culture contaminants and from nonhospital carriage. GC5 was isolated almost exclusively from hospital sources. GC1 and GC6 were isolated from all sources but were overrepresented in isolates from nonhospital and infection sources, respectively. GC2, GC3, and GC4 were relatively rare in this collection. No association was detected between fdh-positive isolates (GC2 and GC4) and nonhospital sources. Using a machine learning algorithm, GCs predicted hospital and nonhospital sources with 80% accuracy and predicted infection and contaminant sources with 45% accuracy, which was comparable to the results seen with a combination of five genetic markers (icaA, IS256, sesD [bhp], mecA, and arginine catabolic mobile element [ACME]). Thus, analysis of population structure with subgenomic data shows the distinction of hospital and nonhospital sources and the near-inseparability of sources within a hospital. PMID:27076664

  11. Optimization lighting layout based on gene density improved genetic algorithm for indoor visible light communications

    NASA Astrophysics Data System (ADS)

    Liu, Huanlin; Wang, Xin; Chen, Yong; Kong, Deqian; Xia, Peijie

    2017-05-01

    For indoor visible light communication system, the layout of LED lamps affects the uniformity of the received power on communication plane. In order to find an optimized lighting layout that meets both the lighting needs and communication needs, a gene density genetic algorithm (GDGA) is proposed. In GDGA, a gene indicates a pair of abscissa and ordinate of a LED, and an individual represents a LED layout in the room. The segmented crossover operation and gene mutation strategy based on gene density are put forward to make the received power on communication plane more uniform and increase the population's diversity. A weighted differences function between individuals is designed as the fitness function of GDGA for reserving the population having the useful LED layout genetic information and ensuring the global convergence of GDGA. Comparing square layout and circular layout, with the optimized layout achieved by the GDGA, the power uniformity increases by 83.3%, 83.1% and 55.4%, respectively. Furthermore, the convergence of GDGA is verified compared with evolutionary algorithm (EA). Experimental results show that GDGA can quickly find an approximation of optimal layout.

  12. TEMPLE: analysing population genetic variation at transcription factor binding sites.

    PubMed

    Litovchenko, Maria; Laurent, Stefan

    2016-11-01

    Genetic variation occurring at the level of regulatory sequences can affect phenotypes and fitness in natural populations. This variation can be analysed in a population genetic framework to study how genetic drift and selection affect the evolution of these functional elements. However, doing this requires a good understanding of the location and nature of regulatory regions and has long been a major hurdle. The current proliferation of genomewide profiling experiments of transcription factor occupancies greatly improves our ability to identify genomic regions involved in specific DNA-protein interactions. Although software exists for predicting transcription factor binding sites (TFBS), and the effects of genetic variants on TFBS specificity, there are no tools currently available for inferring this information jointly with the genetic variation at TFBS in natural populations. We developed the software Transcription Elements Mapping at the Population LEvel (TEMPLE), which predicts TFBS, evaluates the effects of genetic variants on TFBS specificity and summarizes the genetic variation occurring at TFBS in intraspecific sequence alignments. We demonstrate that TEMPLE's TFBS prediction algorithms gives identical results to PATSER, a software distribution commonly used in the field. We also illustrate the unique features of TEMPLE by analysing TFBS diversity for the TF Senseless (SENS) in one ancestral and one cosmopolitan population of the fruit fly Drosophila melanogaster. TEMPLE can be used to localize TFBS that are characterized by strong genetic differentiation across natural populations. This will be particularly useful for studies aiming to identify adaptive mutations. TEMPLE is a java-based cross-platform software that easily maps the genetic diversity at predicted TFBSs using a graphical interface, or from the Unix command line. © 2016 John Wiley & Sons Ltd.

  13. Applying Spatial-Temporal Model and Game Theory to Asymmetric Threat Prediction

    DTIC Science & Technology

    2007-06-01

    Genshe Chen, Denis Garagic, Xiaohuan Tan, Dongxu Li, Dan Shen, Mo Wei, Xu Wang, “Team Dynamics and Tactics for Mission Planning,” Proceedings...Cruz, Jr., Genshe Chen, Dongxu Li, and Denis Garagic, “Target Selection in UAV Cooperative Control Under Uncertain Environment: Genetic Algorithm

  14. Many amino acid substitution variants identified in DNA repair genes during human population screenings are predicted to impact protein function

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xi, T; Jones, I M; Mohrenweiser, H W

    2003-11-03

    Over 520 different amino acid substitution variants have been previously identified in the systematic screening of 91 human DNA repair genes for sequence variation. Two algorithms were employed to predict the impact of these amino acid substitutions on protein activity. Sorting Intolerant From Tolerant (SIFT) classified 226 of 508 variants (44%) as ''Intolerant''. Polymorphism Phenotyping (PolyPhen) classed 165 of 489 amino acid substitutions (34%) as ''Probably or Possibly Damaging''. Another 9-15% of the variants were classed as ''Potentially Intolerant or Damaging''. The results from the two algorithms are highly associated, with concordance in predicted impact observed for {approx}62% of themore » variants. Twenty one to thirty one percent of the variant proteins are predicted to exhibit reduced activity by both algorithms. These variants occur at slightly lower individual allele frequency than do the variants classified as ''Tolerant'' or ''Benign''. Both algorithms correctly predicted the impact of 26 functionally characterized amino acid substitutions in the APE1 protein on biochemical activity, with one exception. It is concluded that a substantial fraction of the missense variants observed in the general human population are functionally relevant. These variants are expected to be the molecular genetic and biochemical basis for the associations of reduced DNA repair capacity phenotypes with elevated cancer risk.« less

  15. Prospects for Genomic Selection in Cassava Breeding.

    PubMed

    Wolfe, Marnin D; Del Carpio, Dunia Pino; Alabi, Olumide; Ezenwaka, Lydia C; Ikeogu, Ugochukwu N; Kayondo, Ismail S; Lozano, Roberto; Okeke, Uche G; Ozimati, Alfred A; Williams, Esuma; Egesi, Chiedozie; Kawuki, Robert S; Kulakow, Peter; Rabbi, Ismail Y; Jannink, Jean-Luc

    2017-11-01

    Cassava ( Crantz) is a clonally propagated staple food crop in the tropics. Genomic selection (GS) has been implemented at three breeding institutions in Africa to reduce cycle times. Initial studies provided promising estimates of predictive abilities. Here, we expand on previous analyses by assessing the accuracy of seven prediction models for seven traits in three prediction scenarios: cross-validation within populations, cross-population prediction and cross-generation prediction. We also evaluated the impact of increasing the training population (TP) size by phenotyping progenies selected either at random or with a genetic algorithm. Cross-validation results were mostly consistent across programs, with nonadditive models predicting of 10% better on average. Cross-population accuracy was generally low (mean = 0.18) but prediction of cassava mosaic disease increased up to 57% in one Nigerian population when data from another related population were combined. Accuracy across generations was poorer than within-generation accuracy, as expected, but accuracy for dry matter content and mosaic disease severity should be sufficient for rapid-cycling GS. Selection of a prediction model made some difference across generations, but increasing TP size was more important. With a genetic algorithm, selection of one-third of progeny could achieve an accuracy equivalent to phenotyping all progeny. We are in the early stages of GS for this crop but the results are promising for some traits. General guidelines that are emerging are that TPs need to continue to grow but phenotyping can be done on a cleverly selected subset of individuals, reducing the overall phenotyping burden. Copyright © 2017 Crop Science Society of America.

  16. Prediction accuracies for growth and wood attributes of interior spruce in space using genotyping-by-sequencing.

    PubMed

    Gamal El-Dien, Omnia; Ratcliffe, Blaise; Klápště, Jaroslav; Chen, Charles; Porth, Ilga; El-Kassaby, Yousry A

    2015-05-09

    Genomic selection (GS) in forestry can substantially reduce the length of breeding cycle and increase gain per unit time through early selection and greater selection intensity, particularly for traits of low heritability and late expression. Affordable next-generation sequencing technologies made it possible to genotype large numbers of trees at a reasonable cost. Genotyping-by-sequencing was used to genotype 1,126 Interior spruce trees representing 25 open-pollinated families planted over three sites in British Columbia, Canada. Four imputation algorithms were compared (mean value (MI), singular value decomposition (SVD), expectation maximization (EM), and a newly derived, family-based k-nearest neighbor (kNN-Fam)). Trees were phenotyped for several yield and wood attributes. Single- and multi-site GS prediction models were developed using the Ridge Regression Best Linear Unbiased Predictor (RR-BLUP) and the Generalized Ridge Regression (GRR) to test different assumption about trait architecture. Finally, using PCA, multi-trait GS prediction models were developed. The EM and kNN-Fam imputation methods were superior for 30 and 60% missing data, respectively. The RR-BLUP GS prediction model produced better accuracies than the GRR indicating that the genetic architecture for these traits is complex. GS prediction accuracies for multi-site were high and better than those of single-sites while multi-site predictability produced the lowest accuracies reflecting type-b genetic correlations and deemed unreliable. The incorporation of genomic information in quantitative genetics analyses produced more realistic heritability estimates as half-sib pedigree tended to inflate the additive genetic variance and subsequently both heritability and gain estimates. Principle component scores as representatives of multi-trait GS prediction models produced surprising results where negatively correlated traits could be concurrently selected for using PCA2 and PCA3. The application of GS to open-pollinated family testing, the simplest form of tree improvement evaluation methods, was proven to be effective. Prediction accuracies obtained for all traits greatly support the integration of GS in tree breeding. While the within-site GS prediction accuracies were high, the results clearly indicate that single-site GS models ability to predict other sites are unreliable supporting the utilization of multi-site approach. Principle component scores provided an opportunity for the concurrent selection of traits with different phenotypic optima.

  17. Predicting the Direction of Stock Market Index Movement Using an Optimized Artificial Neural Network Model.

    PubMed

    Qiu, Mingyue; Song, Yu

    2016-01-01

    In the business sector, it has always been a difficult task to predict the exact daily price of the stock market index; hence, there is a great deal of research being conducted regarding the prediction of the direction of stock price index movement. Many factors such as political events, general economic conditions, and traders' expectations may have an influence on the stock market index. There are numerous research studies that use similar indicators to forecast the direction of the stock market index. In this study, we compare two basic types of input variables to predict the direction of the daily stock market index. The main contribution of this study is the ability to predict the direction of the next day's price of the Japanese stock market index by using an optimized artificial neural network (ANN) model. To improve the prediction accuracy of the trend of the stock market index in the future, we optimize the ANN model using genetic algorithms (GA). We demonstrate and verify the predictability of stock price direction by using the hybrid GA-ANN model and then compare the performance with prior studies. Empirical results show that the Type 2 input variables can generate a higher forecast accuracy and that it is possible to enhance the performance of the optimized ANN model by selecting input variables appropriately.

  18. Predicting the Direction of Stock Market Index Movement Using an Optimized Artificial Neural Network Model

    PubMed Central

    Qiu, Mingyue; Song, Yu

    2016-01-01

    In the business sector, it has always been a difficult task to predict the exact daily price of the stock market index; hence, there is a great deal of research being conducted regarding the prediction of the direction of stock price index movement. Many factors such as political events, general economic conditions, and traders’ expectations may have an influence on the stock market index. There are numerous research studies that use similar indicators to forecast the direction of the stock market index. In this study, we compare two basic types of input variables to predict the direction of the daily stock market index. The main contribution of this study is the ability to predict the direction of the next day’s price of the Japanese stock market index by using an optimized artificial neural network (ANN) model. To improve the prediction accuracy of the trend of the stock market index in the future, we optimize the ANN model using genetic algorithms (GA). We demonstrate and verify the predictability of stock price direction by using the hybrid GA-ANN model and then compare the performance with prior studies. Empirical results show that the Type 2 input variables can generate a higher forecast accuracy and that it is possible to enhance the performance of the optimized ANN model by selecting input variables appropriately. PMID:27196055

  19. Network congestion control algorithm based on Actor-Critic reinforcement learning model

    NASA Astrophysics Data System (ADS)

    Xu, Tao; Gong, Lina; Zhang, Wei; Li, Xuhong; Wang, Xia; Pan, Wenwen

    2018-04-01

    Aiming at the network congestion control problem, a congestion control algorithm based on Actor-Critic reinforcement learning model is designed. Through the genetic algorithm in the congestion control strategy, the network congestion problems can be better found and prevented. According to Actor-Critic reinforcement learning, the simulation experiment of network congestion control algorithm is designed. The simulation experiments verify that the AQM controller can predict the dynamic characteristics of the network system. Moreover, the learning strategy is adopted to optimize the network performance, and the dropping probability of packets is adaptively adjusted so as to improve the network performance and avoid congestion. Based on the above finding, it is concluded that the network congestion control algorithm based on Actor-Critic reinforcement learning model can effectively avoid the occurrence of TCP network congestion.

  20. Genetic algorithms and MCML program for recovery of optical properties of homogeneous turbid media

    PubMed Central

    Morales Cruzado, Beatriz; y Montiel, Sergio Vázquez; Atencio, José Alberto Delgado

    2013-01-01

    In this paper, we present and validate a new method for optical properties recovery of turbid media with slab geometry. This method is an iterative method that compares diffuse reflectance and transmittance, measured using integrating spheres, with those obtained using the known algorithm MCML. The search procedure is based in the evolution of a population due to selection of the best individual, i.e., using a genetic algorithm. This new method includes several corrections such as non-linear effects in integrating spheres measurements and loss of light due to the finite size of the sample. As a potential application and proof-of-principle experiment of this new method, we use this new algorithm in the recovery of optical properties of blood samples at different degrees of coagulation. PMID:23504404

  1. Retrieval of Dry Snow Parameters from Radiometric Data Using a Dense Medium Model and Genetic Algorithms

    NASA Technical Reports Server (NTRS)

    Tedesco, Marco; Kim, Edward J.

    2005-01-01

    In this paper, GA-based techniques are used to invert the equations of an electromagnetic model based on Dense Medium Radiative Transfer Theory (DMRT) under the Quasi Crystalline Approximation with Coherent Potential to retrieve snow depth, mean grain size and fractional volume from microwave brightness temperatures. The technique is initially tested on both noisy and not-noisy simulated data. During this phase, different configurations of genetic algorithm parameters are considered to quantify how their change can affect the algorithm performance. A configuration of GA parameters is then selected and the algorithm is applied to experimental data acquired during the NASA Cold Land Process Experiment. Snow parameters retrieved with the GA-DMRT technique are then compared with snow parameters measured on field.

  2. Bouc-Wen hysteresis model identification using Modified Firefly Algorithm

    NASA Astrophysics Data System (ADS)

    Zaman, Mohammad Asif; Sikder, Urmita

    2015-12-01

    The parameters of Bouc-Wen hysteresis model are identified using a Modified Firefly Algorithm. The proposed algorithm uses dynamic process control parameters to improve its performance. The algorithm is used to find the model parameter values that results in the least amount of error between a set of given data points and points obtained from the Bouc-Wen model. The performance of the algorithm is compared with the performance of conventional Firefly Algorithm, Genetic Algorithm and Differential Evolution algorithm in terms of convergence rate and accuracy. Compared to the other three optimization algorithms, the proposed algorithm is found to have good convergence rate with high degree of accuracy in identifying Bouc-Wen model parameters. Finally, the proposed method is used to find the Bouc-Wen model parameters from experimental data. The obtained model is found to be in good agreement with measured data.

  3. Genetic algorithms and genetic programming for multiscale modeling: Applications in materials science and chemistry and advances in scalability

    NASA Astrophysics Data System (ADS)

    Sastry, Kumara Narasimha

    2007-03-01

    Effective and efficient rnultiscale modeling is essential to advance both the science and synthesis in a, wide array of fields such as physics, chemistry, materials science; biology, biotechnology and pharmacology. This study investigates the efficacy and potential of rising genetic algorithms for rnultiscale materials modeling and addresses some of the challenges involved in designing competent algorithms that solve hard problems quickly, reliably and accurately. In particular, this thesis demonstrates the use of genetic algorithms (GAs) and genetic programming (GP) in multiscale modeling with the help of two non-trivial case studies in materials science and chemistry. The first case study explores the utility of genetic programming (GP) in multi-timescaling alloy kinetics simulations. In essence, GP is used to bridge molecular dynamics and kinetic Monte Carlo methods to span orders-of-magnitude in simulation time. Specifically, GP is used to regress symbolically an inline barrier function from a limited set of molecular dynamics simulations to enable kinetic Monte Carlo that simulate seconds of real time. Results on a non-trivial example of vacancy-assisted migration on a surface of a face-centered cubic (fcc) Copper-Cobalt (CuxCo 1-x) alloy show that GP predicts all barriers with 0.1% error from calculations for less than 3% of active configurations, independent of type of potentials used to obtain the learning set of barriers via molecular dynamics. The resulting method enables 2--9 orders-of-magnitude increase in real-time dynamics simulations taking 4--7 orders-of-magnitude less CPU time. The second case study presents the application of multiobjective genetic algorithms (MOGAs) in multiscaling quantum chemistry simulations. Specifically, MOGAs are used to bridge high-level quantum chemistry and semiempirical methods to provide accurate representation of complex molecular excited-state and ground-state behavior. Results on ethylene and benzene---two common building blocks in organic chemistry---indicate that MOGAs produce High-quality semiempirical methods that (1) are stable to small perturbations, (2) yield accurate configuration energies on untested and critical excited states, and (3) yield ab initio quality excited-state dynamics. The proposed method enables simulations of more complex systems to realistic, multi-picosecond timescales, well beyond previous attempts or expectation of human experts, and 2--3 orders-of-magnitude reduction in computational cost. While the two applications use simple evolutionary operators, in order to tackle more complex systems, their scalability and limitations have to be investigated. The second part of the thesis addresses some of the challenges involved with a successful design of genetic algorithms and genetic programming for multiscale modeling. The first issue addressed is the scalability of genetic programming, where facetwise models are built to assess the population size required by GP to ensure adequate supply of raw building blocks and also to ensure accurate decision-making between competing building blocks. This study also presents a design of competent genetic programming, where traditional fixed recombination operators are replaced by building and sampling probabilistic models of promising candidate programs. The proposed scalable GP, called extended compact GP (eCGP), combines the ideas from extended compact genetic algorithm (eCGA) and probabilistic incremental program evolution (PIPE) and adaptively identifies, propagates and exchanges important subsolutions of a search problem. Results show that eCGP scales cubically with problem size on both GP-easy and GP-hard problems. Finally, facetwise models are developed to explore limitations of scalability of MOGAs, where the scalability of multiobjective algorithms in reliably maintaining Pareto-optimal solutions is addressed. The results show that even when the building blocks are accurately identified, massive multimodality of the search problems can easily overwhelm the nicher (diversity preserving operator) and lead to exponential scale-up. Facetwise models are developed, which incorporate the combined effects of model accuracy, decision making, and sub-structure supply, as well as the effect of niching on the population sizing, to predict a limit on the growth rate of a maximum number of sub-structures that can compete in the two objectives to circumvent the failure of the niching method. The results show that if the number of competing building blocks between multiple objectives is less than the proposed limit, multiobjective GAs scale-up polynomially with the problem size on boundedly-difficult problems.

  4. Stochastic model search with binary outcomes for genome-wide association studies.

    PubMed

    Russu, Alberto; Malovini, Alberto; Puca, Annibale A; Bellazzi, Riccardo

    2012-06-01

    The spread of case-control genome-wide association studies (GWASs) has stimulated the development of new variable selection methods and predictive models. We introduce a novel Bayesian model search algorithm, Binary Outcome Stochastic Search (BOSS), which addresses the model selection problem when the number of predictors far exceeds the number of binary responses. Our method is based on a latent variable model that links the observed outcomes to the underlying genetic variables. A Markov Chain Monte Carlo approach is used for model search and to evaluate the posterior probability of each predictor. BOSS is compared with three established methods (stepwise regression, logistic lasso, and elastic net) in a simulated benchmark. Two real case studies are also investigated: a GWAS on the genetic bases of longevity, and the type 2 diabetes study from the Wellcome Trust Case Control Consortium. Simulations show that BOSS achieves higher precisions than the reference methods while preserving good recall rates. In both experimental studies, BOSS successfully detects genetic polymorphisms previously reported to be associated with the analyzed phenotypes. BOSS outperforms the other methods in terms of F-measure on simulated data. In the two real studies, BOSS successfully detects biologically relevant features, some of which are missed by univariate analysis and the three reference techniques. The proposed algorithm is an advance in the methodology for model selection with a large number of features. Our simulated and experimental results showed that BOSS proves effective in detecting relevant markers while providing a parsimonious model.

  5. Systematic chemical-genetic and chemical-chemical interaction datasets for prediction of compound synergism

    PubMed Central

    Wildenhain, Jan; Spitzer, Michaela; Dolma, Sonam; Jarvik, Nick; White, Rachel; Roy, Marcia; Griffiths, Emma; Bellows, David S.; Wright, Gerard D.; Tyers, Mike

    2016-01-01

    The network structure of biological systems suggests that effective therapeutic intervention may require combinations of agents that act synergistically. However, a dearth of systematic chemical combination datasets have limited the development of predictive algorithms for chemical synergism. Here, we report two large datasets of linked chemical-genetic and chemical-chemical interactions in the budding yeast Saccharomyces cerevisiae. We screened 5,518 unique compounds against 242 diverse yeast gene deletion strains to generate an extended chemical-genetic matrix (CGM) of 492,126 chemical-gene interaction measurements. This CGM dataset contained 1,434 genotype-specific inhibitors, termed cryptagens. We selected 128 structurally diverse cryptagens and tested all pairwise combinations to generate a benchmark dataset of 8,128 pairwise chemical-chemical interaction tests for synergy prediction, termed the cryptagen matrix (CM). An accompanying database resource called ChemGRID was developed to enable analysis, visualisation and downloads of all data. The CGM and CM datasets will facilitate the benchmarking of computational approaches for synergy prediction, as well as chemical structure-activity relationship models for anti-fungal drug discovery. PMID:27874849

  6. Improved Cost-Base Design of Water Distribution Networks using Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Moradzadeh Azar, Foad; Abghari, Hirad; Taghi Alami, Mohammad; Weijs, Steven

    2010-05-01

    Population growth and progressive extension of urbanization in different places of Iran cause an increasing demand for primary needs. The water, this vital liquid is the most important natural need for human life. Providing this natural need is requires the design and construction of water distribution networks, that incur enormous costs on the country's budget. Any reduction in these costs enable more people from society to access extreme profit least cost. Therefore, investment of Municipal councils need to maximize benefits or minimize expenditures. To achieve this purpose, the engineering design depends on the cost optimization techniques. This paper, presents optimization models based on genetic algorithm(GA) to find out the minimum design cost Mahabad City's (North West, Iran) water distribution network. By designing two models and comparing the resulting costs, the abilities of GA were determined. the GA based model could find optimum pipe diameters to reduce the design costs of network. Results show that the water distribution network design using Genetic Algorithm could lead to reduction of at least 7% in project costs in comparison to the classic model. Keywords: Genetic Algorithm, Optimum Design of Water Distribution Network, Mahabad City, Iran.

  7. Locating Critical Circular and Unconstrained Failure Surface in Slope Stability Analysis with Tailored Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Pasik, Tomasz; van der Meij, Raymond

    2017-12-01

    This article presents an efficient search method for representative circular and unconstrained slip surfaces with the use of the tailored genetic algorithm. Searches for unconstrained slip planes with rigid equilibrium methods are yet uncommon in engineering practice, and little publications regarding truly free slip planes exist. The proposed method presents an effective procedure being the result of the right combination of initial population type, selection, crossover and mutation method. The procedure needs little computational effort to find the optimum, unconstrained slip plane. The methodology described in this paper is implemented using Mathematica. The implementation, along with further explanations, is fully presented so the results can be reproduced. Sample slope stability calculations are performed for four cases, along with a detailed result interpretation. Two cases are compared with analyses described in earlier publications. The remaining two are practical cases of slope stability analyses of dikes in Netherlands. These four cases show the benefits of analyzing slope stability with a rigid equilibrium method combined with a genetic algorithm. The paper concludes by describing possibilities and limitations of using the genetic algorithm in the context of the slope stability problem.

  8. Multiple sclerosis: individualized disease susceptibility and therapy response.

    PubMed

    Pravica, Vera; Markovic, Milos; Cupic, Maja; Savic, Emina; Popadic, Dusan; Drulovic, Jelena; Mostarica-Stojkovic, Marija

    2013-02-01

    Multiple sclerosis (MS) is a heterogeneous disease in which diverse genetic, pathological and clinical backgrounds lead to variable therapy response. Accordingly, MS care should be tailored to address disease traits unique to each person. At the core of personalized management is the emergence of new knowledge, enabling optimized treatment and disease-modifying therapies. This overview analyzes the promise of genetic and nongenetic biomarkers in advancing decision-making algorithms to assist diagnosis or in predicting the disease course and therapy response in any given MS patient.

  9. A study on the application of topic models to motif finding algorithms.

    PubMed

    Basha Gutierrez, Josep; Nakai, Kenta

    2016-12-22

    Topic models are statistical algorithms which try to discover the structure of a set of documents according to the abstract topics contained in them. Here we try to apply this approach to the discovery of the structure of the transcription factor binding sites (TFBS) contained in a set of biological sequences, which is a fundamental problem in molecular biology research for the understanding of transcriptional regulation. Here we present two methods that make use of topic models for motif finding. First, we developed an algorithm in which first a set of biological sequences are treated as text documents, and the k-mers contained in them as words, to then build a correlated topic model (CTM) and iteratively reduce its perplexity. We also used the perplexity measurement of CTMs to improve our previous algorithm based on a genetic algorithm and several statistical coefficients. The algorithms were tested with 56 data sets from four different species and compared to 14 other methods by the use of several coefficients both at nucleotide and site level. The results of our first approach showed a performance comparable to the other methods studied, especially at site level and in sensitivity scores, in which it scored better than any of the 14 existing tools. In the case of our previous algorithm, the new approach with the addition of the perplexity measurement clearly outperformed all of the other methods in sensitivity, both at nucleotide and site level, and in overall performance at site level. The statistics obtained show that the performance of a motif finding method based on the use of a CTM is satisfying enough to conclude that the application of topic models is a valid method for developing motif finding algorithms. Moreover, the addition of topic models to a previously developed method dramatically increased its performance, suggesting that this combined algorithm can be a useful tool to successfully predict motifs in different kinds of sets of DNA sequences.

  10. Validation of Clinical Testing for Warfarin Sensitivity

    PubMed Central

    Langley, Michael R.; Booker, Jessica K.; Evans, James P.; McLeod, Howard L.; Weck, Karen E.

    2009-01-01

    Responses to warfarin (Coumadin) anticoagulation therapy are affected by genetic variability in both the CYP2C9 and VKORC1 genes. Validation of pharmacogenetic testing for warfarin responses includes demonstration of analytical validity of testing platforms and of the clinical validity of testing. We compared four platforms for determining the relevant single nucleotide polymorphisms (SNPs) in both CYP2C9 and VKORC1 that are associated with warfarin sensitivity (Third Wave Invader Plus, ParagonDx/Cepheid Smart Cycler, Idaho Technology LightCycler, and AutoGenomics Infiniti). Each method was examined for accuracy, cost, and turnaround time. All genotyping methods demonstrated greater than 95% accuracy for identifying the relevant SNPs (CYP2C9 *2 and *3; VKORC1 −1639 or 1173). The ParagonDx and Idaho Technology assays had the shortest turnaround and hands-on times. The Third Wave assay was readily scalable to higher test volumes but had the longest hands-on time. The AutoGenomics assay interrogated the largest number of SNPs but had the longest turnaround time. Four published warfarin-dosing algorithms (Washington University, UCSF, Louisville, and Newcastle) were compared for accuracy for predicting warfarin dose in a retrospective analysis of a local patient population on long-term, stable warfarin therapy. The predicted doses from both the Washington University and UCSF algorithms demonstrated the best correlation with actual warfarin doses. PMID:19324988

  11. A novel hybrid algorithm for the design of the phase diffractive optical elements for beam shaping

    NASA Astrophysics Data System (ADS)

    Jiang, Wenbo; Wang, Jun; Dong, Xiucheng

    2013-02-01

    In this paper, a novel hybrid algorithm for the design of a phase diffractive optical elements (PDOE) is proposed. It combines the genetic algorithm (GA) with the transformable scale BFGS (Broyden, Fletcher, Goldfarb, Shanno) algorithm, the penalty function was used in the cost function definition. The novel hybrid algorithm has the global merits of the genetic algorithm as well as the local improvement capabilities of the transformable scale BFGS algorithm. We designed the PDOE using the conventional simulated annealing algorithm and the novel hybrid algorithm. To compare the performance of two algorithms, three indexes of the diffractive efficiency, uniformity error and the signal-to-noise ratio are considered in numerical simulation. The results show that the novel hybrid algorithm has good convergence property and good stability. As an application example, the PDOE was used for the Gaussian beam shaping; high diffractive efficiency, low uniformity error and high signal-to-noise were obtained. The PDOE can be used for high quality beam shaping such as inertial confinement fusion (ICF), excimer laser lithography, fiber coupling laser diode array, laser welding, etc. It shows wide application value.

  12. Investigation of trunk muscle activities during lifting using a multi-objective optimization-based model and intelligent optimization algorithms.

    PubMed

    Ghiasi, Mohammad Sadegh; Arjmand, Navid; Boroushaki, Mehrdad; Farahmand, Farzam

    2016-03-01

    A six-degree-of-freedom musculoskeletal model of the lumbar spine was developed to predict the activity of trunk muscles during light, moderate and heavy lifting tasks in standing posture. The model was formulated into a multi-objective optimization problem, minimizing the sum of the cubed muscle stresses and maximizing the spinal stability index. Two intelligent optimization algorithms, i.e., the vector evaluated particle swarm optimization (VEPSO) and nondominated sorting genetic algorithm (NSGA), were employed to solve the optimization problem. The optimal solution for each task was then found in the way that the corresponding in vivo intradiscal pressure could be reproduced. Results indicated that both algorithms predicted co-activity in the antagonistic abdominal muscles, as well as an increase in the stability index when going from the light to the heavy task. For all of the light, moderate and heavy tasks, the muscles' activities predictions of the VEPSO and the NSGA were generally consistent and in the same order of the in vivo electromyography data. The proposed methodology is thought to provide improved estimations for muscle activities by considering the spinal stability and incorporating the in vivo intradiscal pressure data.

  13. Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

    DOE PAGES

    Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.; ...

    2018-01-09

    The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

  14. Linking secondary metabolites to gene clusters through genome sequencing of six diverse Aspergillus species

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kjerbolling, Inge; Vesth, Tammi C.; Frisvad, Jens C.

    The fungal genus of Aspergillus is highly interesting, containing everything from industrial cell factories over model organisms to human pathogens. In particular, this group has a prolific production of bioactive secondary metabolites (SMs). In this work, four diverse Aspergillus species (A. campestris, A. novofumigatus, A. ochraceoroseus and A. steynii) has been whole genome PacBio sequenced to provide genetic references in three Aspergillus sections. Additionally, A. taichungensis and A. candidus were sequenced for SM elucidation. Thirteen Aspergillus genomes were analysed with comparative genomics to determine phylogeny and genetic diversity, showing that each new genome contains 15–27% genes not found in othermore » sequenced Aspergilli. In particular, the new species A. novofumigatus was compared to the pathogenic species A. fumigatus. This suggests that A. novofumigatus can produce most of the same allergens, virulence and pathogenicity factors as A. fumigatus suggesting that A. novofumigatus could be as pathogenic as A. fumigatus. Furthermore, SMs were linked to gene clusters based on biological and chemical knowledge and analysis, genome sequences and predictive algorithms.« less

  15. Cordova: Web-based management of genetic variation data

    PubMed Central

    Ephraim, Sean S.; Anand, Nikhil; DeLuca, Adam P.; Taylor, Kyle R.; Kolbe, Diana L.; Simpson, Allen C.; Azaiez, Hela; Sloan, Christina M.; Shearer, A. Eliot; Hallier, Andrea R.; Casavant, Thomas L.; Scheetz, Todd E.; Smith, Richard J. H.; Braun, Terry A.

    2014-01-01

    Summary: Cordova is an out-of-the-box solution for building and maintaining an online database of genetic variations integrated with pathogenicity prediction results from popular algorithms. Our primary motivation for developing this system is to aid researchers and clinician–scientists in determining the clinical significance of genetic variations. To achieve this goal, Cordova provides an interface to review and manually or computationally curate genetic variation data as well as share it for clinical diagnostics and the advancement of research. Availability and implementation: Cordova is open source under the MIT license and is freely available for download at https://github.com/clcg/cordova. Contact: sean.ephraim@gmail.com or terry-braun@uiowa.edu PMID:25123904

  16. Identification of unique repeated patterns, location of mutation in DNA finger printing using artificial intelligence technique.

    PubMed

    Mukunthan, B; Nagaveni, N

    2014-01-01

    In genetic engineering, conventional techniques and algorithms employed by forensic scientists to assist in identification of individuals on the basis of their respective DNA profiles involves more complex computational steps and mathematical formulae, also the identification of location of mutation in a genomic sequence in laboratories is still an exigent task. This novel approach provides ability to solve the problems that do not have an algorithmic solution and the available solutions are also too complex to be found. The perfect blend made of bioinformatics and neural networks technique results in efficient DNA pattern analysis algorithm with utmost prediction accuracy.

  17. Optimal Design of Passive Power Filters Based on Pseudo-parallel Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Li, Pei; Li, Hongbo; Gao, Nannan; Niu, Lin; Guo, Liangfeng; Pei, Ying; Zhang, Yanyan; Xu, Minmin; Chen, Kerui

    2017-05-01

    The economic costs together with filter efficiency are taken as targets to optimize the parameter of passive filter. Furthermore, the method of combining pseudo-parallel genetic algorithm with adaptive genetic algorithm is adopted in this paper. In the early stages pseudo-parallel genetic algorithm is introduced to increase the population diversity, and adaptive genetic algorithm is used in the late stages to reduce the workload. At the same time, the migration rate of pseudo-parallel genetic algorithm is improved to change with population diversity adaptively. Simulation results show that the filter designed by the proposed method has better filtering effect with lower economic cost, and can be used in engineering.

  18. Modelling and validation of diffuse reflectance of the adult human head for fNIRS: scalp sub-layers definition

    NASA Astrophysics Data System (ADS)

    Herrera-Vega, Javier; Montero-Hernández, Samuel; Tachtsidis, Ilias; Treviño-Palacios, Carlos G.; Orihuela-Espina, Felipe

    2017-11-01

    Accurate estimation of brain haemodynamics parameters such as cerebral blood flow and volume as well as oxygen consumption i.e. metabolic rate of oxygen, with funcional near infrared spectroscopy (fNIRS) requires precise characterization of light propagation through head tissues. An anatomically realistic forward model of the human adult head with unprecedented detailed specification of the 5 scalp sublayers to account for blood irrigation in the connective tissue layer is introduced. The full model consists of 9 layers, accounts for optical properties ranging from 750nm to 950nm and has a voxel size of 0.5mm. The whole model is validated comparing the predicted remitted spectra, using Monte Carlo simulations of radiation propagation with 108 photons, against continuous wave (CW) broadband fNIRS experimental data. As the true oxy- and deoxy-hemoglobin concentrations during acquisition are unknown, a genetic algorithm searched for the vector of parameters that generates a modelled spectrum that optimally fits the experimental spectrum. Differences between experimental and model predicted spectra was quantified using the Root mean square error (RMSE). RMSE was 0.071 +/- 0.004, 0.108 +/- 0.018 and 0.235+/-0.015 at 1, 2 and 3cm interoptode distance respectively. The parameter vector of absolute concentrations of haemoglobin species in scalp and cortex retrieved with the genetic algorithm was within histologically plausible ranges. The new model capability to estimate the contribution of the scalp blood flow shall permit incorporating this information to the regularization of the inverse problem for a cleaner reconstruction of brain hemodynamics.

  19. The impact of self-reported ethnicity versus genetic ancestry on phenotypic characteristics of polycystic ovary syndrome (PCOS).

    PubMed

    Louwers, Y V; Lao, O; Fauser, B C J M; Kayser, M; Laven, J S E

    2014-10-01

    It is well established that ethnicity is associated with the phenotype of polycystic ovary syndrome (PCOS). Self-reported ethnicity was shown to be an inaccurate proxy for ethnic origin in other disease traits, and it remains unclear how in PCOS patients self-reported ethnicity compares with a biological proxy such as genetic ancestry. We compared the impact of self-reported ethnicity versus genetic ancestry on PCOS and tested which of these 2 classifications better predicts the variability in phenotypic characteristics of PCOS. A total of 1499 PCOS patients from The Netherlands, comprising 11 self-reported ethnic groups of European, African, American, and Asian descent were genotyped with the Illumina 610K Quad BeadChip and merged with the data genotyped with the Illumina HumanHap650K available for the reference panel collected by the Human Genome Diversity Project (HGDP), in a collaboration with the Centre Etude Polymorphism Humain (CEPH), including 53 populations for ancestry reference. Algorithms for inferring genetic relationships among individuals, including multidimensional scaling and ADMIXTURE, were applied to recover genetic ancestry for each individual. Regression analysis was used to determine the best predictor for the variability in PCOS characteristics. The association between self-reported ethnicity and genetic ancestry was moderate. For amenorrhea, total follicle count, body mass index, SHBG, dehydroepiandrosterone sulfate, and insulin, mainly genetic ancestry clusters ended up in the final models (P values < .004), indicating that they explain a larger proportion of variability of these PCOS characteristics compared with self-reported ethnicity. Especially variability of insulin levels seems predominantly explained by genetic ancestry. Self-reported ancestry is not a perfect proxy for genetic ancestry in patients with PCOS, emphasizing that by using genetic ancestry data instead of self-reported ethnicity, PCOS-relevant misclassification can be avoided. Moreover, because genetic ancestry explained a larger proportion of phenotypic variability associated with PCOS than self-reported ethnicity, future studies should focus on genetic ancestry verification of PCOS patients for research questions and treatment as well as preventive strategies in these women.

  20. The alliance relationship analysis of international terrorist organizations with link prediction

    NASA Astrophysics Data System (ADS)

    Fang, Ling; Fang, Haiyang; Tian, Yanfang; Yang, Tinghong; Zhao, Jing

    2017-09-01

    Terrorism is a huge public hazard of the international community. Alliances of terrorist organizations may cause more serious threat to national security and world peace. Understanding alliances between global terrorist organizations will facilitate more effective anti-terrorism collaboration between governments. Based on publicly available data, this study constructed a alliance network between terrorist organizations and analyzed the alliance relationships with link prediction. We proposed a novel index based on optimal weighted fusion of six similarity indices, in which the optimal weight is calculated by genetic algorithm. Our experimental results showed that this algorithm could achieve better results on the networks than other algorithms. Using this method, we successfully digged out 21 real terrorist organizations alliance from current data. Our experiment shows that this approach used for terrorist organizations alliance mining is effective and this study is expected to benefit the form of a more powerful anti-terrorism strategy.

  1. Characterization of human passive muscles for impact loads using genetic algorithm and inverse finite element methods.

    PubMed

    Chawla, A; Mukherjee, S; Karthikeyan, B

    2009-02-01

    The objective of this study is to identify the dynamic material properties of human passive muscle tissues for the strain rates relevant to automobile crashes. A novel methodology involving genetic algorithm (GA) and finite element method is implemented to estimate the material parameters by inverse mapping the impact test data. Isolated unconfined impact tests for average strain rates ranging from 136 s(-1) to 262 s(-1) are performed on muscle tissues. Passive muscle tissues are modelled as isotropic, linear and viscoelastic material using three-element Zener model available in PAMCRASH(TM) explicit finite element software. In the GA based identification process, fitness values are calculated by comparing the estimated finite element forces with the measured experimental forces. Linear viscoelastic material parameters (bulk modulus, short term shear modulus and long term shear modulus) are thus identified at strain rates 136 s(-1), 183 s(-1) and 262 s(-1) for modelling muscles. Extracted optimal parameters from this study are comparable with reported parameters in literature. Bulk modulus and short term shear modulus are found to be more influential in predicting the stress-strain response than long term shear modulus for the considered strain rates. Variations within the set of parameters identified at different strain rates indicate the need for new or improved material model, which is capable of capturing the strain rate dependency of passive muscle response with single set of material parameters for wide range of strain rates.

  2. Effective search for stable segregation configurations at grain boundaries with data-mining techniques

    NASA Astrophysics Data System (ADS)

    Kiyohara, Shin; Mizoguchi, Teruyasu

    2018-03-01

    Grain boundary segregation of dopants plays a crucial role in materials properties. To investigate the dopant segregation behavior at the grain boundary, an enormous number of combinations have to be considered in the segregation of multiple dopants at the complex grain boundary structures. Here, two data mining techniques, the random-forests regression and the genetic algorithm, were applied to determine stable segregation sites at grain boundaries efficiently. Using the random-forests method, a predictive model was constructed from 2% of the segregation configurations and it has been shown that this model could determine the stable segregation configurations. Furthermore, the genetic algorithm also successfully determined the most stable segregation configuration with great efficiency. We demonstrate that these approaches are quite effective to investigate the dopant segregation behaviors at grain boundaries.

  3. Software tool for data mining and its applications

    NASA Astrophysics Data System (ADS)

    Yang, Jie; Ye, Chenzhou; Chen, Nianyi

    2002-03-01

    A software tool for data mining is introduced, which integrates pattern recognition (PCA, Fisher, clustering, hyperenvelop, regression), artificial intelligence (knowledge representation, decision trees), statistical learning (rough set, support vector machine), computational intelligence (neural network, genetic algorithm, fuzzy systems). It consists of nine function models: pattern recognition, decision trees, association rule, fuzzy rule, neural network, genetic algorithm, Hyper Envelop, support vector machine, visualization. The principle and knowledge representation of some function models of data mining are described. The software tool of data mining is realized by Visual C++ under Windows 2000. Nonmonotony in data mining is dealt with by concept hierarchy and layered mining. The software tool of data mining has satisfactorily applied in the prediction of regularities of the formation of ternary intermetallic compounds in alloy systems, and diagnosis of brain glioma.

  4. Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies

    PubMed Central

    Manitz, Juliane; Burger, Patricia; Amos, Christopher I.; Chang-Claude, Jenny; Wichmann, Heinz-Erich; Kneib, Thomas; Bickeböller, Heike

    2017-01-01

    The analysis of genome-wide association studies (GWAS) benefits from the investigation of biologically meaningful gene sets, such as gene-interaction networks (pathways). We propose an extension to a successful kernel-based pathway analysis approach by integrating kernel functions into a powerful algorithmic framework for variable selection, to enable investigation of multiple pathways simultaneously. We employ genetic similarity kernels from the logistic kernel machine test (LKMT) as base-learners in a boosting algorithm. A model to explain case-control status is created iteratively by selecting pathways that improve its prediction ability. We evaluated our method in simulation studies adopting 50 pathways for different sample sizes and genetic effect strengths. Additionally, we included an exemplary application of kernel boosting to a rheumatoid arthritis and a lung cancer dataset. Simulations indicate that kernel boosting outperforms the LKMT in certain genetic scenarios. Applications to GWAS data on rheumatoid arthritis and lung cancer resulted in sparse models which were based on pathways interpretable in a clinical sense. Kernel boosting is highly flexible in terms of considered variables and overcomes the problem of multiple testing. Additionally, it enables the prediction of clinical outcomes. Thus, kernel boosting constitutes a new, powerful tool in the analysis of GWAS data and towards the understanding of biological processes involved in disease susceptibility. PMID:28785300

  5. Pathway-Based Kernel Boosting for the Analysis of Genome-Wide Association Studies.

    PubMed

    Friedrichs, Stefanie; Manitz, Juliane; Burger, Patricia; Amos, Christopher I; Risch, Angela; Chang-Claude, Jenny; Wichmann, Heinz-Erich; Kneib, Thomas; Bickeböller, Heike; Hofner, Benjamin

    2017-01-01

    The analysis of genome-wide association studies (GWAS) benefits from the investigation of biologically meaningful gene sets, such as gene-interaction networks (pathways). We propose an extension to a successful kernel-based pathway analysis approach by integrating kernel functions into a powerful algorithmic framework for variable selection, to enable investigation of multiple pathways simultaneously. We employ genetic similarity kernels from the logistic kernel machine test (LKMT) as base-learners in a boosting algorithm. A model to explain case-control status is created iteratively by selecting pathways that improve its prediction ability. We evaluated our method in simulation studies adopting 50 pathways for different sample sizes and genetic effect strengths. Additionally, we included an exemplary application of kernel boosting to a rheumatoid arthritis and a lung cancer dataset. Simulations indicate that kernel boosting outperforms the LKMT in certain genetic scenarios. Applications to GWAS data on rheumatoid arthritis and lung cancer resulted in sparse models which were based on pathways interpretable in a clinical sense. Kernel boosting is highly flexible in terms of considered variables and overcomes the problem of multiple testing. Additionally, it enables the prediction of clinical outcomes. Thus, kernel boosting constitutes a new, powerful tool in the analysis of GWAS data and towards the understanding of biological processes involved in disease susceptibility.

  6. Genetic algorithm for TEC seismo-ionospheric anomalies detection around the time of the Solomon (Mw = 8.0) earthquake of 06 February 2013

    NASA Astrophysics Data System (ADS)

    Akhoondzadeh, M.

    2013-08-01

    On 6 February 2013, at 12:12:27 local time (01:12:27 UTC) a seismic event registering Mw 8.0 struck the Solomon Islands, located at the boundaries of the Australian and Pacific tectonic plates. Time series prediction is an important and widely interesting topic in the research of earthquake precursors. This paper describes a new computational intelligence approach to detect the unusual variations of the total electron content (TEC) seismo-ionospheric anomalies induced by the powerful Solomon earthquake using genetic algorithm (GA). The GA detected a considerable number of anomalous occurrences on earthquake day and also 7 and 8 days prior to the earthquake in a period of high geomagnetic activities. In this study, also the detected TEC anomalies using the proposed method are compared to the results dealing with the observed TEC anomalies by applying the mean, median, wavelet, Kalman filter, ARIMA, neural network and support vector machine methods. The accordance in the final results of all eight methods is a convincing indication for the efficiency of the GA method. It indicates that GA can be an appropriate non-parametric tool for anomaly detection in a non linear time series showing the seismo-ionospheric precursors variations.

  7. Improved regulatory element prediction based on tissue-specific local epigenomic signatures

    PubMed Central

    He, Yupeng; Gorkin, David U.; Dickel, Diane E.; Nery, Joseph R.; Castanon, Rosa G.; Lee, Ah Young; Shen, Yin; Visel, Axel; Pennacchio, Len A.; Ren, Bing; Ecker, Joseph R.

    2017-01-01

    Accurate enhancer identification is critical for understanding the spatiotemporal transcriptional regulation during development as well as the functional impact of disease-related noncoding genetic variants. Computational methods have been developed to predict the genomic locations of active enhancers based on histone modifications, but the accuracy and resolution of these methods remain limited. Here, we present an algorithm, regulatory element prediction based on tissue-specific local epigenetic marks (REPTILE), which integrates histone modification and whole-genome cytosine DNA methylation profiles to identify the precise location of enhancers. We tested the ability of REPTILE to identify enhancers previously validated in reporter assays. Compared with existing methods, REPTILE shows consistently superior performance across diverse cell and tissue types, and the enhancer locations are significantly more refined. We show that, by incorporating base-resolution methylation data, REPTILE greatly improves upon current methods for annotation of enhancers across a variety of cell and tissue types. REPTILE is available at https://github.com/yupenghe/REPTILE/. PMID:28193886

  8. A hierarchical clustering methodology for the estimation of toxicity.

    PubMed

    Martin, Todd M; Harten, Paul; Venkatapathy, Raghuraman; Das, Shashikala; Young, Douglas M

    2008-01-01

    ABSTRACT A quantitative structure-activity relationship (QSAR) methodology based on hierarchical clustering was developed to predict toxicological endpoints. This methodology utilizes Ward's method to divide a training set into a series of structurally similar clusters. The structural similarity is defined in terms of 2-D physicochemical descriptors (such as connectivity and E-state indices). A genetic algorithm-based technique is used to generate statistically valid QSAR models for each cluster (using the pool of descriptors described above). The toxicity for a given query compound is estimated using the weighted average of the predictions from the closest cluster from each step in the hierarchical clustering assuming that the compound is within the domain of applicability of the cluster. The hierarchical clustering methodology was tested using a Tetrahymena pyriformis acute toxicity data set containing 644 chemicals in the training set and with two prediction sets containing 339 and 110 chemicals. The results from the hierarchical clustering methodology were compared to the results from several different QSAR methodologies.

  9. Charge scheduling of an energy storage system under time-of-use pricing and a demand charge.

    PubMed

    Yoon, Yourim; Kim, Yong-Hyuk

    2014-01-01

    A real-coded genetic algorithm is used to schedule the charging of an energy storage system (ESS), operated in tandem with renewable power by an electricity consumer who is subject to time-of-use pricing and a demand charge. Simulations based on load and generation profiles of typical residential customers show that an ESS scheduled by our algorithm can reduce electricity costs by approximately 17%, compared to a system without an ESS and by 8% compared to a scheduling algorithm based on net power.

  10. Charge Scheduling of an Energy Storage System under Time-of-Use Pricing and a Demand Charge

    PubMed Central

    Yoon, Yourim

    2014-01-01

    A real-coded genetic algorithm is used to schedule the charging of an energy storage system (ESS), operated in tandem with renewable power by an electricity consumer who is subject to time-of-use pricing and a demand charge. Simulations based on load and generation profiles of typical residential customers show that an ESS scheduled by our algorithm can reduce electricity costs by approximately 17%, compared to a system without an ESS and by 8% compared to a scheduling algorithm based on net power. PMID:25197720

  11. Using genetic prediction from known complex disease Loci to guide the design of next-generation sequencing experiments.

    PubMed

    Jostins, Luke; Levine, Adam P; Barrett, Jeffrey C

    2013-01-01

    A central focus of complex disease genetics after genome-wide association studies (GWAS) is to identify low frequency and rare risk variants, which may account for an important fraction of disease heritability unexplained by GWAS. A profusion of studies using next-generation sequencing are seeking such risk alleles. We describe how already-known complex trait loci (largely from GWAS) can be used to guide the design of these new studies by selecting cases, controls, or families who are most likely to harbor undiscovered risk alleles. We show that genetic risk prediction can select unrelated cases from large cohorts who are enriched for unknown risk factors, or multiply-affected families that are more likely to harbor high-penetrance risk alleles. We derive the frequency of an undiscovered risk allele in selected cases and controls, and show how this relates to the variance explained by the risk score, the disease prevalence and the population frequency of the risk allele. We also describe a new method for informing the design of sequencing studies using genetic risk prediction in large partially-genotyped families using an extension of the Inside-Outside algorithm for inference on trees. We explore several study design scenarios using both simulated and real data, and show that in many cases genetic risk prediction can provide significant increases in power to detect low-frequency and rare risk alleles. The same approach can also be used to aid discovery of non-genetic risk factors, suggesting possible future utility of genetic risk prediction in conventional epidemiology. Software implementing the methods in this paper is available in the R package Mangrove.

  12. Design of Clinical Support Systems Using Integrated Genetic Algorithm and Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Chen, Yung-Fu; Huang, Yung-Fa; Jiang, Xiaoyi; Hsu, Yuan-Nian; Lin, Hsuan-Hung

    Clinical decision support system (CDSS) provides knowledge and specific information for clinicians to enhance diagnostic efficiency and improving healthcare quality. An appropriate CDSS can highly elevate patient safety, improve healthcare quality, and increase cost-effectiveness. Support vector machine (SVM) is believed to be superior to traditional statistical and neural network classifiers. However, it is critical to determine suitable combination of SVM parameters regarding classification performance. Genetic algorithm (GA) can find optimal solution within an acceptable time, and is faster than greedy algorithm with exhaustive searching strategy. By taking the advantage of GA in quickly selecting the salient features and adjusting SVM parameters, a method using integrated GA and SVM (IGS), which is different from the traditional method with GA used for feature selection and SVM for classification, was used to design CDSSs for prediction of successful ventilation weaning, diagnosis of patients with severe obstructive sleep apnea, and discrimination of different cell types form Pap smear. The results show that IGS is better than methods using SVM alone or linear discriminator.

  13. An Effective Hybrid Cuckoo Search Algorithm with Improved Shuffled Frog Leaping Algorithm for 0-1 Knapsack Problems

    PubMed Central

    Wang, Gai-Ge; Feng, Qingjiang; Zhao, Xiang-Jun

    2014-01-01

    An effective hybrid cuckoo search algorithm (CS) with improved shuffled frog-leaping algorithm (ISFLA) is put forward for solving 0-1 knapsack problem. First of all, with the framework of SFLA, an improved frog-leap operator is designed with the effect of the global optimal information on the frog leaping and information exchange between frog individuals combined with genetic mutation with a small probability. Subsequently, in order to improve the convergence speed and enhance the exploitation ability, a novel CS model is proposed with considering the specific advantages of Lévy flights and frog-leap operator. Furthermore, the greedy transform method is used to repair the infeasible solution and optimize the feasible solution. Finally, numerical simulations are carried out on six different types of 0-1 knapsack instances, and the comparative results have shown the effectiveness of the proposed algorithm and its ability to achieve good quality solutions, which outperforms the binary cuckoo search, the binary differential evolution, and the genetic algorithm. PMID:25404940

  14. A Comparative Study of Probability Collectives Based Multi-agent Systems and Genetic Algorithms

    NASA Technical Reports Server (NTRS)

    Huang, Chien-Feng; Wolpert, David H.; Bieniawski, Stefan; Strauss, Charles E. M.

    2005-01-01

    We compare Genetic Algorithms (GA's) with Probability Collectives (PC), a new framework for distributed optimization and control. In contrast to GA's, PC-based methods do not update populations of solutions. Instead they update an explicitly parameterized probability distribution p over the space of solutions. That updating of p arises as the optimization of a functional of p. The functional is chosen so that any p that optimizes it should be p peaked about good solutions. The PC approach works in both continuous and discrete problems. It does not suffer from the resolution limitation of the finite bit length encoding of parameters into GA alleles. It also has deep connections with both game theory and statistical physics. We review the PC approach using its motivation as the information theoretic formulation of bounded rationality for multi-agent systems. It is then compared with GA's on a diverse set of problems. To handle high dimensional surfaces, in the PC method investigated here p is restricted to a product distribution. Each distribution in that product is controlled by a separate agent. The test functions were selected for their difficulty using either traditional gradient descent or genetic algorithms. On those functions the PC-based approach significantly outperforms traditional GA's in both rate of descent, trapping in false minima, and long term optimization.

  15. WE-E-17A-06: Assessing the Scale of Tumor Heterogeneity by Complete Hierarchical Segmentation On MRI

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Gensheimer, M; Trister, A; Ermoian, R

    2014-06-15

    Purpose: In many cancers, intratumoral heterogeneity exists in vascular and genetic structure. We developed an algorithm which uses clinical imaging to interrogate different scales of heterogeneity. We hypothesize that heterogeneity of perfusion at large distance scales may correlate with propensity for disease recurrence. We applied the algorithm to initial diagnosis MRI of rhabdomyosarcoma patients to predict recurrence. Methods: The Spatial Heterogeneity Analysis by Recursive Partitioning (SHARP) algorithm recursively segments the tumor image. The tumor is repeatedly subdivided, with each dividing line chosen to maximize signal intensity difference between the two subregions. This process continues to the voxel level, producing segmentsmore » at multiple scales. Heterogeneity is measured by comparing signal intensity histograms between each segmented region and the adjacent region. We measured the scales of contrast enhancement heterogeneity of the primary tumor in 18 rhabdomyosarcoma patients. Using Cox proportional hazards regression, we explored the influence of heterogeneity parameters on relapse-free survival (RFS). To compare with existing methods, fractal and Haralick texture features were also calculated. Results: The complete segmentation produced by SHARP allows extraction of diverse features, including the amount of heterogeneity at various distance scales, the area of the tumor with the most heterogeneity at each scale, and for a given point in the tumor, the heterogeneity at different scales. 10/18 rhabdomyosarcoma patients suffered disease recurrence. On contrast-enhanced MRI, larger scale of maximum signal intensity heterogeneity, relative to tumor diameter, predicted for shorter RFS (p=0.05). Fractal dimension, fractal fit, and three Haralick features did not predict RFS (p=0.09-0.90). Conclusion: SHARP produces an automatic segmentation of tumor regions and reports the amount of heterogeneity at various distance scales. In rhabdomyosarcoma, RFS was shorter when the primary tumor exhibited larger scale of heterogeneity on contrast-enhanced MRI. If validated on a larger dataset, this imaging biomarker could be useful to help personalize treatment.« less

  16. How Crossover Speeds up Building Block Assembly in Genetic Algorithms.

    PubMed

    Sudholt, Dirk

    2017-01-01

    We reinvestigate a fundamental question: How effective is crossover in genetic algorithms in combining building blocks of good solutions? Although this has been discussed controversially for decades, we are still lacking a rigorous and intuitive answer. We provide such answers for royal road functions and OneMax, where every bit is a building block. For the latter, we show that using crossover makes every ([Formula: see text]+[Formula: see text]) genetic algorithm at least twice as fast as the fastest evolutionary algorithm using only standard bit mutation, up to small-order terms and for moderate [Formula: see text] and [Formula: see text]. Crossover is beneficial because it can capitalize on mutations that have both beneficial and disruptive effects on building blocks: crossover is able to repair the disruptive effects of mutation in later generations. Compared to mutation-based evolutionary algorithms, this makes multibit mutations more useful. Introducing crossover changes the optimal mutation rate on OneMax from [Formula: see text] to [Formula: see text]. This holds both for uniform crossover and k-point crossover. Experiments and statistical tests confirm that our findings apply to a broad class of building block functions.

  17. Utilizing Machine Learning and Automated Performance Metrics to Evaluate Robot-Assisted Radical Prostatectomy Performance and Predict Outcomes.

    PubMed

    Hung, Andrew J; Chen, Jian; Che, Zhengping; Nilanon, Tanachat; Jarc, Anthony; Titus, Micha; Oh, Paul J; Gill, Inderbir S; Liu, Yan

    2018-05-01

    Surgical performance is critical for clinical outcomes. We present a novel machine learning (ML) method of processing automated performance metrics (APMs) to evaluate surgical performance and predict clinical outcomes after robot-assisted radical prostatectomy (RARP). We trained three ML algorithms utilizing APMs directly from robot system data (training material) and hospital length of stay (LOS; training label) (≤2 days and >2 days) from 78 RARP cases, and selected the algorithm with the best performance. The selected algorithm categorized the cases as "Predicted as expected LOS (pExp-LOS)" and "Predicted as extended LOS (pExt-LOS)." We compared postoperative outcomes of the two groups (Kruskal-Wallis/Fisher's exact tests). The algorithm then predicted individual clinical outcomes, which we compared with actual outcomes (Spearman's correlation/Fisher's exact tests). Finally, we identified five most relevant APMs adopted by the algorithm during predicting. The "Random Forest-50" (RF-50) algorithm had the best performance, reaching 87.2% accuracy in predicting LOS (73 cases as "pExp-LOS" and 5 cases as "pExt-LOS"). The "pExp-LOS" cases outperformed the "pExt-LOS" cases in surgery time (3.7 hours vs 4.6 hours, p = 0.007), LOS (2 days vs 4 days, p = 0.02), and Foley duration (9 days vs 14 days, p = 0.02). Patient outcomes predicted by the algorithm had significant association with the "ground truth" in surgery time (p < 0.001, r = 0.73), LOS (p = 0.05, r = 0.52), and Foley duration (p < 0.001, r = 0.45). The five most relevant APMs, adopted by the RF-50 algorithm in predicting, were largely related to camera manipulation. To our knowledge, ours is the first study to show that APMs and ML algorithms may help assess surgical RARP performance and predict clinical outcomes. With further accrual of clinical data (oncologic and functional data), this process will become increasingly relevant and valuable in surgical assessment and training.

  18. Linear time algorithms to construct populations fitting multiple constraint distributions at genomic scales.

    PubMed

    Siragusa, Enrico; Haiminen, Niina; Utro, Filippo; Parida, Laxmi

    2017-10-09

    Computer simulations can be used to study population genetic methods, models and parameters, as well as to predict potential outcomes. For example, in plant populations, predicting the outcome of breeding operations can be studied using simulations. In-silico construction of populations with pre-specified characteristics is an important task in breeding optimization and other population genetic studies. We present two linear time Simulation using Best-fit Algorithms (SimBA) for two classes of problems where each co-fits two distributions: SimBA-LD fits linkage disequilibrium and minimum allele frequency distributions, while SimBA-hap fits founder-haplotype and polyploid allele dosage distributions. An incremental gap-filling version of previously introduced SimBA-LD is here demonstrated to accurately fit the target distributions, allowing efficient large scale simulations. SimBA-hap accuracy and efficiency is demonstrated by simulating tetraploid populations with varying numbers of founder haplotypes, we evaluate both a linear time greedy algoritm and an optimal solution based on mixed-integer programming. SimBA is available on http://researcher.watson.ibm.com/project/5669.

  19. Pharmacogenetics of warfarin: challenges and opportunities

    PubMed Central

    Ta Michael Lee, Ming; Klein, Teri E

    2014-01-01

    Since the introduction in the 1950s, warfarin has become the commonly used oral anticoagulant for the prevention of thromboembolism in patients with deep vein thrombosis, atrial fibrillation or prosthetic heart valve replacement. Warfarin is highly efficacious; however, achieving the desired anticoagulation is difficult because of its narrow therapeutic window and highly variable dose response among individuals. Bleeding is often associated with overdose of warfarin. There is overwhelming evidence that an individual's warfarin maintenance is associated with clinical factors and genetic variations, most notably polymorphisms in cytochrome P450 2C9 and vitamin K epoxide reductase subunit 1. Numerous dose-prediction algorithms incorporating both genetic and clinical factors have been developed and tested clinically. However, results from major clinical trials are not available yet. This review aims to provide an overview of the field of warfarin which includes information about the drug, genetics of warfarin dose requirements, dosing algorithms developed and the challenges for the clinical implementation of warfarin pharmacogenetics. PMID:23657428

  20. QSAR prediction of additive and non-additive mixture toxicities of antibiotics and pesticide.

    PubMed

    Qin, Li-Tang; Chen, Yu-Han; Zhang, Xin; Mo, Ling-Yun; Zeng, Hong-Hu; Liang, Yan-Peng

    2018-05-01

    Antibiotics and pesticides may exist as a mixture in real environment. The combined effect of mixture can either be additive or non-additive (synergism and antagonism). However, no effective predictive approach exists on predicting the synergistic and antagonistic toxicities of mixtures. In this study, we developed a quantitative structure-activity relationship (QSAR) model for the toxicities (half effect concentration, EC 50 ) of 45 binary and multi-component mixtures composed of two antibiotics and four pesticides. The acute toxicities of single compound and mixtures toward Aliivibrio fischeri were tested. A genetic algorithm was used to obtain the optimized model with three theoretical descriptors. Various internal and external validation techniques indicated that the coefficient of determination of 0.9366 and root mean square error of 0.1345 for the QSAR model predicted that 45 mixture toxicities presented additive, synergistic, and antagonistic effects. Compared with the traditional concentration additive and independent action models, the QSAR model exhibited an advantage in predicting mixture toxicity. Thus, the presented approach may be able to fill the gaps in predicting non-additive toxicities of binary and multi-component mixtures. Copyright © 2018 Elsevier Ltd. All rights reserved.

  1. Software For Genetic Algorithms

    NASA Technical Reports Server (NTRS)

    Wang, Lui; Bayer, Steve E.

    1992-01-01

    SPLICER computer program is genetic-algorithm software tool used to solve search and optimization problems. Provides underlying framework and structure for building genetic-algorithm application program. Written in Think C.

  2. RCQ-GA: RDF Chain Query Optimization Using Genetic Algorithms

    NASA Astrophysics Data System (ADS)

    Hogenboom, Alexander; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

    The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are needed for efficient querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL queries, the so-called RDF chain queries. For this purpose, we devise a genetic algorithm called RCQ-GA that determines the order in which joins need to be performed for an efficient evaluation of RDF chain queries. The approach is benchmarked against a two-phase optimization algorithm, previously proposed in literature. The more complex a query is, the more RCQ-GA outperforms the benchmark in solution quality, execution time needed, and consistency of solution quality. When the algorithms are constrained by a time limit, the overall performance of RCQ-GA compared to the benchmark further improves.

  3. Pricing and location decisions in multi-objective facility location problem with M/M/m/k queuing systems

    NASA Astrophysics Data System (ADS)

    Tavakkoli-Moghaddam, Reza; Vazifeh-Noshafagh, Samira; Taleizadeh, Ata Allah; Hajipour, Vahid; Mahmoudi, Amin

    2017-01-01

    This article presents a new multi-objective model for a facility location problem with congestion and pricing policies. This model considers situations in which immobile service facilities are congested by a stochastic demand following M/M/m/k queues. The presented model belongs to the class of mixed-integer nonlinear programming models and NP-hard problems. To solve such a hard model, a new multi-objective optimization algorithm based on a vibration theory, namely multi-objective vibration damping optimization (MOVDO), is developed. In order to tune the algorithms parameters, the Taguchi approach using a response metric is implemented. The computational results are compared with those of the non-dominated ranking genetic algorithm and non-dominated sorting genetic algorithm. The outputs demonstrate the robustness of the proposed MOVDO in large-sized problems.

  4. A Novel RSSI Prediction Using Imperialist Competition Algorithm (ICA), Radial Basis Function (RBF) and Firefly Algorithm (FFA) in Wireless Networks

    PubMed Central

    Goudarzi, Shidrokh; Haslina Hassan, Wan; Abdalla Hashim, Aisha-Hassan; Soleymani, Seyed Ahmad; Anisi, Mohammad Hossein; Zakaria, Omar M.

    2016-01-01

    This study aims to design a vertical handover prediction method to minimize unnecessary handovers for a mobile node (MN) during the vertical handover process. This relies on a novel method for the prediction of a received signal strength indicator (RSSI) referred to as IRBF-FFA, which is designed by utilizing the imperialist competition algorithm (ICA) to train the radial basis function (RBF), and by hybridizing with the firefly algorithm (FFA) to predict the optimal solution. The prediction accuracy of the proposed IRBF–FFA model was validated by comparing it to support vector machines (SVMs) and multilayer perceptron (MLP) models. In order to assess the model’s performance, we measured the coefficient of determination (R2), correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE). The achieved results indicate that the IRBF–FFA model provides more precise predictions compared to different ANNs, namely, support vector machines (SVMs) and multilayer perceptron (MLP). The performance of the proposed model is analyzed through simulated and real-time RSSI measurements. The results also suggest that the IRBF–FFA model can be applied as an efficient technique for the accurate prediction of vertical handover. PMID:27438600

  5. A Novel RSSI Prediction Using Imperialist Competition Algorithm (ICA), Radial Basis Function (RBF) and Firefly Algorithm (FFA) in Wireless Networks.

    PubMed

    Goudarzi, Shidrokh; Haslina Hassan, Wan; Abdalla Hashim, Aisha-Hassan; Soleymani, Seyed Ahmad; Anisi, Mohammad Hossein; Zakaria, Omar M

    2016-01-01

    This study aims to design a vertical handover prediction method to minimize unnecessary handovers for a mobile node (MN) during the vertical handover process. This relies on a novel method for the prediction of a received signal strength indicator (RSSI) referred to as IRBF-FFA, which is designed by utilizing the imperialist competition algorithm (ICA) to train the radial basis function (RBF), and by hybridizing with the firefly algorithm (FFA) to predict the optimal solution. The prediction accuracy of the proposed IRBF-FFA model was validated by comparing it to support vector machines (SVMs) and multilayer perceptron (MLP) models. In order to assess the model's performance, we measured the coefficient of determination (R2), correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE). The achieved results indicate that the IRBF-FFA model provides more precise predictions compared to different ANNs, namely, support vector machines (SVMs) and multilayer perceptron (MLP). The performance of the proposed model is analyzed through simulated and real-time RSSI measurements. The results also suggest that the IRBF-FFA model can be applied as an efficient technique for the accurate prediction of vertical handover.

  6. Ground Motion Prediction Model Using Artificial Neural Network

    NASA Astrophysics Data System (ADS)

    Dhanya, J.; Raghukanth, S. T. G.

    2018-03-01

    This article focuses on developing a ground motion prediction equation based on artificial neural network (ANN) technique for shallow crustal earthquakes. A hybrid technique combining genetic algorithm and Levenberg-Marquardt technique is used for training the model. The present model is developed to predict peak ground velocity, and 5% damped spectral acceleration. The input parameters for the prediction are moment magnitude ( M w), closest distance to rupture plane ( R rup), shear wave velocity in the region ( V s30) and focal mechanism ( F). A total of 13,552 ground motion records from 288 earthquakes provided by the updated NGA-West2 database released by Pacific Engineering Research Center are utilized to develop the model. The ANN architecture considered for the model consists of 192 unknowns including weights and biases of all the interconnected nodes. The performance of the model is observed to be within the prescribed error limits. In addition, the results from the study are found to be comparable with the existing relations in the global database. The developed model is further demonstrated by estimating site-specific response spectra for Shimla city located in Himalayan region.

  7. QSRR using evolved artificial neural network for 52 common pharmaceuticals and drugs of abuse in hair from UPLC-TOF-MS.

    PubMed

    Noorizadeh, Hadi; Farmany, Abbas; Narimani, Hojat; Noorizadeh, Mehrab

    2013-05-01

    A quantitative structure-retention relationship (QSRR) study based on an artificial neural network (ANN) was carried out for the prediction of the ultra-performance liquid chromatography-Time-of-Flight mass spectrometry (UPLC-TOF-MS) retention time (RT) of a set of 52 pharmaceuticals and drugs of abuse in hair. The genetic algorithm was used as a variable selection tool. A partial least squares (PLS) method was used to select the best descriptors which were used as input neurons in neural network model. For choosing the best predictive model from among comparable models, square correlation coefficient R(2) for the whole set calculated based on leave-group-out predicted values of the training set and model-derived predicted values for the test set compounds is suggested to be a good criterion. Finally, to improve the results, structure-retention relationships were followed by a non-linear approach using artificial neural networks and consequently better results were obtained. This also demonstrates the advantages of ANN. Copyright © 2011 John Wiley & Sons, Ltd.

  8. Including non-additive genetic effects in Bayesian methods for the prediction of genetic values based on genome-wide markers

    PubMed Central

    2011-01-01

    Background Molecular marker information is a common source to draw inferences about the relationship between genetic and phenotypic variation. Genetic effects are often modelled as additively acting marker allele effects. The true mode of biological action can, of course, be different from this plain assumption. One possibility to better understand the genetic architecture of complex traits is to include intra-locus (dominance) and inter-locus (epistasis) interaction of alleles as well as the additive genetic effects when fitting a model to a trait. Several Bayesian MCMC approaches exist for the genome-wide estimation of genetic effects with high accuracy of genetic value prediction. Including pairwise interaction for thousands of loci would probably go beyond the scope of such a sampling algorithm because then millions of effects are to be estimated simultaneously leading to months of computation time. Alternative solving strategies are required when epistasis is studied. Methods We extended a fast Bayesian method (fBayesB), which was previously proposed for a purely additive model, to include non-additive effects. The fBayesB approach was used to estimate genetic effects on the basis of simulated datasets. Different scenarios were simulated to study the loss of accuracy of prediction, if epistatic effects were not simulated but modelled and vice versa. Results If 23 QTL were simulated to cause additive and dominance effects, both fBayesB and a conventional MCMC sampler BayesB yielded similar results in terms of accuracy of genetic value prediction and bias of variance component estimation based on a model including additive and dominance effects. Applying fBayesB to data with epistasis, accuracy could be improved by 5% when all pairwise interactions were modelled as well. The accuracy decreased more than 20% if genetic variation was spread over 230 QTL. In this scenario, accuracy based on modelling only additive and dominance effects was generally superior to that of the complex model including epistatic effects. Conclusions This simulation study showed that the fBayesB approach is convenient for genetic value prediction. Jointly estimating additive and non-additive effects (especially dominance) has reasonable impact on the accuracy of prediction and the proportion of genetic variation assigned to the additive genetic source. PMID:21867519

  9. Geographic and ecologic distributions of the Anopheles gambiae complex predicted using a genetic algorithm.

    PubMed

    Levine, Rebecca S; Peterson, A Townsend; Benedict, Mark Q

    2004-02-01

    The distribution of the Anopheles gambiae complex of malaria vectors in Africa is uncertain due to under-sampling of vast regions. We use ecologic niche modeling to predict the potential distribution of three members of the complex (A. gambiae, A. arabiensis, and A. quadriannulatus) and demonstrate the statistical significance of the models. Predictions correspond well to previous estimates, but provide detail regarding spatial discontinuities in the distribution of A. gambiae s.s. that are consistent with population genetic studies. Our predictions also identify large areas of Africa where the presence of A. arabiensis is predicted, but few specimens have been obtained, suggesting under-sampling of the species. Finally, we project models developed from African distribution data for the late 1900s into the past and to South America to determine retrospectively whether the deadly 1929 introduction of A. gambiae sensu lato into Brazil was more likely that of A. gambiae sensu stricto or A. arabiensis.

  10. Mathematical Modeling and Optimizing of in Vitro Hormonal Combination for G × N15 Vegetative Rootstock Proliferation Using Artificial Neural Network-Genetic Algorithm (ANN-GA)

    PubMed Central

    Arab, Mohammad M.; Yadollahi, Abbas; Ahmadi, Hamed; Eftekhari, Maliheh; Maleki, Masoud

    2017-01-01

    The efficiency of a hybrid systems method which combined artificial neural networks (ANNs) as a modeling tool and genetic algorithms (GAs) as an optimizing method for input variables used in ANN modeling was assessed. Hence, as a new technique, it was applied for the prediction and optimization of the plant hormones concentrations and combinations for in vitro proliferation of Garnem (G × N15) rootstock as a case study. Optimizing hormones combination was surveyed by modeling the effects of various concentrations of cytokinin–auxin, i.e., BAP, KIN, TDZ, IBA, and NAA combinations (inputs) on four growth parameters (outputs), i.e., micro-shoots number per explant, length of micro-shoots, developed callus weight (CW) and the quality index (QI) of plantlets. Calculation of statistical values such as R2 (coefficient of determination) related to the accuracy of ANN-GA models showed a considerably higher prediction accuracy for ANN models, i.e., micro-shoots number: R2 = 0.81, length of micro-shoots: R2 = 0.87, CW: R2 = 0.88, QI: R2 = 0.87. According to the results, among the input variables, BAP (19.3), KIN (9.64), and IBA (2.63) showed the highest values of variable sensitivity ratio for proliferation rate. The GA showed that media containing 1.02 mg/l BAP in combination with 0.098 mg/l IBA could lead to the optimal proliferation rate (10.53) for G × N15 rootstock. Another objective of the present study was to compare the performance of predicted and optimized cytokinin–auxin combination with the best optimized obtained concentrations of our other experiments. Considering three growth parameters (length of micro-shoots, micro-shoots number, and proliferation rate), the last treatment was found to be superior to the rest of treatments for G × N15 rootstock in vitro multiplication. Very little difference between the ANN predicted and experimental data confirmed high capability of ANN-GA method in predicting new optimized protocols for plant in vitro propagation. PMID:29163583

  11. Crohn's Disease: Genetics Update.

    PubMed

    Wang, Ming-Hsi; Picco, Michael F

    2017-09-01

    Since the discovery of the first Crohn's disease (CD) gene NOD2 in 2001, 140 genetic loci have been found in whites using high-throughput genome-wide association studies. Several genes influence the CD subphenotypes and treatment response. With the observations of increasing prevalence in Asia and developing countries and the incomplete explanation of CD variance, other underexplored areas need to be integrated through novel methodologies. Algorithms that incorporate specific genetic risk alleles with other biomarkers will be developed and used to predict CD disease course, complications, and response to specific therapies, allowing precision medicine to become real in CD. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. ADME evaluation in drug discovery. 1. Applications of genetic algorithms to the prediction of blood-brain partitioning of a large set of drugs.

    PubMed

    Hou, Tingjun; Xu, Xiaojie

    2002-12-01

    In this study, the relationships between the brain-blood concentration ratio of 96 structurally diverse compounds with a large number of structurally derived descriptors were investigated. The linear models were based on molecular descriptors that can be calculated for any compound simply from a knowledge of its molecular structure. The linear correlation coefficients of the models were optimized by genetic algorithms (GAs), and the descriptors used in the linear models were automatically selected from 27 structurally derived descriptors. The GA optimizations resulted in a group of linear models with three or four molecular descriptors with good statistical significance. The change of descriptor use as the evolution proceeds demonstrates that the octane/water partition coefficient and the partial negative solvent-accessible surface area multiplied by the negative charge are crucial to brain-blood barrier permeability. Moreover, we found that the predictions using multiple QSPR models from GA optimization gave quite good results in spite of the diversity of structures, which was better than the predictions using the best single model. The predictions for the two external sets with 37 diverse compounds using multiple QSPR models indicate that the best linear models with four descriptors are sufficiently effective for predictive use. Considering the ease of computation of the descriptors, the linear models may be used as general utilities to screen the blood-brain barrier partitioning of drugs in a high-throughput fashion.

  13. Reframed Genome-Scale Metabolic Model to Facilitate Genetic Design and Integration with Expression Data.

    PubMed

    Gu, Deqing; Jian, Xingxing; Zhang, Cheng; Hua, Qiang

    2017-01-01

    Genome-scale metabolic network models (GEMs) have played important roles in the design of genetically engineered strains and helped biologists to decipher metabolism. However, due to the complex gene-reaction relationships that exist in model systems, most algorithms have limited capabilities with respect to directly predicting accurate genetic design for metabolic engineering. In particular, methods that predict reaction knockout strategies leading to overproduction are often impractical in terms of gene manipulations. Recently, we proposed a method named logical transformation of model (LTM) to simplify the gene-reaction associations by introducing intermediate pseudo reactions, which makes it possible to generate genetic design. Here, we propose an alternative method to relieve researchers from deciphering complex gene-reactions by adding pseudo gene controlling reactions. In comparison to LTM, this new method introduces fewer pseudo reactions and generates a much smaller model system named as gModel. We showed that gModel allows two seldom reported applications: identification of minimal genomes and design of minimal cell factories within a modified OptKnock framework. In addition, gModel could be used to integrate expression data directly and improve the performance of the E-Fmin method for predicting fluxes. In conclusion, the model transformation procedure will facilitate genetic research based on GEMs, extending their applications.

  14. Improving the Sensitivity and Positive Predictive Value in a Cystic Fibrosis Newborn Screening Program Using a Repeat Immunoreactive Trypsinogen and Genetic Analysis.

    PubMed

    Sontag, Marci K; Lee, Rachel; Wright, Daniel; Freedenberg, Debra; Sagel, Scott D

    2016-08-01

    To evaluate the performance of a new cystic fibrosis (CF) newborn screening algorithm, comprised of immunoreactive trypsinogen (IRT) in first (24-48 hours of life) and second (7-14 days of life) dried blood spot plus DNA on second dried blood spot, over existing algorithms. A retrospective review of the IRT/IRT/DNA algorithm implemented in Colorado, Wyoming, and Texas. A total of 1 520 079 newborns were screened, 32 557 (2.1%) had abnormal first IRT; 8794 (0.54%) on second. Furthermore, 14 653 mutation analyses were performed; 1391 newborns were referred for diagnostic testing; 274 newborns were diagnosed; and 201/274 (73%) of newborns had 2 mutations on the newborn screening CFTR panel. Sensitivity was 96.2%, compared with sensitivity of 76.1% observed with IRT/IRT (105 ng/mL cut-offs, P < .0001). The ratio of newborns with CF to heterozygote carriers was 1:2.5, and newborns with CF to newborns with CFTR-related metabolic syndrome was 10.8:1. The overall positive predictive value was 20%. The median age of diagnosis was 28, 30, and 39.5 days in the 3 states. IRT/IRT/DNA is more sensitive than IRT/IRT because of lower cut-offs (∼97 percentile or 60 ng/mL); higher cut-offs in IRT/IRT programs (>99 percentile, 105 ng/mL) would not achieve sufficient sensitivity. Carrier identification and identification of newborns with CFTR-related metabolic syndrome is less common in IRT/IRT/DNA compared with IRT/DNA. The time to diagnosis is nominally longer, but diagnosis can be achieved in the neonatal period and opportunities to further improve timeliness have been enacted. IRT/IRT/DNA algorithm should be considered by programs with 2 routine screens. Copyright © 2016 Elsevier Inc. All rights reserved.

  15. Ensemble of hybrid genetic algorithm for two-dimensional phase unwrapping

    NASA Astrophysics Data System (ADS)

    Balakrishnan, D.; Quan, C.; Tay, C. J.

    2013-06-01

    The phase unwrapping is the final and trickiest step in any phase retrieval technique. Phase unwrapping by artificial intelligence methods (optimization algorithms) such as hybrid genetic algorithm, reverse simulated annealing, particle swarm optimization, minimum cost matching showed better results than conventional phase unwrapping methods. In this paper, Ensemble of hybrid genetic algorithm with parallel populations is proposed to solve the branch-cut phase unwrapping problem. In a single populated hybrid genetic algorithm, the selection, cross-over and mutation operators are applied to obtain new population in every generation. The parameters and choice of operators will affect the performance of the hybrid genetic algorithm. The ensemble of hybrid genetic algorithm will facilitate to have different parameters set and different choice of operators simultaneously. Each population will use different set of parameters and the offspring of each population will compete against the offspring of all other populations, which use different set of parameters. The effectiveness of proposed algorithm is demonstrated by phase unwrapping examples and advantages of the proposed method are discussed.

  16. Mobile robot dynamic path planning based on improved genetic algorithm

    NASA Astrophysics Data System (ADS)

    Wang, Yong; Zhou, Heng; Wang, Ying

    2017-08-01

    In dynamic unknown environment, the dynamic path planning of mobile robots is a difficult problem. In this paper, a dynamic path planning method based on genetic algorithm is proposed, and a reward value model is designed to estimate the probability of dynamic obstacles on the path, and the reward value function is applied to the genetic algorithm. Unique coding techniques reduce the computational complexity of the algorithm. The fitness function of the genetic algorithm fully considers three factors: the security of the path, the shortest distance of the path and the reward value of the path. The simulation results show that the proposed genetic algorithm is efficient in all kinds of complex dynamic environments.

  17. PCTFPeval: a web tool for benchmarking newly developed algorithms for predicting cooperative transcription factor pairs in yeast.

    PubMed

    Lai, Fu-Jou; Chang, Hong-Tsun; Wu, Wei-Sheng

    2015-01-01

    Computational identification of cooperative transcription factor (TF) pairs helps understand the combinatorial regulation of gene expression in eukaryotic cells. Many advanced algorithms have been proposed to predict cooperative TF pairs in yeast. However, it is still difficult to conduct a comprehensive and objective performance comparison of different algorithms because of lacking sufficient performance indices and adequate overall performance scores. To solve this problem, in our previous study (published in BMC Systems Biology 2014), we adopted/proposed eight performance indices and designed two overall performance scores to compare the performance of 14 existing algorithms for predicting cooperative TF pairs in yeast. Most importantly, our performance comparison framework can be applied to comprehensively and objectively evaluate the performance of a newly developed algorithm. However, to use our framework, researchers have to put a lot of effort to construct it first. To save researchers time and effort, here we develop a web tool to implement our performance comparison framework, featuring fast data processing, a comprehensive performance comparison and an easy-to-use web interface. The developed tool is called PCTFPeval (Predicted Cooperative TF Pair evaluator), written in PHP and Python programming languages. The friendly web interface allows users to input a list of predicted cooperative TF pairs from their algorithm and select (i) the compared algorithms among the 15 existing algorithms, (ii) the performance indices among the eight existing indices, and (iii) the overall performance scores from two possible choices. The comprehensive performance comparison results are then generated in tens of seconds and shown as both bar charts and tables. The original comparison results of each compared algorithm and each selected performance index can be downloaded as text files for further analyses. Allowing users to select eight existing performance indices and 15 existing algorithms for comparison, our web tool benefits researchers who are eager to comprehensively and objectively evaluate the performance of their newly developed algorithm. Thus, our tool greatly expedites the progress in the research of computational identification of cooperative TF pairs.

  18. PCTFPeval: a web tool for benchmarking newly developed algorithms for predicting cooperative transcription factor pairs in yeast

    PubMed Central

    2015-01-01

    Background Computational identification of cooperative transcription factor (TF) pairs helps understand the combinatorial regulation of gene expression in eukaryotic cells. Many advanced algorithms have been proposed to predict cooperative TF pairs in yeast. However, it is still difficult to conduct a comprehensive and objective performance comparison of different algorithms because of lacking sufficient performance indices and adequate overall performance scores. To solve this problem, in our previous study (published in BMC Systems Biology 2014), we adopted/proposed eight performance indices and designed two overall performance scores to compare the performance of 14 existing algorithms for predicting cooperative TF pairs in yeast. Most importantly, our performance comparison framework can be applied to comprehensively and objectively evaluate the performance of a newly developed algorithm. However, to use our framework, researchers have to put a lot of effort to construct it first. To save researchers time and effort, here we develop a web tool to implement our performance comparison framework, featuring fast data processing, a comprehensive performance comparison and an easy-to-use web interface. Results The developed tool is called PCTFPeval (Predicted Cooperative TF Pair evaluator), written in PHP and Python programming languages. The friendly web interface allows users to input a list of predicted cooperative TF pairs from their algorithm and select (i) the compared algorithms among the 15 existing algorithms, (ii) the performance indices among the eight existing indices, and (iii) the overall performance scores from two possible choices. The comprehensive performance comparison results are then generated in tens of seconds and shown as both bar charts and tables. The original comparison results of each compared algorithm and each selected performance index can be downloaded as text files for further analyses. Conclusions Allowing users to select eight existing performance indices and 15 existing algorithms for comparison, our web tool benefits researchers who are eager to comprehensively and objectively evaluate the performance of their newly developed algorithm. Thus, our tool greatly expedites the progress in the research of computational identification of cooperative TF pairs. PMID:26677932

  19. Analysis of Genetic Algorithm for Rule-Set Production (GARP) modeling approach for predicting distributions of fleas implicated as vectors of plague, Yersinia pestis, in California.

    PubMed

    Adjemian, Jennifer C Z; Girvetz, Evan H; Beckett, Laurel; Foley, Janet E

    2006-01-01

    More than 20 species of fleas in California are implicated as potential vectors of Yersinia pestis. Extremely limited spatial data exist for plague vectors-a key component to understanding where the greatest risks for human, domestic animal, and wildlife health exist. This study increases the spatial data available for 13 potential plague vectors by using the ecological niche modeling system Genetic Algorithm for Rule-Set Production (GARP) to predict their respective distributions. Because the available sample sizes in our data set varied greatly from one species to another, we also performed an analysis of the robustness of GARP by using the data available for flea Oropsylla montana (Baker) to quantify the effects that sample size and the chosen explanatory variables have on the final species distribution map. GARP effectively modeled the distributions of 13 vector species. Furthermore, our analyses show that all of these modeled ranges are robust, with a sample size of six fleas or greater not significantly impacting the percentage of the in-state area where the flea was predicted to be found, or the testing accuracy of the model. The results of this study will help guide the sampling efforts of future studies focusing on plague vectors.

  20. A hybrid genetic algorithm for resolving closely spaced objects

    NASA Technical Reports Server (NTRS)

    Abbott, R. J.; Lillo, W. E.; Schulenburg, N.

    1995-01-01

    A hybrid genetic algorithm is described for performing the difficult optimization task of resolving closely spaced objects appearing in space based and ground based surveillance data. This application of genetic algorithms is unusual in that it uses a powerful domain-specific operation as a genetic operator. Results of applying the algorithm to real data from telescopic observations of a star field are presented.

  1. Genetic Algorithm Tuned Fuzzy Logic for Gliding Return Trajectories

    NASA Technical Reports Server (NTRS)

    Burchett, Bradley T.

    2003-01-01

    The problem of designing and flying a trajectory for successful recovery of a reusable launch vehicle is tackled using fuzzy logic control with genetic algorithm optimization. The plant is approximated by a simplified three degree of freedom non-linear model. A baseline trajectory design and guidance algorithm consisting of several Mamdani type fuzzy controllers is tuned using a simple genetic algorithm. Preliminary results show that the performance of the overall system is shown to improve with genetic algorithm tuning.

  2. Test Scheduling for Core-Based SOCs Using Genetic Algorithm Based Heuristic Approach

    NASA Astrophysics Data System (ADS)

    Giri, Chandan; Sarkar, Soumojit; Chattopadhyay, Santanu

    This paper presents a Genetic algorithm (GA) based solution to co-optimize test scheduling and wrapper design for core based SOCs. Core testing solutions are generated as a set of wrapper configurations, represented as rectangles with width equal to the number of TAM (Test Access Mechanism) channels and height equal to the corresponding testing time. A locally optimal best-fit heuristic based bin packing algorithm has been used to determine placement of rectangles minimizing the overall test times, whereas, GA has been utilized to generate the sequence of rectangles to be considered for placement. Experimental result on ITC'02 benchmark SOCs shows that the proposed method provides better solutions compared to the recent works reported in the literature.

  3. Controlling for Frailty in Pharmacoepidemiologic Studies of Older Adults: Validation of an Existing Medicare Claims-based Algorithm.

    PubMed

    Cuthbertson, Carmen C; Kucharska-Newton, Anna; Faurot, Keturah R; Stürmer, Til; Jonsson Funk, Michele; Palta, Priya; Windham, B Gwen; Thai, Sydney; Lund, Jennifer L

    2018-07-01

    Frailty is a geriatric syndrome characterized by weakness and weight loss and is associated with adverse health outcomes. It is often an unmeasured confounder in pharmacoepidemiologic and comparative effectiveness studies using administrative claims data. Among the Atherosclerosis Risk in Communities (ARIC) Study Visit 5 participants (2011-2013; n = 3,146), we conducted a validation study to compare a Medicare claims-based algorithm of dependency in activities of daily living (or dependency) developed as a proxy for frailty with a reference standard measure of phenotypic frailty. We applied the algorithm to the ARIC participants' claims data to generate a predicted probability of dependency. Using the claims-based algorithm, we estimated the C-statistic for predicting phenotypic frailty. We further categorized participants by their predicted probability of dependency (<5%, 5% to <20%, and ≥20%) and estimated associations with difficulties in physical abilities, falls, and mortality. The claims-based algorithm showed good discrimination of phenotypic frailty (C-statistic = 0.71; 95% confidence interval [CI] = 0.67, 0.74). Participants classified with a high predicted probability of dependency (≥20%) had higher prevalence of falls and difficulty in physical ability, and a greater risk of 1-year all-cause mortality (hazard ratio = 5.7 [95% CI = 2.5, 13]) than participants classified with a low predicted probability (<5%). Sensitivity and specificity varied across predicted probability of dependency thresholds. The Medicare claims-based algorithm showed good discrimination of phenotypic frailty and high predictive ability with adverse health outcomes. This algorithm can be used in future Medicare claims analyses to reduce confounding by frailty and improve study validity.

  4. Algorithms, complexity, and the sciences

    PubMed Central

    Papadimitriou, Christos

    2014-01-01

    Algorithms, perhaps together with Moore’s law, compose the engine of the information technology revolution, whereas complexity—the antithesis of algorithms—is one of the deepest realms of mathematical investigation. After introducing the basic concepts of algorithms and complexity, and the fundamental complexity classes P (polynomial time) and NP (nondeterministic polynomial time, or search problems), we discuss briefly the P vs. NP problem. We then focus on certain classes between P and NP which capture important phenomena in the social and life sciences, namely the Nash equlibrium and other equilibria in economics and game theory, and certain processes in population genetics and evolution. Finally, an algorithm known as multiplicative weights update (MWU) provides an algorithmic interpretation of the evolution of allele frequencies in a population under sex and weak selection. All three of these equivalences are rife with domain-specific implications: The concept of Nash equilibrium may be less universal—and therefore less compelling—than has been presumed; selection on gene interactions may entail the maintenance of genetic variation for longer periods than selection on single alleles predicts; whereas MWU can be shown to maximize, for each gene, a convex combination of the gene’s cumulative fitness in the population and the entropy of the allele distribution, an insight that may be pertinent to the maintenance of variation in evolution. PMID:25349382

  5. Finding Risk Groups by Optimizing Artificial Neural Networks on the Area under the Survival Curve Using Genetic Algorithms.

    PubMed

    Kalderstam, Jonas; Edén, Patrik; Ohlsson, Mattias

    2015-01-01

    We investigate a new method to place patients into risk groups in censored survival data. Properties such as median survival time, and end survival rate, are implicitly improved by optimizing the area under the survival curve. Artificial neural networks (ANN) are trained to either maximize or minimize this area using a genetic algorithm, and combined into an ensemble to predict one of low, intermediate, or high risk groups. Estimated patient risk can influence treatment choices, and is important for study stratification. A common approach is to sort the patients according to a prognostic index and then group them along the quartile limits. The Cox proportional hazards model (Cox) is one example of this approach. Another method of doing risk grouping is recursive partitioning (Rpart), which constructs a decision tree where each branch point maximizes the statistical separation between the groups. ANN, Cox, and Rpart are compared on five publicly available data sets with varying properties. Cross-validation, as well as separate test sets, are used to validate the models. Results on the test sets show comparable performance, except for the smallest data set where Rpart's predicted risk groups turn out to be inverted, an example of crossing survival curves. Cross-validation shows that all three models exhibit crossing of some survival curves on this small data set but that the ANN model manages the best separation of groups in terms of median survival time before such crossings. The conclusion is that optimizing the area under the survival curve is a viable approach to identify risk groups. Training ANNs to optimize this area combines two key strengths from both prognostic indices and Rpart. First, a desired minimum group size can be specified, as for a prognostic index. Second, the ability to utilize non-linear effects among the covariates, which Rpart is also able to do.

  6. Cordova: web-based management of genetic variation data.

    PubMed

    Ephraim, Sean S; Anand, Nikhil; DeLuca, Adam P; Taylor, Kyle R; Kolbe, Diana L; Simpson, Allen C; Azaiez, Hela; Sloan, Christina M; Shearer, A Eliot; Hallier, Andrea R; Casavant, Thomas L; Scheetz, Todd E; Smith, Richard J H; Braun, Terry A

    2014-12-01

    Cordova is an out-of-the-box solution for building and maintaining an online database of genetic variations integrated with pathogenicity prediction results from popular algorithms. Our primary motivation for developing this system is to aid researchers and clinician-scientists in determining the clinical significance of genetic variations. To achieve this goal, Cordova provides an interface to review and manually or computationally curate genetic variation data as well as share it for clinical diagnostics and the advancement of research. Cordova is open source under the MIT license and is freely available for download at https://github.com/clcg/cordova. Published by Oxford University Press. This work is written by US Government employees and is in the public domain in the US.

  7. Prediction of Endocrine System Affectation in Fisher 344 Rats by Food Intake Exposed with Malathion, Applying Naïve Bayes Classifier and Genetic Algorithms

    PubMed Central

    Mora, Juan David Sandino; Hurtado, Darío Amaya; Sandoval, Olga Lucía Ramos

    2016-01-01

    Background: Reported cases of uncontrolled use of pesticides and its produced effects by direct or indirect exposition, represent a high risk for human health. Therefore, in this paper, it is shown the results of the development and execution of an algorithm that predicts the possible effects in endocrine system in Fisher 344 (F344) rats, occasioned by ingestion of malathion. Methods: It was referred to ToxRefDB database in which different case studies in F344 rats exposed to malathion were collected. The experimental data were processed using Naïve Bayes (NB) machine learning classifier, which was subsequently optimized using genetic algorithms (GAs). The model was executed in an application with a graphical user interface programmed in C#. Results: There was a tendency to suffer bigger alterations, increasing levels in the parathyroid gland in dosages between 4 and 5 mg/kg/day, in contrast to the thyroid gland for doses between 739 and 868 mg/kg/day. It was showed a greater resistance for females to contract effects on the endocrine system by the ingestion of malathion. Females were more susceptible to suffer alterations in the pituitary gland with exposure times between 3 and 6 months. Conclusions: The prediction model based on NB classifiers allowed to analyze all the possible combinations of the studied variables and improving its accuracy using GAs. Excepting the pituitary gland, females demonstrated better resistance to contract effects by increasing levels on the rest of endocrine system glands. PMID:27833725

  8. Prediction of Endocrine System Affectation in Fisher 344 Rats by Food Intake Exposed with Malathion, Applying Naïve Bayes Classifier and Genetic Algorithms.

    PubMed

    Mora, Juan David Sandino; Hurtado, Darío Amaya; Sandoval, Olga Lucía Ramos

    2016-01-01

    Reported cases of uncontrolled use of pesticides and its produced effects by direct or indirect exposition, represent a high risk for human health. Therefore, in this paper, it is shown the results of the development and execution of an algorithm that predicts the possible effects in endocrine system in Fisher 344 (F344) rats, occasioned by ingestion of malathion. It was referred to ToxRefDB database in which different case studies in F344 rats exposed to malathion were collected. The experimental data were processed using Naïve Bayes (NB) machine learning classifier, which was subsequently optimized using genetic algorithms (GAs). The model was executed in an application with a graphical user interface programmed in C#. There was a tendency to suffer bigger alterations, increasing levels in the parathyroid gland in dosages between 4 and 5 mg/kg/day, in contrast to the thyroid gland for doses between 739 and 868 mg/kg/day. It was showed a greater resistance for females to contract effects on the endocrine system by the ingestion of malathion. Females were more susceptible to suffer alterations in the pituitary gland with exposure times between 3 and 6 months. The prediction model based on NB classifiers allowed to analyze all the possible combinations of the studied variables and improving its accuracy using GAs. Excepting the pituitary gland, females demonstrated better resistance to contract effects by increasing levels on the rest of endocrine system glands.

  9. [The application of gene expression programming in the diagnosis of heart disease].

    PubMed

    Dai, Wenbin; Zhang, Yuntao; Gao, Xingyu

    2009-02-01

    GEP (Gene expression programming) is a new genetic algorithm, and it has been proved to be excellent in function finding. In this paper, for the purpose of setting up a diagnostic model, GEP is used to deal with the data of heart disease. Eight variables, Sex, Chest pain, Blood pressure, Angina, Peak, Slope, Colored vessels and Thal, are picked out of thirteen variables to form a classified function. This function is used to predict a forecasting set of 100 samples, and the accuracy is 87%. Other algorithms such as SVM (Support vector machine) are applied to the same data and the forecasting results show that GEP is better than other algorithms.

  10. Extending DFT-based genetic algorithms by atom-to-place re-assignment via perturbation theory: A systematic and unbiased approach to structures of mixed-metallic clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Weigend, Florian, E-mail: florian.weigend@kit.edu

    2014-10-07

    Energy surfaces of metal clusters usually show a large variety of local minima. For homo-metallic species the energetically lowest can be found reliably with genetic algorithms, in combination with density functional theory without system-specific parameters. For mixed-metallic clusters this is much more difficult, as for a given arrangement of nuclei one has to find additionally the best of many possibilities of assigning different metal types to the individual positions. In the framework of electronic structure methods this second issue is treatable at comparably low cost at least for elements with similar atomic number by means of first-order perturbation theory, asmore » shown previously [F. Weigend, C. Schrodt, and R. Ahlrichs, J. Chem. Phys. 121, 10380 (2004)]. In the present contribution the extension of a genetic algorithm with the re-assignment of atom types to atom sites is proposed and tested for the search of the global minima of PtHf{sub 12} and [LaPb{sub 7}Bi{sub 7}]{sup 4−}. For both cases the (putative) global minimum is reliably found with the extended technique, which is not the case for the “pure” genetic algorithm.« less

  11. Computational intelligence techniques for biological data mining: An overview

    NASA Astrophysics Data System (ADS)

    Faye, Ibrahima; Iqbal, Muhammad Javed; Said, Abas Md; Samir, Brahim Belhaouari

    2014-10-01

    Computational techniques have been successfully utilized for a highly accurate analysis and modeling of multifaceted and raw biological data gathered from various genome sequencing projects. These techniques are proving much more effective to overcome the limitations of the traditional in-vitro experiments on the constantly increasing sequence data. However, most critical problems that caught the attention of the researchers may include, but not limited to these: accurate structure and function prediction of unknown proteins, protein subcellular localization prediction, finding protein-protein interactions, protein fold recognition, analysis of microarray gene expression data, etc. To solve these problems, various classification and clustering techniques using machine learning have been extensively used in the published literature. These techniques include neural network algorithms, genetic algorithms, fuzzy ARTMAP, K-Means, K-NN, SVM, Rough set classifiers, decision tree and HMM based algorithms. Major difficulties in applying the above algorithms include the limitations found in the previous feature encoding and selection methods while extracting the best features, increasing classification accuracy and decreasing the running time overheads of the learning algorithms. The application of this research would be potentially useful in the drug design and in the diagnosis of some diseases. This paper presents a concise overview of the well-known protein classification techniques.

  12. Algorithme intelligent d'optimisation d'un design structurel de grande envergure

    NASA Astrophysics Data System (ADS)

    Dominique, Stephane

    The implementation of an automated decision support system in the field of design and structural optimisation can give a significant advantage to any industry working on mechanical designs. Indeed, by providing solution ideas to a designer or by upgrading existing design solutions while the designer is not at work, the system may reduce the project cycle time, or allow more time to produce a better design. This thesis presents a new approach to automate a design process based on Case-Based Reasoning (CBR), in combination with a new genetic algorithm named Genetic Algorithm with Territorial core Evolution (GATE). This approach was developed in order to reduce the operating cost of the process. However, as the system implementation cost is quite expensive, the approach is better suited for large scale design problem, and particularly for design problems that the designer plans to solve for many different specification sets. First, the CBR process uses a databank filled with every known solution to similar design problems. Then, the closest solutions to the current problem in term of specifications are selected. After this, during the adaptation phase, an artificial neural network (ANN) interpolates amongst known solutions to produce an additional solution to the current problem using the current specifications as inputs. Each solution produced and selected by the CBR is then used to initialize the population of an island of the genetic algorithm. The algorithm will optimise the solution further during the refinement phase. Using progressive refinement, the algorithm starts using only the most important variables for the problem. Then, as the optimisation progress, the remaining variables are gradually introduced, layer by layer. The genetic algorithm that is used is a new algorithm specifically created during this thesis to solve optimisation problems from the field of mechanical device structural design. The algorithm is named GATE, and is essentially a real number genetic algorithm that prevents new individuals to be born too close to previously evaluated solutions. The restricted area becomes smaller or larger during the optimisation to allow global or local search when necessary. Also, a new search operator named Substitution Operator is incorporated in GATE. This operator allows an ANN surrogate model to guide the algorithm toward the most promising areas of the design space. The suggested CBR approach and GATE were tested on several simple test problems, as well as on the industrial problem of designing a gas turbine engine rotor's disc. These results are compared to other results obtained for the same problems by many other popular optimisation algorithms, such as (depending of the problem) gradient algorithms, binary genetic algorithm, real number genetic algorithm, genetic algorithm using multiple parents crossovers, differential evolution genetic algorithm, Hookes & Jeeves generalized pattern search method and POINTER from the software I-SIGHT 3.5. Results show that GATE is quite competitive, giving the best results for 5 of the 6 constrained optimisation problem. GATE also provided the best results of all on problem produced by a Maximum Set Gaussian landscape generator. Finally, GATE provided a disc 4.3% lighter than the best other tested algorithm (POINTER) for the gas turbine engine rotor's disc problem. One drawback of GATE is a lesser efficiency for highly multimodal unconstrained problems, for which he gave quite poor results with respect to its implementation cost. To conclude, according to the preliminary results obtained during this thesis, the suggested CBR process, combined with GATE, seems to be a very good candidate to automate and accelerate the structural design of mechanical devices, potentially reducing significantly the cost of industrial preliminary design processes.

  13. Constructing better classifier ensemble based on weighted accuracy and diversity measure.

    PubMed

    Zeng, Xiaodong; Wong, Derek F; Chao, Lidia S

    2014-01-01

    A weighted accuracy and diversity (WAD) method is presented, a novel measure used to evaluate the quality of the classifier ensemble, assisting in the ensemble selection task. The proposed measure is motivated by a commonly accepted hypothesis; that is, a robust classifier ensemble should not only be accurate but also different from every other member. In fact, accuracy and diversity are mutual restraint factors; that is, an ensemble with high accuracy may have low diversity, and an overly diverse ensemble may negatively affect accuracy. This study proposes a method to find the balance between accuracy and diversity that enhances the predictive ability of an ensemble for unknown data. The quality assessment for an ensemble is performed such that the final score is achieved by computing the harmonic mean of accuracy and diversity, where two weight parameters are used to balance them. The measure is compared to two representative measures, Kappa-Error and GenDiv, and two threshold measures that consider only accuracy or diversity, with two heuristic search algorithms, genetic algorithm, and forward hill-climbing algorithm, in ensemble selection tasks performed on 15 UCI benchmark datasets. The empirical results demonstrate that the WAD measure is superior to others in most cases.

  14. Constructing Better Classifier Ensemble Based on Weighted Accuracy and Diversity Measure

    PubMed Central

    Chao, Lidia S.

    2014-01-01

    A weighted accuracy and diversity (WAD) method is presented, a novel measure used to evaluate the quality of the classifier ensemble, assisting in the ensemble selection task. The proposed measure is motivated by a commonly accepted hypothesis; that is, a robust classifier ensemble should not only be accurate but also different from every other member. In fact, accuracy and diversity are mutual restraint factors; that is, an ensemble with high accuracy may have low diversity, and an overly diverse ensemble may negatively affect accuracy. This study proposes a method to find the balance between accuracy and diversity that enhances the predictive ability of an ensemble for unknown data. The quality assessment for an ensemble is performed such that the final score is achieved by computing the harmonic mean of accuracy and diversity, where two weight parameters are used to balance them. The measure is compared to two representative measures, Kappa-Error and GenDiv, and two threshold measures that consider only accuracy or diversity, with two heuristic search algorithms, genetic algorithm, and forward hill-climbing algorithm, in ensemble selection tasks performed on 15 UCI benchmark datasets. The empirical results demonstrate that the WAD measure is superior to others in most cases. PMID:24672402

  15. GA(M)E-QSAR: a novel, fully automatic genetic-algorithm-(meta)-ensembles approach for binary classification in ligand-based drug design.

    PubMed

    Pérez-Castillo, Yunierkis; Lazar, Cosmin; Taminau, Jonatan; Froeyen, Mathy; Cabrera-Pérez, Miguel Ángel; Nowé, Ann

    2012-09-24

    Computer-aided drug design has become an important component of the drug discovery process. Despite the advances in this field, there is not a unique modeling approach that can be successfully applied to solve the whole range of problems faced during QSAR modeling. Feature selection and ensemble modeling are active areas of research in ligand-based drug design. Here we introduce the GA(M)E-QSAR algorithm that combines the search and optimization capabilities of Genetic Algorithms with the simplicity of the Adaboost ensemble-based classification algorithm to solve binary classification problems. We also explore the usefulness of Meta-Ensembles trained with Adaboost and Voting schemes to further improve the accuracy, generalization, and robustness of the optimal Adaboost Single Ensemble derived from the Genetic Algorithm optimization. We evaluated the performance of our algorithm using five data sets from the literature and found that it is capable of yielding similar or better classification results to what has been reported for these data sets with a higher enrichment of active compounds relative to the whole actives subset when only the most active chemicals are considered. More important, we compared our methodology with state of the art feature selection and classification approaches and found that it can provide highly accurate, robust, and generalizable models. In the case of the Adaboost Ensembles derived from the Genetic Algorithm search, the final models are quite simple since they consist of a weighted sum of the output of single feature classifiers. Furthermore, the Adaboost scores can be used as ranking criterion to prioritize chemicals for synthesis and biological evaluation after virtual screening experiments.

  16. Learning Intelligent Genetic Algorithms Using Japanese Nonograms

    ERIC Educational Resources Information Center

    Tsai, Jinn-Tsong; Chou, Ping-Yi; Fang, Jia-Cen

    2012-01-01

    An intelligent genetic algorithm (IGA) is proposed to solve Japanese nonograms and is used as a method in a university course to learn evolutionary algorithms. The IGA combines the global exploration capabilities of a canonical genetic algorithm (CGA) with effective condensed encoding, improved fitness function, and modified crossover and…

  17. RNA design using simulated SHAPE data.

    PubMed

    Lotfi, Mohadeseh; Zare-Mirakabad, Fatemeh; Montaseri, Soheila

    2018-05-03

    It has long been established that in addition to being involved in protein translation, RNA plays essential roles in numerous other cellular processes, including gene regulation and DNA replication. Such roles are known to be dictated by higher-order structures of RNA molecules. It is therefore of prime importance to find an RNA sequence that can fold to acquire a particular function that is desirable for use in pharmaceuticals and basic research. The challenge of finding an RNA sequence for a given structure is known as the RNA design problem. Although there are several algorithms to solve this problem, they mainly consider hard constraints, such as minimum free energy, to evaluate the predicted sequences. Recently, SHAPE data has emerged as a new soft constraint for RNA secondary structure prediction. To take advantage of this new experimental constraint, we report here a new method for accurate design of RNA sequences based on their secondary structures using SHAPE data as pseudo-free energy. We then compare our algorithm with four others: INFO-RNA, ERD, MODENA and RNAifold 2.0. Our algorithm precisely predicts 26 out of 29 new sequences for the structures extracted from the Rfam dataset, while the other four algorithms predict no more than 22 out of 29. The proposed algorithm is comparable to the above algorithms on RNA-SSD datasets, where they can predict up to 33 appropriate sequences for RNA secondary structures out of 34.

  18. GAGA: a new algorithm for genomic inference of geographic ancestry reveals fine level population substructure in Europeans.

    PubMed

    Lao, Oscar; Liu, Fan; Wollstein, Andreas; Kayser, Manfred

    2014-02-01

    Attempts to detect genetic population substructure in humans are troubled by the fact that the vast majority of the total amount of observed genetic variation is present within populations rather than between populations. Here we introduce a new algorithm for transforming a genetic distance matrix that reduces the within-population variation considerably. Extensive computer simulations revealed that the transformed matrix captured the genetic population differentiation better than the original one which was based on the T1 statistic. In an empirical genomic data set comprising 2,457 individuals from 23 different European subpopulations, the proportion of individuals that were determined as a genetic neighbour to another individual from the same sampling location increased from 25% with the original matrix to 52% with the transformed matrix. Similarly, the percentage of genetic variation explained between populations by means of Analysis of Molecular Variance (AMOVA) increased from 1.62% to 7.98%. Furthermore, the first two dimensions of a classical multidimensional scaling (MDS) using the transformed matrix explained 15% of the variance, compared to 0.7% obtained with the original matrix. Application of MDS with Mclust, SPA with Mclust, and GemTools algorithms to the same dataset also showed that the transformed matrix gave a better association of the genetic clusters with the sampling locations, and particularly so when it was used in the AMOVA framework with a genetic algorithm. Overall, the new matrix transformation introduced here substantially reduces the within population genetic differentiation, and can be broadly applied to methods such as AMOVA to enhance their sensitivity to reveal population substructure. We herewith provide a publically available (http://www.erasmusmc.nl/fmb/resources/GAGA) model-free method for improved genetic population substructure detection that can be applied to human as well as any other species data in future studies relevant to evolutionary biology, behavioural ecology, medicine, and forensics.

  19. Intermediate view reconstruction using adaptive disparity search algorithm for real-time 3D processing

    NASA Astrophysics Data System (ADS)

    Bae, Kyung-hoon; Park, Changhan; Kim, Eun-soo

    2008-03-01

    In this paper, intermediate view reconstruction (IVR) using adaptive disparity search algorithm (ASDA) is for realtime 3-dimensional (3D) processing proposed. The proposed algorithm can reduce processing time of disparity estimation by selecting adaptive disparity search range. Also, the proposed algorithm can increase the quality of the 3D imaging. That is, by adaptively predicting the mutual correlation between stereo images pair using the proposed algorithm, the bandwidth of stereo input images pair can be compressed to the level of a conventional 2D image and a predicted image also can be effectively reconstructed using a reference image and disparity vectors. From some experiments, stereo sequences of 'Pot Plant' and 'IVO', it is shown that the proposed algorithm improves the PSNRs of a reconstructed image to about 4.8 dB by comparing with that of conventional algorithms, and reduces the Synthesizing time of a reconstructed image to about 7.02 sec by comparing with that of conventional algorithms.

  20. A Comparative Study of Classification and Regression Algorithms for Modelling Students' Academic Performance

    ERIC Educational Resources Information Center

    Strecht, Pedro; Cruz, Luís; Soares, Carlos; Mendes-Moreira, João; Abreu, Rui

    2015-01-01

    Predicting the success or failure of a student in a course or program is a problem that has recently been addressed using data mining techniques. In this paper we evaluate some of the most popular classification and regression algorithms on this problem. We address two problems: prediction of approval/failure and prediction of grade. The former is…

  1. Simultaneous data pre-processing and SVM classification model selection based on a parallel genetic algorithm applied to spectroscopic data of olive oils.

    PubMed

    Devos, Olivier; Downey, Gerard; Duponchel, Ludovic

    2014-04-01

    Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.

  2. The Impact of Genetic and Non-Genetic Factors on Warfarin Dose Prediction in MENA Region: A Systematic Review

    PubMed Central

    2016-01-01

    Background Warfarin is the most commonly used oral anticoagulant for the treatment and prevention of thromboembolic disorders. Pharmacogenomics studies have shown that variants in CYP2C9 and VKORC1 genes are strongly and consistently associated with warfarin dose variability. Although different populations from the Middle East and North Africa (MENA) region may share the same ancestry, it is still unclear how they compare in the genetic and non-genetic factors affecting their warfarin dosing. Objective To explore the prevalence of CYP2C9 and VKORC1 variants in MENA, and the effect of these variants along with other non-genetic factors in predicting warfarin dose. Methods In this systematic review, we included observational cross sectional and cohort studies that enrolled patients on stable warfarin dose and had the genetics and non-genetics factors associated with mean warfarin dose as the primary outcome. We searched PubMed, Medline, Scopus, PharmGKB, PHGKB, Google scholar and reference lists of relevant reviews. Results We identified 17 studies in eight different populations: Iranian, Israeli, Egyptian, Lebanese, Omani, Kuwaiti, Sudanese and Turkish. Most common genetic variant in all populations was the VKORC1 (-1639G>A), with a minor allele frequency ranging from 30% in Egyptians and up to 52% and 56% in Lebanese and Iranian, respectively. Variants in the CYP2C9 were less common, with the highest MAF for CYP2C9*2 among Iranians (27%). Variants in the VKORC1 and CYP2C9 were the most significant predictors of warfarin dose in all populations. Along with other genetic and non-genetic factors, they explained up to 63% of the dose variability in Omani and Israeli patients. Conclusion Variants of VKORC1 and CYP2C9 are the strongest predictors of warfarin dose variability among the different populations from MENA. Although many of those populations share the same ancestry and are similar in their warfarin dose predictors, a population specific dosing algorithm is needed for the prospective estimation of warfarin dose. PMID:27992547

  3. The Impact of Genetic and Non-Genetic Factors on Warfarin Dose Prediction in MENA Region: A Systematic Review.

    PubMed

    Bader, Loulia Akram; Elewa, Hazem

    2016-01-01

    Warfarin is the most commonly used oral anticoagulant for the treatment and prevention of thromboembolic disorders. Pharmacogenomics studies have shown that variants in CYP2C9 and VKORC1 genes are strongly and consistently associated with warfarin dose variability. Although different populations from the Middle East and North Africa (MENA) region may share the same ancestry, it is still unclear how they compare in the genetic and non-genetic factors affecting their warfarin dosing. To explore the prevalence of CYP2C9 and VKORC1 variants in MENA, and the effect of these variants along with other non-genetic factors in predicting warfarin dose. In this systematic review, we included observational cross sectional and cohort studies that enrolled patients on stable warfarin dose and had the genetics and non-genetics factors associated with mean warfarin dose as the primary outcome. We searched PubMed, Medline, Scopus, PharmGKB, PHGKB, Google scholar and reference lists of relevant reviews. We identified 17 studies in eight different populations: Iranian, Israeli, Egyptian, Lebanese, Omani, Kuwaiti, Sudanese and Turkish. Most common genetic variant in all populations was the VKORC1 (-1639G>A), with a minor allele frequency ranging from 30% in Egyptians and up to 52% and 56% in Lebanese and Iranian, respectively. Variants in the CYP2C9 were less common, with the highest MAF for CYP2C9*2 among Iranians (27%). Variants in the VKORC1 and CYP2C9 were the most significant predictors of warfarin dose in all populations. Along with other genetic and non-genetic factors, they explained up to 63% of the dose variability in Omani and Israeli patients. Variants of VKORC1 and CYP2C9 are the strongest predictors of warfarin dose variability among the different populations from MENA. Although many of those populations share the same ancestry and are similar in their warfarin dose predictors, a population specific dosing algorithm is needed for the prospective estimation of warfarin dose.

  4. Boiler-turbine control system design using a genetic algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dimeo, R.; Lee, K.Y.

    1995-12-01

    This paper discusses the application of a genetic algorithm to control system design for a boiler-turbine plant. In particular the authors study the ability of the genetic algorithm to develop a proportional-integral (PI) controller and a state feedback controller for a non-linear multi-input/multi-output (MIMO) plant model. The plant model is presented along with a discussion of the inherent difficulties in such controller development. A sketch of the genetic algorithm (GA) is presented and its strategy as a method of control system design is discussed. Results are presented for two different control systems that have been designed with the genetic algorithm.

  5. Generating Stock Trading Rules Using Genetic Network Programming with Flag Nodes and Adjustment of Importance Indexes

    NASA Astrophysics Data System (ADS)

    Mabu, Shingo; Chen, Yan; Hirasawa, Kotaro

    Genetic Network Programming (GNP) is an evolutionary algorithm which represents its solutions using graph structures. Since GNP can create quite compact programs and has an implicit memory function, GNP works well especially in dynamic environments. In addition, a study on creating trading rules on stock markets using GNP with Importance Index (GNP-IMX) has been done. IMX is one of the criterions for decision making. However, the values of IMXs must be deteminined by our experience/knowledge. Therefore in this paper, IMXs are adjusted appropriately during the stock trading in order to predict the rise and fall of the stocks. Moreover, newly defined flag nodes are introduced to GNP, which can appropriately judge the current situation of the stock prices, and also contributes to the use of many kinds of nodes in GNP program. In the simulation, programs are evolved using the stock prices of 20 companies. Then the generalization ability is tested and compared with GNP without flag nodes, GNP without IMX adjustment and Buy&Hold.

  6. Biased random key genetic algorithm with insertion and gender selection for capacitated vehicle routing problem with time windows

    NASA Astrophysics Data System (ADS)

    Rochman, Auliya Noor; Prasetyo, Hari; Nugroho, Munajat Tri

    2017-06-01

    Vehicle Routing Problem (VRP) often occurs when the manufacturers need to distribute their product to some customers/outlets. The distribution process is typically restricted by the capacity of the vehicle and the working hours at the distributor. This type of VRP is also known as Capacitated Vehicle Routing Problem with Time Windows (CVRPTW). A Biased Random Key Genetic Algorithm (BRKGA) was designed and coded in MATLAB to solve the CVRPTW case of soft drink distribution. The standard BRKGA was then modified by applying chromosome insertion into the initial population and defining chromosome gender for parent undergoing crossover operation. The performance of the established algorithms was then compared to a heuristic procedure for solving a soft drink distribution. Some findings are revealed (1) the total distribution cost of BRKGA with insertion (BRKGA-I) results in a cost saving of 39% compared to the total cost of heuristic method, (2) BRKGA with the gender selection (BRKGA-GS) could further improve the performance of the heuristic method. However, the BRKGA-GS tends to yield worse results compared to that obtained from the standard BRKGA.

  7. Evaluation of an ensemble of genetic models for prediction of a quantitative trait.

    PubMed

    Milton, Jacqueline N; Steinberg, Martin H; Sebastiani, Paola

    2014-01-01

    Many genetic markers have been shown to be associated with common quantitative traits in genome-wide association studies. Typically these associated genetic markers have small to modest effect sizes and individually they explain only a small amount of the variability of the phenotype. In order to build a genetic prediction model without fitting a multiple linear regression model with possibly hundreds of genetic markers as predictors, researchers often summarize the joint effect of risk alleles into a genetic score that is used as a covariate in the genetic prediction model. However, the prediction accuracy can be highly variable and selecting the optimal number of markers to be included in the genetic score is challenging. In this manuscript we present a strategy to build an ensemble of genetic prediction models from data and we show that the ensemble-based method makes the challenge of choosing the number of genetic markers more amenable. Using simulated data with varying heritability and number of genetic markers, we compare the predictive accuracy and inclusion of true positive and false positive markers of a single genetic prediction model and our proposed ensemble method. The results show that the ensemble of genetic models tends to include a larger number of genetic variants than a single genetic model and it is more likely to include all of the true genetic markers. This increased sensitivity is obtained at the price of a lower specificity that appears to minimally affect the predictive accuracy of the ensemble.

  8. Predicting Gene Structure Changes Resulting from Genetic Variants via Exon Definition Features.

    PubMed

    Majoros, William H; Holt, Carson; Campbell, Michael S; Ware, Doreen; Yandell, Mark; Reddy, Timothy E

    2018-04-25

    Genetic variation that disrupts gene function by altering gene splicing between individuals can substantially influence traits and disease. In those cases, accurately predicting the effects of genetic variation on splicing can be highly valuable for investigating the mechanisms underlying those traits and diseases. While methods have been developed to generate high quality computational predictions of gene structures in reference genomes, the same methods perform poorly when used to predict the potentially deleterious effects of genetic changes that alter gene splicing between individuals. Underlying that discrepancy in predictive ability are the common assumptions by reference gene finding algorithms that genes are conserved, well-formed, and produce functional proteins. We describe a probabilistic approach for predicting recent changes to gene structure that may or may not conserve function. The model is applicable to both coding and noncoding genes, and can be trained on existing gene annotations without requiring curated examples of aberrant splicing. We apply this model to the problem of predicting altered splicing patterns in the genomes of individual humans, and we demonstrate that performing gene-structure prediction without relying on conserved coding features is feasible. The model predicts an unexpected abundance of variants that create de novo splice sites, an observation supported by both simulations and empirical data from RNA-seq experiments. While these de novo splice variants are commonly misinterpreted by other tools as coding or noncoding variants of little or no effect, we find that in some cases they can have large effects on splicing activity and protein products, and we propose that they may commonly act as cryptic factors in disease. The software is available from geneprediction.org/SGRF. bmajoros@duke.edu. Supplementary information is available at Bioinformatics online.

  9. Method for hyperspectral imagery exploitation and pixel spectral unmixing

    NASA Technical Reports Server (NTRS)

    Lin, Ching-Fang (Inventor)

    2003-01-01

    An efficiently hybrid approach to exploit hyperspectral imagery and unmix spectral pixels. This hybrid approach uses a genetic algorithm to solve the abundance vector for the first pixel of a hyperspectral image cube. This abundance vector is used as initial state in a robust filter to derive the abundance estimate for the next pixel. By using Kalman filter, the abundance estimate for a pixel can be obtained in one iteration procedure which is much fast than genetic algorithm. The output of the robust filter is fed to genetic algorithm again to derive accurate abundance estimate for the current pixel. The using of robust filter solution as starting point of the genetic algorithm speeds up the evolution of the genetic algorithm. After obtaining the accurate abundance estimate, the procedure goes to next pixel, and uses the output of genetic algorithm as the previous state estimate to derive abundance estimate for this pixel using robust filter. And again use the genetic algorithm to derive accurate abundance estimate efficiently based on the robust filter solution. This iteration continues until pixels in a hyperspectral image cube end.

  10. A novel clinical decision support system using improved adaptive genetic algorithm for the assessment of fetal well-being.

    PubMed

    Ravindran, Sindhu; Jambek, Asral Bahari; Muthusamy, Hariharan; Neoh, Siew-Chin

    2015-01-01

    A novel clinical decision support system is proposed in this paper for evaluating the fetal well-being from the cardiotocogram (CTG) dataset through an Improved Adaptive Genetic Algorithm (IAGA) and Extreme Learning Machine (ELM). IAGA employs a new scaling technique (called sigma scaling) to avoid premature convergence and applies adaptive crossover and mutation techniques with masking concepts to enhance population diversity. Also, this search algorithm utilizes three different fitness functions (two single objective fitness functions and multi-objective fitness function) to assess its performance. The classification results unfold that promising classification accuracy of 94% is obtained with an optimal feature subset using IAGA. Also, the classification results are compared with those of other Feature Reduction techniques to substantiate its exhaustive search towards the global optimum. Besides, five other benchmark datasets are used to gauge the strength of the proposed IAGA algorithm.

  11. Performance Enhancement of Radial Distributed System with Distributed Generators by Reconfiguration Using Binary Firefly Algorithm

    NASA Astrophysics Data System (ADS)

    Rajalakshmi, N.; Padma Subramanian, D.; Thamizhavel, K.

    2015-03-01

    The extent of real power loss and voltage deviation associated with overloaded feeders in radial distribution system can be reduced by reconfiguration. Reconfiguration is normally achieved by changing the open/closed state of tie/sectionalizing switches. Finding optimal switch combination is a complicated problem as there are many switching combinations possible in a distribution system. Hence optimization techniques are finding greater importance in reducing the complexity of reconfiguration problem. This paper presents the application of firefly algorithm (FA) for optimal reconfiguration of radial distribution system with distributed generators (DG). The algorithm is tested on IEEE 33 bus system installed with DGs and the results are compared with binary genetic algorithm. It is found that binary FA is more effective than binary genetic algorithm in achieving real power loss reduction and improving voltage profile and hence enhancing the performance of radial distribution system. Results are found to be optimum when DGs are added to the test system, which proved the impact of DGs on distribution system.

  12. Joint Power Charging and Routing in Wireless Rechargeable Sensor Networks.

    PubMed

    Jia, Jie; Chen, Jian; Deng, Yansha; Wang, Xingwei; Aghvami, Abdol-Hamid

    2017-10-09

    The development of wireless power transfer (WPT) technology has inspired the transition from traditional battery-based wireless sensor networks (WSNs) towards wireless rechargeable sensor networks (WRSNs). While extensive efforts have been made to improve charging efficiency, little has been done for routing optimization. In this work, we present a joint optimization model to maximize both charging efficiency and routing structure. By analyzing the structure of the optimization model, we first decompose the problem and propose a heuristic algorithm to find the optimal charging efficiency for the predefined routing tree. Furthermore, by coding the many-to-one communication topology as an individual, we further propose to apply a genetic algorithm (GA) for the joint optimization of both routing and charging. The genetic operations, including tree-based recombination and mutation, are proposed to obtain a fast convergence. Our simulation results show that the heuristic algorithm reduces the number of resident locations and the total moving distance. We also show that our proposed algorithm achieves a higher charging efficiency compared with existing algorithms.

  13. Joint Power Charging and Routing in Wireless Rechargeable Sensor Networks

    PubMed Central

    Jia, Jie; Chen, Jian; Deng, Yansha; Wang, Xingwei; Aghvami, Abdol-Hamid

    2017-01-01

    The development of wireless power transfer (WPT) technology has inspired the transition from traditional battery-based wireless sensor networks (WSNs) towards wireless rechargeable sensor networks (WRSNs). While extensive efforts have been made to improve charging efficiency, little has been done for routing optimization. In this work, we present a joint optimization model to maximize both charging efficiency and routing structure. By analyzing the structure of the optimization model, we first decompose the problem and propose a heuristic algorithm to find the optimal charging efficiency for the predefined routing tree. Furthermore, by coding the many-to-one communication topology as an individual, we further propose to apply a genetic algorithm (GA) for the joint optimization of both routing and charging. The genetic operations, including tree-based recombination and mutation, are proposed to obtain a fast convergence. Our simulation results show that the heuristic algorithm reduces the number of resident locations and the total moving distance. We also show that our proposed algorithm achieves a higher charging efficiency compared with existing algorithms. PMID:28991200

  14. Phase Retrieval Using a Genetic Algorithm on the Systematic Image-Based Optical Alignment Testbed

    NASA Technical Reports Server (NTRS)

    Taylor, Jaime R.

    2003-01-01

    NASA s Marshall Space Flight Center s Systematic Image-Based Optical Alignment (SIBOA) Testbed was developed to test phase retrieval algorithms and hardware techniques. Individuals working with the facility developed the idea of implementing phase retrieval by breaking the determination of the tip/tilt of each mirror apart from the piston motion (or translation) of each mirror. Presented in this report is an algorithm that determines the optimal phase correction associated only with the piston motion of the mirrors. A description of the Phase Retrieval problem is first presented. The Systematic Image-Based Optical Alignment (SIBOA) Testbeb is then described. A Discrete Fourier Transform (DFT) is necessary to transfer the incoming wavefront (or estimate of phase error) into the spatial frequency domain to compare it with the image. A method for reducing the DFT to seven scalar/matrix multiplications is presented. A genetic algorithm is then used to search for the phase error. The results of this new algorithm on a test problem are presented.

  15. a Genetic Algorithm Based on Sexual Selection for the Multidimensional 0/1 Knapsack Problems

    NASA Astrophysics Data System (ADS)

    Varnamkhasti, Mohammad Jalali; Lee, Lai Soon

    In this study, a new technique is presented for choosing mate chromosomes during sexual selection in a genetic algorithm. The population is divided into groups of males and females. During the sexual selection, the female chromosome is selected by the tournament selection while the male chromosome is selected based on the hamming distance from the selected female chromosome, fitness value or active genes. Computational experiments are conducted on the proposed technique and the results are compared with some selection mechanisms commonly used for solving multidimensional 0/1 knapsack problems published in the literature.

  16. Random search optimization based on genetic algorithm and discriminant function

    NASA Technical Reports Server (NTRS)

    Kiciman, M. O.; Akgul, M.; Erarslanoglu, G.

    1990-01-01

    The general problem of optimization with arbitrary merit and constraint functions, which could be convex, concave, monotonic, or non-monotonic, is treated using stochastic methods. To improve the efficiency of the random search methods, a genetic algorithm for the search phase and a discriminant function for the constraint-control phase were utilized. The validity of the technique is demonstrated by comparing the results to published test problem results. Numerical experimentation indicated that for cases where a quick near optimum solution is desired, a general, user-friendly optimization code can be developed without serious penalties in both total computer time and accuracy.

  17. Potential habitat distribution for the freshwater diatom Didymosphenia geminata in the continental US

    USGS Publications Warehouse

    Kumar, S.; Spaulding, S.A.; Stohlgren, T.J.; Hermann, K.A.; Schmidt, T.S.; Bahls, L.L.

    2009-01-01

    The diatom Didymosphenia geminata is a single-celled alga found in lakes, streams, and rivers. Nuisance blooms of D geminata affect the diversity, abundance, and productivity of other aquatic organisms. Because D geminata can be transported by humans on waders and other gear, accurate spatial prediction of habitat suitability is urgently needed for early detection and rapid response, as well as for evaluation of monitoring and control programs. We compared four modeling methods to predict D geminata's habitat distribution; two methods use presence-absence data (logistic regression and classification and regression tree [CART]), and two involve presence data (maximum entropy model [Maxent] and genetic algorithm for rule-set production [GARP]). Using these methods, we evaluated spatially explicit, bioclimatic and environmental variables as predictors of diatom distribution. The Maxent model provided the most accurate predictions, followed by logistic regression, CART, and GARP. The most suitable habitats were predicted to occur in the western US, in relatively cool sites, and at high elevations with a high base-flow index. The results provide insights into the factors that affect the distribution of D geminata and a spatial basis for the prediction of nuisance blooms. ?? The Ecological Society of America.

  18. Stochastic model search with binary outcomes for genome-wide association studies

    PubMed Central

    Malovini, Alberto; Puca, Annibale A; Bellazzi, Riccardo

    2012-01-01

    Objective The spread of case–control genome-wide association studies (GWASs) has stimulated the development of new variable selection methods and predictive models. We introduce a novel Bayesian model search algorithm, Binary Outcome Stochastic Search (BOSS), which addresses the model selection problem when the number of predictors far exceeds the number of binary responses. Materials and methods Our method is based on a latent variable model that links the observed outcomes to the underlying genetic variables. A Markov Chain Monte Carlo approach is used for model search and to evaluate the posterior probability of each predictor. Results BOSS is compared with three established methods (stepwise regression, logistic lasso, and elastic net) in a simulated benchmark. Two real case studies are also investigated: a GWAS on the genetic bases of longevity, and the type 2 diabetes study from the Wellcome Trust Case Control Consortium. Simulations show that BOSS achieves higher precisions than the reference methods while preserving good recall rates. In both experimental studies, BOSS successfully detects genetic polymorphisms previously reported to be associated with the analyzed phenotypes. Discussion BOSS outperforms the other methods in terms of F-measure on simulated data. In the two real studies, BOSS successfully detects biologically relevant features, some of which are missed by univariate analysis and the three reference techniques. Conclusion The proposed algorithm is an advance in the methodology for model selection with a large number of features. Our simulated and experimental results showed that BOSS proves effective in detecting relevant markers while providing a parsimonious model. PMID:22534080

  19. Optimization of the sources in local hyperthermia using a combined finite element-genetic algorithm method.

    PubMed

    Siauve, N; Nicolas, L; Vollaire, C; Marchal, C

    2004-12-01

    This article describes an optimization process specially designed for local and regional hyperthermia in order to achieve the desired specific absorption rate in the patient. It is based on a genetic algorithm coupled to a finite element formulation. The optimization method is applied to real human organs meshes assembled from computerized tomography scans. A 3D finite element formulation is used to calculate the electromagnetic field in the patient, achieved by radiofrequency or microwave sources. Space discretization is performed using incomplete first order edge elements. The sparse complex symmetric matrix equation is solved using a conjugate gradient solver with potential projection pre-conditionning. The formulation is validated by comparison of calculated specific absorption rate distributions in a phantom to temperature measurements. A genetic algorithm is used to optimize the specific absorption rate distribution to predict the phases and amplitudes of the sources leading to the best focalization. The objective function is defined as the specific absorption rate ratio in the tumour and healthy tissues. Several constraints, regarding the specific absorption rate in tumour and the total power in the patient, may be prescribed. Results obtained with two types of applicators (waveguides and annular phased array) are presented and show the faculties of the developed optimization process.

  20. Modelling and Optimization Studies on a Novel Lipase Production by Staphylococcus arlettae through Submerged Fermentation

    PubMed Central

    Chauhan, Mamta; Chauhan, Rajinder Singh; Garlapati, Vijay Kumar

    2013-01-01

    Microbial enzymes from extremophilic regions such as hot spring serve as an important source of various stable and valuable industrial enzymes. The present paper encompasses the modeling and optimization approach for production of halophilic, solvent, tolerant, and alkaline lipase from Staphylococcus arlettae through response surface methodology integrated nature inspired genetic algorithm. Response surface model based on central composite design has been developed by considering the individual and interaction effects of fermentation conditions on lipase production through submerged fermentation. The validated input space of response surface model (with R 2 value of 96.6%) has been utilized for optimization through genetic algorithm. An optimum lipase yield of 6.5 U/mL has been obtained using binary coded genetic algorithm predicted conditions of 9.39% inoculum with the oil concentration of 10.285% in 2.99 hrs using pH of 7.32 at 38.8°C. This outcome could contribute to introducing this extremophilic lipase (halophilic, solvent, and tolerant) to industrial biotechnology sector and will be a probable choice for different food, detergent, chemical, and pharmaceutical industries. The present work also demonstrated the feasibility of statistical design tools integration with computational tools for optimization of fermentation conditions for maximum lipase production. PMID:24455210

  1. Interactive searching of facial image databases

    NASA Astrophysics Data System (ADS)

    Nicholls, Robert A.; Shepherd, John W.; Shepherd, Jean

    1995-09-01

    A set of psychological facial descriptors has been devised to enable computerized searching of criminal photograph albums. The descriptors have been used to encode image databased of up to twelve thousand images. Using a system called FACES, the databases are searched by translating a witness' verbal description into corresponding facial descriptors. Trials of FACES have shown that this coding scheme is more productive and efficient than searching traditional photograph albums. An alternative method of searching the encoded database using a genetic algorithm is currenly being tested. The genetic search method does not require the witness to verbalize a description of the target but merely to indicate a degree of similarity between the target and a limited selection of images from the database. The major drawback of FACES is that is requires a manual encoding of images. Research is being undertaken to automate the process, however, it will require an algorithm which can predict human descriptive values. Alternatives to human derived coding schemes exist using statistical classifications of images. Since databases encoded using statistical classifiers do not have an obvious direct mapping to human derived descriptors, a search method which does not require the entry of human descriptors is required. A genetic search algorithm is being tested for such a purpose.

  2. A universal deep learning approach for modeling the flow of patients under different severities.

    PubMed

    Jiang, Shancheng; Chin, Kwai-Sang; Tsui, Kwok L

    2018-02-01

    The Accident and Emergency Department (A&ED) is the frontline for providing emergency care in hospitals. Unfortunately, relative A&ED resources have failed to keep up with continuously increasing demand in recent years, which leads to overcrowding in A&ED. Knowing the fluctuation of patient arrival volume in advance is a significant premise to relieve this pressure. Based on this motivation, the objective of this study is to explore an integrated framework with high accuracy for predicting A&ED patient flow under different triage levels, by combining a novel feature selection process with deep neural networks. Administrative data is collected from an actual A&ED and categorized into five groups based on different triage levels. A genetic algorithm (GA)-based feature selection algorithm is improved and implemented as a pre-processing step for this time-series prediction problem, in order to explore key features affecting patient flow. In our improved GA, a fitness-based crossover is proposed to maintain the joint information of multiple features during iterative process, instead of traditional point-based crossover. Deep neural networks (DNN) is employed as the prediction model to utilize their universal adaptability and high flexibility. In the model-training process, the learning algorithm is well-configured based on a parallel stochastic gradient descent algorithm. Two effective regularization strategies are integrated in one DNN framework to avoid overfitting. All introduced hyper-parameters are optimized efficiently by grid-search in one pass. As for feature selection, our improved GA-based feature selection algorithm has outperformed a typical GA and four state-of-the-art feature selection algorithms (mRMR, SAFS, VIFR, and CFR). As for the prediction accuracy of proposed integrated framework, compared with other frequently used statistical models (GLM, seasonal-ARIMA, ARIMAX, and ANN) and modern machine models (SVM-RBF, SVM-linear, RF, and R-LASSO), the proposed integrated "DNN-I-GA" framework achieves higher prediction accuracy on both MAPE and RMSE metrics in pairwise comparisons. The contribution of our study is two-fold. Theoretically, the traditional GA-based feature selection process is improved to have less hyper-parameters and higher efficiency, and the joint information of multiple features is maintained by fitness-based crossover operator. The universal property of DNN is further enhanced by merging different regularization strategies. Practically, features selected by our improved GA can be used to acquire an underlying relationship between patient flows and input features. Predictive values are significant indicators of patients' demand and can be used by A&ED managers to make resource planning and allocation. High accuracy achieved by the present framework in different cases enhances the reliability of downstream decision makings. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Predictive model for survival in patients with gastric cancer.

    PubMed

    Goshayeshi, Ladan; Hoseini, Benyamin; Yousefli, Zahra; Khooie, Alireza; Etminani, Kobra; Esmaeilzadeh, Abbas; Golabpour, Amin

    2017-12-01

    Gastric cancer is one of the most prevalent cancers in the world. Characterized by poor prognosis, it is a frequent cause of cancer in Iran. The aim of the study was to design a predictive model of survival time for patients suffering from gastric cancer. This was a historical cohort conducted between 2011 and 2016. Study population were 277 patients suffering from gastric cancer. Data were gathered from the Iranian Cancer Registry and the laboratory of Emam Reza Hospital in Mashhad, Iran. Patients or their relatives underwent interviews where it was needed. Missing values were imputed by data mining techniques. Fifteen factors were analyzed. Survival was addressed as a dependent variable. Then, the predictive model was designed by combining both genetic algorithm and logistic regression. Matlab 2014 software was used to combine them. Of the 277 patients, only survival of 80 patients was available whose data were used for designing the predictive model. Mean ?SD of missing values for each patient was 4.43?.41 combined predictive model achieved 72.57% accuracy. Sex, birth year, age at diagnosis time, age at diagnosis time of patients' family, family history of gastric cancer, and family history of other gastrointestinal cancers were six parameters associated with patient survival. The study revealed that imputing missing values by data mining techniques have a good accuracy. And it also revealed six parameters extracted by genetic algorithm effect on the survival of patients with gastric cancer. Our combined predictive model, with a good accuracy, is appropriate to forecast the survival of patients suffering from Gastric cancer. So, we suggest policy makers and specialists to apply it for prediction of patients' survival.

  4. Artificial neural network analysis based on genetic algorithm to predict the performance characteristics of a cross flow cooling tower

    NASA Astrophysics Data System (ADS)

    Wu, Jiasheng; Cao, Lin; Zhang, Guoqiang

    2018-02-01

    Cooling tower of air conditioning has been widely used as cooling equipment, and there will be broad application prospect if it can be reversibly used as heat source under heat pump heating operation condition. In view of the complex non-linear relationship of each parameter in the process of heat and mass transfer inside tower, In this paper, the BP neural network model based on genetic algorithm optimization (GABP neural network model) is established for the reverse use of cross flow cooling tower. The model adopts the structure of 6 inputs, 13 hidden nodes and 8 outputs. With this model, the outlet air dry bulb temperature, wet bulb temperature, water temperature, heat, sensible heat ratio and heat absorbing efficiency, Lewis number, a total of 8 the proportion of main performance parameters were predicted. Furthermore, the established network model is used to predict the water temperature and heat absorption of the tower at different inlet temperatures. The mean relative error MRE between BP predicted value and experimental value are 4.47%, 3.63%, 2.38%, 3.71%, 6.35%,3.14%, 13.95% and 6.80% respectively; the mean relative error MRE between GABP predicted value and experimental value are 2.66%, 3.04%, 2.27%, 3.02%, 6.89%, 3.17%, 11.50% and 6.57% respectively. The results show that the prediction results of GABP network model are better than that of BP network model; the simulation results are basically consistent with the actual situation. The GABP network model can well predict the heat and mass transfer performance of the cross flow cooling tower.

  5. A Rigid Image Registration Based on the Nonsubsampled Contourlet Transform and Genetic Algorithms

    PubMed Central

    Meskine, Fatiha; Chikr El Mezouar, Miloud; Taleb, Nasreddine

    2010-01-01

    Image registration is a fundamental task used in image processing to match two or more images taken at different times, from different sensors or from different viewpoints. The objective is to find in a huge search space of geometric transformations, an acceptable accurate solution in a reasonable time to provide better registered images. Exhaustive search is computationally expensive and the computational cost increases exponentially with the number of transformation parameters and the size of the data set. In this work, we present an efficient image registration algorithm that uses genetic algorithms within a multi-resolution framework based on the Non-Subsampled Contourlet Transform (NSCT). An adaptable genetic algorithm for registration is adopted in order to minimize the search space. This approach is used within a hybrid scheme applying the two techniques fitness sharing and elitism. Two NSCT based methods are proposed for registration. A comparative study is established between these methods and a wavelet based one. Because the NSCT is a shift-invariant multidirectional transform, the second method is adopted for its search speeding up property. Simulation results clearly show that both proposed techniques are really promising methods for image registration compared to the wavelet approach, while the second technique has led to the best performance results of all. Moreover, to demonstrate the effectiveness of these methods, these registration techniques have been successfully applied to register SPOT, IKONOS and Synthetic Aperture Radar (SAR) images. The algorithm has been shown to work perfectly well for multi-temporal satellite images as well, even in the presence of noise. PMID:22163672

  6. A rigid image registration based on the nonsubsampled contourlet transform and genetic algorithms.

    PubMed

    Meskine, Fatiha; Chikr El Mezouar, Miloud; Taleb, Nasreddine

    2010-01-01

    Image registration is a fundamental task used in image processing to match two or more images taken at different times, from different sensors or from different viewpoints. The objective is to find in a huge search space of geometric transformations, an acceptable accurate solution in a reasonable time to provide better registered images. Exhaustive search is computationally expensive and the computational cost increases exponentially with the number of transformation parameters and the size of the data set. In this work, we present an efficient image registration algorithm that uses genetic algorithms within a multi-resolution framework based on the Non-Subsampled Contourlet Transform (NSCT). An adaptable genetic algorithm for registration is adopted in order to minimize the search space. This approach is used within a hybrid scheme applying the two techniques fitness sharing and elitism. Two NSCT based methods are proposed for registration. A comparative study is established between these methods and a wavelet based one. Because the NSCT is a shift-invariant multidirectional transform, the second method is adopted for its search speeding up property. Simulation results clearly show that both proposed techniques are really promising methods for image registration compared to the wavelet approach, while the second technique has led to the best performance results of all. Moreover, to demonstrate the effectiveness of these methods, these registration techniques have been successfully applied to register SPOT, IKONOS and Synthetic Aperture Radar (SAR) images. The algorithm has been shown to work perfectly well for multi-temporal satellite images as well, even in the presence of noise.

  7. Estimation and optimization of thermal performance of evacuated tube solar collector system

    NASA Astrophysics Data System (ADS)

    Dikmen, Erkan; Ayaz, Mahir; Ezen, H. Hüseyin; Küçüksille, Ecir U.; Şahin, Arzu Şencan

    2014-05-01

    In this study, artificial neural networks (ANNs) and adaptive neuro-fuzzy (ANFIS) in order to predict the thermal performance of evacuated tube solar collector system have been used. The experimental data for the training and testing of the networks were used. The results of ANN are compared with ANFIS in which the same data sets are used. The R2-value for the thermal performance values of collector is 0.811914 which can be considered as satisfactory. The results obtained when unknown data were presented to the networks are satisfactory and indicate that the proposed method can successfully be used for the prediction of the thermal performance of evacuated tube solar collectors. In addition, new formulations obtained from ANN are presented for the calculation of the thermal performance. The advantages of this approaches compared to the conventional methods are speed, simplicity, and the capacity of the network to learn from examples. In addition, genetic algorithm (GA) was used to maximize the thermal performance of the system. The optimum working conditions of the system were determined by the GA.

  8. Prediction model for prevalence and incidence of advanced age-related macular degeneration based on genetic, demographic, and environmental variables.

    PubMed

    Seddon, Johanna M; Reynolds, Robyn; Maller, Julian; Fagerness, Jesen A; Daly, Mark J; Rosner, Bernard

    2009-05-01

    The joint effects of genetic, ocular, and environmental variables were evaluated and predictive models for prevalence and incidence of AMD were assessed. Participants in the multicenter Age-Related Eye Disease Study (AREDS) were included in a prospective evaluation of 1446 individuals, of which 279 progressed to advanced AMD (geographic atrophy or neovascular disease) and 1167 did not progress during 6.3 years of follow-up. For prevalent AMD, 509 advanced cases were compared with 222 controls. Covariates for the incidence analysis included age, sex, education, smoking, body mass index (BMI), baseline AMD grade, and the AREDS vitamin-mineral treatment assignment. DNA specimens were evaluated for six variants in five genes related to AMD. Unconditional logistic regression analyses were performed for prevalent and incident advanced AMD. An algorithm was developed and receiver operating characteristic curves and C statistics were calculated to assess the predictive ability of risk scores to discriminate progressors from nonprogressors. All genetic polymorphisms were independently related to prevalence of advanced AMD, controlling for genetic factors, smoking, BMI, and AREDS treatment. Multivariate odds ratios (ORs) were 3.5 (95% confidence interval [CI], 1.7-7.1) for CFH Y402H; 3.7 (95% CI, 1.6-8.4) for CFH rs1410996; 25.4 (95% CI, 8.6-75.1) for LOC387715 A69S (ARMS2); 0.3 (95% CI, 0.1-0.7) for C2 E318D; 0.3 (95% CI, 0.1-0.5) for CFB; and 3.6 (95% CI, 1.4-9.4) for C3 R102G, comparing the homozygous risk/protective genotypes to the referent genotypes. For incident AMD, all these variants except CFB were significantly related to progression to advanced AMD, after controlling for baseline AMD grade and other factors, with ORs from 1.8 to 4.0 for presence of two risk alleles and 0.4 for the protective allele. An interaction was seen between CFH402H and treatment, after controlling for all genotypes. Smoking was independently related to AMD, with a multiplicative joint effect with genotype on AMD risk. The C statistic for the full model with all variables was 0.831 for progression to advanced AMD. Factors reflective of nature and nurture are independently related to prevalence and incidence of advanced AMD, with excellent predictive power.

  9. Analysis of Bioactive Amino Acids from Fish Hydrolysates with a New Bioinformatic Intelligent System Approach.

    PubMed

    Elaziz, Mohamed Abd; Hemdan, Ahmed Monem; Hassanien, AboulElla; Oliva, Diego; Xiong, Shengwu

    2017-09-07

    The current economics of the fish protein industry demand rapid, accurate and expressive prediction algorithms at every step of protein production especially with the challenge of global climate change. This help to predict and analyze functional and nutritional quality then consequently control food allergies in hyper allergic patients. As, it is quite expensive and time-consuming to know these concentrations by the lab experimental tests, especially to conduct large-scale projects. Therefore, this paper introduced a new intelligent algorithm using adaptive neuro-fuzzy inference system based on whale optimization algorithm. This algorithm is used to predict the concentration levels of bioactive amino acids in fish protein hydrolysates at different times during the year. The whale optimization algorithm is used to determine the optimal parameters in adaptive neuro-fuzzy inference system. The results of proposed algorithm are compared with others and it is indicated the higher performance of the proposed algorithm.

  10. A novel acenocoumarol pharmacogenomic dosing algorithm for the Greek population of EU-PACT trial.

    PubMed

    Ragia, Georgia; Kolovou, Vana; Kolovou, Genovefa; Konstantinides, Stavros; Maltezos, Efstratios; Tavridou, Anna; Tziakas, Dimitrios; Maitland-van der Zee, Anke H; Manolopoulos, Vangelis G

    2017-01-01

    To generate and validate a pharmacogenomic-guided (PG) dosing algorithm for acenocoumarol in the Greek population. To compare its performance with other PG algorithms developed for the Greek population. A total of 140 Greek patients participants of the EU-PACT trial for acenocoumarol, a randomized clinical trial that prospectively compared the effect of a PG dosing algorithm with a clinical dosing algorithm on the percentage of time within INR therapeutic range, who reached acenocoumarol stable dose were included in the study. CYP2C9 and VKORC1 genotypes, age and weight affected acenocoumarol dose and predicted 53.9% of its variability. EU-PACT PG algorithm overestimated acenocoumarol dose across all different CYP2C9/VKORC1 functional phenotype bins (predicted dose vs stable dose in normal responders 2.31 vs 2.00 mg/day, p = 0.028, in sensitive responders 1.72 vs 1.50 mg/day, p = 0.003, in highly sensitive responders 1.39 vs 1.00 mg/day, p = 0.029). The PG algorithm previously developed for the Greek population overestimated the dose in normal responders (2.51 vs 2.00 mg/day, p < 0.001). Ethnic-specific dosing algorithm is suggested for better prediction of acenocoumarol dosage requirements in patients of Greek origin.

  11. GeneYenta: a phenotype-based rare disease case matching tool based on online dating algorithms for the acceleration of exome interpretation.

    PubMed

    Gottlieb, Michael M; Arenillas, David J; Maithripala, Savanie; Maurer, Zachary D; Tarailo Graovac, Maja; Armstrong, Linlea; Patel, Millan; van Karnebeek, Clara; Wasserman, Wyeth W

    2015-04-01

    Advances in next-generation sequencing (NGS) technologies have helped reveal causal variants for genetic diseases. In order to establish causality, it is often necessary to compare genomes of unrelated individuals with similar disease phenotypes to identify common disrupted genes. When working with cases of rare genetic disorders, finding similar individuals can be extremely difficult. We introduce a web tool, GeneYenta, which facilitates the matchmaking process, allowing clinicians to coordinate detailed comparisons for phenotypically similar cases. Importantly, the system is focused on phenotype annotation, with explicit limitations on highly confidential data that create barriers to participation. The procedure for matching of patient phenotypes, inspired by online dating services, uses an ontology-based semantic case matching algorithm with attribute weighting. We evaluate the capacity of the system using a curated reference data set and 19 clinician entered cases comparing four matching algorithms. We find that the inclusion of clinician weights can augment phenotype matching. © 2015 WILEY PERIODICALS, INC.

  12. Predicting prolonged dose titration in patients starting warfarin.

    PubMed

    Finkelman, Brian S; French, Benjamin; Bershaw, Luanne; Brensinger, Colleen M; Streiff, Michael B; Epstein, Andrew E; Kimmel, Stephen E

    2016-11-01

    Patients initiating warfarin therapy generally experience a dose-titration period of weeks to months, during which time they are at higher risk of both thromboembolic and bleeding events. Accurate prediction of prolonged dose titration could help clinicians determine which patients might be better treated by alternative anticoagulants that, while more costly, do not require dose titration. A prediction model was derived in a prospective cohort of patients starting warfarin (n = 390), using Cox regression, and validated in an external cohort (n = 663) from a later time period. Prolonged dose titration was defined as a dose-titration period >12 weeks. Predictor variables were selected using a modified best subsets algorithm, using leave-one-out cross-validation to reduce overfitting. The final model had five variables: warfarin indication, insurance status, number of doctor's visits in the previous year, smoking status, and heart failure. The area under the ROC curve (AUC) in the derivation cohort was 0.66 (95%CI 0.60, 0.74) using leave-one-out cross-validation, but only 0.59 (95%CI 0.54, 0.64) in the external validation cohort, and varied across clinics. Including genetic factors in the model did not improve the area under the ROC curve (0.59; 95%CI 0.54, 0.65). Relative utility curves indicated that the model was unlikely to provide a clinically meaningful benefit compared with no prediction. Our results suggest that prolonged dose titration cannot be accurately predicted in warfarin patients using traditional clinical, social, and genetic predictors, and that accurate prediction will need to accommodate heterogeneities across clinical sites and over time. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  13. Study of parameter identification using hybrid neural-genetic algorithm in electro-hydraulic servo system

    NASA Astrophysics Data System (ADS)

    Moon, Byung-Young

    2005-12-01

    The hybrid neural-genetic multi-model parameter estimation algorithm was demonstrated. This method can be applied to structured system identification of electro-hydraulic servo system. This algorithms consist of a recurrent incremental credit assignment(ICRA) neural network and a genetic algorithm. The ICRA neural network evaluates each member of a generation of model and genetic algorithm produces new generation of model. To evaluate the proposed method, electro-hydraulic servo system was designed and manufactured. The experiment was carried out to figure out the hybrid neural-genetic multi-model parameter estimation algorithm. As a result, the dynamic characteristics were obtained such as the parameters(mass, damping coefficient, bulk modulus, spring coefficient), which minimize total square error. The result of this study can be applied to hydraulic systems in industrial fields.

  14. A novel hybrid genetic algorithm to solve the make-to-order sequence-dependent flow-shop scheduling problem

    NASA Astrophysics Data System (ADS)

    Mirabi, Mohammad; Fatemi Ghomi, S. M. T.; Jolai, F.

    2014-04-01

    Flow-shop scheduling problem (FSP) deals with the scheduling of a set of n jobs that visit a set of m machines in the same order. As the FSP is NP-hard, there is no efficient algorithm to reach the optimal solution of the problem. To minimize the holding, delay and setup costs of large permutation flow-shop scheduling problems with sequence-dependent setup times on each machine, this paper develops a novel hybrid genetic algorithm (HGA) with three genetic operators. Proposed HGA applies a modified approach to generate a pool of initial solutions, and also uses an improved heuristic called the iterated swap procedure to improve the initial solutions. We consider the make-to-order production approach that some sequences between jobs are assumed as tabu based on maximum allowable setup cost. In addition, the results are compared to some recently developed heuristics and computational experimental results show that the proposed HGA performs very competitively with respect to accuracy and efficiency of solution.

  15. Mathematical modeling of continuous ethanol fermentation in a membrane bioreactor by pervaporation compared to conventional system: Genetic algorithm.

    PubMed

    Esfahanian, Mehri; Shokuhi Rad, Ali; Khoshhal, Saeed; Najafpour, Ghasem; Asghari, Behnam

    2016-07-01

    In this paper, genetic algorithm was used to investigate mathematical modeling of ethanol fermentation in a continuous conventional bioreactor (CCBR) and a continuous membrane bioreactor (CMBR) by ethanol permselective polydimethylsiloxane (PDMS) membrane. A lab scale CMBR with medium glucose concentration of 100gL(-1) and Saccharomyces cerevisiae microorganism was designed and fabricated. At dilution rate of 0.14h(-1), maximum specific cell growth rate and productivity of 0.27h(-1) and 6.49gL(-1)h(-1) were respectively found in CMBR. However, at very high dilution rate, the performance of CMBR was quite similar to conventional fermentation on account of insufficient incubation time. In both systems, genetic algorithm modeling of cell growth, ethanol production and glucose concentration were conducted based on Monod and Moser kinetic models during each retention time at unsteady condition. The results showed that Moser kinetic model was more satisfactory and desirable than Monod model. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Genetic evolutionary taboo search for optimal marker placement in infrared patient setup

    NASA Astrophysics Data System (ADS)

    Riboldi, M.; Baroni, G.; Spadea, M. F.; Tagaste, B.; Garibaldi, C.; Cambria, R.; Orecchia, R.; Pedotti, A.

    2007-09-01

    In infrared patient setup adequate selection of the external fiducial configuration is required for compensating inner target displacements (target registration error, TRE). Genetic algorithms (GA) and taboo search (TS) were applied in a newly designed approach to optimal marker placement: the genetic evolutionary taboo search (GETS) algorithm. In the GETS paradigm, multiple solutions are simultaneously tested in a stochastic evolutionary scheme, where taboo-based decision making and adaptive memory guide the optimization process. The GETS algorithm was tested on a group of ten prostate patients, to be compared to standard optimization and to randomly selected configurations. The changes in the optimal marker configuration, when TRE is minimized for OARs, were specifically examined. Optimal GETS configurations ensured a 26.5% mean decrease in the TRE value, versus 19.4% for conventional quasi-Newton optimization. Common features in GETS marker configurations were highlighted in the dataset of ten patients, even when multiple runs of the stochastic algorithm were performed. Including OARs in TRE minimization did not considerably affect the spatial distribution of GETS marker configurations. In conclusion, the GETS algorithm proved to be highly effective in solving the optimal marker placement problem. Further work is needed to embed site-specific deformation models in the optimization process.

  17. Solving deterministic non-linear programming problem using Hopfield artificial neural network and genetic programming techniques

    NASA Astrophysics Data System (ADS)

    Vasant, P.; Ganesan, T.; Elamvazuthi, I.

    2012-11-01

    A fairly reasonable result was obtained for non-linear engineering problems using the optimization techniques such as neural network, genetic algorithms, and fuzzy logic independently in the past. Increasingly, hybrid techniques are being used to solve the non-linear problems to obtain better output. This paper discusses the use of neuro-genetic hybrid technique to optimize the geological structure mapping which is known as seismic survey. It involves the minimization of objective function subject to the requirement of geophysical and operational constraints. In this work, the optimization was initially performed using genetic programming, and followed by hybrid neuro-genetic programming approaches. Comparative studies and analysis were then carried out on the optimized results. The results indicate that the hybrid neuro-genetic hybrid technique produced better results compared to the stand-alone genetic programming method.

  18. Prediction based active ramp metering control strategy with mobility and safety assessment

    NASA Astrophysics Data System (ADS)

    Fang, Jie; Tu, Lili

    2018-04-01

    Ramp metering is one of the most direct and efficient motorway traffic flow management measures so as to improve traffic conditions. However, owing to short of traffic conditions prediction, in earlier studies, the impact on traffic flow dynamics of the applied RM control was not quantitatively evaluated. In this study, a RM control algorithm adopting Model Predictive Control (MPC) framework to predict and assess future traffic conditions, which taking both the current traffic conditions and the RM-controlled future traffic states into consideration, was presented. The designed RM control algorithm targets at optimizing the network mobility and safety performance. The designed algorithm is evaluated in a field-data-based simulation. Through comparing the presented algorithm controlled scenario with the uncontrolled scenario, it was proved that the proposed RM control algorithm can effectively relieve the congestion of traffic network with no significant compromises in safety aspect.

  19. Training product unit neural networks with genetic algorithms

    NASA Technical Reports Server (NTRS)

    Janson, D. J.; Frenzel, J. F.; Thelen, D. C.

    1991-01-01

    The training of product neural networks using genetic algorithms is discussed. Two unusual neural network techniques are combined; product units are employed instead of the traditional summing units and genetic algorithms train the network rather than backpropagation. As an example, a neural netork is trained to calculate the optimum width of transistors in a CMOS switch. It is shown how local minima affect the performance of a genetic algorithm, and one method of overcoming this is presented.

  20. New Results in Astrodynamics Using Genetic Algorithms

    NASA Technical Reports Server (NTRS)

    Coverstone-Carroll, V.; Hartmann, J. W.; Williams, S. N.; Mason, W. J.

    1998-01-01

    Generic algorithms have gained popularity as an effective procedure for obtaining solutions to traditionally difficult space mission optimization problems. In this paper, a brief survey of the use of genetic algorithms to solve astrodynamics problems is presented and is followed by new results obtained from applying a Pareto genetic algorithm to the optimization of low-thrust interplanetary spacecraft missions.

  1. Optimal Refueling Pattern Search for a CANDU Reactor Using a Genetic Algorithm

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Quang Binh, DO; Gyuhong, ROH; Hangbok, CHOI

    2006-07-01

    This paper presents the results from the application of genetic algorithms to a refueling optimization of a Canada deuterium uranium (CANDU) reactor. This work aims at making a mathematical model of the refueling optimization problem including the objective function and constraints and developing a method based on genetic algorithms to solve the problem. The model of the optimization problem and the proposed method comply with the key features of the refueling strategy of the CANDU reactor which adopts an on-power refueling operation. In this study, a genetic algorithm combined with an elitism strategy was used to automatically search for themore » refueling patterns. The objective of the optimization was to maximize the discharge burn-up of the refueling bundles, minimize the maximum channel power, or minimize the maximum change in the zone controller unit (ZCU) water levels. A combination of these objectives was also investigated. The constraints include the discharge burn-up, maximum channel power, maximum bundle power, channel power peaking factor and the ZCU water level. A refueling pattern that represents the refueling rate and channels was coded by a one-dimensional binary chromosome, which is a string of binary numbers 0 and 1. A computer program was developed in FORTRAN 90 running on an HP 9000 workstation to conduct the search for the optimal refueling patterns for a CANDU reactor at the equilibrium state. The results showed that it was possible to apply genetic algorithms to automatically search for the refueling channels of the CANDU reactor. The optimal refueling patterns were compared with the solutions obtained from the AUTOREFUEL program and the results were consistent with each other. (authors)« less

  2. Prediction of Software Reliability using Bio Inspired Soft Computing Techniques.

    PubMed

    Diwaker, Chander; Tomar, Pradeep; Poonia, Ramesh C; Singh, Vijander

    2018-04-10

    A lot of models have been made for predicting software reliability. The reliability models are restricted to using particular types of methodologies and restricted number of parameters. There are a number of techniques and methodologies that may be used for reliability prediction. There is need to focus on parameters consideration while estimating reliability. The reliability of a system may increase or decreases depending on the selection of different parameters used. Thus there is need to identify factors that heavily affecting the reliability of the system. In present days, reusability is mostly used in the various area of research. Reusability is the basis of Component-Based System (CBS). The cost, time and human skill can be saved using Component-Based Software Engineering (CBSE) concepts. CBSE metrics may be used to assess those techniques which are more suitable for estimating system reliability. Soft computing is used for small as well as large-scale problems where it is difficult to find accurate results due to uncertainty or randomness. Several possibilities are available to apply soft computing techniques in medicine related problems. Clinical science of medicine using fuzzy-logic, neural network methodology significantly while basic science of medicine using neural-networks-genetic algorithm most frequently and preferably. There is unavoidable interest shown by medical scientists to use the various soft computing methodologies in genetics, physiology, radiology, cardiology and neurology discipline. CBSE boost users to reuse the past and existing software for making new products to provide quality with a saving of time, memory space, and money. This paper focused on assessment of commonly used soft computing technique like Genetic Algorithm (GA), Neural-Network (NN), Fuzzy Logic, Support Vector Machine (SVM), Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO), and Artificial Bee Colony (ABC). This paper presents working of soft computing techniques and assessment of soft computing techniques to predict reliability. The parameter considered while estimating and prediction of reliability are also discussed. This study can be used in estimation and prediction of the reliability of various instruments used in the medical system, software engineering, computer engineering and mechanical engineering also. These concepts can be applied to both software and hardware, to predict the reliability using CBSE.

  3. Can human experts predict solubility better than computers?

    PubMed

    Boobier, Samuel; Osbourn, Anne; Mitchell, John B O

    2017-12-13

    In this study, we design and carry out a survey, asking human experts to predict the aqueous solubility of druglike organic compounds. We investigate whether these experts, drawn largely from the pharmaceutical industry and academia, can match or exceed the predictive power of algorithms. Alongside this, we implement 10 typical machine learning algorithms on the same dataset. The best algorithm, a variety of neural network known as a multi-layer perceptron, gave an RMSE of 0.985 log S units and an R 2 of 0.706. We would not have predicted the relative success of this particular algorithm in advance. We found that the best individual human predictor generated an almost identical prediction quality with an RMSE of 0.942 log S units and an R 2 of 0.723. The collection of algorithms contained a higher proportion of reasonably good predictors, nine out of ten compared with around half of the humans. We found that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median generated excellent predictivity. While our consensus human predictor achieved very slightly better headline figures on various statistical measures, the difference between it and the consensus machine learning predictor was both small and statistically insignificant. We conclude that human experts can predict the aqueous solubility of druglike molecules essentially equally well as machine learning algorithms. We find that, for either humans or algorithms, combining individual predictions into a consensus predictor by taking their median is a powerful way of benefitting from the wisdom of crowds.

  4. A 100-Year Review: Methods and impact of genetic selection in dairy cattle-From daughter-dam comparisons to deep learning algorithms.

    PubMed

    Weigel, K A; VanRaden, P M; Norman, H D; Grosu, H

    2017-12-01

    In the early 1900s, breed society herdbooks had been established and milk-recording programs were in their infancy. Farmers wanted to improve the productivity of their cattle, but the foundations of population genetics, quantitative genetics, and animal breeding had not been laid. Early animal breeders struggled to identify genetically superior families using performance records that were influenced by local environmental conditions and herd-specific management practices. Daughter-dam comparisons were used for more than 30 yr and, although genetic progress was minimal, the attention given to performance recording, genetic theory, and statistical methods paid off in future years. Contemporary (herdmate) comparison methods allowed more accurate accounting for environmental factors and genetic progress began to accelerate when these methods were coupled with artificial insemination and progeny testing. Advances in computing facilitated the implementation of mixed linear models that used pedigree and performance data optimally and enabled accurate selection decisions. Sequencing of the bovine genome led to a revolution in dairy cattle breeding, and the pace of scientific discovery and genetic progress accelerated rapidly. Pedigree-based models have given way to whole-genome prediction, and Bayesian regression models and machine learning algorithms have joined mixed linear models in the toolbox of modern animal breeders. Future developments will likely include elucidation of the mechanisms of genetic inheritance and epigenetic modification in key biological pathways, and genomic data will be used with data from on-farm sensors to facilitate precision management on modern dairy farms. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.

  5. Artificial intelligence based modeling and optimization of poly(3-hydroxybutyrate-co-3-hydroxyvalerate) production process by using Azohydromonas lata MTCC 2311 from cane molasses supplemented with volatile fatty acids: a genetic algorithm paradigm.

    PubMed

    Zafar, Mohd; Kumar, Shashi; Kumar, Surendra; Dhiman, Amit K

    2012-01-01

    The present work describes the optimization of medium variables for the production of poly(3-hydroxybutyrate-co-3-hydroxyvalerate) [P(3HB-co-3HV)] by Azohydromonas lata MTCC 2311 using cane molasses supplemented with propionic acid. Genetic algorithm (GA) has been used for the optimization of P(3HB-co-3HV) production through the simulation of artificial neural network (ANN) and response surface methodology (RSM). The predictions by ANN are better than those of RSM and in good agreement with experimental findings. The highest P(3HB-co-3HV) concentration and 3HV content have been reported as 7.35 g/l and 16.84 mol%, respectively by hybrid ANN-GA. Upon validation, 7.20 g/l and 16.30 mol% of P(3HB-co-3HV) concentration and 3HV content have been found in the shake flask, whereas 6.70 g/l and 16.35 mol%, have been observed in a 3 l bioreactor, respectively. The specific growth rate and P(3HB-co-3HV) accumulation rate of 0.29 per h and 0.16 g/lh determined with cane molasses are comparable to those observed on pure substrates. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. The Comparative Effects of Prediction/Discussion-Based Learning Cycle, Conceptual Change Text, and Traditional Instructions on Student Understanding of Genetics

    NASA Astrophysics Data System (ADS)

    Yilmaz, Diba; Tekkaya, Ceren; Sungur, Semra

    2011-03-01

    The present study examined the comparative effects of a prediction/discussion-based learning cycle, conceptual change text (CCT), and traditional instructions on students' understanding of genetics concepts. A quasi-experimental research design of the pre-test-post-test non-equivalent control group was adopted. The three intact classes, taught by the same science teacher, were randomly assigned as prediction/discussion-based learning cycle class (N = 30), CCT class (N = 25), and traditional class (N = 26). Participants completed the genetics concept test as pre-test, post-test, and delayed post-test to examine the effects of instructional strategies on their genetics understanding and retention. While the dependent variable of this study was students' understanding of genetics, the independent variables were time (Time 1, Time 2, and Time 3) and mode of instruction. The mixed between-within subjects analysis of variance revealed that students in both prediction/discussion-based learning cycle and CCT groups understood the genetics concepts and retained their knowledge significantly better than students in the traditional instruction group.

  7. Nonlinear inversion of potential-field data using a hybrid-encoding genetic algorithm

    USGS Publications Warehouse

    Chen, C.; Xia, J.; Liu, J.; Feng, G.

    2006-01-01

    Using a genetic algorithm to solve an inverse problem of complex nonlinear geophysical equations is advantageous because it does not require computer gradients of models or "good" initial models. The multi-point search of a genetic algorithm makes it easier to find the globally optimal solution while avoiding falling into a local extremum. As is the case in other optimization approaches, the search efficiency for a genetic algorithm is vital in finding desired solutions successfully in a multi-dimensional model space. A binary-encoding genetic algorithm is hardly ever used to resolve an optimization problem such as a simple geophysical inversion with only three unknowns. The encoding mechanism, genetic operators, and population size of the genetic algorithm greatly affect search processes in the evolution. It is clear that improved operators and proper population size promote the convergence. Nevertheless, not all genetic operations perform perfectly while searching under either a uniform binary or a decimal encoding system. With the binary encoding mechanism, the crossover scheme may produce more new individuals than with the decimal encoding. On the other hand, the mutation scheme in a decimal encoding system will create new genes larger in scope than those in the binary encoding. This paper discusses approaches of exploiting the search potential of genetic operations in the two encoding systems and presents an approach with a hybrid-encoding mechanism, multi-point crossover, and dynamic population size for geophysical inversion. We present a method that is based on the routine in which the mutation operation is conducted in the decimal code and multi-point crossover operation in the binary code. The mix-encoding algorithm is called the hybrid-encoding genetic algorithm (HEGA). HEGA provides better genes with a higher probability by a mutation operator and improves genetic algorithms in resolving complicated geophysical inverse problems. Another significant result is that final solution is determined by the average model derived from multiple trials instead of one computation due to the randomness in a genetic algorithm procedure. These advantages were demonstrated by synthetic and real-world examples of inversion of potential-field data. ?? 2005 Elsevier Ltd. All rights reserved.

  8. Prediction of apoptosis protein locations with genetic algorithms and support vector machines through a new mode of pseudo amino acid composition.

    PubMed

    Kandaswamy, Krishna Kumar; Pugalenthi, Ganesan; Möller, Steffen; Hartmann, Enno; Kalies, Kai-Uwe; Suganthan, P N; Martinetz, Thomas

    2010-12-01

    Apoptosis is an essential process for controlling tissue homeostasis by regulating a physiological balance between cell proliferation and cell death. The subcellular locations of proteins performing the cell death are determined by mostly independent cellular mechanisms. The regular bioinformatics tools to predict the subcellular locations of such apoptotic proteins do often fail. This work proposes a model for the sorting of proteins that are involved in apoptosis, allowing us to both the prediction of their subcellular locations as well as the molecular properties that contributed to it. We report a novel hybrid Genetic Algorithm (GA)/Support Vector Machine (SVM) approach to predict apoptotic protein sequences using 119 sequence derived properties like frequency of amino acid groups, secondary structure, and physicochemical properties. GA is used for selecting a near-optimal subset of informative features that is most relevant for the classification. Jackknife cross-validation is applied to test the predictive capability of the proposed method on 317 apoptosis proteins. Our method achieved 85.80% accuracy using all 119 features and 89.91% accuracy for 25 features selected by GA. Our models were examined by a test dataset of 98 apoptosis proteins and obtained an overall accuracy of 90.34%. The results show that the proposed approach is promising; it is able to select small subsets of features and still improves the classification accuracy. Our model can contribute to the understanding of programmed cell death and drug discovery. The software and dataset are available at http://www.inb.uni-luebeck.de/tools-demos/apoptosis/GASVM.

  9. Ultrasonic prediction of term birth weight in Hispanic women. Accuracy in an outpatient clinic.

    PubMed

    Nahum, Gerard G; Pham, Krystle Q; McHugh, John P

    2003-01-01

    To investigate the accuracy of ultrasonic fetal biometric algorithms for estimating term fetal weight. Ultrasonographic fetal biometric assessments were made in 74 Hispanic women who delivered at 37-42 weeks of gestation. Measurements were taken of the fetal biparietal diameter, head circumference, abdominal circumference and femur length. Twenty-seven standard fetal biometric algorithms were assessed for their accuracy in predicting fetal weight. Results were compared to those obtained by merely guessing the mean term birth weight in each case. The correlation between ultrasonically predicted and actual birth weights ranged from 0.52 to 0.79. The different ultrasonic algorithms estimated fetal weight to within +/- 8.6-15.0% (+/- 295-520 g) of actual birth weight as compared with +/- 13.6% (+/- 449 g) for guessing the mean birth weight in each case (mean +/- SD). The mean absolute prediction errors for 17 of the ultrasonic equations (63%) were superior to those obtained by guessing the mean birth weight by 3.2-5.0% (96-154 g) (P < .05). Fourteen algorithms (52%) were more accurate for predicting fetal weight to within +/- 15%, and 20 algorithms (74%) were more accurate for predicting fetal weight to within +/- 10% of actual birth weight than simply guessing the mean birth weight (P < .05). Ten ultrasonic equations (37%) showed significant utility for predicting fetal weight > 4,000 g (likelihood ratio > 5.0). Term fetal weight predictions using the majority of sonographic fetal biometric equations are more accurate, by up to 154 g and 5%, than simply guessing the population-specific mean birth weight.

  10. Characterizing the genetic structure of a forensic DNA database using a latent variable approach.

    PubMed

    Kruijver, Maarten

    2016-07-01

    Several problems in forensic genetics require a representative model of a forensic DNA database. Obtaining an accurate representation of the offender database can be difficult, since databases typically contain groups of persons with unregistered ethnic origins in unknown proportions. We propose to estimate the allele frequencies of the subpopulations comprising the offender database and their proportions from the database itself using a latent variable approach. We present a model for which parameters can be estimated using the expectation maximization (EM) algorithm. This approach does not rely on relatively small and possibly unrepresentative population surveys, but is driven by the actual genetic composition of the database only. We fit the model to a snapshot of the Dutch offender database (2014), which contains close to 180,000 profiles, and find that three subpopulations suffice to describe a large fraction of the heterogeneity in the database. We demonstrate the utility and reliability of the approach with three applications. First, we use the model to predict the number of false leads obtained in database searches. We assess how well the model predicts the number of false leads obtained in mock searches in the Dutch offender database, both for the case of familial searching for first degree relatives of a donor and searching for contributors to three-person mixtures. Second, we study the degree of partial matching between all pairs of profiles in the Dutch database and compare this to what is predicted using the latent variable approach. Third, we use the model to provide evidence to support that the Dutch practice of estimating match probabilities using the Balding-Nichols formula with a native Dutch reference database and θ=0.03 is conservative. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  11. Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis

    NASA Astrophysics Data System (ADS)

    Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Bakar, Azuraliza Abu; Yaakub, Mohd Ridzwan

    2017-10-01

    This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.

  12. Variability in Humoral Immunity to Measles Vaccine: New Developments

    PubMed Central

    Haralambieva, Iana H.; Kennedy, Richard B.; Ovsyannikova, Inna G.; Whitaker, Jennifer A.; Poland, Gregory A.

    2015-01-01

    Despite the existence of an effective measles vaccine, resurgence in measles cases in the United States and across Europe has occurred, including in individuals vaccinated with two doses of the vaccine. Host genetic factors result in inter-individual variation in measles vaccine-induced antibodies, and play a role in vaccine failure. Studies have identified HLA and non-HLA genetic influences that individually or jointly contribute to the observed variability in the humoral response to vaccination among healthy individuals. In this exciting era, new high-dimensional approaches and techniques including vaccinomics, systems biology, GWAS, epitope prediction and sophisticated bioinformatics/statistical algorithms, provide powerful tools to investigate immune response mechanisms to the measles vaccine. These might predict, on an individual basis, outcomes of acquired immunity post measles vaccination. PMID:26602762

  13. Shaping asteroid models using genetic evolution (SAGE)

    NASA Astrophysics Data System (ADS)

    Bartczak, P.; Dudziński, G.

    2018-02-01

    In this work, we present SAGE (shaping asteroid models using genetic evolution), an asteroid modelling algorithm based solely on photometric lightcurve data. It produces non-convex shapes, orientations of the rotation axes and rotational periods of asteroids. The main concept behind a genetic evolution algorithm is to produce random populations of shapes and spin-axis orientations by mutating a seed shape and iterating the process until it converges to a stable global minimum. We tested SAGE on five artificial shapes. We also modelled asteroids 433 Eros and 9 Metis, since ground truth observations for them exist, allowing us to validate the models. We compared the derived shape of Eros with the NEAR Shoemaker model and that of Metis with adaptive optics and stellar occultation observations since other models from various inversion methods were available for Metis.

  14. Global velocity constrained cloud motion prediction for short-term solar forecasting

    NASA Astrophysics Data System (ADS)

    Chen, Yanjun; Li, Wei; Zhang, Chongyang; Hu, Chuanping

    2016-09-01

    Cloud motion is the primary reason for short-term solar power output fluctuation. In this work, a new cloud motion estimation algorithm using a global velocity constraint is proposed. Compared to the most used Particle Image Velocity (PIV) algorithm, which assumes the homogeneity of motion vectors, the proposed method can capture the accurate motion vector for each cloud block, including both the motional tendency and morphological changes. Specifically, global velocity derived from PIV is first calculated, and then fine-grained cloud motion estimation can be achieved by global velocity based cloud block researching and multi-scale cloud block matching. Experimental results show that the proposed global velocity constrained cloud motion prediction achieves comparable performance to the existing PIV and filtered PIV algorithms, especially in a short prediction horizon.

  15. Warfarin Pharmacogenetics

    PubMed Central

    Johnson, Julie A.; Cavallari, Larisa H.

    2014-01-01

    The cytochrome P450 (CYP) 2C9 and vitamin K epoxide reductase complex 1 (VKORC1) genotypes have been strongly and consistently associated with warfarin dose requirements, and dosing algorithms incorporating genetic and clinical information have been shown to be predictive of stable warfarin dose. However, clinical trials evaluating genotype-guided warfarin dosing produced mixed results, calling into question the utility of this approach. Recent trials used surrogate markers as endpoints rather than clinical endpoints, further complicating translation of the data to clinical practice. The present data do not support genetic testing to guide warfarin dosing, but in the setting where genotype data are available, use of such data in those of European ancestry is reasonable. Outcomes data are expected from an on-going trial, observational studies continue, and more work is needed to define dosing algorithms that incorporate appropriate variants in minority populations; all these will further shape guidelines and recommendations on the clinical utility of genotype-guided warfarin dosing. PMID:25282448

  16. Risk adjustment model of credit life insurance using a genetic algorithm

    NASA Astrophysics Data System (ADS)

    Saputra, A.; Sukono; Rusyaman, E.

    2018-03-01

    In managing the risk of credit life insurance, insurance company should acknowledge the character of the risks to predict future losses. Risk characteristics can be learned in a claim distribution model. There are two standard approaches in designing the distribution model of claims over the insurance period i.e, collective risk model and individual risk model. In the collective risk model, the claim arises when risk occurs is called individual claim, accumulation of individual claim during a period of insurance is called an aggregate claim. The aggregate claim model may be formed by large model and a number of individual claims. How the measurement of insurance risk with the premium model approach and whether this approach is appropriate for estimating the potential losses occur in the future. In order to solve the problem Genetic Algorithm with Roulette Wheel Selection is used.

  17. Modeling and optimization of joint quality for laser transmission joint of thermoplastic using an artificial neural network and a genetic algorithm

    NASA Astrophysics Data System (ADS)

    Wang, Xiao; Zhang, Cheng; Li, Pin; Wang, Kai; Hu, Yang; Zhang, Peng; Liu, Huixia

    2012-11-01

    A central composite rotatable experimental design(CCRD) is conducted to design experiments for laser transmission joining of thermoplastic-Polycarbonate (PC). The artificial neural network was used to establish the relationships between laser transmission joining process parameters (the laser power, velocity, clamp pressure, scanning number) and joint strength and joint seam width. The developed mathematical models are tested by analysis of variance (ANOVA) method to check their adequacy and the effects of process parameters on the responses and the interaction effects of key process parameters on the quality are analyzed and discussed. Finally, the desirability function coupled with genetic algorithm is used to carry out the optimization of the joint strength and joint width. The results show that the predicted results of the optimization are in good agreement with the experimental results, so this study provides an effective method to enhance the joint quality.

  18. A genetic algorithm-based job scheduling model for big data analytics.

    PubMed

    Lu, Qinghua; Li, Shanshan; Zhang, Weishan; Zhang, Lei

    Big data analytics (BDA) applications are a new category of software applications that process large amounts of data using scalable parallel processing infrastructure to obtain hidden value. Hadoop is the most mature open-source big data analytics framework, which implements the MapReduce programming model to process big data with MapReduce jobs. Big data analytics jobs are often continuous and not mutually separated. The existing work mainly focuses on executing jobs in sequence, which are often inefficient and consume high energy. In this paper, we propose a genetic algorithm-based job scheduling model for big data analytics applications to improve the efficiency of big data analytics. To implement the job scheduling model, we leverage an estimation module to predict the performance of clusters when executing analytics jobs. We have evaluated the proposed job scheduling model in terms of feasibility and accuracy.

  19. Aerodynamic Optimization of a Supersonic Bending Body Projectile by a Vector-Evaluated Genetic Algorithm

    DTIC Science & Technology

    2016-12-01

    Evaluated Genetic Algorithm prepared by Justin L Paul Academy of Applied Science 24 Warren Street Concord, NH 03301 under contract W911SR...Supersonic Bending Body Projectile by a Vector-Evaluated Genetic Algorithm prepared by Justin L Paul Academy of Applied Science 24 Warren Street... Genetic Algorithm 5a. CONTRACT NUMBER W199SR-15-2-001 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Justin L Paul 5d. PROJECT

  20. Stepwise group sparse regression (SGSR): gene-set-based pharmacogenomic predictive models with stepwise selection of functional priors.

    PubMed

    Jang, In Sock; Dienstmann, Rodrigo; Margolin, Adam A; Guinney, Justin

    2015-01-01

    Complex mechanisms involving genomic aberrations in numerous proteins and pathways are believed to be a key cause of many diseases such as cancer. With recent advances in genomics, elucidating the molecular basis of cancer at a patient level is now feasible, and has led to personalized treatment strategies whereby a patient is treated according to his or her genomic profile. However, there is growing recognition that existing treatment modalities are overly simplistic, and do not fully account for the deep genomic complexity associated with sensitivity or resistance to cancer therapies. To overcome these limitations, large-scale pharmacogenomic screens of cancer cell lines--in conjunction with modern statistical learning approaches--have been used to explore the genetic underpinnings of drug response. While these analyses have demonstrated the ability to infer genetic predictors of compound sensitivity, to date most modeling approaches have been data-driven, i.e. they do not explicitly incorporate domain-specific knowledge (priors) in the process of learning a model. While a purely data-driven approach offers an unbiased perspective of the data--and may yield unexpected or novel insights--this strategy introduces challenges for both model interpretability and accuracy. In this study, we propose a novel prior-incorporated sparse regression model in which the choice of informative predictor sets is carried out by knowledge-driven priors (gene sets) in a stepwise fashion. Under regularization in a linear regression model, our algorithm is able to incorporate prior biological knowledge across the predictive variables thereby improving the interpretability of the final model with no loss--and often an improvement--in predictive performance. We evaluate the performance of our algorithm compared to well-known regularization methods such as LASSO, Ridge and Elastic net regression in the Cancer Cell Line Encyclopedia (CCLE) and Genomics of Drug Sensitivity in Cancer (Sanger) pharmacogenomics datasets, demonstrating that incorporation of the biological priors selected by our model confers improved predictability and interpretability, despite much fewer predictors, over existing state-of-the-art methods.

Top