Sample records for algorithm outperforms existing

  1. On accuracy, privacy, and complexity in the identification problem

    NASA Astrophysics Data System (ADS)

    Beekhof, F.; Voloshynovskiy, S.; Koval, O.; Holotyak, T.

    2010-02-01

    This paper presents recent advances in the identification problem taking into account the accuracy, complexity and privacy leak of different decoding algorithms. Using a model of different actors from literature, we show that it is possible to use more accurate decoding algorithms using reliability information without increasing the privacy leak relative to algorithms that only use binary information. Existing algorithms from literature have been modified to take advantage of reliability information, and we show that a proposed branch-and-bound algorithm can outperform existing work, including the enhanced variants.

  2. Searching Information Sources in Networks

    DTIC Science & Technology

    2017-06-14

    SECURITY CLASSIFICATION OF: During the course of this project, we made significant progresses in multiple directions of the information detection...result on information source detection on non-tree networks; (2) The development of information source localization algorithms to detect multiple... information sources. The algorithms have provable performance guarantees and outperform existing algorithms in 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND

  3. Storyline Visualization: A Compelling Way to Understand Patterns over Time and Space

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    2017-10-16

    Storyline visualization is a compelling way to understand patterns over time and space. Much effort has been spent developing efficient and aesthetically pleasing layout optimization algorithms. But what if those algorithms are optimizing the wrong things? To answer this question, we conducted a design study with different storyline layout algorithms. We found that users with our new design principles for storyline visualization outperform existing methods.

  4. A reconsideration of negative ratings for network-based recommendation

    NASA Astrophysics Data System (ADS)

    Hu, Liang; Ren, Liang; Lin, Wenbin

    2018-01-01

    Recommendation algorithms based on bipartite networks have become increasingly popular, thanks to their accuracy and flexibility. Currently, many of these methods ignore users' negative ratings. In this work, we propose a method to exploit negative ratings for the network-based inference algorithm. We find that negative ratings play a positive role regardless of sparsity of data sets. Furthermore, we improve the efficiency of our method and compare it with the state-of-the-art algorithms. Experimental results show that the present method outperforms the existing algorithms.

  5. Optimized atom position and coefficient coding for matching pursuit-based image compression.

    PubMed

    Shoa, Alireza; Shirani, Shahram

    2009-12-01

    In this paper, we propose a new encoding algorithm for matching pursuit image coding. We show that coding performance is improved when correlations between atom positions and atom coefficients are both used in encoding. We find the optimum tradeoff between efficient atom position coding and efficient atom coefficient coding and optimize the encoder parameters. Our proposed algorithm outperforms the existing coding algorithms designed for matching pursuit image coding. Additionally, we show that our algorithm results in better rate distortion performance than JPEG 2000 at low bit rates.

  6. Fringe pattern demodulation with a two-dimensional digital phase-locked loop algorithm.

    PubMed

    Gdeisat, Munther A; Burton, David R; Lalor, Michael J

    2002-09-10

    A novel technique called a two-dimensional digital phase-locked loop (DPLL) for fringe pattern demodulation is presented. This algorithm is more suitable for demodulation of fringe patterns with varying phase in two directions than the existing DPLL techniques that assume that the phase of the fringe patterns varies only in one direction. The two-dimensional DPLL technique assumes that the phase of a fringe pattern is continuous in both directions and takes advantage of the phase continuity; consequently, the algorithm has better noise performance than the existing DPLL schemes. The two-dimensional DPLL algorithm is also suitable for demodulation of fringe patterns with low sampling rates, and it outperforms the Fourier fringe analysis technique in this aspect.

  7. A hybrid frame concealment algorithm for H.264/AVC.

    PubMed

    Yan, Bo; Gharavi, Hamid

    2010-01-01

    In packet-based video transmissions, packets loss due to channel errors may result in the loss of the whole video frame. Recently, many error concealment algorithms have been proposed in order to combat channel errors; however, most of the existing algorithms can only deal with the loss of macroblocks and are not able to conceal the whole missing frame. In order to resolve this problem, in this paper, we have proposed a new hybrid motion vector extrapolation (HMVE) algorithm to recover the whole missing frame, and it is able to provide more accurate estimation for the motion vectors of the missing frame than other conventional methods. Simulation results show that it is highly effective and significantly outperforms other existing frame recovery methods.

  8. Electricity Load Forecasting Using Support Vector Regression with Memetic Algorithms

    PubMed Central

    Hu, Zhongyi; Xiong, Tao

    2013-01-01

    Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature. PMID:24459425

  9. Electricity load forecasting using support vector regression with memetic algorithms.

    PubMed

    Hu, Zhongyi; Bao, Yukun; Xiong, Tao

    2013-01-01

    Electricity load forecasting is an important issue that is widely explored and examined in power systems operation literature and commercial transactions in electricity markets literature as well. Among the existing forecasting models, support vector regression (SVR) has gained much attention. Considering the performance of SVR highly depends on its parameters; this study proposed a firefly algorithm (FA) based memetic algorithm (FA-MA) to appropriately determine the parameters of SVR forecasting model. In the proposed FA-MA algorithm, the FA algorithm is applied to explore the solution space, and the pattern search is used to conduct individual learning and thus enhance the exploitation of FA. Experimental results confirm that the proposed FA-MA based SVR model can not only yield more accurate forecasting results than the other four evolutionary algorithms based SVR models and three well-known forecasting models but also outperform the hybrid algorithms in the related existing literature.

  10. Efficient sequential and parallel algorithms for record linkage.

    PubMed

    Mamun, Abdullah-Al; Mi, Tian; Aseltine, Robert; Rajasekaran, Sanguthevar

    2014-01-01

    Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. In this paper we report efficient sequential and parallel algorithms for record linkage which handle any number of datasets and outperform previous algorithms. Our algorithms employ hierarchical clustering algorithms as the basis. A key idea that we use is radix sorting on certain attributes to eliminate identical records before any further processing. Another novel idea is to form a graph that links similar records and find the connected components. Our sequential and parallel algorithms have been tested on a real dataset of 1,083,878 records and synthetic datasets ranging in size from 50,000 to 9,000,000 records. Our sequential algorithm runs at least two times faster, for any dataset, than the previous best-known algorithm, the two-phase algorithm using faster computation of the edit distance (TPA (FCED)). The speedups obtained by our parallel algorithm are almost linear. For example, we get a speedup of 7.5 with 8 cores (residing in a single node), 14.1 with 16 cores (residing in two nodes), and 26.4 with 32 cores (residing in four nodes). We have compared the performance of our sequential algorithm with TPA (FCED) and found that our algorithm outperforms the previous one. The accuracy is the same as that of this previous best-known algorithm.

  11. Adaptive Residual Interpolation for Color and Multispectral Image Demosaicking †

    PubMed Central

    Kiku, Daisuke; Okutomi, Masatoshi

    2017-01-01

    Color image demosaicking for the Bayer color filter array is an essential image processing operation for acquiring high-quality color images. Recently, residual interpolation (RI)-based algorithms have demonstrated superior demosaicking performance over conventional color difference interpolation-based algorithms. In this paper, we propose adaptive residual interpolation (ARI) that improves existing RI-based algorithms by adaptively combining two RI-based algorithms and selecting a suitable iteration number at each pixel. These are performed based on a unified criterion that evaluates the validity of an RI-based algorithm. Experimental comparisons using standard color image datasets demonstrate that ARI can improve existing RI-based algorithms by more than 0.6 dB in the color peak signal-to-noise ratio and can outperform state-of-the-art algorithms based on training images. We further extend ARI for a multispectral filter array, in which more than three spectral bands are arrayed, and demonstrate that ARI can achieve state-of-the-art performance also for the task of multispectral image demosaicking. PMID:29194407

  12. Adaptive Residual Interpolation for Color and Multispectral Image Demosaicking.

    PubMed

    Monno, Yusuke; Kiku, Daisuke; Tanaka, Masayuki; Okutomi, Masatoshi

    2017-12-01

    Color image demosaicking for the Bayer color filter array is an essential image processing operation for acquiring high-quality color images. Recently, residual interpolation (RI)-based algorithms have demonstrated superior demosaicking performance over conventional color difference interpolation-based algorithms. In this paper, we propose adaptive residual interpolation (ARI) that improves existing RI-based algorithms by adaptively combining two RI-based algorithms and selecting a suitable iteration number at each pixel. These are performed based on a unified criterion that evaluates the validity of an RI-based algorithm. Experimental comparisons using standard color image datasets demonstrate that ARI can improve existing RI-based algorithms by more than 0.6 dB in the color peak signal-to-noise ratio and can outperform state-of-the-art algorithms based on training images. We further extend ARI for a multispectral filter array, in which more than three spectral bands are arrayed, and demonstrate that ARI can achieve state-of-the-art performance also for the task of multispectral image demosaicking.

  13. Two hybrid compaction algorithms for the layout optimization problem.

    PubMed

    Xiao, Ren-Bin; Xu, Yi-Chun; Amos, Martyn

    2007-01-01

    In this paper we present two new algorithms for the layout optimization problem: this concerns the placement of circular, weighted objects inside a circular container, the two objectives being to minimize imbalance of mass and to minimize the radius of the container. This problem carries real practical significance in industrial applications (such as the design of satellites), as well as being of significant theoretical interest. We present two nature-inspired algorithms for this problem, the first based on simulated annealing, and the second on particle swarm optimization. We compare our algorithms with the existing best-known algorithm, and show that our approaches out-perform it in terms of both solution quality and execution time.

  14. Pre-Scheduled and Self Organized Sleep-Scheduling Algorithms for Efficient K-Coverage in Wireless Sensor Networks

    PubMed Central

    Hwang, I-Shyan

    2017-01-01

    The K-coverage configuration that guarantees coverage of each location by at least K sensors is highly popular and is extensively used to monitor diversified applications in wireless sensor networks. Long network lifetime and high detection quality are the essentials of such K-covered sleep-scheduling algorithms. However, the existing sleep-scheduling algorithms either cause high cost or cannot preserve the detection quality effectively. In this paper, the Pre-Scheduling-based K-coverage Group Scheduling (PSKGS) and Self-Organized K-coverage Scheduling (SKS) algorithms are proposed to settle the problems in the existing sleep-scheduling algorithms. Simulation results show that our pre-scheduled-based KGS approach enhances the detection quality and network lifetime, whereas the self-organized-based SKS algorithm minimizes the computation and communication cost of the nodes and thereby is energy efficient. Besides, SKS outperforms PSKGS in terms of network lifetime and detection quality as it is self-organized. PMID:29257078

  15. Mixed Criticality Scheduling for Industrial Wireless Sensor Networks

    PubMed Central

    Jin, Xi; Xia, Changqing; Xu, Huiting; Wang, Jintao; Zeng, Peng

    2016-01-01

    Wireless sensor networks (WSNs) have been widely used in industrial systems. Their real-time performance and reliability are fundamental to industrial production. Many works have studied the two aspects, but only focus on single criticality WSNs. Mixed criticality requirements exist in many advanced applications in which different data flows have different levels of importance (or criticality). In this paper, first, we propose a scheduling algorithm, which guarantees the real-time performance and reliability requirements of data flows with different levels of criticality. The algorithm supports centralized optimization and adaptive adjustment. It is able to improve both the scheduling performance and flexibility. Then, we provide the schedulability test through rigorous theoretical analysis. We conduct extensive simulations, and the results demonstrate that the proposed scheduling algorithm and analysis significantly outperform existing ones. PMID:27589741

  16. Efficient sequential and parallel algorithms for record linkage

    PubMed Central

    Mamun, Abdullah-Al; Mi, Tian; Aseltine, Robert; Rajasekaran, Sanguthevar

    2014-01-01

    Background and objective Integrating data from multiple sources is a crucial and challenging problem. Even though there exist numerous algorithms for record linkage or deduplication, they suffer from either large time needs or restrictions on the number of datasets that they can integrate. In this paper we report efficient sequential and parallel algorithms for record linkage which handle any number of datasets and outperform previous algorithms. Methods Our algorithms employ hierarchical clustering algorithms as the basis. A key idea that we use is radix sorting on certain attributes to eliminate identical records before any further processing. Another novel idea is to form a graph that links similar records and find the connected components. Results Our sequential and parallel algorithms have been tested on a real dataset of 1 083 878 records and synthetic datasets ranging in size from 50 000 to 9 000 000 records. Our sequential algorithm runs at least two times faster, for any dataset, than the previous best-known algorithm, the two-phase algorithm using faster computation of the edit distance (TPA (FCED)). The speedups obtained by our parallel algorithm are almost linear. For example, we get a speedup of 7.5 with 8 cores (residing in a single node), 14.1 with 16 cores (residing in two nodes), and 26.4 with 32 cores (residing in four nodes). Conclusions We have compared the performance of our sequential algorithm with TPA (FCED) and found that our algorithm outperforms the previous one. The accuracy is the same as that of this previous best-known algorithm. PMID:24154837

  17. Filtered-x generalized mixed norm (FXGMN) algorithm for active noise control

    NASA Astrophysics Data System (ADS)

    Song, Pucha; Zhao, Haiquan

    2018-07-01

    The standard adaptive filtering algorithm with a single error norm exhibits slow convergence rate and poor noise reduction performance under specific environments. To overcome this drawback, a filtered-x generalized mixed norm (FXGMN) algorithm for active noise control (ANC) system is proposed. The FXGMN algorithm is developed by using a convex mixture of lp and lq norms as the cost function that it can be viewed as a generalized version of the most existing adaptive filtering algorithms, and it will reduce to a specific algorithm by choosing certain parameters. Especially, it can be used to solve the ANC under Gaussian and non-Gaussian noise environments (including impulsive noise with symmetric α -stable (SαS) distribution). To further enhance the algorithm performance, namely convergence speed and noise reduction performance, a convex combination of the FXGMN algorithm (C-FXGMN) is presented. Moreover, the computational complexity of the proposed algorithms is analyzed, and a stability condition for the proposed algorithms is provided. Simulation results show that the proposed FXGMN and C-FXGMN algorithms can achieve better convergence speed and higher noise reduction as compared to other existing algorithms under various noise input conditions, and the C-FXGMN algorithm outperforms the FXGMN.

  18. Pose estimation for augmented reality applications using genetic algorithm.

    PubMed

    Yu, Ying Kin; Wong, Kin Hong; Chang, Michael Ming Yuen

    2005-12-01

    This paper describes a genetic algorithm that tackles the pose-estimation problem in computer vision. Our genetic algorithm can find the rotation and translation of an object accurately when the three-dimensional structure of the object is given. In our implementation, each chromosome encodes both the pose and the indexes to the selected point features of the object. Instead of only searching for the pose as in the existing work, our algorithm, at the same time, searches for a set containing the most reliable feature points in the process. This mismatch filtering strategy successfully makes the algorithm more robust under the presence of point mismatches and outliers in the images. Our algorithm has been tested with both synthetic and real data with good results. The accuracy of the recovered pose is compared to the existing algorithms. Our approach outperformed the Lowe's method and the other two genetic algorithms under the presence of point mismatches and outliers. In addition, it has been used to estimate the pose of a real object. It is shown that the proposed method is applicable to augmented reality applications.

  19. An Effective Cache Algorithm for Heterogeneous Storage Systems

    PubMed Central

    Li, Yong; Feng, Dan

    2013-01-01

    Modern storage environment is commonly composed of heterogeneous storage devices. However, traditional cache algorithms exhibit performance degradation in heterogeneous storage systems because they were not designed to work with the diverse performance characteristics. In this paper, we present a new cache algorithm called HCM for heterogeneous storage systems. The HCM algorithm partitions the cache among the disks and adopts an effective scheme to balance the work across the disks. Furthermore, it applies benefit-cost analysis to choose the best allocation of cache block to improve the performance. Conducting simulations with a variety of traces and a wide range of cache size, our experiments show that HCM significantly outperforms the existing state-of-the-art storage-aware cache algorithms. PMID:24453890

  20. Data depth based clustering analysis

    DOE PAGES

    Jeong, Myeong -Hun; Cai, Yaping; Sullivan, Clair J.; ...

    2016-01-01

    Here, this paper proposes a new algorithm for identifying patterns within data, based on data depth. Such a clustering analysis has an enormous potential to discover previously unknown insights from existing data sets. Many clustering algorithms already exist for this purpose. However, most algorithms are not affine invariant. Therefore, they must operate with different parameters after the data sets are rotated, scaled, or translated. Further, most clustering algorithms, based on Euclidean distance, can be sensitive to noises because they have no global perspective. Parameter selection also significantly affects the clustering results of each algorithm. Unlike many existing clustering algorithms, themore » proposed algorithm, called data depth based clustering analysis (DBCA), is able to detect coherent clusters after the data sets are affine transformed without changing a parameter. It is also robust to noises because using data depth can measure centrality and outlyingness of the underlying data. Further, it can generate relatively stable clusters by varying the parameter. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed algorithm outperforms DBSCAN and HDBSCAN in terms of affine invariance, and exceeds or matches the ro-bustness to noises of DBSCAN or HDBSCAN. The robust-ness to parameter selection is also demonstrated through the case study of clustering twitter data.« less

  1. A Polygon Model for Wireless Sensor Network Deployment with Directional Sensing Areas

    PubMed Central

    Wu, Chun-Hsien; Chung, Yeh-Ching

    2009-01-01

    The modeling of the sensing area of a sensor node is essential for the deployment algorithm of wireless sensor networks (WSNs). In this paper, a polygon model is proposed for the sensor node with directional sensing area. In addition, a WSN deployment algorithm is presented with topology control and scoring mechanisms to maintain network connectivity and improve sensing coverage rate. To evaluate the proposed polygon model and WSN deployment algorithm, a simulation is conducted. The simulation results show that the proposed polygon model outperforms the existed disk model and circular sector model in terms of the maximum sensing coverage rate. PMID:22303159

  2. Integrating image quality in 2nu-SVM biometric match score fusion.

    PubMed

    Vatsa, Mayank; Singh, Richa; Noore, Afzel

    2007-10-01

    This paper proposes an intelligent 2nu-support vector machine based match score fusion algorithm to improve the performance of face and iris recognition by integrating the quality of images. The proposed algorithm applies redundant discrete wavelet transform to evaluate the underlying linear and non-linear features present in the image. A composite quality score is computed to determine the extent of smoothness, sharpness, noise, and other pertinent features present in each subband of the image. The match score and the corresponding quality score of an image are fused using 2nu-support vector machine to improve the verification performance. The proposed algorithm is experimentally validated using the FERET face database and the CASIA iris database. The verification performance and statistical evaluation show that the proposed algorithm outperforms existing fusion algorithms.

  3. Pattern-set generation algorithm for the one-dimensional multiple stock sizes cutting stock problem

    NASA Astrophysics Data System (ADS)

    Cui, Yaodong; Cui, Yi-Ping; Zhao, Zhigang

    2015-09-01

    A pattern-set generation algorithm (PSG) for the one-dimensional multiple stock sizes cutting stock problem (1DMSSCSP) is presented. The solution process contains two stages. In the first stage, the PSG solves the residual problems repeatedly to generate the patterns in the pattern set, where each residual problem is solved by the column-generation approach, and each pattern is generated by solving a single large object placement problem. In the second stage, the integer linear programming model of the 1DMSSCSP is solved using a commercial solver, where only the patterns in the pattern set are considered. The computational results of benchmark instances indicate that the PSG outperforms existing heuristic algorithms and rivals the exact algorithm in solution quality.

  4. Accelerated Path-following Iterative Shrinkage Thresholding Algorithm with Application to Semiparametric Graph Estimation

    PubMed Central

    Zhao, Tuo; Liu, Han

    2016-01-01

    We propose an accelerated path-following iterative shrinkage thresholding algorithm (APISTA) for solving high dimensional sparse nonconvex learning problems. The main difference between APISTA and the path-following iterative shrinkage thresholding algorithm (PISTA) is that APISTA exploits an additional coordinate descent subroutine to boost the computational performance. Such a modification, though simple, has profound impact: APISTA not only enjoys the same theoretical guarantee as that of PISTA, i.e., APISTA attains a linear rate of convergence to a unique sparse local optimum with good statistical properties, but also significantly outperforms PISTA in empirical benchmarks. As an application, we apply APISTA to solve a family of nonconvex optimization problems motivated by estimating sparse semiparametric graphical models. APISTA allows us to obtain new statistical recovery results which do not exist in the existing literature. Thorough numerical results are provided to back up our theory. PMID:28133430

  5. Constrained Metric Learning by Permutation Inducing Isometries.

    PubMed

    Bosveld, Joel; Mahmood, Arif; Huynh, Du Q; Noakes, Lyle

    2016-01-01

    The choice of metric critically affects the performance of classification and clustering algorithms. Metric learning algorithms attempt to improve performance, by learning a more appropriate metric. Unfortunately, most of the current algorithms learn a distance function which is not invariant to rigid transformations of images. Therefore, the distances between two images and their rigidly transformed pair may differ, leading to inconsistent classification or clustering results. We propose to constrain the learned metric to be invariant to the geometry preserving transformations of images that induce permutations in the feature space. The constraint that these transformations are isometries of the metric ensures consistent results and improves accuracy. Our second contribution is a dimension reduction technique that is consistent with the isometry constraints. Our third contribution is the formulation of the isometry constrained logistic discriminant metric learning (IC-LDML) algorithm, by incorporating the isometry constraints within the objective function of the LDML algorithm. The proposed algorithm is compared with the existing techniques on the publicly available labeled faces in the wild, viewpoint-invariant pedestrian recognition, and Toy Cars data sets. The IC-LDML algorithm has outperformed existing techniques for the tasks of face recognition, person identification, and object classification by a significant margin.

  6. Active impulsive noise control using maximum correntropy with adaptive kernel size

    NASA Astrophysics Data System (ADS)

    Lu, Lu; Zhao, Haiquan

    2017-03-01

    The active noise control (ANC) based on the principle of superposition is an attractive method to attenuate the noise signals. However, the impulsive noise in the ANC systems will degrade the performance of the controller. In this paper, a filtered-x recursive maximum correntropy (FxRMC) algorithm is proposed based on the maximum correntropy criterion (MCC) to reduce the effect of outliers. The proposed FxRMC algorithm does not requires any priori information of the noise characteristics and outperforms the filtered-x least mean square (FxLMS) algorithm for impulsive noise. Meanwhile, in order to adjust the kernel size of FxRMC algorithm online, a recursive approach is proposed through taking into account the past estimates of error signals over a sliding window. Simulation and experimental results in the context of active impulsive noise control demonstrate that the proposed algorithms achieve much better performance than the existing algorithms in various noise environments.

  7. Fragmenting networks by targeting collective influencers at a mesoscopic level.

    PubMed

    Kobayashi, Teruyoshi; Masuda, Naoki

    2016-11-25

    A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure.

  8. Fragmenting networks by targeting collective influencers at a mesoscopic level

    NASA Astrophysics Data System (ADS)

    Kobayashi, Teruyoshi; Masuda, Naoki

    2016-11-01

    A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure.

  9. Fragmenting networks by targeting collective influencers at a mesoscopic level

    PubMed Central

    Kobayashi, Teruyoshi; Masuda, Naoki

    2016-01-01

    A practical approach to protecting networks against epidemic processes such as spreading of infectious diseases, malware, and harmful viral information is to remove some influential nodes beforehand to fragment the network into small components. Because determining the optimal order to remove nodes is a computationally hard problem, various approximate algorithms have been proposed to efficiently fragment networks by sequential node removal. Morone and Makse proposed an algorithm employing the non-backtracking matrix of given networks, which outperforms various existing algorithms. In fact, many empirical networks have community structure, compromising the assumption of local tree-like structure on which the original algorithm is based. We develop an immunization algorithm by synergistically combining the Morone-Makse algorithm and coarse graining of the network in which we regard a community as a supernode. In this way, we aim to identify nodes that connect different communities at a reasonable computational cost. The proposed algorithm works more efficiently than the Morone-Makse and other algorithms on networks with community structure. PMID:27886251

  10. PCA-LBG-based algorithms for VQ codebook generation

    NASA Astrophysics Data System (ADS)

    Tsai, Jinn-Tsong; Yang, Po-Yuan

    2015-04-01

    Vector quantisation (VQ) codebooks are generated by combining principal component analysis (PCA) algorithms with Linde-Buzo-Gray (LBG) algorithms. All training vectors are grouped according to the projected values of the principal components. The PCA-LBG-based algorithms include (1) PCA-LBG-Median, which selects the median vector of each group, (2) PCA-LBG-Centroid, which adopts the centroid vector of each group, and (3) PCA-LBG-Random, which randomly selects a vector of each group. The LBG algorithm finds a codebook based on the better vectors sent to an initial codebook by the PCA. The PCA performs an orthogonal transformation to convert a set of potentially correlated variables into a set of variables that are not linearly correlated. Because the orthogonal transformation efficiently distinguishes test image vectors, the proposed PCA-LBG-based algorithm is expected to outperform conventional algorithms in designing VQ codebooks. The experimental results confirm that the proposed PCA-LBG-based algorithms indeed obtain better results compared to existing methods reported in the literature.

  11. A new distributed systems scheduling algorithm: a swarm intelligence approach

    NASA Astrophysics Data System (ADS)

    Haghi Kashani, Mostafa; Sarvizadeh, Raheleh; Jameii, Mahdi

    2011-12-01

    The scheduling problem in distributed systems is known as an NP-complete problem, and methods based on heuristic or metaheuristic search have been proposed to obtain optimal and suboptimal solutions. The task scheduling is a key factor for distributed systems to gain better performance. In this paper, an efficient method based on memetic algorithm is developed to solve the problem of distributed systems scheduling. With regard to load balancing efficiently, Artificial Bee Colony (ABC) has been applied as local search in the proposed memetic algorithm. The proposed method has been compared to existing memetic-Based approach in which Learning Automata method has been used as local search. The results demonstrated that the proposed method outperform the above mentioned method in terms of communication cost.

  12. Underwater image enhancement through depth estimation based on random forest

    NASA Astrophysics Data System (ADS)

    Tai, Shen-Chuan; Tsai, Ting-Chou; Huang, Jyun-Han

    2017-11-01

    Light absorption and scattering in underwater environments can result in low-contrast images with a distinct color cast. This paper proposes a systematic framework for the enhancement of underwater images. Light transmission is estimated using the random forest algorithm. RGB values, luminance, color difference, blurriness, and the dark channel are treated as features in training and estimation. Transmission is calculated using an ensemble machine learning algorithm to deal with a variety of conditions encountered in underwater environments. A color compensation and contrast enhancement algorithm based on depth information was also developed with the aim of improving the visual quality of underwater images. Experimental results demonstrate that the proposed scheme outperforms existing methods with regard to subjective visual quality as well as objective measurements.

  13. Spatio-temporal colour correction of strongly degraded movies

    NASA Astrophysics Data System (ADS)

    Islam, A. B. M. Tariqul; Farup, Ivar

    2011-01-01

    The archives of motion pictures represent an important part of precious cultural heritage. Unfortunately, these cinematography collections are vulnerable to different distortions such as colour fading which is beyond the capability of photochemical restoration process. Spatial colour algorithms-Retinex and ACE provide helpful tool in restoring strongly degraded colour films but, there are some challenges associated with these algorithms. We present an automatic colour correction technique for digital colour restoration of strongly degraded movie material. The method is based upon the existing STRESS algorithm. In order to cope with the problem of highly correlated colour channels, we implemented a preprocessing step in which saturation enhancement is performed in a PCA space. Spatial colour algorithms tend to emphasize all details in the images, including dust and scratches. Surprisingly, we found that the presence of these defects does not affect the behaviour of the colour correction algorithm. Although the STRESS algorithm is already in itself more efficient than traditional spatial colour algorithms, it is still computationally expensive. To speed it up further, we went beyond the spatial domain of the frames and extended the algorithm to the temporal domain. This way, we were able to achieve an 80 percent reduction of the computational time compared to processing every single frame individually. We performed two user experiments and found that the visual quality of the resulting frames was significantly better than with existing methods. Thus, our method outperforms the existing ones in terms of both visual quality and computational efficiency.

  14. An Integrated Method Based on PSO and EDA for the Max-Cut Problem.

    PubMed

    Lin, Geng; Guan, Jian

    2016-01-01

    The max-cut problem is NP-hard combinatorial optimization problem with many real world applications. In this paper, we propose an integrated method based on particle swarm optimization and estimation of distribution algorithm (PSO-EDA) for solving the max-cut problem. The integrated algorithm overcomes the shortcomings of particle swarm optimization and estimation of distribution algorithm. To enhance the performance of the PSO-EDA, a fast local search procedure is applied. In addition, a path relinking procedure is developed to intensify the search. To evaluate the performance of PSO-EDA, extensive experiments were carried out on two sets of benchmark instances with 800 to 20,000 vertices from the literature. Computational results and comparisons show that PSO-EDA significantly outperforms the existing PSO-based and EDA-based algorithms for the max-cut problem. Compared with other best performing algorithms, PSO-EDA is able to find very competitive results in terms of solution quality.

  15. Infrared traffic image enhancement algorithm based on dark channel prior and gamma correction

    NASA Astrophysics Data System (ADS)

    Zheng, Lintao; Shi, Hengliang; Gu, Ming

    2017-07-01

    The infrared traffic image acquired by the intelligent traffic surveillance equipment has low contrast, little hierarchical differences in perceptions of image and the blurred vision effect. Therefore, infrared traffic image enhancement, being an indispensable key step, is applied to nearly all infrared imaging based traffic engineering applications. In this paper, we propose an infrared traffic image enhancement algorithm that is based on dark channel prior and gamma correction. In existing research dark channel prior, known as a famous image dehazing method, here is used to do infrared image enhancement for the first time. Initially, in the proposed algorithm, the original degraded infrared traffic image is transformed with dark channel prior as the initial enhanced result. A further adjustment based on the gamma curve is needed because initial enhanced result has lower brightness. Comprehensive validation experiments reveal that the proposed algorithm outperforms the current state-of-the-art algorithms.

  16. Research on B Cell Algorithm for Learning to Rank Method Based on Parallel Strategy.

    PubMed

    Tian, Yuling; Zhang, Hongxian

    2016-01-01

    For the purposes of information retrieval, users must find highly relevant documents from within a system (and often a quite large one comprised of many individual documents) based on input query. Ranking the documents according to their relevance within the system to meet user needs is a challenging endeavor, and a hot research topic-there already exist several rank-learning methods based on machine learning techniques which can generate ranking functions automatically. This paper proposes a parallel B cell algorithm, RankBCA, for rank learning which utilizes a clonal selection mechanism based on biological immunity. The novel algorithm is compared with traditional rank-learning algorithms through experimentation and shown to outperform the others in respect to accuracy, learning time, and convergence rate; taken together, the experimental results show that the proposed algorithm indeed effectively and rapidly identifies optimal ranking functions.

  17. Research on B Cell Algorithm for Learning to Rank Method Based on Parallel Strategy

    PubMed Central

    Tian, Yuling; Zhang, Hongxian

    2016-01-01

    For the purposes of information retrieval, users must find highly relevant documents from within a system (and often a quite large one comprised of many individual documents) based on input query. Ranking the documents according to their relevance within the system to meet user needs is a challenging endeavor, and a hot research topic–there already exist several rank-learning methods based on machine learning techniques which can generate ranking functions automatically. This paper proposes a parallel B cell algorithm, RankBCA, for rank learning which utilizes a clonal selection mechanism based on biological immunity. The novel algorithm is compared with traditional rank-learning algorithms through experimentation and shown to outperform the others in respect to accuracy, learning time, and convergence rate; taken together, the experimental results show that the proposed algorithm indeed effectively and rapidly identifies optimal ranking functions. PMID:27487242

  18. Performance of a cavity-method-based algorithm for the prize-collecting Steiner tree problem on graphs

    NASA Astrophysics Data System (ADS)

    Biazzo, Indaco; Braunstein, Alfredo; Zecchina, Riccardo

    2012-08-01

    We study the behavior of an algorithm derived from the cavity method for the prize-collecting steiner tree (PCST) problem on graphs. The algorithm is based on the zero temperature limit of the cavity equations and as such is formally simple (a fixed point equation resolved by iteration) and distributed (parallelizable). We provide a detailed comparison with state-of-the-art algorithms on a wide range of existing benchmarks, networks, and random graphs. Specifically, we consider an enhanced derivative of the Goemans-Williamson heuristics and the dhea solver, a branch and cut integer linear programming based approach. The comparison shows that the cavity algorithm outperforms the two algorithms in most large instances both in running time and quality of the solution. Finally we prove a few optimality properties of the solutions provided by our algorithm, including optimality under the two postprocessing procedures defined in the Goemans-Williamson derivative and global optimality in some limit cases.

  19. A Novel Algorithm Combining Finite State Method and Genetic Algorithm for Solving Crude Oil Scheduling Problem

    PubMed Central

    Duan, Qian-Qian; Yang, Gen-Ke; Pan, Chang-Chun

    2014-01-01

    A hybrid optimization algorithm combining finite state method (FSM) and genetic algorithm (GA) is proposed to solve the crude oil scheduling problem. The FSM and GA are combined to take the advantage of each method and compensate deficiencies of individual methods. In the proposed algorithm, the finite state method makes up for the weakness of GA which is poor at local searching ability. The heuristic returned by the FSM can guide the GA algorithm towards good solutions. The idea behind this is that we can generate promising substructure or partial solution by using FSM. Furthermore, the FSM can guarantee that the entire solution space is uniformly covered. Therefore, the combination of the two algorithms has better global performance than the existing GA or FSM which is operated individually. Finally, a real-life crude oil scheduling problem from the literature is used for conducting simulation. The experimental results validate that the proposed method outperforms the state-of-art GA method. PMID:24772031

  20. Protein Sub-Nuclear Localization Based on Effective Fusion Representations and Dimension Reduction Algorithm LDA.

    PubMed

    Wang, Shunfang; Liu, Shuhui

    2015-12-19

    An effective representation of a protein sequence plays a crucial role in protein sub-nuclear localization. The existing representations, such as dipeptide composition (DipC), pseudo-amino acid composition (PseAAC) and position specific scoring matrix (PSSM), are insufficient to represent protein sequence due to their single perspectives. Thus, this paper proposes two fusion feature representations of DipPSSM and PseAAPSSM to integrate PSSM with DipC and PseAAC, respectively. When constructing each fusion representation, we introduce the balance factors to value the importance of its components. The optimal values of the balance factors are sought by genetic algorithm. Due to the high dimensionality of the proposed representations, linear discriminant analysis (LDA) is used to find its important low dimensional structure, which is essential for classification and location prediction. The numerical experiments on two public datasets with KNN classifier and cross-validation tests showed that in terms of the common indexes of sensitivity, specificity, accuracy and MCC, the proposed fusing representations outperform the traditional representations in protein sub-nuclear localization, and the representation treated by LDA outperforms the untreated one.

  1. Protein Sub-Nuclear Localization Based on Effective Fusion Representations and Dimension Reduction Algorithm LDA

    PubMed Central

    Wang, Shunfang; Liu, Shuhui

    2015-01-01

    An effective representation of a protein sequence plays a crucial role in protein sub-nuclear localization. The existing representations, such as dipeptide composition (DipC), pseudo-amino acid composition (PseAAC) and position specific scoring matrix (PSSM), are insufficient to represent protein sequence due to their single perspectives. Thus, this paper proposes two fusion feature representations of DipPSSM and PseAAPSSM to integrate PSSM with DipC and PseAAC, respectively. When constructing each fusion representation, we introduce the balance factors to value the importance of its components. The optimal values of the balance factors are sought by genetic algorithm. Due to the high dimensionality of the proposed representations, linear discriminant analysis (LDA) is used to find its important low dimensional structure, which is essential for classification and location prediction. The numerical experiments on two public datasets with KNN classifier and cross-validation tests showed that in terms of the common indexes of sensitivity, specificity, accuracy and MCC, the proposed fusing representations outperform the traditional representations in protein sub-nuclear localization, and the representation treated by LDA outperforms the untreated one. PMID:26703574

  2. A novel swarm intelligence algorithm for finding DNA motifs.

    PubMed

    Lei, Chengwei; Ruan, Jianhua

    2009-01-01

    Discovering DNA motifs from co-expressed or co-regulated genes is an important step towards deciphering complex gene regulatory networks and understanding gene functions. Despite significant improvement in the last decade, it still remains one of the most challenging problems in computational molecular biology. In this work, we propose a novel motif finding algorithm that finds consensus patterns using a population-based stochastic optimisation technique called Particle Swarm Optimisation (PSO), which has been shown to be effective in optimising difficult multidimensional problems in continuous domains. We propose to use a word dissimilarity graph to remap the neighborhood structure of the solution space of DNA motifs, and propose a modification of the naive PSO algorithm to accommodate discrete variables. In order to improve efficiency, we also propose several strategies for escaping from local optima and for automatically determining the termination criteria. Experimental results on simulated challenge problems show that our method is both more efficient and more accurate than several existing algorithms. Applications to several sets of real promoter sequences also show that our approach is able to detect known transcription factor binding sites, and outperforms two of the most popular existing algorithms.

  3. Photon-efficient super-resolution laser radar

    NASA Astrophysics Data System (ADS)

    Shin, Dongeek; Shapiro, Jeffrey H.; Goyal, Vivek K.

    2017-08-01

    The resolution achieved in photon-efficient active optical range imaging systems can be low due to non-idealities such as propagation through a diffuse scattering medium. We propose a constrained optimization-based frame- work to address extremes in scarcity of photons and blurring by a forward imaging kernel. We provide two algorithms for the resulting inverse problem: a greedy algorithm, inspired by sparse pursuit algorithms; and a convex optimization heuristic that incorporates image total variation regularization. We demonstrate that our framework outperforms existing deconvolution imaging techniques in terms of peak signal-to-noise ratio. Since our proposed method is able to super-resolve depth features using small numbers of photon counts, it can be useful for observing fine-scale phenomena in remote sensing through a scattering medium and through-the-skin biomedical imaging applications.

  4. An Effective Hybrid Routing Algorithm in WSN: Ant Colony Optimization in combination with Hop Count Minimization.

    PubMed

    Jiang, Ailian; Zheng, Lihong

    2018-03-29

    Low cost, high reliability and easy maintenance are key criteria in the design of routing protocols for wireless sensor networks (WSNs). This paper investigates the existing ant colony optimization (ACO)-based WSN routing algorithms and the minimum hop count WSN routing algorithms by reviewing their strengths and weaknesses. We also consider the critical factors of WSNs, such as energy constraint of sensor nodes, network load balancing and dynamic network topology. Then we propose a hybrid routing algorithm that integrates ACO and a minimum hop count scheme. The proposed algorithm is able to find the optimal routing path with minimal total energy consumption and balanced energy consumption on each node. The algorithm has unique superiority in terms of searching for the optimal path, balancing the network load and the network topology maintenance. The WSN model and the proposed algorithm have been implemented using C++. Extensive simulation experimental results have shown that our algorithm outperforms several other WSN routing algorithms on such aspects that include the rate of convergence, the success rate in searching for global optimal solution, and the network lifetime.

  5. An Effective Hybrid Routing Algorithm in WSN: Ant Colony Optimization in combination with Hop Count Minimization

    PubMed Central

    2018-01-01

    Low cost, high reliability and easy maintenance are key criteria in the design of routing protocols for wireless sensor networks (WSNs). This paper investigates the existing ant colony optimization (ACO)-based WSN routing algorithms and the minimum hop count WSN routing algorithms by reviewing their strengths and weaknesses. We also consider the critical factors of WSNs, such as energy constraint of sensor nodes, network load balancing and dynamic network topology. Then we propose a hybrid routing algorithm that integrates ACO and a minimum hop count scheme. The proposed algorithm is able to find the optimal routing path with minimal total energy consumption and balanced energy consumption on each node. The algorithm has unique superiority in terms of searching for the optimal path, balancing the network load and the network topology maintenance. The WSN model and the proposed algorithm have been implemented using C++. Extensive simulation experimental results have shown that our algorithm outperforms several other WSN routing algorithms on such aspects that include the rate of convergence, the success rate in searching for global optimal solution, and the network lifetime. PMID:29596336

  6. Efficiently computing exact geodesic loops within finite steps.

    PubMed

    Xin, Shi-Qing; He, Ying; Fu, Chi-Wing

    2012-06-01

    Closed geodesics, or geodesic loops, are crucial to the study of differential topology and differential geometry. Although the existence and properties of closed geodesics on smooth surfaces have been widely studied in mathematics community, relatively little progress has been made on how to compute them on polygonal surfaces. Most existing algorithms simply consider the mesh as a graph and so the resultant loops are restricted only on mesh edges, which are far from the actual geodesics. This paper is the first to prove the existence and uniqueness of geodesic loop restricted on a closed face sequence; it contributes also with an efficient algorithm to iteratively evolve an initial closed path on a given mesh into an exact geodesic loop within finite steps. Our proposed algorithm takes only an O(k) space complexity and an O(mk) time complexity (experimentally), where m is the number of vertices in the region bounded by the initial loop and the resultant geodesic loop, and k is the average number of edges in the edge sequences that the evolving loop passes through. In contrast to the existing geodesic curvature flow methods which compute an approximate geodesic loop within a predefined threshold, our method is exact and can apply directly to triangular meshes without needing to solve any differential equation with a numerical solver; it can run at interactive speed, e.g., in the order of milliseconds, for a mesh with around 50K vertices, and hence, significantly outperforms existing algorithms. Actually, our algorithm could run at interactive speed even for larger meshes. Besides the complexity of the input mesh, the geometric shape could also affect the number of evolving steps, i.e., the performance. We motivate our algorithm with an interactive shape segmentation example shown later in the paper.

  7. Fuzzy logic-based approach to detecting a passive RFID tag in an outpatient clinic.

    PubMed

    Min, Daiki; Yih, Yuehwern

    2011-06-01

    This study is motivated by the observations on the data collected by radio frequency identification (RFID) readers in a pilot study, which was used to investigate the feasibility of implementing an RFID-based monitoring system in an outpatient eye clinic. The raw RFID data collected from RFID readers contain noise and missing reads, which prevent us from determining the tag location. In this paper, fuzzy logic-based algorithms are proposed to interpret the raw RFID data to extract accurate information. The proposed algorithms determine the location of an RFID tag by evaluating its possibility of presence and absence. To evaluate the performance of the proposed algorithms, numerical experiments are conducted using the data observed in the outpatient eye clinic. Experiments results showed that the proposed algorithms outperform existing static smoothing method in terms of minimizing both false positives and false negatives. Furthermore, the proposed algorithms are applied to a set of simulated data to show the robustness of the proposed algorithms at various levels of RFID reader reliability.

  8. An auto-adaptive optimization approach for targeting nonpoint source pollution control practices.

    PubMed

    Chen, Lei; Wei, Guoyuan; Shen, Zhenyao

    2015-10-21

    To solve computationally intensive and technically complex control of nonpoint source pollution, the traditional genetic algorithm was modified into an auto-adaptive pattern, and a new framework was proposed by integrating this new algorithm with a watershed model and an economic module. Although conceptually simple and comprehensive, the proposed algorithm would search automatically for those Pareto-optimality solutions without a complex calibration of optimization parameters. The model was applied in a case study in a typical watershed of the Three Gorges Reservoir area, China. The results indicated that the evolutionary process of optimization was improved due to the incorporation of auto-adaptive parameters. In addition, the proposed algorithm outperformed the state-of-the-art existing algorithms in terms of convergence ability and computational efficiency. At the same cost level, solutions with greater pollutant reductions could be identified. From a scientific viewpoint, the proposed algorithm could be extended to other watersheds to provide cost-effective configurations of BMPs.

  9. An OMIC biomarker detection algorithm TriVote and its application in methylomic biomarker detection.

    PubMed

    Xu, Cheng; Liu, Jiamei; Yang, Weifeng; Shu, Yayun; Wei, Zhipeng; Zheng, Weiwei; Feng, Xin; Zhou, Fengfeng

    2018-04-01

    Transcriptomic and methylomic patterns represent two major OMIC data sources impacted by both inheritable genetic information and environmental factors, and have been widely used as disease diagnosis and prognosis biomarkers. Modern transcriptomic and methylomic profiling technologies detect the status of tens of thousands or even millions of probing residues in the human genome, and introduce a major computational challenge for the existing feature selection algorithms. This study proposes a three-step feature selection algorithm, TriVote, to detect a subset of transcriptomic or methylomic residues with highly accurate binary classification performance. TriVote outperforms both filter and wrapper feature selection algorithms with both higher classification accuracy and smaller feature number on 17 transcriptomes and two methylomes. Biological functions of the methylome biomarkers detected by TriVote were discussed for their disease associations. An easy-to-use Python package is also released to facilitate the further applications.

  10. Entropy-aware projected Landweber reconstruction for quantized block compressive sensing of aerial imagery

    NASA Astrophysics Data System (ADS)

    Liu, Hao; Li, Kangda; Wang, Bing; Tang, Hainie; Gong, Xiaohui

    2017-01-01

    A quantized block compressive sensing (QBCS) framework, which incorporates the universal measurement, quantization/inverse quantization, entropy coder/decoder, and iterative projected Landweber reconstruction, is summarized. Under the QBCS framework, this paper presents an improved reconstruction algorithm for aerial imagery, QBCS, with entropy-aware projected Landweber (QBCS-EPL), which leverages the full-image sparse transform without Wiener filter and an entropy-aware thresholding model for wavelet-domain image denoising. Through analyzing the functional relation between the soft-thresholding factors and entropy-based bitrates for different quantization methods, the proposed model can effectively remove wavelet-domain noise of bivariate shrinkage and achieve better image reconstruction quality. For the overall performance of QBCS reconstruction, experimental results demonstrate that the proposed QBCS-EPL algorithm significantly outperforms several existing algorithms. With the experiment-driven methodology, the QBCS-EPL algorithm can obtain better reconstruction quality at a relatively moderate computational cost, which makes it more desirable for aerial imagery applications.

  11. Routing design and fleet allocation optimization of freeway service patrol: Improved results using genetic algorithm

    NASA Astrophysics Data System (ADS)

    Sun, Xiuqiao; Wang, Jian

    2018-07-01

    Freeway service patrol (FSP), is considered to be an effective method for incident management and can help transportation agency decision-makers alter existing route coverage and fleet allocation. This paper investigates the FSP problem of patrol routing design and fleet allocation, with the objective of minimizing the overall average incident response time. While the simulated annealing (SA) algorithm and its improvements have been applied to solve this problem, they often become trapped in local optimal solution. Moreover, the issue of searching efficiency remains to be further addressed. In this paper, we employ the genetic algorithm (GA) and SA to solve the FSP problem. To maintain population diversity and avoid premature convergence, niche strategy is incorporated into the traditional genetic algorithm. We also employ elitist strategy to speed up the convergence. Numerical experiments have been conducted with the help of the Sioux Falls network. Results show that the GA slightly outperforms the dual-based greedy (DBG) algorithm, the very large-scale neighborhood searching (VLNS) algorithm, the SA algorithm and the scenario algorithm.

  12. Anomaly Detection in Large Sets of High-Dimensional Symbol Sequences

    NASA Technical Reports Server (NTRS)

    Budalakoti, Suratna; Srivastava, Ashok N.; Akella, Ram; Turkov, Eugene

    2006-01-01

    This paper addresses the problem of detecting and describing anomalies in large sets of high-dimensional symbol sequences. The approach taken uses unsupervised clustering of sequences using the normalized longest common subsequence (LCS) as a similarity measure, followed by detailed analysis of outliers to detect anomalies. As the LCS measure is expensive to compute, the first part of the paper discusses existing algorithms, such as the Hunt-Szymanski algorithm, that have low time-complexity. We then discuss why these algorithms often do not work well in practice and present a new hybrid algorithm for computing the LCS that, in our tests, outperforms the Hunt-Szymanski algorithm by a factor of five. The second part of the paper presents new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence was deemed to be an outlier. The algorithms provide a coherent description to an analyst of the anomalies in the sequence, compared to more normal sequences. The algorithms we present are general and domain-independent, so we discuss applications in related areas such as anomaly detection.

  13. High-confidence assessment of functional impact of human mitochondrial non-synonymous genome variations by APOGEE.

    PubMed

    Castellana, Stefano; Fusilli, Caterina; Mazzoccoli, Gianluigi; Biagini, Tommaso; Capocefalo, Daniele; Carella, Massimo; Vescovi, Angelo Luigi; Mazza, Tommaso

    2017-06-01

    24,189 are all the possible non-synonymous amino acid changes potentially affecting the human mitochondrial DNA. Only a tiny subset was functionally evaluated with certainty so far, while the pathogenicity of the vast majority was only assessed in-silico by software predictors. Since these tools proved to be rather incongruent, we have designed and implemented APOGEE, a machine-learning algorithm that outperforms all existing prediction methods in estimating the harmfulness of mitochondrial non-synonymous genome variations. We provide a detailed description of the underlying algorithm, of the selected and manually curated training and test sets of variants, as well as of its classification ability.

  14. MINE: Module Identification in Networks

    PubMed Central

    2011-01-01

    Background Graphical models of network associations are useful for both visualizing and integrating multiple types of association data. Identifying modules, or groups of functionally related gene products, is an important challenge in analyzing biological networks. However, existing tools to identify modules are insufficient when applied to dense networks of experimentally derived interaction data. To address this problem, we have developed an agglomerative clustering method that is able to identify highly modular sets of gene products within highly interconnected molecular interaction networks. Results MINE outperforms MCODE, CFinder, NEMO, SPICi, and MCL in identifying non-exclusive, high modularity clusters when applied to the C. elegans protein-protein interaction network. The algorithm generally achieves superior geometric accuracy and modularity for annotated functional categories. In comparison with the most closely related algorithm, MCODE, the top clusters identified by MINE are consistently of higher density and MINE is less likely to designate overlapping modules as a single unit. MINE offers a high level of granularity with a small number of adjustable parameters, enabling users to fine-tune cluster results for input networks with differing topological properties. Conclusions MINE was created in response to the challenge of discovering high quality modules of gene products within highly interconnected biological networks. The algorithm allows a high degree of flexibility and user-customisation of results with few adjustable parameters. MINE outperforms several popular clustering algorithms in identifying modules with high modularity and obtains good overall recall and precision of functional annotations in protein-protein interaction networks from both S. cerevisiae and C. elegans. PMID:21605434

  15. Wavelength converter placement for different RWA algorithms in wavelength-routed all-optical networks

    NASA Astrophysics Data System (ADS)

    Chu, Xiaowen; Li, Bo; Chlamtac, Imrich

    2002-07-01

    Sparse wavelength conversion and appropriate routing and wavelength assignment (RWA) algorithms are the two key factors in improving the blocking performance in wavelength-routed all-optical networks. It has been shown that the optimal placement of a limited number of wavelength converters in an arbitrary mesh network is an NP complete problem. There have been various heuristic algorithms proposed in the literature, in which most of them assume that a static routing and random wavelength assignment RWA algorithm is employed. However, the existing work shows that fixed-alternate routing and dynamic routing RWA algorithms can achieve much better blocking performance. Our study in this paper further demonstrates that the wavelength converter placement and RWA algorithms are closely related in the sense that a well designed wavelength converter placement mechanism for a particular RWA algorithm might not work well with a different RWA algorithm. Therefore, the wavelength converter placement and the RWA have to be considered jointly. The objective of this paper is to investigate the wavelength converter placement problem under fixed-alternate routing algorithm and least-loaded routing algorithm. Under the fixed-alternate routing algorithm, we propose a heuristic algorithm called Minimum Blocking Probability First (MBPF) algorithm for wavelength converter placement. Under the least-loaded routing algorithm, we propose a heuristic converter placement algorithm called Weighted Maximum Segment Length (WMSL) algorithm. The objective of the converter placement algorithm is to minimize the overall blocking probability. Extensive simulation studies have been carried out over three typical mesh networks, including the 14-node NSFNET, 19-node EON and 38-node CTNET. We observe that the proposed algorithms not only outperform existing wavelength converter placement algorithms by a large margin, but they also can achieve almost the same performance comparing with full wavelength conversion under the same RWA algorithm.

  16. Adaptive reference update (ARU) algorithm. A stochastic search algorithm for efficient optimization of multi-drug cocktails

    PubMed Central

    2012-01-01

    Background Multi-target therapeutics has been shown to be effective for treating complex diseases, and currently, it is a common practice to combine multiple drugs to treat such diseases to optimize the therapeutic outcomes. However, considering the huge number of possible ways to mix multiple drugs at different concentrations, it is practically difficult to identify the optimal drug combination through exhaustive testing. Results In this paper, we propose a novel stochastic search algorithm, called the adaptive reference update (ARU) algorithm, that can provide an efficient and systematic way for optimizing multi-drug cocktails. The ARU algorithm iteratively updates the drug combination to improve its response, where the update is made by comparing the response of the current combination with that of a reference combination, based on which the beneficial update direction is predicted. The reference combination is continuously updated based on the drug response values observed in the past, thereby adapting to the underlying drug response function. To demonstrate the effectiveness of the proposed algorithm, we evaluated its performance based on various multi-dimensional drug functions and compared it with existing algorithms. Conclusions Simulation results show that the ARU algorithm significantly outperforms existing stochastic search algorithms, including the Gur Game algorithm. In fact, the ARU algorithm can more effectively identify potent drug combinations and it typically spends fewer iterations for finding effective combinations. Furthermore, the ARU algorithm is robust to random fluctuations and noise in the measured drug response, which makes the algorithm well-suited for practical drug optimization applications. PMID:23134742

  17. Robust continuous clustering

    PubMed Central

    Shah, Sohil Atul

    2017-01-01

    Clustering is a fundamental procedure in the analysis of scientific data. It is used ubiquitously across the sciences. Despite decades of research, existing clustering algorithms have limited effectiveness in high dimensions and often require tuning parameters for different domains and datasets. We present a clustering algorithm that achieves high accuracy across multiple domains and scales efficiently to high dimensions and large datasets. The presented algorithm optimizes a smooth continuous objective, which is based on robust statistics and allows heavily mixed clusters to be untangled. The continuous nature of the objective also allows clustering to be integrated as a module in end-to-end feature learning pipelines. We demonstrate this by extending the algorithm to perform joint clustering and dimensionality reduction by efficiently optimizing a continuous global objective. The presented approach is evaluated on large datasets of faces, hand-written digits, objects, newswire articles, sensor readings from the Space Shuttle, and protein expression levels. Our method achieves high accuracy across all datasets, outperforming the best prior algorithm by a factor of 3 in average rank. PMID:28851838

  18. ROBNCA: robust network component analysis for recovering transcription factor activities.

    PubMed

    Noor, Amina; Ahmad, Aitzaz; Serpedin, Erchin; Nounou, Mohamed; Nounou, Hazem

    2013-10-01

    Network component analysis (NCA) is an efficient method of reconstructing the transcription factor activity (TFA), which makes use of the gene expression data and prior information available about transcription factor (TF)-gene regulations. Most of the contemporary algorithms either exhibit the drawback of inconsistency and poor reliability, or suffer from prohibitive computational complexity. In addition, the existing algorithms do not possess the ability to counteract the presence of outliers in the microarray data. Hence, robust and computationally efficient algorithms are needed to enable practical applications. We propose ROBust Network Component Analysis (ROBNCA), a novel iterative algorithm that explicitly models the possible outliers in the microarray data. An attractive feature of the ROBNCA algorithm is the derivation of a closed form solution for estimating the connectivity matrix, which was not available in prior contributions. The ROBNCA algorithm is compared with FastNCA and the non-iterative NCA (NI-NCA). ROBNCA estimates the TF activity profiles as well as the TF-gene control strength matrix with a much higher degree of accuracy than FastNCA and NI-NCA, irrespective of varying noise, correlation and/or amount of outliers in case of synthetic data. The ROBNCA algorithm is also tested on Saccharomyces cerevisiae data and Escherichia coli data, and it is observed to outperform the existing algorithms. The run time of the ROBNCA algorithm is comparable with that of FastNCA, and is hundreds of times faster than NI-NCA. The ROBNCA software is available at http://people.tamu.edu/∼amina/ROBNCA

  19. Incrementing data quality of multi-frequency echograms using the Adaptive Wiener Filter (AWF) denoising algorithm

    NASA Astrophysics Data System (ADS)

    Peña, M.

    2016-10-01

    Achieving acceptable signal-to-noise ratio (SNR) can be difficult when working in sparsely populated waters and/or when species have low scattering such as fluid filled animals. The increasing use of higher frequencies and the study of deeper depths in fisheries acoustics, as well as the use of commercial vessels, is raising the need to employ good denoising algorithms. The use of a lower Sv threshold to remove noise or unwanted targets is not suitable in many cases and increases the relative background noise component in the echogram, demanding more effectiveness from denoising algorithms. The Adaptive Wiener Filter (AWF) denoising algorithm is presented in this study. The technique is based on the AWF commonly used in digital photography and video enhancement. The algorithm firstly increments the quality of the data with a variance-dependent smoothing, before estimating the noise level as the envelope of the Sv minima. The AWF denoising algorithm outperforms existing algorithms in the presence of gaussian, speckle and salt & pepper noise, although impulse noise needs to be previously removed. Cleaned echograms present homogenous echotraces with outlined edges.

  20. Nonrigid Image Registration in Digital Subtraction Angiography Using Multilevel B-Spline

    PubMed Central

    2013-01-01

    We address the problem of motion artifact reduction in digital subtraction angiography (DSA) using image registration techniques. Most of registration algorithms proposed for application in DSA, have been designed for peripheral and cerebral angiography images in which we mainly deal with global rigid motions. These algorithms did not yield good results when applied to coronary angiography images because of complex nonrigid motions that exist in this type of angiography images. Multiresolution and iterative algorithms are proposed to cope with this problem, but these algorithms are associated with high computational cost which makes them not acceptable for real-time clinical applications. In this paper we propose a nonrigid image registration algorithm for coronary angiography images that is significantly faster than multiresolution and iterative blocking methods and outperforms competing algorithms evaluated on the same data sets. This algorithm is based on a sparse set of matched feature point pairs and the elastic registration is performed by means of multilevel B-spline image warping. Experimental results with several clinical data sets demonstrate the effectiveness of our approach. PMID:23971026

  1. A comprehensive performance evaluation on the prediction results of existing cooperative transcription factors identification algorithms.

    PubMed

    Lai, Fu-Jou; Chang, Hong-Tsun; Huang, Yueh-Min; Wu, Wei-Sheng

    2014-01-01

    Eukaryotic transcriptional regulation is known to be highly connected through the networks of cooperative transcription factors (TFs). Measuring the cooperativity of TFs is helpful for understanding the biological relevance of these TFs in regulating genes. The recent advances in computational techniques led to various predictions of cooperative TF pairs in yeast. As each algorithm integrated different data resources and was developed based on different rationales, it possessed its own merit and claimed outperforming others. However, the claim was prone to subjectivity because each algorithm compared with only a few other algorithms and only used a small set of performance indices for comparison. This motivated us to propose a series of indices to objectively evaluate the prediction performance of existing algorithms. And based on the proposed performance indices, we conducted a comprehensive performance evaluation. We collected 14 sets of predicted cooperative TF pairs (PCTFPs) in yeast from 14 existing algorithms in the literature. Using the eight performance indices we adopted/proposed, the cooperativity of each PCTFP was measured and a ranking score according to the mean cooperativity of the set was given to each set of PCTFPs under evaluation for each performance index. It was seen that the ranking scores of a set of PCTFPs vary with different performance indices, implying that an algorithm used in predicting cooperative TF pairs is of strength somewhere but may be of weakness elsewhere. We finally made a comprehensive ranking for these 14 sets. The results showed that Wang J's study obtained the best performance evaluation on the prediction of cooperative TF pairs in yeast. In this study, we adopted/proposed eight performance indices to make a comprehensive performance evaluation on the prediction results of 14 existing cooperative TFs identification algorithms. Most importantly, these proposed indices can be easily applied to measure the performance of new algorithms developed in the future, thus expedite progress in this research field.

  2. Scalable Parallel Density-based Clustering and Applications

    NASA Astrophysics Data System (ADS)

    Patwary, Mostofa Ali

    2014-04-01

    Recently, density-based clustering algorithms (DBSCAN and OPTICS) have gotten significant attention of the scientific community due to their unique capability of discovering arbitrary shaped clusters and eliminating noise data. These algorithms have several applications, which require high performance computing, including finding halos and subhalos (clusters) from massive cosmology data in astrophysics, analyzing satellite images, X-ray crystallography, and anomaly detection. However, parallelization of these algorithms are extremely challenging as they exhibit inherent sequential data access order, unbalanced workload resulting in low parallel efficiency. To break the data access sequentiality and to achieve high parallelism, we develop new parallel algorithms, both for DBSCAN and OPTICS, designed using graph algorithmic techniques. For example, our parallel DBSCAN algorithm exploits the similarities between DBSCAN and computing connected components. Using datasets containing up to a billion floating point numbers, we show that our parallel density-based clustering algorithms significantly outperform the existing algorithms, achieving speedups up to 27.5 on 40 cores on shared memory architecture and speedups up to 5,765 using 8,192 cores on distributed memory architecture. In our experiments, we found that while achieving the scalability, our algorithms produce clustering results with comparable quality to the classical algorithms.

  3. A fast algorithm to compute precise type-2 centroids for real-time control applications.

    PubMed

    Chakraborty, Sumantra; Konar, Amit; Ralescu, Anca; Pal, Nikhil R

    2015-02-01

    An interval type-2 fuzzy set (IT2 FS) is characterized by its upper and lower membership functions containing all possible embedded fuzzy sets, which together is referred to as the footprint of uncertainty (FOU). The FOU results in a span of uncertainty measured in the defuzzified space and is determined by the positional difference of the centroids of all the embedded fuzzy sets taken together. This paper provides a closed-form formula to evaluate the span of uncertainty of an IT2 FS. The closed-form formula offers a precise measurement of the degree of uncertainty in an IT2 FS with a runtime complexity less than that of the classical iterative Karnik-Mendel algorithm and other formulations employing the iterative Newton-Raphson algorithm. This paper also demonstrates a real-time control application using the proposed closed-form formula of centroids with reduced root mean square error and computational overhead than those of the existing methods. Computer simulations for this real-time control application indicate that parallel realization of the IT2 defuzzification outperforms its competitors with respect to maximum overshoot even at high sampling rates. Furthermore, in the presence of measurement noise in system (plant) states, the proposed IT2 FS based scheme outperforms its type-1 counterpart with respect to peak overshoot and root mean square error in plant response.

  4. Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

    NASA Astrophysics Data System (ADS)

    Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; Govind, Niranjan; Yang, Chao

    2017-12-01

    We present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by a modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.

  5. Algorithm aversion: people erroneously avoid algorithms after seeing them err.

    PubMed

    Dietvorst, Berkeley J; Simmons, Joseph P; Massey, Cade

    2015-02-01

    Research shows that evidence-based algorithms more accurately predict the future than do human forecasters. Yet when forecasters are deciding whether to use a human forecaster or a statistical algorithm, they often choose the human forecaster. This phenomenon, which we call algorithm aversion, is costly, and it is important to understand its causes. We show that people are especially averse to algorithmic forecasters after seeing them perform, even when they see them outperform a human forecaster. This is because people more quickly lose confidence in algorithmic than human forecasters after seeing them make the same mistake. In 5 studies, participants either saw an algorithm make forecasts, a human make forecasts, both, or neither. They then decided whether to tie their incentives to the future predictions of the algorithm or the human. Participants who saw the algorithm perform were less confident in it, and less likely to choose it over an inferior human forecaster. This was true even among those who saw the algorithm outperform the human.

  6. Machine Learning for Social Services: A Study of Prenatal Case Management in Illinois.

    PubMed

    Pan, Ian; Nolan, Laura B; Brown, Rashida R; Khan, Romana; van der Boor, Paul; Harris, Daniel G; Ghani, Rayid

    2017-06-01

    To evaluate the positive predictive value of machine learning algorithms for early assessment of adverse birth risk among pregnant women as a means of improving the allocation of social services. We used administrative data for 6457 women collected by the Illinois Department of Human Services from July 2014 to May 2015 to develop a machine learning model for adverse birth prediction and improve upon the existing paper-based risk assessment. We compared different models and determined the strongest predictors of adverse birth outcomes using positive predictive value as the metric for selection. Machine learning algorithms performed similarly, outperforming the current paper-based risk assessment by up to 36%; a refined paper-based assessment outperformed the current assessment by up to 22%. We estimate that these improvements will allow 100 to 170 additional high-risk pregnant women screened for program eligibility each year to receive services that would have otherwise been unobtainable. Our analysis exhibits the potential for machine learning to move government agencies toward a more data-informed approach to evaluating risk and providing social services. Overall, such efforts will improve the efficiency of allocating resource-intensive interventions.

  7. Hierarchical heuristic search using a Gaussian mixture model for UAV coverage planning.

    PubMed

    Lin, Lanny; Goodrich, Michael A

    2014-12-01

    During unmanned aerial vehicle (UAV) search missions, efficient use of UAV flight time requires flight paths that maximize the probability of finding the desired subject. The probability of detecting the desired subject based on UAV sensor information can vary in different search areas due to environment elements like varying vegetation density or lighting conditions, making it likely that the UAV can only partially detect the subject. This adds another dimension of complexity to the already difficult (NP-Hard) problem of finding an optimal search path. We present a new class of algorithms that account for partial detection in the form of a task difficulty map and produce paths that approximate the payoff of optimal solutions. The algorithms use the mode goodness ratio heuristic that uses a Gaussian mixture model to prioritize search subregions. The algorithms search for effective paths through the parameter space at different levels of resolution. We compare the performance of the new algorithms against two published algorithms (Bourgault's algorithm and LHC-GW-CONV algorithm) in simulated searches with three real search and rescue scenarios, and show that the new algorithms outperform existing algorithms significantly and can yield efficient paths that yield payoffs near the optimal.

  8. An effective and efficient compression algorithm for ECG signals with irregular periods.

    PubMed

    Chou, Hsiao-Hsuan; Chen, Ying-Jui; Shiau, Yu-Chien; Kuo, Te-Son

    2006-06-01

    This paper presents an effective and efficient preprocessing algorithm for two-dimensional (2-D) electrocardiogram (ECG) compression to better compress irregular ECG signals by exploiting their inter- and intra-beat correlations. To better reveal the correlation structure, we first convert the ECG signal into a proper 2-D representation, or image. This involves a few steps including QRS detection and alignment, period sorting, and length equalization. The resulting 2-D ECG representation is then ready to be compressed by an appropriate image compression algorithm. We choose the state-of-the-art JPEG2000 for its high efficiency and flexibility. In this way, the proposed algorithm is shown to outperform some existing arts in the literature by simultaneously achieving high compression ratio (CR), low percent root mean squared difference (PRD), low maximum error (MaxErr), and low standard derivation of errors (StdErr). In particular, because the proposed period sorting method rearranges the detected heartbeats into a smoother image that is easier to compress, this algorithm is insensitive to irregular ECG periods. Thus either the irregular ECG signals or the QRS false-detection cases can be better compressed. This is a significant improvement over existing 2-D ECG compression methods. Moreover, this algorithm is not tied exclusively to JPEG2000. It can also be combined with other 2-D preprocessing methods or appropriate codecs to enhance the compression performance in irregular ECG cases.

  9. An Energy-Efficient Game-Theory-Based Spectrum Decision Scheme for Cognitive Radio Sensor Networks

    PubMed Central

    Salim, Shelly; Moh, Sangman

    2016-01-01

    A cognitive radio sensor network (CRSN) is a wireless sensor network in which sensor nodes are equipped with cognitive radio. In this paper, we propose an energy-efficient game-theory-based spectrum decision (EGSD) scheme for CRSNs to prolong the network lifetime. Note that energy efficiency is the most important design consideration in CRSNs because it determines the network lifetime. The central part of the EGSD scheme consists of two spectrum selection algorithms: random selection and game-theory-based selection. The EGSD scheme also includes a clustering algorithm, spectrum characterization with a Markov chain, and cluster member coordination. Our performance study shows that EGSD outperforms the existing popular framework in terms of network lifetime and coordination overhead. PMID:27376290

  10. An Energy-Efficient Game-Theory-Based Spectrum Decision Scheme for Cognitive Radio Sensor Networks.

    PubMed

    Salim, Shelly; Moh, Sangman

    2016-06-30

    A cognitive radio sensor network (CRSN) is a wireless sensor network in which sensor nodes are equipped with cognitive radio. In this paper, we propose an energy-efficient game-theory-based spectrum decision (EGSD) scheme for CRSNs to prolong the network lifetime. Note that energy efficiency is the most important design consideration in CRSNs because it determines the network lifetime. The central part of the EGSD scheme consists of two spectrum selection algorithms: random selection and game-theory-based selection. The EGSD scheme also includes a clustering algorithm, spectrum characterization with a Markov chain, and cluster member coordination. Our performance study shows that EGSD outperforms the existing popular framework in terms of network lifetime and coordination overhead.

  11. Inferring Gene Regulatory Networks by Singular Value Decomposition and Gravitation Field Algorithm

    PubMed Central

    Zheng, Ming; Wu, Jia-nan; Huang, Yan-xin; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang

    2012-01-01

    Reconstruction of gene regulatory networks (GRNs) is of utmost interest and has become a challenge computational problem in system biology. However, every existing inference algorithm from gene expression profiles has its own advantages and disadvantages. In particular, the effectiveness and efficiency of every previous algorithm is not high enough. In this work, we proposed a novel inference algorithm from gene expression data based on differential equation model. In this algorithm, two methods were included for inferring GRNs. Before reconstructing GRNs, singular value decomposition method was used to decompose gene expression data, determine the algorithm solution space, and get all candidate solutions of GRNs. In these generated family of candidate solutions, gravitation field algorithm was modified to infer GRNs, used to optimize the criteria of differential equation model, and search the best network structure result. The proposed algorithm is validated on both the simulated scale-free network and real benchmark gene regulatory network in networks database. Both the Bayesian method and the traditional differential equation model were also used to infer GRNs, and the results were used to compare with the proposed algorithm in our work. And genetic algorithm and simulated annealing were also used to evaluate gravitation field algorithm. The cross-validation results confirmed the effectiveness of our algorithm, which outperforms significantly other previous algorithms. PMID:23226565

  12. Text Extraction from Scene Images by Character Appearance and Structure Modeling

    PubMed Central

    Yi, Chucai; Tian, Yingli

    2012-01-01

    In this paper, we propose a novel algorithm to detect text information from natural scene images. Scene text classification and detection are still open research topics. Our proposed algorithm is able to model both character appearance and structure to generate representative and discriminative text descriptors. The contributions of this paper include three aspects: 1) a new character appearance model by a structure correlation algorithm which extracts discriminative appearance features from detected interest points of character samples; 2) a new text descriptor based on structons and correlatons, which model character structure by structure differences among character samples and structure component co-occurrence; and 3) a new text region localization method by combining color decomposition, character contour refinement, and string line alignment to localize character candidates and refine detected text regions. We perform three groups of experiments to evaluate the effectiveness of our proposed algorithm, including text classification, text detection, and character identification. The evaluation results on benchmark datasets demonstrate that our algorithm achieves the state-of-the-art performance on scene text classification and detection, and significantly outperforms the existing algorithms for character identification. PMID:23316111

  13. A threshold-based fixed predictor for JPEG-LS image compression

    NASA Astrophysics Data System (ADS)

    Deng, Lihua; Huang, Zhenghua; Yao, Shoukui

    2018-03-01

    In JPEG-LS, fixed predictor based on median edge detector (MED) only detect horizontal and vertical edges, and thus produces large prediction errors in the locality of diagonal edges. In this paper, we propose a threshold-based edge detection scheme for the fixed predictor. The proposed scheme can detect not only the horizontal and vertical edges, but also diagonal edges. For some certain thresholds, the proposed scheme can be simplified to other existing schemes. So, it can also be regarded as the integration of these existing schemes. For a suitable threshold, the accuracy of horizontal and vertical edges detection is higher than the existing median edge detection in JPEG-LS. Thus, the proposed fixed predictor outperforms the existing JPEG-LS predictors for all images tested, while the complexity of the overall algorithm is maintained at a similar level.

  14. An effective image classification method with the fusion of invariant feature and a new color descriptor

    NASA Astrophysics Data System (ADS)

    Mansourian, Leila; Taufik Abdullah, Muhamad; Nurliyana Abdullah, Lili; Azman, Azreen; Mustaffa, Mas Rina

    2017-02-01

    Pyramid Histogram of Words (PHOW), combined Bag of Visual Words (BoVW) with the spatial pyramid matching (SPM) in order to add location information to extracted features. However, different PHOW extracted from various color spaces, and they did not extract color information individually, that means they discard color information, which is an important characteristic of any image that is motivated by human vision. This article, concatenated PHOW Multi-Scale Dense Scale Invariant Feature Transform (MSDSIFT) histogram and a proposed Color histogram to improve the performance of existing image classification algorithms. Performance evaluation on several datasets proves that the new approach outperforms other existing, state-of-the-art methods.

  15. Robust dynamic myocardial perfusion CT deconvolution using adaptive-weighted tensor total variation regularization

    NASA Astrophysics Data System (ADS)

    Gong, Changfei; Zeng, Dong; Bian, Zhaoying; Huang, Jing; Zhang, Xinyu; Zhang, Hua; Lu, Lijun; Feng, Qianjin; Liang, Zhengrong; Ma, Jianhua

    2016-03-01

    Dynamic myocardial perfusion computed tomography (MPCT) is a promising technique for diagnosis and risk stratification of coronary artery disease by assessing the myocardial perfusion hemodynamic maps (MPHM). Meanwhile, the repeated scanning of the same region results in a relatively large radiation dose to patients potentially. In this work, we present a robust MPCT deconvolution algorithm with adaptive-weighted tensor total variation regularization to estimate residue function accurately under the low-dose context, which is termed `MPD-AwTTV'. More specifically, the AwTTV regularization takes into account the anisotropic edge property of the MPCT images compared with the conventional total variation (TV) regularization, which can mitigate the drawbacks of TV regularization. Subsequently, an effective iterative algorithm was adopted to minimize the associative objective function. Experimental results on a modified XCAT phantom demonstrated that the present MPD-AwTTV algorithm outperforms and is superior to other existing deconvolution algorithms in terms of noise-induced artifacts suppression, edge details preservation and accurate MPHM estimation.

  16. Historical data learning based dynamic LSP routing for overlay IP/MPLS over WDM networks

    NASA Astrophysics Data System (ADS)

    Yu, Xiaojun; Xiao, Gaoxi; Cheng, Tee Hiang

    2013-08-01

    Overlay IP/MPLS over WDM network is a promising network architecture starting to gain wide deployments recently. A desirable feature of such a network is to achieve efficient routing with limited information exchanges between the IP/MPLS and the WDM layers. This paper studies dynamic label switched path (LSP) routing in the overlay IP/MPLS over WDM networks. To enhance network performance while maintaining its simplicity, we propose to learn from the historical data of lightpath setup costs maintained by the IP-layer integrated service provider (ISP) when making routing decisions. Using a novel historical data learning scheme for logical link cost estimation, we develop a new dynamic LSP routing method named Existing Link First (ELF) algorithm. Simulation results show that the proposed algorithm significantly outperforms the existing ones under different traffic loads, with either limited or unlimited numbers of optical ports. Effects of the number of candidate routes, add/drop ratio and the amount of historical data are also evaluated.

  17. Segmentation methodology for automated classification and differentiation of soft tissues in multiband images of high-resolution ultrasonic transmission tomography.

    PubMed

    Jeong, Jeong-Won; Shin, Dae C; Do, Synho; Marmarelis, Vasilis Z

    2006-08-01

    This paper presents a novel segmentation methodology for automated classification and differentiation of soft tissues using multiband data obtained with the newly developed system of high-resolution ultrasonic transmission tomography (HUTT) for imaging biological organs. This methodology extends and combines two existing approaches: the L-level set active contour (AC) segmentation approach and the agglomerative hierarchical kappa-means approach for unsupervised clustering (UC). To prevent the trapping of the current iterative minimization AC algorithm in a local minimum, we introduce a multiresolution approach that applies the level set functions at successively increasing resolutions of the image data. The resulting AC clusters are subsequently rearranged by the UC algorithm that seeks the optimal set of clusters yielding the minimum within-cluster distances in the feature space. The presented results from Monte Carlo simulations and experimental animal-tissue data demonstrate that the proposed methodology outperforms other existing methods without depending on heuristic parameters and provides a reliable means for soft tissue differentiation in HUTT images.

  18. Algorithm for protecting light-trees in survivable mesh wavelength-division-multiplexing networks

    NASA Astrophysics Data System (ADS)

    Luo, Hongbin; Li, Lemin; Yu, Hongfang

    2006-12-01

    Wavelength-division-multiplexing (WDM) technology is expected to facilitate bandwidth-intensive multicast applications such as high-definition television. A single fiber cut in a WDM mesh network, however, can disrupt the dissemination of information to several destinations on a light-tree based multicast session. Thus it is imperative to protect multicast sessions by reserving redundant resources. We propose a novel and efficient algorithm for protecting light-trees in survivable WDM mesh networks. The algorithm is called segment-based protection with sister node first (SSNF), whose basic idea is to protect a light-tree using a set of backup segments with a higher priority to protect the segments from a branch point to its children (sister nodes). The SSNF algorithm differs from the segment protection scheme proposed in the literature in how the segments are identified and protected. Our objective is to minimize the network resources used for protecting each primary light-tree such that the blocking probability can be minimized. To verify the effectiveness of the SSNF algorithm, we conduct extensive simulation experiments. The simulation results demonstrate that the SSNF algorithm outperforms existing algorithms for the same problem.

  19. Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue

    Within this paper, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. Additionally, the solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less

  20. Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue

    In this article, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less

  1. Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

    DOE PAGES

    Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; ...

    2017-12-01

    In this article, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. The solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less

  2. Efficient block preconditioned eigensolvers for linear response time-dependent density functional theory

    DOE PAGES

    Vecharynski, Eugene; Brabec, Jiri; Shao, Meiyue; ...

    2017-08-24

    Within this paper, we present two efficient iterative algorithms for solving the linear response eigenvalue problem arising from the time dependent density functional theory. Although the matrix to be diagonalized is nonsymmetric, it has a special structure that can be exploited to save both memory and floating point operations. In particular, the nonsymmetric eigenvalue problem can be transformed into an eigenvalue problem that involves the product of two matrices M and K. We show that, because MK is self-adjoint with respect to the inner product induced by the matrix K, this product eigenvalue problem can be solved efficiently by amore » modified Davidson algorithm and a modified locally optimal block preconditioned conjugate gradient (LOBPCG) algorithm that make use of the K-inner product. Additionally, the solution of the product eigenvalue problem yields one component of the eigenvector associated with the original eigenvalue problem. We show that the other component of the eigenvector can be easily recovered in an inexpensive postprocessing procedure. As a result, the algorithms we present here become more efficient than existing methods that try to approximate both components of the eigenvectors simultaneously. In particular, our numerical experiments demonstrate that the new algorithms presented here consistently outperform the existing state-of-the-art Davidson type solvers by a factor of two in both solution time and storage.« less

  3. Acoustic diagnosis of pulmonary hypertension: automated speech- recognition-inspired classification algorithm outperforms physicians

    NASA Astrophysics Data System (ADS)

    Kaddoura, Tarek; Vadlamudi, Karunakar; Kumar, Shine; Bobhate, Prashant; Guo, Long; Jain, Shreepal; Elgendi, Mohamed; Coe, James Y.; Kim, Daniel; Taylor, Dylan; Tymchak, Wayne; Schuurmans, Dale; Zemp, Roger J.; Adatia, Ian

    2016-09-01

    We hypothesized that an automated speech- recognition-inspired classification algorithm could differentiate between the heart sounds in subjects with and without pulmonary hypertension (PH) and outperform physicians. Heart sounds, electrocardiograms, and mean pulmonary artery pressures (mPAp) were recorded simultaneously. Heart sound recordings were digitized to train and test speech-recognition-inspired classification algorithms. We used mel-frequency cepstral coefficients to extract features from the heart sounds. Gaussian-mixture models classified the features as PH (mPAp ≥ 25 mmHg) or normal (mPAp < 25 mmHg). Physicians blinded to patient data listened to the same heart sound recordings and attempted a diagnosis. We studied 164 subjects: 86 with mPAp ≥ 25 mmHg (mPAp 41 ± 12 mmHg) and 78 with mPAp < 25 mmHg (mPAp 17 ± 5 mmHg) (p  < 0.005). The correct diagnostic rate of the automated speech-recognition-inspired algorithm was 74% compared to 56% by physicians (p = 0.005). The false positive rate for the algorithm was 34% versus 50% (p = 0.04) for clinicians. The false negative rate for the algorithm was 23% and 68% (p = 0.0002) for physicians. We developed an automated speech-recognition-inspired classification algorithm for the acoustic diagnosis of PH that outperforms physicians that could be used to screen for PH and encourage earlier specialist referral.

  4. Prediction of protein-protein interaction network using a multi-objective optimization approach.

    PubMed

    Chowdhury, Archana; Rakshit, Pratyusha; Konar, Amit

    2016-06-01

    Protein-Protein Interactions (PPIs) are very important as they coordinate almost all cellular processes. This paper attempts to formulate PPI prediction problem in a multi-objective optimization framework. The scoring functions for the trial solution deal with simultaneous maximization of functional similarity, strength of the domain interaction profiles, and the number of common neighbors of the proteins predicted to be interacting. The above optimization problem is solved using the proposed Firefly Algorithm with Nondominated Sorting. Experiments undertaken reveal that the proposed PPI prediction technique outperforms existing methods, including gene ontology-based Relative Specific Similarity, multi-domain-based Domain Cohesion Coupling method, domain-based Random Decision Forest method, Bagging with REP Tree, and evolutionary/swarm algorithm-based approaches, with respect to sensitivity, specificity, and F1 score.

  5. BPP: a sequence-based algorithm for branch point prediction.

    PubMed

    Zhang, Qing; Fan, Xiaodan; Wang, Yejun; Sun, Ming-An; Shao, Jianlin; Guo, Dianjing

    2017-10-15

    Although high-throughput sequencing methods have been proposed to identify splicing branch points in the human genome, these methods can only detect a small fraction of the branch points subject to the sequencing depth, experimental cost and the expression level of the mRNA. An accurate computational model for branch point prediction is therefore an ongoing objective in human genome research. We here propose a novel branch point prediction algorithm that utilizes information on the branch point sequence and the polypyrimidine tract. Using experimentally validated data, we demonstrate that our proposed method outperforms existing methods. Availability and implementation: https://github.com/zhqingit/BPP. djguo@cuhk.edu.hk. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  6. Hyperspectral face recognition with spatiospectral information fusion and PLS regression.

    PubMed

    Uzair, Muhammad; Mahmood, Arif; Mian, Ajmal

    2015-03-01

    Hyperspectral imaging offers new opportunities for face recognition via improved discrimination along the spectral dimension. However, it poses new challenges, including low signal-to-noise ratio, interband misalignment, and high data dimensionality. Due to these challenges, the literature on hyperspectral face recognition is not only sparse but is limited to ad hoc dimensionality reduction techniques and lacks comprehensive evaluation. We propose a hyperspectral face recognition algorithm using a spatiospectral covariance for band fusion and partial least square regression for classification. Moreover, we extend 13 existing face recognition techniques, for the first time, to perform hyperspectral face recognition.We formulate hyperspectral face recognition as an image-set classification problem and evaluate the performance of seven state-of-the-art image-set classification techniques. We also test six state-of-the-art grayscale and RGB (color) face recognition algorithms after applying fusion techniques on hyperspectral images. Comparison with the 13 extended and five existing hyperspectral face recognition techniques on three standard data sets show that the proposed algorithm outperforms all by a significant margin. Finally, we perform band selection experiments to find the most discriminative bands in the visible and near infrared response spectrum.

  7. Automatic identification of high impact articles in PubMed to support clinical decision making.

    PubMed

    Bian, Jiantao; Morid, Mohammad Amin; Jonnalagadda, Siddhartha; Luo, Gang; Del Fiol, Guilherme

    2017-09-01

    The practice of evidence-based medicine involves integrating the latest best available evidence into patient care decisions. Yet, critical barriers exist for clinicians' retrieval of evidence that is relevant for a particular patient from primary sources such as randomized controlled trials and meta-analyses. To help address those barriers, we investigated machine learning algorithms that find clinical studies with high clinical impact from PubMed®. Our machine learning algorithms use a variety of features including bibliometric features (e.g., citation count), social media attention, journal impact factors, and citation metadata. The algorithms were developed and evaluated with a gold standard composed of 502 high impact clinical studies that are referenced in 11 clinical evidence-based guidelines on the treatment of various diseases. We tested the following hypotheses: (1) our high impact classifier outperforms a state-of-the-art classifier based on citation metadata and citation terms, and PubMed's® relevance sort algorithm; and (2) the performance of our high impact classifier does not decrease significantly after removing proprietary features such as citation count. The mean top 20 precision of our high impact classifier was 34% versus 11% for the state-of-the-art classifier and 4% for PubMed's® relevance sort (p=0.009); and the performance of our high impact classifier did not decrease significantly after removing proprietary features (mean top 20 precision=34% vs. 36%; p=0.085). The high impact classifier, using features such as bibliometrics, social media attention and MEDLINE® metadata, outperformed previous approaches and is a promising alternative to identifying high impact studies for clinical decision support. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Comparison of crisp and fuzzy character networks in handwritten word recognition

    NASA Technical Reports Server (NTRS)

    Gader, Paul; Mohamed, Magdi; Chiang, Jung-Hsien

    1992-01-01

    Experiments involving handwritten word recognition on words taken from images of handwritten address blocks from the United States Postal Service mailstream are described. The word recognition algorithm relies on the use of neural networks at the character level. The neural networks are trained using crisp and fuzzy desired outputs. The fuzzy outputs were defined using a fuzzy k-nearest neighbor algorithm. The crisp networks slightly outperformed the fuzzy networks at the character level but the fuzzy networks outperformed the crisp networks at the word level.

  9. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets

    PubMed Central

    Wernisch, Lorenz

    2017-01-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm. PMID:29036190

  10. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.

    PubMed

    Gabasova, Evelina; Reid, John; Wernisch, Lorenz

    2017-10-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm.

  11. Modern meta-heuristics based on nonlinear physics processes: A review of models and design procedures

    NASA Astrophysics Data System (ADS)

    Salcedo-Sanz, S.

    2016-10-01

    Meta-heuristic algorithms are problem-solving methods which try to find good-enough solutions to very hard optimization problems, at a reasonable computation time, where classical approaches fail, or cannot even been applied. Many existing meta-heuristics approaches are nature-inspired techniques, which work by simulating or modeling different natural processes in a computer. Historically, many of the most successful meta-heuristic approaches have had a biological inspiration, such as evolutionary computation or swarm intelligence paradigms, but in the last few years new approaches based on nonlinear physics processes modeling have been proposed and applied with success. Non-linear physics processes, modeled as optimization algorithms, are able to produce completely new search procedures, with extremely effective exploration capabilities in many cases, which are able to outperform existing optimization approaches. In this paper we review the most important optimization algorithms based on nonlinear physics, how they have been constructed from specific modeling of a real phenomena, and also their novelty in terms of comparison with alternative existing algorithms for optimization. We first review important concepts on optimization problems, search spaces and problems' difficulty. Then, the usefulness of heuristics and meta-heuristics approaches to face hard optimization problems is introduced, and some of the main existing classical versions of these algorithms are reviewed. The mathematical framework of different nonlinear physics processes is then introduced as a preparatory step to review in detail the most important meta-heuristics based on them. A discussion on the novelty of these approaches, their main computational implementation and design issues, and the evaluation of a novel meta-heuristic based on Strange Attractors mutation will be carried out to complete the review of these techniques. We also describe some of the most important application areas, in broad sense, of meta-heuristics, and describe free-accessible software frameworks which can be used to make easier the implementation of these algorithms.

  12. Leveraging disjoint communities for detecting overlapping community structure

    NASA Astrophysics Data System (ADS)

    Chakraborty, Tanmoy

    2015-05-01

    Network communities represent mesoscopic structure for understanding the organization of real-world networks, where nodes often belong to multiple communities and form overlapping community structure in the network. Due to non-triviality in finding the exact boundary of such overlapping communities, this problem has become challenging, and therefore huge effort has been devoted to detect overlapping communities from the network. In this paper, we present PVOC (Permanence based Vertex-replication algorithm for Overlapping Community detection), a two-stage framework to detect overlapping community structure. We build on a novel observation that non-overlapping community structure detected by a standard disjoint community detection algorithm from a network has high resemblance with its actual overlapping community structure, except the overlapping part. Based on this observation, we posit that there is perhaps no need of building yet another overlapping community finding algorithm; but one can efficiently manipulate the output of any existing disjoint community finding algorithm to obtain the required overlapping structure. We propose a new post-processing technique that by combining with any existing disjoint community detection algorithm, can suitably process each vertex using a new vertex-based metric, called permanence, and thereby finds out overlapping candidates with their community memberships. Experimental results on both synthetic and large real-world networks show that PVOC significantly outperforms six state-of-the-art overlapping community detection algorithms in terms of high similarity of the output with the ground-truth structure. Thus our framework not only finds meaningful overlapping communities from the network, but also allows us to put an end to the constant effort of building yet another overlapping community detection algorithm.

  13. Assessment of Chlorophyll-a Algorithms Considering Different Trophic Statuses and Optimal Bands.

    PubMed

    Salem, Salem Ibrahim; Higa, Hiroto; Kim, Hyungjun; Kobayashi, Hiroshi; Oki, Kazuo; Oki, Taikan

    2017-07-31

    Numerous algorithms have been proposed to retrieve chlorophyll- a concentrations in Case 2 waters; however, the retrieval accuracy is far from satisfactory. In this research, seven algorithms are assessed with different band combinations of multispectral and hyperspectral bands using linear (LN), quadratic polynomial (QP) and power (PW) regression approaches, resulting in altogether 43 algorithmic combinations. These algorithms are evaluated by using simulated and measured datasets to understand the strengths and limitations of these algorithms. Two simulated datasets comprising 500,000 reflectance spectra each, both based on wide ranges of inherent optical properties (IOPs), are generated for the calibration and validation stages. Results reveal that the regression approach (i.e., LN, QP, and PW) has more influence on the simulated dataset than on the measured one. The algorithms that incorporated linear regression provide the highest retrieval accuracy for the simulated dataset. Results from simulated datasets reveal that the 3-band (3b) algorithm that incorporate 665-nm and 680-nm bands and band tuning selection approach outperformed other algorithms with root mean square error (RMSE) of 15.87 mg·m -3 , 16.25 mg·m -3 , and 19.05 mg·m -3 , respectively. The spatial distribution of the best performing algorithms, for various combinations of chlorophyll- a (Chla) and non-algal particles (NAP) concentrations, show that the 3b_tuning_QP and 3b_680_QP outperform other algorithms in terms of minimum RMSE frequency of 33.19% and 60.52%, respectively. However, the two algorithms failed to accurately retrieve Chla for many combinations of Chla and NAP, particularly for low Chla and NAP concentrations. In addition, the spatial distribution emphasizes that no single algorithm can provide outstanding accuracy for Chla retrieval and that multi-algorithms should be included to reduce the error. Comparing the results of the measured and simulated datasets reveal that the algorithms that incorporate the 665-nm band outperform other algorithms for measured dataset (RMSE = 36.84 mg·m -3 ), while algorithms that incorporate the band tuning approach provide the highest retrieval accuracy for the simulated dataset (RMSE = 25.05 mg·m -3 ).

  14. Assessment of Chlorophyll-a Algorithms Considering Different Trophic Statuses and Optimal Bands

    PubMed Central

    Higa, Hiroto; Kobayashi, Hiroshi; Oki, Kazuo

    2017-01-01

    Numerous algorithms have been proposed to retrieve chlorophyll-a concentrations in Case 2 waters; however, the retrieval accuracy is far from satisfactory. In this research, seven algorithms are assessed with different band combinations of multispectral and hyperspectral bands using linear (LN), quadratic polynomial (QP) and power (PW) regression approaches, resulting in altogether 43 algorithmic combinations. These algorithms are evaluated by using simulated and measured datasets to understand the strengths and limitations of these algorithms. Two simulated datasets comprising 500,000 reflectance spectra each, both based on wide ranges of inherent optical properties (IOPs), are generated for the calibration and validation stages. Results reveal that the regression approach (i.e., LN, QP, and PW) has more influence on the simulated dataset than on the measured one. The algorithms that incorporated linear regression provide the highest retrieval accuracy for the simulated dataset. Results from simulated datasets reveal that the 3-band (3b) algorithm that incorporate 665-nm and 680-nm bands and band tuning selection approach outperformed other algorithms with root mean square error (RMSE) of 15.87 mg·m−3, 16.25 mg·m−3, and 19.05 mg·m−3, respectively. The spatial distribution of the best performing algorithms, for various combinations of chlorophyll-a (Chla) and non-algal particles (NAP) concentrations, show that the 3b_tuning_QP and 3b_680_QP outperform other algorithms in terms of minimum RMSE frequency of 33.19% and 60.52%, respectively. However, the two algorithms failed to accurately retrieve Chla for many combinations of Chla and NAP, particularly for low Chla and NAP concentrations. In addition, the spatial distribution emphasizes that no single algorithm can provide outstanding accuracy for Chla retrieval and that multi-algorithms should be included to reduce the error. Comparing the results of the measured and simulated datasets reveal that the algorithms that incorporate the 665-nm band outperform other algorithms for measured dataset (RMSE = 36.84 mg·m−3), while algorithms that incorporate the band tuning approach provide the highest retrieval accuracy for the simulated dataset (RMSE = 25.05 mg·m−3). PMID:28758984

  15. An Efficient Supervised Training Algorithm for Multilayer Spiking Neural Networks

    PubMed Central

    Xie, Xiurui; Qu, Hong; Liu, Guisong; Zhang, Malu; Kurths, Jürgen

    2016-01-01

    The spiking neural networks (SNNs) are the third generation of neural networks and perform remarkably well in cognitive tasks such as pattern recognition. The spike emitting and information processing mechanisms found in biological cognitive systems motivate the application of the hierarchical structure and temporal encoding mechanism in spiking neural networks, which have exhibited strong computational capability. However, the hierarchical structure and temporal encoding approach require neurons to process information serially in space and time respectively, which reduce the training efficiency significantly. For training the hierarchical SNNs, most existing methods are based on the traditional back-propagation algorithm, inheriting its drawbacks of the gradient diffusion and the sensitivity on parameters. To keep the powerful computation capability of the hierarchical structure and temporal encoding mechanism, but to overcome the low efficiency of the existing algorithms, a new training algorithm, the Normalized Spiking Error Back Propagation (NSEBP) is proposed in this paper. In the feedforward calculation, the output spike times are calculated by solving the quadratic function in the spike response model instead of detecting postsynaptic voltage states at all time points in traditional algorithms. Besides, in the feedback weight modification, the computational error is propagated to previous layers by the presynaptic spike jitter instead of the gradient decent rule, which realizes the layer-wised training. Furthermore, our algorithm investigates the mathematical relation between the weight variation and voltage error change, which makes the normalization in the weight modification applicable. Adopting these strategies, our algorithm outperforms the traditional SNN multi-layer algorithms in terms of learning efficiency and parameter sensitivity, that are also demonstrated by the comprehensive experimental results in this paper. PMID:27044001

  16. A Comparative Study of Frequent and Maximal Periodic Pattern Mining Algorithms in Spatiotemporal Databases

    NASA Astrophysics Data System (ADS)

    Obulesu, O.; Rama Mohan Reddy, A., Dr; Mahendra, M.

    2017-08-01

    Detecting regular and efficient cyclic models is the demanding activity for data analysts due to unstructured, vigorous and enormous raw information produced from web. Many existing approaches generate large candidate patterns in the occurrence of huge and complex databases. In this work, two novel algorithms are proposed and a comparative examination is performed by considering scalability and performance parameters. The first algorithm is, EFPMA (Extended Regular Model Detection Algorithm) used to find frequent sequential patterns from the spatiotemporal dataset and the second one is, ETMA (Enhanced Tree-based Mining Algorithm) for detecting effective cyclic models with symbolic database representation. EFPMA is an algorithm grows models from both ends (prefixes and suffixes) of detected patterns, which results in faster pattern growth because of less levels of database projection compared to existing approaches such as Prefixspan and SPADE. ETMA uses distinct notions to store and manage transactions data horizontally such as segment, sequence and individual symbols. ETMA exploits a partition-and-conquer method to find maximal patterns by using symbolic notations. Using this algorithm, we can mine cyclic models in full-series sequential patterns including subsection series also. ETMA reduces the memory consumption and makes use of the efficient symbolic operation. Furthermore, ETMA only records time-series instances dynamically, in terms of character, series and section approaches respectively. The extent of the pattern and proving efficiency of the reducing and retrieval techniques from synthetic and actual datasets is a really open & challenging mining problem. These techniques are useful in data streams, traffic risk analysis, medical diagnosis, DNA sequence Mining, Earthquake prediction applications. Extensive investigational outcomes illustrates that the algorithms outperforms well towards efficiency and scalability than ECLAT, STNR and MAFIA approaches.

  17. A hybrid genetic-simulated annealing algorithm for the location-inventory-routing problem considering returns under e-supply chain environment.

    PubMed

    Li, Yanhui; Guo, Hao; Wang, Lin; Fu, Jing

    2013-01-01

    Facility location, inventory control, and vehicle routes scheduling are critical and highly related problems in the design of logistics system for e-business. Meanwhile, the return ratio in Internet sales was significantly higher than in the traditional business. Many of returned merchandise have no quality defects, which can reenter sales channels just after a simple repackaging process. Focusing on the existing problem in e-commerce logistics system, we formulate a location-inventory-routing problem model with no quality defects returns. To solve this NP-hard problem, an effective hybrid genetic simulated annealing algorithm (HGSAA) is proposed. Results of numerical examples show that HGSAA outperforms GA on computing time, optimal solution, and computing stability. The proposed model is very useful to help managers make the right decisions under e-supply chain environment.

  18. Adaptive multi-view clustering based on nonnegative matrix factorization and pairwise co-regularization

    NASA Astrophysics Data System (ADS)

    Zhang, Tianzhen; Wang, Xiumei; Gao, Xinbo

    2018-04-01

    Nowadays, several datasets are demonstrated by multi-view, which usually include shared and complementary information. Multi-view clustering methods integrate the information of multi-view to obtain better clustering results. Nonnegative matrix factorization has become an essential and popular tool in clustering methods because of its interpretation. However, existing nonnegative matrix factorization based multi-view clustering algorithms do not consider the disagreement between views and neglects the fact that different views will have different contributions to the data distribution. In this paper, we propose a new multi-view clustering method, named adaptive multi-view clustering based on nonnegative matrix factorization and pairwise co-regularization. The proposed algorithm can obtain the parts-based representation of multi-view data by nonnegative matrix factorization. Then, pairwise co-regularization is used to measure the disagreement between views. There is only one parameter to auto learning the weight values according to the contribution of each view to data distribution. Experimental results show that the proposed algorithm outperforms several state-of-the-arts algorithms for multi-view clustering.

  19. A gossip based information fusion protocol for distributed frequent itemset mining

    NASA Astrophysics Data System (ADS)

    Sohrabi, Mohammad Karim

    2018-07-01

    The computational complexity, huge memory space requirement, and time-consuming nature of frequent pattern mining process are the most important motivations for distribution and parallelization of this mining process. On the other hand, the emergence of distributed computational and operational environments, which causes the production and maintenance of data on different distributed data sources, makes the parallelization and distribution of the knowledge discovery process inevitable. In this paper, a gossip based distributed itemset mining (GDIM) algorithm is proposed to extract frequent itemsets, which are special types of frequent patterns, in a wireless sensor network environment. In this algorithm, local frequent itemsets of each sensor are extracted using a bit-wise horizontal approach (LHPM) from the nodes which are clustered using a leach-based protocol. Heads of clusters exploit a gossip based protocol in order to communicate each other to find the patterns which their global support is equal to or more than the specified support threshold. Experimental results show that the proposed algorithm outperforms the best existing gossip based algorithm in term of execution time.

  20. Simultaneous Scheduling of Jobs, AGVs and Tools Considering Tool Transfer Times in Multi Machine FMS By SOS Algorithm

    NASA Astrophysics Data System (ADS)

    Sivarami Reddy, N.; Ramamurthy, D. V., Dr.; Prahlada Rao, K., Dr.

    2017-08-01

    This article addresses simultaneous scheduling of machines, AGVs and tools where machines are allowed to share the tools considering transfer times of jobs and tools between machines, to generate best optimal sequences that minimize makespan in a multi-machine Flexible Manufacturing System (FMS). Performance of FMS is expected to improve by effective utilization of its resources, by proper integration and synchronization of their scheduling. Symbiotic Organisms Search (SOS) algorithm is a potent tool which is a better alternative for solving optimization problems like scheduling and proven itself. The proposed SOS algorithm is tested on 22 job sets with makespan as objective for scheduling of machines and tools where machines are allowed to share tools without considering transfer times of jobs and tools and the results are compared with the results of existing methods. The results show that the SOS has outperformed. The same SOS algorithm is used for simultaneous scheduling of machines, AGVs and tools where machines are allowed to share tools considering transfer times of jobs and tools to determine the best optimal sequences that minimize makespan.

  1. A content-boosted collaborative filtering algorithm for personalized training in interpretation of radiological imaging.

    PubMed

    Lin, Hongli; Yang, Xuedong; Wang, Weisheng

    2014-08-01

    Devising a method that can select cases based on the performance levels of trainees and the characteristics of cases is essential for developing a personalized training program in radiology education. In this paper, we propose a novel hybrid prediction algorithm called content-boosted collaborative filtering (CBCF) to predict the difficulty level of each case for each trainee. The CBCF utilizes a content-based filtering (CBF) method to enhance existing trainee-case ratings data and then provides final predictions through a collaborative filtering (CF) algorithm. The CBCF algorithm incorporates the advantages of both CBF and CF, while not inheriting the disadvantages of either. The CBCF method is compared with the pure CBF and pure CF approaches using three datasets. The experimental data are then evaluated in terms of the MAE metric. Our experimental results show that the CBCF outperforms the pure CBF and CF methods by 13.33 and 12.17 %, respectively, in terms of prediction precision. This also suggests that the CBCF can be used in the development of personalized training systems in radiology education.

  2. Indications for MARS-MRI in Patients Treated With Metal-on-Metal Hip Resurfacing Arthroplasty.

    PubMed

    Connelly, James W; Galea, Vincent P; Matuszak, Sean J; Madanat, Rami; Muratoglu, Orhun; Malchau, Henrik

    2018-06-01

    Currently, there are no universally accepted guidelines on when to obtain metal artifact reduction sequence magnetic resonance imaging (MARS-MRI) in metal-on-metal (MoM) hip resurfacing arthroplasty (HRA) patients. Our primary aims were to identify which patient and clinical factors are predictive of adverse local tissue reaction (ALTR) and create an algorithm for indicating MARS-MRI in patients with Articular Surface Replacement (ASR) HRA. The secondary aim was to compare our algorithm to existing guidelines on when to perform MARS-MRI in MoM HRA patients. The study cohort consisted of 182 patients with unilateral ASR HRA from a prospective, multicenter study. Subjects received MARS-MRI at a mean of 7.8 years from surgery, regardless of symptoms. We determined which variables were predictive of ALTR and generated cutoffs for each variable. Finally, we created an algorithm to predict ALTR and indicate MARS-MRI in ASR HRA patients using these cutoffs and compared it to existing guidelines. We found high blood cobalt (Co) (odds ratio = 1.070; P = .011) and high blood chromium (Cr) (odds ratio = 1.162; P = .002) to be significant predictors of ALTR presence. Our algorithm using a blood Co cutoff of 1.15 ppb and a Cr cutoff of 1.09 ppb achieved 96.6% sensitivity and 35.3% specificity in predicting ALTR, which outperformed the existing guidelines. Blood Co and Cr levels are predictive of ALTR in ASR HRA patients. Our algorithm considering blood Co and Cr levels predicts ALTR in ASR HRA patients with higher sensitivity than previously established guidelines. Copyright © 2018 Elsevier Inc. All rights reserved.

  3. Edge-directed inference for microaneurysms detection in digital fundus images

    NASA Astrophysics Data System (ADS)

    Huang, Ke; Yan, Michelle; Aviyente, Selin

    2007-03-01

    Microaneurysms (MAs) detection is a critical step in diabetic retinopathy screening, since MAs are the earliest visible warning of potential future problems. A variety of algorithms have been proposed for MAs detection in mass screening. Different methods have been proposed for MAs detection. The core technology for most of existing methods is based on a directional mathematical morphological operation called "Top-Hat" filter that requires multiple filtering operations at each pixel. Background structure, uneven illumination and noise often cause confusion between MAs and some non-MA structures and limits the applicability of the filter. In this paper, a novel detection framework based on edge directed inference is proposed for MAs detection. The candidate MA regions are first delineated from the edge map of a fundus image. Features measuring shape, brightness and contrast are extracted for each candidate MA region to better exclude false detection from true MAs. Algorithmic analysis and empirical evaluation reveal that the proposed edge directed inference outperforms the "Top-Hat" based algorithm in both detection accuracy and computational speed.

  4. A Hybrid alldifferent-Tabu Search Algorithm for Solving Sudoku Puzzles

    PubMed Central

    Crawford, Broderick; Paredes, Fernando; Norero, Enrique

    2015-01-01

    The Sudoku problem is a well-known logic-based puzzle of combinatorial number-placement. It consists in filling a n 2 × n 2 grid, composed of n columns, n rows, and n subgrids, each one containing distinct integers from 1 to n 2. Such a puzzle belongs to the NP-complete collection of problems, to which there exist diverse exact and approximate methods able to solve it. In this paper, we propose a new hybrid algorithm that smartly combines a classic tabu search procedure with the alldifferent global constraint from the constraint programming world. The alldifferent constraint is known to be efficient for domain filtering in the presence of constraints that must be pairwise different, which are exactly the kind of constraints that Sudokus own. This ability clearly alleviates the work of the tabu search, resulting in a faster and more robust approach for solving Sudokus. We illustrate interesting experimental results where our proposed algorithm outperforms the best results previously reported by hybrids and approximate methods. PMID:26078751

  5. An FPGA-based DS-CDMA multiuser demodulator employing adaptive multistage parallel interference cancellation

    NASA Astrophysics Data System (ADS)

    Li, Xinhua; Song, Zhenyu; Zhan, Yongjie; Wu, Qiongzhi

    2009-12-01

    Since the system capacity is severely limited, reducing the multiple access interfere (MAI) is necessary in the multiuser direct-sequence code division multiple access (DS-CDMA) system which is used in the telecommunication terminals data-transferred link system. In this paper, we adopt an adaptive multistage parallel interference cancellation structure in the demodulator based on the least mean square (LMS) algorithm to eliminate the MAI on the basis of overviewing various of multiuser dectection schemes. Neither a training sequence nor a pilot signal is needed in the proposed scheme, and its implementation complexity can be greatly reduced by a LMS approximate algorithm. The algorithm and its FPGA implementation is then derived. Simulation results of the proposed adaptive PIC can outperform some of the existing interference cancellation methods in AWGN channels. The hardware setup of mutiuser demodulator is described, and the experimental results based on it demonstrate that the simulation results shows large performance gains over the conventional single-user demodulator.

  6. Bounded Kalman filter method for motion-robust, non-contact heart rate estimation

    PubMed Central

    Prakash, Sakthi Kumar Arul; Tucker, Conrad S.

    2018-01-01

    The authors of this work present a real-time measurement of heart rate across different lighting conditions and motion categories. This is an advancement over existing remote Photo Plethysmography (rPPG) methods that require a static, controlled environment for heart rate detection, making them impractical for real-world scenarios wherein a patient may be in motion, or remotely connected to a healthcare provider through telehealth technologies. The algorithm aims to minimize motion artifacts such as blurring and noise due to head movements (uniform, random) by employing i) a blur identification and denoising algorithm for each frame and ii) a bounded Kalman filter technique for motion estimation and feature tracking. A case study is presented that demonstrates the feasibility of the algorithm in non-contact estimation of the pulse rate of subjects performing everyday head and body movements. The method in this paper outperforms state of the art rPPG methods in heart rate detection, as revealed by the benchmarked results. PMID:29552419

  7. MultiNest: Efficient and Robust Bayesian Inference

    NASA Astrophysics Data System (ADS)

    Feroz, F.; Hobson, M. P.; Bridges, M.

    2011-09-01

    We present further development and the first public release of our multimodal nested sampling algorithm, called MultiNest. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson (2008), which itself significantly outperformed existing MCMC techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MultiNest algorithm is demonstrated by application to two toy problems and to a cosmological inference problem focusing on the extension of the vanilla LambdaCDM model to include spatial curvature and a varying equation of state for dark energy. The MultiNest software is fully parallelized using MPI and includes an interface to CosmoMC. It will also be released as part of the SuperBayeS package, for the analysis of supersymmetric theories of particle physics, at this http URL.

  8. A Hybrid alldifferent-Tabu Search Algorithm for Solving Sudoku Puzzles.

    PubMed

    Soto, Ricardo; Crawford, Broderick; Galleguillos, Cristian; Paredes, Fernando; Norero, Enrique

    2015-01-01

    The Sudoku problem is a well-known logic-based puzzle of combinatorial number-placement. It consists in filling a n(2) × n(2) grid, composed of n columns, n rows, and n subgrids, each one containing distinct integers from 1 to n(2). Such a puzzle belongs to the NP-complete collection of problems, to which there exist diverse exact and approximate methods able to solve it. In this paper, we propose a new hybrid algorithm that smartly combines a classic tabu search procedure with the alldifferent global constraint from the constraint programming world. The alldifferent constraint is known to be efficient for domain filtering in the presence of constraints that must be pairwise different, which are exactly the kind of constraints that Sudokus own. This ability clearly alleviates the work of the tabu search, resulting in a faster and more robust approach for solving Sudokus. We illustrate interesting experimental results where our proposed algorithm outperforms the best results previously reported by hybrids and approximate methods.

  9. A new peak detection algorithm for MALDI mass spectrometry data based on a modified Asymmetric Pseudo-Voigt model.

    PubMed

    Wijetunge, Chalini D; Saeed, Isaam; Boughton, Berin A; Roessner, Ute; Halgamuge, Saman K

    2015-01-01

    Mass Spectrometry (MS) is a ubiquitous analytical tool in biological research and is used to measure the mass-to-charge ratio of bio-molecules. Peak detection is the essential first step in MS data analysis. Precise estimation of peak parameters such as peak summit location and peak area are critical to identify underlying bio-molecules and to estimate their abundances accurately. We propose a new method to detect and quantify peaks in mass spectra. It uses dual-tree complex wavelet transformation along with Stein's unbiased risk estimator for spectra smoothing. Then, a new method, based on the modified Asymmetric Pseudo-Voigt (mAPV) model and hierarchical particle swarm optimization, is used for peak parameter estimation. Using simulated data, we demonstrated the benefit of using the mAPV model over Gaussian, Lorentz and Bi-Gaussian functions for MS peak modelling. The proposed mAPV model achieved the best fitting accuracy for asymmetric peaks, with lower percentage errors in peak summit location estimation, which were 0.17% to 4.46% less than that of the other models. It also outperformed the other models in peak area estimation, delivering lower percentage errors, which were about 0.7% less than its closest competitor - the Bi-Gaussian model. In addition, using data generated from a MALDI-TOF computer model, we showed that the proposed overall algorithm outperformed the existing methods mainly in terms of sensitivity. It achieved a sensitivity of 85%, compared to 77% and 71% of the two benchmark algorithms, continuous wavelet transformation based method and Cromwell respectively. The proposed algorithm is particularly useful for peak detection and parameter estimation in MS data with overlapping peak distributions and asymmetric peaks. The algorithm is implemented using MATLAB and the source code is freely available at http://mapv.sourceforge.net.

  10. A new peak detection algorithm for MALDI mass spectrometry data based on a modified Asymmetric Pseudo-Voigt model

    PubMed Central

    2015-01-01

    Background Mass Spectrometry (MS) is a ubiquitous analytical tool in biological research and is used to measure the mass-to-charge ratio of bio-molecules. Peak detection is the essential first step in MS data analysis. Precise estimation of peak parameters such as peak summit location and peak area are critical to identify underlying bio-molecules and to estimate their abundances accurately. We propose a new method to detect and quantify peaks in mass spectra. It uses dual-tree complex wavelet transformation along with Stein's unbiased risk estimator for spectra smoothing. Then, a new method, based on the modified Asymmetric Pseudo-Voigt (mAPV) model and hierarchical particle swarm optimization, is used for peak parameter estimation. Results Using simulated data, we demonstrated the benefit of using the mAPV model over Gaussian, Lorentz and Bi-Gaussian functions for MS peak modelling. The proposed mAPV model achieved the best fitting accuracy for asymmetric peaks, with lower percentage errors in peak summit location estimation, which were 0.17% to 4.46% less than that of the other models. It also outperformed the other models in peak area estimation, delivering lower percentage errors, which were about 0.7% less than its closest competitor - the Bi-Gaussian model. In addition, using data generated from a MALDI-TOF computer model, we showed that the proposed overall algorithm outperformed the existing methods mainly in terms of sensitivity. It achieved a sensitivity of 85%, compared to 77% and 71% of the two benchmark algorithms, continuous wavelet transformation based method and Cromwell respectively. Conclusions The proposed algorithm is particularly useful for peak detection and parameter estimation in MS data with overlapping peak distributions and asymmetric peaks. The algorithm is implemented using MATLAB and the source code is freely available at http://mapv.sourceforge.net. PMID:26680279

  11. The cost-constrained traveling salesman problem

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sokkappa, P.R.

    1990-10-01

    The Cost-Constrained Traveling Salesman Problem (CCTSP) is a variant of the well-known Traveling Salesman Problem (TSP). In the TSP, the goal is to find a tour of a given set of cities such that the total cost of the tour is minimized. In the CCTSP, each city is given a value, and a fixed cost-constraint is specified. The objective is to find a subtour of the cities that achieves maximum value without exceeding the cost-constraint. Thus, unlike the TSP, the CCTSP requires both selection and sequencing. As a consequence, most results for the TSP cannot be extended to the CCTSP.more » We show that the CCTSP is NP-hard and that no K-approximation algorithm or fully polynomial approximation scheme exists, unless P = NP. We also show that several special cases are polynomially solvable. Algorithms for the CCTSP, which outperform previous methods, are developed in three areas: upper bounding methods, exact algorithms, and heuristics. We found that a bounding strategy based on the knapsack problem performs better, both in speed and in the quality of the bounds, than methods based on the assignment problem. Likewise, we found that a branch-and-bound approach using the knapsack bound was superior to a method based on a common branch-and-bound method for the TSP. In our study of heuristic algorithms, we found that, when selecting modes for inclusion in the subtour, it is important to consider the neighborhood'' of the nodes. A node with low value that brings the subtour near many other nodes may be more desirable than an isolated node of high value. We found two types of repetition to be desirable: repetitions based on randomization in the subtour buildings process, and repetitions encouraging the inclusion of different subsets of the nodes. By varying the number and type of repetitions, we can adjust the computation time required by our method to obtain algorithms that outperform previous methods.« less

  12. Evaluation of dynamically dimensioned search algorithm for optimizing SWAT by altering sampling distributions and searching range

    USDA-ARS?s Scientific Manuscript database

    The primary advantage of Dynamically Dimensioned Search algorithm (DDS) is that it outperforms many other optimization techniques in both convergence speed and the ability in searching for parameter sets that satisfy statistical guidelines while requiring only one algorithm parameter (perturbation f...

  13. WE-FG-207B-05: Iterative Reconstruction Via Prior Image Constrained Total Generalized Variation for Spectral CT

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Niu, S; Zhang, Y; Ma, J

    Purpose: To investigate iterative reconstruction via prior image constrained total generalized variation (PICTGV) for spectral computed tomography (CT) using fewer projections while achieving greater image quality. Methods: The proposed PICTGV method is formulated as an optimization problem, which balances the data fidelity and prior image constrained total generalized variation of reconstructed images in one framework. The PICTGV method is based on structure correlations among images in the energy domain and high-quality images to guide the reconstruction of energy-specific images. In PICTGV method, the high-quality image is reconstructed from all detector-collected X-ray signals and is referred as the broad-spectrum image. Distinctmore » from the existing reconstruction methods applied on the images with first order derivative, the higher order derivative of the images is incorporated into the PICTGV method. An alternating optimization algorithm is used to minimize the PICTGV objective function. We evaluate the performance of PICTGV on noise and artifacts suppressing using phantom studies and compare the method with the conventional filtered back-projection method as well as TGV based method without prior image. Results: On the digital phantom, the proposed method outperforms the existing TGV method in terms of the noise reduction, artifacts suppression, and edge detail preservation. Compared to that obtained by the TGV based method without prior image, the relative root mean square error in the images reconstructed by the proposed method is reduced by over 20%. Conclusion: The authors propose an iterative reconstruction via prior image constrained total generalize variation for spectral CT. Also, we have developed an alternating optimization algorithm and numerically demonstrated the merits of our approach. Results show that the proposed PICTGV method outperforms the TGV method for spectral CT.« less

  14. Efficient Approximation Algorithms for Weighted $b$-Matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Khan, Arif; Pothen, Alex; Mostofa Ali Patwary, Md.

    2016-01-01

    We describe a half-approximation algorithm, b-Suitor, for computing a b-Matching of maximum weight in a graph with weights on the edges. b-Matching is a generalization of the well-known Matching problem in graphs, where the objective is to choose a subset of M edges in the graph such that at most a specified number b(v) of edges in M are incident on each vertex v. Subject to this restriction we maximize the sum of the weights of the edges in M. We prove that the b-Suitor algorithm computes the same b-Matching as the one obtained by the greedy algorithm for themore » problem. We implement the algorithm on serial and shared-memory parallel processors, and compare its performance against a collection of approximation algorithms that have been proposed for the Matching problem. Our results show that the b-Suitor algorithm outperforms the Greedy and Locally Dominant edge algorithms by one to two orders of magnitude on a serial processor. The b-Suitor algorithm has a high degree of concurrency, and it scales well up to 240 threads on a shared memory multiprocessor. The b-Suitor algorithm outperforms the Locally Dominant edge algorithm by a factor of fourteen on 16 cores of an Intel Xeon multiprocessor.« less

  15. An ant colony optimization based algorithm for identifying gene regulatory elements.

    PubMed

    Liu, Wei; Chen, Hanwu; Chen, Ling

    2013-08-01

    It is one of the most important tasks in bioinformatics to identify the regulatory elements in gene sequences. Most of the existing algorithms for identifying regulatory elements are inclined to converge into a local optimum, and have high time complexity. Ant Colony Optimization (ACO) is a meta-heuristic method based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of real ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper designs and implements an ACO based algorithm named ACRI (ant-colony-regulatory-identification) for identifying all possible binding sites of transcription factor from the upstream of co-expressed genes. To accelerate the ants' searching process, a strategy of local optimization is presented to adjust the ants' start positions on the searched sequences. By exploiting the powerful optimization ability of ACO, the algorithm ACRI can not only improve precision of the results, but also achieve a very high speed. Experimental results on real world datasets show that ACRI can outperform other traditional algorithms in the respects of speed and quality of solutions. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Real-time image annotation by manifold-based biased Fisher discriminant analysis

    NASA Astrophysics Data System (ADS)

    Ji, Rongrong; Yao, Hongxun; Wang, Jicheng; Sun, Xiaoshuai; Liu, Xianming

    2008-01-01

    Automatic Linguistic Annotation is a promising solution to bridge the semantic gap in content-based image retrieval. However, two crucial issues are not well addressed in state-of-art annotation algorithms: 1. The Small Sample Size (3S) problem in keyword classifier/model learning; 2. Most of annotation algorithms can not extend to real-time online usage due to their low computational efficiencies. This paper presents a novel Manifold-based Biased Fisher Discriminant Analysis (MBFDA) algorithm to address these two issues by transductive semantic learning and keyword filtering. To address the 3S problem, Co-Training based Manifold learning is adopted for keyword model construction. To achieve real-time annotation, a Bias Fisher Discriminant Analysis (BFDA) based semantic feature reduction algorithm is presented for keyword confidence discrimination and semantic feature reduction. Different from all existing annotation methods, MBFDA views image annotation from a novel Eigen semantic feature (which corresponds to keywords) selection aspect. As demonstrated in experiments, our manifold-based biased Fisher discriminant analysis annotation algorithm outperforms classical and state-of-art annotation methods (1.K-NN Expansion; 2.One-to-All SVM; 3.PWC-SVM) in both computational time and annotation accuracy with a large margin.

  17. Global Contrast Based Salient Region Detection.

    PubMed

    Cheng, Ming-Ming; Mitra, Niloy J; Huang, Xiaolei; Torr, Philip H S; Hu, Shi-Min

    2015-03-01

    Automatic estimation of salient object regions across images, without any prior assumption or knowledge of the contents of the corresponding scenes, enhances many computer vision and computer graphics applications. We introduce a regional contrast based salient object detection algorithm, which simultaneously evaluates global contrast differences and spatial weighted coherence scores. The proposed algorithm is simple, efficient, naturally multi-scale, and produces full-resolution, high-quality saliency maps. These saliency maps are further used to initialize a novel iterative version of GrabCut, namely SaliencyCut, for high quality unsupervised salient object segmentation. We extensively evaluated our algorithm using traditional salient object detection datasets, as well as a more challenging Internet image dataset. Our experimental results demonstrate that our algorithm consistently outperforms 15 existing salient object detection and segmentation methods, yielding higher precision and better recall rates. We also show that our algorithm can be used to efficiently extract salient object masks from Internet images, enabling effective sketch-based image retrieval (SBIR) via simple shape comparisons. Despite such noisy internet images, where the saliency regions are ambiguous, our saliency guided image retrieval achieves a superior retrieval rate compared with state-of-the-art SBIR methods, and additionally provides important target object region information.

  18. Fast clustering algorithm for large ECG data sets based on CS theory in combination with PCA and K-NN methods.

    PubMed

    Balouchestani, Mohammadreza; Krishnan, Sridhar

    2014-01-01

    Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.

  19. Accuracy and robustness evaluation in stereo matching

    NASA Astrophysics Data System (ADS)

    Nguyen, Duc M.; Hanca, Jan; Lu, Shao-Ping; Schelkens, Peter; Munteanu, Adrian

    2016-09-01

    Stereo matching has received a lot of attention from the computer vision community, thanks to its wide range of applications. Despite of the large variety of algorithms that have been proposed so far, it is not trivial to select suitable algorithms for the construction of practical systems. One of the main problems is that many algorithms lack sufficient robustness when employed in various operational conditions. This problem is due to the fact that most of the proposed methods in the literature are usually tested and tuned to perform well on one specific dataset. To alleviate this problem, an extensive evaluation in terms of accuracy and robustness of state-of-the-art stereo matching algorithms is presented. Three datasets (Middlebury, KITTI, and MPEG FTV) representing different operational conditions are employed. Based on the analysis, improvements over existing algorithms have been proposed. The experimental results show that our improved versions of cross-based and cost volume filtering algorithms outperform the original versions with large margins on Middlebury and KITTI datasets. In addition, the latter of the two proposed algorithms ranks itself among the best local stereo matching approaches on the KITTI benchmark. Under evaluations using specific settings for depth-image-based-rendering applications, our improved belief propagation algorithm is less complex than MPEG's FTV depth estimation reference software (DERS), while yielding similar depth estimation performance. Finally, several conclusions on stereo matching algorithms are also presented.

  20. General subspace learning with corrupted training data via graph embedding.

    PubMed

    Bao, Bing-Kun; Liu, Guangcan; Hong, Richang; Yan, Shuicheng; Xu, Changsheng

    2013-11-01

    We address the following subspace learning problem: supposing we are given a set of labeled, corrupted training data points, how to learn the underlying subspace, which contains three components: an intrinsic subspace that captures certain desired properties of a data set, a penalty subspace that fits the undesired properties of the data, and an error container that models the gross corruptions possibly existing in the data. Given a set of data points, these three components can be learned by solving a nuclear norm regularized optimization problem, which is convex and can be efficiently solved in polynomial time. Using the method as a tool, we propose a new discriminant analysis (i.e., supervised subspace learning) algorithm called Corruptions Tolerant Discriminant Analysis (CTDA), in which the intrinsic subspace is used to capture the features with high within-class similarity, the penalty subspace takes the role of modeling the undesired features with high between-class similarity, and the error container takes charge of fitting the possible corruptions in the data. We show that CTDA can well handle the gross corruptions possibly existing in the training data, whereas previous linear discriminant analysis algorithms arguably fail in such a setting. Extensive experiments conducted on two benchmark human face data sets and one object recognition data set show that CTDA outperforms the related algorithms.

  1. Improving personalized link prediction by hybrid diffusion

    NASA Astrophysics Data System (ADS)

    Liu, Jin-Hu; Zhu, Yu-Xiao; Zhou, Tao

    2016-04-01

    Inspired by traditional link prediction and to solve the problem of recommending friends in social networks, we introduce the personalized link prediction in this paper, in which each individual will get equal number of diversiform predictions. While the performances of many classical algorithms are not satisfactory under this framework, thus new algorithms are in urgent need. Motivated by previous researches in other fields, we generalize heat conduction process to the framework of personalized link prediction and find that this method outperforms many classical similarity-based algorithms, especially in the performance of diversity. In addition, we demonstrate that adding one ground node that is supposed to connect all the nodes in the system will greatly benefit the performance of heat conduction. Finally, better hybrid algorithms composed of local random walk and heat conduction have been proposed. Numerical results show that the hybrid algorithms can outperform other algorithms simultaneously in all four adopted metrics: AUC, precision, recall and hamming distance. In a word, this work may shed some light on the in-depth understanding of the effect of physical processes in personalized link prediction.

  2. Optimised analytical models of the dielectric properties of biological tissue.

    PubMed

    Salahuddin, Saqib; Porter, Emily; Krewer, Finn; O' Halloran, Martin

    2017-05-01

    The interaction of electromagnetic fields with the human body is quantified by the dielectric properties of biological tissues. These properties are incorporated into complex numerical simulations using parametric models such as Debye and Cole-Cole, for the computational investigation of electromagnetic wave propagation within the body. These parameters can be acquired through a variety of optimisation algorithms to achieve an accurate fit to measured data sets. A number of different optimisation techniques have been proposed, but these are often limited by the requirement for initial value estimations or by the large overall error (often up to several percentage points). In this work, a novel two-stage genetic algorithm proposed by the authors is applied to optimise the multi-pole Debye parameters for 54 types of human tissues. The performance of the two-stage genetic algorithm has been examined through a comparison with five other existing algorithms. The experimental results demonstrate that the two-stage genetic algorithm produces an accurate fit to a range of experimental data and efficiently out-performs all other optimisation algorithms under consideration. Accurate values of the three-pole Debye models for 54 types of human tissues, over 500 MHz to 20 GHz, are also presented for reference. Copyright © 2017 IPEM. Published by Elsevier Ltd. All rights reserved.

  3. A novel combined SLAM based on RBPF-SLAM and EIF-SLAM for mobile system sensing in a large scale environment.

    PubMed

    He, Bo; Zhang, Shujing; Yan, Tianhong; Zhang, Tao; Liang, Yan; Zhang, Hongjin

    2011-01-01

    Mobile autonomous systems are very important for marine scientific investigation and military applications. Many algorithms have been studied to deal with the computational efficiency problem required for large scale simultaneous localization and mapping (SLAM) and its related accuracy and consistency. Among these methods, submap-based SLAM is a more effective one. By combining the strength of two popular mapping algorithms, the Rao-Blackwellised particle filter (RBPF) and extended information filter (EIF), this paper presents a combined SLAM-an efficient submap-based solution to the SLAM problem in a large scale environment. RBPF-SLAM is used to produce local maps, which are periodically fused into an EIF-SLAM algorithm. RBPF-SLAM can avoid linearization of the robot model during operating and provide a robust data association, while EIF-SLAM can improve the whole computational speed, and avoid the tendency of RBPF-SLAM to be over-confident. In order to further improve the computational speed in a real time environment, a binary-tree-based decision-making strategy is introduced. Simulation experiments show that the proposed combined SLAM algorithm significantly outperforms currently existing algorithms in terms of accuracy and consistency, as well as the computing efficiency. Finally, the combined SLAM algorithm is experimentally validated in a real environment by using the Victoria Park dataset.

  4. Multiplicative noise removal via a learned dictionary.

    PubMed

    Huang, Yu-Mei; Moisan, Lionel; Ng, Michael K; Zeng, Tieyong

    2012-11-01

    Multiplicative noise removal is a challenging image processing problem, and most existing methods are based on the maximum a posteriori formulation and the logarithmic transformation of multiplicative denoising problems into additive denoising problems. Sparse representations of images have shown to be efficient approaches for image recovery. Following this idea, in this paper, we propose to learn a dictionary from the logarithmic transformed image, and then to use it in a variational model built for noise removal. Extensive experimental results suggest that in terms of visual quality, peak signal-to-noise ratio, and mean absolute deviation error, the proposed algorithm outperforms state-of-the-art methods.

  5. Robust Transceiver Design for Multiuser MIMO Downlink with Channel Uncertainties

    NASA Astrophysics Data System (ADS)

    Miao, Wei; Li, Yunzhou; Chen, Xiang; Zhou, Shidong; Wang, Jing

    This letter addresses the problem of robust transceiver design for the multiuser multiple-input-multiple-output (MIMO) downlink where the channel state information at the base station (BS) is imperfect. A stochastic approach which minimizes the expectation of the total mean square error (MSE) of the downlink conditioned on the channel estimates under a total transmit power constraint is adopted. The iterative algorithm reported in [2] is improved to handle the proposed robust optimization problem. Simulation results show that our proposed robust scheme effectively reduces the performance loss due to channel uncertainties and outperforms existing methods, especially when the channel errors of the users are different.

  6. A Hybrid Genetic-Simulated Annealing Algorithm for the Location-Inventory-Routing Problem Considering Returns under E-Supply Chain Environment

    PubMed Central

    Guo, Hao; Fu, Jing

    2013-01-01

    Facility location, inventory control, and vehicle routes scheduling are critical and highly related problems in the design of logistics system for e-business. Meanwhile, the return ratio in Internet sales was significantly higher than in the traditional business. Many of returned merchandise have no quality defects, which can reenter sales channels just after a simple repackaging process. Focusing on the existing problem in e-commerce logistics system, we formulate a location-inventory-routing problem model with no quality defects returns. To solve this NP-hard problem, an effective hybrid genetic simulated annealing algorithm (HGSAA) is proposed. Results of numerical examples show that HGSAA outperforms GA on computing time, optimal solution, and computing stability. The proposed model is very useful to help managers make the right decisions under e-supply chain environment. PMID:24489489

  7. Refining Automatically Extracted Knowledge Bases Using Crowdsourcing.

    PubMed

    Li, Chunhua; Zhao, Pengpeng; Sheng, Victor S; Xian, Xuefeng; Wu, Jian; Cui, Zhiming

    2017-01-01

    Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.

  8. A Hybrid Genetic Programming Algorithm for Automated Design of Dispatching Rules.

    PubMed

    Nguyen, Su; Mei, Yi; Xue, Bing; Zhang, Mengjie

    2018-06-04

    Designing effective dispatching rules for production systems is a difficult and timeconsuming task if it is done manually. In the last decade, the growth of computing power, advanced machine learning, and optimisation techniques has made the automated design of dispatching rules possible and automatically discovered rules are competitive or outperform existing rules developed by researchers. Genetic programming is one of the most popular approaches to discovering dispatching rules in the literature, especially for complex production systems. However, the large heuristic search space may restrict genetic programming from finding near optimal dispatching rules. This paper develops a new hybrid genetic programming algorithm for dynamic job shop scheduling based on a new representation, a new local search heuristic, and efficient fitness evaluators. Experiments show that the new method is effective regarding the quality of evolved rules. Moreover, evolved rules are also significantly smaller and contain more relevant attributes.

  9. Accurate and diverse recommendations via eliminating redundant correlations

    NASA Astrophysics Data System (ADS)

    Zhou, Tao; Su, Ri-Qi; Liu, Run-Ran; Jiang, Luo-Luo; Wang, Bing-Hong; Zhang, Yi-Cheng

    2009-12-01

    In this paper, based on a weighted projection of a bipartite user-object network, we introduce a personalized recommendation algorithm, called network-based inference (NBI), which has higher accuracy than the classical algorithm, namely collaborative filtering. In NBI, the correlation resulting from a specific attribute may be repeatedly counted in the cumulative recommendations from different objects. By considering the higher order correlations, we design an improved algorithm that can, to some extent, eliminate the redundant correlations. We test our algorithm on two benchmark data sets, MovieLens and Netflix. Compared with NBI, the algorithmic accuracy, measured by the ranking score, can be further improved by 23 per cent for MovieLens and 22 per cent for Netflix. The present algorithm can even outperform the Latent Dirichlet Allocation algorithm, which requires much longer computational time. Furthermore, most previous studies considered the algorithmic accuracy only; in this paper, we argue that the diversity and popularity, as two significant criteria of algorithmic performance, should also be taken into account. With more or less the same accuracy, an algorithm giving higher diversity and lower popularity is more favorable. Numerical results show that the present algorithm can outperform the standard one simultaneously in all five adopted metrics: lower ranking score and higher precision for accuracy, larger Hamming distance and lower intra-similarity for diversity, as well as smaller average degree for popularity.

  10. Case-Mix for Performance Management: A Risk Algorithm Based on ICD-10-CM.

    PubMed

    Gao, Jian; Moran, Eileen; Almenoff, Peter L

    2018-06-01

    Accurate risk adjustment is the key to a reliable comparison of cost and quality performance among providers and hospitals. However, the existing case-mix algorithms based on age, sex, and diagnoses can only explain up to 50% of the cost variation. More accurate risk adjustment is desired for provider performance assessment and improvement. To develop a case-mix algorithm that hospitals and payers can use to measure and compare cost and quality performance of their providers. All 6,048,895 patients with valid diagnoses and cost recorded in the US Veterans health care system in fiscal year 2016 were included in this study. The dependent variable was total cost at the patient level, and the explanatory variables were age, sex, and comorbidities represented by 762 clinically homogeneous groups, which were created by expanding the 283 categories from Clinical Classifications Software based on ICD-10-CM codes. The split-sample method was used to assess model overfitting and coefficient stability. The predictive power of the algorithms was ascertained by comparing the R, mean absolute percentage error, root mean square error, predictive ratios, and c-statistics. The expansion of the Clinical Classifications Software categories resulted in higher predictive power. The R reached 0.72 and 0.52 for the transformed and raw scale cost, respectively. The case-mix algorithm we developed based on age, sex, and diagnoses outperformed the existing case-mix models reported in the literature. The method developed in this study can be used by other health systems to produce tailored risk models for their specific purpose.

  11. Overlapping communities from dense disjoint and high total degree clusters

    NASA Astrophysics Data System (ADS)

    Zhang, Hongli; Gao, Yang; Zhang, Yue

    2018-04-01

    Community plays an important role in the field of sociology, biology and especially in domains of computer science, where systems are often represented as networks. And community detection is of great importance in the domains. A community is a dense subgraph of the whole graph with more links between its members than between its members to the outside nodes, and nodes in the same community probably share common properties or play similar roles in the graph. Communities overlap when nodes in a graph belong to multiple communities. A vast variety of overlapping community detection methods have been proposed in the literature, and the local expansion method is one of the most successful techniques dealing with large networks. The paper presents a density-based seeding method, in which dense disjoint local clusters are searched and selected as seeds. The proposed method selects a seed by the total degree and density of local clusters utilizing merely local structures of the network. Furthermore, this paper proposes a novel community refining phase via minimizing the conductance of each community, through which the quality of identified communities is largely improved in linear time. Experimental results in synthetic networks show that the proposed seeding method outperforms other seeding methods in the state of the art and the proposed refining method largely enhances the quality of the identified communities. Experimental results in real graphs with ground-truth communities show that the proposed approach outperforms other state of the art overlapping community detection algorithms, in particular, it is more than two orders of magnitude faster than the existing global algorithms with higher quality, and it obtains much more accurate community structure than the current local algorithms without any priori information.

  12. A multilevel ant colony optimization algorithm for classical and isothermic DNA sequencing by hybridization with multiplicity information available.

    PubMed

    Kwarciak, Kamil; Radom, Marcin; Formanowicz, Piotr

    2016-04-01

    The classical sequencing by hybridization takes into account a binary information about sequence composition. A given element from an oligonucleotide library is or is not a part of the target sequence. However, the DNA chip technology has been developed and it enables to receive a partial information about multiplicity of each oligonucleotide the analyzed sequence consist of. Currently, it is not possible to assess the exact data of such type but even partial information should be very useful. Two realistic multiplicity information models are taken into consideration in this paper. The first one, called "one and many" assumes that it is possible to obtain information if a given oligonucleotide occurs in a reconstructed sequence once or more than once. According to the second model, called "one, two and many", one is able to receive from biochemical experiment information if a given oligonucleotide is present in an analyzed sequence once, twice or at least three times. An ant colony optimization algorithm has been implemented to verify the above models and to compare with existing algorithms for sequencing by hybridization which utilize the additional information. The proposed algorithm solves the problem with any kind of hybridization errors. Computational experiment results confirm that using even the partial information about multiplicity leads to increased quality of reconstructed sequences. Moreover, they also show that the more precise model enables to obtain better solutions and the ant colony optimization algorithm outperforms the existing ones. Test data sets and the proposed ant colony optimization algorithm are available on: http://bioserver.cs.put.poznan.pl/download/ACO4mSBH.zip. Copyright © 2016 Elsevier Ltd. All rights reserved.

  13. Distributed Matrix Completion: Applications to Cooperative Positioning in Noisy Environments

    DTIC Science & Technology

    2013-12-11

    positioning, and a gossip version of low-rank approximation were developed. A convex relaxation for positioning in the presence of noise was shown...computing the leading eigenvectors of a large data matrix through gossip algorithms. A new algorithm is proposed that amounts to iteratively multiplying...generalization of gossip algorithms for consensus. The algorithms outperform state-of-the-art methods in a communication-limited scenario. Positioning via

  14. A Hybrid Color Space for Skin Detection Using Genetic Algorithm Heuristic Search and Principal Component Analysis Technique

    PubMed Central

    2015-01-01

    Color is one of the most prominent features of an image and used in many skin and face detection applications. Color space transformation is widely used by researchers to improve face and skin detection performance. Despite the substantial research efforts in this area, choosing a proper color space in terms of skin and face classification performance which can address issues like illumination variations, various camera characteristics and diversity in skin color tones has remained an open issue. This research proposes a new three-dimensional hybrid color space termed SKN by employing the Genetic Algorithm heuristic and Principal Component Analysis to find the optimal representation of human skin color in over seventeen existing color spaces. Genetic Algorithm heuristic is used to find the optimal color component combination setup in terms of skin detection accuracy while the Principal Component Analysis projects the optimal Genetic Algorithm solution to a less complex dimension. Pixel wise skin detection was used to evaluate the performance of the proposed color space. We have employed four classifiers including Random Forest, Naïve Bayes, Support Vector Machine and Multilayer Perceptron in order to generate the human skin color predictive model. The proposed color space was compared to some existing color spaces and shows superior results in terms of pixel-wise skin detection accuracy. Experimental results show that by using Random Forest classifier, the proposed SKN color space obtained an average F-score and True Positive Rate of 0.953 and False Positive Rate of 0.0482 which outperformed the existing color spaces in terms of pixel wise skin detection accuracy. The results also indicate that among the classifiers used in this study, Random Forest is the most suitable classifier for pixel wise skin detection applications. PMID:26267377

  15. CMDR based differential evolution identifies the epistatic interaction in genome-wide association studies.

    PubMed

    Yang, Cheng-Hong; Chuang, Li-Yeh; Lin, Yu-Da

    2017-08-01

    Detecting epistatic interactions in genome-wide association studies (GWAS) is a computational challenge. Such huge numbers of single-nucleotide polymorphism (SNP) combinations limit the some of the powerful algorithms to be applied to detect the potential epistasis in large-scale SNP datasets. We propose a new algorithm which combines the differential evolution (DE) algorithm with a classification based multifactor-dimensionality reduction (CMDR), termed DECMDR. DECMDR uses the CMDR as a fitness measure to evaluate values of solutions in DE process for scanning the potential statistical epistasis in GWAS. The results indicated that DECMDR outperforms the existing algorithms in terms of detection success rate by the large simulation and real data obtained from the Wellcome Trust Case Control Consortium. For running time comparison, DECMDR can efficient to apply the CMDR to detect the significant association between cases and controls amongst all possible SNP combinations in GWAS. DECMDR is freely available at https://goo.gl/p9sLuJ . chuang@isu.edu.tw or e0955767257@yahoo.com.tw. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  16. Adaptive structured dictionary learning for image fusion based on group-sparse-representation

    NASA Astrophysics Data System (ADS)

    Yang, Jiajie; Sun, Bin; Luo, Chengwei; Wu, Yuzhong; Xu, Limei

    2018-04-01

    Dictionary learning is the key process of sparse representation which is one of the most widely used image representation theories in image fusion. The existing dictionary learning method does not use the group structure information and the sparse coefficients well. In this paper, we propose a new adaptive structured dictionary learning algorithm and a l1-norm maximum fusion rule that innovatively utilizes grouped sparse coefficients to merge the images. In the dictionary learning algorithm, we do not need prior knowledge about any group structure of the dictionary. By using the characteristics of the dictionary in expressing the signal, our algorithm can automatically find the desired potential structure information that hidden in the dictionary. The fusion rule takes the physical meaning of the group structure dictionary, and makes activity-level judgement on the structure information when the images are being merged. Therefore, the fused image can retain more significant information. Comparisons have been made with several state-of-the-art dictionary learning methods and fusion rules. The experimental results demonstrate that, the dictionary learning algorithm and the fusion rule both outperform others in terms of several objective evaluation metrics.

  17. A new compound arithmetic crossover-based genetic algorithm for constrained optimisation in enterprise systems

    NASA Astrophysics Data System (ADS)

    Jin, Chenxia; Li, Fachao; Tsang, Eric C. C.; Bulysheva, Larissa; Kataev, Mikhail Yu

    2017-01-01

    In many real industrial applications, the integration of raw data with a methodology can support economically sound decision-making. Furthermore, most of these tasks involve complex optimisation problems. Seeking better solutions is critical. As an intelligent search optimisation algorithm, genetic algorithm (GA) is an important technique for complex system optimisation, but it has internal drawbacks such as low computation efficiency and prematurity. Improving the performance of GA is a vital topic in academic and applications research. In this paper, a new real-coded crossover operator, called compound arithmetic crossover operator (CAC), is proposed. CAC is used in conjunction with a uniform mutation operator to define a new genetic algorithm CAC10-GA. This GA is compared with an existing genetic algorithm (AC10-GA) that comprises an arithmetic crossover operator and a uniform mutation operator. To judge the performance of CAC10-GA, two kinds of analysis are performed. First the analysis of the convergence of CAC10-GA is performed by the Markov chain theory; second, a pair-wise comparison is carried out between CAC10-GA and AC10-GA through two test problems available in the global optimisation literature. The overall comparative study shows that the CAC performs quite well and the CAC10-GA defined outperforms the AC10-GA.

  18. GPS-Free Localization Algorithm for Wireless Sensor Networks

    PubMed Central

    Wang, Lei; Xu, Qingzheng

    2010-01-01

    Localization is one of the most fundamental problems in wireless sensor networks, since the locations of the sensor nodes are critical to both network operations and most application level tasks. A GPS-free localization scheme for wireless sensor networks is presented in this paper. First, we develop a standardized clustering-based approach for the local coordinate system formation wherein a multiplication factor is introduced to regulate the number of master and slave nodes and the degree of connectivity among master nodes. Second, using homogeneous coordinates, we derive a transformation matrix between two Cartesian coordinate systems to efficiently merge them into a global coordinate system and effectively overcome the flip ambiguity problem. The algorithm operates asynchronously without a centralized controller; and does not require that the location of the sensors be known a priori. A set of parameter-setting guidelines for the proposed algorithm is derived based on a probability model and the energy requirements are also investigated. A simulation analysis on a specific numerical example is conducted to validate the mathematical analytical results. We also compare the performance of the proposed algorithm under a variety multiplication factor, node density and node communication radius scenario. Experiments show that our algorithm outperforms existing mechanisms in terms of accuracy and convergence time. PMID:22219694

  19. Detection of Nitrogen Content in Rubber Leaves Using Near-Infrared (NIR) Spectroscopy with Correlation-Based Successive Projections Algorithm (SPA).

    PubMed

    Tang, Rongnian; Chen, Xupeng; Li, Chuang

    2018-05-01

    Near-infrared spectroscopy is an efficient, low-cost technology that has potential as an accurate method in detecting the nitrogen content of natural rubber leaves. Successive projections algorithm (SPA) is a widely used variable selection method for multivariate calibration, which uses projection operations to select a variable subset with minimum multi-collinearity. However, due to the fluctuation of correlation between variables, high collinearity may still exist in non-adjacent variables of subset obtained by basic SPA. Based on analysis to the correlation matrix of the spectra data, this paper proposed a correlation-based SPA (CB-SPA) to apply the successive projections algorithm in regions with consistent correlation. The result shows that CB-SPA can select variable subsets with more valuable variables and less multi-collinearity. Meanwhile, models established by the CB-SPA subset outperform basic SPA subsets in predicting nitrogen content in terms of both cross-validation and external prediction. Moreover, CB-SPA is assured to be more efficient, for the time cost in its selection procedure is one-twelfth that of the basic SPA.

  20. Solving NP-Hard Problems with Physarum-Based Ant Colony System.

    PubMed

    Liu, Yuxin; Gao, Chao; Zhang, Zili; Lu, Yuxiao; Chen, Shi; Liang, Mingxin; Tao, Li

    2017-01-01

    NP-hard problems exist in many real world applications. Ant colony optimization (ACO) algorithms can provide approximate solutions for those NP-hard problems, but the performance of ACO algorithms is significantly reduced due to premature convergence and weak robustness, etc. With these observations in mind, this paper proposes a Physarum-based pheromone matrix optimization strategy in ant colony system (ACS) for solving NP-hard problems such as traveling salesman problem (TSP) and 0/1 knapsack problem (0/1 KP). In the Physarum-inspired mathematical model, one of the unique characteristics is that critical tubes can be reserved in the process of network evolution. The optimized updating strategy employs the unique feature and accelerates the positive feedback process in ACS, which contributes to the quick convergence of the optimal solution. Some experiments were conducted using both benchmark and real datasets. The experimental results show that the optimized ACS outperforms other meta-heuristic algorithms in accuracy and robustness for solving TSPs. Meanwhile, the convergence rate and robustness for solving 0/1 KPs are better than those of classical ACS.

  1. Iterative Nonlocal Total Variation Regularization Method for Image Restoration

    PubMed Central

    Xu, Huanyu; Sun, Quansen; Luo, Nan; Cao, Guo; Xia, Deshen

    2013-01-01

    In this paper, a Bregman iteration based total variation image restoration algorithm is proposed. Based on the Bregman iteration, the algorithm splits the original total variation problem into sub-problems that are easy to solve. Moreover, non-local regularization is introduced into the proposed algorithm, and a method to choose the non-local filter parameter locally and adaptively is proposed. Experiment results show that the proposed algorithms outperform some other regularization methods. PMID:23776560

  2. Algorithms for Brownian first-passage-time estimation

    NASA Astrophysics Data System (ADS)

    Adib, Artur B.

    2009-09-01

    A class of algorithms in discrete space and continuous time for Brownian first-passage-time estimation is considered. A simple algorithm is derived that yields exact mean first-passage times (MFPTs) for linear potentials in one dimension, regardless of the lattice spacing. When applied to nonlinear potentials and/or higher spatial dimensions, numerical evidence suggests that this algorithm yields MFPT estimates that either outperform or rival Langevin-based (discrete time and continuous space) estimates.

  3. Portfolios of quantum algorithms.

    PubMed

    Maurer, S M; Hogg, T; Huberman, B A

    2001-12-17

    Quantum computation holds promise for the solution of many intractable problems. However, since many quantum algorithms are stochastic in nature they can find the solution of hard problems only probabilistically. Thus the efficiency of the algorithms has to be characterized by both the expected time to completion and the associated variance. In order to minimize both the running time and its uncertainty, we show that portfolios of quantum algorithms analogous to those of finance can outperform single algorithms when applied to the NP-complete problems such as 3-satisfiability.

  4. AUC-Maximizing Ensembles through Metalearning.

    PubMed

    LeDell, Erin; van der Laan, Mark J; Petersen, Maya

    2016-05-01

    Area Under the ROC Curve (AUC) is often used to measure the performance of an estimator in binary classification problems. An AUC-maximizing classifier can have significant advantages in cases where ranking correctness is valued or if the outcome is rare. In a Super Learner ensemble, maximization of the AUC can be achieved by the use of an AUC-maximining metalearning algorithm. We discuss an implementation of an AUC-maximization technique that is formulated as a nonlinear optimization problem. We also evaluate the effectiveness of a large number of different nonlinear optimization algorithms to maximize the cross-validated AUC of the ensemble fit. The results provide evidence that AUC-maximizing metalearners can, and often do, out-perform non-AUC-maximizing metalearning methods, with respect to ensemble AUC. The results also demonstrate that as the level of imbalance in the training data increases, the Super Learner ensemble outperforms the top base algorithm by a larger degree.

  5. AUC-Maximizing Ensembles through Metalearning

    PubMed Central

    LeDell, Erin; van der Laan, Mark J.; Peterson, Maya

    2016-01-01

    Area Under the ROC Curve (AUC) is often used to measure the performance of an estimator in binary classification problems. An AUC-maximizing classifier can have significant advantages in cases where ranking correctness is valued or if the outcome is rare. In a Super Learner ensemble, maximization of the AUC can be achieved by the use of an AUC-maximining metalearning algorithm. We discuss an implementation of an AUC-maximization technique that is formulated as a nonlinear optimization problem. We also evaluate the effectiveness of a large number of different nonlinear optimization algorithms to maximize the cross-validated AUC of the ensemble fit. The results provide evidence that AUC-maximizing metalearners can, and often do, out-perform non-AUC-maximizing metalearning methods, with respect to ensemble AUC. The results also demonstrate that as the level of imbalance in the training data increases, the Super Learner ensemble outperforms the top base algorithm by a larger degree. PMID:27227721

  6. Online Adaboost-Based Parameterized Methods for Dynamic Distributed Network Intrusion Detection.

    PubMed

    Hu, Weiming; Gao, Jun; Wang, Yanguo; Wu, Ou; Maybank, Stephen

    2014-01-01

    Current network intrusion detection systems lack adaptability to the frequently changing network environments. Furthermore, intrusion detection in the new distributed architectures is now a major requirement. In this paper, we propose two online Adaboost-based intrusion detection algorithms. In the first algorithm, a traditional online Adaboost process is used where decision stumps are used as weak classifiers. In the second algorithm, an improved online Adaboost process is proposed, and online Gaussian mixture models (GMMs) are used as weak classifiers. We further propose a distributed intrusion detection framework, in which a local parameterized detection model is constructed in each node using the online Adaboost algorithm. A global detection model is constructed in each node by combining the local parametric models using a small number of samples in the node. This combination is achieved using an algorithm based on particle swarm optimization (PSO) and support vector machines. The global model in each node is used to detect intrusions. Experimental results show that the improved online Adaboost process with GMMs obtains a higher detection rate and a lower false alarm rate than the traditional online Adaboost process that uses decision stumps. Both the algorithms outperform existing intrusion detection algorithms. It is also shown that our PSO, and SVM-based algorithm effectively combines the local detection models into the global model in each node; the global model in a node can handle the intrusion types that are found in other nodes, without sharing the samples of these intrusion types.

  7. Connected Component Model for Multi-Object Tracking.

    PubMed

    He, Zhenyu; Li, Xin; You, Xinge; Tao, Dacheng; Tang, Yuan Yan

    2016-08-01

    In multi-object tracking, it is critical to explore the data associations by exploiting the temporal information from a sequence of frames rather than the information from the adjacent two frames. Since straightforwardly obtaining data associations from multi-frames is an NP-hard multi-dimensional assignment (MDA) problem, most existing methods solve this MDA problem by either developing complicated approximate algorithms, or simplifying MDA as a 2D assignment problem based upon the information extracted only from adjacent frames. In this paper, we show that the relation between associations of two observations is the equivalence relation in the data association problem, based on the spatial-temporal constraint that the trajectories of different objects must be disjoint. Therefore, the MDA problem can be equivalently divided into independent subproblems by equivalence partitioning. In contrast to existing works for solving the MDA problem, we develop a connected component model (CCM) by exploiting the constraints of the data association and the equivalence relation on the constraints. Based upon CCM, we can efficiently obtain the global solution of the MDA problem for multi-object tracking by optimizing a sequence of independent data association subproblems. Experiments on challenging public data sets demonstrate that our algorithm outperforms the state-of-the-art approaches.

  8. NetCoffee: a fast and accurate global alignment approach to identify functionally conserved proteins in multiple networks.

    PubMed

    Hu, Jialu; Kehr, Birte; Reinert, Knut

    2014-02-15

    Owing to recent advancements in high-throughput technologies, protein-protein interaction networks of more and more species become available in public databases. The question of how to identify functionally conserved proteins across species attracts a lot of attention in computational biology. Network alignments provide a systematic way to solve this problem. However, most existing alignment tools encounter limitations in tackling this problem. Therefore, the demand for faster and more efficient alignment tools is growing. We present a fast and accurate algorithm, NetCoffee, which allows to find a global alignment of multiple protein-protein interaction networks. NetCoffee searches for a global alignment by maximizing a target function using simulated annealing on a set of weighted bipartite graphs that are constructed using a triplet approach similar to T-Coffee. To assess its performance, NetCoffee was applied to four real datasets. Our results suggest that NetCoffee remedies several limitations of previous algorithms, outperforms all existing alignment tools in terms of speed and nevertheless identifies biologically meaningful alignments. The source code and data are freely available for download under the GNU GPL v3 license at https://code.google.com/p/netcoffee/.

  9. Gaussian Multiscale Aggregation Applied to Segmentation in Hand Biometrics

    PubMed Central

    de Santos Sierra, Alberto; Ávila, Carmen Sánchez; Casanova, Javier Guerra; del Pozo, Gonzalo Bailador

    2011-01-01

    This paper presents an image segmentation algorithm based on Gaussian multiscale aggregation oriented to hand biometric applications. The method is able to isolate the hand from a wide variety of background textures such as carpets, fabric, glass, grass, soil or stones. The evaluation was carried out by using a publicly available synthetic database with 408,000 hand images in different backgrounds, comparing the performance in terms of accuracy and computational cost to two competitive segmentation methods existing in literature, namely Lossy Data Compression (LDC) and Normalized Cuts (NCuts). The results highlight that the proposed method outperforms current competitive segmentation methods with regard to computational cost, time performance, accuracy and memory usage. PMID:22247658

  10. Gaussian multiscale aggregation applied to segmentation in hand biometrics.

    PubMed

    de Santos Sierra, Alberto; Avila, Carmen Sánchez; Casanova, Javier Guerra; del Pozo, Gonzalo Bailador

    2011-01-01

    This paper presents an image segmentation algorithm based on Gaussian multiscale aggregation oriented to hand biometric applications. The method is able to isolate the hand from a wide variety of background textures such as carpets, fabric, glass, grass, soil or stones. The evaluation was carried out by using a publicly available synthetic database with 408,000 hand images in different backgrounds, comparing the performance in terms of accuracy and computational cost to two competitive segmentation methods existing in literature, namely Lossy Data Compression (LDC) and Normalized Cuts (NCuts). The results highlight that the proposed method outperforms current competitive segmentation methods with regard to computational cost, time performance, accuracy and memory usage.

  11. View-Invariant Gait Recognition Through Genetic Template Segmentation

    NASA Astrophysics Data System (ADS)

    Isaac, Ebenezer R. H. P.; Elias, Susan; Rajagopalan, Srinivasan; Easwarakumar, K. S.

    2017-08-01

    Template-based model-free approach provides by far the most successful solution to the gait recognition problem in literature. Recent work discusses how isolating the head and leg portion of the template increase the performance of a gait recognition system making it robust against covariates like clothing and carrying conditions. However, most involve a manual definition of the boundaries. The method we propose, the genetic template segmentation (GTS), employs the genetic algorithm to automate the boundary selection process. This method was tested on the GEI, GEnI and AEI templates. GEI seems to exhibit the best result when segmented with our approach. Experimental results depict that our approach significantly outperforms the existing implementations of view-invariant gait recognition.

  12. Improved quantum backtracking algorithms using effective resistance estimates

    NASA Astrophysics Data System (ADS)

    Jarret, Michael; Wan, Kianna

    2018-02-01

    We investigate quantum backtracking algorithms of the type introduced by Montanaro (Montanaro, arXiv:1509.02374). These algorithms explore trees of unknown structure and in certain settings exponentially outperform their classical counterparts. Some of the previous work focused on obtaining a quantum advantage for trees in which a unique marked vertex is promised to exist. We remove this restriction by recharacterizing the problem in terms of the effective resistance of the search space. In this paper, we present a generalization of one of Montanaro's algorithms to trees containing k marked vertices, where k is not necessarily known a priori. Our approach involves using amplitude estimation to determine a near-optimal weighting of a diffusion operator, which can then be applied to prepare a superposition state with support only on marked vertices and ancestors thereof. By repeatedly sampling this state and updating the input vertex, a marked vertex is reached in a logarithmic number of steps. The algorithm thereby achieves the conjectured bound of O ˜(√{T Rmax }) for finding a single marked vertex and O ˜(k √{T Rmax }) for finding all k marked vertices, where T is an upper bound on the tree size and Rmax is the maximum effective resistance encountered by the algorithm. This constitutes a speedup over Montanaro's original procedure in both the case of finding one and the case of finding multiple marked vertices in an arbitrary tree.

  13. Multispectra CWT-based algorithm (MCWT) in mass spectra for peak extraction.

    PubMed

    Hsueh, Huey-Miin; Kuo, Hsun-Chih; Tsai, Chen-An

    2008-01-01

    An important objective in mass spectrometry (MS) is to identify a set of biomarkers that can be used to potentially distinguish patients between distinct treatments (or conditions) from tens or hundreds of spectra. A common two-step approach involving peak extraction and quantification is employed to identify the features of scientific interest. The selected features are then used for further investigation to understand underlying biological mechanism of individual protein or for development of genomic biomarkers to early diagnosis. However, the use of inadequate or ineffective peak detection and peak alignment algorithms in peak extraction step may lead to a high rate of false positives. Also, it is crucial to reduce the false positive rate in detecting biomarkers from ten or hundreds of spectra. Here a new procedure is introduced for feature extraction in mass spectrometry data that extends the continuous wavelet transform-based (CWT-based) algorithm to multiple spectra. The proposed multispectra CWT-based algorithm (MCWT) not only can perform peak detection for multiple spectra but also carry out peak alignment at the same time. The author' MCWT algorithm constructs a reference, which integrates information of multiple raw spectra, for feature extraction. The algorithm is applied to a SELDI-TOF mass spectra data set provided by CAMDA 2006 with known polypeptide m/z positions. This new approach is easy to implement and it outperforms the existing peak extraction method from the Bioconductor PROcess package.

  14. Human-like machines: Transparency and comprehensibility.

    PubMed

    Patrzyk, Piotr M; Link, Daniela; Marewski, Julian N

    2017-01-01

    Artificial intelligence algorithms seek inspiration from human cognitive systems in areas where humans outperform machines. But on what level should algorithms try to approximate human cognition? We argue that human-like machines should be designed to make decisions in transparent and comprehensible ways, which can be achieved by accurately mirroring human cognitive processes.

  15. FBP and BPF reconstruction methods for circular X-ray tomography with off-center detector.

    PubMed

    Schäfer, Dirk; Grass, Michael; van de Haar, Peter

    2011-07-01

    Circular scanning with an off-center planar detector is an acquisition scheme that allows to save detector area while keeping a large field of view (FOV). Several filtered back-projection (FBP) algorithms have been proposed earlier. The purpose of this work is to present two newly developed back-projection filtration (BPF) variants and evaluate the image quality of these methods compared to the existing state-of-the-art FBP methods. The first new BPF algorithm applies redundancy weighting of overlapping opposite projections before differentiation in a single projection. The second one uses the Katsevich-type differentiation involving two neighboring projections followed by redundancy weighting and back-projection. An averaging scheme is presented to mitigate streak artifacts inherent to circular BPF algorithms along the Hilbert filter lines in the off-center transaxial slices of the reconstructions. The image quality is assessed visually on reconstructed slices of simulated and clinical data. Quantitative evaluation studies are performed with the Forbild head phantom by calculating root-mean-squared-deviations (RMSDs) to the voxelized phantom for different detector overlap settings and by investigating the noise resolution trade-off with a wire phantom in the full detector and off-center scenario. The noise-resolution behavior of all off-center reconstruction methods corresponds to their full detector performance with the best resolution for the FDK based methods with the given imaging geometry. With respect to RMSD and visual inspection, the proposed BPF with Katsevich-type differentiation outperforms all other methods for the smallest chosen detector overlap of about 15 mm. The best FBP method is the algorithm that is also based on the Katsevich-type differentiation and subsequent redundancy weighting. For wider overlap of about 40-50 mm, these two algorithms produce similar results outperforming the other three methods. The clinical case with a detector overlap of about 17 mm confirms these results. The BPF-type reconstructions with Katsevich differentiation are widely independent of the size of the detector overlap and give the best results with respect to RMSD and visual inspection for minimal detector overlap. The increased homogeneity will improve correct assessment of lesions in the entire field of view.

  16. Joint histogram-based cost aggregation for stereo matching.

    PubMed

    Min, Dongbo; Lu, Jiangbo; Do, Minh N

    2013-10-01

    This paper presents a novel method for performing efficient cost aggregation in stereo matching. The cost aggregation problem is reformulated from the perspective of a histogram, giving us the potential to reduce the complexity of the cost aggregation in stereo matching significantly. Differently from previous methods which have tried to reduce the complexity in terms of the size of an image and a matching window, our approach focuses on reducing the computational redundancy that exists among the search range, caused by a repeated filtering for all the hypotheses. Moreover, we also reduce the complexity of the window-based filtering through an efficient sampling scheme inside the matching window. The tradeoff between accuracy and complexity is extensively investigated by varying the parameters used in the proposed method. Experimental results show that the proposed method provides high-quality disparity maps with low complexity and outperforms existing local methods. This paper also provides new insights into complexity-constrained stereo-matching algorithm design.

  17. Analysis of Genome-Wide Association Studies with Multiple Outcomes Using Penalization

    PubMed Central

    Liu, Jin; Huang, Jian; Ma, Shuangge

    2012-01-01

    Genome-wide association studies have been extensively conducted, searching for markers for biologically meaningful outcomes and phenotypes. Penalization methods have been adopted in the analysis of the joint effects of a large number of SNPs (single nucleotide polymorphisms) and marker identification. This study is partly motivated by the analysis of heterogeneous stock mice dataset, in which multiple correlated phenotypes and a large number of SNPs are available. Existing penalization methods designed to analyze a single response variable cannot accommodate the correlation among multiple response variables. With multiple response variables sharing the same set of markers, joint modeling is first employed to accommodate the correlation. The group Lasso approach is adopted to select markers associated with all the outcome variables. An efficient computational algorithm is developed. Simulation study and analysis of the heterogeneous stock mice dataset show that the proposed method can outperform existing penalization methods. PMID:23272092

  18. Hybrid recommendation methods in complex networks.

    PubMed

    Fiasconaro, A; Tumminello, M; Nicosia, V; Latora, V; Mantegna, R N

    2015-07-01

    We propose two recommendation methods, based on the appropriate normalization of already existing similarity measures, and on the convex combination of the recommendation scores derived from similarity between users and between objects. We validate the proposed measures on three data sets, and we compare the performance of our methods to other recommendation systems recently proposed in the literature. We show that the proposed similarity measures allow us to attain an improvement of performances of up to 20% with respect to existing nonparametric methods, and that the accuracy of a recommendation can vary widely from one specific bipartite network to another, which suggests that a careful choice of the most suitable method is highly relevant for an effective recommendation on a given system. Finally, we study how an increasing presence of random links in the network affects the recommendation scores, finding that one of the two recommendation algorithms introduced here can systematically outperform the others in noisy data sets.

  19. Optimizing the Learning Order of Chinese Characters Using a Novel Topological Sort Algorithm

    PubMed Central

    Wang, Jinzhao

    2016-01-01

    We present a novel algorithm for optimizing the order in which Chinese characters are learned, one that incorporates the benefits of learning them in order of usage frequency and in order of their hierarchal structural relationships. We show that our work outperforms previously published orders and algorithms. Our algorithm is applicable to any scheduling task where nodes have intrinsic differences in importance and must be visited in topological order. PMID:27706234

  20. Tail Biting Trellis Representation of Codes: Decoding and Construction

    NASA Technical Reports Server (NTRS)

    Shao. Rose Y.; Lin, Shu; Fossorier, Marc

    1999-01-01

    This paper presents two new iterative algorithms for decoding linear codes based on their tail biting trellises, one is unidirectional and the other is bidirectional. Both algorithms are computationally efficient and achieves virtually optimum error performance with a small number of decoding iterations. They outperform all the previous suboptimal decoding algorithms. The bidirectional algorithm also reduces decoding delay. Also presented in the paper is a method for constructing tail biting trellises for linear block codes.

  1. Hybrid artificial bee colony algorithm for parameter optimization of five-parameter bidirectional reflectance distribution function model.

    PubMed

    Wang, Qianqian; Zhao, Jing; Gong, Yong; Hao, Qun; Peng, Zhong

    2017-11-20

    A hybrid artificial bee colony (ABC) algorithm inspired by the best-so-far solution and bacterial chemotaxis was introduced to optimize the parameters of the five-parameter bidirectional reflectance distribution function (BRDF) model. To verify the performance of the hybrid ABC algorithm, we measured BRDF of three kinds of samples and simulated the undetermined parameters of the five-parameter BRDF model using the hybrid ABC algorithm and the genetic algorithm, respectively. The experimental results demonstrate that the hybrid ABC algorithm outperforms the genetic algorithm in convergence speed, accuracy, and time efficiency under the same conditions.

  2. Complexity of the Quantum Adiabatic Algorithm

    NASA Astrophysics Data System (ADS)

    Hen, Itay

    2013-03-01

    The Quantum Adiabatic Algorithm (QAA) has been proposed as a mechanism for efficiently solving optimization problems on a quantum computer. Since adiabatic computation is analog in nature and does not require the design and use of quantum gates, it can be thought of as a simpler and perhaps more profound method for performing quantum computations that might also be easier to implement experimentally. While these features have generated substantial research in QAA, to date there is still a lack of solid evidence that the algorithm can outperform classical optimization algorihms. Here, we discuss several aspects of the quantum adiabatic algorithm: We analyze the efficiency of the algorithm on several ``hard'' (NP) computational problems. Studying the size dependence of the typical minimum energy gap of the Hamiltonians of these problems using quantum Monte Carlo methods, we find that while for most problems the minimum gap decreases exponentially with the size of the problem, indicating that the QAA is not more efficient than existing classical search algorithms, for other problems there is evidence to suggest that the gap may be polynomial near the phase transition. We also discuss applications of the QAA to ``real life'' problems and how they can be implemented on currently available (albeit prototypical) quantum hardware such as ``D-Wave One'', that impose serious restrictions as to which type of problems may be tested. Finally, we discuss different approaches to find improved implementations of the algorithm such as local adiabatic evolution, adaptive methods, local search in Hamiltonian space and others.

  3. SAR Processing Based On Two-Dimensional Transfer Function

    NASA Technical Reports Server (NTRS)

    Chang, Chi-Yung; Jin, Michael Y.; Curlander, John C.

    1994-01-01

    Exact transfer function, ETF, is two-dimensional transfer function that constitutes basis of improved frequency-domain-convolution algorithm for processing synthetic-aperture-radar, SAR data. ETF incorporates terms that account for Doppler effect of motion of radar relative to scanned ground area and for antenna squint angle. Algorithm based on ETF outperforms others.

  4. Creating Very True Quantum Algorithms for Quantum Energy Based Computing

    NASA Astrophysics Data System (ADS)

    Nagata, Koji; Nakamura, Tadao; Geurdes, Han; Batle, Josep; Abdalla, Soliman; Farouk, Ahmed; Diep, Do Ngoc

    2018-04-01

    An interpretation of quantum mechanics is discussed. It is assumed that quantum is energy. An algorithm by means of the energy interpretation is discussed. An algorithm, based on the energy interpretation, for fast determining a homogeneous linear function f( x) := s. x = s 1 x 1 + s 2 x 2 + ⋯ + s N x N is proposed. Here x = ( x 1, … , x N ), x j ∈ R and the coefficients s = ( s 1, … , s N ), s j ∈ N. Given the interpolation values (f(1), f(2),...,f(N))=ěc {y}, the unknown coefficients s = (s1(ěc {y}),\\dots , sN(ěc {y})) of the linear function shall be determined, simultaneously. The speed of determining the values is shown to outperform the classical case by a factor of N. Our method is based on the generalized Bernstein-Vazirani algorithm to qudit systems. Next, by using M parallel quantum systems, M homogeneous linear functions are determined, simultaneously. The speed of obtaining the set of M homogeneous linear functions is shown to outperform the classical case by a factor of N × M.

  5. Creating Very True Quantum Algorithms for Quantum Energy Based Computing

    NASA Astrophysics Data System (ADS)

    Nagata, Koji; Nakamura, Tadao; Geurdes, Han; Batle, Josep; Abdalla, Soliman; Farouk, Ahmed; Diep, Do Ngoc

    2017-12-01

    An interpretation of quantum mechanics is discussed. It is assumed that quantum is energy. An algorithm by means of the energy interpretation is discussed. An algorithm, based on the energy interpretation, for fast determining a homogeneous linear function f(x) := s.x = s 1 x 1 + s 2 x 2 + ⋯ + s N x N is proposed. Here x = (x 1, … , x N ), x j ∈ R and the coefficients s = (s 1, … , s N ), s j ∈ N. Given the interpolation values (f(1), f(2),...,f(N))=ěc {y}, the unknown coefficients s = (s1(ěc {y}),\\dots , sN(ěc {y})) of the linear function shall be determined, simultaneously. The speed of determining the values is shown to outperform the classical case by a factor of N. Our method is based on the generalized Bernstein-Vazirani algorithm to qudit systems. Next, by using M parallel quantum systems, M homogeneous linear functions are determined, simultaneously. The speed of obtaining the set of M homogeneous linear functions is shown to outperform the classical case by a factor of N × M.

  6. Preconditioned Alternating Projection Algorithms for Maximum a Posteriori ECT Reconstruction

    PubMed Central

    Krol, Andrzej; Li, Si; Shen, Lixin; Xu, Yuesheng

    2012-01-01

    We propose a preconditioned alternating projection algorithm (PAPA) for solving the maximum a posteriori (MAP) emission computed tomography (ECT) reconstruction problem. Specifically, we formulate the reconstruction problem as a constrained convex optimization problem with the total variation (TV) regularization. We then characterize the solution of the constrained convex optimization problem and show that it satisfies a system of fixed-point equations defined in terms of two proximity operators raised from the convex functions that define the TV-norm and the constrain involved in the problem. The characterization (of the solution) via the proximity operators that define two projection operators naturally leads to an alternating projection algorithm for finding the solution. For efficient numerical computation, we introduce to the alternating projection algorithm a preconditioning matrix (the EM-preconditioner) for the dense system matrix involved in the optimization problem. We prove theoretically convergence of the preconditioned alternating projection algorithm. In numerical experiments, performance of our algorithms, with an appropriately selected preconditioning matrix, is compared with performance of the conventional MAP expectation-maximization (MAP-EM) algorithm with TV regularizer (EM-TV) and that of the recently developed nested EM-TV algorithm for ECT reconstruction. Based on the numerical experiments performed in this work, we observe that the alternating projection algorithm with the EM-preconditioner outperforms significantly the EM-TV in all aspects including the convergence speed, the noise in the reconstructed images and the image quality. It also outperforms the nested EM-TV in the convergence speed while providing comparable image quality. PMID:23271835

  7. A SAT Based Effective Algorithm for the Directed Hamiltonian Cycle Problem

    NASA Astrophysics Data System (ADS)

    Jäger, Gerold; Zhang, Weixiong

    The Hamiltonian cycle problem (HCP) is an important combinatorial problem with applications in many areas. While thorough theoretical and experimental analyses have been made on the HCP in undirected graphs, little is known for the HCP in directed graphs (DHCP). The contribution of this work is an effective algorithm for the DHCP. Our algorithm explores and exploits the close relationship between the DHCP and the Assignment Problem (AP) and utilizes a technique based on Boolean satisfiability (SAT). By combining effective algorithms for the AP and SAT, our algorithm significantly outperforms previous exact DHCP algorithms including an algorithm based on the award-winning Concorde TSP algorithm.

  8. Firefly algorithm with chaos

    NASA Astrophysics Data System (ADS)

    Gandomi, A. H.; Yang, X.-S.; Talatahari, S.; Alavi, A. H.

    2013-01-01

    A recently developed metaheuristic optimization algorithm, firefly algorithm (FA), mimics the social behavior of fireflies based on the flashing and attraction characteristics of fireflies. In the present study, we will introduce chaos into FA so as to increase its global search mobility for robust global optimization. Detailed studies are carried out on benchmark problems with different chaotic maps. Here, 12 different chaotic maps are utilized to tune the attractive movement of the fireflies in the algorithm. The results show that some chaotic FAs can clearly outperform the standard FA.

  9. Deep learning improves prediction of CRISPR-Cpf1 guide RNA activity.

    PubMed

    Kim, Hui Kwon; Min, Seonwoo; Song, Myungjae; Jung, Soobin; Choi, Jae Woo; Kim, Younggwang; Lee, Sangeun; Yoon, Sungroh; Kim, Hyongbum Henry

    2018-03-01

    We present two algorithms to predict the activity of AsCpf1 guide RNAs. Indel frequencies for 15,000 target sequences were used in a deep-learning framework based on a convolutional neural network to train Seq-deepCpf1. We then incorporated chromatin accessibility information to create the better-performing DeepCpf1 algorithm for cell lines for which such information is available and show that both algorithms outperform previous machine learning algorithms on our own and published data sets.

  10. A novel robust Arabic light stemmer

    NASA Astrophysics Data System (ADS)

    Abainia, Kheireddine; Ouamour, Siham; Sayoud, Halim

    2017-05-01

    The stemming is the process of transforming a word into its root or stem, hence, it is considered as a crucial pre-processing step before tackling any task of natural language processing or information retrieval. However, in the case of Arabic language, finding an effective stemming algorithm seems to be quite difficult, since the Arabic language has a specific morphology, which is different from many other languages. Although, there exist several algorithms in literature addressing the Arabic stemming issue, unfortunately, most of them are restricted to a limited number of words, present some confusions between original letters and affixes, and usually employ dictionary of words or patterns. For that purpose, we propose the design and implementation of a novel Arabic light stemmer, which is based on some new rules for stripping prefixes, suffixes and infixes in a smart way. And in our knowledge, it is the first work dealing with Arabic infixes with regards to their irregular rules. The empirical evaluation was conducted on a new Arabic data-set (called ARASTEM), which was conceived and collected from several Arabic discussion forums containing dialectical Arabic and modern pseudo-Arabic languages. Hence, we present a comparative investigation between our new stemmer and other existing stemmers using Paice's parameters, namely: Under Stemming Index (UI), Over Stemming Index (OI) and Stemming Weight (SW). Results show that the proposed Arabic light stemmer maintains consistently high performances and outperforms several existing light stemmers.

  11. New Method of Calculating a Multiplication by using the Generalized Bernstein-Vazirani Algorithm

    NASA Astrophysics Data System (ADS)

    Nagata, Koji; Nakamura, Tadao; Geurdes, Han; Batle, Josep; Abdalla, Soliman; Farouk, Ahmed

    2018-06-01

    We present a new method of more speedily calculating a multiplication by using the generalized Bernstein-Vazirani algorithm and many parallel quantum systems. Given the set of real values a1,a2,a3,\\ldots ,aN and a function g:bf {R}→ {0,1}, we shall determine the following values g(a1),g(a2),g(a3),\\ldots , g(aN) simultaneously. The speed of determining the values is shown to outperform the classical case by a factor of N. Next, we consider it as a number in binary representation; M 1 = ( g( a 1), g( a 2), g( a 3),…, g( a N )). By using M parallel quantum systems, we have M numbers in binary representation, simultaneously. The speed of obtaining the M numbers is shown to outperform the classical case by a factor of M. Finally, we calculate the product; M1× M2× \\cdots × MM. The speed of obtaining the product is shown to outperform the classical case by a factor of N × M.

  12. Detection of Heart Sounds in Children with and without Pulmonary Arterial Hypertension―Daubechies Wavelets Approach

    PubMed Central

    Elgendi, Mohamed; Kumar, Shine; Guo, Long; Rutledge, Jennifer; Coe, James Y.; Zemp, Roger; Schuurmans, Dale; Adatia, Ian

    2015-01-01

    Background Automatic detection of the 1st (S1) and 2nd (S2) heart sounds is difficult, and existing algorithms are imprecise. We sought to develop a wavelet-based algorithm for the detection of S1 and S2 in children with and without pulmonary arterial hypertension (PAH). Method Heart sounds were recorded at the second left intercostal space and the cardiac apex with a digital stethoscope simultaneously with pulmonary arterial pressure (PAP). We developed a Daubechies wavelet algorithm for the automatic detection of S1 and S2 using the wavelet coefficient ‘D 6’ based on power spectral analysis. We compared our algorithm with four other Daubechies wavelet-based algorithms published by Liang, Kumar, Wang, and Zhong. We annotated S1 and S2 from an audiovisual examination of the phonocardiographic tracing by two trained cardiologists and the observation that in all subjects systole was shorter than diastole. Results We studied 22 subjects (9 males and 13 females, median age 6 years, range 0.25–19). Eleven subjects had a mean PAP < 25 mmHg. Eleven subjects had PAH with a mean PAP ≥ 25 mmHg. All subjects had a pulmonary artery wedge pressure ≤ 15 mmHg. The sensitivity (SE) and positive predictivity (+P) of our algorithm were 70% and 68%, respectively. In comparison, the SE and +P of Liang were 59% and 42%, Kumar 19% and 12%, Wang 50% and 45%, and Zhong 43% and 53%, respectively. Our algorithm demonstrated robustness and outperformed the other methods up to a signal-to-noise ratio (SNR) of 10 dB. For all algorithms, detection errors arose from low-amplitude peaks, fast heart rates, low signal-to-noise ratio, and fixed thresholds. Conclusion Our algorithm for the detection of S1 and S2 improves the performance of existing Daubechies-based algorithms and justifies the use of the wavelet coefficient ‘D 6’ through power spectral analysis. Also, the robustness despite ambient noise may improve real world clinical performance. PMID:26629704

  13. Binary Interval Search: a scalable algorithm for counting interval intersections.

    PubMed

    Layer, Ryan M; Skadron, Kevin; Robins, Gabriel; Hall, Ira M; Quinlan, Aaron R

    2013-01-01

    The comparison of diverse genomic datasets is fundamental to understand genome biology. Researchers must explore many large datasets of genome intervals (e.g. genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect, that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequencing technologies, efficient methods for measuring statistically significant relationships between many sets of genomic features are crucial for future discovery. We introduce the Binary Interval Search (BITS) algorithm, a novel and scalable approach to interval set intersection. We demonstrate that BITS outperforms existing methods at counting interval intersections. Moreover, we show that BITS is intrinsically suited to parallel computing architectures, such as graphics processing units by illustrating its utility for efficient Monte Carlo simulations measuring the significance of relationships between sets of genomic intervals. https://github.com/arq5x/bits.

  14. Naturalness preservation image contrast enhancement via histogram modification

    NASA Astrophysics Data System (ADS)

    Tian, Qi-Chong; Cohen, Laurent D.

    2018-04-01

    Contrast enhancement is a technique for enhancing image contrast to obtain better visual quality. Since many existing contrast enhancement algorithms usually produce over-enhanced results, the naturalness preservation is needed to be considered in the framework of image contrast enhancement. This paper proposes a naturalness preservation contrast enhancement method, which adopts the histogram matching to improve the contrast and uses the image quality assessment to automatically select the optimal target histogram. The contrast improvement and the naturalness preservation are both considered in the target histogram, so this method can avoid the over-enhancement problem. In the proposed method, the optimal target histogram is a weighted sum of the original histogram, the uniform histogram, and the Gaussian-shaped histogram. Then the structural metric and the statistical naturalness metric are used to determine the weights of corresponding histograms. At last, the contrast-enhanced image is obtained via matching the optimal target histogram. The experiments demonstrate the proposed method outperforms the compared histogram-based contrast enhancement algorithms.

  15. Particle swarm optimization-based local entropy weighted histogram equalization for infrared image enhancement

    NASA Astrophysics Data System (ADS)

    Wan, Minjie; Gu, Guohua; Qian, Weixian; Ren, Kan; Chen, Qian; Maldague, Xavier

    2018-06-01

    Infrared image enhancement plays a significant role in intelligent urban surveillance systems for smart city applications. Unlike existing methods only exaggerating the global contrast, we propose a particle swam optimization-based local entropy weighted histogram equalization which involves the enhancement of both local details and fore-and background contrast. First of all, a novel local entropy weighted histogram depicting the distribution of detail information is calculated based on a modified hyperbolic tangent function. Then, the histogram is divided into two parts via a threshold maximizing the inter-class variance in order to improve the contrasts of foreground and background, respectively. To avoid over-enhancement and noise amplification, double plateau thresholds of the presented histogram are formulated by means of particle swarm optimization algorithm. Lastly, each sub-image is equalized independently according to the constrained sub-local entropy weighted histogram. Comparative experiments implemented on real infrared images prove that our algorithm outperforms other state-of-the-art methods in terms of both visual and quantized evaluations.

  16. Conic Sampling: An Efficient Method for Solving Linear and Quadratic Programming by Randomly Linking Constraints within the Interior

    PubMed Central

    Serang, Oliver

    2012-01-01

    Linear programming (LP) problems are commonly used in analysis and resource allocation, frequently surfacing as approximations to more difficult problems. Existing approaches to LP have been dominated by a small group of methods, and randomized algorithms have not enjoyed popularity in practice. This paper introduces a novel randomized method of solving LP problems by moving along the facets and within the interior of the polytope along rays randomly sampled from the polyhedral cones defined by the bounding constraints. This conic sampling method is then applied to randomly sampled LPs, and its runtime performance is shown to compare favorably to the simplex and primal affine-scaling algorithms, especially on polytopes with certain characteristics. The conic sampling method is then adapted and applied to solve a certain quadratic program, which compute a projection onto a polytope; the proposed method is shown to outperform the proprietary software Mathematica on large, sparse QP problems constructed from mass spectometry-based proteomics. PMID:22952741

  17. Refining Automatically Extracted Knowledge Bases Using Crowdsourcing

    PubMed Central

    Xian, Xuefeng; Cui, Zhiming

    2017-01-01

    Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost. PMID:28588611

  18. Image-guided filtering for improving photoacoustic tomographic image reconstruction.

    PubMed

    Awasthi, Navchetan; Kalva, Sandeep Kumar; Pramanik, Manojit; Yalavarthy, Phaneendra K

    2018-06-01

    Several algorithms exist to solve the photoacoustic image reconstruction problem depending on the expected reconstructed image features. These reconstruction algorithms promote typically one feature, such as being smooth or sharp, in the output image. Combining these features using a guided filtering approach was attempted in this work, which requires an input and guiding image. This approach act as a postprocessing step to improve commonly used Tikhonov or total variational regularization method. The result obtained from linear backprojection was used as a guiding image to improve these results. Using both numerical and experimental phantom cases, it was shown that the proposed guided filtering approach was able to improve (as high as 11.23 dB) the signal-to-noise ratio of the reconstructed images with the added advantage being computationally efficient. This approach was compared with state-of-the-art basis pursuit deconvolution as well as standard denoising methods and shown to outperform them. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  19. A Space Affine Matching Approach to fMRI Time Series Analysis.

    PubMed

    Chen, Liang; Zhang, Weishi; Liu, Hongbo; Feng, Shigang; Chen, C L Philip; Wang, Huili

    2016-07-01

    For fMRI time series analysis, an important challenge is to overcome the potential delay between hemodynamic response signal and cognitive stimuli signal, namely the same frequency but different phase (SFDP) problem. In this paper, a novel space affine matching feature is presented by introducing the time domain and frequency domain features. The time domain feature is used to discern different stimuli, while the frequency domain feature to eliminate the delay. And then we propose a space affine matching (SAM) algorithm to match fMRI time series by our affine feature, in which a normal vector is estimated using gradient descent to explore the time series matching optimally. The experimental results illustrate that the SAM algorithm is insensitive to the delay between the hemodynamic response signal and the cognitive stimuli signal. Our approach significantly outperforms GLM method while there exists the delay. The approach can help us solve the SFDP problem in fMRI time series matching and thus of great promise to reveal brain dynamics.

  20. An Improved Heuristic Method for Subgraph Isomorphism Problem

    NASA Astrophysics Data System (ADS)

    Xiang, Yingzhuo; Han, Jiesi; Xu, Haijiang; Guo, Xin

    2017-09-01

    This paper focus on the subgraph isomorphism (SI) problem. We present an improved genetic algorithm, a heuristic method to search the optimal solution. The contribution of this paper is that we design a dedicated crossover algorithm and a new fitness function to measure the evolution process. Experiments show our improved genetic algorithm performs better than other heuristic methods. For a large graph, such as a subgraph of 40 nodes, our algorithm outperforms the traditional tree search algorithms. We find that the performance of our improved genetic algorithm does not decrease as the number of nodes in prototype graphs.

  1. SEQUOIA: significance enhanced network querying through context-sensitive random walk and minimization of network conductance.

    PubMed

    Jeong, Hyundoo; Yoon, Byung-Jun

    2017-03-14

    Network querying algorithms provide computational means to identify conserved network modules in large-scale biological networks that are similar to known functional modules, such as pathways or molecular complexes. Two main challenges for network querying algorithms are the high computational complexity of detecting potential isomorphism between the query and the target graphs and ensuring the biological significance of the query results. In this paper, we propose SEQUOIA, a novel network querying algorithm that effectively addresses these issues by utilizing a context-sensitive random walk (CSRW) model for network comparison and minimizing the network conductance of potential matches in the target network. The CSRW model, inspired by the pair hidden Markov model (pair-HMM) that has been widely used for sequence comparison and alignment, can accurately assess the node-to-node correspondence between different graphs by accounting for node insertions and deletions. The proposed algorithm identifies high-scoring network regions based on the CSRW scores, which are subsequently extended by maximally reducing the network conductance of the identified subnetworks. Performance assessment based on real PPI networks and known molecular complexes show that SEQUOIA outperforms existing methods and clearly enhances the biological significance of the query results. The source code and datasets can be downloaded from http://www.ece.tamu.edu/~bjyoon/SEQUOIA .

  2. A permutation-based non-parametric analysis of CRISPR screen data.

    PubMed

    Jia, Gaoxiang; Wang, Xinlei; Xiao, Guanghua

    2017-07-19

    Clustered regularly-interspaced short palindromic repeats (CRISPR) screens are usually implemented in cultured cells to identify genes with critical functions. Although several methods have been developed or adapted to analyze CRISPR screening data, no single specific algorithm has gained popularity. Thus, rigorous procedures are needed to overcome the shortcomings of existing algorithms. We developed a Permutation-Based Non-Parametric Analysis (PBNPA) algorithm, which computes p-values at the gene level by permuting sgRNA labels, and thus it avoids restrictive distributional assumptions. Although PBNPA is designed to analyze CRISPR data, it can also be applied to analyze genetic screens implemented with siRNAs or shRNAs and drug screens. We compared the performance of PBNPA with competing methods on simulated data as well as on real data. PBNPA outperformed recent methods designed for CRISPR screen analysis, as well as methods used for analyzing other functional genomics screens, in terms of Receiver Operating Characteristics (ROC) curves and False Discovery Rate (FDR) control for simulated data under various settings. Remarkably, the PBNPA algorithm showed better consistency and FDR control on published real data as well. PBNPA yields more consistent and reliable results than its competitors, especially when the data quality is low. R package of PBNPA is available at: https://cran.r-project.org/web/packages/PBNPA/ .

  3. Comparative analysis of chemical similarity methods for modular natural products with a hypothetical structure enumeration algorithm.

    PubMed

    Skinnider, Michael A; Dejong, Chris A; Franczak, Brian C; McNicholas, Paul D; Magarvey, Nathan A

    2017-08-16

    Natural products represent a prominent source of pharmaceutically and industrially important agents. Calculating the chemical similarity of two molecules is a central task in cheminformatics, with applications at multiple stages of the drug discovery pipeline. Quantifying the similarity of natural products is a particularly important problem, as the biological activities of these molecules have been extensively optimized by natural selection. The large and structurally complex scaffolds of natural products distinguish their physical and chemical properties from those of synthetic compounds. However, no analysis of the performance of existing methods for molecular similarity calculation specific to natural products has been reported to date. Here, we present LEMONS, an algorithm for the enumeration of hypothetical modular natural product structures. We leverage this algorithm to conduct a comparative analysis of molecular similarity methods within the unique chemical space occupied by modular natural products using controlled synthetic data, and comprehensively investigate the impact of diverse biosynthetic parameters on similarity search. We additionally investigate a recently described algorithm for natural product retrobiosynthesis and alignment, and find that when rule-based retrobiosynthesis can be applied, this approach outperforms conventional two-dimensional fingerprints, suggesting it may represent a valuable approach for the targeted exploration of natural product chemical space and microbial genome mining. Our open-source algorithm is an extensible method of enumerating hypothetical natural product structures with diverse potential applications in bioinformatics.

  4. Weighted community detection and data clustering using message passing

    NASA Astrophysics Data System (ADS)

    Shi, Cheng; Liu, Yanchen; Zhang, Pan

    2018-03-01

    Grouping objects into clusters based on the similarities or weights between them is one of the most important problems in science and engineering. In this work, by extending message-passing algorithms and spectral algorithms proposed for an unweighted community detection problem, we develop a non-parametric method based on statistical physics, by mapping the problem to the Potts model at the critical temperature of spin-glass transition and applying belief propagation to solve the marginals corresponding to the Boltzmann distribution. Our algorithm is robust to over-fitting and gives a principled way to determine whether there are significant clusters in the data and how many clusters there are. We apply our method to different clustering tasks. In the community detection problem in weighted and directed networks, we show that our algorithm significantly outperforms existing algorithms. In the clustering problem, where the data were generated by mixture models in the sparse regime, we show that our method works all the way down to the theoretical limit of detectability and gives accuracy very close to that of the optimal Bayesian inference. In the semi-supervised clustering problem, our method only needs several labels to work perfectly in classic datasets. Finally, we further develop Thouless-Anderson-Palmer equations which heavily reduce the computation complexity in dense networks but give almost the same performance as belief propagation.

  5. Inverting Monotonic Nonlinearities by Entropy Maximization

    PubMed Central

    López-de-Ipiña Pena, Karmele; Caiafa, Cesar F.

    2016-01-01

    This paper proposes a new method for blind inversion of a monotonic nonlinear map applied to a sum of random variables. Such kinds of mixtures of random variables are found in source separation and Wiener system inversion problems, for example. The importance of our proposed method is based on the fact that it permits to decouple the estimation of the nonlinear part (nonlinear compensation) from the estimation of the linear one (source separation matrix or deconvolution filter), which can be solved by applying any convenient linear algorithm. Our new nonlinear compensation algorithm, the MaxEnt algorithm, generalizes the idea of Gaussianization of the observation by maximizing its entropy instead. We developed two versions of our algorithm based either in a polynomial or a neural network parameterization of the nonlinear function. We provide a sufficient condition on the nonlinear function and the probability distribution that gives a guarantee for the MaxEnt method to succeed compensating the distortion. Through an extensive set of simulations, MaxEnt is compared with existing algorithms for blind approximation of nonlinear maps. Experiments show that MaxEnt is able to successfully compensate monotonic distortions outperforming other methods in terms of the obtained Signal to Noise Ratio in many important cases, for example when the number of variables in a mixture is small. Besides its ability for compensating nonlinearities, MaxEnt is very robust, i.e. showing small variability in the results. PMID:27780261

  6. Inverting Monotonic Nonlinearities by Entropy Maximization.

    PubMed

    Solé-Casals, Jordi; López-de-Ipiña Pena, Karmele; Caiafa, Cesar F

    2016-01-01

    This paper proposes a new method for blind inversion of a monotonic nonlinear map applied to a sum of random variables. Such kinds of mixtures of random variables are found in source separation and Wiener system inversion problems, for example. The importance of our proposed method is based on the fact that it permits to decouple the estimation of the nonlinear part (nonlinear compensation) from the estimation of the linear one (source separation matrix or deconvolution filter), which can be solved by applying any convenient linear algorithm. Our new nonlinear compensation algorithm, the MaxEnt algorithm, generalizes the idea of Gaussianization of the observation by maximizing its entropy instead. We developed two versions of our algorithm based either in a polynomial or a neural network parameterization of the nonlinear function. We provide a sufficient condition on the nonlinear function and the probability distribution that gives a guarantee for the MaxEnt method to succeed compensating the distortion. Through an extensive set of simulations, MaxEnt is compared with existing algorithms for blind approximation of nonlinear maps. Experiments show that MaxEnt is able to successfully compensate monotonic distortions outperforming other methods in terms of the obtained Signal to Noise Ratio in many important cases, for example when the number of variables in a mixture is small. Besides its ability for compensating nonlinearities, MaxEnt is very robust, i.e. showing small variability in the results.

  7. Methods for reducing interference in the Complementary Learning Systems model: oscillating inhibition and autonomous memory rehearsal.

    PubMed

    Norman, Kenneth A; Newman, Ehren L; Perotte, Adler J

    2005-11-01

    The stability-plasticity problem (i.e. how the brain incorporates new information into its model of the world, while at the same time preserving existing knowledge) has been at the forefront of computational memory research for several decades. In this paper, we critically evaluate how well the Complementary Learning Systems theory of hippocampo-cortical interactions addresses the stability-plasticity problem. We identify two major challenges for the model: Finding a learning algorithm for cortex and hippocampus that enacts selective strengthening of weak memories, and selective punishment of competing memories; and preventing catastrophic forgetting in the case of non-stationary environments (i.e. when items are temporarily removed from the training set). We then discuss potential solutions to these problems: First, we describe a recently developed learning algorithm that leverages neural oscillations to find weak parts of memories (so they can be strengthened) and strong competitors (so they can be punished), and we show how this algorithm outperforms other learning algorithms (CPCA Hebbian learning and Leabra at memorizing overlapping patterns. Second, we describe how autonomous re-activation of memories (separately in cortex and hippocampus) during REM sleep, coupled with the oscillating learning algorithm, can reduce the rate of forgetting of input patterns that are no longer present in the environment. We then present a simple demonstration of how this process can prevent catastrophic interference in an AB-AC learning paradigm.

  8. Discovery of phosphorylation motif mixtures in phosphoproteomics data

    PubMed Central

    Ritz, Anna; Shakhnarovich, Gregory; Salomon, Arthur R.; Raphael, Benjamin J.

    2009-01-01

    Motivation: Modification of proteins via phosphorylation is a primary mechanism for signal transduction in cells. Phosphorylation sites on proteins are determined in part through particular patterns, or motifs, present in the amino acid sequence. Results: We describe an algorithm that simultaneously discovers multiple motifs in a set of peptides that were phosphorylated by several different kinases. Such sets of peptides are routinely produced in proteomics experiments.Our motif-finding algorithm uses the principle of minimum description length to determine a mixture of sequence motifs that distinguish a foreground set of phosphopeptides from a background set of unphosphorylated peptides. We show that our algorithm outperforms existing motif-finding algorithms on synthetic datasets consisting of mixtures of known phosphorylation sites. We also derive a motif specificity score that quantifies whether or not the phosphoproteins containing an instance of a motif have a significant number of known interactions. Application of our motif-finding algorithm to recently published human and mouse proteomic studies recovers several known phosphorylation motifs and reveals a number of novel motifs that are enriched for interactions with a particular kinase or phosphatase. Our tools provide a new approach for uncovering the sequence specificities of uncharacterized kinases or phosphatases. Availability: Software is available at http:/cs.brown.edu/people/braphael/software.html. Contact: aritz@cs.brown.edu; braphael@cs.brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:18996944

  9. Resolution-Adaptive Hybrid MIMO Architectures for Millimeter Wave Communications

    NASA Astrophysics Data System (ADS)

    Choi, Jinseok; Evans, Brian L.; Gatherer, Alan

    2017-12-01

    In this paper, we propose a hybrid analog-digital beamforming architecture with resolution-adaptive ADCs for millimeter wave (mmWave) receivers with large antenna arrays. We adopt array response vectors for the analog combiners and derive ADC bit-allocation (BA) solutions in closed form. The BA solutions reveal that the optimal number of ADC bits is logarithmically proportional to the RF chain's signal-to-noise ratio raised to the 1/3 power. Using the solutions, two proposed BA algorithms minimize the mean square quantization error of received analog signals under a total ADC power constraint. Contributions of this paper include 1) ADC bit-allocation algorithms to improve communication performance of a hybrid MIMO receiver, 2) approximation of the capacity with the BA algorithm as a function of channels, and 3) a worst-case analysis of the ergodic rate of the proposed MIMO receiver that quantifies system tradeoffs and serves as the lower bound. Simulation results demonstrate that the BA algorithms outperform a fixed-ADC approach in both spectral and energy efficiency, and validate the capacity and ergodic rate formula. For a power constraint equivalent to that of fixed 4-bit ADCs, the revised BA algorithm makes the quantization error negligible while achieving 22% better energy efficiency. Having negligible quantization error allows existing state-of-the-art digital beamformers to be readily applied to the proposed system.

  10. Predicting biomedical metadata in CEDAR: A study of Gene Expression Omnibus (GEO).

    PubMed

    Panahiazar, Maryam; Dumontier, Michel; Gevaert, Olivier

    2017-08-01

    A crucial and limiting factor in data reuse is the lack of accurate, structured, and complete descriptions of data, known as metadata. Towards improving the quantity and quality of metadata, we propose a novel metadata prediction framework to learn associations from existing metadata that can be used to predict metadata values. We evaluate our framework in the context of experimental metadata from the Gene Expression Omnibus (GEO). We applied four rule mining algorithms to the most common structured metadata elements (sample type, molecular type, platform, label type and organism) from over 1.3million GEO records. We examined the quality of well supported rules from each algorithm and visualized the dependencies among metadata elements. Finally, we evaluated the performance of the algorithms in terms of accuracy, precision, recall, and F-measure. We found that PART is the best algorithm outperforming Apriori, Predictive Apriori, and Decision Table. All algorithms perform significantly better in predicting class values than the majority vote classifier. We found that the performance of the algorithms is related to the dimensionality of the GEO elements. The average performance of all algorithm increases due of the decreasing of dimensionality of the unique values of these elements (2697 platforms, 537 organisms, 454 labels, 9 molecules, and 5 types). Our work suggests that experimental metadata such as present in GEO can be accurately predicted using rule mining algorithms. Our work has implications for both prospective and retrospective augmentation of metadata quality, which are geared towards making data easier to find and reuse. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.

  11. Active learning for clinical text classification: is it better than random sampling?

    PubMed

    Figueroa, Rosa L; Zeng-Treitler, Qing; Ngo, Long H; Goryachev, Sergey; Wiechmann, Eduardo P

    2012-01-01

    This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results. Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences. The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm. For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty.

  12. Active learning for clinical text classification: is it better than random sampling?

    PubMed Central

    Figueroa, Rosa L; Ngo, Long H; Goryachev, Sergey; Wiechmann, Eduardo P

    2012-01-01

    Objective This study explores active learning algorithms as a way to reduce the requirements for large training sets in medical text classification tasks. Design Three existing active learning algorithms (distance-based (DIST), diversity-based (DIV), and a combination of both (CMB)) were used to classify text from five datasets. The performance of these algorithms was compared to that of passive learning on the five datasets. We then conducted a novel investigation of the interaction between dataset characteristics and the performance results. Measurements Classification accuracy and area under receiver operating characteristics (ROC) curves for each algorithm at different sample sizes were generated. The performance of active learning algorithms was compared with that of passive learning using a weighted mean of paired differences. To determine why the performance varies on different datasets, we measured the diversity and uncertainty of each dataset using relative entropy and correlated the results with the performance differences. Results The DIST and CMB algorithms performed better than passive learning. With a statistical significance level set at 0.05, DIST outperformed passive learning in all five datasets, while CMB was found to be better than passive learning in four datasets. We found strong correlations between the dataset diversity and the DIV performance, as well as the dataset uncertainty and the performance of the DIST algorithm. Conclusion For medical text classification, appropriate active learning algorithms can yield performance comparable to that of passive learning with considerably smaller training sets. In particular, our results suggest that DIV performs better on data with higher diversity and DIST on data with lower uncertainty. PMID:22707743

  13. Preconditioned alternating projection algorithms for maximum a posteriori ECT reconstruction

    NASA Astrophysics Data System (ADS)

    Krol, Andrzej; Li, Si; Shen, Lixin; Xu, Yuesheng

    2012-11-01

    We propose a preconditioned alternating projection algorithm (PAPA) for solving the maximum a posteriori (MAP) emission computed tomography (ECT) reconstruction problem. Specifically, we formulate the reconstruction problem as a constrained convex optimization problem with the total variation (TV) regularization. We then characterize the solution of the constrained convex optimization problem and show that it satisfies a system of fixed-point equations defined in terms of two proximity operators raised from the convex functions that define the TV-norm and the constraint involved in the problem. The characterization (of the solution) via the proximity operators that define two projection operators naturally leads to an alternating projection algorithm for finding the solution. For efficient numerical computation, we introduce to the alternating projection algorithm a preconditioning matrix (the EM-preconditioner) for the dense system matrix involved in the optimization problem. We prove theoretically convergence of the PAPA. In numerical experiments, performance of our algorithms, with an appropriately selected preconditioning matrix, is compared with performance of the conventional MAP expectation-maximization (MAP-EM) algorithm with TV regularizer (EM-TV) and that of the recently developed nested EM-TV algorithm for ECT reconstruction. Based on the numerical experiments performed in this work, we observe that the alternating projection algorithm with the EM-preconditioner outperforms significantly the EM-TV in all aspects including the convergence speed, the noise in the reconstructed images and the image quality. It also outperforms the nested EM-TV in the convergence speed while providing comparable image quality.

  14. An artificial bioindicator system for network intrusion detection.

    PubMed

    Blum, Christian; Lozano, José A; Davidson, Pedro Pinacho

    An artificial bioindicator system is developed in order to solve a network intrusion detection problem. The system, inspired by an ecological approach to biological immune systems, evolves a population of agents that learn to survive in their environment. An adaptation process allows the transformation of the agent population into a bioindicator that is capable of reacting to system anomalies. Two characteristics stand out in our proposal. On the one hand, it is able to discover new, previously unseen attacks, and on the other hand, contrary to most of the existing systems for network intrusion detection, it does not need any previous training. We experimentally compare our proposal with three state-of-the-art algorithms and show that it outperforms the competing approaches on widely used benchmark data.

  15. Sub-pattern based multi-manifold discriminant analysis for face recognition

    NASA Astrophysics Data System (ADS)

    Dai, Jiangyan; Guo, Changlu; Zhou, Wei; Shi, Yanjiao; Cong, Lin; Yi, Yugen

    2018-04-01

    In this paper, we present a Sub-pattern based Multi-manifold Discriminant Analysis (SpMMDA) algorithm for face recognition. Unlike existing Multi-manifold Discriminant Analysis (MMDA) approach which is based on holistic information of face image for recognition, SpMMDA operates on sub-images partitioned from the original face image and then extracts the discriminative local feature from the sub-images separately. Moreover, the structure information of different sub-images from the same face image is considered in the proposed method with the aim of further improve the recognition performance. Extensive experiments on three standard face databases (Extended YaleB, CMU PIE and AR) demonstrate that the proposed method is effective and outperforms some other sub-pattern based face recognition methods.

  16. Genetic Algorithms Evolve Optimized Transforms for Signal Processing Applications

    DTIC Science & Technology

    2005-04-01

    coefficient sets describing inverse transforms and matched forward/ inverse transform pairs that consistently outperform wavelets for image compression and reconstruction applications under conditions subject to quantization error.

  17. High-precision approach to localization scheme of visible light communication based on artificial neural networks and modified genetic algorithms

    NASA Astrophysics Data System (ADS)

    Guan, Weipeng; Wu, Yuxiang; Xie, Canyu; Chen, Hao; Cai, Ye; Chen, Yingcong

    2017-10-01

    An indoor positioning algorithm based on visible light communication (VLC) is presented. This algorithm is used to calculate a three-dimensional (3-D) coordinate of an indoor optical wireless environment, which includes sufficient orders of multipath reflections from reflecting surfaces of the room. Leveraging the global optimization ability of the genetic algorithm (GA), an innovative framework for 3-D position estimation based on a modified genetic algorithm is proposed. Unlike other techniques using VLC for positioning, the proposed system can achieve indoor 3-D localization without making assumptions about the height or acquiring the orientation angle of the mobile terminal. Simulation results show that an average localization error of less than 1.02 cm can be achieved. In addition, in most VLC-positioning systems, the effect of reflection is always neglected and its performance is limited by reflection, which makes the results not so accurate for a real scenario and the positioning errors at the corners are relatively larger than other places. So, we take the first-order reflection into consideration and use artificial neural network to match the model of a nonlinear channel. The studies show that under the nonlinear matching of direct and reflected channels the average positioning errors of four corners decrease from 11.94 to 0.95 cm. The employed algorithm is emerged as an effective and practical method for indoor localization and outperform other existing indoor wireless localization approaches.

  18. AUI&GIV: Recommendation with Asymmetric User Influence and Global Importance Value.

    PubMed

    Zhao, Zhi-Lin; Wang, Chang-Dong; Lai, Jian-Huang

    2016-01-01

    The user-based collaborative filtering (CF) algorithm is one of the most popular approaches for making recommendation. Despite its success, the traditional user-based CF algorithm suffers one serious problem that it only measures the influence between two users based on their symmetric similarities calculated by their consumption histories. It means that, for a pair of users, the influences on each other are the same, which however may not be true. Intuitively, an expert may have an impact on a novice user but a novice user may not affect an expert at all. Besides, each user may possess a global importance factor that affects his/her influence to the remaining users. To this end, in this paper, we propose an asymmetric user influence model to measure the directed influence between two users and adopt the PageRank algorithm to calculate the global importance value of each user. And then the directed influence values and the global importance values are integrated to deduce the final influence values between two users. Finally, we use the final influence values to improve the performance of the traditional user-based CF algorithm. Extensive experiments have been conducted, the results of which have confirmed that both the asymmetric user influence model and global importance value play key roles in improving recommendation accuracy, and hence the proposed method significantly outperforms the existing recommendation algorithms, in particular the user-based CF algorithm on the datasets of high rating density.

  19. AUI&GIV: Recommendation with Asymmetric User Influence and Global Importance Value

    PubMed Central

    Zhao, Zhi-Lin; Wang, Chang-Dong; Lai, Jian-Huang

    2016-01-01

    The user-based collaborative filtering (CF) algorithm is one of the most popular approaches for making recommendation. Despite its success, the traditional user-based CF algorithm suffers one serious problem that it only measures the influence between two users based on their symmetric similarities calculated by their consumption histories. It means that, for a pair of users, the influences on each other are the same, which however may not be true. Intuitively, an expert may have an impact on a novice user but a novice user may not affect an expert at all. Besides, each user may possess a global importance factor that affects his/her influence to the remaining users. To this end, in this paper, we propose an asymmetric user influence model to measure the directed influence between two users and adopt the PageRank algorithm to calculate the global importance value of each user. And then the directed influence values and the global importance values are integrated to deduce the final influence values between two users. Finally, we use the final influence values to improve the performance of the traditional user-based CF algorithm. Extensive experiments have been conducted, the results of which have confirmed that both the asymmetric user influence model and global importance value play key roles in improving recommendation accuracy, and hence the proposed method significantly outperforms the existing recommendation algorithms, in particular the user-based CF algorithm on the datasets of high rating density. PMID:26828803

  20. Effective hybrid teaching-learning-based optimization algorithm for balancing two-sided assembly lines with multiple constraints

    NASA Astrophysics Data System (ADS)

    Tang, Qiuhua; Li, Zixiang; Zhang, Liping; Floudas, C. A.; Cao, Xiaojun

    2015-09-01

    Due to the NP-hardness of the two-sided assembly line balancing (TALB) problem, multiple constraints existing in real applications are less studied, especially when one task is involved with several constraints. In this paper, an effective hybrid algorithm is proposed to address the TALB problem with multiple constraints (TALB-MC). Considering the discrete attribute of TALB-MC and the continuous attribute of the standard teaching-learning-based optimization (TLBO) algorithm, the random-keys method is hired in task permutation representation, for the purpose of bridging the gap between them. Subsequently, a special mechanism for handling multiple constraints is developed. In the mechanism, the directions constraint of each task is ensured by the direction check and adjustment. The zoning constraints and the synchronism constraints are satisfied by teasing out the hidden correlations among constraints. The positional constraint is allowed to be violated to some extent in decoding and punished in cost function. Finally, with the TLBO seeking for the global optimum, the variable neighborhood search (VNS) is further hybridized to extend the local search space. The experimental results show that the proposed hybrid algorithm outperforms the late acceptance hill-climbing algorithm (LAHC) for TALB-MC in most cases, especially for large-size problems with multiple constraints, and demonstrates well balance between the exploration and the exploitation. This research proposes an effective and efficient algorithm for solving TALB-MC problem by hybridizing the TLBO and VNS.

  1. Analysis of high-order SNP barcodes in mitochondrial D-loop for chronic dialysis susceptibility.

    PubMed

    Yang, Cheng-Hong; Lin, Yu-Da; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2016-10-01

    Positively identifying disease-associated single nucleotide polymorphism (SNP) markers in genome-wide studies entails the complex association analysis of a huge number of SNPs. Such large numbers of SNP barcode (SNP/genotype combinations) continue to pose serious computational challenges, especially for high-dimensional data. We propose a novel exploiting SNP barcode method based on differential evolution, termed IDE (improved differential evolution). IDE uses a "top combination strategy" to improve the ability of differential evolution to explore high-order SNP barcodes in high-dimensional data. We simulate disease data and use real chronic dialysis data to test four global optimization algorithms. In 48 simulated disease models, we show that IDE outperforms existing global optimization algorithms in terms of exploring ability and power to detect the specific SNP/genotype combinations with a maximum difference between cases and controls. In real data, we show that IDE can be used to evaluate the relative effects of each individual SNP on disease susceptibility. IDE generated significant SNP barcode with less computational complexity than the other algorithms, making IDE ideally suited for analysis of high-order SNP barcodes. Copyright © 2016 Elsevier Inc. All rights reserved.

  2. A Hybrid Ant Colony Optimization Algorithm for the Extended Capacitated Arc Routing Problem.

    PubMed

    Li-Ning Xing; Rohlfshagen, P; Ying-Wu Chen; Xin Yao

    2011-08-01

    The capacitated arc routing problem (CARP) is representative of numerous practical applications, and in order to widen its scope, we consider an extended version of this problem that entails both total service time and fixed investment costs. We subsequently propose a hybrid ant colony optimization (ACO) algorithm (HACOA) to solve instances of the extended CARP. This approach is characterized by the exploitation of heuristic information, adaptive parameters, and local optimization techniques: Two kinds of heuristic information, arc cluster information and arc priority information, are obtained continuously from the solutions sampled to guide the subsequent optimization process. The adaptive parameters ease the burden of choosing initial values and facilitate improved and more robust results. Finally, local optimization, based on the two-opt heuristic, is employed to improve the overall performance of the proposed algorithm. The resulting HACOA is tested on four sets of benchmark problems containing a total of 87 instances with up to 140 nodes and 380 arcs. In order to evaluate the effectiveness of the proposed method, some existing capacitated arc routing heuristics are extended to cope with the extended version of this problem; the experimental results indicate that the proposed ACO method outperforms these heuristics.

  3. An IMM-Aided ZUPT Methodology for an INS/DVL Integrated Navigation System.

    PubMed

    Yao, Yiqing; Xu, Xiaosu; Xu, Xiang

    2017-09-05

    Inertial navigation system (INS)/Doppler velocity log (DVL) integration is the most common navigation solution for underwater vehicles. Due to the complex underwater environment, the velocity information provided by DVL always contains some errors. To improve navigation accuracy, zero velocity update (ZUPT) technology is considered, which is an effective algorithm for land vehicles to mitigate the navigation error during the pure INS mode. However, in contrast to ground vehicles, the ZUPT solution cannot be used directly for underwater vehicles because of the existence of the water current. In order to leverage the strengths of the ZUPT method and the INS/DVL solution, an interactive multiple model (IMM)-aided ZUPT methodology for the INS/DVL-integrated underwater navigation system is proposed. Both the INS/DVL and INS/ZUPT models are constructed and operated in parallel, with weights calculated according to their innovations and innovation covariance matrices. Simulations are conducted to evaluate the proposed algorithm. The results indicate that the IMM-aided ZUPT solution outperforms both the INS/DVL solution and the INS/ZUPT solution in the underwater environment, which can properly distinguish between the ZUPT and non-ZUPT conditions. In addition, during DVL outage, the effectiveness of the proposed algorithm is also verified.

  4. Holoentropy enabled-decision tree for automatic classification of diabetic retinopathy using retinal fundus images.

    PubMed

    Mane, Vijay Mahadeo; Jadhav, D V

    2017-05-24

    Diabetic retinopathy (DR) is the most common diabetic eye disease. Doctors are using various test methods to detect DR. But, the availability of test methods and requirements of domain experts pose a new challenge in the automatic detection of DR. In order to fulfill this objective, a variety of algorithms has been developed in the literature. In this paper, we propose a system consisting of a novel sparking process and a holoentropy-based decision tree for automatic classification of DR images to further improve the effectiveness. The sparking process algorithm is developed for automatic segmentation of blood vessels through the estimation of optimal threshold. The holoentropy enabled decision tree is newly developed for automatic classification of retinal images into normal or abnormal using hybrid features which preserve the disease-level patterns even more than the signal level of the feature. The effectiveness of the proposed system is analyzed using standard fundus image databases DIARETDB0 and DIARETDB1 for sensitivity, specificity and accuracy. The proposed system yields sensitivity, specificity and accuracy values of 96.72%, 97.01% and 96.45%, respectively. The experimental result reveals that the proposed technique outperforms the existing algorithms.

  5. Real-time traffic sign detection and recognition

    NASA Astrophysics Data System (ADS)

    Herbschleb, Ernst; de With, Peter H. N.

    2009-01-01

    The continuous growth of imaging databases increasingly requires analysis tools for extraction of features. In this paper, a new architecture for the detection of traffic signs is proposed. The architecture is designed to process a large database with tens of millions of images with a resolution up to 4,800x2,400 pixels. Because of the size of the database, a high reliability as well as a high throughput is required. The novel architecture consists of a three-stage algorithm with multiple steps per stage, combining both color and specific spatial information. The first stage contains an area-limitation step which is performance critical in both the detection rate as the overall processing time. The second stage locates suggestions for traffic signs using recently published feature processing. The third stage contains a validation step to enhance reliability of the algorithm. During this stage, the traffic signs are recognized. Experiments show a convincing detection rate of 99%. With respect to computational speed, the throughput for line-of-sight images of 800×600 pixels is 35 Hz and for panorama images it is 4 Hz. Our novel architecture outperforms existing algorithms, with respect to both detection rate and throughput

  6. MULTINEST: an efficient and robust Bayesian inference tool for cosmology and particle physics

    NASA Astrophysics Data System (ADS)

    Feroz, F.; Hobson, M. P.; Bridges, M.

    2009-10-01

    We present further development and the first public release of our multimodal nested sampling algorithm, called MULTINEST. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson, which itself significantly outperformed existing Markov chain Monte Carlo techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MULTINEST algorithm are demonstrated by application to two toy problems and to a cosmological inference problem focusing on the extension of the vanilla Λ cold dark matter model to include spatial curvature and a varying equation of state for dark energy. The MULTINEST software, which is fully parallelized using MPI and includes an interface to COSMOMC, is available at http://www.mrao.cam.ac.uk/software/multinest/. It will also be released as part of the SUPERBAYES package, for the analysis of supersymmetric theories of particle physics, at http://www.superbayes.org.

  7. Improved HDRG decoders for qudit and non-Abelian quantum error correction

    NASA Astrophysics Data System (ADS)

    Hutter, Adrian; Loss, Daniel; Wootton, James R.

    2015-03-01

    Hard-decision renormalization group (HDRG) decoders are an important class of decoding algorithms for topological quantum error correction. Due to their versatility, they have been used to decode systems with fractal logical operators, color codes, qudit topological codes, and non-Abelian systems. In this work, we develop a method of performing HDRG decoding which combines strengths of existing decoders and further improves upon them. In particular, we increase the minimal number of errors necessary for a logical error in a system of linear size L from \\Theta ({{L}2/3}) to Ω ({{L}1-ε }) for any ε \\gt 0. We apply our algorithm to decoding D({{{Z}}d}) quantum double models and a non-Abelian anyon model with Fibonacci-like fusion rules, and show that it indeed significantly outperforms previous HDRG decoders. Furthermore, we provide the first study of continuous error correction with imperfect syndrome measurements for the D({{{Z}}d}) quantum double models. The parallelized runtime of our algorithm is poly(log L) for the perfect measurement case. In the continuous case with imperfect syndrome measurements, the averaged runtime is O(1) for Abelian systems, while continuous error correction for non-Abelian anyons stays an open problem.

  8. Learning to rank using user clicks and visual features for image retrieval.

    PubMed

    Yu, Jun; Tao, Dacheng; Wang, Meng; Rui, Yong

    2015-04-01

    The inconsistency between textual features and visual contents can cause poor image search results. To solve this problem, click features, which are more reliable than textual information in justifying the relevance between a query and clicked images, are adopted in image ranking model. However, the existing ranking model cannot integrate visual features, which are efficient in refining the click-based search results. In this paper, we propose a novel ranking model based on the learning to rank framework. Visual features and click features are simultaneously utilized to obtain the ranking model. Specifically, the proposed approach is based on large margin structured output learning and the visual consistency is integrated with the click features through a hypergraph regularizer term. In accordance with the fast alternating linearization method, we design a novel algorithm to optimize the objective function. This algorithm alternately minimizes two different approximations of the original objective function by keeping one function unchanged and linearizing the other. We conduct experiments on a large-scale dataset collected from the Microsoft Bing image search engine, and the results demonstrate that the proposed learning to rank models based on visual features and user clicks outperforms state-of-the-art algorithms.

  9. A scalable and practical one-pass clustering algorithm for recommender system

    NASA Astrophysics Data System (ADS)

    Khalid, Asra; Ghazanfar, Mustansar Ali; Azam, Awais; Alahmari, Saad Ali

    2015-12-01

    KMeans clustering-based recommendation algorithms have been proposed claiming to increase the scalability of recommender systems. One potential drawback of these algorithms is that they perform training offline and hence cannot accommodate the incremental updates with the arrival of new data, making them unsuitable for the dynamic environments. From this line of research, a new clustering algorithm called One-Pass is proposed, which is a simple, fast, and accurate. We show empirically that the proposed algorithm outperforms K-Means in terms of recommendation and training time while maintaining a good level of accuracy.

  10. Simultaneous Nonrigid Registration, Segmentation, and Tumor Detection in MRI Guided Cervical Cancer Radiation Therapy

    PubMed Central

    Lu, Chao; Chelikani, Sudhakar; Jaffray, David A.; Milosevic, Michael F.; Staib, Lawrence H.; Duncan, James S.

    2013-01-01

    External beam radiation therapy (EBRT) for the treatment of cancer enables accurate placement of radiation dose on the cancerous region. However, the deformation of soft tissue during the course of treatment, such as in cervical cancer, presents significant challenges for the delineation of the target volume and other structures of interest. Furthermore, the presence and regression of pathologies such as tumors may violate registration constraints and cause registration errors. In this paper, automatic segmentation, nonrigid registration and tumor detection in cervical magnetic resonance (MR) data are addressed simultaneously using a unified Bayesian framework. The proposed novel method can generate a tumor probability map while progressively identifying the boundary of an organ of interest based on the achieved nonrigid transformation. The method is able to handle the challenges of significant tumor regression and its effect on surrounding tissues. The new method was compared to various currently existing algorithms on a set of 36 MR data from six patients, each patient has six T2-weighted MR cervical images. The results show that the proposed approach achieves an accuracy comparable to manual segmentation and it significantly outperforms the existing registration algorithms. In addition, the tumor detection result generated by the proposed method has a high agreement with manual delineation by a qualified clinician. PMID:22328178

  11. Complexity of the Quantum Adiabatic Algorithm

    NASA Technical Reports Server (NTRS)

    Hen, Itay

    2013-01-01

    The Quantum Adiabatic Algorithm (QAA) has been proposed as a mechanism for efficiently solving optimization problems on a quantum computer. Since adiabatic computation is analog in nature and does not require the design and use of quantum gates, it can be thought of as a simpler and perhaps more profound method for performing quantum computations that might also be easier to implement experimentally. While these features have generated substantial research in QAA, to date there is still a lack of solid evidence that the algorithm can outperform classical optimization algorithms.

  12. pyJac: Analytical Jacobian generator for chemical kinetics

    NASA Astrophysics Data System (ADS)

    Niemeyer, Kyle E.; Curtis, Nicholas J.; Sung, Chih-Jen

    2017-06-01

    Accurate simulations of combustion phenomena require the use of detailed chemical kinetics in order to capture limit phenomena such as ignition and extinction as well as predict pollutant formation. However, the chemical kinetic models for hydrocarbon fuels of practical interest typically have large numbers of species and reactions and exhibit high levels of mathematical stiffness in the governing differential equations, particularly for larger fuel molecules. In order to integrate the stiff equations governing chemical kinetics, generally reactive-flow simulations rely on implicit algorithms that require frequent Jacobian matrix evaluations. Some in situ and a posteriori computational diagnostics methods also require accurate Jacobian matrices, including computational singular perturbation and chemical explosive mode analysis. Typically, finite differences numerically approximate these, but for larger chemical kinetic models this poses significant computational demands since the number of chemical source term evaluations scales with the square of species count. Furthermore, existing analytical Jacobian tools do not optimize evaluations or support emerging SIMD processors such as GPUs. Here we introduce pyJac, a Python-based open-source program that generates analytical Jacobian matrices for use in chemical kinetics modeling and analysis. In addition to producing the necessary customized source code for evaluating reaction rates (including all modern reaction rate formulations), the chemical source terms, and the Jacobian matrix, pyJac uses an optimized evaluation order to minimize computational and memory operations. As a demonstration, we first establish the correctness of the Jacobian matrices for kinetic models of hydrogen, methane, ethylene, and isopentanol oxidation (number of species ranging 13-360) by showing agreement within 0.001% of matrices obtained via automatic differentiation. We then demonstrate the performance achievable on CPUs and GPUs using pyJac via matrix evaluation timing comparisons; the routines produced by pyJac outperformed first-order finite differences by 3-7.5 times and the existing analytical Jacobian software TChem by 1.1-2.2 times on a single-threaded basis. It is noted that TChem is not thread-safe, while pyJac is easily parallelized, and hence can greatly outperform TChem on multicore CPUs. The Jacobian matrix generator we describe here will be useful for reducing the cost of integrating chemical source terms with implicit algorithms in particular and algorithms that require an accurate Jacobian matrix in general. Furthermore, the open-source release of the program and Python-based implementation will enable wide adoption.

  13. Clustering algorithm for determining community structure in large networks

    NASA Astrophysics Data System (ADS)

    Pujol, Josep M.; Béjar, Javier; Delgado, Jordi

    2006-07-01

    We propose an algorithm to find the community structure in complex networks based on the combination of spectral analysis and modularity optimization. The clustering produced by our algorithm is as accurate as the best algorithms on the literature of modularity optimization; however, the main asset of the algorithm is its efficiency. The best match for our algorithm is Newman’s fast algorithm, which is the reference algorithm for clustering in large networks due to its efficiency. When both algorithms are compared, our algorithm outperforms the fast algorithm both in efficiency and accuracy of the clustering, in terms of modularity. Thus, the results suggest that the proposed algorithm is a good choice to analyze the community structure of medium and large networks in the range of tens and hundreds of thousand vertices.

  14. Hybrid Pareto artificial bee colony algorithm for multi-objective single machine group scheduling problem with sequence-dependent setup times and learning effects.

    PubMed

    Yue, Lei; Guan, Zailin; Saif, Ullah; Zhang, Fei; Wang, Hao

    2016-01-01

    Group scheduling is significant for efficient and cost effective production system. However, there exist setup times between the groups, which require to decrease it by sequencing groups in an efficient way. Current research is focused on a sequence dependent group scheduling problem with an aim to minimize the makespan in addition to minimize the total weighted tardiness simultaneously. In most of the production scheduling problems, the processing time of jobs is assumed as fixed. However, the actual processing time of jobs may be reduced due to "learning effect". The integration of sequence dependent group scheduling problem with learning effects has been rarely considered in literature. Therefore, current research considers a single machine group scheduling problem with sequence dependent setup times and learning effects simultaneously. A novel hybrid Pareto artificial bee colony algorithm (HPABC) with some steps of genetic algorithm is proposed for current problem to get Pareto solutions. Furthermore, five different sizes of test problems (small, small medium, medium, large medium, large) are tested using proposed HPABC. Taguchi method is used to tune the effective parameters of the proposed HPABC for each problem category. The performance of HPABC is compared with three famous multi objective optimization algorithms, improved strength Pareto evolutionary algorithm (SPEA2), non-dominated sorting genetic algorithm II (NSGAII) and particle swarm optimization algorithm (PSO). Results indicate that HPABC outperforms SPEA2, NSGAII and PSO and gives better Pareto optimal solutions in terms of diversity and quality for almost all the instances of the different sizes of problems.

  15. On the improvement of neural cryptography using erroneous transmitted information with error prediction.

    PubMed

    Allam, Ahmed M; Abbas, Hazem M

    2010-12-01

    Neural cryptography deals with the problem of "key exchange" between two neural networks using the mutual learning concept. The two networks exchange their outputs (in bits) and the key between the two communicating parties is eventually represented in the final learned weights, when the two networks are said to be synchronized. Security of neural synchronization is put at risk if an attacker is capable of synchronizing with any of the two parties during the training process. Therefore, diminishing the probability of such a threat improves the reliability of exchanging the output bits through a public channel. The synchronization with feedback algorithm is one of the existing algorithms that enhances the security of neural cryptography. This paper proposes three new algorithms to enhance the mutual learning process. They mainly depend on disrupting the attacker confidence in the exchanged outputs and input patterns during training. The first algorithm is called "Do not Trust My Partner" (DTMP), which relies on one party sending erroneous output bits, with the other party being capable of predicting and correcting this error. The second algorithm is called "Synchronization with Common Secret Feedback" (SCSFB), where inputs are kept partially secret and the attacker has to train its network on input patterns that are different from the training sets used by the communicating parties. The third algorithm is a hybrid technique combining the features of the DTMP and SCSFB. The proposed approaches are shown to outperform the synchronization with feedback algorithm in the time needed for the parties to synchronize.

  16. Multifractal detrending moving-average cross-correlation analysis

    NASA Astrophysics Data System (ADS)

    Jiang, Zhi-Qiang; Zhou, Wei-Xing

    2011-07-01

    There are a number of situations in which several signals are simultaneously recorded in complex systems, which exhibit long-term power-law cross correlations. The multifractal detrended cross-correlation analysis (MFDCCA) approaches can be used to quantify such cross correlations, such as the MFDCCA based on the detrended fluctuation analysis (MFXDFA) method. We develop in this work a class of MFDCCA algorithms based on the detrending moving-average analysis, called MFXDMA. The performances of the proposed MFXDMA algorithms are compared with the MFXDFA method by extensive numerical experiments on pairs of time series generated from bivariate fractional Brownian motions, two-component autoregressive fractionally integrated moving-average processes, and binomial measures, which have theoretical expressions of the multifractal nature. In all cases, the scaling exponents hxy extracted from the MFXDMA and MFXDFA algorithms are very close to the theoretical values. For bivariate fractional Brownian motions, the scaling exponent of the cross correlation is independent of the cross-correlation coefficient between two time series, and the MFXDFA and centered MFXDMA algorithms have comparative performances, which outperform the forward and backward MFXDMA algorithms. For two-component autoregressive fractionally integrated moving-average processes, we also find that the MFXDFA and centered MFXDMA algorithms have comparative performances, while the forward and backward MFXDMA algorithms perform slightly worse. For binomial measures, the forward MFXDMA algorithm exhibits the best performance, the centered MFXDMA algorithms performs worst, and the backward MFXDMA algorithm outperforms the MFXDFA algorithm when the moment order q<0 and underperforms when q>0. We apply these algorithms to the return time series of two stock market indexes and to their volatilities. For the returns, the centered MFXDMA algorithm gives the best estimates of hxy(q) since its hxy(2) is closest to 0.5, as expected, and the MFXDFA algorithm has the second best performance. For the volatilities, the forward and backward MFXDMA algorithms give similar results, while the centered MFXDMA and the MFXDFA algorithms fail to extract rational multifractal nature.

  17. Recognition of pornographic web pages by classifying texts and images.

    PubMed

    Hu, Weiming; Wu, Ou; Chen, Zhouyao; Fu, Zhouyu; Maybank, Steve

    2007-06-01

    With the rapid development of the World Wide Web, people benefit more and more from the sharing of information. However, Web pages with obscene, harmful, or illegal content can be easily accessed. It is important to recognize such unsuitable, offensive, or pornographic Web pages. In this paper, a novel framework for recognizing pornographic Web pages is described. A C4.5 decision tree is used to divide Web pages, according to content representations, into continuous text pages, discrete text pages, and image pages. These three categories of Web pages are handled, respectively, by a continuous text classifier, a discrete text classifier, and an algorithm that fuses the results from the image classifier and the discrete text classifier. In the continuous text classifier, statistical and semantic features are used to recognize pornographic texts. In the discrete text classifier, the naive Bayes rule is used to calculate the probability that a discrete text is pornographic. In the image classifier, the object's contour-based features are extracted to recognize pornographic images. In the text and image fusion algorithm, the Bayes theory is used to combine the recognition results from images and texts. Experimental results demonstrate that the continuous text classifier outperforms the traditional keyword-statistics-based classifier, the contour-based image classifier outperforms the traditional skin-region-based image classifier, the results obtained by our fusion algorithm outperform those by either of the individual classifiers, and our framework can be adapted to different categories of Web pages.

  18. Frog: Asynchronous Graph Processing on GPU with Hybrid Coloring Model

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shi, Xuanhua; Luo, Xuan; Liang, Junling

    GPUs have been increasingly used to accelerate graph processing for complicated computational problems regarding graph theory. Many parallel graph algorithms adopt the asynchronous computing model to accelerate the iterative convergence. Unfortunately, the consistent asynchronous computing requires locking or atomic operations, leading to significant penalties/overheads when implemented on GPUs. As such, coloring algorithm is adopted to separate the vertices with potential updating conflicts, guaranteeing the consistency/correctness of the parallel processing. Common coloring algorithms, however, may suffer from low parallelism because of a large number of colors generally required for processing a large-scale graph with billions of vertices. We propose a light-weightmore » asynchronous processing framework called Frog with a preprocessing/hybrid coloring model. The fundamental idea is based on Pareto principle (or 80-20 rule) about coloring algorithms as we observed through masses of realworld graph coloring cases. We find that a majority of vertices (about 80%) are colored with only a few colors, such that they can be read and updated in a very high degree of parallelism without violating the sequential consistency. Accordingly, our solution separates the processing of the vertices based on the distribution of colors. In this work, we mainly answer three questions: (1) how to partition the vertices in a sparse graph with maximized parallelism, (2) how to process large-scale graphs that cannot fit into GPU memory, and (3) how to reduce the overhead of data transfers on PCIe while processing each partition. We conduct experiments on real-world data (Amazon, DBLP, YouTube, RoadNet-CA, WikiTalk and Twitter) to evaluate our approach and make comparisons with well-known non-preprocessed (such as Totem, Medusa, MapGraph and Gunrock) and preprocessed (Cusha) approaches, by testing four classical algorithms (BFS, PageRank, SSSP and CC). On all the tested applications and datasets, Frog is able to significantly outperform existing GPU-based graph processing systems except Gunrock and MapGraph. MapGraph gets better performance than Frog when running BFS on RoadNet-CA. The comparison between Gunrock and Frog is inconclusive. Frog can outperform Gunrock more than 1.04X when running PageRank and SSSP, while the advantage of Frog is not obvious when running BFS and CC on some datasets especially for RoadNet-CA.« less

  19. FBP and BPF reconstruction methods for circular X-ray tomography with off-center detector

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schaefer, Dirk; Grass, Michael; Haar, Peter van de

    2011-05-15

    Purpose: Circular scanning with an off-center planar detector is an acquisition scheme that allows to save detector area while keeping a large field of view (FOV). Several filtered back-projection (FBP) algorithms have been proposed earlier. The purpose of this work is to present two newly developed back-projection filtration (BPF) variants and evaluate the image quality of these methods compared to the existing state-of-the-art FBP methods. Methods: The first new BPF algorithm applies redundancy weighting of overlapping opposite projections before differentiation in a single projection. The second one uses the Katsevich-type differentiation involving two neighboring projections followed by redundancy weighting andmore » back-projection. An averaging scheme is presented to mitigate streak artifacts inherent to circular BPF algorithms along the Hilbert filter lines in the off-center transaxial slices of the reconstructions. The image quality is assessed visually on reconstructed slices of simulated and clinical data. Quantitative evaluation studies are performed with the Forbild head phantom by calculating root-mean-squared-deviations (RMSDs) to the voxelized phantom for different detector overlap settings and by investigating the noise resolution trade-off with a wire phantom in the full detector and off-center scenario. Results: The noise-resolution behavior of all off-center reconstruction methods corresponds to their full detector performance with the best resolution for the FDK based methods with the given imaging geometry. With respect to RMSD and visual inspection, the proposed BPF with Katsevich-type differentiation outperforms all other methods for the smallest chosen detector overlap of about 15 mm. The best FBP method is the algorithm that is also based on the Katsevich-type differentiation and subsequent redundancy weighting. For wider overlap of about 40-50 mm, these two algorithms produce similar results outperforming the other three methods. The clinical case with a detector overlap of about 17 mm confirms these results. Conclusions: The BPF-type reconstructions with Katsevich differentiation are widely independent of the size of the detector overlap and give the best results with respect to RMSD and visual inspection for minimal detector overlap. The increased homogeneity will improve correct assessment of lesions in the entire field of view.« less

  20. Approximation algorithms for the min-power symmetric connectivity problem

    NASA Astrophysics Data System (ADS)

    Plotnikov, Roman; Erzin, Adil; Mladenovic, Nenad

    2016-10-01

    We consider the NP-hard problem of synthesis of optimal spanning communication subgraph in a given arbitrary simple edge-weighted graph. This problem occurs in the wireless networks while minimizing the total transmission power consumptions. We propose several new heuristics based on the variable neighborhood search metaheuristic for the approximation solution of the problem. We have performed a numerical experiment where all proposed algorithms have been executed on the randomly generated test samples. For these instances, on average, our algorithms outperform the previously known heuristics.

  1. Statistical Relational Learning to Predict Primary Myocardial Infarction from Electronic Health Records

    PubMed Central

    Weiss, Jeremy C; Page, David; Peissig, Peggy L; Natarajan, Sriraam; McCarty, Catherine

    2013-01-01

    Electronic health records (EHRs) are an emerging relational domain with large potential to improve clinical outcomes. We apply two statistical relational learning (SRL) algorithms to the task of predicting primary myocardial infarction. We show that one SRL algorithm, relational functional gradient boosting, outperforms propositional learners particularly in the medically-relevant high recall region. We observe that both SRL algorithms predict outcomes better than their propositional analogs and suggest how our methods can augment current epidemiological practices. PMID:25360347

  2. Linear-time general decoding algorithm for the surface code

    NASA Astrophysics Data System (ADS)

    Darmawan, Andrew S.; Poulin, David

    2018-05-01

    A quantum error correcting protocol can be substantially improved by taking into account features of the physical noise process. We present an efficient decoder for the surface code which can account for general noise features, including coherences and correlations. We demonstrate that the decoder significantly outperforms the conventional matching algorithm on a variety of noise models, including non-Pauli noise and spatially correlated noise. The algorithm is based on an approximate calculation of the logical channel using a tensor-network description of the noisy state.

  3. Interior search algorithm (ISA): a novel approach for global optimization.

    PubMed

    Gandomi, Amir H

    2014-07-01

    This paper presents the interior search algorithm (ISA) as a novel method for solving optimization tasks. The proposed ISA is inspired by interior design and decoration. The algorithm is different from other metaheuristic algorithms and provides new insight for global optimization. The proposed method is verified using some benchmark mathematical and engineering problems commonly used in the area of optimization. ISA results are further compared with well-known optimization algorithms. The results show that the ISA is efficiently capable of solving optimization problems. The proposed algorithm can outperform the other well-known algorithms. Further, the proposed algorithm is very simple and it only has one parameter to tune. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.

  4. A Hyper-Heuristic Ensemble Method for Static Job-Shop Scheduling.

    PubMed

    Hart, Emma; Sim, Kevin

    2016-01-01

    We describe a new hyper-heuristic method NELLI-GP for solving job-shop scheduling problems (JSSP) that evolves an ensemble of heuristics. The ensemble adopts a divide-and-conquer approach in which each heuristic solves a unique subset of the instance set considered. NELLI-GP extends an existing ensemble method called NELLI by introducing a novel heuristic generator that evolves heuristics composed of linear sequences of dispatching rules: each rule is represented using a tree structure and is itself evolved. Following a training period, the ensemble is shown to outperform both existing dispatching rules and a standard genetic programming algorithm on a large set of new test instances. In addition, it obtains superior results on a set of 210 benchmark problems from the literature when compared to two state-of-the-art hyper-heuristic approaches. Further analysis of the relationship between heuristics in the evolved ensemble and the instances each solves provides new insights into features that might describe similar instances.

  5. RootGraph: a graphic optimization tool for automated image analysis of plant roots

    PubMed Central

    Cai, Jinhai; Zeng, Zhanghui; Connor, Jason N.; Huang, Chun Yuan; Melino, Vanessa; Kumar, Pankaj; Miklavcic, Stanley J.

    2015-01-01

    This paper outlines a numerical scheme for accurate, detailed, and high-throughput image analysis of plant roots. In contrast to existing root image analysis tools that focus on root system-average traits, a novel, fully automated and robust approach for the detailed characterization of root traits, based on a graph optimization process is presented. The scheme, firstly, distinguishes primary roots from lateral roots and, secondly, quantifies a broad spectrum of root traits for each identified primary and lateral root. Thirdly, it associates lateral roots and their properties with the specific primary root from which the laterals emerge. The performance of this approach was evaluated through comparisons with other automated and semi-automated software solutions as well as against results based on manual measurements. The comparisons and subsequent application of the algorithm to an array of experimental data demonstrate that this method outperforms existing methods in terms of accuracy, robustness, and the ability to process root images under high-throughput conditions. PMID:26224880

  6. An adaptive large neighborhood search heuristic for Two-Echelon Vehicle Routing Problems arising in city logistics

    PubMed Central

    Hemmelmayr, Vera C.; Cordeau, Jean-François; Crainic, Teodor Gabriel

    2012-01-01

    In this paper, we propose an adaptive large neighborhood search heuristic for the Two-Echelon Vehicle Routing Problem (2E-VRP) and the Location Routing Problem (LRP). The 2E-VRP arises in two-level transportation systems such as those encountered in the context of city logistics. In such systems, freight arrives at a major terminal and is shipped through intermediate satellite facilities to the final customers. The LRP can be seen as a special case of the 2E-VRP in which vehicle routing is performed only at the second level. We have developed new neighborhood search operators by exploiting the structure of the two problem classes considered and have also adapted existing operators from the literature. The operators are used in a hierarchical scheme reflecting the multi-level nature of the problem. Computational experiments conducted on several sets of instances from the literature show that our algorithm outperforms existing solution methods for the 2E-VRP and achieves excellent results on the LRP. PMID:23483764

  7. Multiuser receiver for DS-CDMA signals in multipath channels: an enhanced multisurface method.

    PubMed

    Mahendra, Chetan; Puthusserypady, Sadasivan

    2006-11-01

    This paper deals with the problem of multiuser detection in direct-sequence code-division multiple-access (DS-CDMA) systems in multipath environments. The existing multiuser detectors can be divided into two categories: (1) low-complexity poor-performance linear detectors and (2) high-complexity good-performance nonlinear detectors. In particular, in channels where the orthogonality of the code sequences is destroyed by multipath, detectors with linear complexity perform much worse than the nonlinear detectors. In this paper, we propose an enhanced multisurface method (EMSM) for multiuser detection in multipath channels. EMSM is an intermediate piecewise linear detection scheme with a run-time complexity linear in the number of users. Its bit error rate performance is compared with existing linear detectors, a nonlinear radial basis function detector trained by the new support vector learning algorithm, and Verdu's optimal detector. Simulations in multipath channels, for both synchronous and asynchronous cases, indicate that it always outperforms all other linear detectors, performing nearly as well as nonlinear detectors.

  8. An adaptive large neighborhood search heuristic for Two-Echelon Vehicle Routing Problems arising in city logistics.

    PubMed

    Hemmelmayr, Vera C; Cordeau, Jean-François; Crainic, Teodor Gabriel

    2012-12-01

    In this paper, we propose an adaptive large neighborhood search heuristic for the Two-Echelon Vehicle Routing Problem (2E-VRP) and the Location Routing Problem (LRP). The 2E-VRP arises in two-level transportation systems such as those encountered in the context of city logistics. In such systems, freight arrives at a major terminal and is shipped through intermediate satellite facilities to the final customers. The LRP can be seen as a special case of the 2E-VRP in which vehicle routing is performed only at the second level. We have developed new neighborhood search operators by exploiting the structure of the two problem classes considered and have also adapted existing operators from the literature. The operators are used in a hierarchical scheme reflecting the multi-level nature of the problem. Computational experiments conducted on several sets of instances from the literature show that our algorithm outperforms existing solution methods for the 2E-VRP and achieves excellent results on the LRP.

  9. Application of XGBoost algorithm in hourly PM2.5 concentration prediction

    NASA Astrophysics Data System (ADS)

    Pan, Bingyue

    2018-02-01

    In view of prediction techniques of hourly PM2.5 concentration in China, this paper applied the XGBoost(Extreme Gradient Boosting) algorithm to predict hourly PM2.5 concentration. The monitoring data of air quality in Tianjin city was analyzed by using XGBoost algorithm. The prediction performance of the XGBoost method is evaluated by comparing observed and predicted PM2.5 concentration using three measures of forecast accuracy. The XGBoost method is also compared with the random forest algorithm, multiple linear regression, decision tree regression and support vector machines for regression models using computational results. The results demonstrate that the XGBoost algorithm outperforms other data mining methods.

  10. The finite state projection algorithm for the solution of the chemical master equation.

    PubMed

    Munsky, Brian; Khammash, Mustafa

    2006-01-28

    This article introduces the finite state projection (FSP) method for use in the stochastic analysis of chemically reacting systems. One can describe the chemical populations of such systems with probability density vectors that evolve according to a set of linear ordinary differential equations known as the chemical master equation (CME). Unlike Monte Carlo methods such as the stochastic simulation algorithm (SSA) or tau leaping, the FSP directly solves or approximates the solution of the CME. If the CME describes a system that has a finite number of distinct population vectors, the FSP method provides an exact analytical solution. When an infinite or extremely large number of population variations is possible, the state space can be truncated, and the FSP method provides a certificate of accuracy for how closely the truncated space approximation matches the true solution. The proposed FSP algorithm systematically increases the projection space in order to meet prespecified tolerance in the total probability density error. For any system in which a sufficiently accurate FSP exists, the FSP algorithm is shown to converge in a finite number of steps. The FSP is utilized to solve two examples taken from the field of systems biology, and comparisons are made between the FSP, the SSA, and tau leaping algorithms. In both examples, the FSP outperforms the SSA in terms of accuracy as well as computational efficiency. Furthermore, due to very small molecular counts in these particular examples, the FSP also performs far more effectively than tau leaping methods.

  11. FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm.

    PubMed

    Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen

    2016-01-01

    Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset.

  12. FHSA-SED: Two-Locus Model Detection for Genome-Wide Association Study with Harmony Search Algorithm

    PubMed Central

    Tuo, Shouheng; Zhang, Junying; Yuan, Xiguo; Zhang, Yuanyuan; Liu, Zhaowen

    2016-01-01

    Motivation Two-locus model is a typical significant disease model to be identified in genome-wide association study (GWAS). Due to intensive computational burden and diversity of disease models, existing methods have drawbacks on low detection power, high computation cost, and preference for some types of disease models. Method In this study, two scoring functions (Bayesian network based K2-score and Gini-score) are used for characterizing two SNP locus as a candidate model, the two criteria are adopted simultaneously for improving identification power and tackling the preference problem to disease models. Harmony search algorithm (HSA) is improved for quickly finding the most likely candidate models among all two-locus models, in which a local search algorithm with two-dimensional tabu table is presented to avoid repeatedly evaluating some disease models that have strong marginal effect. Finally G-test statistic is used to further test the candidate models. Results We investigate our method named FHSA-SED on 82 simulated datasets and a real AMD dataset, and compare it with two typical methods (MACOED and CSE) which have been developed recently based on swarm intelligent search algorithm. The results of simulation experiments indicate that our method outperforms the two compared algorithms in terms of detection power, computation time, evaluation times, sensitivity (TPR), specificity (SPC), positive predictive value (PPV) and accuracy (ACC). Our method has identified two SNPs (rs3775652 and rs10511467) that may be also associated with disease in AMD dataset. PMID:27014873

  13. Quality Scalability Aware Watermarking for Visual Content.

    PubMed

    Bhowmik, Deepayan; Abhayaratne, Charith

    2016-11-01

    Scalable coding-based content adaptation poses serious challenges to traditional watermarking algorithms, which do not consider the scalable coding structure and hence cannot guarantee correct watermark extraction in media consumption chain. In this paper, we propose a novel concept of scalable blind watermarking that ensures more robust watermark extraction at various compression ratios while not effecting the visual quality of host media. The proposed algorithm generates scalable and robust watermarked image code-stream that allows the user to constrain embedding distortion for target content adaptations. The watermarked image code-stream consists of hierarchically nested joint distortion-robustness coding atoms. The code-stream is generated by proposing a new wavelet domain blind watermarking algorithm guided by a quantization based binary tree. The code-stream can be truncated at any distortion-robustness atom to generate the watermarked image with the desired distortion-robustness requirements. A blind extractor is capable of extracting watermark data from the watermarked images. The algorithm is further extended to incorporate a bit-plane discarding-based quantization model used in scalable coding-based content adaptation, e.g., JPEG2000. This improves the robustness against quality scalability of JPEG2000 compression. The simulation results verify the feasibility of the proposed concept, its applications, and its improved robustness against quality scalable content adaptation. Our proposed algorithm also outperforms existing methods showing 35% improvement. In terms of robustness to quality scalable video content adaptation using Motion JPEG2000 and wavelet-based scalable video coding, the proposed method shows major improvement for video watermarking.

  14. Using OpenMP vs. Threading Building Blocks for Medical Imaging on Multi-cores

    NASA Astrophysics Data System (ADS)

    Kegel, Philipp; Schellmann, Maraike; Gorlatch, Sergei

    We compare two parallel programming approaches for multi-core systems: the well-known OpenMP and the recently introduced Threading Building Blocks (TBB) library by Intel®. The comparison is made using the parallelization of a real-world numerical algorithm for medical imaging. We develop several parallel implementations, and compare them w.r.t. programming effort, programming style and abstraction, and runtime performance. We show that TBB requires a considerable program re-design, whereas with OpenMP simple compiler directives are sufficient. While TBB appears to be less appropriate for parallelizing existing implementations, it fosters a good programming style and higher abstraction level for newly developed parallel programs. Our experimental measurements on a dual quad-core system demonstrate that OpenMP slightly outperforms TBB in our implementation.

  15. Metaphor Identification in Large Texts Corpora

    PubMed Central

    Neuman, Yair; Assaf, Dan; Cohen, Yohai; Last, Mark; Argamon, Shlomo; Howard, Newton; Frieder, Ophir

    2013-01-01

    Identifying metaphorical language-use (e.g., sweet child) is one of the challenges facing natural language processing. This paper describes three novel algorithms for automatic metaphor identification. The algorithms are variations of the same core algorithm. We evaluate the algorithms on two corpora of Reuters and the New York Times articles. The paper presents the most comprehensive study of metaphor identification in terms of scope of metaphorical phrases and annotated corpora size. Algorithms’ performance in identifying linguistic phrases as metaphorical or literal has been compared to human judgment. Overall, the algorithms outperform the state-of-the-art algorithm with 71% precision and 27% averaged improvement in prediction over the base-rate of metaphors in the corpus. PMID:23658625

  16. Ion trace detection algorithm to extract pure ion chromatograms to improve untargeted peak detection quality for liquid chromatography/time-of-flight mass spectrometry-based metabolomics data.

    PubMed

    Wang, San-Yuan; Kuo, Ching-Hua; Tseng, Yufeng J

    2015-03-03

    Able to detect known and unknown metabolites, untargeted metabolomics has shown great potential in identifying novel biomarkers. However, elucidating all possible liquid chromatography/time-of-flight mass spectrometry (LC/TOF-MS) ion signals in a complex biological sample remains challenging since many ions are not the products of metabolites. Methods of reducing ions not related to metabolites or simply directly detecting metabolite related (pure) ions are important. In this work, we describe PITracer, a novel algorithm that accurately detects the pure ions of a LC/TOF-MS profile to extract pure ion chromatograms and detect chromatographic peaks. PITracer estimates the relative mass difference tolerance of ions and calibrates the mass over charge (m/z) values for peak detection algorithms with an additional option to further mass correction with respect to a user-specified metabolite. PITracer was evaluated using two data sets containing 373 human metabolite standards, including 5 saturated standards considered to be split peaks resultant from huge m/z fluctuation, and 12 urine samples spiked with 50 forensic drugs of varying concentrations. Analysis of these data sets show that PITracer correctly outperformed existing state-of-art algorithm and extracted the pure ion chromatograms of the 5 saturated standards without generating split peaks and detected the forensic drugs with high recall, precision, and F-score and small mass error.

  17. SimBA: simulation algorithm to fit extant-population distributions.

    PubMed

    Parida, Laxmi; Haiminen, Niina

    2015-03-14

    Simulation of populations with specified characteristics such as allele frequencies, linkage disequilibrium etc., is an integral component of many studies, including in-silico breeding optimization. Since the accuracy and sensitivity of population simulation is critical to the quality of the output of the applications that use them, accurate algorithms are required to provide a strong foundation to the methods in these studies. In this paper we present SimBA (Simulation using Best-fit Algorithm) a non-generative approach, based on a combination of stochastic techniques and discrete methods. We optimize a hill climbing algorithm and extend the framework to include multiple subpopulation structures. Additionally, we show that SimBA is very sensitive to the input specifications, i.e., very similar but distinct input characteristics result in distinct outputs with high fidelity to the specified distributions. This property of the simulation is not explicitly modeled or studied by previous methods. We show that SimBA outperforms the existing population simulation methods, both in terms of accuracy as well as time-efficiency. Not only does it construct populations that meet the input specifications more stringently than other published methods, SimBA is also easy to use. It does not require explicit parameter adaptations or calibrations. Also, it can work with input specified as distributions, without an exemplar matrix or population as required by some methods. SimBA is available at http://researcher.ibm.com/project/5669 .

  18. Automated detection of extended sources in radio maps: progress from the SCORPIO survey

    NASA Astrophysics Data System (ADS)

    Riggi, S.; Ingallinera, A.; Leto, P.; Cavallaro, F.; Bufano, F.; Schillirò, F.; Trigilio, C.; Umana, G.; Buemi, C. S.; Norris, R. P.

    2016-08-01

    Automated source extraction and parametrization represents a crucial challenge for the next-generation radio interferometer surveys, such as those performed with the Square Kilometre Array (SKA) and its precursors. In this paper, we present a new algorithm, called CAESAR (Compact And Extended Source Automated Recognition), to detect and parametrize extended sources in radio interferometric maps. It is based on a pre-filtering stage, allowing image denoising, compact source suppression and enhancement of diffuse emission, followed by an adaptive superpixel clustering stage for final source segmentation. A parametrization stage provides source flux information and a wide range of morphology estimators for post-processing analysis. We developed CAESAR in a modular software library, also including different methods for local background estimation and image filtering, along with alternative algorithms for both compact and diffuse source extraction. The method was applied to real radio continuum data collected at the Australian Telescope Compact Array (ATCA) within the SCORPIO project, a pathfinder of the Evolutionary Map of the Universe (EMU) survey at the Australian Square Kilometre Array Pathfinder (ASKAP). The source reconstruction capabilities were studied over different test fields in the presence of compact sources, imaging artefacts and diffuse emission from the Galactic plane and compared with existing algorithms. When compared to a human-driven analysis, the designed algorithm was found capable of detecting known target sources and regions of diffuse emission, outperforming alternative approaches over the considered fields.

  19. Dynamic Inertia Weight Binary Bat Algorithm with Neighborhood Search

    PubMed Central

    2017-01-01

    Binary bat algorithm (BBA) is a binary version of the bat algorithm (BA). It has been proven that BBA is competitive compared to other binary heuristic algorithms. Since the update processes of velocity in the algorithm are consistent with BA, in some cases, this algorithm also faces the premature convergence problem. This paper proposes an improved binary bat algorithm (IBBA) to solve this problem. To evaluate the performance of IBBA, standard benchmark functions and zero-one knapsack problems have been employed. The numeric results obtained by benchmark functions experiment prove that the proposed approach greatly outperforms the original BBA and binary particle swarm optimization (BPSO). Compared with several other heuristic algorithms on zero-one knapsack problems, it also verifies that the proposed algorithm is more able to avoid local minima. PMID:28634487

  20. Dynamic Inertia Weight Binary Bat Algorithm with Neighborhood Search.

    PubMed

    Huang, Xingwang; Zeng, Xuewen; Han, Rui

    2017-01-01

    Binary bat algorithm (BBA) is a binary version of the bat algorithm (BA). It has been proven that BBA is competitive compared to other binary heuristic algorithms. Since the update processes of velocity in the algorithm are consistent with BA, in some cases, this algorithm also faces the premature convergence problem. This paper proposes an improved binary bat algorithm (IBBA) to solve this problem. To evaluate the performance of IBBA, standard benchmark functions and zero-one knapsack problems have been employed. The numeric results obtained by benchmark functions experiment prove that the proposed approach greatly outperforms the original BBA and binary particle swarm optimization (BPSO). Compared with several other heuristic algorithms on zero-one knapsack problems, it also verifies that the proposed algorithm is more able to avoid local minima.

  1. Developing a Shuffled Complex-Self Adaptive Hybrid Evolution (SC-SAHEL) Framework for Water Resources Management and Water-Energy System Optimization

    NASA Astrophysics Data System (ADS)

    Rahnamay Naeini, M.; Sadegh, M.; AghaKouchak, A.; Hsu, K. L.; Sorooshian, S.; Yang, T.

    2017-12-01

    Meta-Heuristic optimization algorithms have gained a great deal of attention in a wide variety of fields. Simplicity and flexibility of these algorithms, along with their robustness, make them attractive tools for solving optimization problems. Different optimization methods, however, hold algorithm-specific strengths and limitations. Performance of each individual algorithm obeys the "No-Free-Lunch" theorem, which means a single algorithm cannot consistently outperform all possible optimization problems over a variety of problems. From users' perspective, it is a tedious process to compare, validate, and select the best-performing algorithm for a specific problem or a set of test cases. In this study, we introduce a new hybrid optimization framework, entitled Shuffled Complex-Self Adaptive Hybrid EvoLution (SC-SAHEL), which combines the strengths of different evolutionary algorithms (EAs) in a parallel computing scheme, and allows users to select the most suitable algorithm tailored to the problem at hand. The concept of SC-SAHEL is to execute different EAs as separate parallel search cores, and let all participating EAs to compete during the course of the search. The newly developed SC-SAHEL algorithm is designed to automatically select, the best performing algorithm for the given optimization problem. This algorithm is rigorously effective in finding the global optimum for several strenuous benchmark test functions, and computationally efficient as compared to individual EAs. We benchmark the proposed SC-SAHEL algorithm over 29 conceptual test functions, and two real-world case studies - one hydropower reservoir model and one hydrological model (SAC-SMA). Results show that the proposed framework outperforms individual EAs in an absolute majority of the test problems, and can provide competitive results to the fittest EA algorithm with more comprehensive information during the search. The proposed framework is also flexible for merging additional EAs, boundary-handling techniques, and sampling schemes, and has good potential to be used in Water-Energy system optimal operation and management.

  2. Discovering protein complexes in protein interaction networks via exploring the weak ties effect

    PubMed Central

    2012-01-01

    Background Studying protein complexes is very important in biological processes since it helps reveal the structure-functionality relationships in biological networks and much attention has been paid to accurately predict protein complexes from the increasing amount of protein-protein interaction (PPI) data. Most of the available algorithms are based on the assumption that dense subgraphs correspond to complexes, failing to take into account the inherence organization within protein complex and the roles of edges. Thus, there is a critical need to investigate the possibility of discovering protein complexes using the topological information hidden in edges. Results To provide an investigation of the roles of edges in PPI networks, we show that the edges connecting less similar vertices in topology are more significant in maintaining the global connectivity, indicating the weak ties phenomenon in PPI networks. We further demonstrate that there is a negative relation between the weak tie strength and the topological similarity. By using the bridges, a reliable virtual network is constructed, in which each maximal clique corresponds to the core of a complex. By this notion, the detection of the protein complexes is transformed into a classic all-clique problem. A novel core-attachment based method is developed, which detects the cores and attachments, respectively. A comprehensive comparison among the existing algorithms and our algorithm has been made by comparing the predicted complexes against benchmark complexes. Conclusions We proved that the weak tie effect exists in the PPI network and demonstrated that the density is insufficient to characterize the topological structure of protein complexes. Furthermore, the experimental results on the yeast PPI network show that the proposed method outperforms the state-of-the-art algorithms. The analysis of detected modules by the present algorithm suggests that most of these modules have well biological significance in context of complexes, suggesting that the roles of edges are critical in discovering protein complexes. PMID:23046740

  3. AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling.

    PubMed

    Wang, Sheng; Sun, Siqi; Xu, Jinbo

    2016-09-01

    Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC.

  4. AUC-Maximized Deep Convolutional Neural Fields for Protein Sequence Labeling

    PubMed Central

    Wang, Sheng; Sun, Siqi

    2017-01-01

    Deep Convolutional Neural Networks (DCNN) has shown excellent performance in a variety of machine learning tasks. This paper presents Deep Convolutional Neural Fields (DeepCNF), an integration of DCNN with Conditional Random Field (CRF), for sequence labeling with an imbalanced label distribution. The widely-used training methods, such as maximum-likelihood and maximum labelwise accuracy, do not work well on imbalanced data. To handle this, we present a new training algorithm called maximum-AUC for DeepCNF. That is, we train DeepCNF by directly maximizing the empirical Area Under the ROC Curve (AUC), which is an unbiased measurement for imbalanced data. To fulfill this, we formulate AUC in a pairwise ranking framework, approximate it by a polynomial function and then apply a gradient-based procedure to optimize it. Our experimental results confirm that maximum-AUC greatly outperforms the other two training methods on 8-state secondary structure prediction and disorder prediction since their label distributions are highly imbalanced and also has similar performance as the other two training methods on solvent accessibility prediction, which has three equally-distributed labels. Furthermore, our experimental results show that our AUC-trained DeepCNF models greatly outperform existing popular predictors of these three tasks. The data and software related to this paper are available at https://github.com/realbigws/DeepCNF_AUC. PMID:28884168

  5. Binary Interval Search: a scalable algorithm for counting interval intersections

    PubMed Central

    Layer, Ryan M.; Skadron, Kevin; Robins, Gabriel; Hall, Ira M.; Quinlan, Aaron R.

    2013-01-01

    Motivation: The comparison of diverse genomic datasets is fundamental to understand genome biology. Researchers must explore many large datasets of genome intervals (e.g. genes, sequence alignments) to place their experimental results in a broader context and to make new discoveries. Relationships between genomic datasets are typically measured by identifying intervals that intersect, that is, they overlap and thus share a common genome interval. Given the continued advances in DNA sequencing technologies, efficient methods for measuring statistically significant relationships between many sets of genomic features are crucial for future discovery. Results: We introduce the Binary Interval Search (BITS) algorithm, a novel and scalable approach to interval set intersection. We demonstrate that BITS outperforms existing methods at counting interval intersections. Moreover, we show that BITS is intrinsically suited to parallel computing architectures, such as graphics processing units by illustrating its utility for efficient Monte Carlo simulations measuring the significance of relationships between sets of genomic intervals. Availability: https://github.com/arq5x/bits. Contact: arq5x@virginia.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23129298

  6. Period Estimation for Sparsely-sampled Quasi-periodic Light Curves Applied to Miras

    NASA Astrophysics Data System (ADS)

    He, Shiyuan; Yuan, Wenlong; Huang, Jianhua Z.; Long, James; Macri, Lucas M.

    2016-12-01

    We develop a nonlinear semi-parametric Gaussian process model to estimate periods of Miras with sparsely sampled light curves. The model uses a sinusoidal basis for the periodic variation and a Gaussian process for the stochastic changes. We use maximum likelihood to estimate the period and the parameters of the Gaussian process, while integrating out the effects of other nuisance parameters in the model with respect to a suitable prior distribution obtained from earlier studies. Since the likelihood is highly multimodal for period, we implement a hybrid method that applies the quasi-Newton algorithm for Gaussian process parameters and search the period/frequency parameter space over a dense grid. A large-scale, high-fidelity simulation is conducted to mimic the sampling quality of Mira light curves obtained by the M33 Synoptic Stellar Survey. The simulated data set is publicly available and can serve as a testbed for future evaluation of different period estimation methods. The semi-parametric model outperforms an existing algorithm on this simulated test data set as measured by period recovery rate and quality of the resulting period-luminosity relations.

  7. Performance of Case-Based Reasoning Retrieval Using Classification Based on Associations versus Jcolibri and FreeCBR: A Further Validation Study

    NASA Astrophysics Data System (ADS)

    Aljuboori, Ahmed S.; Coenen, Frans; Nsaif, Mohammed; Parsons, David J.

    2018-05-01

    Case-Based Reasoning (CBR) plays a major role in expert system research. However, a critical problem can be met when a CBR system retrieves incorrect cases. Class Association Rules (CARs) have been utilized to offer a potential solution in a previous work. The aim of this paper was to perform further validation of Case-Based Reasoning using a Classification based on Association Rules (CBRAR) to enhance the performance of Similarity Based Retrieval (SBR). The CBRAR strategy uses a classed frequent pattern tree algorithm (FP-CAR) in order to disambiguate wrongly retrieved cases in CBR. The research reported in this paper makes contributions to both fields of CBR and Association Rules Mining (ARM) in that full target cases can be extracted from the FP-CAR algorithm without invoking P-trees and union operations. The dataset used in this paper provided more efficient results when the SBR retrieves unrelated answers. The accuracy of the proposed CBRAR system outperforms the results obtained by existing CBR tools such as Jcolibri and FreeCBR.

  8. HomoTarget: a new algorithm for prediction of microRNA targets in Homo sapiens.

    PubMed

    Ahmadi, Hamed; Ahmadi, Ali; Azimzadeh-Jamalkandi, Sadegh; Shoorehdeli, Mahdi Aliyari; Salehzadeh-Yazdi, Ali; Bidkhori, Gholamreza; Masoudi-Nejad, Ali

    2013-02-01

    MiRNAs play an essential role in the networks of gene regulation by inhibiting the translation of target mRNAs. Several computational approaches have been proposed for the prediction of miRNA target-genes. Reports reveal a large fraction of under-predicted or falsely predicted target genes. Thus, there is an imperative need to develop a computational method by which the target mRNAs of existing miRNAs can be correctly identified. In this study, combined pattern recognition neural network (PRNN) and principle component analysis (PCA) architecture has been proposed in order to model the complicated relationship between miRNAs and their target mRNAs in humans. The results of several types of intelligent classifiers and our proposed model were compared, showing that our algorithm outperformed them with higher sensitivity and specificity. Using the recent release of the mirBase database to find potential targets of miRNAs, this model incorporated twelve structural, thermodynamic and positional features of miRNA:mRNA binding sites to select target candidates. Copyright © 2012 Elsevier Inc. All rights reserved.

  9. Semi-supervised prediction of gene regulatory networks using machine learning algorithms.

    PubMed

    Patel, Nihir; Wang, Jason T L

    2015-10-01

    Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods for GRN prediction by utilizing two machine learning algorithms, namely, support vector machines (SVM) and random forests (RF). The semi-supervised methods make use of unlabelled data for training. We investigated inductive and transductive learning approaches, both of which adopt an iterative procedure to obtain reliable negative training data from the unlabelled data. We then applied our semi-supervised methods to gene expression data of Escherichia coli and Saccharomyces cerevisiae, and evaluated the performance of our methods using the expression data. Our analysis indicated that the transductive learning approach outperformed the inductive learning approach for both organisms. However, there was no conclusive difference identified in the performance of SVM and RF. Experimental results also showed that the proposed semi-supervised methods performed better than existing supervised methods for both organisms.

  10. Evolutionary and Neural Computing Based Decision Support System for Disease Diagnosis from Clinical Data Sets in Medical Practice.

    PubMed

    Sudha, M

    2017-09-27

    As a recent trend, various computational intelligence and machine learning approaches have been used for mining inferences hidden in the large clinical databases to assist the clinician in strategic decision making. In any target data the irrelevant information may be detrimental, causing confusion for the mining algorithm and degrades the prediction outcome. To address this issue, this study attempts to identify an intelligent approach to assist disease diagnostic procedure using an optimal set of attributes instead of all attributes present in the clinical data set. In this proposed Application Specific Intelligent Computing (ASIC) decision support system, a rough set based genetic algorithm is employed in pre-processing phase and a back propagation neural network is applied in training and testing phase. ASIC has two phases, the first phase handles outliers, noisy data, and missing values to obtain a qualitative target data to generate appropriate attribute reduct sets from the input data using rough computing based genetic algorithm centred on a relative fitness function measure. The succeeding phase of this system involves both training and testing of back propagation neural network classifier on the selected reducts. The model performance is evaluated with widely adopted existing classifiers. The proposed ASIC system for clinical decision support has been tested with breast cancer, fertility diagnosis and heart disease data set from the University of California at Irvine (UCI) machine learning repository. The proposed system outperformed the existing approaches attaining the accuracy rate of 95.33%, 97.61%, and 93.04% for breast cancer, fertility issue and heart disease diagnosis.

  11. A Review of Depth and Normal Fusion Algorithms

    PubMed Central

    Štolc, Svorad; Pock, Thomas

    2018-01-01

    Geometric surface information such as depth maps and surface normals can be acquired by various methods such as stereo light fields, shape from shading and photometric stereo techniques. We compare several algorithms which deal with the combination of depth with surface normal information in order to reconstruct a refined depth map. The reasons for performance differences are examined from the perspective of alternative formulations of surface normals for depth reconstruction. We review and analyze methods in a systematic way. Based on our findings, we introduce a new generalized fusion method, which is formulated as a least squares problem and outperforms previous methods in the depth error domain by introducing a novel normal weighting that performs closer to the geodesic distance measure. Furthermore, a novel method is introduced based on Total Generalized Variation (TGV) which further outperforms previous approaches in terms of the geodesic normal distance error and maintains comparable quality in the depth error domain. PMID:29389903

  12. NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

    PubMed

    Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan

    2014-01-01

    One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.

  13. Interior tomography in microscopic CT with image reconstruction constrained by full field of view scan at low spatial resolution

    NASA Astrophysics Data System (ADS)

    Luo, Shouhua; Shen, Tao; Sun, Yi; Li, Jing; Li, Guang; Tang, Xiangyang

    2018-04-01

    In high resolution (microscopic) CT applications, the scan field of view should cover the entire specimen or sample to allow complete data acquisition and image reconstruction. However, truncation may occur in projection data and results in artifacts in reconstructed images. In this study, we propose a low resolution image constrained reconstruction algorithm (LRICR) for interior tomography in microscopic CT at high resolution. In general, the multi-resolution acquisition based methods can be employed to solve the data truncation problem if the project data acquired at low resolution are utilized to fill up the truncated projection data acquired at high resolution. However, most existing methods place quite strict restrictions on the data acquisition geometry, which greatly limits their utility in practice. In the proposed LRICR algorithm, full and partial data acquisition (scan) at low and high resolutions, respectively, are carried out. Using the image reconstructed from sparse projection data acquired at low resolution as the prior, a microscopic image at high resolution is reconstructed from the truncated projection data acquired at high resolution. Two synthesized digital phantoms, a raw bamboo culm and a specimen of mouse femur, were utilized to evaluate and verify performance of the proposed LRICR algorithm. Compared with the conventional TV minimization based algorithm and the multi-resolution scout-reconstruction algorithm, the proposed LRICR algorithm shows significant improvement in reduction of the artifacts caused by data truncation, providing a practical solution for high quality and reliable interior tomography in microscopic CT applications. The proposed LRICR algorithm outperforms the multi-resolution scout-reconstruction method and the TV minimization based reconstruction for interior tomography in microscopic CT.

  14. A hybrid algorithm for speckle noise reduction of ultrasound images.

    PubMed

    Singh, Karamjeet; Ranade, Sukhjeet Kaur; Singh, Chandan

    2017-09-01

    Medical images are contaminated by multiplicative speckle noise which significantly reduce the contrast of ultrasound images and creates a negative effect on various image interpretation tasks. In this paper, we proposed a hybrid denoising approach which collaborate the both local and nonlocal information in an efficient manner. The proposed hybrid algorithm consist of three stages in which at first stage the use of local statistics in the form of guided filter is used to reduce the effect of speckle noise initially. Then, an improved speckle reducing bilateral filter (SRBF) is developed to further reduce the speckle noise from the medical images. Finally, to reconstruct the diffused edges we have used the efficient post-processing technique which jointly considered the advantages of both bilateral and nonlocal mean (NLM) filter for the attenuation of speckle noise efficiently. The performance of proposed hybrid algorithm is evaluated on synthetic, simulated and real ultrasound images. The experiments conducted on various test images demonstrate that our proposed hybrid approach outperforms the various traditional speckle reduction approaches included recently proposed NLM and optimized Bayesian-based NLM. The results of various quantitative, qualitative measures and by visual inspection of denoise synthetic and real ultrasound images demonstrate that the proposed hybrid algorithm have strong denoising capability and able to preserve the fine image details such as edge of a lesion better than previously developed methods for speckle noise reduction. The denoising and edge preserving capability of hybrid algorithm is far better than existing traditional and recently proposed speckle reduction (SR) filters. The success of proposed algorithm would help in building the lay foundation for inventing the hybrid algorithms for denoising of ultrasound images. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. The effects of automated artifact removal algorithms on electroencephalography-based Alzheimer's disease diagnosis

    PubMed Central

    Cassani, Raymundo; Falk, Tiago H.; Fraga, Francisco J.; Kanda, Paulo A. M.; Anghinah, Renato

    2014-01-01

    Over the last decade, electroencephalography (EEG) has emerged as a reliable tool for the diagnosis of cortical disorders such as Alzheimer's disease (AD). EEG signals, however, are susceptible to several artifacts, such as ocular, muscular, movement, and environmental. To overcome this limitation, existing diagnostic systems commonly depend on experienced clinicians to manually select artifact-free epochs from the collected multi-channel EEG data. Manual selection, however, is a tedious and time-consuming process, rendering the diagnostic system “semi-automated.” Notwithstanding, a number of EEG artifact removal algorithms have been proposed in the literature. The (dis)advantages of using such algorithms in automated AD diagnostic systems, however, have not been documented; this paper aims to fill this gap. Here, we investigate the effects of three state-of-the-art automated artifact removal (AAR) algorithms (both alone and in combination with each other) on AD diagnostic systems based on four different classes of EEG features, namely, spectral, amplitude modulation rate of change, coherence, and phase. The three AAR algorithms tested are statistical artifact rejection (SAR), blind source separation based on second order blind identification and canonical correlation analysis (BSS-SOBI-CCA), and wavelet enhanced independent component analysis (wICA). Experimental results based on 20-channel resting-awake EEG data collected from 59 participants (20 patients with mild AD, 15 with moderate-to-severe AD, and 24 age-matched healthy controls) showed the wICA algorithm alone outperforming other enhancement algorithm combinations across three tasks: diagnosis (control vs. mild vs. moderate), early detection (control vs. mild), and disease progression (mild vs. moderate), thus opening the doors for fully-automated systems that can assist clinicians with early detection of AD, as well as disease severity progression assessment. PMID:24723886

  16. Phylogenetic Copy-Number Factorization of Multiple Tumor Samples.

    PubMed

    Zaccaria, Simone; El-Kebir, Mohammed; Klau, Gunnar W; Raphael, Benjamin J

    2018-04-16

    Cancer is an evolutionary process driven by somatic mutations. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the many types of mutations in cancer and the fact that nearly all cancer sequencing is of a bulk tumor, measuring a superposition of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy-number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy-number data from multiple samples of a tumor. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that either perform deconvolution/factorization of mixed tumor samples or build phylogenetic trees assuming homogeneous tumor samples. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher resolution view of copy-number evolution of this cancer than published analyses.

  17. Identifying Dynamic Protein Complexes Based on Gene Expression Profiles and PPI Networks

    PubMed Central

    Li, Min; Chen, Weijie; Wang, Jianxin; Pan, Yi

    2014-01-01

    Identification of protein complexes from protein-protein interaction networks has become a key problem for understanding cellular life in postgenomic era. Many computational methods have been proposed for identifying protein complexes. Up to now, the existing computational methods are mostly applied on static PPI networks. However, proteins and their interactions are dynamic in reality. Identifying dynamic protein complexes is more meaningful and challenging. In this paper, a novel algorithm, named DPC, is proposed to identify dynamic protein complexes by integrating PPI data and gene expression profiles. According to Core-Attachment assumption, these proteins which are always active in the molecular cycle are regarded as core proteins. The protein-complex cores are identified from these always active proteins by detecting dense subgraphs. Final protein complexes are extended from the protein-complex cores by adding attachments based on a topological character of “closeness” and dynamic meaning. The protein complexes produced by our algorithm DPC contain two parts: static core expressed in all the molecular cycle and dynamic attachments short-lived. The proposed algorithm DPC was applied on the data of Saccharomyces cerevisiae and the experimental results show that DPC outperforms CMC, MCL, SPICi, HC-PIN, COACH, and Core-Attachment based on the validation of matching with known complexes and hF-measures. PMID:24963481

  18. A Class of Prediction-Correction Methods for Time-Varying Convex Optimization

    NASA Astrophysics Data System (ADS)

    Simonetto, Andrea; Mokhtari, Aryan; Koppel, Alec; Leus, Geert; Ribeiro, Alejandro

    2016-09-01

    This paper considers unconstrained convex optimization problems with time-varying objective functions. We propose algorithms with a discrete time-sampling scheme to find and track the solution trajectory based on prediction and correction steps, while sampling the problem data at a constant rate of $1/h$, where $h$ is the length of the sampling interval. The prediction step is derived by analyzing the iso-residual dynamics of the optimality conditions. The correction step adjusts for the distance between the current prediction and the optimizer at each time step, and consists either of one or multiple gradient steps or Newton steps, which respectively correspond to the gradient trajectory tracking (GTT) or Newton trajectory tracking (NTT) algorithms. Under suitable conditions, we establish that the asymptotic error incurred by both proposed methods behaves as $O(h^2)$, and in some cases as $O(h^4)$, which outperforms the state-of-the-art error bound of $O(h)$ for correction-only methods in the gradient-correction step. Moreover, when the characteristics of the objective function variation are not available, we propose approximate gradient and Newton tracking algorithms (AGT and ANT, respectively) that still attain these asymptotical error bounds. Numerical simulations demonstrate the practical utility of the proposed methods and that they improve upon existing techniques by several orders of magnitude.

  19. Fast vessel segmentation in retinal images using multi-scale enhancement and second-order local entropy

    NASA Astrophysics Data System (ADS)

    Yu, H.; Barriga, S.; Agurto, C.; Zamora, G.; Bauman, W.; Soliz, P.

    2012-03-01

    Retinal vasculature is one of the most important anatomical structures in digital retinal photographs. Accurate segmentation of retinal blood vessels is an essential task in automated analysis of retinopathy. This paper presents a new and effective vessel segmentation algorithm that features computational simplicity and fast implementation. This method uses morphological pre-processing to decrease the disturbance of bright structures and lesions before vessel extraction. Next, a vessel probability map is generated by computing the eigenvalues of the second derivatives of Gaussian filtered image at multiple scales. Then, the second order local entropy thresholding is applied to segment the vessel map. Lastly, a rule-based decision step, which measures the geometric shape difference between vessels and lesions is applied to reduce false positives. The algorithm is evaluated on the low-resolution DRIVE and STARE databases and the publicly available high-resolution image database from Friedrich-Alexander University Erlangen-Nuremberg, Germany). The proposed method achieved comparable performance to state of the art unsupervised vessel segmentation methods with a competitive faster speed on the DRIVE and STARE databases. For the high resolution fundus image database, the proposed algorithm outperforms an existing approach both on performance and speed. The efficiency and robustness make the blood vessel segmentation method described here suitable for broad application in automated analysis of retinal images.

  20. Sparse Learning with Stochastic Composite Optimization.

    PubMed

    Zhang, Weizhong; Zhang, Lijun; Jin, Zhongming; Jin, Rong; Cai, Deng; Li, Xuelong; Liang, Ronghua; He, Xiaofei

    2017-06-01

    In this paper, we study Stochastic Composite Optimization (SCO) for sparse learning that aims to learn a sparse solution from a composite function. Most of the recent SCO algorithms have already reached the optimal expected convergence rate O(1/λT), but they often fail to deliver sparse solutions at the end either due to the limited sparsity regularization during stochastic optimization (SO) or due to the limitation in online-to-batch conversion. Even when the objective function is strongly convex, their high probability bounds can only attain O(√{log(1/δ)/T}) with δ is the failure probability, which is much worse than the expected convergence rate. To address these limitations, we propose a simple yet effective two-phase Stochastic Composite Optimization scheme by adding a novel powerful sparse online-to-batch conversion to the general Stochastic Optimization algorithms. We further develop three concrete algorithms, OptimalSL, LastSL and AverageSL, directly under our scheme to prove the effectiveness of the proposed scheme. Both the theoretical analysis and the experiment results show that our methods can really outperform the existing methods at the ability of sparse learning and at the meantime we can improve the high probability bound to approximately O(log(log(T)/δ)/λT).

  1. Sparse generalized linear model with L0 approximation for feature selection and prediction with big omics data.

    PubMed

    Liu, Zhenqiu; Sun, Fengzhu; McGovern, Dermot P

    2017-01-01

    Feature selection and prediction are the most important tasks for big data mining. The common strategies for feature selection in big data mining are L 1 , SCAD and MC+. However, none of the existing algorithms optimizes L 0 , which penalizes the number of nonzero features directly. In this paper, we develop a novel sparse generalized linear model (GLM) with L 0 approximation for feature selection and prediction with big omics data. The proposed approach approximate the L 0 optimization directly. Even though the original L 0 problem is non-convex, the problem is approximated by sequential convex optimizations with the proposed algorithm. The proposed method is easy to implement with only several lines of code. Novel adaptive ridge algorithms ( L 0 ADRIDGE) for L 0 penalized GLM with ultra high dimensional big data are developed. The proposed approach outperforms the other cutting edge regularization methods including SCAD and MC+ in simulations. When it is applied to integrated analysis of mRNA, microRNA, and methylation data from TCGA ovarian cancer, multilevel gene signatures associated with suboptimal debulking are identified simultaneously. The biological significance and potential clinical importance of those genes are further explored. The developed Software L 0 ADRIDGE in MATLAB is available at https://github.com/liuzqx/L0adridge.

  2. An IMM-Aided ZUPT Methodology for an INS/DVL Integrated Navigation System

    PubMed Central

    Yao, Yiqing

    2017-01-01

    Inertial navigation system (INS)/Doppler velocity log (DVL) integration is the most common navigation solution for underwater vehicles. Due to the complex underwater environment, the velocity information provided by DVL always contains some errors. To improve navigation accuracy, zero velocity update (ZUPT) technology is considered, which is an effective algorithm for land vehicles to mitigate the navigation error during the pure INS mode. However, in contrast to ground vehicles, the ZUPT solution cannot be used directly for underwater vehicles because of the existence of the water current. In order to leverage the strengths of the ZUPT method and the INS/DVL solution, an interactive multiple model (IMM)-aided ZUPT methodology for the INS/DVL-integrated underwater navigation system is proposed. Both the INS/DVL and INS/ZUPT models are constructed and operated in parallel, with weights calculated according to their innovations and innovation covariance matrices. Simulations are conducted to evaluate the proposed algorithm. The results indicate that the IMM-aided ZUPT solution outperforms both the INS/DVL solution and the INS/ZUPT solution in the underwater environment, which can properly distinguish between the ZUPT and non-ZUPT conditions. In addition, during DVL outage, the effectiveness of the proposed algorithm is also verified. PMID:28872602

  3. Oversimplifying quantum factoring.

    PubMed

    Smolin, John A; Smith, Graeme; Vargo, Alexander

    2013-07-11

    Shor's quantum factoring algorithm exponentially outperforms known classical methods. Previous experimental implementations have used simplifications dependent on knowing the factors in advance. However, as we show here, all composite numbers admit simplification of the algorithm to a circuit equivalent to flipping coins. The difficulty of a particular experiment therefore depends on the level of simplification chosen, not the size of the number factored. Valid implementations should not make use of the answer sought.

  4. Optimizing Algorithm Choice for Metaproteomics: Comparing X!Tandem and Proteome Discoverer for Soil Proteomes

    NASA Astrophysics Data System (ADS)

    Diaz, K. S.; Kim, E. H.; Jones, R. M.; de Leon, K. C.; Woodcroft, B. J.; Tyson, G. W.; Rich, V. I.

    2014-12-01

    The growing field of metaproteomics links microbial communities to their expressed functions by using mass spectrometry methods to characterize community proteins. Comparison of mass spectrometry protein search algorithms and their biases is crucial for maximizing the quality and amount of protein identifications in mass spectral data. Available algorithms employ different approaches when mapping mass spectra to peptides against a database. We compared mass spectra from four microbial proteomes derived from high-organic content soils searched with two search algorithms: 1) Sequest HT as packaged within Proteome Discoverer (v.1.4) and 2) X!Tandem as packaged in TransProteomicPipeline (v.4.7.1). Searches used matched metagenomes, and results were filtered to allow identification of high probability proteins. There was little overlap in proteins identified by both algorithms, on average just ~24% of the total. However, when adjusted for spectral abundance, the overlap improved to ~70%. Proteome Discoverer generally outperformed X!Tandem, identifying an average of 12.5% more proteins than X!Tandem, with X!Tandem identifying more proteins only in the first two proteomes. For spectrally-adjusted results, the algorithms were similar, with X!Tandem marginally outperforming Proteome Discoverer by an average of ~4%. We then assessed differences in heat shock proteins (HSP) identification by the two algorithms by BLASTing identified proteins against the Heat Shock Protein Information Resource, because HSP hits typically account for the majority signal in proteomes, due to extraction protocols. Total HSP identifications for each of the 4 proteomes were approximately ~15%, ~11%, ~17%, and ~19%, with ~14% for total HSPs with redundancies removed. Of the ~15% average of proteins from the 4 proteomes identified as HSPs, ~10% of proteins and spectra were identified by both algorithms. On average, Proteome Discoverer identified ~9% more HSPs than X!Tandem.

  5. Student beats the teacher: deep neural networks for lateral ventricles segmentation in brain MR

    NASA Astrophysics Data System (ADS)

    Ghafoorian, Mohsen; Teuwen, Jonas; Manniesing, Rashindra; Leeuw, Frank-Erik d.; van Ginneken, Bram; Karssemeijer, Nico; Platel, Bram

    2018-03-01

    Ventricular volume and its progression are known to be linked to several brain diseases such as dementia and schizophrenia. Therefore accurate measurement of ventricle volume is vital for longitudinal studies on these disorders, making automated ventricle segmentation algorithms desirable. In the past few years, deep neural networks have shown to outperform the classical models in many imaging domains. However, the success of deep networks is dependent on manually labeled data sets, which are expensive to acquire especially for higher dimensional data in the medical domain. In this work, we show that deep neural networks can be trained on muchcheaper-to-acquire pseudo-labels (e.g., generated by other automated less accurate methods) and still produce more accurate segmentations compared to the quality of the labels. To show this, we use noisy segmentation labels generated by a conventional region growing algorithm to train a deep network for lateral ventricle segmentation. Then on a large manually annotated test set, we show that the network significantly outperforms the conventional region growing algorithm which was used to produce the training labels for the network. Our experiments report a Dice Similarity Coefficient (DSC) of 0.874 for the trained network compared to 0.754 for the conventional region growing algorithm (p < 0.001).

  6. Predicting overlapping protein complexes from weighted protein interaction graphs by gradually expanding dense neighborhoods.

    PubMed

    Dimitrakopoulos, Christos; Theofilatos, Konstantinos; Pegkas, Andreas; Likothanassis, Spiros; Mavroudi, Seferina

    2016-07-01

    Proteins are vital biological molecules driving many fundamental cellular processes. They rarely act alone, but form interacting groups called protein complexes. The study of protein complexes is a key goal in systems biology. Recently, large protein-protein interaction (PPI) datasets have been published and a plethora of computational methods that provide new ideas for the prediction of protein complexes have been implemented. However, most of the methods suffer from two major limitations: First, they do not account for proteins participating in multiple functions and second, they are unable to handle weighted PPI graphs. Moreover, the problem remains open as existing algorithms and tools are insufficient in terms of predictive metrics. In the present paper, we propose gradually expanding neighborhoods with adjustment (GENA), a new algorithm that gradually expands neighborhoods in a graph starting from highly informative "seed" nodes. GENA considers proteins as multifunctional molecules allowing them to participate in more than one protein complex. In addition, GENA accepts weighted PPI graphs by using a weighted evaluation function for each cluster. In experiments with datasets from Saccharomyces cerevisiae and human, GENA outperformed Markov clustering, restricted neighborhood search and clustering with overlapping neighborhood expansion, three state-of-the-art methods for computationally predicting protein complexes. Seven PPI networks and seven evaluation datasets were used in total. GENA outperformed existing methods in 16 out of 18 experiments achieving an average improvement of 5.5% when the maximum matching ratio metric was used. Our method was able to discover functionally homogeneous protein clusters and uncover important network modules in a Parkinson expression dataset. When used on the human networks, around 47% of the detected clusters were enriched in gene ontology (GO) terms with depth higher than five in the GO hierarchy. In the present manuscript, we introduce a new method for the computational prediction of protein complexes by making the realistic assumption that proteins participate in multiple protein complexes and cellular functions. Our method can detect accurate and functionally homogeneous clusters. Copyright © 2016 Elsevier B.V. All rights reserved.

  7. Literature-based condition-specific miRNA-mRNA target prediction.

    PubMed

    Oh, Minsik; Rhee, Sungmin; Moon, Ji Hwan; Chae, Heejoon; Lee, Sunwon; Kang, Jaewoo; Kim, Sun

    2017-01-01

    miRNAs are small non-coding RNAs that regulate gene expression by binding to the 3'-UTR of genes. Many recent studies have reported that miRNAs play important biological roles by regulating specific mRNAs or genes. Many sequence-based target prediction algorithms have been developed to predict miRNA targets. However, these methods are not designed for condition-specific target predictions and produce many false positives; thus, expression-based target prediction algorithms have been developed for condition-specific target predictions. A typical strategy to utilize expression data is to leverage the negative control roles of miRNAs on genes. To control false positives, a stringent cutoff value is typically set, but in this case, these methods tend to reject many true target relationships, i.e., false negatives. To overcome these limitations, additional information should be utilized. The literature is probably the best resource that we can utilize. Recent literature mining systems compile millions of articles with experiments designed for specific biological questions, and the systems provide a function to search for specific information. To utilize the literature information, we used a literature mining system, BEST, that automatically extracts information from the literature in PubMed and that allows the user to perform searches of the literature with any English words. By integrating omics data analysis methods and BEST, we developed Context-MMIA, a miRNA-mRNA target prediction method that combines expression data analysis results and the literature information extracted based on the user-specified context. In the pathway enrichment analysis using genes included in the top 200 miRNA-targets, Context-MMIA outperformed the four existing target prediction methods that we tested. In another test on whether prediction methods can re-produce experimentally validated target relationships, Context-MMIA outperformed the four existing target prediction methods. In summary, Context-MMIA allows the user to specify a context of the experimental data to predict miRNA targets, and we believe that Context-MMIA is very useful for predicting condition-specific miRNA targets.

  8. Optimized extreme learning machine for urban land cover classification using hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Su, Hongjun; Tian, Shufang; Cai, Yue; Sheng, Yehua; Chen, Chen; Najafian, Maryam

    2017-12-01

    This work presents a new urban land cover classification framework using the firefly algorithm (FA) optimized extreme learning machine (ELM). FA is adopted to optimize the regularization coefficient C and Gaussian kernel σ for kernel ELM. Additionally, effectiveness of spectral features derived from an FA-based band selection algorithm is studied for the proposed classification task. Three sets of hyperspectral databases were recorded using different sensors, namely HYDICE, HyMap, and AVIRIS. Our study shows that the proposed method outperforms traditional classification algorithms such as SVM and reduces computational cost significantly.

  9. Production scheduling and rescheduling with genetic algorithms.

    PubMed

    Bierwirth, C; Mattfeld, D C

    1999-01-01

    A general model for job shop scheduling is described which applies to static, dynamic and non-deterministic production environments. Next, a Genetic Algorithm is presented which solves the job shop scheduling problem. This algorithm is tested in a dynamic environment under different workload situations. Thereby, a highly efficient decoding procedure is proposed which strongly improves the quality of schedules. Finally, this technique is tested for scheduling and rescheduling in a non-deterministic environment. It is shown by experiment that conventional methods of production control are clearly outperformed at reasonable run-time costs.

  10. Ensemble-based prediction of RNA secondary structures.

    PubMed

    Aghaeepour, Nima; Hoos, Holger H

    2013-04-24

    Accurate structure prediction methods play an important role for the understanding of RNA function. Energy-based, pseudoknot-free secondary structure prediction is one of the most widely used and versatile approaches, and improved methods for this task have received much attention over the past five years. Despite the impressive progress that as been achieved in this area, existing evaluations of the prediction accuracy achieved by various algorithms do not provide a comprehensive, statistically sound assessment. Furthermore, while there is increasing evidence that no prediction algorithm consistently outperforms all others, no work has been done to exploit the complementary strengths of multiple approaches. In this work, we present two contributions to the area of RNA secondary structure prediction. Firstly, we use state-of-the-art, resampling-based statistical methods together with a previously published and increasingly widely used dataset of high-quality RNA structures to conduct a comprehensive evaluation of existing RNA secondary structure prediction procedures. The results from this evaluation clarify the performance relationship between ten well-known existing energy-based pseudoknot-free RNA secondary structure prediction methods and clearly demonstrate the progress that has been achieved in recent years. Secondly, we introduce AveRNA, a generic and powerful method for combining a set of existing secondary structure prediction procedures into an ensemble-based method that achieves significantly higher prediction accuracies than obtained from any of its component procedures. Our new, ensemble-based method, AveRNA, improves the state of the art for energy-based, pseudoknot-free RNA secondary structure prediction by exploiting the complementary strengths of multiple existing prediction procedures, as demonstrated using a state-of-the-art statistical resampling approach. In addition, AveRNA allows an intuitive and effective control of the trade-off between false negative and false positive base pair predictions. Finally, AveRNA can make use of arbitrary sets of secondary structure prediction procedures and can therefore be used to leverage improvements in prediction accuracy offered by algorithms and energy models developed in the future. Our data, MATLAB software and a web-based version of AveRNA are publicly available at http://www.cs.ubc.ca/labs/beta/Software/AveRNA.

  11. An Improved Iris Recognition Algorithm Based on Hybrid Feature and ELM

    NASA Astrophysics Data System (ADS)

    Wang, Juan

    2018-03-01

    The iris image is easily polluted by noise and uneven light. This paper proposed an improved extreme learning machine (ELM) based iris recognition algorithm with hybrid feature. 2D-Gabor filters and GLCM is employed to generate a multi-granularity hybrid feature vector. 2D-Gabor filter and GLCM feature work for capturing low-intermediate frequency and high frequency texture information, respectively. Finally, we utilize extreme learning machine for iris recognition. Experimental results reveal our proposed ELM based multi-granularity iris recognition algorithm (ELM-MGIR) has higher accuracy of 99.86%, and lower EER of 0.12% under the premise of real-time performance. The proposed ELM-MGIR algorithm outperforms other mainstream iris recognition algorithms.

  12. Bayesian CP Factorization of Incomplete Tensors with Automatic Rank Determination.

    PubMed

    Zhao, Qibin; Zhang, Liqing; Cichocki, Andrzej

    2015-09-01

    CANDECOMP/PARAFAC (CP) tensor factorization of incomplete data is a powerful technique for tensor completion through explicitly capturing the multilinear latent factors. The existing CP algorithms require the tensor rank to be manually specified, however, the determination of tensor rank remains a challenging problem especially for CP rank . In addition, existing approaches do not take into account uncertainty information of latent factors, as well as missing entries. To address these issues, we formulate CP factorization using a hierarchical probabilistic model and employ a fully Bayesian treatment by incorporating a sparsity-inducing prior over multiple latent factors and the appropriate hyperpriors over all hyperparameters, resulting in automatic rank determination. To learn the model, we develop an efficient deterministic Bayesian inference algorithm, which scales linearly with data size. Our method is characterized as a tuning parameter-free approach, which can effectively infer underlying multilinear factors with a low-rank constraint, while also providing predictive distributions over missing entries. Extensive simulations on synthetic data illustrate the intrinsic capability of our method to recover the ground-truth of CP rank and prevent the overfitting problem, even when a large amount of entries are missing. Moreover, the results from real-world applications, including image inpainting and facial image synthesis, demonstrate that our method outperforms state-of-the-art approaches for both tensor factorization and tensor completion in terms of predictive performance.

  13. Chaotic Teaching-Learning-Based Optimization with Lévy Flight for Global Numerical Optimization.

    PubMed

    He, Xiangzhu; Huang, Jida; Rao, Yunqing; Gao, Liang

    2016-01-01

    Recently, teaching-learning-based optimization (TLBO), as one of the emerging nature-inspired heuristic algorithms, has attracted increasing attention. In order to enhance its convergence rate and prevent it from getting stuck in local optima, a novel metaheuristic has been developed in this paper, where particular characteristics of the chaos mechanism and Lévy flight are introduced to the basic framework of TLBO. The new algorithm is tested on several large-scale nonlinear benchmark functions with different characteristics and compared with other methods. Experimental results show that the proposed algorithm outperforms other algorithms and achieves a satisfactory improvement over TLBO.

  14. Parallel processors and nonlinear structural dynamics algorithms and software

    NASA Technical Reports Server (NTRS)

    Belytschko, Ted; Gilbertsen, Noreen D.; Neal, Mark O.; Plaskacz, Edward J.

    1989-01-01

    The adaptation of a finite element program with explicit time integration to a massively parallel SIMD (single instruction multiple data) computer, the CONNECTION Machine is described. The adaptation required the development of a new algorithm, called the exchange algorithm, in which all nodal variables are allocated to the element with an exchange of nodal forces at each time step. The architectural and C* programming language features of the CONNECTION Machine are also summarized. Various alternate data structures and associated algorithms for nonlinear finite element analysis are discussed and compared. Results are presented which demonstrate that the CONNECTION Machine is capable of outperforming the CRAY XMP/14.

  15. featsel: A framework for benchmarking of feature selection algorithms and cost functions

    NASA Astrophysics Data System (ADS)

    Reis, Marcelo S.; Estrela, Gustavo; Ferreira, Carlos Eduardo; Barrera, Junior

    In this paper, we introduce featsel, a framework for benchmarking of feature selection algorithms and cost functions. This framework allows the user to deal with the search space as a Boolean lattice and has its core coded in C++ for computational efficiency purposes. Moreover, featsel includes Perl scripts to add new algorithms and/or cost functions, generate random instances, plot graphs and organize results into tables. Besides, this framework already comes with dozens of algorithms and cost functions for benchmarking experiments. We also provide illustrative examples, in which featsel outperforms the popular Weka workbench in feature selection procedures on data sets from the UCI Machine Learning Repository.

  16. Interaction sorting method for molecular dynamics on multi-core SIMD CPU architecture.

    PubMed

    Matvienko, Sergey; Alemasov, Nikolay; Fomin, Eduard

    2015-02-01

    Molecular dynamics (MD) is widely used in computational biology for studying binding mechanisms of molecules, molecular transport, conformational transitions, protein folding, etc. The method is computationally expensive; thus, the demand for the development of novel, much more efficient algorithms is still high. Therefore, the new algorithm designed in 2007 and called interaction sorting (IS) clearly attracted interest, as it outperformed the most efficient MD algorithms. In this work, a new IS modification is proposed which allows the algorithm to utilize SIMD processor instructions. This paper shows that the improvement provides an additional gain in performance, 9% to 45% in comparison to the original IS method.

  17. A modified genetic algorithm with fuzzy roulette wheel selection for job-shop scheduling problems

    NASA Astrophysics Data System (ADS)

    Thammano, Arit; Teekeng, Wannaporn

    2015-05-01

    The job-shop scheduling problem is one of the most difficult production planning problems. Since it is in the NP-hard class, a recent trend in solving the job-shop scheduling problem is shifting towards the use of heuristic and metaheuristic algorithms. This paper proposes a novel metaheuristic algorithm, which is a modification of the genetic algorithm. This proposed algorithm introduces two new concepts to the standard genetic algorithm: (1) fuzzy roulette wheel selection and (2) the mutation operation with tabu list. The proposed algorithm has been evaluated and compared with several state-of-the-art algorithms in the literature. The experimental results on 53 JSSPs show that the proposed algorithm is very effective in solving the combinatorial optimization problems. It outperforms all state-of-the-art algorithms on all benchmark problems in terms of the ability to achieve the optimal solution and the computational time.

  18. Sorting on STAR. [CDC computer algorithm timing comparison

    NASA Technical Reports Server (NTRS)

    Stone, H. S.

    1978-01-01

    Timing comparisons are given for three sorting algorithms written for the CDC STAR computer. One algorithm is Hoare's (1962) Quicksort, which is the fastest or nearly the fastest sorting algorithm for most computers. A second algorithm is a vector version of Quicksort that takes advantage of the STAR's vector operations. The third algorithm is an adaptation of Batcher's (1968) sorting algorithm, which makes especially good use of vector operations but has a complexity of N(log N)-squared as compared with a complexity of N log N for the Quicksort algorithms. In spite of its worse complexity, Batcher's sorting algorithm is competitive with the serial version of Quicksort for vectors up to the largest that can be treated by STAR. Vector Quicksort outperforms the other two algorithms and is generally preferred. These results indicate that unusual instruction sets can introduce biases in program execution time that counter results predicted by worst-case asymptotic complexity analysis.

  19. Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance

    NASA Astrophysics Data System (ADS)

    Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi

    2017-11-01

    K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).

  20. An Efficient Augmented Lagrangian Method with Applications to Total Variation Minimization

    DTIC Science & Technology

    2012-08-17

    the classic augmented Lagrangian multiplier method, we propose, analyze and test an algorithm for solving a class of equality-constrained non-smooth...method, we propose, analyze and test an algorithm for solving a class of equality-constrained non-smooth optimization problems (chie y but not...significantly outperforming several state-of-the-art solvers on most tested problems. The resulting MATLAB solver, called TVAL3, has been posted online [23]. 2

  1. Virtual fringe projection system with nonparallel illumination based on iteration

    NASA Astrophysics Data System (ADS)

    Zhou, Duo; Wang, Zhangying; Gao, Nan; Zhang, Zonghua; Jiang, Xiangqian

    2017-06-01

    Fringe projection profilometry has been widely applied in many fields. To set up an ideal measuring system, a virtual fringe projection technique has been studied to assist in the design of hardware configurations. However, existing virtual fringe projection systems use parallel illumination and have a fixed optical framework. This paper presents a virtual fringe projection system with nonparallel illumination. Using an iterative method to calculate intersection points between rays and reference planes or object surfaces, the proposed system can simulate projected fringe patterns and captured images. A new explicit calibration method has been presented to validate the precision of the system. Simulated results indicate that the proposed iterative method outperforms previous systems. Our virtual system can be applied to error analysis, algorithm optimization, and help operators to find ideal system parameter settings for actual measurements.

  2. A dictionary learning approach for Poisson image deblurring.

    PubMed

    Ma, Liyan; Moisan, Lionel; Yu, Jian; Zeng, Tieyong

    2013-07-01

    The restoration of images corrupted by blur and Poisson noise is a key issue in medical and biological image processing. While most existing methods are based on variational models, generally derived from a maximum a posteriori (MAP) formulation, recently sparse representations of images have shown to be efficient approaches for image recovery. Following this idea, we propose in this paper a model containing three terms: a patch-based sparse representation prior over a learned dictionary, the pixel-based total variation regularization term and a data-fidelity term capturing the statistics of Poisson noise. The resulting optimization problem can be solved by an alternating minimization technique combined with variable splitting. Extensive experimental results suggest that in terms of visual quality, peak signal-to-noise ratio value and the method noise, the proposed algorithm outperforms state-of-the-art methods.

  3. Fast Image Restoration for Spatially Varying Defocus Blur of Imaging Sensor

    PubMed Central

    Cheong, Hejin; Chae, Eunjung; Lee, Eunsung; Jo, Gwanghyun; Paik, Joonki

    2015-01-01

    This paper presents a fast adaptive image restoration method for removing spatially varying out-of-focus blur of a general imaging sensor. After estimating the parameters of space-variant point-spread-function (PSF) using the derivative in each uniformly blurred region, the proposed method performs spatially adaptive image restoration by selecting the optimal restoration filter according to the estimated blur parameters. Each restoration filter is implemented in the form of a combination of multiple FIR filters, which guarantees the fast image restoration without the need of iterative or recursive processing. Experimental results show that the proposed method outperforms existing space-invariant restoration methods in the sense of both objective and subjective performance measures. The proposed algorithm can be employed to a wide area of image restoration applications, such as mobile imaging devices, robot vision, and satellite image processing. PMID:25569760

  4. SANDPUMA: ensemble predictions of nonribosomal peptide chemistry reveal biosynthetic diversity across Actinobacteria.

    PubMed

    Chevrette, Marc G; Aicheler, Fabian; Kohlbacher, Oliver; Currie, Cameron R; Medema, Marnix H

    2017-10-15

    Nonribosomally synthesized peptides (NRPs) are natural products with widespread applications in medicine and biotechnology. Many algorithms have been developed to predict the substrate specificities of nonribosomal peptide synthetase adenylation (A) domains from DNA sequences, which enables prioritization and dereplication, and integration with other data types in discovery efforts. However, insufficient training data and a lack of clarity regarding prediction quality have impeded optimal use. Here, we introduce prediCAT, a new phylogenetics-inspired algorithm, which quantitatively estimates the degree of predictability of each A-domain. We then systematically benchmarked all algorithms on a newly gathered, independent test set of 434 A-domain sequences, showing that active-site-motif-based algorithms outperform whole-domain-based methods. Subsequently, we developed SANDPUMA, a powerful ensemble algorithm, based on newly trained versions of all high-performing algorithms, which significantly outperforms individual methods. Finally, we deployed SANDPUMA in a systematic investigation of 7635 Actinobacteria genomes, suggesting that NRP chemical diversity is much higher than previously estimated. SANDPUMA has been integrated into the widely used antiSMASH biosynthetic gene cluster analysis pipeline and is also available as an open-source, standalone tool. SANDPUMA is freely available at https://bitbucket.org/chevrm/sandpuma and as a docker image at https://hub.docker.com/r/chevrm/sandpuma/ under the GNU Public License 3 (GPL3). chevrette@wisc.edu or marnix.medema@wur.nl. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  5. An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images.

    PubMed

    Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua

    2014-01-01

    The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms.

  6. Change detection of medical images using dictionary learning techniques and principal component analysis.

    PubMed

    Nika, Varvara; Babyn, Paul; Zhu, Hongmei

    2014-07-01

    Automatic change detection methods for identifying the changes of serial MR images taken at different times are of great interest to radiologists. The majority of existing change detection methods in medical imaging, and those of brain images in particular, include many preprocessing steps and rely mostly on statistical analysis of magnetic resonance imaging (MRI) scans. Although most methods utilize registration software, tissue classification remains a difficult and overwhelming task. Recently, dictionary learning techniques are being used in many areas of image processing, such as image surveillance, face recognition, remote sensing, and medical imaging. We present an improved version of the EigenBlockCD algorithm, named the EigenBlockCD-2. The EigenBlockCD-2 algorithm performs an initial global registration and identifies the changes between serial MR images of the brain. Blocks of pixels from a baseline scan are used to train local dictionaries to detect changes in the follow-up scan. We use PCA to reduce the dimensionality of the local dictionaries and the redundancy of data. Choosing the appropriate distance measure significantly affects the performance of our algorithm. We examine the differences between [Formula: see text] and [Formula: see text] norms as two possible similarity measures in the improved EigenBlockCD-2 algorithm. We show the advantages of the [Formula: see text] norm over the [Formula: see text] norm both theoretically and numerically. We also demonstrate the performance of the new EigenBlockCD-2 algorithm for detecting changes of MR images and compare our results with those provided in the recent literature. Experimental results with both simulated and real MRI scans show that our improved EigenBlockCD-2 algorithm outperforms the previous methods. It detects clinical changes while ignoring the changes due to the patient's position and other acquisition artifacts.

  7. Urban Growth Modeling Using Cellular Automata with Multi-Temporal Remote Sensing Images Calibrated by the Artificial Bee Colony Optimization Algorithm.

    PubMed

    Naghibi, Fereydoun; Delavar, Mahmoud Reza; Pijanowski, Bryan

    2016-12-14

    Cellular Automata (CA) is one of the most common techniques used to simulate the urbanization process. CA-based urban models use transition rules to deliver spatial patterns of urban growth and urban dynamics over time. Determining the optimum transition rules of the CA is a critical step because of the heterogeneity and nonlinearities existing among urban growth driving forces. Recently, new CA models integrated with optimization methods based on swarm intelligence algorithms were proposed to overcome this drawback. The Artificial Bee Colony (ABC) algorithm is an advanced meta-heuristic swarm intelligence-based algorithm. Here, we propose a novel CA-based urban change model that uses the ABC algorithm to extract optimum transition rules. We applied the proposed ABC-CA model to simulate future urban growth in Urmia (Iran) with multi-temporal Landsat images from 1997, 2006 and 2015. Validation of the simulation results was made through statistical methods such as overall accuracy, the figure of merit and total operating characteristics (TOC). Additionally, we calibrated the CA model by ant colony optimization (ACO) to assess the performance of our proposed model versus similar swarm intelligence algorithm methods. We showed that the overall accuracy and the figure of merit of the ABC-CA model are 90.1% and 51.7%, which are 2.9% and 8.8% higher than those of the ACO-CA model, respectively. Moreover, the allocation disagreement of the simulation results for the ABC-CA model is 9.9%, which is 2.9% less than that of the ACO-CA model. Finally, the ABC-CA model also outperforms the ACO-CA model with fewer quantity and allocation errors and slightly more hits.

  8. Urban Growth Modeling Using Cellular Automata with Multi-Temporal Remote Sensing Images Calibrated by the Artificial Bee Colony Optimization Algorithm

    PubMed Central

    Naghibi, Fereydoun; Delavar, Mahmoud Reza; Pijanowski, Bryan

    2016-01-01

    Cellular Automata (CA) is one of the most common techniques used to simulate the urbanization process. CA-based urban models use transition rules to deliver spatial patterns of urban growth and urban dynamics over time. Determining the optimum transition rules of the CA is a critical step because of the heterogeneity and nonlinearities existing among urban growth driving forces. Recently, new CA models integrated with optimization methods based on swarm intelligence algorithms were proposed to overcome this drawback. The Artificial Bee Colony (ABC) algorithm is an advanced meta-heuristic swarm intelligence-based algorithm. Here, we propose a novel CA-based urban change model that uses the ABC algorithm to extract optimum transition rules. We applied the proposed ABC-CA model to simulate future urban growth in Urmia (Iran) with multi-temporal Landsat images from 1997, 2006 and 2015. Validation of the simulation results was made through statistical methods such as overall accuracy, the figure of merit and total operating characteristics (TOC). Additionally, we calibrated the CA model by ant colony optimization (ACO) to assess the performance of our proposed model versus similar swarm intelligence algorithm methods. We showed that the overall accuracy and the figure of merit of the ABC-CA model are 90.1% and 51.7%, which are 2.9% and 8.8% higher than those of the ACO-CA model, respectively. Moreover, the allocation disagreement of the simulation results for the ABC-CA model is 9.9%, which is 2.9% less than that of the ACO-CA model. Finally, the ABC-CA model also outperforms the ACO-CA model with fewer quantity and allocation errors and slightly more hits. PMID:27983633

  9. Optimal Integration of Departures and Arrivals in Terminal Airspace

    NASA Technical Reports Server (NTRS)

    Xue, Min; Zelinski, Shannon Jean

    2013-01-01

    Coordination of operations with spatially and temporally shared resources, such as route segments, fixes, and runways, improves the efficiency of terminal airspace management. Problems in this category are, in general, computationally difficult compared to conventional scheduling problems. This paper presents a fast time algorithm formulation using a non-dominated sorting genetic algorithm (NSGA). It was first applied to a test problem introduced in existing literature. An experiment with a test problem showed that new methods can solve the 20 aircraft problem in fast time with a 65% or 440 second delay reduction using shared departure fixes. In order to test its application in a more realistic and complicated problem, the NSGA algorithm was applied to a problem in LAX terminal airspace, where interactions between 28% of LAX arrivals and 10% of LAX departures are resolved by spatial separation in current operations, which may introduce unnecessary delays. In this work, three types of separations - spatial, temporal, and hybrid separations - were formulated using the new algorithm. The hybrid separation combines both temporal and spatial separations. Results showed that although temporal separation achieved less delay than spatial separation with a small uncertainty buffer, spatial separation outperformed temporal separation when the uncertainty buffer was increased. Hybrid separation introduced much less delay than both spatial and temporal approaches. For a total of 15 interacting departures and arrivals, when compared to spatial separation, the delay reduction of hybrid separation varied between 11% or 3.1 minutes and 64% or 10.7 minutes corresponding to an uncertainty buffer from 0 to 60 seconds. Furthermore, as a comparison with the NSGA algorithm, a First-Come-First-Serve based heuristic method was implemented for the hybrid separation. Experiments showed that the results from the NSGA algorithm have 9% to 42% less delay than the heuristic method with varied uncertainty buffer sizes.

  10. A study on the application of topic models to motif finding algorithms.

    PubMed

    Basha Gutierrez, Josep; Nakai, Kenta

    2016-12-22

    Topic models are statistical algorithms which try to discover the structure of a set of documents according to the abstract topics contained in them. Here we try to apply this approach to the discovery of the structure of the transcription factor binding sites (TFBS) contained in a set of biological sequences, which is a fundamental problem in molecular biology research for the understanding of transcriptional regulation. Here we present two methods that make use of topic models for motif finding. First, we developed an algorithm in which first a set of biological sequences are treated as text documents, and the k-mers contained in them as words, to then build a correlated topic model (CTM) and iteratively reduce its perplexity. We also used the perplexity measurement of CTMs to improve our previous algorithm based on a genetic algorithm and several statistical coefficients. The algorithms were tested with 56 data sets from four different species and compared to 14 other methods by the use of several coefficients both at nucleotide and site level. The results of our first approach showed a performance comparable to the other methods studied, especially at site level and in sensitivity scores, in which it scored better than any of the 14 existing tools. In the case of our previous algorithm, the new approach with the addition of the perplexity measurement clearly outperformed all of the other methods in sensitivity, both at nucleotide and site level, and in overall performance at site level. The statistics obtained show that the performance of a motif finding method based on the use of a CTM is satisfying enough to conclude that the application of topic models is a valid method for developing motif finding algorithms. Moreover, the addition of topic models to a previously developed method dramatically increased its performance, suggesting that this combined algorithm can be a useful tool to successfully predict motifs in different kinds of sets of DNA sequences.

  11. Dynamics and control of diseases in networks with community structure.

    PubMed

    Salathé, Marcel; Jones, James H

    2010-04-08

    The dynamics of infectious diseases spread via direct person-to-person transmission (such as influenza, smallpox, HIV/AIDS, etc.) depends on the underlying host contact network. Human contact networks exhibit strong community structure. Understanding how such community structure affects epidemics may provide insights for preventing the spread of disease between communities by changing the structure of the contact network through pharmaceutical or non-pharmaceutical interventions. We use empirical and simulated networks to investigate the spread of disease in networks with community structure. We find that community structure has a major impact on disease dynamics, and we show that in networks with strong community structure, immunization interventions targeted at individuals bridging communities are more effective than those simply targeting highly connected individuals. Because the structure of relevant contact networks is generally not known, and vaccine supply is often limited, there is great need for efficient vaccination algorithms that do not require full knowledge of the network. We developed an algorithm that acts only on locally available network information and is able to quickly identify targets for successful immunization intervention. The algorithm generally outperforms existing algorithms when vaccine supply is limited, particularly in networks with strong community structure. Understanding the spread of infectious diseases and designing optimal control strategies is a major goal of public health. Social networks show marked patterns of community structure, and our results, based on empirical and simulated data, demonstrate that community structure strongly affects disease dynamics. These results have implications for the design of control strategies.

  12. An Energy Aware Adaptive Sampling Algorithm for Energy Harvesting WSN with Energy Hungry Sensors.

    PubMed

    Srbinovski, Bruno; Magno, Michele; Edwards-Murphy, Fiona; Pakrashi, Vikram; Popovici, Emanuel

    2016-03-28

    Wireless sensor nodes have a limited power budget, though they are often expected to be functional in the field once deployed for extended periods of time. Therefore, minimization of energy consumption and energy harvesting technology in Wireless Sensor Networks (WSN) are key tools for maximizing network lifetime, and achieving self-sustainability. This paper proposes an energy aware Adaptive Sampling Algorithm (ASA) for WSN with power hungry sensors and harvesting capabilities, an energy management technique that can be implemented on any WSN platform with enough processing power to execute the proposed algorithm. An existing state-of-the-art ASA developed for wireless sensor networks with power hungry sensors is optimized and enhanced to adapt the sampling frequency according to the available energy of the node. The proposed algorithm is evaluated using two in-field testbeds that are supplied by two different energy harvesting sources (solar and wind). Simulation and comparison between the state-of-the-art ASA and the proposed energy aware ASA (EASA) in terms of energy durability are carried out using in-field measured harvested energy (using both wind and solar sources) and power hungry sensors (ultrasonic wind sensor and gas sensors). The simulation results demonstrate that using ASA in combination with an energy aware function on the nodes can drastically increase the lifetime of a WSN node and enable self-sustainability. In fact, the proposed EASA in conjunction with energy harvesting capability can lead towards perpetual WSN operation and significantly outperform the state-of-the-art ASA.

  13. Recognizing Age-Separated Face Images: Humans and Machines

    PubMed Central

    Yadav, Daksha; Singh, Richa; Vatsa, Mayank; Noore, Afzel

    2014-01-01

    Humans utilize facial appearance, gender, expression, aging pattern, and other ancillary information to recognize individuals. It is interesting to observe how humans perceive facial age. Analyzing these properties can help in understanding the phenomenon of facial aging and incorporating the findings can help in designing effective algorithms. Such a study has two components - facial age estimation and age-separated face recognition. Age estimation involves predicting the age of an individual given his/her facial image. On the other hand, age-separated face recognition consists of recognizing an individual given his/her age-separated images. In this research, we investigate which facial cues are utilized by humans for estimating the age of people belonging to various age groups along with analyzing the effect of one's gender, age, and ethnicity on age estimation skills. We also analyze how various facial regions such as binocular and mouth regions influence age estimation and recognition capabilities. Finally, we propose an age-invariant face recognition algorithm that incorporates the knowledge learned from these observations. Key observations of our research are: (1) the age group of newborns and toddlers is easiest to estimate, (2) gender and ethnicity do not affect the judgment of age group estimation, (3) face as a global feature, is essential to achieve good performance in age-separated face recognition, and (4) the proposed algorithm yields improved recognition performance compared to existing algorithms and also outperforms a commercial system in the young image as probe scenario. PMID:25474200

  14. Recognizing age-separated face images: humans and machines.

    PubMed

    Yadav, Daksha; Singh, Richa; Vatsa, Mayank; Noore, Afzel

    2014-01-01

    Humans utilize facial appearance, gender, expression, aging pattern, and other ancillary information to recognize individuals. It is interesting to observe how humans perceive facial age. Analyzing these properties can help in understanding the phenomenon of facial aging and incorporating the findings can help in designing effective algorithms. Such a study has two components--facial age estimation and age-separated face recognition. Age estimation involves predicting the age of an individual given his/her facial image. On the other hand, age-separated face recognition consists of recognizing an individual given his/her age-separated images. In this research, we investigate which facial cues are utilized by humans for estimating the age of people belonging to various age groups along with analyzing the effect of one's gender, age, and ethnicity on age estimation skills. We also analyze how various facial regions such as binocular and mouth regions influence age estimation and recognition capabilities. Finally, we propose an age-invariant face recognition algorithm that incorporates the knowledge learned from these observations. Key observations of our research are: (1) the age group of newborns and toddlers is easiest to estimate, (2) gender and ethnicity do not affect the judgment of age group estimation, (3) face as a global feature, is essential to achieve good performance in age-separated face recognition, and (4) the proposed algorithm yields improved recognition performance compared to existing algorithms and also outperforms a commercial system in the young image as probe scenario.

  15. Evolutionary Approach for Relative Gene Expression Algorithms

    PubMed Central

    Czajkowski, Marcin

    2014-01-01

    A Relative Expression Analysis (RXA) uses ordering relationships in a small collection of genes and is successfully applied to classiffication using microarray data. As checking all possible subsets of genes is computationally infeasible, the RXA algorithms require feature selection and multiple restrictive assumptions. Our main contribution is a specialized evolutionary algorithm (EA) for top-scoring pairs called EvoTSP which allows finding more advanced gene relations. We managed to unify the major variants of relative expression algorithms through EA and introduce weights to the top-scoring pairs. Experimental validation of EvoTSP on public available microarray datasets showed that the proposed solution significantly outperforms in terms of accuracy other relative expression algorithms and allows exploring much larger solution space. PMID:24790574

  16. Optimal signal constellation design for ultra-high-speed optical transport in the presence of nonlinear phase noise.

    PubMed

    Liu, Tao; Djordjevic, Ivan B

    2014-12-29

    In this paper, we first describe an optimal signal constellation design algorithm suitable for the coherent optical channels dominated by the linear phase noise. Then, we modify this algorithm to be suitable for the nonlinear phase noise dominated channels. In optimization procedure, the proposed algorithm uses the cumulative log-likelihood function instead of the Euclidian distance. Further, an LDPC coded modulation scheme is proposed to be used in combination with signal constellations obtained by proposed algorithm. Monte Carlo simulations indicate that the LDPC-coded modulation schemes employing the new constellation sets, obtained by our new signal constellation design algorithm, outperform corresponding QAM constellations significantly in terms of transmission distance and have better nonlinearity tolerance.

  17. Reduced projection angles for binary tomography with particle aggregation.

    PubMed

    Al-Rifaie, Mohammad Majid; Blackwell, Tim

    This paper extends particle aggregate reconstruction technique (PART), a reconstruction algorithm for binary tomography based on the movement of particles. PART supposes that pixel values are particles, and that particles diffuse through the image, staying together in regions of uniform pixel value known as aggregates. In this work, a variation of this algorithm is proposed and a focus is placed on reducing the number of projections and whether this impacts the reconstruction of images. The algorithm is tested on three phantoms of varying sizes and numbers of forward projections and compared to filtered back projection, a random search algorithm and to SART, a standard algebraic reconstruction method. It is shown that the proposed algorithm outperforms the aforementioned algorithms on small numbers of projections. This potentially makes the algorithm attractive in scenarios where collecting less projection data are inevitable.

  18. A total variation diminishing finite difference algorithm for sonic boom propagation models

    NASA Technical Reports Server (NTRS)

    Sparrow, Victor W.

    1993-01-01

    It is difficult to accurately model the rise phases of sonic boom waveforms with traditional finite difference algorithms because of finite difference phase dispersion. This paper introduces the concept of a total variation diminishing (TVD) finite difference method as a tool for accurately modeling the rise phases of sonic booms. A standard second order finite difference algorithm and its TVD modified counterpart are both applied to the one-way propagation of a square pulse. The TVD method clearly outperforms the non-TVD method, showing great potential as a new computational tool in the analysis of sonic boom propagation.

  19. A hybrid metaheuristic for closest string problem.

    PubMed

    Mousavi, Sayyed Rasoul

    2011-01-01

    The Closest String Problem (CSP) is an optimisation problem, which is to obtain a string with the minimum distance from a number of given strings. In this paper, a new metaheuristic algorithm is investigated for the problem, whose main feature is relatively high speed in obtaining good solutions, which is essential when the input size is large. The proposed algorithm is compared with four recent algorithms suggested for the problem, outperforming them in more than 98% of the cases. It is also remarkably faster than all of them, running within 1 s in most of the experimental cases.

  20. Why Do Chinese Students Out-Perform Those from the West? Do Approaches to Learning Contribute to the Explanation?

    ERIC Educational Resources Information Center

    Kember, David

    2016-01-01

    One of the major current issues in education is the question of why Chinese and East Asian students are outperforming those from Western countries. Research into the approaches to learning of Chinese students revealed the existence of intermediate approaches, combining memorising and understanding, which were distinct from rote learning. At the…

  1. Nonuniform update for sparse target recovery in fluorescence molecular tomography accelerated by ordered subsets.

    PubMed

    Zhu, Dianwen; Li, Changqing

    2014-12-01

    Fluorescence molecular tomography (FMT) is a promising imaging modality and has been actively studied in the past two decades since it can locate the specific tumor position three-dimensionally in small animals. However, it remains a challenging task to obtain fast, robust and accurate reconstruction of fluorescent probe distribution in small animals due to the large computational burden, the noisy measurement and the ill-posed nature of the inverse problem. In this paper we propose a nonuniform preconditioning method in combination with L (1) regularization and ordered subsets technique (NUMOS) to take care of the different updating needs at different pixels, to enhance sparsity and suppress noise, and to further boost convergence of approximate solutions for fluorescence molecular tomography. Using both simulated data and phantom experiment, we found that the proposed nonuniform updating method outperforms its popular uniform counterpart by obtaining a more localized, less noisy, more accurate image. The computational cost was greatly reduced as well. The ordered subset (OS) technique provided additional 5 times and 3 times speed enhancements for simulation and phantom experiments, respectively, without degrading image qualities. When compared with the popular L (1) algorithms such as iterative soft-thresholding algorithm (ISTA) and Fast iterative soft-thresholding algorithm (FISTA) algorithms, NUMOS also outperforms them by obtaining a better image in much shorter period of time.

  2. The global Minmax k-means algorithm.

    PubMed

    Wang, Xiaoyan; Bai, Yanping

    2016-01-01

    The global k -means algorithm is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure from suitable initial positions, and employs k -means to minimize the sum of the intra-cluster variances. However the global k -means algorithm sometimes results singleton clusters and the initial positions sometimes are bad, after a bad initialization, poor local optimal can be easily obtained by k -means algorithm. In this paper, we modified the global k -means algorithm to eliminate the singleton clusters at first, and then we apply MinMax k -means clustering error method to global k -means algorithm to overcome the effect of bad initialization, proposed the global Minmax k -means algorithm. The proposed clustering method is tested on some popular data sets and compared to the k -means algorithm, the global k -means algorithm and the MinMax k -means algorithm. The experiment results show our proposed algorithm outperforms other algorithms mentioned in the paper.

  3. Deep-Learning-Based Drug-Target Interaction Prediction.

    PubMed

    Wen, Ming; Zhang, Zhimin; Niu, Shaoyu; Sha, Haozhi; Yang, Ruihan; Yun, Yonghuan; Lu, Hongmei

    2017-04-07

    Identifying interactions between known drugs and targets is a major challenge in drug repositioning. In silico prediction of drug-target interaction (DTI) can speed up the expensive and time-consuming experimental work by providing the most potent DTIs. In silico prediction of DTI can also provide insights about the potential drug-drug interaction and promote the exploration of drug side effects. Traditionally, the performance of DTI prediction depends heavily on the descriptors used to represent the drugs and the target proteins. In this paper, to accurately predict new DTIs between approved drugs and targets without separating the targets into different classes, we developed a deep-learning-based algorithmic framework named DeepDTIs. It first abstracts representations from raw input descriptors using unsupervised pretraining and then applies known label pairs of interaction to build a classification model. Compared with other methods, it is found that DeepDTIs reaches or outperforms other state-of-the-art methods. The DeepDTIs can be further used to predict whether a new drug targets to some existing targets or whether a new target interacts with some existing drugs.

  4. A new improved artificial bee colony algorithm for ship hull form optimization

    NASA Astrophysics Data System (ADS)

    Huang, Fuxin; Wang, Lijue; Yang, Chi

    2016-04-01

    The artificial bee colony (ABC) algorithm is a relatively new swarm intelligence-based optimization algorithm. Its simplicity of implementation, relatively few parameter settings and promising optimization capability make it widely used in different fields. However, it has problems of slow convergence due to its solution search equation. Here, a new solution search equation based on a combination of the elite solution pool and the block perturbation scheme is proposed to improve the performance of the algorithm. In addition, two different solution search equations are used by employed bees and onlooker bees to balance the exploration and exploitation of the algorithm. The developed algorithm is validated by a set of well-known numerical benchmark functions. It is then applied to optimize two ship hull forms with minimum resistance. The tested results show that the proposed new improved ABC algorithm can outperform the ABC algorithm in most of the tested problems.

  5. Wavelet tree structure based speckle noise removal for optical coherence tomography

    NASA Astrophysics Data System (ADS)

    Yuan, Xin; Liu, Xuan; Liu, Yang

    2018-02-01

    We report a new speckle noise removal algorithm in optical coherence tomography (OCT). Though wavelet domain thresholding algorithms have demonstrated superior advantages in suppressing noise magnitude and preserving image sharpness in OCT, the wavelet tree structure has not been investigated in previous applications. In this work, we propose an adaptive wavelet thresholding algorithm via exploiting the tree structure in wavelet coefficients to remove the speckle noise in OCT images. The threshold for each wavelet band is adaptively selected following a special rule to retain the structure of the image across different wavelet layers. Our results demonstrate that the proposed algorithm outperforms conventional wavelet thresholding, with significant advantages in preserving image features.

  6. Improved adaptive splitting and selection: the hybrid training method of a classifier based on a feature space partitioning.

    PubMed

    Jackowski, Konrad; Krawczyk, Bartosz; Woźniak, Michał

    2014-05-01

    Currently, methods of combined classification are the focus of intense research. A properly designed group of combined classifiers exploiting knowledge gathered in a pool of elementary classifiers can successfully outperform a single classifier. There are two essential issues to consider when creating combined classifiers: how to establish the most comprehensive pool and how to design a fusion model that allows for taking full advantage of the collected knowledge. In this work, we address the issues and propose an AdaSS+, training algorithm dedicated for the compound classifier system that effectively exploits local specialization of the elementary classifiers. An effective training procedure consists of two phases. The first phase detects the classifier competencies and adjusts the respective fusion parameters. The second phase boosts classification accuracy by elevating the degree of local specialization. The quality of the proposed algorithms are evaluated on the basis of a wide range of computer experiments that show that AdaSS+ can outperform the original method and several reference classifiers.

  7. An Exact Algorithm to Compute the Double-Cut-and-Join Distance for Genomes with Duplicate Genes.

    PubMed

    Shao, Mingfu; Lin, Yu; Moret, Bernard M E

    2015-05-01

    Computing the edit distance between two genomes is a basic problem in the study of genome evolution. The double-cut-and-join (DCJ) model has formed the basis for most algorithmic research on rearrangements over the last few years. The edit distance under the DCJ model can be computed in linear time for genomes without duplicate genes, while the problem becomes NP-hard in the presence of duplicate genes. In this article, we propose an integer linear programming (ILP) formulation to compute the DCJ distance between two genomes with duplicate genes. We also provide an efficient preprocessing approach to simplify the ILP formulation while preserving optimality. Comparison on simulated genomes demonstrates that our method outperforms MSOAR in computing the edit distance, especially when the genomes contain long duplicated segments. We also apply our method to assign orthologous gene pairs among human, mouse, and rat genomes, where once again our method outperforms MSOAR.

  8. Improving cerebellar segmentation with statistical fusion

    NASA Astrophysics Data System (ADS)

    Plassard, Andrew J.; Yang, Zhen; Prince, Jerry L.; Claassen, Daniel O.; Landman, Bennett A.

    2016-03-01

    The cerebellum is a somatotopically organized central component of the central nervous system well known to be involved with motor coordination and increasingly recognized roles in cognition and planning. Recent work in multiatlas labeling has created methods that offer the potential for fully automated 3-D parcellation of the cerebellar lobules and vermis (which are organizationally equivalent to cortical gray matter areas). This work explores the trade offs of using different statistical fusion techniques and post hoc optimizations in two datasets with distinct imaging protocols. We offer a novel fusion technique by extending the ideas of the Selective and Iterative Method for Performance Level Estimation (SIMPLE) to a patch-based performance model. We demonstrate the effectiveness of our algorithm, Non- Local SIMPLE, for segmentation of a mixed population of healthy subjects and patients with severe cerebellar anatomy. Under the first imaging protocol, we show that Non-Local SIMPLE outperforms previous gold-standard segmentation techniques. In the second imaging protocol, we show that Non-Local SIMPLE outperforms previous gold standard techniques but is outperformed by a non-locally weighted vote with the deeper population of atlases available. This work advances the state of the art in open source cerebellar segmentation algorithms and offers the opportunity for routinely including cerebellar segmentation in magnetic resonance imaging studies that acquire whole brain T1-weighted volumes with approximately 1 mm isotropic resolution.

  9. Optimal Fungal Space Searching Algorithms.

    PubMed

    Asenova, Elitsa; Lin, Hsin-Yu; Fu, Eileen; Nicolau, Dan V; Nicolau, Dan V

    2016-10-01

    Previous experiments have shown that fungi use an efficient natural algorithm for searching the space available for their growth in micro-confined networks, e.g., mazes. This natural "master" algorithm, which comprises two "slave" sub-algorithms, i.e., collision-induced branching and directional memory, has been shown to be more efficient than alternatives, with one, or the other, or both sub-algorithms turned off. In contrast, the present contribution compares the performance of the fungal natural algorithm against several standard artificial homologues. It was found that the space-searching fungal algorithm consistently outperforms uninformed algorithms, such as Depth-First-Search (DFS). Furthermore, while the natural algorithm is inferior to informed ones, such as A*, this under-performance does not importantly increase with the increase of the size of the maze. These findings suggest that a systematic effort of harvesting the natural space searching algorithms used by microorganisms is warranted and possibly overdue. These natural algorithms, if efficient, can be reverse-engineered for graph and tree search strategies.

  10. Real time network traffic monitoring for wireless local area networks based on compressed sensing

    NASA Astrophysics Data System (ADS)

    Balouchestani, Mohammadreza

    2017-05-01

    A wireless local area network (WLAN) is an important type of wireless networks which connotes different wireless nodes in a local area network. WLANs suffer from important problems such as network load balancing, large amount of energy, and load of sampling. This paper presents a new networking traffic approach based on Compressed Sensing (CS) for improving the quality of WLANs. The proposed architecture allows reducing Data Delay Probability (DDP) to 15%, which is a good record for WLANs. The proposed architecture is increased Data Throughput (DT) to 22 % and Signal to Noise (S/N) ratio to 17 %, which provide a good background for establishing high qualified local area networks. This architecture enables continuous data acquisition and compression of WLAN's signals that are suitable for a variety of other wireless networking applications. At the transmitter side of each wireless node, an analog-CS framework is applied at the sensing step before analog to digital converter in order to generate the compressed version of the input signal. At the receiver side of wireless node, a reconstruction algorithm is applied in order to reconstruct the original signals from the compressed signals with high probability and enough accuracy. The proposed algorithm out-performs existing algorithms by achieving a good level of Quality of Service (QoS). This ability allows reducing 15 % of Bit Error Rate (BER) at each wireless node.

  11. Incorporating World Knowledge to Document Clustering via Heterogeneous Information Networks.

    PubMed

    Wang, Chenguang; Song, Yangqiu; El-Kishky, Ahmed; Roth, Dan; Zhang, Ming; Han, Jiawei

    2015-08-01

    One of the key obstacles in making learning protocols realistic in applications is the need to supervise them, a costly process that often requires hiring domain experts. We consider the framework to use the world knowledge as indirect supervision. World knowledge is general-purpose knowledge, which is not designed for any specific domain. Then the key challenges are how to adapt the world knowledge to domains and how to represent it for learning. In this paper, we provide an example of using world knowledge for domain dependent document clustering. We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network. Then we propose a clustering algorithm that can cluster multiple types and incorporate the sub-type information as constraints. In the experiments, we use two existing knowledge bases as our sources of world knowledge. One is Freebase, which is collaboratively collected knowledge about entities and their organizations. The other is YAGO2, a knowledge base automatically extracted from Wikipedia and maps knowledge to the linguistic knowledge base, Word-Net. Experimental results on two text benchmark datasets (20newsgroups and RCV1) show that incorporating world knowledge as indirect supervision can significantly outperform the state-of-the-art clustering algorithms as well as clustering algorithms enhanced with world knowledge features.

  12. A priori mesh grading for the numerical calculation of the head-related transfer functions

    PubMed Central

    Ziegelwanger, Harald; Kreuzer, Wolfgang; Majdak, Piotr

    2017-01-01

    Head-related transfer functions (HRTFs) describe the directional filtering of the incoming sound caused by the morphology of a listener’s head and pinnae. When an accurate model of a listener’s morphology exists, HRTFs can be calculated numerically with the boundary element method (BEM). However, the general recommendation to model the head and pinnae with at least six elements per wavelength renders the BEM as a time-consuming procedure when calculating HRTFs for the full audible frequency range. In this study, a mesh preprocessing algorithm is proposed, viz., a priori mesh grading, which reduces the computational costs in the HRTF calculation process significantly. The mesh grading algorithm deliberately violates the recommendation of at least six elements per wavelength in certain regions of the head and pinnae and varies the size of elements gradually according to an a priori defined grading function. The evaluation of the algorithm involved HRTFs calculated for various geometric objects including meshes of three human listeners and various grading functions. The numerical accuracy and the predicted sound-localization performance of calculated HRTFs were analyzed. A-priori mesh grading appeared to be suitable for the numerical calculation of HRTFs in the full audible frequency range and outperformed uniform meshes in terms of numerical errors, perception based predictions of sound-localization performance, and computational costs. PMID:28239186

  13. Incorporating World Knowledge to Document Clustering via Heterogeneous Information Networks

    PubMed Central

    Wang, Chenguang; Song, Yangqiu; El-Kishky, Ahmed; Roth, Dan; Zhang, Ming; Han, Jiawei

    2015-01-01

    One of the key obstacles in making learning protocols realistic in applications is the need to supervise them, a costly process that often requires hiring domain experts. We consider the framework to use the world knowledge as indirect supervision. World knowledge is general-purpose knowledge, which is not designed for any specific domain. Then the key challenges are how to adapt the world knowledge to domains and how to represent it for learning. In this paper, we provide an example of using world knowledge for domain dependent document clustering. We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network. Then we propose a clustering algorithm that can cluster multiple types and incorporate the sub-type information as constraints. In the experiments, we use two existing knowledge bases as our sources of world knowledge. One is Freebase, which is collaboratively collected knowledge about entities and their organizations. The other is YAGO2, a knowledge base automatically extracted from Wikipedia and maps knowledge to the linguistic knowledge base, Word-Net. Experimental results on two text benchmark datasets (20newsgroups and RCV1) show that incorporating world knowledge as indirect supervision can significantly outperform the state-of-the-art clustering algorithms as well as clustering algorithms enhanced with world knowledge features. PMID:26705504

  14. Traveling salesman problems with PageRank Distance on complex networks reveal community structure

    NASA Astrophysics Data System (ADS)

    Jiang, Zhongzhou; Liu, Jing; Wang, Shuai

    2016-12-01

    In this paper, we propose a new algorithm for community detection problems (CDPs) based on traveling salesman problems (TSPs), labeled as TSP-CDA. Since TSPs need to find a tour with minimum cost, cities close to each other are usually clustered in the tour. This inspired us to model CDPs as TSPs by taking each vertex as a city. Then, in the final tour, the vertices in the same community tend to cluster together, and the community structure can be obtained by cutting the tour into a couple of paths. There are two challenges. The first is to define a suitable distance between each pair of vertices which can reflect the probability that they belong to the same community. The second is to design a suitable strategy to cut the final tour into paths which can form communities. In TSP-CDA, we deal with these two challenges by defining a PageRank Distance and an automatic threshold-based cutting strategy. The PageRank Distance is designed with the intrinsic properties of CDPs in mind, and can be calculated efficiently. In the experiments, benchmark networks with 1000-10,000 nodes and varying structures are used to test the performance of TSP-CDA. A comparison is also made between TSP-CDA and two well-established community detection algorithms. The results show that TSP-CDA can find accurate community structure efficiently and outperforms the two existing algorithms.

  15. MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering

    PubMed Central

    Kim, Eun-Youn; Kim, Seon-Young; Ashlock, Daniel; Nam, Dougu

    2009-01-01

    Background Uncovering subtypes of disease from microarray samples has important clinical implications such as survival time and sensitivity of individual patients to specific therapies. Unsupervised clustering methods have been used to classify this type of data. However, most existing methods focus on clusters with compact shapes and do not reflect the geometric complexity of the high dimensional microarray clusters, which limits their performance. Results We present a cluster-number-based ensemble clustering algorithm, called MULTI-K, for microarray sample classification, which demonstrates remarkable accuracy. The method amalgamates multiple k-means runs by varying the number of clusters and identifies clusters that manifest the most robust co-memberships of elements. In addition to the original algorithm, we newly devised the entropy-plot to control the separation of singletons or small clusters. MULTI-K, unlike the simple k-means or other widely used methods, was able to capture clusters with complex and high-dimensional structures accurately. MULTI-K outperformed other methods including a recently developed ensemble clustering algorithm in tests with five simulated and eight real gene-expression data sets. Conclusion The geometric complexity of clusters should be taken into account for accurate classification of microarray data, and ensemble clustering applied to the number of clusters tackles the problem very well. The C++ code and the data sets tested are available from the authors. PMID:19698124

  16. Efficient convex-elastic net algorithm to solve the Euclidean traveling salesman problem.

    PubMed

    Al-Mulhem, M; Al-Maghrabi, T

    1998-01-01

    This paper describes a hybrid algorithm that combines an adaptive-type neural network algorithm and a nondeterministic iterative algorithm to solve the Euclidean traveling salesman problem (E-TSP). It begins with a brief introduction to the TSP and the E-TSP. Then, it presents the proposed algorithm with its two major components: the convex-elastic net (CEN) algorithm and the nondeterministic iterative improvement (NII) algorithm. These two algorithms are combined into the efficient convex-elastic net (ECEN) algorithm. The CEN algorithm integrates the convex-hull property and elastic net algorithm to generate an initial tour for the E-TSP. The NII algorithm uses two rearrangement operators to improve the initial tour given by the CEN algorithm. The paper presents simulation results for two instances of E-TSP: randomly generated tours and tours for well-known problems in the literature. Experimental results are given to show that the proposed algorithm ran find the nearly optimal solution for the E-TSP that outperform many similar algorithms reported in the literature. The paper concludes with the advantages of the new algorithm and possible extensions.

  17. Differential Topic Models.

    PubMed

    Chen, Changyou; Buntine, Wray; Ding, Nan; Xie, Lexing; Du, Lan

    2015-02-01

    In applications we may want to compare different document collections: they could have shared content but also different and unique aspects in particular collections. This task has been called comparative text mining or cross-collection modeling. We present a differential topic model for this application that models both topic differences and similarities. For this we use hierarchical Bayesian nonparametric models. Moreover, we found it was important to properly model power-law phenomena in topic-word distributions and thus we used the full Pitman-Yor process rather than just a Dirichlet process. Furthermore, we propose the transformed Pitman-Yor process (TPYP) to incorporate prior knowledge such as vocabulary variations in different collections into the model. To deal with the non-conjugate issue between model prior and likelihood in the TPYP, we thus propose an efficient sampling algorithm using a data augmentation technique based on the multinomial theorem. Experimental results show the model discovers interesting aspects of different collections. We also show the proposed MCMC based algorithm achieves a dramatically reduced test perplexity compared to some existing topic models. Finally, we show our model outperforms the state-of-the-art for document classification/ideology prediction on a number of text collections.

  18. Collaborative Filtering Recommendation on Users' Interest Sequences.

    PubMed

    Cheng, Weijie; Yin, Guisheng; Dong, Yuxin; Dong, Hongbin; Zhang, Wansong

    2016-01-01

    As an important factor for improving recommendations, time information has been introduced to model users' dynamic preferences in many papers. However, the sequence of users' behaviour is rarely studied in recommender systems. Due to the users' unique behavior evolution patterns and personalized interest transitions among items, users' similarity in sequential dimension should be introduced to further distinguish users' preferences and interests. In this paper, we propose a new collaborative filtering recommendation method based on users' interest sequences (IS) that rank users' ratings or other online behaviors according to the timestamps when they occurred. This method extracts the semantics hidden in the interest sequences by the length of users' longest common sub-IS (LCSIS) and the count of users' total common sub-IS (ACSIS). Then, these semantics are utilized to obtain users' IS-based similarities and, further, to refine the similarities acquired from traditional collaborative filtering approaches. With these updated similarities, transition characteristics and dynamic evolution patterns of users' preferences are considered. Our new proposed method was compared with state-of-the-art time-aware collaborative filtering algorithms on datasets MovieLens, Flixster and Ciao. The experimental results validate that the proposed recommendation method is effective and outperforms several existing algorithms in the accuracy of rating prediction.

  19. Collaborative Filtering Recommendation on Users’ Interest Sequences

    PubMed Central

    Cheng, Weijie; Yin, Guisheng; Dong, Yuxin; Dong, Hongbin; Zhang, Wansong

    2016-01-01

    As an important factor for improving recommendations, time information has been introduced to model users’ dynamic preferences in many papers. However, the sequence of users’ behaviour is rarely studied in recommender systems. Due to the users’ unique behavior evolution patterns and personalized interest transitions among items, users’ similarity in sequential dimension should be introduced to further distinguish users’ preferences and interests. In this paper, we propose a new collaborative filtering recommendation method based on users’ interest sequences (IS) that rank users’ ratings or other online behaviors according to the timestamps when they occurred. This method extracts the semantics hidden in the interest sequences by the length of users’ longest common sub-IS (LCSIS) and the count of users’ total common sub-IS (ACSIS). Then, these semantics are utilized to obtain users’ IS-based similarities and, further, to refine the similarities acquired from traditional collaborative filtering approaches. With these updated similarities, transition characteristics and dynamic evolution patterns of users’ preferences are considered. Our new proposed method was compared with state-of-the-art time-aware collaborative filtering algorithms on datasets MovieLens, Flixster and Ciao. The experimental results validate that the proposed recommendation method is effective and outperforms several existing algorithms in the accuracy of rating prediction. PMID:27195787

  20. Natural image statistics and low-complexity feature selection.

    PubMed

    Vasconcelos, Manuela; Vasconcelos, Nuno

    2009-02-01

    Low-complexity feature selection is analyzed in the context of visual recognition. It is hypothesized that high-order dependences of bandpass features contain little information for discrimination of natural images. This hypothesis is characterized formally by the introduction of the concepts of conjunctive interference and decomposability order of a feature set. Necessary and sufficient conditions for the feasibility of low-complexity feature selection are then derived in terms of these concepts. It is shown that the intrinsic complexity of feature selection is determined by the decomposability order of the feature set and not its dimension. Feature selection algorithms are then derived for all levels of complexity and are shown to be approximated by existing information-theoretic methods, which they consistently outperform. The new algorithms are also used to objectively test the hypothesis of low decomposability order through comparison of classification performance. It is shown that, for image classification, the gain of modeling feature dependencies has strongly diminishing returns: best results are obtained under the assumption of decomposability order 1. This suggests a generic law for bandpass features extracted from natural images: that the effect, on the dependence of any two features, of observing any other feature is constant across image classes.

  1. A comparison of foveated acquisition and tracking performance relative to uniform resolution approaches

    NASA Astrophysics Data System (ADS)

    Dubuque, Shaun; Coffman, Thayne; McCarley, Paul; Bovik, A. C.; Thomas, C. William

    2009-05-01

    Foveated imaging has been explored for compression and tele-presence, but gaps exist in the study of foveated imaging applied to acquisition and tracking systems. Results are presented from two sets of experiments comparing simple foveated and uniform resolution targeting (acquisition and tracking) algorithms. The first experiments measure acquisition performance when locating Gabor wavelet targets in noise, with fovea placement driven by a mutual information measure. The foveated approach is shown to have lower detection delay than a notional uniform resolution approach when using video that consumes equivalent bandwidth. The second experiments compare the accuracy of target position estimates from foveated and uniform resolution tracking algorithms. A technique is developed to select foveation parameters that minimize error in Kalman filter state estimates. Foveated tracking is shown to consistently outperform uniform resolution tracking on an abstract multiple target task when using video that consumes equivalent bandwidth. Performance is also compared to uniform resolution processing without bandwidth limitations. In both experiments, superior performance is achieved at a given bandwidth by foveated processing because limited resources are allocated intelligently to maximize operational performance. These findings indicate the potential for operational performance improvements over uniform resolution systems in both acquisition and tracking tasks.

  2. Anti-infectious drug repurposing using an integrated chemical genomics and structural systems biology approach.

    PubMed

    Ng, Clara; Hauptman, Ruth; Zhang, Yinliang; Bourne, Philip E; Xie, Lei

    2014-01-01

    The emergence of multi-drug and extensive drug resistance of microbes to antibiotics poses a great threat to human health. Although drug repurposing is a promising solution for accelerating the drug development process, its application to anti-infectious drug discovery is limited by the scope of existing phenotype-, ligand-, or target-based methods. In this paper we introduce a new computational strategy to determine the genome-wide molecular targets of bioactive compounds in both human and bacterial genomes. Our method is based on the use of a novel algorithm, ligand Enrichment of Network Topological Similarity (ligENTS), to map the chemical universe to its global pharmacological space. ligENTS outperforms the state-of-the-art algorithms in identifying novel drug-target relationships. Furthermore, we integrate ligENTS with our structural systems biology platform to identify drug repurposing opportunities via target similarity profiling. Using this integrated strategy, we have identified novel P. falciparum targets of drug-like active compounds from the Malaria Box, and suggest that a number of approved drugs may be active against malaria. This study demonstrates the potential of an integrative chemical genomics and structural systems biology approach to drug repurposing.

  3. Ubiquitous human upper-limb motion estimation using wearable sensors.

    PubMed

    Zhang, Zhi-Qiang; Wong, Wai-Choong; Wu, Jian-Kang

    2011-07-01

    Human motion capture technologies have been widely used in a wide spectrum of applications, including interactive game and learning, animation, film special effects, health care, navigation, and so on. The existing human motion capture techniques, which use structured multiple high-resolution cameras in a dedicated studio, are complicated and expensive. With the rapid development of microsensors-on-chip, human motion capture using wearable microsensors has become an active research topic. Because of the agility in movement, upper-limb motion estimation has been regarded as the most difficult problem in human motion capture. In this paper, we take the upper limb as our research subject and propose a novel ubiquitous upper-limb motion estimation algorithm, which concentrates on modeling the relationship between upper-arm movement and forearm movement. A link structure with 5 degrees of freedom (DOF) is proposed to model the human upper-limb skeleton structure. Parameters are defined according to Denavit-Hartenberg convention, forward kinematics equations are derived, and an unscented Kalman filter is deployed to estimate the defined parameters. The experimental results have shown that the proposed upper-limb motion capture and analysis algorithm outperforms other fusion methods and provides accurate results in comparison to the BTS optical motion tracker.

  4. Robust Adaptive Beamforming with Sensor Position Errors Using Weighted Subspace Fitting-Based Covariance Matrix Reconstruction.

    PubMed

    Chen, Peng; Yang, Yixin; Wang, Yong; Ma, Yuanliang

    2018-05-08

    When sensor position errors exist, the performance of recently proposed interference-plus-noise covariance matrix (INCM)-based adaptive beamformers may be severely degraded. In this paper, we propose a weighted subspace fitting-based INCM reconstruction algorithm to overcome sensor displacement for linear arrays. By estimating the rough signal directions, we construct a novel possible mismatched steering vector (SV) set. We analyze the proximity of the signal subspace from the sample covariance matrix (SCM) and the space spanned by the possible mismatched SV set. After solving an iterative optimization problem, we reconstruct the INCM using the estimated sensor position errors. Then we estimate the SV of the desired signal by solving an optimization problem with the reconstructed INCM. The main advantage of the proposed algorithm is its robustness against SV mismatches dominated by unknown sensor position errors. Numerical examples show that even if the position errors are up to half of the assumed sensor spacing, the output signal-to-interference-plus-noise ratio is only reduced by 4 dB. Beam patterns plotted using experiment data show that the interference suppression capability of the proposed beamformer outperforms other tested beamformers.

  5. PERIOD ESTIMATION FOR SPARSELY SAMPLED QUASI-PERIODIC LIGHT CURVES APPLIED TO MIRAS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    He, Shiyuan; Huang, Jianhua Z.; Long, James

    2016-12-01

    We develop a nonlinear semi-parametric Gaussian process model to estimate periods of Miras with sparsely sampled light curves. The model uses a sinusoidal basis for the periodic variation and a Gaussian process for the stochastic changes. We use maximum likelihood to estimate the period and the parameters of the Gaussian process, while integrating out the effects of other nuisance parameters in the model with respect to a suitable prior distribution obtained from earlier studies. Since the likelihood is highly multimodal for period, we implement a hybrid method that applies the quasi-Newton algorithm for Gaussian process parameters and search the period/frequencymore » parameter space over a dense grid. A large-scale, high-fidelity simulation is conducted to mimic the sampling quality of Mira light curves obtained by the M33 Synoptic Stellar Survey. The simulated data set is publicly available and can serve as a testbed for future evaluation of different period estimation methods. The semi-parametric model outperforms an existing algorithm on this simulated test data set as measured by period recovery rate and quality of the resulting period–luminosity relations.« less

  6. An Effective Massive Sensor Network Data Access Scheme Based on Topology Control for the Internet of Things.

    PubMed

    Yi, Meng; Chen, Qingkui; Xiong, Neal N

    2016-11-03

    This paper considers the distributed access and control problem of massive wireless sensor networks' data access center for the Internet of Things, which is an extension of wireless sensor networks and an element of its topology structure. In the context of the arrival of massive service access requests at a virtual data center, this paper designs a massive sensing data access and control mechanism to improve the access efficiency of service requests and makes full use of the available resources at the data access center for the Internet of things. Firstly, this paper proposes a synergistically distributed buffer access model, which separates the information of resource and location. Secondly, the paper divides the service access requests into multiple virtual groups based on their characteristics and locations using an optimized self-organizing feature map neural network. Furthermore, this paper designs an optimal scheduling algorithm of group migration based on the combination scheme between the artificial bee colony algorithm and chaos searching theory. Finally, the experimental results demonstrate that this mechanism outperforms the existing schemes in terms of enhancing the accessibility of service requests effectively, reducing network delay, and has higher load balancing capacity and higher resource utility rate.

  7. Objective Quality Assessment for Color-to-Gray Image Conversion.

    PubMed

    Ma, Kede; Zhao, Tiesong; Zeng, Kai; Wang, Zhou

    2015-12-01

    Color-to-gray (C2G) image conversion is the process of transforming a color image into a grayscale one. Despite its wide usage in real-world applications, little work has been dedicated to compare the performance of C2G conversion algorithms. Subjective evaluation is reliable but is also inconvenient and time consuming. Here, we make one of the first attempts to develop an objective quality model that automatically predicts the perceived quality of C2G converted images. Inspired by the philosophy of the structural similarity index, we propose a C2G structural similarity (C2G-SSIM) index, which evaluates the luminance, contrast, and structure similarities between the reference color image and the C2G converted image. The three components are then combined depending on image type to yield an overall quality measure. Experimental results show that the proposed C2G-SSIM index has close agreement with subjective rankings and significantly outperforms existing objective quality metrics for C2G conversion. To explore the potentials of C2G-SSIM, we further demonstrate its use in two applications: 1) automatic parameter tuning for C2G conversion algorithms and 2) adaptive fusion of C2G converted images.

  8. Social Milieu Oriented Routing: A New Dimension to Enhance Network Security in WSNs.

    PubMed

    Liu, Lianggui; Chen, Li; Jia, Huiling

    2016-02-19

    In large-scale wireless sensor networks (WSNs), in order to enhance network security, it is crucial for a trustor node to perform social milieu oriented routing to a target a trustee node to carry out trust evaluation. This challenging social milieu oriented routing with more than one end-to-end Quality of Trust (QoT) constraint has proved to be NP-complete. Heuristic algorithms with polynomial and pseudo-polynomial-time complexities are often used to deal with this challenging problem. However, existing solutions cannot guarantee the efficiency of searching; that is, they can hardly avoid obtaining partial optimal solutions during a searching process. Quantum annealing (QA) uses delocalization and tunneling to avoid falling into local minima without sacrificing execution time. This has been proven a promising way to many optimization problems in recently published literatures. In this paper, for the first time, with the help of a novel approach, that is, configuration path-integral Monte Carlo (CPIMC) simulations, a QA-based optimal social trust path (QA_OSTP) selection algorithm is applied to the extraction of the optimal social trust path in large-scale WSNs. Extensive experiments have been conducted, and the experiment results demonstrate that QA_OSTP outperforms its heuristic opponents.

  9. Implementation of rigorous renormalization group method for ground space and low-energy states of local Hamiltonians

    NASA Astrophysics Data System (ADS)

    Roberts, Brenden; Vidick, Thomas; Motrunich, Olexei I.

    2017-12-01

    The success of polynomial-time tensor network methods for computing ground states of certain quantum local Hamiltonians has recently been given a sound theoretical basis by Arad et al. [Math. Phys. 356, 65 (2017), 10.1007/s00220-017-2973-z]. The convergence proof, however, relies on "rigorous renormalization group" (RRG) techniques which differ fundamentally from existing algorithms. We introduce a practical adaptation of the RRG procedure which, while no longer theoretically guaranteed to converge, finds matrix product state ansatz approximations to the ground spaces and low-lying excited spectra of local Hamiltonians in realistic situations. In contrast to other schemes, RRG does not utilize variational methods on tensor networks. Rather, it operates on subsets of the system Hilbert space by constructing approximations to the global ground space in a treelike manner. We evaluate the algorithm numerically, finding similar performance to density matrix renormalization group (DMRG) in the case of a gapped nondegenerate Hamiltonian. Even in challenging situations of criticality, large ground-state degeneracy, or long-range entanglement, RRG remains able to identify candidate states having large overlap with ground and low-energy eigenstates, outperforming DMRG in some cases.

  10. Reinforcement Learning in Distributed Domains: Beyond Team Games

    NASA Technical Reports Server (NTRS)

    Wolpert, David H.; Sill, Joseph; Turner, Kagan

    2000-01-01

    Distributed search algorithms are crucial in dealing with large optimization problems, particularly when a centralized approach is not only impractical but infeasible. Many machine learning concepts have been applied to search algorithms in order to improve their effectiveness. In this article we present an algorithm that blends Reinforcement Learning (RL) and hill climbing directly, by using the RL signal to guide the exploration step of a hill climbing algorithm. We apply this algorithm to the domain of a constellations of communication satellites where the goal is to minimize the loss of importance weighted data. We introduce the concept of 'ghost' traffic, where correctly setting this traffic induces the satellites to act to optimize the world utility. Our results indicated that the bi-utility search introduced in this paper outperforms both traditional hill climbing algorithms and distributed RL approaches such as team games.

  11. Hierarchical Dirichlet process model for gene expression clustering

    PubMed Central

    2013-01-01

    Clustering is an important data processing tool for interpreting microarray data and genomic network inference. In this article, we propose a clustering algorithm based on the hierarchical Dirichlet processes (HDP). The HDP clustering introduces a hierarchical structure in the statistical model which captures the hierarchical features prevalent in biological data such as the gene express data. We develop a Gibbs sampling algorithm based on the Chinese restaurant metaphor for the HDP clustering. We apply the proposed HDP algorithm to both regulatory network segmentation and gene expression clustering. The HDP algorithm is shown to outperform several popular clustering algorithms by revealing the underlying hierarchical structure of the data. For the yeast cell cycle data, we compare the HDP result to the standard result and show that the HDP algorithm provides more information and reduces the unnecessary clustering fragments. PMID:23587447

  12. A Guiding Evolutionary Algorithm with Greedy Strategy for Global Optimization Problems

    PubMed Central

    Cao, Leilei; Xu, Lihong; Goodman, Erik D.

    2016-01-01

    A Guiding Evolutionary Algorithm (GEA) with greedy strategy for global optimization problems is proposed. Inspired by Particle Swarm Optimization, the Genetic Algorithm, and the Bat Algorithm, the GEA was designed to retain some advantages of each method while avoiding some disadvantages. In contrast to the usual Genetic Algorithm, each individual in GEA is crossed with the current global best one instead of a randomly selected individual. The current best individual served as a guide to attract offspring to its region of genotype space. Mutation was added to offspring according to a dynamic mutation probability. To increase the capability of exploitation, a local search mechanism was applied to new individuals according to a dynamic probability of local search. Experimental results show that GEA outperformed the other three typical global optimization algorithms with which it was compared. PMID:27293421

  13. A Guiding Evolutionary Algorithm with Greedy Strategy for Global Optimization Problems.

    PubMed

    Cao, Leilei; Xu, Lihong; Goodman, Erik D

    2016-01-01

    A Guiding Evolutionary Algorithm (GEA) with greedy strategy for global optimization problems is proposed. Inspired by Particle Swarm Optimization, the Genetic Algorithm, and the Bat Algorithm, the GEA was designed to retain some advantages of each method while avoiding some disadvantages. In contrast to the usual Genetic Algorithm, each individual in GEA is crossed with the current global best one instead of a randomly selected individual. The current best individual served as a guide to attract offspring to its region of genotype space. Mutation was added to offspring according to a dynamic mutation probability. To increase the capability of exploitation, a local search mechanism was applied to new individuals according to a dynamic probability of local search. Experimental results show that GEA outperformed the other three typical global optimization algorithms with which it was compared.

  14. Enhancing Sparsity by Reweighted l(1) Minimization

    DTIC Science & Technology

    2008-07-01

    recovery depends on the sparsity level k. The dashed curves represent a reweighted ℓ1 algorithm that outperforms the traditional unweighted ℓ1...approach (solid curve ). (a) Performance after 4 reweighting iterations as a function of ǫ. (b) Performance with fixed ǫ = 0.1 as a function of the number of...signal recovery (declared when ‖x0 − x‖ℓ∞ ≤ 10−3) for the unweighted ℓ1 algorithm as a function of the sparsity level k. The dashed curves represent the

  15. Machine Learning Algorithms Outperform Conventional Regression Models in Predicting Development of Hepatocellular Carcinoma

    PubMed Central

    Singal, Amit G.; Mukherjee, Ashin; Elmunzer, B. Joseph; Higgins, Peter DR; Lok, Anna S.; Zhu, Ji; Marrero, Jorge A; Waljee, Akbar K

    2015-01-01

    Background Predictive models for hepatocellular carcinoma (HCC) have been limited by modest accuracy and lack of validation. Machine learning algorithms offer a novel methodology, which may improve HCC risk prognostication among patients with cirrhosis. Our study's aim was to develop and compare predictive models for HCC development among cirrhotic patients, using conventional regression analysis and machine learning algorithms. Methods We enrolled 442 patients with Child A or B cirrhosis at the University of Michigan between January 2004 and September 2006 (UM cohort) and prospectively followed them until HCC development, liver transplantation, death, or study termination. Regression analysis and machine learning algorithms were used to construct predictive models for HCC development, which were tested on an independent validation cohort from the Hepatitis C Antiviral Long-term Treatment against Cirrhosis (HALT-C) Trial. Both models were also compared to the previously published HALT-C model. Discrimination was assessed using receiver operating characteristic curve analysis and diagnostic accuracy was assessed with net reclassification improvement and integrated discrimination improvement statistics. Results After a median follow-up of 3.5 years, 41 patients developed HCC. The UM regression model had a c-statistic of 0.61 (95%CI 0.56-0.67), whereas the machine learning algorithm had a c-statistic of 0.64 (95%CI 0.60–0.69) in the validation cohort. The machine learning algorithm had significantly better diagnostic accuracy as assessed by net reclassification improvement (p<0.001) and integrated discrimination improvement (p=0.04). The HALT-C model had a c-statistic of 0.60 (95%CI 0.50-0.70) in the validation cohort and was outperformed by the machine learning algorithm (p=0.047). Conclusion Machine learning algorithms improve the accuracy of risk stratifying patients with cirrhosis and can be used to accurately identify patients at high-risk for developing HCC. PMID:24169273

  16. Joint penalized-likelihood reconstruction of time-activity curves and regions-of-interest from projection data in brain PET

    NASA Astrophysics Data System (ADS)

    Krestyannikov, E.; Tohka, J.; Ruotsalainen, U.

    2008-06-01

    This paper presents a novel statistical approach for joint estimation of regions-of-interest (ROIs) and the corresponding time-activity curves (TACs) from dynamic positron emission tomography (PET) brain projection data. It is based on optimizing the joint objective function that consists of a data log-likelihood term and two penalty terms reflecting the available a priori information about the human brain anatomy. The developed local optimization strategy iteratively updates both the ROI and TAC parameters and is guaranteed to monotonically increase the objective function. The quantitative evaluation of the algorithm is performed with numerically and Monte Carlo-simulated dynamic PET brain data of the 11C-Raclopride and 18F-FDG tracers. The results demonstrate that the method outperforms the existing sequential ROI quantification approaches in terms of accuracy, and can noticeably reduce the errors in TACs arising due to the finite spatial resolution and ROI delineation.

  17. Joint Concept Correlation and Feature-Concept Relevance Learning for Multilabel Classification.

    PubMed

    Zhao, Xiaowei; Ma, Zhigang; Li, Zhi; Li, Zhihui

    2018-02-01

    In recent years, multilabel classification has attracted significant attention in multimedia annotation. However, most of the multilabel classification methods focus only on the inherent correlations existing among multiple labels and concepts and ignore the relevance between features and the target concepts. To obtain more robust multilabel classification results, we propose a new multilabel classification method aiming to capture the correlations among multiple concepts by leveraging hypergraph that is proved to be beneficial for relational learning. Moreover, we consider mining feature-concept relevance, which is often overlooked by many multilabel learning algorithms. To better show the feature-concept relevance, we impose a sparsity constraint on the proposed method. We compare the proposed method with several other multilabel classification methods and evaluate the classification performance by mean average precision on several data sets. The experimental results show that the proposed method outperforms the state-of-the-art methods.

  18. T-IDBA: A de novo Iterative de Bruijn Graph Assembler for Transcriptome

    NASA Astrophysics Data System (ADS)

    Peng, Yu; Leung, Henry C. M.; Yiu, S. M.; Chin, Francis Y. L.

    RNA-seq data produced by next-generation sequencing technology is a useful tool for analyzing transcriptomes. However, existing de novo transcriptome assemblers do not fully utilize the properties of transcriptomes and may result in short contigs because of the splicing nature (shared exons) of the genes. We propose the T-IDBA algorithm to reconstruct expressed isoforms without reference genome. By using pair-end information to solve the problem of long repeats in different genes and branching in the same gene due to alternative splicing, the graph can be decomposed into small components, each corresponds to a gene. The most possible isoforms with sufficient support from the pair-end reads will be found heuristically. In practice, our de novo transcriptome assembler, T-IDBA, outperforms Abyss substantially in terms of sensitivity and precision for both simulated and real data. T-IDBA is available at http://www.cs.hku.hk/~alse/tidba/

  19. Information Filtering via Heterogeneous Diffusion in Online Bipartite Networks

    PubMed Central

    Zhang, Fu-Guo; Zeng, An

    2015-01-01

    The rapid expansion of Internet brings us overwhelming online information, which is impossible for an individual to go through all of it. Therefore, recommender systems were created to help people dig through this abundance of information. In networks composed by users and objects, recommender algorithms based on diffusion have been proven to be one of the best performing methods. Previous works considered the diffusion process from user to object, and from object to user to be equivalent. We show in this work that it is not the case and we improve the quality of the recommendation by taking into account the asymmetrical nature of this process. We apply this idea to modify the state-of-the-art recommendation methods. The simulation results show that the new methods can outperform these existing methods in both recommendation accuracy and diversity. Finally, this modification is checked to be able to improve the recommendation in a realistic case. PMID:26125631

  20. A Generalized Mixture Framework for Multi-label Classification

    PubMed Central

    Hong, Charmgil; Batal, Iyad; Hauskrecht, Milos

    2015-01-01

    We develop a novel probabilistic ensemble framework for multi-label classification that is based on the mixtures-of-experts architecture. In this framework, we combine multi-label classification models in the classifier chains family that decompose the class posterior distribution P(Y1, …, Yd|X) using a product of posterior distributions over components of the output space. Our approach captures different input–output and output–output relations that tend to change across data. As a result, we can recover a rich set of dependency relations among inputs and outputs that a single multi-label classification model cannot capture due to its modeling simplifications. We develop and present algorithms for learning the mixtures-of-experts models from data and for performing multi-label predictions on unseen data instances. Experiments on multiple benchmark datasets demonstrate that our approach achieves highly competitive results and outperforms the existing state-of-the-art multi-label classification methods. PMID:26613069

  1. Information Filtering via Heterogeneous Diffusion in Online Bipartite Networks.

    PubMed

    Zhang, Fu-Guo; Zeng, An

    2015-01-01

    The rapid expansion of Internet brings us overwhelming online information, which is impossible for an individual to go through all of it. Therefore, recommender systems were created to help people dig through this abundance of information. In networks composed by users and objects, recommender algorithms based on diffusion have been proven to be one of the best performing methods. Previous works considered the diffusion process from user to object, and from object to user to be equivalent. We show in this work that it is not the case and we improve the quality of the recommendation by taking into account the asymmetrical nature of this process. We apply this idea to modify the state-of-the-art recommendation methods. The simulation results show that the new methods can outperform these existing methods in both recommendation accuracy and diversity. Finally, this modification is checked to be able to improve the recommendation in a realistic case.

  2. State-space adjustment of radar rainfall and skill score evaluation of stochastic volume forecasts in urban drainage systems.

    PubMed

    Löwe, Roland; Mikkelsen, Peter Steen; Rasmussen, Michael R; Madsen, Henrik

    2013-01-01

    Merging of radar rainfall data with rain gauge measurements is a common approach to overcome problems in deriving rain intensities from radar measurements. We extend an existing approach for adjustment of C-band radar data using state-space models and use the resulting rainfall intensities as input for forecasting outflow from two catchments in the Copenhagen area. Stochastic grey-box models are applied to create the runoff forecasts, providing us with not only a point forecast but also a quantification of the forecast uncertainty. Evaluating the results, we can show that using the adjusted radar data improves runoff forecasts compared with using the original radar data and that rain gauge measurements as forecast input are also outperformed. Combining the data merging approach with short-term rainfall forecasting algorithms may result in further improved runoff forecasts that can be used in real time control.

  3. Consistently Sampled Correlation Filters with Space Anisotropic Regularization for Visual Tracking

    PubMed Central

    Shi, Guokai; Xu, Tingfa; Luo, Jiqiang; Li, Yuankun

    2017-01-01

    Most existing correlation filter-based tracking algorithms, which use fixed patches and cyclic shifts as training and detection measures, assume that the training samples are reliable and ignore the inconsistencies between training samples and detection samples. We propose to construct and study a consistently sampled correlation filter with space anisotropic regularization (CSSAR) to solve these two problems simultaneously. Our approach constructs a spatiotemporally consistent sample strategy to alleviate the redundancies in training samples caused by the cyclical shifts, eliminate the inconsistencies between training samples and detection samples, and introduce space anisotropic regularization to constrain the correlation filter for alleviating drift caused by occlusion. Moreover, an optimization strategy based on the Gauss-Seidel method was developed for obtaining robust and efficient online learning. Both qualitative and quantitative evaluations demonstrate that our tracker outperforms state-of-the-art trackers in object tracking benchmarks (OTBs). PMID:29231876

  4. Clustering for Binary Data Sets by Using Genetic Algorithm-Incremental K-means

    NASA Astrophysics Data System (ADS)

    Saharan, S.; Baragona, R.; Nor, M. E.; Salleh, R. M.; Asrah, N. M.

    2018-04-01

    This research was initially driven by the lack of clustering algorithms that specifically focus in binary data. To overcome this gap in knowledge, a promising technique for analysing this type of data became the main subject in this research, namely Genetic Algorithms (GA). For the purpose of this research, GA was combined with the Incremental K-means (IKM) algorithm to cluster the binary data streams. In GAIKM, the objective function was based on a few sufficient statistics that may be easily and quickly calculated on binary numbers. The implementation of IKM will give an advantage in terms of fast convergence. The results show that GAIKM is an efficient and effective new clustering algorithm compared to the clustering algorithms and to the IKM itself. In conclusion, the GAIKM outperformed other clustering algorithms such as GCUK, IKM, Scalable K-means (SKM) and K-means clustering and paves the way for future research involving missing data and outliers.

  5. Combining Biomarkers Linearly and Nonlinearly for Classification Using the Area Under the ROC Curve

    PubMed Central

    Fong, Youyi; Yin, Shuxin; Huang, Ying

    2016-01-01

    In biomedical studies, it is often of interest to classify/predict a subject’s disease status based on a variety of biomarker measurements. A commonly used classification criterion is based on AUC - Area under the Receiver Operating Characteristic Curve. Many methods have been proposed to optimize approximated empirical AUC criteria, but there are two limitations to the existing methods. First, most methods are only designed to find the best linear combination of biomarkers, which may not perform well when there is strong nonlinearity in the data. Second, many existing linear combination methods use gradient-based algorithms to find the best marker combination, which often result in sub-optimal local solutions. In this paper, we address these two problems by proposing a new kernel-based AUC optimization method called Ramp AUC (RAUC). This method approximates the empirical AUC loss function with a ramp function, and finds the best combination by a difference of convex functions algorithm. We show that as a linear combination method, RAUC leads to a consistent and asymptotically normal estimator of the linear marker combination when the data is generated from a semiparametric generalized linear model, just as the Smoothed AUC method (SAUC). Through simulation studies and real data examples, we demonstrate that RAUC out-performs SAUC in finding the best linear marker combinations, and can successfully capture nonlinear pattern in the data to achieve better classification performance. We illustrate our method with a dataset from a recent HIV vaccine trial. PMID:27058981

  6. Multiple-variable neighbourhood search for the single-machine total weighted tardiness problem

    NASA Astrophysics Data System (ADS)

    Chung, Tsui-Ping; Fu, Qunjie; Liao, Ching-Jong; Liu, Yi-Ting

    2017-07-01

    The single-machine total weighted tardiness (SMTWT) problem is a typical discrete combinatorial optimization problem in the scheduling literature. This problem has been proved to be NP hard and thus provides a challenging area for metaheuristics, especially the variable neighbourhood search algorithm. In this article, a multiple variable neighbourhood search (m-VNS) algorithm with multiple neighbourhood structures is proposed to solve the problem. Special mechanisms named matching and strengthening operations are employed in the algorithm, which has an auto-revising local search procedure to explore the solution space beyond local optimality. Two aspects, searching direction and searching depth, are considered, and neighbourhood structures are systematically exchanged. Experimental results show that the proposed m-VNS algorithm outperforms all the compared algorithms in solving the SMTWT problem.

  7. A junction-tree based learning algorithm to optimize network wide traffic control: A coordinated multi-agent framework

    DOE PAGES

    Zhu, Feng; Aziz, H. M. Abdul; Qian, Xinwu; ...

    2015-01-31

    Our study develops a novel reinforcement learning algorithm for the challenging coordinated signal control problem. Traffic signals are modeled as intelligent agents interacting with the stochastic traffic environment. The model is built on the framework of coordinated reinforcement learning. The Junction Tree Algorithm (JTA) based reinforcement learning is proposed to obtain an exact inference of the best joint actions for all the coordinated intersections. Moreover, the algorithm is implemented and tested with a network containing 18 signalized intersections in VISSIM. Finally, our results show that the JTA based algorithm outperforms independent learning (Q-learning), real-time adaptive learning, and fixed timing plansmore » in terms of average delay, number of stops, and vehicular emissions at the network level.« less

  8. Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering

    PubMed Central

    Sun, Peng; Speicher, Nora K.; Röttger, Richard; Guo, Jiong; Baumbach, Jan

    2014-01-01

    Abstract The explosion of the biological data has dramatically reformed today's biological research. The need to integrate and analyze high-dimensional biological data on a large scale is driving the development of novel bioinformatics approaches. Biclustering, also known as ‘simultaneous clustering’ or ‘co-clustering’, has been successfully utilized to discover local patterns in gene expression data and similar biomedical data types. Here, we contribute a new heuristic: ‘Bi-Force’. It is based on the weighted bicluster editing model, to perform biclustering on arbitrary sets of biological entities, given any kind of pairwise similarities. We first evaluated the power of Bi-Force to solve dedicated bicluster editing problems by comparing Bi-Force with two existing algorithms in the BiCluE software package. We then followed a biclustering evaluation protocol in a recent review paper from Eren et al. (2013) (A comparative analysis of biclustering algorithms for gene expressiondata. Brief. Bioinform., 14:279–292.) and compared Bi-Force against eight existing tools: FABIA, QUBIC, Cheng and Church, Plaid, BiMax, Spectral, xMOTIFs and ISA. To this end, a suite of synthetic datasets as well as nine large gene expression datasets from Gene Expression Omnibus were analyzed. All resulting biclusters were subsequently investigated by Gene Ontology enrichment analysis to evaluate their biological relevance. The distinct theoretical foundation of Bi-Force (bicluster editing) is more powerful than strict biclustering. We thus outperformed existing tools with Bi-Force at least when following the evaluation protocols from Eren et al. Bi-Force is implemented in Java and integrated into the open source software package of BiCluE. The software as well as all used datasets are publicly available at http://biclue.mpi-inf.mpg.de. PMID:24682815

  9. Supervised detection of exoplanets in high-contrast imaging sequences

    NASA Astrophysics Data System (ADS)

    Gomez Gonzalez, C. A.; Absil, O.; Van Droogenbroeck, M.

    2018-06-01

    Context. Post-processing algorithms play a key role in pushing the detection limits of high-contrast imaging (HCI) instruments. State-of-the-art image processing approaches for HCI enable the production of science-ready images relying on unsupervised learning techniques, such as low-rank approximations, for generating a model point spread function (PSF) and subtracting the residual starlight and speckle noise. Aims: In order to maximize the detection rate of HCI instruments and survey campaigns, advanced algorithms with higher sensitivities to faint companions are needed, especially for the speckle-dominated innermost region of the images. Methods: We propose a reformulation of the exoplanet detection task (for ADI sequences) that builds on well-established machine learning techniques to take HCI post-processing from an unsupervised to a supervised learning context. In this new framework, we present algorithmic solutions using two different discriminative models: SODIRF (random forests) and SODINN (neural networks). We test these algorithms on real ADI datasets from VLT/NACO and VLT/SPHERE HCI instruments. We then assess their performances by injecting fake companions and using receiver operating characteristic analysis. This is done in comparison with state-of-the-art ADI algorithms, such as ADI principal component analysis (ADI-PCA). Results: This study shows the improved sensitivity versus specificity trade-off of the proposed supervised detection approach. At the diffraction limit, SODINN improves the true positive rate by a factor ranging from 2 to 10 (depending on the dataset and angular separation) with respect to ADI-PCA when working at the same false-positive level. Conclusions: The proposed supervised detection framework outperforms state-of-the-art techniques in the task of discriminating planet signal from speckles. In addition, it offers the possibility of re-processing existing HCI databases to maximize their scientific return and potentially improve the demographics of directly imaged exoplanets.

  10. Long-term power generation expansion planning with short-term demand response: Model, algorithms, implementation, and electricity policies

    NASA Astrophysics Data System (ADS)

    Lohmann, Timo

    Electric sector models are powerful tools that guide policy makers and stakeholders. Long-term power generation expansion planning models are a prominent example and determine a capacity expansion for an existing power system over a long planning horizon. With the changes in the power industry away from monopolies and regulation, the focus of these models has shifted to competing electric companies maximizing their profit in a deregulated electricity market. In recent years, consumers have started to participate in demand response programs, actively influencing electricity load and price in the power system. We introduce a model that features investment and retirement decisions over a long planning horizon of more than 20 years, as well as an hourly representation of day-ahead electricity markets in which sellers of electricity face buyers. This combination makes our model both unique and challenging to solve. Decomposition algorithms, and especially Benders decomposition, can exploit the model structure. We present a novel method that can be seen as an alternative to generalized Benders decomposition and relies on dynamic linear overestimation. We prove its finite convergence and present computational results, demonstrating its superiority over traditional approaches. In certain special cases of our model, all necessary solution values in the decomposition algorithms can be directly calculated and solving mathematical programming problems becomes entirely obsolete. This leads to highly efficient algorithms that drastically outperform their programming problem-based counterparts. Furthermore, we discuss the implementation of all tailored algorithms and the challenges from a modeling software developer's standpoint, providing an insider's look into the modeling language GAMS. Finally, we apply our model to the Texas power system and design two electricity policies motivated by the U.S. Environment Protection Agency's recently proposed CO2 emissions targets for the power sector.

  11. iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space.

    PubMed

    Akbar, Shahid; Hayat, Maqsood; Iqbal, Muhammad; Jan, Mian Ahmad

    2017-06-01

    Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Due to perilous impact of cancer, the development of an accurate and highly efficient intelligent computational model is desirable for identification of anticancer peptides. In this paper, evolutionary intelligent genetic algorithm-based ensemble model, 'iACP-GAEnsC', is proposed for the identification of anticancer peptides. In this model, the protein sequences are formulated, using three different discrete feature representation methods, i.e., amphiphilic Pseudo amino acid composition, g-Gap dipeptide composition, and Reduce amino acid alphabet composition. The performance of the extracted feature spaces are investigated separately and then merged to exhibit the significance of hybridization. In addition, the predicted results of individual classifiers are combined together, using optimized genetic algorithm and simple majority technique in order to enhance the true classification rate. It is observed that genetic algorithm-based ensemble classification outperforms than individual classifiers as well as simple majority voting base ensemble. The performance of genetic algorithm-based ensemble classification is highly reported on hybrid feature space, with an accuracy of 96.45%. In comparison to the existing techniques, 'iACP-GAEnsC' model has achieved remarkable improvement in terms of various performance metrics. Based on the simulation results, it is observed that 'iACP-GAEnsC' model might be a leading tool in the field of drug design and proteomics for researchers. Copyright © 2017 Elsevier B.V. All rights reserved.

  12. Computer vision-based automated peak picking applied to protein NMR spectra.

    PubMed

    Klukowski, Piotr; Walczak, Michal J; Gonczarek, Adam; Boudet, Julien; Wider, Gerhard

    2015-09-15

    A detailed analysis of multidimensional NMR spectra of macromolecules requires the identification of individual resonances (peaks). This task can be tedious and time-consuming and often requires support by experienced users. Automated peak picking algorithms were introduced more than 25 years ago, but there are still major deficiencies/flaws that often prevent complete and error free peak picking of biological macromolecule spectra. The major challenges of automated peak picking algorithms is both the distinction of artifacts from real peaks particularly from those with irregular shapes and also picking peaks in spectral regions with overlapping resonances which are very hard to resolve by existing computer algorithms. In both of these cases a visual inspection approach could be more effective than a 'blind' algorithm. We present a novel approach using computer vision (CV) methodology which could be better adapted to the problem of peak recognition. After suitable 'training' we successfully applied the CV algorithm to spectra of medium-sized soluble proteins up to molecular weights of 26 kDa and to a 130 kDa complex of a tetrameric membrane protein in detergent micelles. Our CV approach outperforms commonly used programs. With suitable training datasets the application of the presented method can be extended to automated peak picking in multidimensional spectra of nucleic acids or carbohydrates and adapted to solid-state NMR spectra. CV-Peak Picker is available upon request from the authors. gsw@mol.biol.ethz.ch; michal.walczak@mol.biol.ethz.ch; adam.gonczarek@pwr.edu.pl Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Energy-Efficient Deadline-Aware Data-Gathering Scheme Using Multiple Mobile Data Collectors.

    PubMed

    Dasgupta, Rumpa; Yoon, Seokhoon

    2017-04-01

    In wireless sensor networks, the data collected by sensors are usually forwarded to the sink through multi-hop forwarding. However, multi-hop forwarding can be inefficient due to the energy hole problem and high communications overhead. Moreover, when the monitored area is large and the number of sensors is small, sensors cannot send the data via multi-hop forwarding due to the lack of network connectivity. In order to address those problems of multi-hop forwarding, in this paper, we consider a data collection scheme that uses mobile data collectors (MDCs), which visit sensors and collect data from them. Due to the recent breakthroughs in wireless power transfer technology, MDCs can also be used to recharge the sensors to keep them from draining their energy. In MDC-based data-gathering schemes, a big challenge is how to find the MDCs' traveling paths in a balanced way, such that their energy consumption is minimized and the packet-delay constraint is satisfied. Therefore, in this paper, we aim at finding the MDCs' paths, taking energy efficiency and delay constraints into account. We first define an optimization problem, named the delay-constrained energy minimization (DCEM) problem, to find the paths for MDCs. An integer linear programming problem is formulated to find the optimal solution. We also propose a two-phase path-selection algorithm to efficiently solve the DCEM problem. Simulations are performed to compare the performance of the proposed algorithms with two heuristics algorithms for the vehicle routing problem under various scenarios. The simulation results show that the proposed algorithms can outperform existing algorithms in terms of energy efficiency and packet delay.

  14. Performance comparisons between PCA-EA-LBG and PCA-LBG-EA approaches in VQ codebook generation for image compression

    NASA Astrophysics Data System (ADS)

    Tsai, Jinn-Tsong; Chou, Ping-Yi; Chou, Jyh-Horng

    2015-11-01

    The aim of this study is to generate vector quantisation (VQ) codebooks by integrating principle component analysis (PCA) algorithm, Linde-Buzo-Gray (LBG) algorithm, and evolutionary algorithms (EAs). The EAs include genetic algorithm (GA), particle swarm optimisation (PSO), honey bee mating optimisation (HBMO), and firefly algorithm (FF). The study is to provide performance comparisons between PCA-EA-LBG and PCA-LBG-EA approaches. The PCA-EA-LBG approaches contain PCA-GA-LBG, PCA-PSO-LBG, PCA-HBMO-LBG, and PCA-FF-LBG, while the PCA-LBG-EA approaches contain PCA-LBG, PCA-LBG-GA, PCA-LBG-PSO, PCA-LBG-HBMO, and PCA-LBG-FF. All training vectors of test images are grouped according to PCA. The PCA-EA-LBG used the vectors grouped by PCA as initial individuals, and the best solution gained by the EAs was given for LBG to discover a codebook. The PCA-LBG approach is to use the PCA to select vectors as initial individuals for LBG to find a codebook. The PCA-LBG-EA used the final result of PCA-LBG as an initial individual for EAs to find a codebook. The search schemes in PCA-EA-LBG first used global search and then applied local search skill, while in PCA-LBG-EA first used local search and then employed global search skill. The results verify that the PCA-EA-LBG indeed gain superior results compared to the PCA-LBG-EA, because the PCA-EA-LBG explores a global area to find a solution, and then exploits a better one from the local area of the solution. Furthermore the proposed PCA-EA-LBG approaches in designing VQ codebooks outperform existing approaches shown in the literature.

  15. Paroxysmal atrial fibrillation prediction based on HRV analysis and non-dominated sorting genetic algorithm III.

    PubMed

    Boon, K H; Khalil-Hani, M; Malarvili, M B

    2018-01-01

    This paper presents a method that able to predict the paroxysmal atrial fibrillation (PAF). The method uses shorter heart rate variability (HRV) signals when compared to existing methods, and achieves good prediction accuracy. PAF is a common cardiac arrhythmia that increases the health risk of a patient, and the development of an accurate predictor of the onset of PAF is clinical important because it increases the possibility to electrically stabilize and prevent the onset of atrial arrhythmias with different pacing techniques. We propose a multi-objective optimization algorithm based on the non-dominated sorting genetic algorithm III for optimizing the baseline PAF prediction system, that consists of the stages of pre-processing, HRV feature extraction, and support vector machine (SVM) model. The pre-processing stage comprises of heart rate correction, interpolation, and signal detrending. After that, time-domain, frequency-domain, non-linear HRV features are extracted from the pre-processed data in feature extraction stage. Then, these features are used as input to the SVM for predicting the PAF event. The proposed optimization algorithm is used to optimize the parameters and settings of various HRV feature extraction algorithms, select the best feature subsets, and tune the SVM parameters simultaneously for maximum prediction performance. The proposed method achieves an accuracy rate of 87.7%, which significantly outperforms most of the previous works. This accuracy rate is achieved even with the HRV signal length being reduced from the typical 30 min to just 5 min (a reduction of 83%). Furthermore, another significant result is the sensitivity rate, which is considered more important that other performance metrics in this paper, can be improved with the trade-off of lower specificity. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Energy-Efficient Deadline-Aware Data-Gathering Scheme Using Multiple Mobile Data Collectors

    PubMed Central

    Dasgupta, Rumpa; Yoon, Seokhoon

    2017-01-01

    In wireless sensor networks, the data collected by sensors are usually forwarded to the sink through multi-hop forwarding. However, multi-hop forwarding can be inefficient due to the energy hole problem and high communications overhead. Moreover, when the monitored area is large and the number of sensors is small, sensors cannot send the data via multi-hop forwarding due to the lack of network connectivity. In order to address those problems of multi-hop forwarding, in this paper, we consider a data collection scheme that uses mobile data collectors (MDCs), which visit sensors and collect data from them. Due to the recent breakthroughs in wireless power transfer technology, MDCs can also be used to recharge the sensors to keep them from draining their energy. In MDC-based data-gathering schemes, a big challenge is how to find the MDCs’ traveling paths in a balanced way, such that their energy consumption is minimized and the packet-delay constraint is satisfied. Therefore, in this paper, we aim at finding the MDCs’ paths, taking energy efficiency and delay constraints into account. We first define an optimization problem, named the delay-constrained energy minimization (DCEM) problem, to find the paths for MDCs. An integer linear programming problem is formulated to find the optimal solution. We also propose a two-phase path-selection algorithm to efficiently solve the DCEM problem. Simulations are performed to compare the performance of the proposed algorithms with two heuristics algorithms for the vehicle routing problem under various scenarios. The simulation results show that the proposed algorithms can outperform existing algorithms in terms of energy efficiency and packet delay. PMID:28368300

  17. Do Algorithms Homogenize Students' Achievements in Secondary School Better than Teachers' Tracking Decisions?

    ERIC Educational Resources Information Center

    Klapproth, Florian

    2015-01-01

    Two objectives guided this research. First, this study examined how well teachers' tracking decisions contribute to the homogenization of their students' achievements. Second, the study explored whether teachers' tracking decisions would be outperformed in homogenizing the students' achievements by statistical models of tracking decisions. These…

  18. Online ranking by projecting.

    PubMed

    Crammer, Koby; Singer, Yoram

    2005-01-01

    We discuss the problem of ranking instances. In our framework, each instance is associated with a rank or a rating, which is an integer in 1 to k. Our goal is to find a rank-prediction rule that assigns each instance a rank that is as close as possible to the instance's true rank. We discuss a group of closely related online algorithms, analyze their performance in the mistake-bound model, and prove their correctness. We describe two sets of experiments, with synthetic data and with the EachMovie data set for collaborative filtering. In the experiments we performed, our algorithms outperform online algorithms for regression and classification applied to ranking.

  19. Community Detection on the GPU

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Naim, Md; Manne, Fredrik; Halappanavar, Mahantesh

    We present and evaluate a new GPU algorithm based on the Louvain method for community detection. Our algorithm is the first for this problem that parallelizes the access to individual edges. In this way we can fine tune the load balance when processing networks with nodes of highly varying degrees. This is achieved by scaling the number of threads assigned to each node according to its degree. Extensive experiments show that we obtain speedups up to a factor of 270 compared to the sequential algorithm. The algorithm consistently outperforms other recent shared memory implementations and is only one order ofmore » magnitude slower than the current fastest parallel Louvain method running on a Blue Gene/Q supercomputer using more than 500K threads.« less

  20. Image compression using quad-tree coding with morphological dilation

    NASA Astrophysics Data System (ADS)

    Wu, Jiaji; Jiang, Weiwei; Jiao, Licheng; Wang, Lei

    2007-11-01

    In this paper, we propose a new algorithm which integrates morphological dilation operation to quad-tree coding, the purpose of doing this is to compensate each other's drawback by using quad-tree coding and morphological dilation operation respectively. New algorithm can not only quickly find the seed significant coefficient of dilation but also break the limit of block boundary of quad-tree coding. We also make a full use of both within-subband and cross-subband correlation to avoid the expensive cost of representing insignificant coefficients. Experimental results show that our algorithm outperforms SPECK and SPIHT. Without using any arithmetic coding, our algorithm can achieve good performance with low computational cost and it's more suitable to mobile devices or scenarios with a strict real-time requirement.

  1. Adaptive object tracking via both positive and negative models matching

    NASA Astrophysics Data System (ADS)

    Li, Shaomei; Gao, Chao; Wang, Yawen

    2015-03-01

    To improve tracking drift which often occurs in adaptive tracking, an algorithm based on the fusion of tracking and detection is proposed in this paper. Firstly, object tracking is posed as abinary classification problem and is modeled by partial least squares (PLS) analysis. Secondly, tracking object frame by frame via particle filtering. Thirdly, validating the tracking reliability based on both positive and negative models matching. Finally, relocating the object based on SIFT features matching and voting when drift occurs. Object appearance model is updated at the same time. The algorithm can not only sense tracking drift but also relocate the object whenever needed. Experimental results demonstrate that this algorithm outperforms state-of-the-art algorithms on many challenging sequences.

  2. A consensus algorithm for approximate string matching and its application to QRS complex detection

    NASA Astrophysics Data System (ADS)

    Alba, Alfonso; Mendez, Martin O.; Rubio-Rincon, Miguel E.; Arce-Santana, Edgar R.

    2016-08-01

    In this paper, a novel algorithm for approximate string matching (ASM) is proposed. The novelty resides in the fact that, unlike most other methods, the proposed algorithm is not based on the Hamming or Levenshtein distances, but instead computes a score for each symbol in the search text based on a consensus measure. Those symbols with sufficiently high scores will likely correspond to approximate instances of the pattern string. To demonstrate the usefulness of the proposed method, it has been applied to the detection of QRS complexes in electrocardiographic signals with competitive results when compared against the classic Pan-Tompkins (PT) algorithm. The proposed method outperformed PT in 72% of the test cases, with no extra computational cost.

  3. Demultiplexing based on frequency-domain joint decision MMA for MDM system

    NASA Astrophysics Data System (ADS)

    Caili, Gong; Li, Li; Guijun, Hu

    2016-06-01

    In this paper, we propose a demultiplexing method based on frequency-domain joint decision multi-modulus algorithm (FD-JDMMA) for mode division multiplexing (MDM) system. The performance of FD-JDMMA is compared with frequency-domain multi-modulus algorithm (FD-MMA) and frequency-domain least mean square (FD-LMS) algorithm. The simulation results show that FD-JDMMA outperforms FD-MMA in terms of BER and convergence speed in the cases of mQAM (m=4, 16 and 64) formats. And it is also demonstrated that FD-JDMMA achieves better BER performance and converges faster than FD-LMS in the cases of 16QAM and 64QAM. Furthermore, FD-JDMMA maintains similar computational complexity as the both equalization algorithms.

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shamis, Pavel; Graham, Richard L; Gorentla Venkata, Manjunath

    The scalability and performance of collective communication operations limit the scalability and performance of many scientific applications. This paper presents two new blocking and nonblocking Broadcast algorithms for communicators with arbitrary communication topology, and studies their performance. These algorithms benefit from increased concurrency and a reduced memory footprint, making them suitable for use on large-scale systems. Measuring small, medium, and large data Broadcasts on a Cray-XT5, using 24,576 MPI processes, the Cheetah algorithms outperform the native MPI on that system by 51%, 69%, and 9%, respectively, at the same process count. These results demonstrate an algorithmic approach to the implementationmore » of the important class of collective communications, which is high performing, scalable, and also uses resources in a scalable manner.« less

  5. Defining an essence of structure determining residue contacts in proteins.

    PubMed

    Sathyapriya, R; Duarte, Jose M; Stehr, Henning; Filippis, Ioannis; Lappe, Michael

    2009-12-01

    The network of native non-covalent residue contacts determines the three-dimensional structure of a protein. However, not all contacts are of equal structural significance, and little knowledge exists about a minimal, yet sufficient, subset required to define the global features of a protein. Characterisation of this "structural essence" has remained elusive so far: no algorithmic strategy has been devised to-date that could outperform a random selection in terms of 3D reconstruction accuracy (measured as the Ca RMSD). It is not only of theoretical interest (i.e., for design of advanced statistical potentials) to identify the number and nature of essential native contacts-such a subset of spatial constraints is very useful in a number of novel experimental methods (like EPR) which rely heavily on constraint-based protein modelling. To derive accurate three-dimensional models from distance constraints, we implemented a reconstruction pipeline using distance geometry. We selected a test-set of 12 protein structures from the four major SCOP fold classes and performed our reconstruction analysis. As a reference set, series of random subsets (ranging from 10% to 90% of native contacts) are generated for each protein, and the reconstruction accuracy is computed for each subset. We have developed a rational strategy, termed "cone-peeling" that combines sequence features and network descriptors to select minimal subsets that outperform the reference sets. We present, for the first time, a rational strategy to derive a structural essence of residue contacts and provide an estimate of the size of this minimal subset. Our algorithm computes sparse subsets capable of determining the tertiary structure at approximately 4.8 A Ca RMSD with as little as 8% of the native contacts (Ca-Ca and Cb-Cb). At the same time, a randomly chosen subset of native contacts needs about twice as many contacts to reach the same level of accuracy. This "structural essence" opens new avenues in the fields of structure prediction, empirical potentials and docking.

  6. Defining an Essence of Structure Determining Residue Contacts in Proteins

    PubMed Central

    Sathyapriya, R.; Duarte, Jose M.; Stehr, Henning; Filippis, Ioannis; Lappe, Michael

    2009-01-01

    The network of native non-covalent residue contacts determines the three-dimensional structure of a protein. However, not all contacts are of equal structural significance, and little knowledge exists about a minimal, yet sufficient, subset required to define the global features of a protein. Characterisation of this “structural essence” has remained elusive so far: no algorithmic strategy has been devised to-date that could outperform a random selection in terms of 3D reconstruction accuracy (measured as the Ca RMSD). It is not only of theoretical interest (i.e., for design of advanced statistical potentials) to identify the number and nature of essential native contacts—such a subset of spatial constraints is very useful in a number of novel experimental methods (like EPR) which rely heavily on constraint-based protein modelling. To derive accurate three-dimensional models from distance constraints, we implemented a reconstruction pipeline using distance geometry. We selected a test-set of 12 protein structures from the four major SCOP fold classes and performed our reconstruction analysis. As a reference set, series of random subsets (ranging from 10% to 90% of native contacts) are generated for each protein, and the reconstruction accuracy is computed for each subset. We have developed a rational strategy, termed “cone-peeling” that combines sequence features and network descriptors to select minimal subsets that outperform the reference sets. We present, for the first time, a rational strategy to derive a structural essence of residue contacts and provide an estimate of the size of this minimal subset. Our algorithm computes sparse subsets capable of determining the tertiary structure at approximately 4.8 Å Ca RMSD with as little as 8% of the native contacts (Ca-Ca and Cb-Cb). At the same time, a randomly chosen subset of native contacts needs about twice as many contacts to reach the same level of accuracy. This “structural essence” opens new avenues in the fields of structure prediction, empirical potentials and docking. PMID:19997489

  7. Developing EHR-driven heart failure risk prediction models using CPXR(Log) with the probabilistic loss function.

    PubMed

    Taslimitehrani, Vahid; Dong, Guozhu; Pereira, Naveen L; Panahiazar, Maryam; Pathak, Jyotishman

    2016-04-01

    Computerized survival prediction in healthcare identifying the risk of disease mortality, helps healthcare providers to effectively manage their patients by providing appropriate treatment options. In this study, we propose to apply a classification algorithm, Contrast Pattern Aided Logistic Regression (CPXR(Log)) with the probabilistic loss function, to develop and validate prognostic risk models to predict 1, 2, and 5year survival in heart failure (HF) using data from electronic health records (EHRs) at Mayo Clinic. The CPXR(Log) constructs a pattern aided logistic regression model defined by several patterns and corresponding local logistic regression models. One of the models generated by CPXR(Log) achieved an AUC and accuracy of 0.94 and 0.91, respectively, and significantly outperformed prognostic models reported in prior studies. Data extracted from EHRs allowed incorporation of patient co-morbidities into our models which helped improve the performance of the CPXR(Log) models (15.9% AUC improvement), although did not improve the accuracy of the models built by other classifiers. We also propose a probabilistic loss function to determine the large error and small error instances. The new loss function used in the algorithm outperforms other functions used in the previous studies by 1% improvement in the AUC. This study revealed that using EHR data to build prediction models can be very challenging using existing classification methods due to the high dimensionality and complexity of EHR data. The risk models developed by CPXR(Log) also reveal that HF is a highly heterogeneous disease, i.e., different subgroups of HF patients require different types of considerations with their diagnosis and treatment. Our risk models provided two valuable insights for application of predictive modeling techniques in biomedicine: Logistic risk models often make systematic prediction errors, and it is prudent to use subgroup based prediction models such as those given by CPXR(Log) when investigating heterogeneous diseases. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Multiple-Beam Detection of Fast Transient Radio Sources

    NASA Technical Reports Server (NTRS)

    Thompson, David R.; Wagstaff, Kiri L.; Majid, Walid A.

    2011-01-01

    A method has been designed for using multiple independent stations to discriminate fast transient radio sources from local anomalies, such as antenna noise or radio frequency interference (RFI). This can improve the sensitivity of incoherent detection for geographically separated stations such as the very long baseline array (VLBA), the future square kilometer array (SKA), or any other coincident observations by multiple separated receivers. The transients are short, broadband pulses of radio energy, often just a few milliseconds long, emitted by a variety of exotic astronomical phenomena. They generally represent rare, high-energy events making them of great scientific value. For RFI-robust adaptive detection of transients, using multiple stations, a family of algorithms has been developed. The technique exploits the fact that the separated stations constitute statistically independent samples of the target. This can be used to adaptively ignore RFI events for superior sensitivity. If the antenna signals are independent and identically distributed (IID), then RFI events are simply outlier data points that can be removed through robust estimation such as a trimmed or Winsorized estimator. The alternative "trimmed" estimator is considered, which excises the strongest n signals from the list of short-beamed intensities. Because local RFI is independent at each antenna, this interference is unlikely to occur at many antennas on the same step. Trimming the strongest signals provides robustness to RFI that can theoretically outperform even the detection performance of the same number of antennas at a single site. This algorithm requires sorting the signals at each time step and dispersion measure, an operation that is computationally tractable for existing array sizes. An alternative uses the various stations to form an ensemble estimate of the conditional density function (CDF) evaluated at each time step. Both methods outperform standard detection strategies on a test sequence of VLBA data, and both are efficient enough for deployment in real-time, online transient detection applications.

  9. SU-C-207B-02: Maximal Noise Reduction Filter with Anatomical Structures Preservation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Maitree, R; Guzman, G; Chundury, A

    Purpose: All medical images contain noise, which can result in an undesirable appearance and can reduce the visibility of anatomical details. There are varieties of techniques utilized to reduce noise such as increasing the image acquisition time and using post-processing noise reduction algorithms. However, these techniques are increasing the imaging time and cost or reducing tissue contrast and effective spatial resolution which are useful diagnosis information. The three main focuses in this study are: 1) to develop a novel approach that can adaptively and maximally reduce noise while preserving valuable details of anatomical structures, 2) to evaluate the effectiveness ofmore » available noise reduction algorithms in comparison to the proposed algorithm, and 3) to demonstrate that the proposed noise reduction approach can be used clinically. Methods: To achieve a maximal noise reduction without destroying the anatomical details, the proposed approach automatically estimated the local image noise strength levels and detected the anatomical structures, i.e. tissue boundaries. Such information was used to adaptively adjust strength of the noise reduction filter. The proposed algorithm was tested on 34 repeating swine head datasets and 54 patients MRI and CT images. The performance was quantitatively evaluated by image quality metrics and manually validated for clinical usages by two radiation oncologists and one radiologist. Results: Qualitative measurements on repeated swine head images demonstrated that the proposed algorithm efficiently removed noise while preserving the structures and tissues boundaries. In comparisons, the proposed algorithm obtained competitive noise reduction performance and outperformed other filters in preserving anatomical structures. Assessments from the manual validation indicate that the proposed noise reduction algorithm is quite adequate for some clinical usages. Conclusion: According to both clinical evaluation (human expert ranking) and qualitative assessment, the proposed approach has superior noise reduction and anatomical structures preservation capabilities over existing noise removal methods. Senior Author Dr. Deshan Yang received research funding form ViewRay and Varian.« less

  10. NIMEFI: Gene Regulatory Network Inference using Multiple Ensemble Feature Importance Algorithms

    PubMed Central

    Ruyssinck, Joeri; Huynh-Thu, Vân Anh; Geurts, Pierre; Dhaene, Tom; Demeester, Piet; Saeys, Yvan

    2014-01-01

    One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available. PMID:24667482

  11. Unimolecular Reaction Pathways of a γ-Ketohydroperoxide from Combined Application of Automated Reaction Discovery Methods

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grambow, Colin A.; Jamal, Adeel; Li, Yi -Pei

    Ketohydroperoxides are important in liquid-phase autoxidation and in gas-phase partial oxidation and pre-ignition chemistry, but because of their low concentration, instability, and various analytical chemistry limitations, it has been challenging to experimentally determine their reactivity, and only a few pathways are known. In the present work, 75 elementary-step unimolecular reactions of the simplest γ-ketohydroperoxide, 3-hydroperoxypropanal, were discovered by a combination of density functional theory with several automated transition-state search algorithms: the Berny algorithm coupled with the freezing string method, single- and double-ended growing string methods, the heuristic KinBot algorithm, and the single-component artificial force induced reaction method (SC-AFIR). The presentmore » joint approach significantly outperforms previous manual and automated transition-state searches – 68 of the reactions of γ-ketohydroperoxide discovered here were previously unknown and completely unexpected. All of the methods found the lowest-energy transition state, which corresponds to the first step of the Korcek mechanism, but each algorithm except for SC-AFIR detected several reactions not found by any of the other methods. We show that the low-barrier chemical reactions involve promising new chemistry that may be relevant in atmospheric and combustion systems. Our study highlights the complexity of chemical space exploration and the advantage of combined application of several approaches. Altogether, the present work demonstrates both the power and the weaknesses of existing fully automated approaches for reaction discovery which suggest possible directions for further method development and assessment in order to enable reliable discovery of all important reactions of any specified reactant(s).« less

  12. Unimolecular Reaction Pathways of a γ-Ketohydroperoxide from Combined Application of Automated Reaction Discovery Methods

    DOE PAGES

    Grambow, Colin A.; Jamal, Adeel; Li, Yi -Pei; ...

    2017-12-22

    Ketohydroperoxides are important in liquid-phase autoxidation and in gas-phase partial oxidation and pre-ignition chemistry, but because of their low concentration, instability, and various analytical chemistry limitations, it has been challenging to experimentally determine their reactivity, and only a few pathways are known. In the present work, 75 elementary-step unimolecular reactions of the simplest γ-ketohydroperoxide, 3-hydroperoxypropanal, were discovered by a combination of density functional theory with several automated transition-state search algorithms: the Berny algorithm coupled with the freezing string method, single- and double-ended growing string methods, the heuristic KinBot algorithm, and the single-component artificial force induced reaction method (SC-AFIR). The presentmore » joint approach significantly outperforms previous manual and automated transition-state searches – 68 of the reactions of γ-ketohydroperoxide discovered here were previously unknown and completely unexpected. All of the methods found the lowest-energy transition state, which corresponds to the first step of the Korcek mechanism, but each algorithm except for SC-AFIR detected several reactions not found by any of the other methods. We show that the low-barrier chemical reactions involve promising new chemistry that may be relevant in atmospheric and combustion systems. Our study highlights the complexity of chemical space exploration and the advantage of combined application of several approaches. Altogether, the present work demonstrates both the power and the weaknesses of existing fully automated approaches for reaction discovery which suggest possible directions for further method development and assessment in order to enable reliable discovery of all important reactions of any specified reactant(s).« less

  13. Vertical decomposition with Genetic Algorithm for Multiple Sequence Alignment

    PubMed Central

    2011-01-01

    Background Many Bioinformatics studies begin with a multiple sequence alignment as the foundation for their research. This is because multiple sequence alignment can be a useful technique for studying molecular evolution and analyzing sequence structure relationships. Results In this paper, we have proposed a Vertical Decomposition with Genetic Algorithm (VDGA) for Multiple Sequence Alignment (MSA). In VDGA, we divide the sequences vertically into two or more subsequences, and then solve them individually using a guide tree approach. Finally, we combine all the subsequences to generate a new multiple sequence alignment. This technique is applied on the solutions of the initial generation and of each child generation within VDGA. We have used two mechanisms to generate an initial population in this research: the first mechanism is to generate guide trees with randomly selected sequences and the second is shuffling the sequences inside such trees. Two different genetic operators have been implemented with VDGA. To test the performance of our algorithm, we have compared it with existing well-known methods, namely PRRP, CLUSTALX, DIALIGN, HMMT, SB_PIMA, ML_PIMA, MULTALIGN, and PILEUP8, and also other methods, based on Genetic Algorithms (GA), such as SAGA, MSA-GA and RBT-GA, by solving a number of benchmark datasets from BAliBase 2.0. Conclusions The experimental results showed that the VDGA with three vertical divisions was the most successful variant for most of the test cases in comparison to other divisions considered with VDGA. The experimental results also confirmed that VDGA outperformed the other methods considered in this research. PMID:21867510

  14. An Energy Aware Adaptive Sampling Algorithm for Energy Harvesting WSN with Energy Hungry Sensors

    PubMed Central

    Srbinovski, Bruno; Magno, Michele; Edwards-Murphy, Fiona; Pakrashi, Vikram; Popovici, Emanuel

    2016-01-01

    Wireless sensor nodes have a limited power budget, though they are often expected to be functional in the field once deployed for extended periods of time. Therefore, minimization of energy consumption and energy harvesting technology in Wireless Sensor Networks (WSN) are key tools for maximizing network lifetime, and achieving self-sustainability. This paper proposes an energy aware Adaptive Sampling Algorithm (ASA) for WSN with power hungry sensors and harvesting capabilities, an energy management technique that can be implemented on any WSN platform with enough processing power to execute the proposed algorithm. An existing state-of-the-art ASA developed for wireless sensor networks with power hungry sensors is optimized and enhanced to adapt the sampling frequency according to the available energy of the node. The proposed algorithm is evaluated using two in-field testbeds that are supplied by two different energy harvesting sources (solar and wind). Simulation and comparison between the state-of-the-art ASA and the proposed energy aware ASA (EASA) in terms of energy durability are carried out using in-field measured harvested energy (using both wind and solar sources) and power hungry sensors (ultrasonic wind sensor and gas sensors). The simulation results demonstrate that using ASA in combination with an energy aware function on the nodes can drastically increase the lifetime of a WSN node and enable self-sustainability. In fact, the proposed EASA in conjunction with energy harvesting capability can lead towards perpetual WSN operation and significantly outperform the state-of-the-art ASA. PMID:27043559

  15. Determining the semantic similarities among Gene Ontology terms.

    PubMed

    Taha, Kamal

    2013-05-01

    We present in this paper novel techniques that determine the semantic relationships among GeneOntology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S' of GO terms, where each term in S' is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T. We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms.We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.

  16. A study of active learning methods for named entity recognition in clinical text.

    PubMed

    Chen, Yukun; Lasko, Thomas A; Mei, Qiaozhu; Denny, Joshua C; Xu, Hua

    2015-12-01

    Named entity recognition (NER), a sequential labeling task, is one of the fundamental tasks for building clinical natural language processing (NLP) systems. Machine learning (ML) based approaches can achieve good performance, but they often require large amounts of annotated samples, which are expensive to build due to the requirement of domain experts in annotation. Active learning (AL), a sample selection approach integrated with supervised ML, aims to minimize the annotation cost while maximizing the performance of ML-based models. In this study, our goal was to develop and evaluate both existing and new AL methods for a clinical NER task to identify concepts of medical problems, treatments, and lab tests from the clinical notes. Using the annotated NER corpus from the 2010 i2b2/VA NLP challenge that contained 349 clinical documents with 20,423 unique sentences, we simulated AL experiments using a number of existing and novel algorithms in three different categories including uncertainty-based, diversity-based, and baseline sampling strategies. They were compared with the passive learning that uses random sampling. Learning curves that plot performance of the NER model against the estimated annotation cost (based on number of sentences or words in the training set) were generated to evaluate different active learning and the passive learning methods and the area under the learning curve (ALC) score was computed. Based on the learning curves of F-measure vs. number of sentences, uncertainty sampling algorithms outperformed all other methods in ALC. Most diversity-based methods also performed better than random sampling in ALC. To achieve an F-measure of 0.80, the best method based on uncertainty sampling could save 66% annotations in sentences, as compared to random sampling. For the learning curves of F-measure vs. number of words, uncertainty sampling methods again outperformed all other methods in ALC. To achieve 0.80 in F-measure, in comparison to random sampling, the best uncertainty based method saved 42% annotations in words. But the best diversity based method reduced only 7% annotation effort. In the simulated setting, AL methods, particularly uncertainty-sampling based approaches, seemed to significantly save annotation cost for the clinical NER task. The actual benefit of active learning in clinical NER should be further evaluated in a real-time setting. Copyright © 2015 Elsevier Inc. All rights reserved.

  17. A hybrid artificial bee colony algorithm for numerical function optimization

    NASA Astrophysics Data System (ADS)

    Alqattan, Zakaria N.; Abdullah, Rosni

    2015-02-01

    Artificial Bee Colony (ABC) algorithm is one of the swarm intelligence algorithms; it has been introduced by Karaboga in 2005. It is a meta-heuristic optimization search algorithm inspired from the intelligent foraging behavior of the honey bees in nature. Its unique search process made it as one of the most competitive algorithm with some other search algorithms in the area of optimization, such as Genetic algorithm (GA) and Particle Swarm Optimization (PSO). However, the ABC performance of the local search process and the bee movement or the solution improvement equation still has some weaknesses. The ABC is good in avoiding trapping at the local optimum but it spends its time searching around unpromising random selected solutions. Inspired by the PSO, we propose a Hybrid Particle-movement ABC algorithm called HPABC, which adapts the particle movement process to improve the exploration of the original ABC algorithm. Numerical benchmark functions were used in order to experimentally test the HPABC algorithm. The results illustrate that the HPABC algorithm can outperform the ABC algorithm in most of the experiments (75% better in accuracy and over 3 times faster).

  18. Capacity improvement using simulation optimization approaches: A case study in the thermotechnology industry

    NASA Astrophysics Data System (ADS)

    Yelkenci Köse, Simge; Demir, Leyla; Tunalı, Semra; Türsel Eliiyi, Deniz

    2015-02-01

    In manufacturing systems, optimal buffer allocation has a considerable impact on capacity improvement. This study presents a simulation optimization procedure to solve the buffer allocation problem in a heat exchanger production plant so as to improve the capacity of the system. For optimization, three metaheuristic-based search algorithms, i.e. a binary-genetic algorithm (B-GA), a binary-simulated annealing algorithm (B-SA) and a binary-tabu search algorithm (B-TS), are proposed. These algorithms are integrated with the simulation model of the production line. The simulation model, which captures the stochastic and dynamic nature of the production line, is used as an evaluation function for the proposed metaheuristics. The experimental study with benchmark problem instances from the literature and the real-life problem show that the proposed B-TS algorithm outperforms B-GA and B-SA in terms of solution quality.

  19. The multinomial simulation algorithm for discrete stochastic simulation of reaction-diffusion systems.

    PubMed

    Lampoudi, Sotiria; Gillespie, Dan T; Petzold, Linda R

    2009-03-07

    The Inhomogeneous Stochastic Simulation Algorithm (ISSA) is a variant of the stochastic simulation algorithm in which the spatially inhomogeneous volume of the system is divided into homogeneous subvolumes, and the chemical reactions in those subvolumes are augmented by diffusive transfers of molecules between adjacent subvolumes. The ISSA can be prohibitively slow when the system is such that diffusive transfers occur much more frequently than chemical reactions. In this paper we present the Multinomial Simulation Algorithm (MSA), which is designed to, on the one hand, outperform the ISSA when diffusive transfer events outnumber reaction events, and on the other, to handle small reactant populations with greater accuracy than deterministic-stochastic hybrid algorithms. The MSA treats reactions in the usual ISSA fashion, but uses appropriately conditioned binomial random variables for representing the net numbers of molecules diffusing from any given subvolume to a neighbor within a prescribed distance. Simulation results illustrate the benefits of the algorithm.

  20. New Enhanced Artificial Bee Colony (JA-ABC5) Algorithm with Application for Reactive Power Optimization

    PubMed Central

    2015-01-01

    The standard artificial bee colony (ABC) algorithm involves exploration and exploitation processes which need to be balanced for enhanced performance. This paper proposes a new modified ABC algorithm named JA-ABC5 to enhance convergence speed and improve the ability to reach the global optimum by balancing exploration and exploitation processes. New stages have been proposed at the earlier stages of the algorithm to increase the exploitation process. Besides that, modified mutation equations have also been introduced in the employed and onlooker-bees phases to balance the two processes. The performance of JA-ABC5 has been analyzed on 27 commonly used benchmark functions and tested to optimize the reactive power optimization problem. The performance results have clearly shown that the newly proposed algorithm has outperformed other compared algorithms in terms of convergence speed and global optimum achievement. PMID:25879054

  1. New enhanced artificial bee colony (JA-ABC5) algorithm with application for reactive power optimization.

    PubMed

    Sulaiman, Noorazliza; Mohamad-Saleh, Junita; Abro, Abdul Ghani

    2015-01-01

    The standard artificial bee colony (ABC) algorithm involves exploration and exploitation processes which need to be balanced for enhanced performance. This paper proposes a new modified ABC algorithm named JA-ABC5 to enhance convergence speed and improve the ability to reach the global optimum by balancing exploration and exploitation processes. New stages have been proposed at the earlier stages of the algorithm to increase the exploitation process. Besides that, modified mutation equations have also been introduced in the employed and onlooker-bees phases to balance the two processes. The performance of JA-ABC5 has been analyzed on 27 commonly used benchmark functions and tested to optimize the reactive power optimization problem. The performance results have clearly shown that the newly proposed algorithm has outperformed other compared algorithms in terms of convergence speed and global optimum achievement.

  2. Enhancing artificial bee colony algorithm with self-adaptive searching strategy and artificial immune network operators for global optimization.

    PubMed

    Chen, Tinggui; Xiao, Renbin

    2014-01-01

    Artificial bee colony (ABC) algorithm, inspired by the intelligent foraging behavior of honey bees, was proposed by Karaboga. It has been shown to be superior to some conventional intelligent algorithms such as genetic algorithm (GA), artificial colony optimization (ACO), and particle swarm optimization (PSO). However, the ABC still has some limitations. For example, ABC can easily get trapped in the local optimum when handing in functions that have a narrow curving valley, a high eccentric ellipse, or complex multimodal functions. As a result, we proposed an enhanced ABC algorithm called EABC by introducing self-adaptive searching strategy and artificial immune network operators to improve the exploitation and exploration. The simulation results tested on a suite of unimodal or multimodal benchmark functions illustrate that the EABC algorithm outperforms ACO, PSO, and the basic ABC in most of the experiments.

  3. Blind Channel Equalization Using Constrained Generalized Pattern Search Optimization and Reinitialization Strategy

    NASA Astrophysics Data System (ADS)

    Zaouche, Abdelouahib; Dayoub, Iyad; Rouvaen, Jean Michel; Tatkeu, Charles

    2008-12-01

    We propose a global convergence baud-spaced blind equalization method in this paper. This method is based on the application of both generalized pattern optimization and channel surfing reinitialization. The potentially used unimodal cost function relies on higher- order statistics, and its optimization is achieved using a pattern search algorithm. Since the convergence to the global minimum is not unconditionally warranted, we make use of channel surfing reinitialization (CSR) strategy to find the right global minimum. The proposed algorithm is analyzed, and simulation results using a severe frequency selective propagation channel are given. Detailed comparisons with constant modulus algorithm (CMA) are highlighted. The proposed algorithm performances are evaluated in terms of intersymbol interference, normalized received signal constellations, and root mean square error vector magnitude. In case of nonconstant modulus input signals, our algorithm outperforms significantly CMA algorithm with full channel surfing reinitialization strategy. However, comparable performances are obtained for constant modulus signals.

  4. A High Performance Cloud-Based Protein-Ligand Docking Prediction Algorithm

    PubMed Central

    Chen, Jui-Le; Yang, Chu-Sing

    2013-01-01

    The potential of predicting druggability for a particular disease by integrating biological and computer science technologies has witnessed success in recent years. Although the computer science technologies can be used to reduce the costs of the pharmaceutical research, the computation time of the structure-based protein-ligand docking prediction is still unsatisfied until now. Hence, in this paper, a novel docking prediction algorithm, named fast cloud-based protein-ligand docking prediction algorithm (FCPLDPA), is presented to accelerate the docking prediction algorithm. The proposed algorithm works by leveraging two high-performance operators: (1) the novel migration (information exchange) operator is designed specially for cloud-based environments to reduce the computation time; (2) the efficient operator is aimed at filtering out the worst search directions. Our simulation results illustrate that the proposed method outperforms the other docking algorithms compared in this paper in terms of both the computation time and the quality of the end result. PMID:23762864

  5. Enhancing Artificial Bee Colony Algorithm with Self-Adaptive Searching Strategy and Artificial Immune Network Operators for Global Optimization

    PubMed Central

    Chen, Tinggui; Xiao, Renbin

    2014-01-01

    Artificial bee colony (ABC) algorithm, inspired by the intelligent foraging behavior of honey bees, was proposed by Karaboga. It has been shown to be superior to some conventional intelligent algorithms such as genetic algorithm (GA), artificial colony optimization (ACO), and particle swarm optimization (PSO). However, the ABC still has some limitations. For example, ABC can easily get trapped in the local optimum when handing in functions that have a narrow curving valley, a high eccentric ellipse, or complex multimodal functions. As a result, we proposed an enhanced ABC algorithm called EABC by introducing self-adaptive searching strategy and artificial immune network operators to improve the exploitation and exploration. The simulation results tested on a suite of unimodal or multimodal benchmark functions illustrate that the EABC algorithm outperforms ACO, PSO, and the basic ABC in most of the experiments. PMID:24772023

  6. Nested Conjugate Gradient Algorithm with Nested Preconditioning for Non-linear Image Restoration.

    PubMed

    Skariah, Deepak G; Arigovindan, Muthuvel

    2017-06-19

    We develop a novel optimization algorithm, which we call Nested Non-Linear Conjugate Gradient algorithm (NNCG), for image restoration based on quadratic data fitting and smooth non-quadratic regularization. The algorithm is constructed as a nesting of two conjugate gradient (CG) iterations. The outer iteration is constructed as a preconditioned non-linear CG algorithm; the preconditioning is performed by the inner CG iteration that is linear. The inner CG iteration, which performs preconditioning for outer CG iteration, itself is accelerated by an another FFT based non-iterative preconditioner. We prove that the method converges to a stationary point for both convex and non-convex regularization functionals. We demonstrate experimentally that proposed method outperforms the well-known majorization-minimization method used for convex regularization, and a non-convex inertial-proximal method for non-convex regularization functional.

  7. Receiver Diversity Combining Using Evolutionary Algorithms in Rayleigh Fading Channel

    PubMed Central

    Akbari, Mohsen; Manesh, Mohsen Riahi

    2014-01-01

    In diversity combining at the receiver, the output signal-to-noise ratio (SNR) is often maximized by using the maximal ratio combining (MRC) provided that the channel is perfectly estimated at the receiver. However, channel estimation is rarely perfect in practice, which results in deteriorating the system performance. In this paper, an imperialistic competitive algorithm (ICA) is proposed and compared with two other evolutionary based algorithms, namely, particle swarm optimization (PSO) and genetic algorithm (GA), for diversity combining of signals travelling across the imperfect channels. The proposed algorithm adjusts the combiner weights of the received signal components in such a way that maximizes the SNR and minimizes the bit error rate (BER). The results indicate that the proposed method eliminates the need of channel estimation and can outperform the conventional diversity combining methods. PMID:25045725

  8. Adaptive image coding based on cubic-spline interpolation

    NASA Astrophysics Data System (ADS)

    Jiang, Jian-Xing; Hong, Shao-Hua; Lin, Tsung-Ching; Wang, Lin; Truong, Trieu-Kien

    2014-09-01

    It has been investigated that at low bit rates, downsampling prior to coding and upsampling after decoding can achieve better compression performance than standard coding algorithms, e.g., JPEG and H. 264/AVC. However, at high bit rates, the sampling-based schemes generate more distortion. Additionally, the maximum bit rate for the sampling-based scheme to outperform the standard algorithm is image-dependent. In this paper, a practical adaptive image coding algorithm based on the cubic-spline interpolation (CSI) is proposed. This proposed algorithm adaptively selects the image coding method from CSI-based modified JPEG and standard JPEG under a given target bit rate utilizing the so called ρ-domain analysis. The experimental results indicate that compared with the standard JPEG, the proposed algorithm can show better performance at low bit rates and maintain the same performance at high bit rates.

  9. Comparison of l₁-Norm SVR and Sparse Coding Algorithms for Linear Regression.

    PubMed

    Zhang, Qingtian; Hu, Xiaolin; Zhang, Bo

    2015-08-01

    Support vector regression (SVR) is a popular function estimation technique based on Vapnik's concept of support vector machine. Among many variants, the l1-norm SVR is known to be good at selecting useful features when the features are redundant. Sparse coding (SC) is a technique widely used in many areas and a number of efficient algorithms are available. Both l1-norm SVR and SC can be used for linear regression. In this brief, the close connection between the l1-norm SVR and SC is revealed and some typical algorithms are compared for linear regression. The results show that the SC algorithms outperform the Newton linear programming algorithm, an efficient l1-norm SVR algorithm, in efficiency. The algorithms are then used to design the radial basis function (RBF) neural networks. Experiments on some benchmark data sets demonstrate the high efficiency of the SC algorithms. In particular, one of the SC algorithms, the orthogonal matching pursuit is two orders of magnitude faster than a well-known RBF network designing algorithm, the orthogonal least squares algorithm.

  10. Efficient Record Linkage Algorithms Using Complete Linkage Clustering.

    PubMed

    Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar

    2016-01-01

    Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times.

  11. Efficient Record Linkage Algorithms Using Complete Linkage Clustering

    PubMed Central

    Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar

    2016-01-01

    Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times. PMID:27124604

  12. A tabu search evalutionary algorithm for multiobjective optimization: Application to a bi-criterion aircraft structural reliability problem

    NASA Astrophysics Data System (ADS)

    Long, Kim Chenming

    Real-world engineering optimization problems often require the consideration of multiple conflicting and noncommensurate objectives, subject to nonconvex constraint regions in a high-dimensional decision space. Further challenges occur for combinatorial multiobjective problems in which the decision variables are not continuous. Traditional multiobjective optimization methods of operations research, such as weighting and epsilon constraint methods, are ill-suited to solving these complex, multiobjective problems. This has given rise to the application of a wide range of metaheuristic optimization algorithms, such as evolutionary, particle swarm, simulated annealing, and ant colony methods, to multiobjective optimization. Several multiobjective evolutionary algorithms have been developed, including the strength Pareto evolutionary algorithm (SPEA) and the non-dominated sorting genetic algorithm (NSGA), for determining the Pareto-optimal set of non-dominated solutions. Although numerous researchers have developed a wide range of multiobjective optimization algorithms, there is a continuing need to construct computationally efficient algorithms with an improved ability to converge to globally non-dominated solutions along the Pareto-optimal front for complex, large-scale, multiobjective engineering optimization problems. This is particularly important when the multiple objective functions and constraints of the real-world system cannot be expressed in explicit mathematical representations. This research presents a novel metaheuristic evolutionary algorithm for complex multiobjective optimization problems, which combines the metaheuristic tabu search algorithm with the evolutionary algorithm (TSEA), as embodied in genetic algorithms. TSEA is successfully applied to bicriteria (i.e., structural reliability and retrofit cost) optimization of the aircraft tail structure fatigue life, which increases its reliability by prolonging fatigue life. A comparison for this application of the proposed algorithm, TSEA, with several state-of-the-art multiobjective optimization algorithms reveals that TSEA outperforms these algorithms by providing retrofit solutions with greater reliability for the same costs (i.e., closer to the Pareto-optimal front) after the algorithms are executed for the same number of generations. This research also demonstrates that TSEA competes with and, in some situations, outperforms state-of-the-art multiobjective optimization algorithms such as NSGA II and SPEA 2 when applied to classic bicriteria test problems in the technical literature and other complex, sizable real-world applications. The successful implementation of TSEA contributes to the safety of aeronautical structures by providing a systematic way to guide aircraft structural retrofitting efforts, as well as a potentially useful algorithm for a wide range of multiobjective optimization problems in engineering and other fields.

  13. Parallel Splash Belief Propagation

    DTIC Science & Technology

    2010-08-01

    s/ ROGER J. DZIEGIEL, Jr. MICHAEL J. WESSING, Deputy Chief Work Unit Manager For... Management and Budget, Paperwork Reduction Project (0704-0188) Washington, DC 20503. PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. 1. REPORT DATE...Propagation algorithm outperforms synchronous, round-robin, wild-fire ( Ranganathan et al., 2007), and residual (Elidan et al., 2006) belief propagation

  14. Quantum Algorithms for Scientific Computing and Approximate Optimization

    NASA Astrophysics Data System (ADS)

    Hadfield, Stuart Andrew

    Diversity and inclusion has been a concern for the physics community for nearly 50 years. Despite significant efforts including the American Physical Society (APS) Conferences for Undergraduate Women in Physics (CUWiP) and the APS Bridge Program, women, African Americans, and Hispanics continue to be substantially underrepresented in the physics profession. Similar efforts within the field of engineering, whose students make up the majority of students in the introductory calculus-based physics courses, have also met with limited success. With the introduction of research-based instruments such as the Force Concept Inventory (FCI), the Force and Motion Conceptual Evaluation (FMCE), and the Conceptual Survey of Electricity and Magnetism (CSEM), differences in performance by gender began to be reported. Researchers have yet to come to an agreement as to why these "gender gaps" exist in the conceptual inventories that are widely used in physics education research and/or how to reduce the gaps. The "gender gap" has been extensively studied; on average, for the mechanics conceptual inventories, male students outperform female students by 13% on the pretest and by 12% post instruction. While much of the gender gap research has been geared toward the mechanics conceptual inventories, there have been few studies exploring the gender gap in the electricity and magnetism conceptual inventories. Overall, male students outperform female students by 3.7% on the pretest and 8.5% on the post-test; however, these studies have much more variation including one study showing female students outperforming male students on the CSEM. Many factors have been proposed that may influence the gender gap, from differences in background and preparation to various psychological and sociocultural effects. A parallel but largely disconnected set of research has identified gender biased questions within the FCI. This research has produced sporadic results and has only been performed on the FCI. The work performed in this manuscript will seek to synthesize these strands and use large datasets and deep demographic data to understand the persistent differences in male and female performance.

  15. Evaluation of multilayer perceptron algorithms for an analysis of network flow data

    NASA Astrophysics Data System (ADS)

    Bieniasz, Jedrzej; Rawski, Mariusz; Skowron, Krzysztof; Trzepiński, Mateusz

    2016-09-01

    The volume of exchanged information through IP networks is larger than ever and still growing. It creates a space for both benign and malicious activities. The second one raises awareness on security network devices, as well as network infrastructure and a system as a whole. One of the basic tools to prevent cyber attacks is Network Instrusion Detection System (NIDS). NIDS could be realized as a signature-based detector or an anomaly-based one. In the last few years the emphasis has been placed on the latter type, because of the possibility of applying smart and intelligent solutions. An ideal NIDS of next generation should be composed of self-learning algorithms that could react on known and unknown malicious network activities respectively. In this paper we evaluated a machine learning approach for detection of anomalies in IP network data represented as NetFlow records. We considered Multilayer Perceptron (MLP) as the classifier and we used two types of learning algorithms - Backpropagation (BP) and Particle Swarm Optimization (PSO). This paper includes a comprehensive survey on determining the most optimal MLP learning algorithm for the classification problem in application to network flow data. The performance, training time and convergence of BP and PSO methods were compared. The results show that PSO algorithm implemented by the authors outperformed other solutions if accuracy of classifications is considered. The major disadvantage of PSO is training time, which could be not acceptable for larger data sets or in real network applications. At the end we compared some key findings with the results from the other papers to show that in all cases results from this study outperformed them.

  16. A ℓ2, 1 norm regularized multi-kernel learning for false positive reduction in Lung nodule CAD.

    PubMed

    Cao, Peng; Liu, Xiaoli; Zhang, Jian; Li, Wei; Zhao, Dazhe; Huang, Min; Zaiane, Osmar

    2017-03-01

    The aim of this paper is to describe a novel algorithm for False Positive Reduction in lung nodule Computer Aided Detection(CAD). In this paper, we describes a new CT lung CAD method which aims to detect solid nodules. Specially, we proposed a multi-kernel classifier with a ℓ 2, 1 norm regularizer for heterogeneous feature fusion and selection from the feature subset level, and designed two efficient strategies to optimize the parameters of kernel weights in non-smooth ℓ 2, 1 regularized multiple kernel learning algorithm. The first optimization algorithm adapts a proximal gradient method for solving the ℓ 2, 1 norm of kernel weights, and use an accelerated method based on FISTA; the second one employs an iterative scheme based on an approximate gradient descent method. The results demonstrates that the FISTA-style accelerated proximal descent method is efficient for the ℓ 2, 1 norm formulation of multiple kernel learning with the theoretical guarantee of the convergence rate. Moreover, the experimental results demonstrate the effectiveness of the proposed methods in terms of Geometric mean (G-mean) and Area under the ROC curve (AUC), and significantly outperforms the competing methods. The proposed approach exhibits some remarkable advantages both in heterogeneous feature subsets fusion and classification phases. Compared with the fusion strategies of feature-level and decision level, the proposed ℓ 2, 1 norm multi-kernel learning algorithm is able to accurately fuse the complementary and heterogeneous feature sets, and automatically prune the irrelevant and redundant feature subsets to form a more discriminative feature set, leading a promising classification performance. Moreover, the proposed algorithm consistently outperforms the comparable classification approaches in the literature. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  17. A novel joint-processing adaptive nonlinear equalizer using a modular recurrent neural network for chaotic communication systems.

    PubMed

    Zhao, Haiquan; Zeng, Xiangping; Zhang, Jiashu; Liu, Yangguang; Wang, Xiaomin; Li, Tianrui

    2011-01-01

    To eliminate nonlinear channel distortion in chaotic communication systems, a novel joint-processing adaptive nonlinear equalizer based on a pipelined recurrent neural network (JPRNN) is proposed, using a modified real-time recurrent learning (RTRL) algorithm. Furthermore, an adaptive amplitude RTRL algorithm is adopted to overcome the deteriorating effect introduced by the nesting process. Computer simulations illustrate that the proposed equalizer outperforms the pipelined recurrent neural network (PRNN) and recurrent neural network (RNN) equalizers. Copyright © 2010 Elsevier Ltd. All rights reserved.

  18. Edge directed image interpolation with Bamberger pyramids

    NASA Astrophysics Data System (ADS)

    Rosiles, Jose Gerardo

    2005-08-01

    Image interpolation is a standard feature in digital image editing software, digital camera systems and printers. Classical methods for resizing produce blurred images with unacceptable quality. Bamberger Pyramids and filter banks have been successfully used for texture and image analysis. They provide excellent multiresolution and directional selectivity. In this paper we present an edge-directed image interpolation algorithm which takes advantage of the simultaneous spatial-directional edge localization at the subband level. The proposed algorithm outperform classical schemes like bilinear and bicubic schemes from the visual and numerical point of views.

  19. GPU Acceleration of DSP for Communication Receivers.

    PubMed

    Gunther, Jake; Gunther, Hyrum; Moon, Todd

    2017-09-01

    Graphics processing unit (GPU) implementations of signal processing algorithms can outperform CPU-based implementations. This paper describes the GPU implementation of several algorithms encountered in a wide range of high-data rate communication receivers including filters, multirate filters, numerically controlled oscillators, and multi-stage digital down converters. These structures are tested by processing the 20 MHz wide FM radio band (88-108 MHz). Two receiver structures are explored: a single channel receiver and a filter bank channelizer. Both run in real time on NVIDIA GeForce GTX 1080 graphics card.

  20. Seismic noise attenuation using an online subspace tracking algorithm

    NASA Astrophysics Data System (ADS)

    Zhou, Yatong; Li, Shuhua; Zhang, Dong; Chen, Yangkang

    2018-02-01

    We propose a new low-rank based noise attenuation method using an efficient algorithm for tracking subspaces from highly corrupted seismic observations. The subspace tracking algorithm requires only basic linear algebraic manipulations. The algorithm is derived by analysing incremental gradient descent on the Grassmannian manifold of subspaces. When the multidimensional seismic data are mapped to a low-rank space, the subspace tracking algorithm can be directly applied to the input low-rank matrix to estimate the useful signals. Since the subspace tracking algorithm is an online algorithm, it is more robust to random noise than traditional truncated singular value decomposition (TSVD) based subspace tracking algorithm. Compared with the state-of-the-art algorithms, the proposed denoising method can obtain better performance. More specifically, the proposed method outperforms the TSVD-based singular spectrum analysis method in causing less residual noise and also in saving half of the computational cost. Several synthetic and field data examples with different levels of complexities demonstrate the effectiveness and robustness of the presented algorithm in rejecting different types of noise including random noise, spiky noise, blending noise, and coherent noise.

  1. Artificial Bee Colony Optimization of Capping Potentials for Hybrid Quantum Mechanical/Molecular Mechanical Calculations.

    PubMed

    Schiffmann, Christoph; Sebastiani, Daniel

    2011-05-10

    We present an algorithmic extension of a numerical optimization scheme for analytic capping potentials for use in mixed quantum-classical (quantum mechanical/molecular mechanical, QM/MM) ab initio calculations. Our goal is to minimize bond-cleavage-induced perturbations in the electronic structure, measured by means of a suitable penalty functional. The optimization algorithm-a variant of the artificial bee colony (ABC) algorithm, which relies on swarm intelligence-couples deterministic (downhill gradient) and stochastic elements to avoid local minimum trapping. The ABC algorithm outperforms the conventional downhill gradient approach, if the penalty hypersurface exhibits wiggles that prevent a straight minimization pathway. We characterize the optimized capping potentials by computing NMR chemical shifts. This approach will increase the accuracy of QM/MM calculations of complex biomolecules.

  2. Geometry-driven distributed compression of the plenoptic function: performance bounds and constructive algorithms.

    PubMed

    Gehrig, Nicolas; Dragotti, Pier Luigi

    2009-03-01

    In this paper, we study the sampling and the distributed compression of the data acquired by a camera sensor network. The effective design of these sampling and compression schemes requires, however, the understanding of the structure of the acquired data. To this end, we show that the a priori knowledge of the configuration of the camera sensor network can lead to an effective estimation of such structure and to the design of effective distributed compression algorithms. For idealized scenarios, we derive the fundamental performance bounds of a camera sensor network and clarify the connection between sampling and distributed compression. We then present a distributed compression algorithm that takes advantage of the structure of the data and that outperforms independent compression algorithms on real multiview images.

  3. A new collaborative recommendation approach based on users clustering using artificial bee colony algorithm.

    PubMed

    Ju, Chunhua; Xu, Chonghuan

    2013-01-01

    Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users' preferences. In this paper, we propose a novel collaborative filtering recommendation approach based on K-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC) algorithm to overcome the local optimal problem caused by K-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark dataset MovieLens and a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods.

  4. Deep Marginalized Sparse Denoising Auto-Encoder for Image Denoising

    NASA Astrophysics Data System (ADS)

    Ma, Hongqiang; Ma, Shiping; Xu, Yuelei; Zhu, Mingming

    2018-01-01

    Stacked Sparse Denoising Auto-Encoder (SSDA) has been successfully applied to image denoising. As a deep network, the SSDA network with powerful data feature learning ability is superior to the traditional image denoising algorithms. However, the algorithm has high computational complexity and slow convergence rate in the training. To address this limitation, we present a method of image denoising based on Deep Marginalized Sparse Denoising Auto-Encoder (DMSDA). The loss function of Sparse Denoising Auto-Encoder is marginalized so that it satisfies both sparseness and marginality. The experimental results show that the proposed algorithm can not only outperform SSDA in the convergence speed and training time, but also has better denoising performance than the current excellent denoising algorithms, including both the subjective and objective evaluation of image denoising.

  5. Study of genetic direct search algorithms for function optimization

    NASA Technical Reports Server (NTRS)

    Zeigler, B. P.

    1974-01-01

    The results are presented of a study to determine the performance of genetic direct search algorithms in solving function optimization problems arising in the optimal and adaptive control areas. The findings indicate that: (1) genetic algorithms can outperform standard algorithms in multimodal and/or noisy optimization situations, but suffer from lack of gradient exploitation facilities when gradient information can be utilized to guide the search. (2) For large populations, or low dimensional function spaces, mutation is a sufficient operator. However for small populations or high dimensional functions, crossover applied in about equal frequency with mutation is an optimum combination. (3) Complexity, in terms of storage space and running time, is significantly increased when population size is increased or the inversion operator, or the second level adaptation routine is added to the basic structure.

  6. Shape-driven 3D segmentation using spherical wavelets.

    PubMed

    Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen

    2006-01-01

    This paper presents a novel active surface segmentation algorithm using a multiscale shape representation and prior. We define a parametric model of a surface using spherical wavelet functions and learn a prior probability distribution over the wavelet coefficients to model shape variations at different scales and spatial locations in a training set. Based on this representation, we derive a parametric active surface evolution using the multiscale prior coefficients as parameters for our optimization procedure to naturally include the prior in the segmentation framework. Additionally, the optimization method can be applied in a coarse-to-fine manner. We apply our algorithm to the segmentation of brain caudate nucleus, of interest in the study of schizophrenia. Our validation shows our algorithm is computationally efficient and outperforms the Active Shape Model algorithm by capturing finer shape details.

  7. Minimized-Laplacian residual interpolation for color image demosaicking

    NASA Astrophysics Data System (ADS)

    Kiku, Daisuke; Monno, Yusuke; Tanaka, Masayuki; Okutomi, Masatoshi

    2014-03-01

    A color difference interpolation technique is widely used for color image demosaicking. In this paper, we propose a minimized-laplacian residual interpolation (MLRI) as an alternative to the color difference interpolation, where the residuals are differences between observed and tentatively estimated pixel values. In the MLRI, we estimate the tentative pixel values by minimizing the Laplacian energies of the residuals. This residual image transfor- mation allows us to interpolate more easily than the standard color difference transformation. We incorporate the proposed MLRI into the gradient based threshold free (GBTF) algorithm, which is one of current state-of- the-art demosaicking algorithms. Experimental results demonstrate that our proposed demosaicking algorithm can outperform the state-of-the-art algorithms for the 30 images of the IMAX and the Kodak datasets.

  8. A New Collaborative Recommendation Approach Based on Users Clustering Using Artificial Bee Colony Algorithm

    PubMed Central

    Ju, Chunhua

    2013-01-01

    Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users' preferences. In this paper, we propose a novel collaborative filtering recommendation approach based on K-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC) algorithm to overcome the local optimal problem caused by K-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark dataset MovieLens and a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods. PMID:24381525

  9. Model-based sensor-less wavefront aberration correction in optical coherence tomography.

    PubMed

    Verstraete, Hans R G W; Wahls, Sander; Kalkman, Jeroen; Verhaegen, Michel

    2015-12-15

    Several sensor-less wavefront aberration correction methods that correct nonlinear wavefront aberrations by maximizing the optical coherence tomography (OCT) signal are tested on an OCT setup. A conventional coordinate search method is compared to two model-based optimization methods. The first model-based method takes advantage of the well-known optimization algorithm (NEWUOA) and utilizes a quadratic model. The second model-based method (DONE) is new and utilizes a random multidimensional Fourier-basis expansion. The model-based algorithms achieve lower wavefront errors with up to ten times fewer measurements. Furthermore, the newly proposed DONE method outperforms the NEWUOA method significantly. The DONE algorithm is tested on OCT images and shows a significantly improved image quality.

  10. LTI system order reduction approach based on asymptotical equivalence and the Co-operation of biology-related algorithms

    NASA Astrophysics Data System (ADS)

    Ryzhikov, I. S.; Semenkin, E. S.; Akhmedova, Sh A.

    2017-02-01

    A novel order reduction method for linear time invariant systems is described. The method is based on reducing the initial problem to an optimization one, using the proposed model representation, and solving the problem with an efficient optimization algorithm. The proposed method of determining the model allows all the parameters of the model with lower order to be identified and by definition, provides the model with the required steady-state. As a powerful optimization tool, the meta-heuristic Co-Operation of Biology-Related Algorithms was used. Experimental results proved that the proposed approach outperforms other approaches and that the reduced order model achieves a high level of accuracy.

  11. A Swarm Optimization Genetic Algorithm Based on Quantum-Behaved Particle Swarm Optimization.

    PubMed

    Sun, Tao; Xu, Ming-Hai

    2017-01-01

    Quantum-behaved particle swarm optimization (QPSO) algorithm is a variant of the traditional particle swarm optimization (PSO). The QPSO that was originally developed for continuous search spaces outperforms the traditional PSO in search ability. This paper analyzes the main factors that impact the search ability of QPSO and converts the particle movement formula to the mutation condition by introducing the rejection region, thus proposing a new binary algorithm, named swarm optimization genetic algorithm (SOGA), because it is more like genetic algorithm (GA) than PSO in form. SOGA has crossover and mutation operator as GA but does not need to set the crossover and mutation probability, so it has fewer parameters to control. The proposed algorithm was tested with several nonlinear high-dimension functions in the binary search space, and the results were compared with those from BPSO, BQPSO, and GA. The experimental results show that SOGA is distinctly superior to the other three algorithms in terms of solution accuracy and convergence.

  12. Image reconstruction algorithms for electrical capacitance tomography based on ROF model using new numerical techniques

    NASA Astrophysics Data System (ADS)

    Chen, Jiaoxuan; Zhang, Maomao; Liu, Yinyan; Chen, Jiaoliao; Li, Yi

    2017-03-01

    Electrical capacitance tomography (ECT) is a promising technique applied in many fields. However, the solutions for ECT are not unique and highly sensitive to the measurement noise. To remain a good shape of reconstructed object and endure a noisy data, a Rudin-Osher-Fatemi (ROF) model with total variation regularization is applied to image reconstruction in ECT. Two numerical methods, which are simplified augmented Lagrangian (SAL) and accelerated alternating direction method of multipliers (AADMM), are innovatively introduced to try to solve the above mentioned problems in ECT. The effect of the parameters and the number of iterations for different algorithms, and the noise level in capacitance data are discussed. Both simulation and experimental tests were carried out to validate the feasibility of the proposed algorithms, compared to the Landweber iteration (LI) algorithm. The results show that the SAL and AADMM algorithms can handle a high level of noise and the AADMM algorithm outperforms other algorithms in identifying the object from its background.

  13. "ON ALGEBRAIC DECODING OF Q-ARY REED-MULLER AND PRODUCT REED-SOLOMON CODES"

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    SANTHI, NANDAKISHORE

    We consider a list decoding algorithm recently proposed by Pellikaan-Wu for q-ary Reed-Muller codes RM{sub q}({ell}, m, n) of length n {le} q{sup m} when {ell} {le} q. A simple and easily accessible correctness proof is given which shows that this algorithm achieves a relative error-correction radius of {tau} {le} (1-{radical}{ell}q{sup m-1}/n). This is an improvement over the proof using one-point Algebraic-Geometric decoding method given in. The described algorithm can be adapted to decode product Reed-Solomon codes. We then propose a new low complexity recursive aJgebraic decoding algorithm for product Reed-Solomon codes and Reed-Muller codes. This algorithm achieves a relativemore » error correction radius of {tau} {le} {Pi}{sub i=1}{sup m} (1 - {radical}k{sub i}/q). This algorithm is then proved to outperform the Pellikaan-Wu algorithm in both complexity and error correction radius over a wide range of code rates.« less

  14. Multi-test decision tree and its application to microarray data classification.

    PubMed

    Czajkowski, Marcin; Grześ, Marek; Kretowski, Marek

    2014-05-01

    The desirable property of tools used to investigate biological data is easy to understand models and predictive decisions. Decision trees are particularly promising in this regard due to their comprehensible nature that resembles the hierarchical process of human decision making. However, existing algorithms for learning decision trees have tendency to underfit gene expression data. The main aim of this work is to improve the performance and stability of decision trees with only a small increase in their complexity. We propose a multi-test decision tree (MTDT); our main contribution is the application of several univariate tests in each non-terminal node of the decision tree. We also search for alternative, lower-ranked features in order to obtain more stable and reliable predictions. Experimental validation was performed on several real-life gene expression datasets. Comparison results with eight classifiers show that MTDT has a statistically significantly higher accuracy than popular decision tree classifiers, and it was highly competitive with ensemble learning algorithms. The proposed solution managed to outperform its baseline algorithm on 14 datasets by an average 6%. A study performed on one of the datasets showed that the discovered genes used in the MTDT classification model are supported by biological evidence in the literature. This paper introduces a new type of decision tree which is more suitable for solving biological problems. MTDTs are relatively easy to analyze and much more powerful in modeling high dimensional microarray data than their popular counterparts. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. Fusion of multichannel local and global structural cues for photo aesthetics evaluation.

    PubMed

    Luming Zhang; Yue Gao; Zimmermann, Roger; Qi Tian; Xuelong Li

    2014-03-01

    Photo aesthetic quality evaluation is a fundamental yet under addressed task in computer vision and image processing fields. Conventional approaches are frustrated by the following two drawbacks. First, both the local and global spatial arrangements of image regions play an important role in photo aesthetics. However, existing rules, e.g., visual balance, heuristically define which spatial distribution among the salient regions of a photo is aesthetically pleasing. Second, it is difficult to adjust visual cues from multiple channels automatically in photo aesthetics assessment. To solve these problems, we propose a new photo aesthetics evaluation framework, focusing on learning the image descriptors that characterize local and global structural aesthetics from multiple visual channels. In particular, to describe the spatial structure of the image local regions, we construct graphlets small-sized connected graphs by connecting spatially adjacent atomic regions. Since spatially adjacent graphlets distribute closely in their feature space, we project them onto a manifold and subsequently propose an embedding algorithm. The embedding algorithm encodes the photo global spatial layout into graphlets. Simultaneously, the importance of graphlets from multiple visual channels are dynamically adjusted. Finally, these post-embedding graphlets are integrated for photo aesthetics evaluation using a probabilistic model. Experimental results show that: 1) the visualized graphlets explicitly capture the aesthetically arranged atomic regions; 2) the proposed approach generalizes and improves four prominent aesthetic rules; and 3) our approach significantly outperforms state-of-the-art algorithms in photo aesthetics prediction.

  16. Density-Aware Clustering Based on Aggregated Heat Kernel and Its Transformation

    DOE PAGES

    Huang, Hao; Yoo, Shinjae; Yu, Dantong; ...

    2015-06-01

    Current spectral clustering algorithms suffer from the sensitivity to existing noise, and parameter scaling, and may not be aware of different density distributions across clusters. If these problems are left untreated, the consequent clustering results cannot accurately represent true data patterns, in particular, for complex real world datasets with heterogeneous densities. This paper aims to solve these problems by proposing a diffusion-based Aggregated Heat Kernel (AHK) to improve the clustering stability, and a Local Density Affinity Transformation (LDAT) to correct the bias originating from different cluster densities. AHK statistically\\ models the heat diffusion traces along the entire time scale, somore » it ensures robustness during clustering process, while LDAT probabilistically reveals local density of each instance and suppresses the local density bias in the affinity matrix. Our proposed framework integrates these two techniques systematically. As a result, not only does it provide an advanced noise-resisting and density-aware spectral mapping to the original dataset, but also demonstrates the stability during the processing of tuning the scaling parameter (which usually controls the range of neighborhood). Furthermore, our framework works well with the majority of similarity kernels, which ensures its applicability to many types of data and problem domains. The systematic experiments on different applications show that our proposed algorithms outperform state-of-the-art clustering algorithms for the data with heterogeneous density distributions, and achieve robust clustering performance with respect to tuning the scaling parameter and handling various levels and types of noise.« less

  17. A novel variable selection approach that iteratively optimizes variable space using weighted binary matrix sampling.

    PubMed

    Deng, Bai-chuan; Yun, Yong-huan; Liang, Yi-zeng; Yi, Lun-zhao

    2014-10-07

    In this study, a new optimization algorithm called the Variable Iterative Space Shrinkage Approach (VISSA) that is based on the idea of model population analysis (MPA) is proposed for variable selection. Unlike most of the existing optimization methods for variable selection, VISSA statistically evaluates the performance of variable space in each step of optimization. Weighted binary matrix sampling (WBMS) is proposed to generate sub-models that span the variable subspace. Two rules are highlighted during the optimization procedure. First, the variable space shrinks in each step. Second, the new variable space outperforms the previous one. The second rule, which is rarely satisfied in most of the existing methods, is the core of the VISSA strategy. Compared with some promising variable selection methods such as competitive adaptive reweighted sampling (CARS), Monte Carlo uninformative variable elimination (MCUVE) and iteratively retaining informative variables (IRIV), VISSA showed better prediction ability for the calibration of NIR data. In addition, VISSA is user-friendly; only a few insensitive parameters are needed, and the program terminates automatically without any additional conditions. The Matlab codes for implementing VISSA are freely available on the website: https://sourceforge.net/projects/multivariateanalysis/files/VISSA/.

  18. Active link selection for efficient semi-supervised community detection

    NASA Astrophysics Data System (ADS)

    Yang, Liang; Jin, Di; Wang, Xiao; Cao, Xiaochun

    2015-03-01

    Several semi-supervised community detection algorithms have been proposed recently to improve the performance of traditional topology-based methods. However, most of them focus on how to integrate supervised information with topology information; few of them pay attention to which information is critical for performance improvement. This leads to large amounts of demand for supervised information, which is expensive or difficult to obtain in most fields. For this problem we propose an active link selection framework, that is we actively select the most uncertain and informative links for human labeling for the efficient utilization of the supervised information. We also disconnect the most likely inter-community edges to further improve the efficiency. Our main idea is that, by connecting uncertain nodes to their community hubs and disconnecting the inter-community edges, one can sharpen the block structure of adjacency matrix more efficiently than randomly labeling links as the existing methods did. Experiments on both synthetic and real networks demonstrate that our new approach significantly outperforms the existing methods in terms of the efficiency of using supervised information. It needs ~13% of the supervised information to achieve a performance similar to that of the original semi-supervised approaches.

  19. Active link selection for efficient semi-supervised community detection

    PubMed Central

    Yang, Liang; Jin, Di; Wang, Xiao; Cao, Xiaochun

    2015-01-01

    Several semi-supervised community detection algorithms have been proposed recently to improve the performance of traditional topology-based methods. However, most of them focus on how to integrate supervised information with topology information; few of them pay attention to which information is critical for performance improvement. This leads to large amounts of demand for supervised information, which is expensive or difficult to obtain in most fields. For this problem we propose an active link selection framework, that is we actively select the most uncertain and informative links for human labeling for the efficient utilization of the supervised information. We also disconnect the most likely inter-community edges to further improve the efficiency. Our main idea is that, by connecting uncertain nodes to their community hubs and disconnecting the inter-community edges, one can sharpen the block structure of adjacency matrix more efficiently than randomly labeling links as the existing methods did. Experiments on both synthetic and real networks demonstrate that our new approach significantly outperforms the existing methods in terms of the efficiency of using supervised information. It needs ~13% of the supervised information to achieve a performance similar to that of the original semi-supervised approaches. PMID:25761385

  20. Improving detection of copy-number variation by simultaneous bias correction and read-depth segmentation.

    PubMed

    Szatkiewicz, Jin P; Wang, WeiBo; Sullivan, Patrick F; Wang, Wei; Sun, Wei

    2013-02-01

    Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth-based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth-based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.

  1. A Novel Extreme Learning Machine Classification Model for e-Nose Application Based on the Multiple Kernel Approach.

    PubMed

    Jian, Yulin; Huang, Daoyu; Yan, Jia; Lu, Kun; Huang, Ying; Wen, Tailai; Zeng, Tanyue; Zhong, Shijie; Xie, Qilong

    2017-06-19

    A novel classification model, named the quantum-behaved particle swarm optimization (QPSO)-based weighted multiple kernel extreme learning machine (QWMK-ELM), is proposed in this paper. Experimental validation is carried out with two different electronic nose (e-nose) datasets. Being different from the existing multiple kernel extreme learning machine (MK-ELM) algorithms, the combination coefficients of base kernels are regarded as external parameters of single-hidden layer feedforward neural networks (SLFNs). The combination coefficients of base kernels, the model parameters of each base kernel, and the regularization parameter are optimized by QPSO simultaneously before implementing the kernel extreme learning machine (KELM) with the composite kernel function. Four types of common single kernel functions (Gaussian kernel, polynomial kernel, sigmoid kernel, and wavelet kernel) are utilized to constitute different composite kernel functions. Moreover, the method is also compared with other existing classification methods: extreme learning machine (ELM), kernel extreme learning machine (KELM), k-nearest neighbors (KNN), support vector machine (SVM), multi-layer perceptron (MLP), radical basis function neural network (RBFNN), and probabilistic neural network (PNN). The results have demonstrated that the proposed QWMK-ELM outperforms the aforementioned methods, not only in precision, but also in efficiency for gas classification.

  2. Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge

    PubMed Central

    Litjens, Geert; Toth, Robert; van de Ven, Wendy; Hoeks, Caroline; Kerkstra, Sjoerd; van Ginneken, Bram; Vincent, Graham; Guillard, Gwenael; Birbeck, Neil; Zhang, Jindang; Strand, Robin; Malmberg, Filip; Ou, Yangming; Davatzikos, Christos; Kirschner, Matthias; Jung, Florian; Yuan, Jing; Qiu, Wu; Gao, Qinquan; Edwards, Philip “Eddie”; Maan, Bianca; van der Heijden, Ferdinand; Ghose, Soumya; Mitra, Jhimli; Dowling, Jason; Barratt, Dean; Huisman, Henkjan; Madabhushi, Anant

    2014-01-01

    Prostate MRI image segmentation has been an area of intense research due to the increased use of MRI as a modality for the clinical workup of prostate cancer. Segmentation is useful for various tasks, e.g. to accurately localize prostate boundaries for radiotherapy or to initialize multi-modal registration algorithms. In the past, it has been difficult for research groups to evaluate prostate segmentation algorithms on multi-center, multi-vendor and multi-protocol data. Especially because we are dealing with MR images, image appearance, resolution and the presence of artifacts are affected by differences in scanners and/or protocols, which in turn can have a large influence on algorithm accuracy. The Prostate MR Image Segmentation (PROMISE12) challenge was setup to allow a fair and meaningful comparison of segmentation methods on the basis of performance and robustness. In this work we will discuss the initial results of the online PROMISE12 challenge, and the results obtained in the live challenge workshop hosted by the MICCAI2012 conference. In the challenge, 100 prostate MR cases from 4 different centers were included, with differences in scanner manufacturer, field strength and protocol. A total of 11 teams from academic research groups and industry participated. Algorithms showed a wide variety in methods and implementation, including active appearance models, atlas registration and level sets. Evaluation was performed using boundary and volume based metrics which were combined into a single score relating the metrics to human expert performance. The winners of the challenge where the algorithms by teams Imorphics and ScrAutoProstate, with scores of 85.72 and 84.29 overall. Both algorithms where significantly better than all other algorithms in the challenge (p < 0.05) and had an efficient implementation with a run time of 8 minutes and 3 second per case respectively. Overall, active appearance model based approaches seemed to outperform other approaches like multi-atlas registration, both on accuracy and computation time. Although average algorithm performance was good to excellent and the Imorphics algorithm outperformed the second observer on average, we showed that algorithm combination might lead to further improvement, indicating that optimal performance for prostate segmentation is not yet obtained. All results are available online at http://promise12.grand-challenge.org/. PMID:24418598

  3. Macula segmentation and fovea localization employing image processing and heuristic based clustering for automated retinal screening.

    PubMed

    R, GeethaRamani; Balasubramanian, Lakshmi

    2018-07-01

    Macula segmentation and fovea localization is one of the primary tasks in retinal analysis as they are responsible for detailed vision. Existing approaches required segmentation of retinal structures viz. optic disc and blood vessels for this purpose. This work avoids knowledge of other retinal structures and attempts data mining techniques to segment macula. Unsupervised clustering algorithm is exploited for this purpose. Selection of initial cluster centres has a great impact on performance of clustering algorithms. A heuristic based clustering in which initial centres are selected based on measures defining statistical distribution of data is incorporated in the proposed methodology. The initial phase of proposed framework includes image cropping, green channel extraction, contrast enhancement and application of mathematical closing. Then, the pre-processed image is subjected to heuristic based clustering yielding a binary map. The binary image is post-processed to eliminate unwanted components. Finally, the component which possessed the minimum intensity is finalized as macula and its centre constitutes the fovea. The proposed approach outperforms existing works by reporting that 100%,of HRF, 100% of DRIVE, 96.92% of DIARETDB0, 97.75% of DIARETDB1, 98.81% of HEI-MED, 90% of STARE and 99.33% of MESSIDOR images satisfy the 1R criterion, a standard adopted for evaluating performance of macula and fovea identification. The proposed system thus helps the ophthalmologists in identifying the macula thereby facilitating to identify if any abnormality is present within the macula region. Copyright © 2018 Elsevier B.V. All rights reserved.

  4. Machine learning with naturally labeled data for identifying abbreviation definitions.

    PubMed

    Yeganova, Lana; Comeau, Donald C; Wilbur, W John

    2011-06-09

    The rapid growth of biomedical literature requires accurate text analysis and text processing tools. Detecting abbreviations and identifying their definitions is an important component of such tools. Most existing approaches for the abbreviation definition identification task employ rule-based methods. While achieving high precision, rule-based methods are limited to the rules defined and fail to capture many uncommon definition patterns. Supervised learning techniques, which offer more flexibility in detecting abbreviation definitions, have also been applied to the problem. However, they require manually labeled training data. In this work, we develop a machine learning algorithm for abbreviation definition identification in text which makes use of what we term naturally labeled data. Positive training examples are naturally occurring potential abbreviation-definition pairs in text. Negative training examples are generated by randomly mixing potential abbreviations with unrelated potential definitions. The machine learner is trained to distinguish between these two sets of examples. Then, the learned feature weights are used to identify the abbreviation full form. This approach does not require manually labeled training data. We evaluate the performance of our algorithm on the Ab3P, BIOADI and Medstract corpora. Our system demonstrated results that compare favourably to the existing Ab3P and BIOADI systems. We achieve an F-measure of 91.36% on Ab3P corpus, and an F-measure of 87.13% on BIOADI corpus which are superior to the results reported by Ab3P and BIOADI systems. Moreover, we outperform these systems in terms of recall, which is one of our goals.

  5. Deep learning with domain adaptation for accelerated projection-reconstruction MR.

    PubMed

    Han, Yoseob; Yoo, Jaejun; Kim, Hak Hee; Shin, Hee Jung; Sung, Kyunghyun; Ye, Jong Chul

    2018-09-01

    The radial k-space trajectory is a well-established sampling trajectory used in conjunction with magnetic resonance imaging. However, the radial k-space trajectory requires a large number of radial lines for high-resolution reconstruction. Increasing the number of radial lines causes longer acquisition time, making it more difficult for routine clinical use. On the other hand, if we reduce the number of radial lines, streaking artifact patterns are unavoidable. To solve this problem, we propose a novel deep learning approach with domain adaptation to restore high-resolution MR images from under-sampled k-space data. The proposed deep network removes the streaking artifacts from the artifact corrupted images. To address the situation given the limited available data, we propose a domain adaptation scheme that employs a pre-trained network using a large number of X-ray computed tomography (CT) or synthesized radial MR datasets, which is then fine-tuned with only a few radial MR datasets. The proposed method outperforms existing compressed sensing algorithms, such as the total variation and PR-FOCUSS methods. In addition, the calculation time is several orders of magnitude faster than the total variation and PR-FOCUSS methods. Moreover, we found that pre-training using CT or MR data from similar organ data is more important than pre-training using data from the same modality for different organ. We demonstrate the possibility of a domain-adaptation when only a limited amount of MR data is available. The proposed method surpasses the existing compressed sensing algorithms in terms of the image quality and computation time. © 2018 International Society for Magnetic Resonance in Medicine.

  6. Positive-unlabeled learning for disease gene identification

    PubMed Central

    Yang, Peng; Li, Xiao-Li; Mei, Jian-Ping; Kwoh, Chee-Keong; Ng, See-Kiong

    2012-01-01

    Background: Identifying disease genes from human genome is an important but challenging task in biomedical research. Machine learning methods can be applied to discover new disease genes based on the known ones. Existing machine learning methods typically use the known disease genes as the positive training set P and the unknown genes as the negative training set N (non-disease gene set does not exist) to build classifiers to identify new disease genes from the unknown genes. However, such kind of classifiers is actually built from a noisy negative set N as there can be unknown disease genes in N itself. As a result, the classifiers do not perform as well as they could be. Result: Instead of treating the unknown genes as negative examples in N, we treat them as an unlabeled set U. We design a novel positive-unlabeled (PU) learning algorithm PUDI (PU learning for disease gene identification) to build a classifier using P and U. We first partition U into four sets, namely, reliable negative set RN, likely positive set LP, likely negative set LN and weak negative set WN. The weighted support vector machines are then used to build a multi-level classifier based on the four training sets and positive training set P to identify disease genes. Our experimental results demonstrate that our proposed PUDI algorithm outperformed the existing methods significantly. Conclusion: The proposed PUDI algorithm is able to identify disease genes more accurately by treating the unknown data more appropriately as unlabeled set U instead of negative set N. Given that many machine learning problems in biomedical research do involve positive and unlabeled data instead of negative data, it is possible that the machine learning methods for these problems can be further improved by adopting PU learning methods, as we have done here for disease gene identification. Availability and implementation: The executable program and data are available at http://www1.i2r.a-star.edu.sg/∼xlli/PUDI/PUDI.html. Contact: xlli@i2r.a-star.edu.sg or yang0293@e.ntu.edu.sg Supplementary information: Supplementary Data are available at Bioinformatics online. PMID:22923290

  7. Advanced time integration algorithms for dislocation dynamics simulations of work hardening

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sills, Ryan B.; Aghaei, Amin; Cai, Wei

    Efficient time integration is a necessity for dislocation dynamics simulations of work hardening to achieve experimentally relevant strains. In this work, an efficient time integration scheme using a high order explicit method with time step subcycling and a newly-developed collision detection algorithm are evaluated. First, time integrator performance is examined for an annihilating Frank–Read source, showing the effects of dislocation line collision. The integrator with subcycling is found to significantly out-perform other integration schemes. The performance of the time integration and collision detection algorithms is then tested in a work hardening simulation. The new algorithms show a 100-fold speed-up relativemore » to traditional schemes. As a result, subcycling is shown to improve efficiency significantly while maintaining an accurate solution, and the new collision algorithm allows an arbitrarily large time step size without missing collisions.« less

  8. Optimizing the Shunting Schedule of Electric Multiple Units Depot Using an Enhanced Particle Swarm Optimization Algorithm

    PubMed Central

    Jin, Junchen

    2016-01-01

    The shunting schedule of electric multiple units depot (SSED) is one of the essential plans for high-speed train maintenance activities. This paper presents a 0-1 programming model to address the problem of determining an optimal SSED through automatic computing. The objective of the model is to minimize the number of shunting movements and the constraints include track occupation conflicts, shunting routes conflicts, time durations of maintenance processes, and shunting running time. An enhanced particle swarm optimization (EPSO) algorithm is proposed to solve the optimization problem. Finally, an empirical study from Shanghai South EMU Depot is carried out to illustrate the model and EPSO algorithm. The optimization results indicate that the proposed method is valid for the SSED problem and that the EPSO algorithm outperforms the traditional PSO algorithm on the aspect of optimality. PMID:27436998

  9. Advanced time integration algorithms for dislocation dynamics simulations of work hardening

    DOE PAGES

    Sills, Ryan B.; Aghaei, Amin; Cai, Wei

    2016-04-25

    Efficient time integration is a necessity for dislocation dynamics simulations of work hardening to achieve experimentally relevant strains. In this work, an efficient time integration scheme using a high order explicit method with time step subcycling and a newly-developed collision detection algorithm are evaluated. First, time integrator performance is examined for an annihilating Frank–Read source, showing the effects of dislocation line collision. The integrator with subcycling is found to significantly out-perform other integration schemes. The performance of the time integration and collision detection algorithms is then tested in a work hardening simulation. The new algorithms show a 100-fold speed-up relativemore » to traditional schemes. As a result, subcycling is shown to improve efficiency significantly while maintaining an accurate solution, and the new collision algorithm allows an arbitrarily large time step size without missing collisions.« less

  10. Operating Quantum States in Single Magnetic Molecules: Implementation of Grover's Quantum Algorithm.

    PubMed

    Godfrin, C; Ferhat, A; Ballou, R; Klyatskaya, S; Ruben, M; Wernsdorfer, W; Balestro, F

    2017-11-03

    Quantum algorithms use the principles of quantum mechanics, such as, for example, quantum superposition, in order to solve particular problems outperforming standard computation. They are developed for cryptography, searching, optimization, simulation, and solving large systems of linear equations. Here, we implement Grover's quantum algorithm, proposed to find an element in an unsorted list, using a single nuclear 3/2 spin carried by a Tb ion sitting in a single molecular magnet transistor. The coherent manipulation of this multilevel quantum system (qudit) is achieved by means of electric fields only. Grover's search algorithm is implemented by constructing a quantum database via a multilevel Hadamard gate. The Grover sequence then allows us to select each state. The presented method is of universal character and can be implemented in any multilevel quantum system with nonequal spaced energy levels, opening the way to novel quantum search algorithms.

  11. Linear antenna array optimization using flower pollination algorithm.

    PubMed

    Saxena, Prerna; Kothari, Ashwin

    2016-01-01

    Flower pollination algorithm (FPA) is a new nature-inspired evolutionary algorithm used to solve multi-objective optimization problems. The aim of this paper is to introduce FPA to the electromagnetics and antenna community for the optimization of linear antenna arrays. FPA is applied for the first time to linear array so as to obtain optimized antenna positions in order to achieve an array pattern with minimum side lobe level along with placement of deep nulls in desired directions. Various design examples are presented that illustrate the use of FPA for linear antenna array optimization, and subsequently the results are validated by benchmarking along with results obtained using other state-of-the-art, nature-inspired evolutionary algorithms such as particle swarm optimization, ant colony optimization and cat swarm optimization. The results suggest that in most cases, FPA outperforms the other evolutionary algorithms and at times it yields a similar performance.

  12. Magnified gradient function with deterministic weight modification in adaptive learning.

    PubMed

    Ng, Sin-Chun; Cheung, Chi-Chung; Leung, Shu-Hung

    2004-11-01

    This paper presents two novel approaches, backpropagation (BP) with magnified gradient function (MGFPROP) and deterministic weight modification (DWM), to speed up the convergence rate and improve the global convergence capability of the standard BP learning algorithm. The purpose of MGFPROP is to increase the convergence rate by magnifying the gradient function of the activation function, while the main objective of DWM is to reduce the system error by changing the weights of a multilayered feedforward neural network in a deterministic way. Simulation results show that the performance of the above two approaches is better than BP and other modified BP algorithms for a number of learning problems. Moreover, the integration of the above two approaches forming a new algorithm called MDPROP, can further improve the performance of MGFPROP and DWM. From our simulation results, the MDPROP algorithm always outperforms BP and other modified BP algorithms in terms of convergence rate and global convergence capability.

  13. A joint tracking method for NSCC based on WLS algorithm

    NASA Astrophysics Data System (ADS)

    Luo, Ruidan; Xu, Ying; Yuan, Hong

    2017-12-01

    Navigation signal based on compound carrier (NSCC), has the flexible multi-carrier scheme and various scheme parameters configuration, which enables it to possess significant efficiency of navigation augmentation in terms of spectral efficiency, tracking accuracy, multipath mitigation capability and anti-jamming reduction compared with legacy navigation signals. Meanwhile, the typical scheme characteristics can provide auxiliary information for signal synchronism algorithm design. This paper, based on the characteristics of NSCC, proposed a kind of joint tracking method utilizing Weighted Least Square (WLS) algorithm. In this method, the LS algorithm is employed to jointly estimate each sub-carrier frequency shift with the frequency-Doppler linear relationship, by utilizing the known sub-carrier frequency. Besides, the weighting matrix is set adaptively according to the sub-carrier power to ensure the estimation accuracy. Both the theory analysis and simulation results illustrate that the tracking accuracy and sensitivity of this method outperforms the single-carrier algorithm with lower SNR.

  14. Science Concierge: A Fast Content-Based Recommendation System for Scientific Publications.

    PubMed

    Achakulvisut, Titipat; Acuna, Daniel E; Ruangrong, Tulakan; Kording, Konrad

    2016-01-01

    Finding relevant publications is important for scientists who have to cope with exponentially increasing numbers of scholarly material. Algorithms can help with this task as they help for music, movie, and product recommendations. However, we know little about the performance of these algorithms with scholarly material. Here, we develop an algorithm, and an accompanying Python library, that implements a recommendation system based on the content of articles. Design principles are to adapt to new content, provide near-real time suggestions, and be open source. We tested the library on 15K posters from the Society of Neuroscience Conference 2015. Human curated topics are used to cross validate parameters in the algorithm and produce a similarity metric that maximally correlates with human judgments. We show that our algorithm significantly outperformed suggestions based on keywords. The work presented here promises to make the exploration of scholarly material faster and more accurate.

  15. Fuzzy Sarsa with Focussed Replacing Eligibility Traces for Robust and Accurate Control

    NASA Astrophysics Data System (ADS)

    Kamdem, Sylvain; Ohki, Hidehiro; Sueda, Naomichi

    Several methods of reinforcement learning in continuous state and action spaces that utilize fuzzy logic have been proposed in recent years. This paper introduces Fuzzy Sarsa(λ), an on-policy algorithm for fuzzy learning that relies on a novel way of computing replacing eligibility traces to accelerate the policy evaluation. It is tested against several temporal difference learning algorithms: Sarsa(λ), Fuzzy Q(λ), an earlier fuzzy version of Sarsa and an actor-critic algorithm. We perform detailed evaluations on two benchmark problems : a maze domain and the cart pole. Results of various tests highlight the strengths and weaknesses of these algorithms and show that Fuzzy Sarsa(λ) outperforms all other algorithms tested for a larger granularity of design and under noisy conditions. It is a highly competitive method of learning in realistic noisy domains where a denser fuzzy design over the state space is needed for a more precise control.

  16. Automatic Regionalization Algorithm for Distributed State Estimation in Power Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Dexin; Yang, Liuqing; Florita, Anthony

    The deregulation of the power system and the incorporation of generation from renewable energy sources recessitates faster state estimation in the smart grid. Distributed state estimation (DSE) has become a promising and scalable solution to this urgent demand. In this paper, we investigate the regionalization algorithms for the power system, a necessary step before distributed state estimation can be performed. To the best of the authors' knowledge, this is the first investigation on automatic regionalization (AR). We propose three spectral clustering based AR algorithms. Simulations show that our proposed algorithms outperform the two investigated manual regionalization cases. With the helpmore » of AR algorithms, we also show how the number of regions impacts the accuracy and convergence speed of the DSE and conclude that the number of regions needs to be chosen carefully to improve the convergence speed of DSEs.« less

  17. Automatic Regionalization Algorithm for Distributed State Estimation in Power Systems: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang, Dexin; Yang, Liuqing; Florita, Anthony

    The deregulation of the power system and the incorporation of generation from renewable energy sources recessitates faster state estimation in the smart grid. Distributed state estimation (DSE) has become a promising and scalable solution to this urgent demand. In this paper, we investigate the regionalization algorithms for the power system, a necessary step before distributed state estimation can be performed. To the best of the authors' knowledge, this is the first investigation on automatic regionalization (AR). We propose three spectral clustering based AR algorithms. Simulations show that our proposed algorithms outperform the two investigated manual regionalization cases. With the helpmore » of AR algorithms, we also show how the number of regions impacts the accuracy and convergence speed of the DSE and conclude that the number of regions needs to be chosen carefully to improve the convergence speed of DSEs.« less

  18. Science Concierge: A Fast Content-Based Recommendation System for Scientific Publications

    PubMed Central

    Achakulvisut, Titipat; Acuna, Daniel E.; Ruangrong, Tulakan; Kording, Konrad

    2016-01-01

    Finding relevant publications is important for scientists who have to cope with exponentially increasing numbers of scholarly material. Algorithms can help with this task as they help for music, movie, and product recommendations. However, we know little about the performance of these algorithms with scholarly material. Here, we develop an algorithm, and an accompanying Python library, that implements a recommendation system based on the content of articles. Design principles are to adapt to new content, provide near-real time suggestions, and be open source. We tested the library on 15K posters from the Society of Neuroscience Conference 2015. Human curated topics are used to cross validate parameters in the algorithm and produce a similarity metric that maximally correlates with human judgments. We show that our algorithm significantly outperformed suggestions based on keywords. The work presented here promises to make the exploration of scholarly material faster and more accurate. PMID:27383424

  19. Operating Quantum States in Single Magnetic Molecules: Implementation of Grover's Quantum Algorithm

    NASA Astrophysics Data System (ADS)

    Godfrin, C.; Ferhat, A.; Ballou, R.; Klyatskaya, S.; Ruben, M.; Wernsdorfer, W.; Balestro, F.

    2017-11-01

    Quantum algorithms use the principles of quantum mechanics, such as, for example, quantum superposition, in order to solve particular problems outperforming standard computation. They are developed for cryptography, searching, optimization, simulation, and solving large systems of linear equations. Here, we implement Grover's quantum algorithm, proposed to find an element in an unsorted list, using a single nuclear 3 /2 spin carried by a Tb ion sitting in a single molecular magnet transistor. The coherent manipulation of this multilevel quantum system (qudit) is achieved by means of electric fields only. Grover's search algorithm is implemented by constructing a quantum database via a multilevel Hadamard gate. The Grover sequence then allows us to select each state. The presented method is of universal character and can be implemented in any multilevel quantum system with nonequal spaced energy levels, opening the way to novel quantum search algorithms.

  20. Hierarchical image segmentation via recursive superpixel with adaptive regularity

    NASA Astrophysics Data System (ADS)

    Nakamura, Kensuke; Hong, Byung-Woo

    2017-11-01

    A fast and accurate segmentation algorithm in a hierarchical way based on a recursive superpixel technique is presented. We propose a superpixel energy formulation in which the trade-off between data fidelity and regularization is dynamically determined based on the local residual in the energy optimization procedure. We also present an energy optimization algorithm that allows a pixel to be shared by multiple regions to improve the accuracy and appropriate the number of segments. The qualitative and quantitative evaluations demonstrate that our algorithm, combining the proposed energy and optimization, outperforms the conventional k-means algorithm by up to 29.10% in F-measure. We also perform comparative analysis with state-of-the-art algorithms in the hierarchical segmentation. Our algorithm yields smooth regions throughout the hierarchy as opposed to the others that include insignificant details. Our algorithm overtakes the other algorithms in terms of balance between accuracy and computational time. Specifically, our method runs 36.48% faster than the region-merging approach, which is the fastest of the comparing algorithms, while achieving a comparable accuracy.

  1. Prediction of fatigue-related driver performance from EEG data by deep Riemannian model.

    PubMed

    Hajinoroozi, Mehdi; Jianqiu Zhang; Yufei Huang

    2017-07-01

    Prediction of the drivers' drowsy and alert states is important for safety purposes. The prediction of drivers' drowsy and alert states from electroencephalography (EEG) using shallow and deep Riemannian methods is presented. For shallow Riemannian methods, the minimum distance to Riemannian mean (mdm) and Log-Euclidian metric are investigated, where it is shown that Log-Euclidian metric outperforms the mdm algorithm. In addition the SPDNet, a deep Riemannian model, that takes the EEG covariance matrix as the input is investigated. It is shown that SPDNet outperforms all tested shallow and deep classification methods. Performance of SPDNet is 6.02% and 2.86% higher than the best performance by the conventional Euclidian classifiers and shallow Riemannian models, respectively.

  2. Improving HybrID: How to best combine indirect and direct encoding in evolutionary algorithms.

    PubMed

    Helms, Lucas; Clune, Jeff

    2017-01-01

    Many challenging engineering problems are regular, meaning solutions to one part of a problem can be reused to solve other parts. Evolutionary algorithms with indirect encoding perform better on regular problems because they reuse genomic information to create regular phenotypes. However, on problems that are mostly regular, but contain some irregularities, which describes most real-world problems, indirect encodings struggle to handle the irregularities, hurting performance. Direct encodings are better at producing irregular phenotypes, but cannot exploit regularity. An algorithm called HybrID combines the best of both: it first evolves with indirect encoding to exploit problem regularity, then switches to direct encoding to handle problem irregularity. While HybrID has been shown to outperform both indirect and direct encoding, its initial implementation required the manual specification of when to switch from indirect to direct encoding. In this paper, we test two new methods to improve HybrID by eliminating the need to manually specify this parameter. Auto-Switch-HybrID automatically switches from indirect to direct encoding when fitness stagnates. Offset-HybrID simultaneously evolves an indirect encoding with directly encoded offsets, eliminating the need to switch. We compare the original HybrID to these alternatives on three different problems with adjustable regularity. The results show that both Auto-Switch-HybrID and Offset-HybrID outperform the original HybrID on different types of problems, and thus offer more tools for researchers to solve challenging problems. The Offset-HybrID algorithm is particularly interesting because it suggests a path forward for automatically and simultaneously combining the best traits of indirect and direct encoding.

  3. Non-invasive Fetal ECG Signal Quality Assessment for Multichannel Heart Rate Estimation.

    PubMed

    Andreotti, Fernando; Graser, Felix; Malberg, Hagen; Zaunseder, Sebastian

    2017-12-01

    The noninvasive fetal ECG (NI-FECG) from abdominal recordings offers novel prospects for prenatal monitoring. However, NI-FECG signals are corrupted by various nonstationary noise sources, making the processing of abdominal recordings a challenging task. In this paper, we present an online approach that dynamically assess the quality of NI-FECG to improve fetal heart rate (FHR) estimation. Using a naive Bayes classifier, state-of-the-art and novel signal quality indices (SQIs), and an existing adaptive Kalman filter, FHR estimation was improved. For the purpose of training and validating the proposed methods, a large annotated private clinical dataset was used. The suggested classification scheme demonstrated an accuracy of Krippendorff's alpha in determining the overall quality of NI-FECG signals. The proposed Kalman filter outperformed alternative methods for FHR estimation achieving accuracy. The proposed algorithm was able to reliably reflect changes of signal quality and can be used in improving FHR estimation. NI-ECG signal quality estimation and multichannel information fusion are largely unexplored topics. Based on previous works, multichannel FHR estimation is a field that could strongly benefit from such methods. The developed SQI algorithms as well as resulting classifier were made available under a GNU GPL open-source license and contributed to the FECGSYN toolbox.

  4. A Statistically Representative Atlas for Mapping Neuronal Circuits in the Drosophila Adult Brain.

    PubMed

    Arganda-Carreras, Ignacio; Manoliu, Tudor; Mazuras, Nicolas; Schulze, Florian; Iglesias, Juan E; Bühler, Katja; Jenett, Arnim; Rouyer, François; Andrey, Philippe

    2018-01-01

    Imaging the expression patterns of reporter constructs is a powerful tool to dissect the neuronal circuits of perception and behavior in the adult brain of Drosophila , one of the major models for studying brain functions. To date, several Drosophila brain templates and digital atlases have been built to automatically analyze and compare collections of expression pattern images. However, there has been no systematic comparison of performances between alternative atlasing strategies and registration algorithms. Here, we objectively evaluated the performance of different strategies for building adult Drosophila brain templates and atlases. In addition, we used state-of-the-art registration algorithms to generate a new group-wise inter-sex atlas. Our results highlight the benefit of statistical atlases over individual ones and show that the newly proposed inter-sex atlas outperformed existing solutions for automated registration and annotation of expression patterns. Over 3,000 images from the Janelia Farm FlyLight collection were registered using the proposed strategy. These registered expression patterns can be searched and compared with a new version of the BrainBaseWeb system and BrainGazer software. We illustrate the validity of our methodology and brain atlas with registration-based predictions of expression patterns in a subset of clock neurons. The described registration framework should benefit to brain studies in Drosophila and other insect species.

  5. Cyber-Physical System Security With Deceptive Virtual Hosts for Industrial Control Networks

    DOE PAGES

    Vollmer, Todd; Manic, Milos

    2014-05-01

    A challenge facing industrial control network administrators is protecting the typically large number of connected assets for which they are responsible. These cyber devices may be tightly coupled with the physical processes they control and human induced failures risk dire real-world consequences. Dynamic virtual honeypots are effective tools for observing and attracting network intruder activity. This paper presents a design and implementation for self-configuring honeypots that passively examine control system network traffic and actively adapt to the observed environment. In contrast to prior work in the field, six tools were analyzed for suitability of network entity information gathering. Ettercap, anmore » established network security tool not commonly used in this capacity, outperformed the other tools and was chosen for implementation. Utilizing Ettercap XML output, a novel four-step algorithm was developed for autonomous creation and update of a Honeyd configuration. This algorithm was tested on an existing small campus grid and sensor network by execution of a collaborative usage scenario. Automatically created virtual hosts were deployed in concert with an anomaly behavior (AB) system in an attack scenario. Virtual hosts were automatically configured with unique emulated network stack behaviors for 92% of the targeted devices. The AB system alerted on 100% of the monitored emulated devices.« less

  6. A Spatial Division Clustering Method and Low Dimensional Feature Extraction Technique Based Indoor Positioning System

    PubMed Central

    Mo, Yun; Zhang, Zhongzhao; Meng, Weixiao; Ma, Lin; Wang, Yao

    2014-01-01

    Indoor positioning systems based on the fingerprint method are widely used due to the large number of existing devices with a wide range of coverage. However, extensive positioning regions with a massive fingerprint database may cause high computational complexity and error margins, therefore clustering methods are widely applied as a solution. However, traditional clustering methods in positioning systems can only measure the similarity of the Received Signal Strength without being concerned with the continuity of physical coordinates. Besides, outage of access points could result in asymmetric matching problems which severely affect the fine positioning procedure. To solve these issues, in this paper we propose a positioning system based on the Spatial Division Clustering (SDC) method for clustering the fingerprint dataset subject to physical distance constraints. With the Genetic Algorithm and Support Vector Machine techniques, SDC can achieve higher coarse positioning accuracy than traditional clustering algorithms. In terms of fine localization, based on the Kernel Principal Component Analysis method, the proposed positioning system outperforms its counterparts based on other feature extraction methods in low dimensionality. Apart from balancing online matching computational burden, the new positioning system exhibits advantageous performance on radio map clustering, and also shows better robustness and adaptability in the asymmetric matching problem aspect. PMID:24451470

  7. Bilinear Factor Matrix Norm Minimization for Robust PCA: Algorithms and Applications.

    PubMed

    Shang, Fanhua; Cheng, James; Liu, Yuanyuan; Luo, Zhi-Quan; Lin, Zhouchen

    2017-09-04

    The heavy-tailed distributions of corrupted outliers and singular values of all channels in low-level vision have proven effective priors for many applications such as background modeling, photometric stereo and image alignment. And they can be well modeled by a hyper-Laplacian. However, the use of such distributions generally leads to challenging non-convex, non-smooth and non-Lipschitz problems, and makes existing algorithms very slow for large-scale applications. Together with the analytic solutions to Lp-norm minimization with two specific values of p, i.e., p=1/2 and p=2/3, we propose two novel bilinear factor matrix norm minimization models for robust principal component analysis. We first define the double nuclear norm and Frobenius/nuclear hybrid norm penalties, and then prove that they are in essence the Schatten-1/2 and 2/3 quasi-norms, respectively, which lead to much more tractable and scalable Lipschitz optimization problems. Our experimental analysis shows that both our methods yield more accurate solutions than original Schatten quasi-norm minimization, even when the number of observations is very limited. Finally, we apply our penalties to various low-level vision problems, e.g. moving object detection, image alignment and inpainting, and show that our methods usually outperform the state-of-the-art methods.

  8. Plant microRNA-Target Interaction Identification Model Based on the Integration of Prediction Tools and Support Vector Machine

    PubMed Central

    Meng, Jun; Shi, Lin; Luan, Yushi

    2014-01-01

    Background Confident identification of microRNA-target interactions is significant for studying the function of microRNA (miRNA). Although some computational miRNA target prediction methods have been proposed for plants, results of various methods tend to be inconsistent and usually lead to more false positive. To address these issues, we developed an integrated model for identifying plant miRNA–target interactions. Results Three online miRNA target prediction toolkits and machine learning algorithms were integrated to identify and analyze Arabidopsis thaliana miRNA-target interactions. Principle component analysis (PCA) feature extraction and self-training technology were introduced to improve the performance. Results showed that the proposed model outperformed the previously existing methods. The results were validated by using degradome sequencing supported Arabidopsis thaliana miRNA-target interactions. The proposed model constructed on Arabidopsis thaliana was run over Oryza sativa and Vitis vinifera to demonstrate that our model is effective for other plant species. Conclusions The integrated model of online predictors and local PCA-SVM classifier gained credible and high quality miRNA-target interactions. The supervised learning algorithm of PCA-SVM classifier was employed in plant miRNA target identification for the first time. Its performance can be substantially improved if more experimentally proved training samples are provided. PMID:25051153

  9. Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction

    PubMed Central

    Marks, Claire; Nowak, Jaroslaw; Klostermann, Stefan; Georges, Guy; Dunbar, James; Shi, Jiye; Kelm, Sebastian

    2017-01-01

    Abstract Motivation: Loops are often vital for protein function, however, their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations. Here, we present a novel method, Sphinx, which combines ab initio techniques with the potential extra structural information contained within loops of a different length to improve structure prediction. Results: We show that Sphinx is able to generate high-accuracy predictions and decoy sets enriched with near-native loop conformations, performing better than the ab initio algorithm on which it is based. In addition, it is able to provide predictions for every target, unlike some knowledge-based methods. Sphinx can be used successfully for the difficult problem of antibody H3 prediction, outperforming RosettaAntibody, one of the leading H3-specific ab initio methods, both in accuracy and speed. Availability and Implementation: Sphinx is available at http://opig.stats.ox.ac.uk/webapps/sphinx. Contact: deane@stats.ox.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:28453681

  10. Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction.

    PubMed

    Marks, Claire; Nowak, Jaroslaw; Klostermann, Stefan; Georges, Guy; Dunbar, James; Shi, Jiye; Kelm, Sebastian; Deane, Charlotte M

    2017-05-01

    Loops are often vital for protein function, however, their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations. Here, we present a novel method, Sphinx, which combines ab initio techniques with the potential extra structural information contained within loops of a different length to improve structure prediction. We show that Sphinx is able to generate high-accuracy predictions and decoy sets enriched with near-native loop conformations, performing better than the ab initio algorithm on which it is based. In addition, it is able to provide predictions for every target, unlike some knowledge-based methods. Sphinx can be used successfully for the difficult problem of antibody H3 prediction, outperforming RosettaAntibody, one of the leading H3-specific ab initio methods, both in accuracy and speed. Sphinx is available at http://opig.stats.ox.ac.uk/webapps/sphinx. deane@stats.ox.ac.uk. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press.

  11. A Hybrid Parallel Strategy Based on String Graph Theory to Improve De Novo DNA Assembly on the TianHe-2 Supercomputer.

    PubMed

    Zhang, Feng; Liao, Xiangke; Peng, Shaoliang; Cui, Yingbo; Wang, Bingqiang; Zhu, Xiaoqian; Liu, Jie

    2016-06-01

    ' The de novo assembly of DNA sequences is increasingly important for biological researches in the genomic era. After more than one decade since the Human Genome Project, some challenges still exist and new solutions are being explored to improve de novo assembly of genomes. String graph assembler (SGA), based on the string graph theory, is a new method/tool developed to address the challenges. In this paper, based on an in-depth analysis of SGA we prove that the SGA-based sequence de novo assembly is an NP-complete problem. According to our analysis, SGA outperforms other similar methods/tools in memory consumption, but costs much more time, of which 60-70 % is spent on the index construction. Upon this analysis, we introduce a hybrid parallel optimization algorithm and implement this algorithm in the TianHe-2's parallel framework. Simulations are performed with different datasets. For data of small size the optimized solution is 3.06 times faster than before, and for data of middle size it's 1.60 times. The results demonstrate an evident performance improvement, with the linear scalability for parallel FM-index construction. This results thus contribute significantly to improving the efficiency of de novo assembly of DNA sequences.

  12. An Effective Massive Sensor Network Data Access Scheme Based on Topology Control for the Internet of Things

    PubMed Central

    Yi, Meng; Chen, Qingkui; Xiong, Neal N.

    2016-01-01

    This paper considers the distributed access and control problem of massive wireless sensor networks’ data access center for the Internet of Things, which is an extension of wireless sensor networks and an element of its topology structure. In the context of the arrival of massive service access requests at a virtual data center, this paper designs a massive sensing data access and control mechanism to improve the access efficiency of service requests and makes full use of the available resources at the data access center for the Internet of things. Firstly, this paper proposes a synergistically distributed buffer access model, which separates the information of resource and location. Secondly, the paper divides the service access requests into multiple virtual groups based on their characteristics and locations using an optimized self-organizing feature map neural network. Furthermore, this paper designs an optimal scheduling algorithm of group migration based on the combination scheme between the artificial bee colony algorithm and chaos searching theory. Finally, the experimental results demonstrate that this mechanism outperforms the existing schemes in terms of enhancing the accessibility of service requests effectively, reducing network delay, and has higher load balancing capacity and higher resource utility rate. PMID:27827878

  13. A computational intelligent approach to multi-factor analysis of violent crime information system

    NASA Astrophysics Data System (ADS)

    Liu, Hongbo; Yang, Chao; Zhang, Meng; McLoone, Seán; Sun, Yeqing

    2017-02-01

    Various scientific studies have explored the causes of violent behaviour from different perspectives, with psychological tests, in particular, applied to the analysis of crime factors. The relationship between bi-factors has also been extensively studied including the link between age and crime. In reality, many factors interact to contribute to criminal behaviour and as such there is a need to have a greater level of insight into its complex nature. In this article we analyse violent crime information systems containing data on psychological, environmental and genetic factors. Our approach combines elements of rough set theory with fuzzy logic and particle swarm optimisation to yield an algorithm and methodology that can effectively extract multi-knowledge from information systems. The experimental results show that our approach outperforms alternative genetic algorithm and dynamic reduct-based techniques for reduct identification and has the added advantage of identifying multiple reducts and hence multi-knowledge (rules). Identified rules are consistent with classical statistical analysis of violent crime data and also reveal new insights into the interaction between several factors. As such, the results are helpful in improving our understanding of the factors contributing to violent crime and in highlighting the existence of hidden and intangible relationships between crime factors.

  14. A Universal 3D Voxel Descriptor for Solid-State Material Informatics with Deep Convolutional Neural Networks.

    PubMed

    Kajita, Seiji; Ohba, Nobuko; Jinnouchi, Ryosuke; Asahi, Ryoji

    2017-12-05

    Material informatics (MI) is a promising approach to liberate us from the time-consuming Edisonian (trial and error) process for material discoveries, driven by machine-learning algorithms. Several descriptors, which are encoded material features to feed computers, were proposed in the last few decades. Especially to solid systems, however, their insufficient representations of three dimensionality of field quantities such as electron distributions and local potentials have critically hindered broad and practical successes of the solid-state MI. We develop a simple, generic 3D voxel descriptor that compacts any field quantities, in such a suitable way to implement convolutional neural networks (CNNs). We examine the 3D voxel descriptor encoded from the electron distribution by a regression test with 680 oxides data. The present scheme outperforms other existing descriptors in the prediction of Hartree energies that are significantly relevant to the long-wavelength distribution of the valence electrons. The results indicate that this scheme can forecast any functionals of field quantities just by learning sufficient amount of data, if there is an explicit correlation between the target properties and field quantities. This 3D descriptor opens a way to import prominent CNNs-based algorithms of supervised, semi-supervised and reinforcement learnings into the solid-state MI.

  15. Fully Automated Segmentation of Fluid/Cyst Regions in Optical Coherence Tomography Images With Diabetic Macular Edema Using Neutrosophic Sets and Graph Algorithms.

    PubMed

    Rashno, Abdolreza; Koozekanani, Dara D; Drayna, Paul M; Nazari, Behzad; Sadri, Saeed; Rabbani, Hossein; Parhi, Keshab K

    2018-05-01

    This paper presents a fully automated algorithm to segment fluid-associated (fluid-filled) and cyst regions in optical coherence tomography (OCT) retina images of subjects with diabetic macular edema. The OCT image is segmented using a novel neutrosophic transformation and a graph-based shortest path method. In neutrosophic domain, an image is transformed into three sets: (true), (indeterminate) that represents noise, and (false). This paper makes four key contributions. First, a new method is introduced to compute the indeterminacy set , and a new -correction operation is introduced to compute the set in neutrosophic domain. Second, a graph shortest-path method is applied in neutrosophic domain to segment the inner limiting membrane and the retinal pigment epithelium as regions of interest (ROI) and outer plexiform layer and inner segment myeloid as middle layers using a novel definition of the edge weights . Third, a new cost function for cluster-based fluid/cyst segmentation in ROI is presented which also includes a novel approach in estimating the number of clusters in an automated manner. Fourth, the final fluid regions are achieved by ignoring very small regions and the regions between middle layers. The proposed method is evaluated using two publicly available datasets: Duke, Optima, and a third local dataset from the UMN clinic which is available online. The proposed algorithm outperforms the previously proposed Duke algorithm by 8% with respect to the dice coefficient and by 5% with respect to precision on the Duke dataset, while achieving about the same sensitivity. Also, the proposed algorithm outperforms a prior method for Optima dataset by 6%, 22%, and 23% with respect to the dice coefficient, sensitivity, and precision, respectively. Finally, the proposed algorithm also achieves sensitivity of 67.3%, 88.8%, and 76.7%, for the Duke, Optima, and the university of minnesota (UMN) datasets, respectively.

  16. HUGO: Hierarchical mUlti-reference Genome cOmpression for aligned reads

    PubMed Central

    Li, Pinghao; Jiang, Xiaoqian; Wang, Shuang; Kim, Jihoon; Xiong, Hongkai; Ohno-Machado, Lucila

    2014-01-01

    Background and objective Short-read sequencing is becoming the standard of practice for the study of structural variants associated with disease. However, with the growth of sequence data largely surpassing reasonable storage capability, the biomedical community is challenged with the management, transfer, archiving, and storage of sequence data. Methods We developed Hierarchical mUlti-reference Genome cOmpression (HUGO), a novel compression algorithm for aligned reads in the sorted Sequence Alignment/Map (SAM) format. We first aligned short reads against a reference genome and stored exactly mapped reads for compression. For the inexact mapped or unmapped reads, we realigned them against different reference genomes using an adaptive scheme by gradually shortening the read length. Regarding the base quality value, we offer lossy and lossless compression mechanisms. The lossy compression mechanism for the base quality values uses k-means clustering, where a user can adjust the balance between decompression quality and compression rate. The lossless compression can be produced by setting k (the number of clusters) to the number of different quality values. Results The proposed method produced a compression ratio in the range 0.5–0.65, which corresponds to 35–50% storage savings based on experimental datasets. The proposed approach achieved 15% more storage savings over CRAM and comparable compression ratio with Samcomp (CRAM and Samcomp are two of the state-of-the-art genome compression algorithms). The software is freely available at https://sourceforge.net/projects/hierachicaldnac/with a General Public License (GPL) license. Limitation Our method requires having different reference genomes and prolongs the execution time for additional alignments. Conclusions The proposed multi-reference-based compression algorithm for aligned reads outperforms existing single-reference based algorithms. PMID:24368726

  17. CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data.

    PubMed

    Fidaner, Işık Barış; Cankorur-Cetinkaya, Ayca; Dikicioglu, Duygu; Kirdar, Betul; Cemgil, Ali Taylan; Oliver, Stephen G

    2016-02-01

    Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets. We present a novel statistical application called CLUSTERnGO, which uses a model-based clustering algorithm that fulfils this need. This algorithm involves two components of operation. Component 1 constructs a Bayesian non-parametric model (Infinite Mixture of Piecewise Linear Sequences) and Component 2, which applies a novel clustering methodology (Two-Stage Clustering). The software can also assign biological meaning to the identified clusters using an appropriate ontology. It applies multiple hypothesis testing to report the significance of these enrichments. The algorithm has a four-phase pipeline. The application can be executed using either command-line tools or a user-friendly Graphical User Interface. The latter has been developed to address the needs of both specialist and non-specialist users. We use three diverse test cases to demonstrate the flexibility of the proposed strategy. In all cases, CLUSTERnGO not only outperformed existing algorithms in assigning unique GO term enrichments to the identified clusters, but also revealed novel insights regarding the biological systems examined, which were not uncovered in the original publications. The C++ and QT source codes, the GUI applications for Windows, OS X and Linux operating systems and user manual are freely available for download under the GNU GPL v3 license at http://www.cmpe.boun.edu.tr/content/CnG. sgo24@cam.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  18. Optimization-based scatter estimation using primary modulation for computed tomography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Yi; Ma, Jingchen; Zhao, Jun, E-mail: junzhao

    Purpose: Scatter reduces the image quality in computed tomography (CT), but scatter correction remains a challenge. A previously proposed primary modulation method simultaneously obtains the primary and scatter in a single scan. However, separating the scatter and primary in primary modulation is challenging because it is an underdetermined problem. In this study, an optimization-based scatter estimation (OSE) algorithm is proposed to estimate and correct scatter. Methods: In the concept of primary modulation, the primary is modulated, but the scatter remains smooth by inserting a modulator between the x-ray source and the object. In the proposed algorithm, an objective function ismore » designed for separating the scatter and primary. Prior knowledge is incorporated in the optimization-based framework to improve the accuracy of the estimation: (1) the primary is always positive; (2) the primary is locally smooth and the scatter is smooth; (3) the location of penumbra can be determined; and (4) the scatter-contaminated data provide knowledge about which part is smooth. Results: The simulation study shows that the edge-preserving weighting in OSE improves the estimation accuracy near the object boundary. Simulation study also demonstrates that OSE outperforms the two existing primary modulation algorithms for most regions of interest in terms of the CT number accuracy and noise. The proposed method was tested on a clinical cone beam CT, demonstrating that OSE corrects the scatter even when the modulator is not accurately registered. Conclusions: The proposed OSE algorithm improves the robustness and accuracy in scatter estimation and correction. This method is promising for scatter correction of various kinds of x-ray imaging modalities, such as x-ray radiography, cone beam CT, and the fourth-generation CT.« less

  19. CoGI: Towards Compressing Genomes as an Image.

    PubMed

    Xie, Xiaojing; Zhou, Shuigeng; Guan, Jihong

    2015-01-01

    Genomic science is now facing an explosive increase of data thanks to the fast development of sequencing technology. This situation poses serious challenges to genomic data storage and transferring. It is desirable to compress data to reduce storage and transferring cost, and thus to boost data distribution and utilization efficiency. Up to now, a number of algorithms / tools have been developed for compressing genomic sequences. Unlike the existing algorithms, most of which treat genomes as one-dimensional text strings and compress them based on dictionaries or probability models, this paper proposes a novel approach called CoGI (the abbreviation of Compressing Genomes as an Image) for genome compression, which transforms the genomic sequences to a two-dimensional binary image (or bitmap), then applies a rectangular partition coding algorithm to compress the binary image. CoGI can be used as either a reference-based compressor or a reference-free compressor. For the former, we develop two entropy-based algorithms to select a proper reference genome. Performance evaluation is conducted on various genomes. Experimental results show that the reference-based CoGI significantly outperforms two state-of-the-art reference-based genome compressors GReEn and RLZ-opt in both compression ratio and compression efficiency. It also achieves comparable compression ratio but two orders of magnitude higher compression efficiency in comparison with XM--one state-of-the-art reference-free genome compressor. Furthermore, our approach performs much better than Gzip--a general-purpose and widely-used compressor, in both compression speed and compression ratio. So, CoGI can serve as an effective and practical genome compressor. The source code and other related documents of CoGI are available at: http://admis.fudan.edu.cn/projects/cogi.htm.

  20. Gene regulatory network inference from multifactorial perturbation data using both regression and correlation analyses.

    PubMed

    Xiong, Jie; Zhou, Tong

    2012-01-01

    An important problem in systems biology is to reconstruct gene regulatory networks (GRNs) from experimental data and other a priori information. The DREAM project offers some types of experimental data, such as knockout data, knockdown data, time series data, etc. Among them, multifactorial perturbation data are easier and less expensive to obtain than other types of experimental data and are thus more common in practice. In this article, a new algorithm is presented for the inference of GRNs using the DREAM4 multifactorial perturbation data. The GRN inference problem among [Formula: see text] genes is decomposed into [Formula: see text] different regression problems. In each of the regression problems, the expression level of a target gene is predicted solely from the expression level of a potential regulation gene. For different potential regulation genes, different weights for a specific target gene are constructed by using the sum of squared residuals and the Pearson correlation coefficient. Then these weights are normalized to reflect effort differences of regulating distinct genes. By appropriately choosing the parameters of the power law, we constructe a 0-1 integer programming problem. By solving this problem, direct regulation genes for an arbitrary gene can be estimated. And, the normalized weight of a gene is modified, on the basis of the estimation results about the existence of direct regulations to it. These normalized and modified weights are used in queuing the possibility of the existence of a corresponding direct regulation. Computation results with the DREAM4 In Silico Size 100 Multifactorial subchallenge show that estimation performances of the suggested algorithm can even outperform the best team. Using the real data provided by the DREAM5 Network Inference Challenge, estimation performances can be ranked third. Furthermore, the high precision of the obtained most reliable predictions shows the suggested algorithm may be helpful in guiding biological experiment designs.

  1. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing.

    PubMed

    Lim, Hansaim; Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; He, Di; Zhuang, Luke; Meng, Patrick; Xie, Lei

    2016-10-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and side effect prediction. The software and benchmark are available at https://github.com/hansaimlim/REMAP.

  2. Large-Scale Off-Target Identification Using Fast and Accurate Dual Regularized One-Class Collaborative Filtering and Its Application to Drug Repurposing

    PubMed Central

    Poleksic, Aleksandar; Yao, Yuan; Tong, Hanghang; Meng, Patrick; Xie, Lei

    2016-01-01

    Target-based screening is one of the major approaches in drug discovery. Besides the intended target, unexpected drug off-target interactions often occur, and many of them have not been recognized and characterized. The off-target interactions can be responsible for either therapeutic or side effects. Thus, identifying the genome-wide off-targets of lead compounds or existing drugs will be critical for designing effective and safe drugs, and providing new opportunities for drug repurposing. Although many computational methods have been developed to predict drug-target interactions, they are either less accurate than the one that we are proposing here or computationally too intensive, thereby limiting their capability for large-scale off-target identification. In addition, the performances of most machine learning based algorithms have been mainly evaluated to predict off-target interactions in the same gene family for hundreds of chemicals. It is not clear how these algorithms perform in terms of detecting off-targets across gene families on a proteome scale. Here, we are presenting a fast and accurate off-target prediction method, REMAP, which is based on a dual regularized one-class collaborative filtering algorithm, to explore continuous chemical space, protein space, and their interactome on a large scale. When tested in a reliable, extensive, and cross-gene family benchmark, REMAP outperforms the state-of-the-art methods. Furthermore, REMAP is highly scalable. It can screen a dataset of 200 thousands chemicals against 20 thousands proteins within 2 hours. Using the reconstructed genome-wide target profile as the fingerprint of a chemical compound, we predicted that seven FDA-approved drugs can be repurposed as novel anti-cancer therapies. The anti-cancer activity of six of them is supported by experimental evidences. Thus, REMAP is a valuable addition to the existing in silico toolbox for drug target identification, drug repurposing, phenotypic screening, and side effect prediction. The software and benchmark are available at https://github.com/hansaimlim/REMAP. PMID:27716836

  3. Investigations of image fusion

    NASA Astrophysics Data System (ADS)

    Zhang, Zhong

    1999-12-01

    The objective of image fusion is to combine information from multiple images of the same scene. The result of image fusion is a single image which is more suitable for the purpose of human visual perception or further image processing tasks. In this thesis, a region-based fusion algorithm using the wavelet transform is proposed. The identification of important features in each image, such as edges and regions of interest, are used to guide the fusion process. The idea of multiscale grouping is also introduced and a generic image fusion framework based on multiscale decomposition is studied. The framework includes all of the existing multiscale-decomposition- based fusion approaches we found in the literature which did not assume a statistical model for the source images. Comparisons indicate that our framework includes some new approaches which outperform the existing approaches for the cases we consider. Registration must precede our fusion algorithms. So we proposed a hybrid scheme which uses both feature-based and intensity-based methods. The idea of robust estimation of optical flow from time- varying images is employed with a coarse-to-fine multi- resolution approach and feature-based registration to overcome some of the limitations of the intensity-based schemes. Experiments show that this approach is robust and efficient. Assessing image fusion performance in a real application is a complicated issue. In this dissertation, a mixture probability density function model is used in conjunction with the Expectation- Maximization algorithm to model histograms of edge intensity. Some new techniques are proposed for estimating the quality of a noisy image of a natural scene. Such quality measures can be used to guide the fusion. Finally, we study fusion of images obtained from several copies of a new type of camera developed for video surveillance. Our techniques increase the capability and reliability of the surveillance system and provide an easy way to obtain 3-D information of objects in the space monitored by the system.

  4. Fringe pattern demodulation with a two-frame digital phase-locked loop algorithm.

    PubMed

    Gdeisat, Munther A; Burton, David R; Lalor, Michael J

    2002-09-10

    A novel technique called a two-frame digital phase-locked loop for fringe pattern demodulation is presented. In this scheme, two fringe patterns with different spatial carrier frequencies are grabbed for an object. A digital phase-locked loop algorithm tracks and demodulates the phase difference between both fringe patterns by employing the wrapped phase components of one of the fringe patterns as a reference to demodulate the second fringe pattern. The desired phase information can be extracted from the demodulated phase difference. We tested the algorithm experimentally using real fringe patterns. The technique is shown to be suitable for noncontact measurement of objects with rapid surface variations, and it outperforms the Fourier fringe analysis technique in this aspect. Phase maps produced withthis algorithm are noisy in comparison with phase maps generated with the Fourier fringe analysis technique.

  5. RCQ-GA: RDF Chain Query Optimization Using Genetic Algorithms

    NASA Astrophysics Data System (ADS)

    Hogenboom, Alexander; Milea, Viorel; Frasincar, Flavius; Kaymak, Uzay

    The application of Semantic Web technologies in an Electronic Commerce environment implies a need for good support tools. Fast query engines are needed for efficient querying of large amounts of data, usually represented using RDF. We focus on optimizing a special class of SPARQL queries, the so-called RDF chain queries. For this purpose, we devise a genetic algorithm called RCQ-GA that determines the order in which joins need to be performed for an efficient evaluation of RDF chain queries. The approach is benchmarked against a two-phase optimization algorithm, previously proposed in literature. The more complex a query is, the more RCQ-GA outperforms the benchmark in solution quality, execution time needed, and consistency of solution quality. When the algorithms are constrained by a time limit, the overall performance of RCQ-GA compared to the benchmark further improves.

  6. Design and analysis of a model predictive controller for active queue management.

    PubMed

    Wang, Ping; Chen, Hong; Yang, Xiaoping; Ma, Yan

    2012-01-01

    Model predictive (MP) control as a novel active queue management (AQM) algorithm in dynamic computer networks is proposed. According to the predicted future queue length in the data buffer, early packets at the router are dropped reasonably by the MPAQM controller so that the queue length reaches the desired value with minimal tracking error. The drop probability is obtained by optimizing the network performance. Further, randomized algorithms are applied to analyze the robustness of MPAQM successfully, and also to provide the stability domain of systems with uncertain network parameters. The performances of MPAQM are evaluated through a series of simulations in NS2. The simulation results show that the MPAQM algorithm outperforms RED, PI, and REM algorithms in terms of stability, disturbance rejection, and robustness. Copyright © 2011 ISA. Published by Elsevier Ltd. All rights reserved.

  7. Insight into efficient image registration techniques and the demons algorithm.

    PubMed

    Vercauteren, Tom; Pennec, Xavier; Malis, Ezio; Perchant, Aymeric; Ayache, Nicholas

    2007-01-01

    As image registration becomes more and more central to many biomedical imaging applications, the efficiency of the algorithms becomes a key issue. Image registration is classically performed by optimizing a similarity criterion over a given spatial transformation space. Even if this problem is considered as almost solved for linear registration, we show in this paper that some tools that have recently been developed in the field of vision-based robot control can outperform classical solutions. The adequacy of these tools for linear image registration leads us to revisit non-linear registration and allows us to provide interesting theoretical roots to the different variants of Thirion's demons algorithm. This analysis predicts a theoretical advantage to the symmetric forces variant of the demons algorithm. We show that, on controlled experiments, this advantage is confirmed, and yields a faster convergence.

  8. Personalized recommendation based on unbiased consistence

    NASA Astrophysics Data System (ADS)

    Zhu, Xuzhen; Tian, Hui; Zhang, Ping; Hu, Zheng; Zhou, Tao

    2015-08-01

    Recently, in physical dynamics, mass-diffusion-based recommendation algorithms on bipartite network provide an efficient solution by automatically pushing possible relevant items to users according to their past preferences. However, traditional mass-diffusion-based algorithms just focus on unidirectional mass diffusion from objects having been collected to those which should be recommended, resulting in a biased causal similarity estimation and not-so-good performance. In this letter, we argue that in many cases, a user's interests are stable, and thus bidirectional mass diffusion abilities, no matter originated from objects having been collected or from those which should be recommended, should be consistently powerful, showing unbiased consistence. We further propose a consistence-based mass diffusion algorithm via bidirectional diffusion against biased causality, outperforming the state-of-the-art recommendation algorithms in disparate real data sets, including Netflix, MovieLens, Amazon and Rate Your Music.

  9. Shape-Driven 3D Segmentation Using Spherical Wavelets

    PubMed Central

    Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen

    2013-01-01

    This paper presents a novel active surface segmentation algorithm using a multiscale shape representation and prior. We define a parametric model of a surface using spherical wavelet functions and learn a prior probability distribution over the wavelet coefficients to model shape variations at different scales and spatial locations in a training set. Based on this representation, we derive a parametric active surface evolution using the multiscale prior coefficients as parameters for our optimization procedure to naturally include the prior in the segmentation framework. Additionally, the optimization method can be applied in a coarse-to-fine manner. We apply our algorithm to the segmentation of brain caudate nucleus, of interest in the study of schizophrenia. Our validation shows our algorithm is computationally efficient and outperforms the Active Shape Model algorithm by capturing finer shape details. PMID:17354875

  10. Active mask segmentation of fluorescence microscope images.

    PubMed

    Srinivasa, Gowri; Fickus, Matthew C; Guo, Yusong; Linstedt, Adam D; Kovacević, Jelena

    2009-08-01

    We propose a new active mask algorithm for the segmentation of fluorescence microscope images of punctate patterns. It combines the (a) flexibility offered by active-contour methods, (b) speed offered by multiresolution methods, (c) smoothing offered by multiscale methods, and (d) statistical modeling offered by region-growing methods into a fast and accurate segmentation tool. The framework moves from the idea of the "contour" to that of "inside and outside," or masks, allowing for easy multidimensional segmentation. It adapts to the topology of the image through the use of multiple masks. The algorithm is almost invariant under initialization, allowing for random initialization, and uses a few easily tunable parameters. Experiments show that the active mask algorithm matches the ground truth well and outperforms the algorithm widely used in fluorescence microscopy, seeded watershed, both qualitatively, as well as quantitatively.

  11. Two-Swim Operators in the Modified Bacterial Foraging Algorithm for the Optimal Synthesis of Four-Bar Mechanisms

    PubMed Central

    Hernández-Ocaña, Betania; Pozos-Parra, Ma. Del Pilar; Mezura-Montes, Efrén; Portilla-Flores, Edgar Alfredo; Vega-Alvarado, Eduardo; Calva-Yáñez, Maria Bárbara

    2016-01-01

    This paper presents two-swim operators to be added to the chemotaxis process of the modified bacterial foraging optimization algorithm to solve three instances of the synthesis of four-bar planar mechanisms. One swim favors exploration while the second one promotes fine movements in the neighborhood of each bacterium. The combined effect of the new operators looks to increase the production of better solutions during the search. As a consequence, the ability of the algorithm to escape from local optimum solutions is enhanced. The algorithm is tested through four experiments and its results are compared against two BFOA-based algorithms and also against a differential evolution algorithm designed for mechanical design problems. The overall results indicate that the proposed algorithm outperforms other BFOA-based approaches and finds highly competitive mechanisms, with a single set of parameter values and with less evaluations in the first synthesis problem, with respect to those mechanisms obtained by the differential evolution algorithm, which needed a parameter fine-tuning process for each optimization problem. PMID:27057156

  12. Two-Swim Operators in the Modified Bacterial Foraging Algorithm for the Optimal Synthesis of Four-Bar Mechanisms.

    PubMed

    Hernández-Ocaña, Betania; Pozos-Parra, Ma Del Pilar; Mezura-Montes, Efrén; Portilla-Flores, Edgar Alfredo; Vega-Alvarado, Eduardo; Calva-Yáñez, Maria Bárbara

    2016-01-01

    This paper presents two-swim operators to be added to the chemotaxis process of the modified bacterial foraging optimization algorithm to solve three instances of the synthesis of four-bar planar mechanisms. One swim favors exploration while the second one promotes fine movements in the neighborhood of each bacterium. The combined effect of the new operators looks to increase the production of better solutions during the search. As a consequence, the ability of the algorithm to escape from local optimum solutions is enhanced. The algorithm is tested through four experiments and its results are compared against two BFOA-based algorithms and also against a differential evolution algorithm designed for mechanical design problems. The overall results indicate that the proposed algorithm outperforms other BFOA-based approaches and finds highly competitive mechanisms, with a single set of parameter values and with less evaluations in the first synthesis problem, with respect to those mechanisms obtained by the differential evolution algorithm, which needed a parameter fine-tuning process for each optimization problem.

  13. MAFsnp: A Multi-Sample Accurate and Flexible SNP Caller Using Next-Generation Sequencing Data

    PubMed Central

    Hu, Jiyuan; Li, Tengfei; Xiu, Zidi; Zhang, Hong

    2015-01-01

    Most existing statistical methods developed for calling single nucleotide polymorphisms (SNPs) using next-generation sequencing (NGS) data are based on Bayesian frameworks, and there does not exist any SNP caller that produces p-values for calling SNPs in a frequentist framework. To fill in this gap, we develop a new method MAFsnp, a Multiple-sample based Accurate and Flexible algorithm for calling SNPs with NGS data. MAFsnp is based on an estimated likelihood ratio test (eLRT) statistic. In practical situation, the involved parameter is very close to the boundary of the parametric space, so the standard large sample property is not suitable to evaluate the finite-sample distribution of the eLRT statistic. Observing that the distribution of the test statistic is a mixture of zero and a continuous part, we propose to model the test statistic with a novel two-parameter mixture distribution. Once the parameters in the mixture distribution are estimated, p-values can be easily calculated for detecting SNPs, and the multiple-testing corrected p-values can be used to control false discovery rate (FDR) at any pre-specified level. With simulated data, MAFsnp is shown to have much better control of FDR than the existing SNP callers. Through the application to two real datasets, MAFsnp is also shown to outperform the existing SNP callers in terms of calling accuracy. An R package “MAFsnp” implementing the new SNP caller is freely available at http://homepage.fudan.edu.cn/zhangh/softwares/. PMID:26309201

  14. A comparison of three-dimensional nonequilibrium solution algorithms applied to hypersonic flows with stiff chemical source terms

    NASA Technical Reports Server (NTRS)

    Palmer, Grant; Venkatapathy, Ethiraj

    1993-01-01

    Three solution algorithms, explicit underrelaxation, point implicit, and lower upper symmetric Gauss-Seidel (LUSGS), are used to compute nonequilibrium flow around the Apollo 4 return capsule at 62 km altitude. By varying the Mach number, the efficiency and robustness of the solution algorithms were tested for different levels of chemical stiffness. The performance of the solution algorithms degraded as the Mach number and stiffness of the flow increased. At Mach 15, 23, and 30, the LUSGS method produces an eight order of magnitude drop in the L2 norm of the energy residual in 1/3 to 1/2 the Cray C-90 computer time as compared to the point implicit and explicit under-relaxation methods. The explicit under-relaxation algorithm experienced convergence difficulties at Mach 23 and above. At Mach 40 the performance of the LUSGS algorithm deteriorates to the point it is out-performed by the point implicit method. The effects of the viscous terms are investigated. Grid dependency questions are explored.

  15. Effects of high-order correlations on personalized recommendations for bipartite networks

    NASA Astrophysics Data System (ADS)

    Liu, Jian-Guo; Zhou, Tao; Che, Hong-An; Wang, Bing-Hong; Zhang, Yi-Cheng

    2010-02-01

    In this paper, we introduce a modified collaborative filtering (MCF) algorithm, which has remarkably higher accuracy than the standard collaborative filtering. In the MCF, instead of the cosine similarity index, the user-user correlations are obtained by a diffusion process. Furthermore, by considering the second-order correlations, we design an effective algorithm that depresses the influence of mainstream preferences. Simulation results show that the algorithmic accuracy, measured by the average ranking score, is further improved by 20.45% and 33.25% in the optimal cases of MovieLens and Netflix data. More importantly, the optimal value λ depends approximately monotonously on the sparsity of the training set. Given a real system, we could estimate the optimal parameter according to the data sparsity, which makes this algorithm easy to be applied. In addition, two significant criteria of algorithmic performance, diversity and popularity, are also taken into account. Numerical results show that as the sparsity increases, the algorithm considering the second-order correlation can outperform the MCF simultaneously in all three criteria.

  16. Sort-Mid tasks scheduling algorithm in grid computing.

    PubMed

    Reda, Naglaa M; Tawfik, A; Marzok, Mohamed A; Khamis, Soheir M

    2015-11-01

    Scheduling tasks on heterogeneous resources distributed over a grid computing system is an NP-complete problem. The main aim for several researchers is to develop variant scheduling algorithms for achieving optimality, and they have shown a good performance for tasks scheduling regarding resources selection. However, using of the full power of resources is still a challenge. In this paper, a new heuristic algorithm called Sort-Mid is proposed. It aims to maximizing the utilization and minimizing the makespan. The new strategy of Sort-Mid algorithm is to find appropriate resources. The base step is to get the average value via sorting list of completion time of each task. Then, the maximum average is obtained. Finally, the task has the maximum average is allocated to the machine that has the minimum completion time. The allocated task is deleted and then, these steps are repeated until all tasks are allocated. Experimental tests show that the proposed algorithm outperforms almost other algorithms in terms of resources utilization and makespan.

  17. An Effective Hybrid Cuckoo Search Algorithm with Improved Shuffled Frog Leaping Algorithm for 0-1 Knapsack Problems

    PubMed Central

    Wang, Gai-Ge; Feng, Qingjiang; Zhao, Xiang-Jun

    2014-01-01

    An effective hybrid cuckoo search algorithm (CS) with improved shuffled frog-leaping algorithm (ISFLA) is put forward for solving 0-1 knapsack problem. First of all, with the framework of SFLA, an improved frog-leap operator is designed with the effect of the global optimal information on the frog leaping and information exchange between frog individuals combined with genetic mutation with a small probability. Subsequently, in order to improve the convergence speed and enhance the exploitation ability, a novel CS model is proposed with considering the specific advantages of Lévy flights and frog-leap operator. Furthermore, the greedy transform method is used to repair the infeasible solution and optimize the feasible solution. Finally, numerical simulations are carried out on six different types of 0-1 knapsack instances, and the comparative results have shown the effectiveness of the proposed algorithm and its ability to achieve good quality solutions, which outperforms the binary cuckoo search, the binary differential evolution, and the genetic algorithm. PMID:25404940

  18. Sort-Mid tasks scheduling algorithm in grid computing

    PubMed Central

    Reda, Naglaa M.; Tawfik, A.; Marzok, Mohamed A.; Khamis, Soheir M.

    2014-01-01

    Scheduling tasks on heterogeneous resources distributed over a grid computing system is an NP-complete problem. The main aim for several researchers is to develop variant scheduling algorithms for achieving optimality, and they have shown a good performance for tasks scheduling regarding resources selection. However, using of the full power of resources is still a challenge. In this paper, a new heuristic algorithm called Sort-Mid is proposed. It aims to maximizing the utilization and minimizing the makespan. The new strategy of Sort-Mid algorithm is to find appropriate resources. The base step is to get the average value via sorting list of completion time of each task. Then, the maximum average is obtained. Finally, the task has the maximum average is allocated to the machine that has the minimum completion time. The allocated task is deleted and then, these steps are repeated until all tasks are allocated. Experimental tests show that the proposed algorithm outperforms almost other algorithms in terms of resources utilization and makespan. PMID:26644937

  19. Monte Carlo sampling in diffusive dynamical systems

    NASA Astrophysics Data System (ADS)

    Tapias, Diego; Sanders, David P.; Altmann, Eduardo G.

    2018-05-01

    We introduce a Monte Carlo algorithm to efficiently compute transport properties of chaotic dynamical systems. Our method exploits the importance sampling technique that favors trajectories in the tail of the distribution of displacements, where deviations from a diffusive process are most prominent. We search for initial conditions using a proposal that correlates states in the Markov chain constructed via a Metropolis-Hastings algorithm. We show that our method outperforms the direct sampling method and also Metropolis-Hastings methods with alternative proposals. We test our general method through numerical simulations in 1D (box-map) and 2D (Lorentz gas) systems.

  20. Parallelized seeded region growing using CUDA.

    PubMed

    Park, Seongjin; Lee, Jeongjin; Lee, Hyunna; Shin, Juneseuk; Seo, Jinwook; Lee, Kyoung Ho; Shin, Yeong-Gil; Kim, Bohyoung

    2014-01-01

    This paper presents a novel method for parallelizing the seeded region growing (SRG) algorithm using Compute Unified Device Architecture (CUDA) technology, with intention to overcome the theoretical weakness of SRG algorithm of its computation time being directly proportional to the size of a segmented region. The segmentation performance of the proposed CUDA-based SRG is compared with SRG implementations on single-core CPUs, quad-core CPUs, and shader language programming, using synthetic datasets and 20 body CT scans. Based on the experimental results, the CUDA-based SRG outperforms the other three implementations, advocating that it can substantially assist the segmentation during massive CT screening tests.

  1. Improved Adaptive LSB Steganography Based on Chaos and Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Yu, Lifang; Zhao, Yao; Ni, Rongrong; Li, Ting

    2010-12-01

    We propose a novel steganographic method in JPEG images with high performance. Firstly, we propose improved adaptive LSB steganography, which can achieve high capacity while preserving the first-order statistics. Secondly, in order to minimize visual degradation of the stego image, we shuffle bits-order of the message based on chaos whose parameters are selected by the genetic algorithm. Shuffling message's bits-order provides us with a new way to improve the performance of steganography. Experimental results show that our method outperforms classical steganographic methods in image quality, while preserving characteristics of histogram and providing high capacity.

  2. Quantum machine learning.

    PubMed

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-13

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  3. Quantum machine learning

    NASA Astrophysics Data System (ADS)

    Biamonte, Jacob; Wittek, Peter; Pancotti, Nicola; Rebentrost, Patrick; Wiebe, Nathan; Lloyd, Seth

    2017-09-01

    Fuelled by increasing computer power and algorithmic advances, machine learning techniques have become powerful tools for finding patterns in data. Quantum systems produce atypical patterns that classical systems are thought not to produce efficiently, so it is reasonable to postulate that quantum computers may outperform classical computers on machine learning tasks. The field of quantum machine learning explores how to devise and implement quantum software that could enable machine learning that is faster than that of classical computers. Recent work has produced quantum algorithms that could act as the building blocks of machine learning programs, but the hardware and software challenges are still considerable.

  4. Hybrid glowworm swarm optimization for task scheduling in the cloud environment

    NASA Astrophysics Data System (ADS)

    Zhou, Jing; Dong, Shoubin

    2018-06-01

    In recent years many heuristic algorithms have been proposed to solve task scheduling problems in the cloud environment owing to their optimization capability. This article proposes a hybrid glowworm swarm optimization (HGSO) based on glowworm swarm optimization (GSO), which uses a technique of evolutionary computation, a strategy of quantum behaviour based on the principle of neighbourhood, offspring production and random walk, to achieve more efficient scheduling with reasonable scheduling costs. The proposed HGSO reduces the redundant computation and the dependence on the initialization of GSO, accelerates the convergence and more easily escapes from local optima. The conducted experiments and statistical analysis showed that in most cases the proposed HGSO algorithm outperformed previous heuristic algorithms to deal with independent tasks.

  5. Enhancing the performance of MOEAs: an experimental presentation of a new fitness guided mutation operator

    NASA Astrophysics Data System (ADS)

    Liagkouras, K.; Metaxiotis, K.

    2017-01-01

    Multi-objective evolutionary algorithms (MOEAs) are currently a dynamic field of research that has attracted considerable attention. Mutation operators have been utilized by MOEAs as variation mechanisms. In particular, polynomial mutation (PLM) is one of the most popular variation mechanisms and has been utilized by many well-known MOEAs. In this paper, we revisit the PLM operator and we propose a fitness-guided version of the PLM. Experimental results obtained by non-dominated sorting genetic algorithm II and strength Pareto evolutionary algorithm 2 show that the proposed fitness-guided mutation operator outperforms the classical PLM operator, based on different performance metrics that evaluate both the proximity of the solutions to the Pareto front and their dispersion on it.

  6. SLIDER: a generic metaheuristic for the discovery of correlated motifs in protein-protein interaction networks.

    PubMed

    Boyen, Peter; Van Dyck, Dries; Neven, Frank; van Ham, Roeland C H J; van Dijk, Aalt D J

    2011-01-01

    Correlated motif mining (cmm) is the problem of finding overrepresented pairs of patterns, called motifs, in sequences of interacting proteins. Algorithmic solutions for cmm thereby provide a computational method for predicting binding sites for protein interaction. In this paper, we adopt a motif-driven approach where the support of candidate motif pairs is evaluated in the network. We experimentally establish the superiority of the Chi-square-based support measure over other support measures. Furthermore, we obtain that cmm is an np-hard problem for a large class of support measures (including Chi-square) and reformulate the search for correlated motifs as a combinatorial optimization problem. We then present the generic metaheuristic slider which uses steepest ascent with a neighborhood function based on sliding motifs and employs the Chi-square-based support measure. We show that slider outperforms existing motif-driven cmm methods and scales to large protein-protein interaction networks. The slider-implementation and the data used in the experiments are available on http://bioinformatics.uhasselt.be.

  7. Influence Function Learning in Information Diffusion Networks.

    PubMed

    Du, Nan; Liang, Yingyu; Balcan, Maria-Florina; Song, Le

    2014-06-01

    Can we learn the influence of a set of people in a social network from cascades of information diffusion? This question is often addressed by a two-stage approach: first learn a diffusion model, and then calculate the influence based on the learned model. Thus, the success of this approach relies heavily on the correctness of the diffusion model which is hard to verify for real world data. In this paper, we exploit the insight that the influence functions in many diffusion models are coverage functions, and propose a novel parameterization of such functions using a convex combination of random basis functions. Moreover, we propose an efficient maximum likelihood based algorithm to learn such functions directly from cascade data, and hence bypass the need to specify a particular diffusion model in advance. We provide both theoretical and empirical analysis for our approach, showing that the proposed approach can provably learn the influence function with low sample complexity, be robust to the unknown diffusion models, and significantly outperform existing approaches in both synthetic and real world data.

  8. Dual-stage deep learning framework for pigment epithelium detachment segmentation in polypoidal choroidal vasculopathy

    PubMed Central

    Xu, Yupeng; Yan, Ke; Kim, Jinman; Wang, Xiuying; Li, Changyang; Su, Li; Yu, Suqin; Xu, Xun; Feng, Dagan David

    2017-01-01

    Worldwide, polypoidal choroidal vasculopathy (PCV) is a common vision-threatening exudative maculopathy, and pigment epithelium detachment (PED) is an important clinical characteristic. Thus, precise and efficient PED segmentation is necessary for PCV clinical diagnosis and treatment. We propose a dual-stage learning framework via deep neural networks (DNN) for automated PED segmentation in PCV patients to avoid issues associated with manual PED segmentation (subjectivity, manual segmentation errors, and high time consumption).The optical coherence tomography scans of fifty patients were quantitatively evaluated with different algorithms and clinicians. Dual-stage DNN outperformed existing PED segmentation methods for all segmentation accuracy parameters, including true positive volume fraction (85.74 ± 8.69%), dice similarity coefficient (85.69 ± 8.08%), positive predictive value (86.02 ± 8.99%) and false positive volume fraction (0.38 ± 0.18%). Dual-stage DNN achieves accurate PED quantitative information, works with multiple types of PEDs and agrees well with manual delineation, suggesting that it is a potential automated assistant for PCV management. PMID:28966847

  9. Dual-stage deep learning framework for pigment epithelium detachment segmentation in polypoidal choroidal vasculopathy.

    PubMed

    Xu, Yupeng; Yan, Ke; Kim, Jinman; Wang, Xiuying; Li, Changyang; Su, Li; Yu, Suqin; Xu, Xun; Feng, Dagan David

    2017-09-01

    Worldwide, polypoidal choroidal vasculopathy (PCV) is a common vision-threatening exudative maculopathy, and pigment epithelium detachment (PED) is an important clinical characteristic. Thus, precise and efficient PED segmentation is necessary for PCV clinical diagnosis and treatment. We propose a dual-stage learning framework via deep neural networks (DNN) for automated PED segmentation in PCV patients to avoid issues associated with manual PED segmentation (subjectivity, manual segmentation errors, and high time consumption).The optical coherence tomography scans of fifty patients were quantitatively evaluated with different algorithms and clinicians. Dual-stage DNN outperformed existing PED segmentation methods for all segmentation accuracy parameters, including true positive volume fraction (85.74 ± 8.69%), dice similarity coefficient (85.69 ± 8.08%), positive predictive value (86.02 ± 8.99%) and false positive volume fraction (0.38 ± 0.18%). Dual-stage DNN achieves accurate PED quantitative information, works with multiple types of PEDs and agrees well with manual delineation, suggesting that it is a potential automated assistant for PCV management.

  10. Clustering Categorical Data Using Community Detection Techniques

    PubMed Central

    2017-01-01

    With the advent of the k-modes algorithm, the toolbox for clustering categorical data has an efficient tool that scales linearly in the number of data items. However, random initialization of cluster centers in k-modes makes it hard to reach a good clustering without resorting to many trials. Recently proposed methods for better initialization are deterministic and reduce the clustering cost considerably. A variety of initialization methods differ in how the heuristics chooses the set of initial centers. In this paper, we address the clustering problem for categorical data from the perspective of community detection. Instead of initializing k modes and running several iterations, our scheme, CD-Clustering, builds an unweighted graph and detects highly cohesive groups of nodes using a fast community detection technique. The top-k detected communities by size will define the k modes. Evaluation on ten real categorical datasets shows that our method outperforms the existing initialization methods for k-modes in terms of accuracy, precision, and recall in most of the cases. PMID:29430249

  11. Transition probabilities for general birth-death processes with applications in ecology, genetics, and evolution

    PubMed Central

    Crawford, Forrest W.; Suchard, Marc A.

    2011-01-01

    A birth-death process is a continuous-time Markov chain that counts the number of particles in a system over time. In the general process with n current particles, a new particle is born with instantaneous rate λn and a particle dies with instantaneous rate μn. Currently no robust and efficient method exists to evaluate the finite-time transition probabilities in a general birth-death process with arbitrary birth and death rates. In this paper, we first revisit the theory of continued fractions to obtain expressions for the Laplace transforms of these transition probabilities and make explicit an important derivation connecting transition probabilities and continued fractions. We then develop an efficient algorithm for computing these probabilities that analyzes the error associated with approximations in the method. We demonstrate that this error-controlled method agrees with known solutions and outperforms previous approaches to computing these probabilities. Finally, we apply our novel method to several important problems in ecology, evolution, and genetics. PMID:21984359

  12. Fourier-Mellin moment-based intertwining map for image encryption

    NASA Astrophysics Data System (ADS)

    Kaur, Manjit; Kumar, Vijay

    2018-03-01

    In this paper, a robust image encryption technique that utilizes Fourier-Mellin moments and intertwining logistic map is proposed. Fourier-Mellin moment-based intertwining logistic map has been designed to overcome the issue of low sensitivity of an input image. Multi-objective Non-Dominated Sorting Genetic Algorithm (NSGA-II) based on Reinforcement Learning (MNSGA-RL) has been used to optimize the required parameters of intertwining logistic map. Fourier-Mellin moments are used to make the secret keys more secure. Thereafter, permutation and diffusion operations are carried out on input image using secret keys. The performance of proposed image encryption technique has been evaluated on five well-known benchmark images and also compared with seven well-known existing encryption techniques. The experimental results reveal that the proposed technique outperforms others in terms of entropy, correlation analysis, a unified average changing intensity and the number of changing pixel rate. The simulation results reveal that the proposed technique provides high level of security and robustness against various types of attacks.

  13. Calibrated Multivariate Regression with Application to Neural Semantic Basis Discovery.

    PubMed

    Liu, Han; Wang, Lie; Zhao, Tuo

    2015-08-01

    We propose a calibrated multivariate regression method named CMR for fitting high dimensional multivariate regression models. Compared with existing methods, CMR calibrates regularization for each regression task with respect to its noise level so that it simultaneously attains improved finite-sample performance and tuning insensitiveness. Theoretically, we provide sufficient conditions under which CMR achieves the optimal rate of convergence in parameter estimation. Computationally, we propose an efficient smoothed proximal gradient algorithm with a worst-case numerical rate of convergence O (1/ ϵ ), where ϵ is a pre-specified accuracy of the objective function value. We conduct thorough numerical simulations to illustrate that CMR consistently outperforms other high dimensional multivariate regression methods. We also apply CMR to solve a brain activity prediction problem and find that it is as competitive as a handcrafted model created by human experts. The R package camel implementing the proposed method is available on the Comprehensive R Archive Network http://cran.r-project.org/web/packages/camel/.

  14. Robustness study of the pseudo open-loop controller for multiconjugate adaptive optics.

    PubMed

    Piatrou, Piotr; Gilles, Luc

    2005-02-20

    Robustness of the recently proposed "pseudo open-loop control" algorithm against various system errors has been investigated for the representative example of the Gemini-South 8-m telescope multiconjugate adaptive-optics system. The existing model to represent the adaptive-optics system with pseudo open-loop control has been modified to account for misalignments, noise and calibration errors in deformable mirrors, and wave-front sensors. Comparison with the conventional least-squares control model has been done. We show with the aid of both transfer-function pole-placement analysis and Monte Carlo simulations that POLC remains remarkably stable and robust against very large levels of system errors and outperforms in this respect least-squares control. Approximate stability margins as well as performance metrics such as Strehl ratios and rms wave-front residuals averaged over a 1-arc min field of view have been computed for different types and levels of system errors to quantify the expected performance degradation.

  15. e-DMDAV: A new privacy preserving algorithm for wearable enterprise information systems

    NASA Astrophysics Data System (ADS)

    Zhang, Zhenjiang; Wang, Xiaoni; Uden, Lorna; Zhang, Peng; Zhao, Yingsi

    2018-04-01

    Wearable devices have been widely used in many fields to improve the quality of people's lives. More and more data on individuals and businesses are collected by statistical organizations though those devices. Almost all of this data holds confidential information. Statistical Disclosure Control (SDC) seeks to protect statistical data in such a way that it can be released without giving away confidential information that can be linked to specific individuals or entities. The MDAV (Maximum Distance to Average Vector) algorithm is an efficient micro-aggregation algorithm belonging to SDC. However, the MDAV algorithm cannot survive homogeneity and background knowledge attacks because it was designed for static numerical data. This paper proposes a systematic dynamic-updating anonymity algorithm based on MDAV called the e-DMDAV algorithm. This algorithm introduces a new parameter and a table to ensure that the k records in one cluster with the range of the distinct values in each cluster is no less than e for numerical and non-numerical datasets. This new algorithm has been evaluated and compared with the MDAV algorithm. The simulation results show that the new algorithm outperforms MDAV in terms of minimizing distortion and disclosure risk with a similar computational cost.

  16. Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs.

    PubMed

    Kundeti, Vamsi K; Rajasekaran, Sanguthevar; Dinh, Hieu; Vaughn, Matthew; Thapar, Vishal

    2010-11-15

    Assembling genomic sequences from a set of overlapping reads is one of the most fundamental problems in computational biology. Algorithms addressing the assembly problem fall into two broad categories - based on the data structures which they employ. The first class uses an overlap/string graph and the second type uses a de Bruijn graph. However with the recent advances in short read sequencing technology, de Bruijn graph based algorithms seem to play a vital role in practice. Efficient algorithms for building these massive de Bruijn graphs are very essential in large sequencing projects based on short reads. In an earlier work, an O(n/p) time parallel algorithm has been given for this problem. Here n is the size of the input and p is the number of processors. This algorithm enumerates all possible bi-directed edges which can overlap with a node and ends up generating Θ(nΣ) messages (Σ being the size of the alphabet). In this paper we present a Θ(n/p) time parallel algorithm with a communication complexity that is equal to that of parallel sorting and is not sensitive to Σ. The generality of our algorithm makes it very easy to extend it even to the out-of-core model and in this case it has an optimal I/O complexity of Θ(nlog(n/B)Blog(M/B)) (M being the main memory size and B being the size of the disk block). We demonstrate the scalability of our parallel algorithm on a SGI/Altix computer. A comparison of our algorithm with the previous approaches reveals that our algorithm is faster--both asymptotically and practically. We demonstrate the scalability of our sequential out-of-core algorithm by comparing it with the algorithm used by VELVET to build the bi-directed de Bruijn graph. Our experiments reveal that our algorithm can build the graph with a constant amount of memory, which clearly outperforms VELVET. We also provide efficient algorithms for the bi-directed chain compaction problem. The bi-directed de Bruijn graph is a fundamental data structure for any sequence assembly program based on Eulerian approach. Our algorithms for constructing Bi-directed de Bruijn graphs are efficient in parallel and out of core settings. These algorithms can be used in building large scale bi-directed de Bruijn graphs. Furthermore, our algorithms do not employ any all-to-all communications in a parallel setting and perform better than the prior algorithms. Finally our out-of-core algorithm is extremely memory efficient and can replace the existing graph construction algorithm in VELVET.

  17. Using Collective Intelligence to Route Internet Traffic

    NASA Technical Reports Server (NTRS)

    Wolpert, David H.; Tumer, Kagan; Frank, Jeremy

    1998-01-01

    A Collective Intelligence (COIN) is a community of interacting reinforcement learning (RL) algorithms designed so that their collective behavior maximizes a global utility function. We introduce the theory of COINs, then present experiments using that theory to design COINs to control internet traffic routing. These experiments indicate that COINs outperform previous RL-based systems for such routing that have previously been investigated.

  18. Algorithmic Management for Improving Collective Productivity in Crowdsourcing.

    PubMed

    Yu, Han; Miao, Chunyan; Chen, Yiqiang; Fauvel, Simon; Li, Xiaoming; Lesser, Victor R

    2017-10-02

    Crowdsourcing systems are complex not only because of the huge number of potential strategies for assigning workers to tasks, but also due to the dynamic characteristics associated with workers. Maximizing social welfare in such situations is known to be NP-hard. To address these fundamental challenges, we propose the surprise-minimization-value-maximization (SMVM) approach. By analysing typical crowdsourcing system dynamics, we established a simple and novel worker desirability index (WDI) jointly considering the effect of each worker's reputation, workload and motivation to work on collective productivity. Through evaluating workers' WDI values, SMVM influences individual workers in real time about courses of action which can benefit the workers and lead to high collective productivity. Solutions can be produced in polynomial time and are proven to be asymptotically bounded by a theoretical optimal solution. High resolution simulations based on a real-world dataset demonstrate that SMVM significantly outperforms state-of-the-art approaches. A large-scale 3-year empirical study involving 1,144 participants in over 9,000 sessions shows that SMVM outperforms human task delegation decisions over 80% of the time under common workload conditions. The approach and results can help engineer highly scalable data-driven algorithmic management decision support systems for crowdsourcing.

  19. Bearing Fault Diagnosis Based on Statistical Locally Linear Embedding

    PubMed Central

    Wang, Xiang; Zheng, Yuan; Zhao, Zhenzhou; Wang, Jinping

    2015-01-01

    Fault diagnosis is essentially a kind of pattern recognition. The measured signal samples usually distribute on nonlinear low-dimensional manifolds embedded in the high-dimensional signal space, so how to implement feature extraction, dimensionality reduction and improve recognition performance is a crucial task. In this paper a novel machinery fault diagnosis approach based on a statistical locally linear embedding (S-LLE) algorithm which is an extension of LLE by exploiting the fault class label information is proposed. The fault diagnosis approach first extracts the intrinsic manifold features from the high-dimensional feature vectors which are obtained from vibration signals that feature extraction by time-domain, frequency-domain and empirical mode decomposition (EMD), and then translates the complex mode space into a salient low-dimensional feature space by the manifold learning algorithm S-LLE, which outperforms other feature reduction methods such as PCA, LDA and LLE. Finally in the feature reduction space pattern classification and fault diagnosis by classifier are carried out easily and rapidly. Rolling bearing fault signals are used to validate the proposed fault diagnosis approach. The results indicate that the proposed approach obviously improves the classification performance of fault pattern recognition and outperforms the other traditional approaches. PMID:26153771

  20. Risk-Seeking Versus Risk-Avoiding Investments in Noisy Periodic Environments

    NASA Astrophysics Data System (ADS)

    Navarro-Barrientos, J. Emeterio; Walter, Frank E.; Schweitzer, Frank

    We study the performance of various agent strategies in an artificial investment scenario. Agents are equipped with a budget, x(t), and at each time step invest a particular fraction, q(t), of their budget. The return on investment (RoI), r(t), is characterized by a periodic function with different types and levels of noise. Risk-avoiding agents choose their fraction q(t) proportional to the expected positive RoI, while risk-seeking agents always choose a maximum value qmax if they predict the RoI to be positive ("everything on red"). In addition to these different strategies, agents have different capabilities to predict the future r(t), dependent on their internal complexity. Here, we compare "zero-intelligent" agents using technical analysis (such as moving least squares) with agents using reinforcement learning or genetic algorithms to predict r(t). The performance of agents is measured by their average budget growth after a certain number of time steps. We present results of extensive computer simulations, which show that, for our given artificial environment, (i) the risk-seeking strategy outperforms the risk-avoiding one, and (ii) the genetic algorithm was able to find this optimal strategy itself, and thus outperforms other prediction approaches considered.

Top