Sample records for greedy algorithm identifies

  1. Greedy Algorithms for Nonnegativity-Constrained Simultaneous Sparse Recovery

    PubMed Central

    Kim, Daeun; Haldar, Justin P.

    2016-01-01

    This work proposes a family of greedy algorithms to jointly reconstruct a set of vectors that are (i) nonnegative and (ii) simultaneously sparse with a shared support set. The proposed algorithms generalize previous approaches that were designed to impose these constraints individually. Similar to previous greedy algorithms for sparse recovery, the proposed algorithms iteratively identify promising support indices. In contrast to previous approaches, the support index selection procedure has been adapted to prioritize indices that are consistent with both the nonnegativity and shared support constraints. Empirical results demonstrate for the first time that the combined use of simultaneous sparsity and nonnegativity constraints can substantially improve recovery performance relative to existing greedy algorithms that impose less signal structure. PMID:26973368

  2. Synthesis of Greedy Algorithms Using Dominance Relations

    NASA Technical Reports Server (NTRS)

    Nedunuri, Srinivas; Smith, Douglas R.; Cook, William R.

    2010-01-01

    Greedy algorithms exploit problem structure and constraints to achieve linear-time performance. Yet there is still no completely satisfactory way of constructing greedy algorithms. For example, the Greedy Algorithm of Edmonds depends upon translating a problem into an algebraic structure called a matroid, but the existence of such a translation can be as hard to determine as the existence of a greedy algorithm itself. An alternative characterization of greedy algorithms is in terms of dominance relations, a well-known algorithmic technique used to prune search spaces. We demonstrate a process by which dominance relations can be methodically derived for a number of greedy algorithms, including activity selection, and prefix-free codes. By incorporating our approach into an existing framework for algorithm synthesis, we demonstrate that it could be the basis for an effective engineering method for greedy algorithms. We also compare our approach with other characterizations of greedy algorithms.

  3. A genetic algorithm for replica server placement

    NASA Astrophysics Data System (ADS)

    Eslami, Ghazaleh; Toroghi Haghighat, Abolfazl

    2012-01-01

    Modern distribution systems use replication to improve communication delay experienced by their clients. Some techniques have been developed for web server replica placement. One of the previous studies was Greedy algorithm proposed by Qiu et al, that needs knowledge about network topology. In This paper, first we introduce a genetic algorithm for web server replica placement. Second, we compare our algorithm with Greedy algorithm proposed by Qiu et al, and Optimum algorithm. We found that our approach can achieve better results than Greedy algorithm proposed by Qiu et al but it's computational time is more than Greedy algorithm.

  4. A genetic algorithm for replica server placement

    NASA Astrophysics Data System (ADS)

    Eslami, Ghazaleh; Toroghi Haghighat, Abolfazl

    2011-12-01

    Modern distribution systems use replication to improve communication delay experienced by their clients. Some techniques have been developed for web server replica placement. One of the previous studies was Greedy algorithm proposed by Qiu et al, that needs knowledge about network topology. In This paper, first we introduce a genetic algorithm for web server replica placement. Second, we compare our algorithm with Greedy algorithm proposed by Qiu et al, and Optimum algorithm. We found that our approach can achieve better results than Greedy algorithm proposed by Qiu et al but it's computational time is more than Greedy algorithm.

  5. Efficient and accurate Greedy Search Methods for mining functional modules in protein interaction networks.

    PubMed

    He, Jieyue; Li, Chaojun; Ye, Baoliu; Zhong, Wei

    2012-06-25

    Most computational algorithms mainly focus on detecting highly connected subgraphs in PPI networks as protein complexes but ignore their inherent organization. Furthermore, many of these algorithms are computationally expensive. However, recent analysis indicates that experimentally detected protein complexes generally contain Core/attachment structures. In this paper, a Greedy Search Method based on Core-Attachment structure (GSM-CA) is proposed. The GSM-CA method detects densely connected regions in large protein-protein interaction networks based on the edge weight and two criteria for determining core nodes and attachment nodes. The GSM-CA method improves the prediction accuracy compared to other similar module detection approaches, however it is computationally expensive. Many module detection approaches are based on the traditional hierarchical methods, which is also computationally inefficient because the hierarchical tree structure produced by these approaches cannot provide adequate information to identify whether a network belongs to a module structure or not. In order to speed up the computational process, the Greedy Search Method based on Fast Clustering (GSM-FC) is proposed in this work. The edge weight based GSM-FC method uses a greedy procedure to traverse all edges just once to separate the network into the suitable set of modules. The proposed methods are applied to the protein interaction network of S. cerevisiae. Experimental results indicate that many significant functional modules are detected, most of which match the known complexes. Results also demonstrate that the GSM-FC algorithm is faster and more accurate as compared to other competing algorithms. Based on the new edge weight definition, the proposed algorithm takes advantages of the greedy search procedure to separate the network into the suitable set of modules. Experimental analysis shows that the identified modules are statistically significant. The algorithm can reduce the computational time significantly while keeping high prediction accuracy.

  6. A noniterative greedy algorithm for multiframe point correspondence.

    PubMed

    Shafique, Khurram; Shah, Mubarak

    2005-01-01

    This paper presents a framework for finding point correspondences in monocular image sequences over multiple frames. The general problem of multiframe point correspondence is NP-hard for three or more frames. A polynomial time algorithm for a restriction of this problem is presented and is used as the basis of the proposed greedy algorithm for the general problem. The greedy nature of the proposed algorithm allows it to be used in real-time systems for tracking and surveillance, etc. In addition, the proposed algorithm deals with the problems of occlusion, missed detections, and false positives by using a single noniterative greedy optimization scheme and, hence, reduces the complexity of the overall algorithm as compared to most existing approaches where multiple heuristics are used for the same purpose. While most greedy algorithms for point tracking do not allow for entry and exit of the points from the scene, this is not a limitation for the proposed algorithm. Experiments with real and synthetic data over a wide range of scenarios and system parameters are presented to validate the claims about the performance of the proposed algorithm.

  7. An Optimal Schedule for Urban Road Network Repair Based on the Greedy Algorithm

    PubMed Central

    Lu, Guangquan; Xiong, Ying; Wang, Yunpeng

    2016-01-01

    The schedule of urban road network recovery caused by rainstorms, snow, and other bad weather conditions, traffic incidents, and other daily events is essential. However, limited studies have been conducted to investigate this problem. We fill this research gap by proposing an optimal schedule for urban road network repair with limited repair resources based on the greedy algorithm. Critical links will be given priority in repair according to the basic concept of the greedy algorithm. In this study, the link whose restoration produces the ratio of the system-wide travel time of the current network to the worst network is the minimum. We define such a link as the critical link for the current network. We will re-evaluate the importance of damaged links after each repair process is completed. That is, the critical link ranking will be changed along with the repair process because of the interaction among links. We repair the most critical link for the specific network state based on the greedy algorithm to obtain the optimal schedule. The algorithm can still quickly obtain an optimal schedule even if the scale of the road network is large because the greedy algorithm can reduce computational complexity. We prove that the problem can obtain the optimal solution using the greedy algorithm in theory. The algorithm is also demonstrated in the Sioux Falls network. The problem discussed in this paper is highly significant in dealing with urban road network restoration. PMID:27768732

  8. GreedyMAX-type Algorithms for the Maximum Independent Set Problem

    NASA Astrophysics Data System (ADS)

    Borowiecki, Piotr; Göring, Frank

    A maximum independent set problem for a simple graph G = (V,E) is to find the largest subset of pairwise nonadjacent vertices. The problem is known to be NP-hard and it is also hard to approximate. Within this article we introduce a non-negative integer valued function p defined on the vertex set V(G) and called a potential function of a graph G, while P(G) = max v ∈ V(G) p(v) is called a potential of G. For any graph P(G) ≤ Δ(G), where Δ(G) is the maximum degree of G. Moreover, Δ(G) - P(G) may be arbitrarily large. A potential of a vertex lets us get a closer insight into the properties of its neighborhood which leads to the definition of the family of GreedyMAX-type algorithms having the classical GreedyMAX algorithm as their origin. We establish a lower bound 1/(P + 1) for the performance ratio of GreedyMAX-type algorithms which favorably compares with the bound 1/(Δ + 1) known to hold for GreedyMAX. The cardinality of an independent set generated by any GreedyMAX-type algorithm is at least sum_{vin V(G)} (p(v)+1)^{-1}, which strengthens the bounds of Turán and Caro-Wei stated in terms of vertex degrees.

  9. Scheduling Algorithm for Mission Planning and Logistics Evaluation (SAMPLE). Volume 3: The GREEDY algorithm

    NASA Technical Reports Server (NTRS)

    Dupnick, E.; Wiggins, D.

    1980-01-01

    The functional specifications, functional design and flow, and the program logic of the GREEDY computer program are described. The GREEDY program is a submodule of the Scheduling Algorithm for Mission Planning and Logistics Evaluation (SAMPLE) program and has been designed as a continuation of the shuttle Mission Payloads (MPLS) program. The MPLS uses input payload data to form a set of feasible payload combinations; from these, GREEDY selects a subset of combinations (a traffic model) so all payloads can be included without redundancy. The program also provides the user a tutorial option so that he can choose an alternate traffic model in case a particular traffic model is unacceptable.

  10. Scheduling algorithm for mission planning and logistics evaluation users' guide

    NASA Technical Reports Server (NTRS)

    Chang, H.; Williams, J. M.

    1976-01-01

    The scheduling algorithm for mission planning and logistics evaluation (SAMPLE) program is a mission planning tool composed of three subsystems; the mission payloads subsystem (MPLS), which generates a list of feasible combinations from a payload model for a given calendar year; GREEDY, which is a heuristic model used to find the best traffic model; and the operations simulation and resources scheduling subsystem (OSARS), which determines traffic model feasibility for available resources. The SAMPLE provides the user with options to allow the execution of MPLS, GREEDY, GREEDY-OSARS, or MPLS-GREEDY-OSARS.

  11. Robust Planning for Effects-Based Operations

    DTIC Science & Technology

    2006-06-01

    Algorithm ......................................... 34 2.6 Robust Optimization Literature ..................................... 36 2.6.1 Protecting Against...Model Formulation ...................... 55 3.1.5 Deterministic EBO Model Example and Performance ............. 59 3.1.6 Greedy Algorithm ...111 4.1.9 Conclusions on Robust EBO Model Performance .................... 116 4.2 Greedy Algorithm versus EBO Models

  12. Greedy algorithms in disordered systems

    NASA Astrophysics Data System (ADS)

    Duxbury, P. M.; Dobrin, R.

    1999-08-01

    We discuss search, minimal path and minimal spanning tree algorithms and their applications to disordered systems. Greedy algorithms solve these problems exactly, and are related to extremal dynamics in physics. Minimal cost path (Dijkstra) and minimal cost spanning tree (Prim) algorithms provide extremal dynamics for a polymer in a random medium (the KPZ universality class) and invasion percolation (without trapping) respectively.

  13. Efficient greedy algorithms for economic manpower shift planning

    NASA Astrophysics Data System (ADS)

    Nearchou, A. C.; Giannikos, I. C.; Lagodimos, A. G.

    2015-01-01

    Consideration is given to the economic manpower shift planning (EMSP) problem, an NP-hard capacity planning problem appearing in various industrial settings including the packing stage of production in process industries and maintenance operations. EMSP aims to determine the manpower needed in each available workday shift of a given planning horizon so as to complete a set of independent jobs at minimum cost. Three greedy heuristics are presented for the EMSP solution. These practically constitute adaptations of an existing algorithm for a simplified version of EMSP which had shown excellent performance in terms of solution quality and speed. Experimentation shows that the new algorithms perform very well in comparison to the results obtained by both the CPLEX optimizer and an existing metaheuristic. Statistical analysis is deployed to rank the algorithms in terms of their solution quality and to identify the effects that critical planning factors may have on their relative efficiency.

  14. Uncovering the community structure in signed social networks based on greedy optimization

    NASA Astrophysics Data System (ADS)

    Chen, Yan; Yan, Jiaqi; Yang, Yu; Chen, Junhua

    2017-05-01

    The formality of signed relationships has been recently adopted in a lot of complicated systems. The relations among these entities are complicated and multifarious. We cannot indicate these relationships only by positive links, and signed networks have been becoming more and more universal in the study of social networks when community is being significant. In this paper, to identify communities in signed networks, we exploit a new greedy algorithm, taking signs and the density of these links into account. The main idea of the algorithm is the initial procedure of signed modularity and the corresponding update rules. Specially, we employ the “Asymmetric and Constrained Belief Evolution” procedure to evaluate the optimal number of communities. According to the experimental results, the algorithm is proved to be able to run well. More specifically, the proposed algorithm is very efficient for these networks with medium size, both dense and sparse.

  15. The Best m-Term Approximation and Greedy Algorithms

    DTIC Science & Technology

    1997-01-01

    in the paper DKT For a given basis we dene the Greedy Algorithm Gp as follows Let f X I cIf I and cIf p kcIf Ikp Then... DKT RA DeVore SV Konyagin and VV Temlyakov Hyperbolic Wavelet Approximation to appear DL R DeVore GLorentz

  16. Greedy Gossip With Eavesdropping

    NASA Astrophysics Data System (ADS)

    Ustebay, Deniz; Oreshkin, Boris N.; Coates, Mark J.; Rabbat, Michael G.

    2010-07-01

    This paper presents greedy gossip with eavesdropping (GGE), a novel randomized gossip algorithm for distributed computation of the average consensus problem. In gossip algorithms, nodes in the network randomly communicate with their neighbors and exchange information iteratively. The algorithms are simple and decentralized, making them attractive for wireless network applications. In general, gossip algorithms are robust to unreliable wireless conditions and time varying network topologies. In this paper we introduce GGE and demonstrate that greedy updates lead to rapid convergence. We do not require nodes to have any location information. Instead, greedy updates are made possible by exploiting the broadcast nature of wireless communications. During the operation of GGE, when a node decides to gossip, instead of choosing one of its neighbors at random, it makes a greedy selection, choosing the node which has the value most different from its own. In order to make this selection, nodes need to know their neighbors' values. Therefore, we assume that all transmissions are wireless broadcasts and nodes keep track of their neighbors' values by eavesdropping on their communications. We show that the convergence of GGE is guaranteed for connected network topologies. We also study the rates of convergence and illustrate, through theoretical bounds and numerical simulations, that GGE consistently outperforms randomized gossip and performs comparably to geographic gossip on moderate-sized random geometric graph topologies.

  17. A greedy algorithm for species selection in dimension reduction of combustion chemistry

    NASA Astrophysics Data System (ADS)

    Hiremath, Varun; Ren, Zhuyin; Pope, Stephen B.

    2010-09-01

    Computational calculations of combustion problems involving large numbers of species and reactions with a detailed description of the chemistry can be very expensive. Numerous dimension reduction techniques have been developed in the past to reduce the computational cost. In this paper, we consider the rate controlled constrained-equilibrium (RCCE) dimension reduction method, in which a set of constrained species is specified. For a given number of constrained species, the 'optimal' set of constrained species is that which minimizes the dimension reduction error. The direct determination of the optimal set is computationally infeasible, and instead we present a greedy algorithm which aims at determining a 'good' set of constrained species; that is, one leading to near-minimal dimension reduction error. The partially-stirred reactor (PaSR) involving methane premixed combustion with chemistry described by the GRI-Mech 1.2 mechanism containing 31 species is used to test the algorithm. Results on dimension reduction errors for different sets of constrained species are presented to assess the effectiveness of the greedy algorithm. It is shown that the first four constrained species selected using the proposed greedy algorithm produce lower dimension reduction error than constraints on the major species: CH4, O2, CO2 and H2O. It is also shown that the first ten constrained species selected using the proposed greedy algorithm produce a non-increasing dimension reduction error with every additional constrained species; and produce the lowest dimension reduction error in many cases tested over a wide range of equivalence ratios, pressures and initial temperatures.

  18. A Subspace Pursuit–based Iterative Greedy Hierarchical Solution to the Neuromagnetic Inverse Problem

    PubMed Central

    Babadi, Behtash; Obregon-Henao, Gabriel; Lamus, Camilo; Hämäläinen, Matti S.; Brown, Emery N.; Purdon, Patrick L.

    2013-01-01

    Magnetoencephalography (MEG) is an important non-invasive method for studying activity within the human brain. Source localization methods can be used to estimate spatiotemporal activity from MEG measurements with high temporal resolution, but the spatial resolution of these estimates is poor due to the ill-posed nature of the MEG inverse problem. Recent developments in source localization methodology have emphasized temporal as well as spatial constraints to improve source localization accuracy, but these methods can be computationally intense. Solutions emphasizing spatial sparsity hold tremendous promise, since the underlying neurophysiological processes generating MEG signals are often sparse in nature, whether in the form of focal sources, or distributed sources representing large-scale functional networks. Recent developments in the theory of compressed sensing (CS) provide a rigorous framework to estimate signals with sparse structure. In particular, a class of CS algorithms referred to as greedy pursuit algorithms can provide both high recovery accuracy and low computational complexity. Greedy pursuit algorithms are difficult to apply directly to the MEG inverse problem because of the high-dimensional structure of the MEG source space and the high spatial correlation in MEG measurements. In this paper, we develop a novel greedy pursuit algorithm for sparse MEG source localization that overcomes these fundamental problems. This algorithm, which we refer to as the Subspace Pursuit-based Iterative Greedy Hierarchical (SPIGH) inverse solution, exhibits very low computational complexity while achieving very high localization accuracy. We evaluate the performance of the proposed algorithm using comprehensive simulations, as well as the analysis of human MEG data during spontaneous brain activity and somatosensory stimuli. These studies reveal substantial performance gains provided by the SPIGH algorithm in terms of computational complexity, localization accuracy, and robustness. PMID:24055554

  19. On Stable Marriages and Greedy Matchings

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Manne, Fredrik; Naim, Md; Lerring, Hakon

    2016-12-11

    Research on stable marriage problems has a long and mathematically rigorous history, while that of exploiting greedy matchings in combinatorial scientific computing is a younger and less developed research field. In this paper we consider the relationships between these two areas. In particular we show that several problems related to computing greedy matchings can be formulated as stable marriage problems and as a consequence several recently proposed algorithms for computing greedy matchings are in fact special cases of well known algorithms for the stable marriage problem. However, in terms of implementations and practical scalable solutions on modern hardware, the greedymore » matching community has made considerable progress. We show that due to the strong relationship between these two fields many of these results are also applicable for solving stable marriage problems.« less

  20. Evaluation of a Didactic Method for the Active Learning of Greedy Algorithms

    ERIC Educational Resources Information Center

    Esteban-Sánchez, Natalia; Pizarro, Celeste; Velázquez-Iturbide, J. Ángel

    2014-01-01

    An evaluation of the educational effectiveness of a didactic method for the active learning of greedy algorithms is presented. The didactic method sets students structured-inquiry challenges to be addressed with a specific experimental method, supported by the interactive system GreedEx. This didactic method has been refined over several years of…

  1. Approximation algorithms for a genetic diagnostics problem.

    PubMed

    Kosaraju, S R; Schäffer, A A; Biesecker, L G

    1998-01-01

    We define and study a combinatorial problem called WEIGHTED DIAGNOSTIC COVER (WDC) that models the use of a laboratory technique called genotyping in the diagnosis of an important class of chromosomal aberrations. An optimal solution to WDC would enable us to define a genetic assay that maximizes the diagnostic power for a specified cost of laboratory work. We develop approximation algorithms for WDC by making use of the well-known problem SET COVER for which the greedy heuristic has been extensively studied. We prove worst-case performance bounds on the greedy heuristic for WDC and for another heuristic we call directional greedy. We implemented both heuristics. We also implemented a local search heuristic that takes the solutions obtained by greedy and dir-greedy and applies swaps until they are locally optimal. We report their performance on a real data set that is representative of the options that a clinical geneticist faces for the real diagnostic problem. Many open problems related to WDC remain, both of theoretical interest and practical importance.

  2. Fast algorithm of adaptive Fourier series

    NASA Astrophysics Data System (ADS)

    Gao, You; Ku, Min; Qian, Tao

    2018-05-01

    Adaptive Fourier decomposition (AFD, precisely 1-D AFD or Core-AFD) was originated for the goal of positive frequency representations of signals. It achieved the goal and at the same time offered fast decompositions of signals. There then arose several types of AFDs. AFD merged with the greedy algorithm idea, and in particular, motivated the so-called pre-orthogonal greedy algorithm (Pre-OGA) that was proven to be the most efficient greedy algorithm. The cost of the advantages of the AFD type decompositions is, however, the high computational complexity due to the involvement of maximal selections of the dictionary parameters. The present paper offers one formulation of the 1-D AFD algorithm by building the FFT algorithm into it. Accordingly, the algorithm complexity is reduced, from the original $\\mathcal{O}(M N^2)$ to $\\mathcal{O}(M N\\log_2 N)$, where $N$ denotes the number of the discretization points on the unit circle and $M$ denotes the number of points in $[0,1)$. This greatly enhances the applicability of AFD. Experiments are carried out to show the high efficiency of the proposed algorithm.

  3. Maximizing phylogenetic diversity in biodiversity conservation: Greedy solutions to the Noah's Ark problem.

    PubMed

    Hartmann, Klaas; Steel, Mike

    2006-08-01

    The Noah's Ark Problem (NAP) is a comprehensive cost-effectiveness methodology for biodiversity conservation that was introduced by Weitzman (1998) and utilizes the phylogenetic tree containing the taxa of interest to assess biodiversity. Given a set of taxa, each of which has a particular survival probability that can be increased at some cost, the NAP seeks to allocate limited funds to conserving these taxa so that the future expected biodiversity is maximized. Finding optimal solutions using this framework is a computationally difficult problem to which a simple and efficient "greedy" algorithm has been proposed in the literature and applied to conservation problems. We show that, although algorithms of this type cannot produce optimal solutions for the general NAP, there are two restricted scenarios of the NAP for which a greedy algorithm is guaranteed to produce optimal solutions. The first scenario requires the taxa to have equal conservation cost; the second scenario requires an ultrametric tree. The NAP assumes a linear relationship between the funding allocated to conservation of a taxon and the increased survival probability of that taxon. This relationship is briefly investigated and one variation is suggested that can also be solved using a greedy algorithm.

  4. Detection of Cheating by Decimation Algorithm

    NASA Astrophysics Data System (ADS)

    Yamanaka, Shogo; Ohzeki, Masayuki; Decelle, Aurélien

    2015-02-01

    We expand the item response theory to study the case of "cheating students" for a set of exams, trying to detect them by applying a greedy algorithm of inference. This extended model is closely related to the Boltzmann machine learning. In this paper we aim to infer the correct biases and interactions of our model by considering a relatively small number of sets of training data. Nevertheless, the greedy algorithm that we employed in the present study exhibits good performance with a few number of training data. The key point is the sparseness of the interactions in our problem in the context of the Boltzmann machine learning: the existence of cheating students is expected to be very rare (possibly even in real world). We compare a standard approach to infer the sparse interactions in the Boltzmann machine learning to our greedy algorithm and we find the latter to be superior in several aspects.

  5. Impact of heuristics in clustering large biological networks.

    PubMed

    Shafin, Md Kishwar; Kabir, Kazi Lutful; Ridwan, Iffatur; Anannya, Tasmiah Tamzid; Karim, Rashid Saadman; Hoque, Mohammad Mozammel; Rahman, M Sohel

    2015-12-01

    Traditional clustering algorithms often exhibit poor performance for large networks. On the contrary, greedy algorithms are found to be relatively efficient while uncovering functional modules from large biological networks. The quality of the clusters produced by these greedy techniques largely depends on the underlying heuristics employed. Different heuristics based on different attributes and properties perform differently in terms of the quality of the clusters produced. This motivates us to design new heuristics for clustering large networks. In this paper, we have proposed two new heuristics and analyzed the performance thereof after incorporating those with three different combinations in a recently celebrated greedy clustering algorithm named SPICi. We have extensively analyzed the effectiveness of these new variants. The results are found to be promising. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Effective Iterated Greedy Algorithm for Flow-Shop Scheduling Problems with Time lags

    NASA Astrophysics Data System (ADS)

    ZHAO, Ning; YE, Song; LI, Kaidian; CHEN, Siyu

    2017-05-01

    Flow shop scheduling problem with time lags is a practical scheduling problem and attracts many studies. Permutation problem(PFSP with time lags) is concentrated but non-permutation problem(non-PFSP with time lags) seems to be neglected. With the aim to minimize the makespan and satisfy time lag constraints, efficient algorithms corresponding to PFSP and non-PFSP problems are proposed, which consist of iterated greedy algorithm for permutation(IGTLP) and iterated greedy algorithm for non-permutation (IGTLNP). The proposed algorithms are verified using well-known simple and complex instances of permutation and non-permutation problems with various time lag ranges. The permutation results indicate that the proposed IGTLP can reach near optimal solution within nearly 11% computational time of traditional GA approach. The non-permutation results indicate that the proposed IG can reach nearly same solution within less than 1% computational time compared with traditional GA approach. The proposed research combines PFSP and non-PFSP together with minimal and maximal time lag consideration, which provides an interesting viewpoint for industrial implementation.

  7. A Simulation of Readiness-Based Sparing Policies

    DTIC Science & Technology

    2017-06-01

    variant of a greedy heuristic algorithm to set stock levels and estimate overall WS availability. Our discrete event simulation is then used to test the...available in the optimization tools. 14. SUBJECT TERMS readiness-based sparing, discrete event simulation, optimization, multi-indenture...variant of a greedy heuristic algorithm to set stock levels and estimate overall WS availability. Our discrete event simulation is then used to test the

  8. A distributed geo-routing algorithm for wireless sensor networks.

    PubMed

    Joshi, Gyanendra Prasad; Kim, Sung Won

    2009-01-01

    Geographic wireless sensor networks use position information for greedy routing. Greedy routing works well in dense networks, whereas in sparse networks it may fail and require a recovery algorithm. Recovery algorithms help the packet to get out of the communication void. However, these algorithms are generally costly for resource constrained position-based wireless sensor networks (WSNs). In this paper, we propose a void avoidance algorithm (VAA), a novel idea based on upgrading virtual distance. VAA allows wireless sensor nodes to remove all stuck nodes by transforming the routing graph and forwarding packets using only greedy routing. In VAA, the stuck node upgrades distance unless it finds a next hop node that is closer to the destination than it is. VAA guarantees packet delivery if there is a topologically valid path. Further, it is completely distributed, immediately responds to node failure or topology changes and does not require planarization of the network. NS-2 is used to evaluate the performance and correctness of VAA and we compare its performance to other protocols. Simulations show our proposed algorithm consumes less energy, has an efficient path and substantially less control overheads.

  9. GSNFS: Gene subnetwork biomarker identification of lung cancer expression data.

    PubMed

    Doungpan, Narumol; Engchuan, Worrawat; Chan, Jonathan H; Meechai, Asawin

    2016-12-05

    Gene expression has been used to identify disease gene biomarkers, but there are ongoing challenges. Single gene or gene-set biomarkers are inadequate to provide sufficient understanding of complex disease mechanisms and the relationship among those genes. Network-based methods have thus been considered for inferring the interaction within a group of genes to further study the disease mechanism. Recently, the Gene-Network-based Feature Set (GNFS), which is capable of handling case-control and multiclass expression for gene biomarker identification, has been proposed, partly taking into account of network topology. However, its performance relies on a greedy search for building subnetworks and thus requires further improvement. In this work, we establish a new approach named Gene Sub-Network-based Feature Selection (GSNFS) by implementing the GNFS framework with two proposed searching and scoring algorithms, namely gene-set-based (GS) search and parent-node-based (PN) search, to identify subnetworks. An additional dataset is used to validate the results. The two proposed searching algorithms of the GSNFS method for subnetwork expansion are concerned with the degree of connectivity and the scoring scheme for building subnetworks and their topology. For each iteration of expansion, the neighbour genes of a current subnetwork, whose expression data improved the overall subnetwork score, is recruited. While the GS search calculated the subnetwork score using an activity score of a current subnetwork and the gene expression values of its neighbours, the PN search uses the expression value of the corresponding parent of each neighbour gene. Four lung cancer expression datasets were used for subnetwork identification. In addition, using pathway data and protein-protein interaction as network data in order to consider the interaction among significant genes were discussed. Classification was performed to compare the performance of the identified gene subnetworks with three subnetwork identification algorithms. The two searching algorithms resulted in better classification and gene/gene-set agreement compared to the original greedy search of the GNFS method. The identified lung cancer subnetwork using the proposed searching algorithm resulted in an improvement of the cross-dataset validation and an increase in the consistency of findings between two independent datasets. The homogeneity measurement of the datasets was conducted to assess dataset compatibility in cross-dataset validation. The lung cancer dataset with higher homogeneity showed a better result when using the GS search while the dataset with low homogeneity showed a better result when using the PN search. The 10-fold cross-dataset validation on the independent lung cancer datasets showed higher classification performance of the proposed algorithms when compared with the greedy search in the original GNFS method. The proposed searching algorithms provide a higher number of genes in the subnetwork expansion step than the greedy algorithm. As a result, the performance of the subnetworks identified from the GSNFS method was improved in terms of classification performance and gene/gene-set level agreement depending on the homogeneity of the datasets used in the analysis. Some common genes obtained from the four datasets using different searching algorithms are genes known to play a role in lung cancer. The improvement of classification performance and the gene/gene-set level agreement, and the biological relevance indicated the effectiveness of the GSNFS method for gene subnetwork identification using expression data.

  10. TH-CD-209-01: A Greedy Reassignment Algorithm for the PBS Minimum Monitor Unit Constraint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, Y; Kooy, H; Craft, D

    2016-06-15

    Purpose: To investigate a Greedy Reassignment algorithm in order to mitigate the effects of low weight spots in proton pencil beam scanning (PBS) treatment plans. Methods: To convert a plan from the treatment planning system’s (TPS) to a deliverable plan, post processing methods can be used to adjust the spot maps to meets the minimum MU constraint. Existing methods include: deleting low weight spots (Cut method), or rounding spots with weight above/below half the limit up/down to the limit/zero (Round method). An alternative method called Greedy Reassignment was developed in this work in which the lowest weight spot in themore » field was removed and its weight reassigned equally among its nearest neighbors. The process was repeated with the next lowest weight spot until all spots in the field were above the MU constraint. The algorithm performance was evaluated using plans collected from 190 patients (496 fields) treated at our facility. The evaluation criteria were the γ-index pass rate comparing the pre-processed and post-processed dose distributions. A planning metric was further developed to predict the impact of post-processing on treatment plans for various treatment planning, machine, and dose tolerance parameters. Results: For fields with a gamma pass rate of 90±1%, the metric has a standard deviation equal to 18% of the centroid value. This showed that the metric and γ-index pass rate are correlated for the Greedy Reassignment algorithm. Using a 3rd order polynomial fit to the data, the Greedy Reassignment method had 1.8 times better metric at 90% pass rate compared to other post-processing methods. Conclusion: We showed that the Greedy Reassignment method yields deliverable plans that are closest to the optimized-without-MU-constraint plan from the TPS. The metric developed in this work could help design the minimum MU threshold with the goal of keeping the γ-index pass rate above an acceptable value.« less

  11. Greedy Sparse Approaches for Homological Coverage in Location Unaware Sensor Networks

    DTIC Science & Technology

    2017-12-08

    GlobalSIP); 2013 Dec; Austin , TX . p. 595– 598. 33. Farah C, Schwaner F, Abedi A, Worboys M. Distributed homology algorithm to detect topological events...ARL-TR-8235•DEC 2017 US Army Research Laboratory Greedy Sparse Approaches for Homological Coverage in Location-Unaware Sensor Net- works by Terrence...8235•DEC 2017 US Army Research Laboratory Greedy Sparse Approaches for Homological Coverage in Location-Unaware Sensor Net- works by Terrence J Moore

  12. Reducing a congestion with introduce the greedy algorithm on traffic light control

    NASA Astrophysics Data System (ADS)

    Catur Siswipraptini, Puji; Hendro Martono, Wisnu; Hartanti, Dian

    2018-03-01

    The density of vehicles causes congestion seen at every junction in the city of jakarta due to the static or manual traffic timing lamp system consequently the length of the queue at the junction is uncertain. The research has been aimed at designing a sensor based traffic system based on the queue length detection of the vehicle to optimize the duration of the green light. In detecting the length of the queue of vehicles using infrared sensor assistance placed in each intersection path, then apply Greedy algorithm to help accelerate the movement of green light duration for the path that requires, while to apply the traffic lights regulation program based on greedy algorithm which is then stored on microcontroller with Arduino Mega 2560 type. Where a developed system implements the greedy algorithm with the help of the infrared sensor it will extend the duration of the green light on the long vehicle queue and accelerate the duration of the green light at the intersection that has the queue not too dense. Furthermore, the design is made to form an artificial form of the actual situation of the scale model or simple simulator (next we just called as scale model of simulator) of the intersection then tested. Sensors used are infrared sensors, where the placement of sensors in each intersection on the scale model is placed within 10 cm of each sensor and serves as a queue detector. From the results of the test process on the scale model with a longer queue obtained longer green light time so it will fix the problem of long queue of vehicles. Using greedy algorithms can add long green lights for 2 seconds on tracks that have long queues at least three sensor levels and accelerate time at other intersections that have longer queue sensor levels less than level three.

  13. Electromagnetic interference-aware transmission scheduling and power control for dynamic wireless access in hospital environments.

    PubMed

    Phunchongharn, Phond; Hossain, Ekram; Camorlinga, Sergio

    2011-11-01

    We study the multiple access problem for e-Health applications (referred to as secondary users) coexisting with medical devices (referred to as primary or protected users) in a hospital environment. In particular, we focus on transmission scheduling and power control of secondary users in multiple spatial reuse time-division multiple access (STDMA) networks. The objective is to maximize the spectrum utilization of secondary users and minimize their power consumption subject to the electromagnetic interference (EMI) constraints for active and passive medical devices and minimum throughput guarantee for secondary users. The multiple access problem is formulated as a dual objective optimization problem which is shown to be NP-complete. We propose a joint scheduling and power control algorithm based on a greedy approach to solve the problem with much lower computational complexity. To this end, an enhanced greedy algorithm is proposed to improve the performance of the greedy algorithm by finding the optimal sequence of secondary users for scheduling. Using extensive simulations, the tradeoff in performance in terms of spectrum utilization, energy consumption, and computational complexity is evaluated for both the algorithms.

  14. Improvements on a privacy-protection algorithm for DNA sequences with generalization lattices.

    PubMed

    Li, Guang; Wang, Yadong; Su, Xiaohong

    2012-10-01

    When developing personal DNA databases, there must be an appropriate guarantee of anonymity, which means that the data cannot be related back to individuals. DNA lattice anonymization (DNALA) is a successful method for making personal DNA sequences anonymous. However, it uses time-consuming multiple sequence alignment and a low-accuracy greedy clustering algorithm. Furthermore, DNALA is not an online algorithm, and so it cannot quickly return results when the database is updated. This study improves the DNALA method. Specifically, we replaced the multiple sequence alignment in DNALA with global pairwise sequence alignment to save time, and we designed a hybrid clustering algorithm comprised of a maximum weight matching (MWM)-based algorithm and an online algorithm. The MWM-based algorithm is more accurate than the greedy algorithm in DNALA and has the same time complexity. The online algorithm can process data quickly when the database is updated. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  15. Approximating the 0-1 Multiple Knapsack Problem with Agent Decomposition and Market Negotiation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smolinski, B.

    The 0-1 multiple knapsack problem appears in many domains from financial portfolio management to cargo ship stowing. Methods for solving it range from approximate algorithms, such as greedy algorithms, to exact algorithms, such as branch and bound. Approximate algorithms have no bounds on how poorly they perform and exact algorithms can suffer from exponential time and space complexities with large data sets. This paper introduces a market model based on agent decomposition and market auctions for approximating the 0-1 multiple knapsack problem, and an algorithm that implements the model (M(x)). M(x) traverses the solution space rather than getting caught inmore » a local maximum, overcoming an inherent problem of many greedy algorithms. The use of agents ensures that infeasible solutions are not considered while traversing the solution space and that traversal of the solution space is not just random, but is also directed. M(x) is compared to a bound and bound algorithm (BB) and a simple greedy algorithm with a random shuffle (G(x)). The results suggest that M(x) is a good algorithm for approximating the 0-1 Multiple Knapsack problem. M(x) almost always found solutions that were close to optimal in a fraction of the time it took BB to run and with much less memory on large test data sets. M(x) usually performed better than G(x) on hard problems with correlated data.« less

  16. ScaffoldScaffolder: solving contig orientation via bidirected to directed graph reduction.

    PubMed

    Bodily, Paul M; Fujimoto, M Stanley; Snell, Quinn; Ventura, Dan; Clement, Mark J

    2016-01-01

    The contig orientation problem, which we formally define as the MAX-DIR problem, has at times been addressed cursorily and at times using various heuristics. In setting forth a linear-time reduction from the MAX-CUT problem to the MAX-DIR problem, we prove the latter is NP-complete. We compare the relative performance of a novel greedy approach with several other heuristic solutions. Our results suggest that our greedy heuristic algorithm not only works well but also outperforms the other algorithms due to the nature of scaffold graphs. Our results also demonstrate a novel method for identifying inverted repeats and inversion variants, both of which contradict the basic single-orientation assumption. Such inversions have previously been noted as being difficult to detect and are directly involved in the genetic mechanisms of several diseases. http://bioresearch.byu.edu/scaffoldscaffolder. paulmbodily@gmail.com Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  17. A Guiding Evolutionary Algorithm with Greedy Strategy for Global Optimization Problems

    PubMed Central

    Cao, Leilei; Xu, Lihong; Goodman, Erik D.

    2016-01-01

    A Guiding Evolutionary Algorithm (GEA) with greedy strategy for global optimization problems is proposed. Inspired by Particle Swarm Optimization, the Genetic Algorithm, and the Bat Algorithm, the GEA was designed to retain some advantages of each method while avoiding some disadvantages. In contrast to the usual Genetic Algorithm, each individual in GEA is crossed with the current global best one instead of a randomly selected individual. The current best individual served as a guide to attract offspring to its region of genotype space. Mutation was added to offspring according to a dynamic mutation probability. To increase the capability of exploitation, a local search mechanism was applied to new individuals according to a dynamic probability of local search. Experimental results show that GEA outperformed the other three typical global optimization algorithms with which it was compared. PMID:27293421

  18. A Guiding Evolutionary Algorithm with Greedy Strategy for Global Optimization Problems.

    PubMed

    Cao, Leilei; Xu, Lihong; Goodman, Erik D

    2016-01-01

    A Guiding Evolutionary Algorithm (GEA) with greedy strategy for global optimization problems is proposed. Inspired by Particle Swarm Optimization, the Genetic Algorithm, and the Bat Algorithm, the GEA was designed to retain some advantages of each method while avoiding some disadvantages. In contrast to the usual Genetic Algorithm, each individual in GEA is crossed with the current global best one instead of a randomly selected individual. The current best individual served as a guide to attract offspring to its region of genotype space. Mutation was added to offspring according to a dynamic mutation probability. To increase the capability of exploitation, a local search mechanism was applied to new individuals according to a dynamic probability of local search. Experimental results show that GEA outperformed the other three typical global optimization algorithms with which it was compared.

  19. RMP: Reduced-set matching pursuit approach for efficient compressed sensing signal reconstruction.

    PubMed

    Abdel-Sayed, Michael M; Khattab, Ahmed; Abu-Elyazeed, Mohamed F

    2016-11-01

    Compressed sensing enables the acquisition of sparse signals at a rate that is much lower than the Nyquist rate. Compressed sensing initially adopted [Formula: see text] minimization for signal reconstruction which is computationally expensive. Several greedy recovery algorithms have been recently proposed for signal reconstruction at a lower computational complexity compared to the optimal [Formula: see text] minimization, while maintaining a good reconstruction accuracy. In this paper, the Reduced-set Matching Pursuit (RMP) greedy recovery algorithm is proposed for compressed sensing. Unlike existing approaches which either select too many or too few values per iteration, RMP aims at selecting the most sufficient number of correlation values per iteration, which improves both the reconstruction time and error. Furthermore, RMP prunes the estimated signal, and hence, excludes the incorrectly selected values. The RMP algorithm achieves a higher reconstruction accuracy at a significantly low computational complexity compared to existing greedy recovery algorithms. It is even superior to [Formula: see text] minimization in terms of the normalized time-error product, a new metric introduced to measure the trade-off between the reconstruction time and error. RMP superior performance is illustrated with both noiseless and noisy samples.

  20. Improving performances of suboptimal greedy iterative biclustering heuristics via localization.

    PubMed

    Erten, Cesim; Sözdinler, Melih

    2010-10-15

    Biclustering gene expression data is the problem of extracting submatrices of genes and conditions exhibiting significant correlation across both the rows and the columns of a data matrix of expression values. Even the simplest versions of the problem are computationally hard. Most of the proposed solutions therefore employ greedy iterative heuristics that locally optimize a suitably assigned scoring function. We provide a fast and simple pre-processing algorithm called localization that reorders the rows and columns of the input data matrix in such a way as to group correlated entries in small local neighborhoods within the matrix. The proposed localization algorithm takes its roots from effective use of graph-theoretical methods applied to problems exhibiting a similar structure to that of biclustering. In order to evaluate the effectivenesss of the localization pre-processing algorithm, we focus on three representative greedy iterative heuristic methods. We show how the localization pre-processing can be incorporated into each representative algorithm to improve biclustering performance. Furthermore, we propose a simple biclustering algorithm, Random Extraction After Localization (REAL) that randomly extracts submatrices from the localization pre-processed data matrix, eliminates those with low similarity scores, and provides the rest as correlated structures representing biclusters. We compare the proposed localization pre-processing with another pre-processing alternative, non-negative matrix factorization. We show that our fast and simple localization procedure provides similar or even better results than the computationally heavy matrix factorization pre-processing with regards to H-value tests. We next demonstrate that the performances of the three representative greedy iterative heuristic methods improve with localization pre-processing when biological correlations in the form of functional enrichment and PPI verification constitute the main performance criteria. The fact that the random extraction method based on localization REAL performs better than the representative greedy heuristic methods under same criteria also confirms the effectiveness of the suggested pre-processing method. Supplementary material including code implementations in LEDA C++ library, experimental data, and the results are available at http://code.google.com/p/biclustering/ cesim@khas.edu.tr; melihsozdinler@boun.edu.tr Supplementary data are available at Bioinformatics online.

  1. Algorithms for selecting informative marker panels for population assignment.

    PubMed

    Rosenberg, Noah A

    2005-11-01

    Given a set of potential source populations, genotypes of an individual of unknown origin at a collection of markers can be used to predict the correct source population of the individual. For improved efficiency, informative markers can be chosen from a larger set of markers to maximize the accuracy of this prediction. However, selecting the loci that are individually most informative does not necessarily produce the optimal panel. Here, using genotypes from eight species--carp, cat, chicken, dog, fly, grayling, human, and maize--this univariate accumulation procedure is compared to new multivariate "greedy" and "maximin" algorithms for choosing marker panels. The procedures generally suggest similar panels, although the greedy method often recommends inclusion of loci that are not chosen by the other algorithms. In seven of the eight species, when applied to five or more markers, all methods achieve at least 94% assignment accuracy on simulated individuals, with one species--dog--producing this level of accuracy with only three markers, and the eighth species--human--requiring approximately 13-16 markers. The new algorithms produce substantial improvements over use of randomly selected markers; where differences among the methods are noticeable, the greedy algorithm leads to slightly higher probabilities of correct assignment. Although none of the approaches necessarily chooses the panel with optimal performance, the algorithms all likely select panels with performance near enough to the maximum that they all are suitable for practical use.

  2. Performance improvement of multi-class detection using greedy algorithm for Viola-Jones cascade selection

    NASA Astrophysics Data System (ADS)

    Tereshin, Alexander A.; Usilin, Sergey A.; Arlazarov, Vladimir V.

    2018-04-01

    This paper aims to study the problem of multi-class object detection in video stream with Viola-Jones cascades. An adaptive algorithm for selecting Viola-Jones cascade based on greedy choice strategy in solution of the N-armed bandit problem is proposed. The efficiency of the algorithm on the problem of detection and recognition of the bank card logos in the video stream is shown. The proposed algorithm can be effectively used in documents localization and identification, recognition of road scene elements, localization and tracking of the lengthy objects , and for solving other problems of rigid object detection in a heterogeneous data flows. The computational efficiency of the algorithm makes it possible to use it both on personal computers and on mobile devices based on processors with low power consumption.

  3. Deep greedy learning under thermal variability in full diurnal cycles

    NASA Astrophysics Data System (ADS)

    Rauss, Patrick; Rosario, Dalton

    2017-08-01

    We study the generalization and scalability behavior of a deep belief network (DBN) applied to a challenging long-wave infrared hyperspectral dataset, consisting of radiance from several manmade and natural materials within a fixed site located 500 m from an observation tower. The collections cover multiple full diurnal cycles and include different atmospheric conditions. Using complementary priors, a DBN uses a greedy algorithm that can learn deep, directed belief networks one layer at a time and has two layers form to provide undirected associative memory. The greedy algorithm initializes a slower learning procedure, which fine-tunes the weights, using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of spectral data and their labels, despite significant data variability between and within classes due to environmental and temperature variation occurring within and between full diurnal cycles. We argue, however, that more questions than answers are raised regarding the generalization capacity of these deep nets through experiments aimed at investigating their training and augmented learning behavior.

  4. Determining coding CpG islands by identifying regions significant for pattern statistics on Markov chains.

    PubMed

    Singer, Meromit; Engström, Alexander; Schönhuth, Alexander; Pachter, Lior

    2011-09-23

    Recent experimental and computational work confirms that CpGs can be unmethylated inside coding exons, thereby showing that codons may be subjected to both genomic and epigenomic constraint. It is therefore of interest to identify coding CpG islands (CCGIs) that are regions inside exons enriched for CpGs. The difficulty in identifying such islands is that coding exons exhibit sequence biases determined by codon usage and constraints that must be taken into account. We present a method for finding CCGIs that showcases a novel approach we have developed for identifying regions of interest that are significant (with respect to a Markov chain) for the counts of any pattern. Our method begins with the exact computation of tail probabilities for the number of CpGs in all regions contained in coding exons, and then applies a greedy algorithm for selecting islands from among the regions. We show that the greedy algorithm provably optimizes a biologically motivated criterion for selecting islands while controlling the false discovery rate. We applied this approach to the human genome (hg18) and annotated CpG islands in coding exons. The statistical criterion we apply to evaluating islands reduces the number of false positives in existing annotations, while our approach to defining islands reveals significant numbers of undiscovered CCGIs in coding exons. Many of these appear to be examples of functional epigenetic specialization in coding exons.

  5. Optimal stabilization of Boolean networks through collective influence

    NASA Astrophysics Data System (ADS)

    Wang, Jiannan; Pei, Sen; Wei, Wei; Feng, Xiangnan; Zheng, Zhiming

    2018-03-01

    Boolean networks have attracted much attention due to their wide applications in describing dynamics of biological systems. During past decades, much effort has been invested in unveiling how network structure and update rules affect the stability of Boolean networks. In this paper, we aim to identify and control a minimal set of influential nodes that is capable of stabilizing an unstable Boolean network. For locally treelike Boolean networks with biased truth tables, we propose a greedy algorithm to identify influential nodes in Boolean networks by minimizing the largest eigenvalue of a modified nonbacktracking matrix. We test the performance of the proposed collective influence algorithm on four different networks. Results show that the collective influence algorithm can stabilize each network with a smaller set of nodes compared with other heuristic algorithms. Our work provides a new insight into the mechanism that determines the stability of Boolean networks, which may find applications in identifying virulence genes that lead to serious diseases.

  6. Fast Solution in Sparse LDA for Binary Classification

    NASA Technical Reports Server (NTRS)

    Moghaddam, Baback

    2010-01-01

    An algorithm that performs sparse linear discriminant analysis (Sparse-LDA) finds near-optimal solutions in far less time than the prior art when specialized to binary classification (of 2 classes). Sparse-LDA is a type of feature- or variable- selection problem with numerous applications in statistics, machine learning, computer vision, computational finance, operations research, and bio-informatics. Because of its combinatorial nature, feature- or variable-selection problems are NP-hard or computationally intractable in cases involving more than 30 variables or features. Therefore, one typically seeks approximate solutions by means of greedy search algorithms. The prior Sparse-LDA algorithm was a greedy algorithm that considered the best variable or feature to add/ delete to/ from its subsets in order to maximally discriminate between multiple classes of data. The present algorithm is designed for the special but prevalent case of 2-class or binary classification (e.g. 1 vs. 0, functioning vs. malfunctioning, or change versus no change). The present algorithm provides near-optimal solutions on large real-world datasets having hundreds or even thousands of variables or features (e.g. selecting the fewest wavelength bands in a hyperspectral sensor to do terrain classification) and does so in typical computation times of minutes as compared to days or weeks as taken by the prior art. Sparse LDA requires solving generalized eigenvalue problems for a large number of variable subsets (represented by the submatrices of the input within-class and between-class covariance matrices). In the general (fullrank) case, the amount of computation scales at least cubically with the number of variables and thus the size of the problems that can be solved is limited accordingly. However, in binary classification, the principal eigenvalues can be found using a special analytic formula, without resorting to costly iterative techniques. The present algorithm exploits this analytic form along with the inherent sequential nature of greedy search itself. Together this enables the use of highly-efficient partitioned-matrix-inverse techniques that result in large speedups of computation in both the forward-selection and backward-elimination stages of greedy algorithms in general.

  7. Aveiro method in reproducing kernel Hilbert spaces under complete dictionary

    NASA Astrophysics Data System (ADS)

    Mai, Weixiong; Qian, Tao

    2017-12-01

    Aveiro Method is a sparse representation method in reproducing kernel Hilbert spaces (RKHS) that gives orthogonal projections in linear combinations of reproducing kernels over uniqueness sets. It, however, suffers from determination of uniqueness sets in the underlying RKHS. In fact, in general spaces, uniqueness sets are not easy to be identified, let alone the convergence speed aspect with Aveiro Method. To avoid those difficulties we propose an anew Aveiro Method based on a dictionary and the matching pursuit idea. What we do, in fact, are more: The new Aveiro method will be in relation to the recently proposed, the so called Pre-Orthogonal Greedy Algorithm (P-OGA) involving completion of a given dictionary. The new method is called Aveiro Method Under Complete Dictionary (AMUCD). The complete dictionary consists of all directional derivatives of the underlying reproducing kernels. We show that, under the boundary vanishing condition, bring available for the classical Hardy and Paley-Wiener spaces, the complete dictionary enables an efficient expansion of any given element in the Hilbert space. The proposed method reveals new and advanced aspects in both the Aveiro Method and the greedy algorithm.

  8. Inferring consistent functional interaction patterns from natural stimulus FMRI data

    PubMed Central

    Sun, Jiehuan; Hu, Xintao; Huang, Xiu; Liu, Yang; Li, Kaiming; Li, Xiang; Han, Junwei; Guo, Lei

    2014-01-01

    There has been increasing interest in how the human brain responds to natural stimulus such as video watching in the neuroimaging field. Along this direction, this paper presents our effort in inferring consistent and reproducible functional interaction patterns under natural stimulus of video watching among known functional brain regions identified by task-based fMRI. Then, we applied and compared four statistical approaches, including Bayesian network modeling with searching algorithms: greedy equivalence search (GES), Peter and Clark (PC) analysis, independent multiple greedy equivalence search (IMaGES), and the commonly used Granger causality analysis (GCA), to infer consistent and reproducible functional interaction patterns among these brain regions. It is interesting that a number of reliable and consistent functional interaction patterns were identified by the GES, PC and IMaGES algorithms in different participating subjects when they watched multiple video shots of the same semantic category. These interaction patterns are meaningful given current neuroscience knowledge and are reasonably reproducible across different brains and video shots. In particular, these consistent functional interaction patterns are supported by structural connections derived from diffusion tensor imaging (DTI) data, suggesting the structural underpinnings of consistent functional interactions. Our work demonstrates that specific consistent patterns of functional interactions among relevant brain regions might reflect the brain's fundamental mechanisms of online processing and comprehension of video messages. PMID:22440644

  9. Wireless Sensor Network Metrics for Real-Time Systems

    DTIC Science & Technology

    2009-05-20

    to compute the probability of end-to-end packet delivery as a function of latency, the expected radio energy consumption on the nodes from relaying... schedules for WSNs. Particularly, we focus on the impact scheduling has on path diversity, using short repeating schedules and Greedy Maximal Matching...a greedy algorithm for constructing a mesh routing topology. Finally, we study the implications of using distributed scheduling schemes to generate

  10. Minimizing the Total Service Time of Discrete Dynamic Berth Allocation Problem by an Iterated Greedy Heuristic

    PubMed Central

    2014-01-01

    Berth allocation is the forefront operation performed when ships arrive at a port and is a critical task in container port optimization. Minimizing the time ships spend at berths constitutes an important objective of berth allocation problems. This study focuses on the discrete dynamic berth allocation problem (discrete DBAP), which aims to minimize total service time, and proposes an iterated greedy (IG) algorithm to solve it. The proposed IG algorithm is tested on three benchmark problem sets. Experimental results show that the proposed IG algorithm can obtain optimal solutions for all test instances of the first and second problem sets and outperforms the best-known solutions for 35 out of 90 test instances of the third problem set. PMID:25295295

  11. A Modified Distributed Bees Algorithm for Multi-Sensor Task Allocation.

    PubMed

    Tkach, Itshak; Jevtić, Aleksandar; Nof, Shimon Y; Edan, Yael

    2018-03-02

    Multi-sensor systems can play an important role in monitoring tasks and detecting targets. However, real-time allocation of heterogeneous sensors to dynamic targets/tasks that are unknown a priori in their locations and priorities is a challenge. This paper presents a Modified Distributed Bees Algorithm (MDBA) that is developed to allocate stationary heterogeneous sensors to upcoming unknown tasks using a decentralized, swarm intelligence approach to minimize the task detection times. Sensors are allocated to tasks based on sensors' performance, tasks' priorities, and the distances of the sensors from the locations where the tasks are being executed. The algorithm was compared to a Distributed Bees Algorithm (DBA), a Bees System, and two common multi-sensor algorithms, market-based and greedy-based algorithms, which were fitted for the specific task. Simulation analyses revealed that MDBA achieved statistically significant improved performance by 7% with respect to DBA as the second-best algorithm, and by 19% with respect to Greedy algorithm, which was the worst, thus indicating its fitness to provide solutions for heterogeneous multi-sensor systems.

  12. A Modified Distributed Bees Algorithm for Multi-Sensor Task Allocation †

    PubMed Central

    Nof, Shimon Y.; Edan, Yael

    2018-01-01

    Multi-sensor systems can play an important role in monitoring tasks and detecting targets. However, real-time allocation of heterogeneous sensors to dynamic targets/tasks that are unknown a priori in their locations and priorities is a challenge. This paper presents a Modified Distributed Bees Algorithm (MDBA) that is developed to allocate stationary heterogeneous sensors to upcoming unknown tasks using a decentralized, swarm intelligence approach to minimize the task detection times. Sensors are allocated to tasks based on sensors’ performance, tasks’ priorities, and the distances of the sensors from the locations where the tasks are being executed. The algorithm was compared to a Distributed Bees Algorithm (DBA), a Bees System, and two common multi-sensor algorithms, market-based and greedy-based algorithms, which were fitted for the specific task. Simulation analyses revealed that MDBA achieved statistically significant improved performance by 7% with respect to DBA as the second-best algorithm, and by 19% with respect to Greedy algorithm, which was the worst, thus indicating its fitness to provide solutions for heterogeneous multi-sensor systems. PMID:29498683

  13. Influencing Busy People in a Social Network

    PubMed Central

    Sarkar, Kaushik; Sundaram, Hari

    2016-01-01

    We identify influential early adopters in a social network, where individuals are resource constrained, to maximize the spread of multiple, costly behaviors. A solution to this problem is especially important for viral marketing. The problem of maximizing influence in a social network is challenging since it is computationally intractable. We make three contributions. First, we propose a new model of collective behavior that incorporates individual intent, knowledge of neighbors actions and resource constraints. Second, we show that the multiple behavior influence maximization is NP-hard. Furthermore, we show that the problem is submodular, implying the existence of a greedy solution that approximates the optimal solution to within a constant. However, since the greedy algorithm is expensive for large networks, we propose efficient heuristics to identify the influential individuals, including heuristics to assign behaviors to the different early adopters. We test our approach on synthetic and real-world topologies with excellent results. We evaluate the effectiveness under three metrics: unique number of participants, total number of active behaviors and network resource utilization. Our heuristics produce 15-51% increase in expected resource utilization over the naïve approach. PMID:27711127

  14. Influencing Busy People in a Social Network.

    PubMed

    Sarkar, Kaushik; Sundaram, Hari

    2016-01-01

    We identify influential early adopters in a social network, where individuals are resource constrained, to maximize the spread of multiple, costly behaviors. A solution to this problem is especially important for viral marketing. The problem of maximizing influence in a social network is challenging since it is computationally intractable. We make three contributions. First, we propose a new model of collective behavior that incorporates individual intent, knowledge of neighbors actions and resource constraints. Second, we show that the multiple behavior influence maximization is NP-hard. Furthermore, we show that the problem is submodular, implying the existence of a greedy solution that approximates the optimal solution to within a constant. However, since the greedy algorithm is expensive for large networks, we propose efficient heuristics to identify the influential individuals, including heuristics to assign behaviors to the different early adopters. We test our approach on synthetic and real-world topologies with excellent results. We evaluate the effectiveness under three metrics: unique number of participants, total number of active behaviors and network resource utilization. Our heuristics produce 15-51% increase in expected resource utilization over the naïve approach.

  15. Efficient Approximation Algorithms for Weighted $b$-Matching

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Khan, Arif; Pothen, Alex; Mostofa Ali Patwary, Md.

    2016-01-01

    We describe a half-approximation algorithm, b-Suitor, for computing a b-Matching of maximum weight in a graph with weights on the edges. b-Matching is a generalization of the well-known Matching problem in graphs, where the objective is to choose a subset of M edges in the graph such that at most a specified number b(v) of edges in M are incident on each vertex v. Subject to this restriction we maximize the sum of the weights of the edges in M. We prove that the b-Suitor algorithm computes the same b-Matching as the one obtained by the greedy algorithm for themore » problem. We implement the algorithm on serial and shared-memory parallel processors, and compare its performance against a collection of approximation algorithms that have been proposed for the Matching problem. Our results show that the b-Suitor algorithm outperforms the Greedy and Locally Dominant edge algorithms by one to two orders of magnitude on a serial processor. The b-Suitor algorithm has a high degree of concurrency, and it scales well up to 240 threads on a shared memory multiprocessor. The b-Suitor algorithm outperforms the Locally Dominant edge algorithm by a factor of fourteen on 16 cores of an Intel Xeon multiprocessor.« less

  16. Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations

    PubMed Central

    2012-01-01

    Background Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. Results We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension – UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high-end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent test data than a classifier trained using features selected by a statistical p-value-based filter, which is currently the most popular approach for constructing predictive models in GWAS. Conclusions Greedy RLS is the first known implementation of a machine learning based method with the capability to conduct a wrapper-based feature selection on an entire GWAS containing several thousand examples and over 400,000 variants. In our experiments, greedy RLS selected a highly predictive subset of genetic variants in a fraction of the time spent by wrapper-based selection methods used together with SVM classifiers. The proposed algorithms are freely available as part of the RLScore software library at http://users.utu.fi/aatapa/RLScore/. PMID:22551170

  17. Improving multivariate Horner schemes with Monte Carlo tree search

    NASA Astrophysics Data System (ADS)

    Kuipers, J.; Plaat, A.; Vermaseren, J. A. M.; van den Herik, H. J.

    2013-11-01

    Optimizing the cost of evaluating a polynomial is a classic problem in computer science. For polynomials in one variable, Horner's method provides a scheme for producing a computationally efficient form. For multivariate polynomials it is possible to generalize Horner's method, but this leaves freedom in the order of the variables. Traditionally, greedy schemes like most-occurring variable first are used. This simple textbook algorithm has given remarkably efficient results. Finding better algorithms has proved difficult. In trying to improve upon the greedy scheme we have implemented Monte Carlo tree search, a recent search method from the field of artificial intelligence. This results in better Horner schemes and reduces the cost of evaluating polynomials, sometimes by factors up to two.

  18. An ILP based memetic algorithm for finding minimum positive influence dominating sets in social networks

    NASA Astrophysics Data System (ADS)

    Lin, Geng; Guan, Jian; Feng, Huibin

    2018-06-01

    The positive influence dominating set problem is a variant of the minimum dominating set problem, and has lots of applications in social networks. It is NP-hard, and receives more and more attention. Various methods have been proposed to solve the positive influence dominating set problem. However, most of the existing work focused on greedy algorithms, and the solution quality needs to be improved. In this paper, we formulate the minimum positive influence dominating set problem as an integer linear programming (ILP), and propose an ILP based memetic algorithm (ILPMA) for solving the problem. The ILPMA integrates a greedy randomized adaptive construction procedure, a crossover operator, a repair operator, and a tabu search procedure. The performance of ILPMA is validated on nine real-world social networks with nodes up to 36,692. The results show that ILPMA significantly improves the solution quality, and is robust.

  19. Scaling Up Coordinate Descent Algorithms for Large ℓ1 Regularization Problems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scherrer, Chad; Halappanavar, Mahantesh; Tewari, Ambuj

    2012-07-03

    We present a generic framework for parallel coordinate descent (CD) algorithms that has as special cases the original sequential algorithms of Cyclic CD and Stochastic CD, as well as the recent parallel Shotgun algorithm of Bradley et al. We introduce two novel parallel algorithms that are also special cases---Thread-Greedy CD and Coloring-Based CD---and give performance measurements for an OpenMP implementation of these.

  20. Greedy bases in rank 2 quantum cluster algebras

    PubMed Central

    Lee, Kyungyong; Li, Li; Rupel, Dylan; Zelevinsky, Andrei

    2014-01-01

    We identify a quantum lift of the greedy basis for rank 2 coefficient-free cluster algebras. Our main result is that our construction does not depend on the choice of initial cluster, that it builds all cluster monomials, and that it produces bar-invariant elements. We also present several conjectures related to this quantum greedy basis and the triangular basis of Berenstein and Zelevinsky. PMID:24982182

  1. Information-optimal genome assembly via sparse read-overlap graphs.

    PubMed

    Shomorony, Ilan; Kim, Samuel H; Courtade, Thomas A; Tse, David N C

    2016-09-01

    In the context of third-generation long-read sequencing technologies, read-overlap-based approaches are expected to play a central role in the assembly step. A fundamental challenge in assembling from a read-overlap graph is that the true sequence corresponds to a Hamiltonian path on the graph, and, under most formulations, the assembly problem becomes NP-hard, restricting practical approaches to heuristics. In this work, we avoid this seemingly fundamental barrier by first setting the computational complexity issue aside, and seeking an algorithm that targets information limits In particular, we consider a basic feasibility question: when does the set of reads contain enough information to allow unambiguous reconstruction of the true sequence? Based on insights from this information feasibility question, we present an algorithm-the Not-So-Greedy algorithm-to construct a sparse read-overlap graph. Unlike most other assembly algorithms, Not-So-Greedy comes with a performance guarantee: whenever information feasibility conditions are satisfied, the algorithm reduces the assembly problem to an Eulerian path problem on the resulting graph, and can thus be solved in linear time. In practice, this theoretical guarantee translates into assemblies of higher quality. Evaluations on both simulated reads from real genomes and a PacBio Escherichia coli K12 dataset demonstrate that Not-So-Greedy compares favorably with standard string graph approaches in terms of accuracy of the resulting read-overlap graph and contig N50. Available at github.com/samhykim/nsg courtade@eecs.berkeley.edu or dntse@stanford.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Single-Pass Serial Scheduling Heuristic for Eglin AFB Range Services Division Schedule

    DTIC Science & Technology

    2009-06-01

    scheduling tool for this RCPSP. Research on a schedule improvement metaheuristic and coding of the complete algorithm is required before it can be...a schedule better by applying metaheuristic improvement algorithms to a feasible schedule after it is created. 2.5.1. Greedy Algorithm The...next available position, the algorithm will not utilize all the available range time and manpower. An improvement metaheuristic is required to

  3. Distributed learning automata-based algorithm for community detection in complex networks

    NASA Astrophysics Data System (ADS)

    Khomami, Mohammad Mehdi Daliri; Rezvanian, Alireza; Meybodi, Mohammad Reza

    2016-03-01

    Community structure is an important and universal topological property of many complex networks such as social and information networks. The detection of communities of a network is a significant technique for understanding the structure and function of networks. In this paper, we propose an algorithm based on distributed learning automata for community detection (DLACD) in complex networks. In the proposed algorithm, each vertex of network is equipped with a learning automation. According to the cooperation among network of learning automata and updating action probabilities of each automaton, the algorithm interactively tries to identify high-density local communities. The performance of the proposed algorithm is investigated through a number of simulations on popular synthetic and real networks. Experimental results in comparison with popular community detection algorithms such as walk trap, Danon greedy optimization, Fuzzy community detection, Multi-resolution community detection and label propagation demonstrated the superiority of DLACD in terms of modularity, NMI, performance, min-max-cut and coverage.

  4. Sniffer Channel Selection for Monitoring Wireless LANs

    NASA Astrophysics Data System (ADS)

    Song, Yuan; Chen, Xian; Kim, Yoo-Ah; Wang, Bing; Chen, Guanling

    Wireless sniffers are often used to monitor APs in wireless LANs (WLANs) for network management, fault detection, traffic characterization, and optimizing deployment. It is cost effective to deploy single-radio sniffers that can monitor multiple nearby APs. However, since nearby APs often operate on orthogonal channels, a sniffer needs to switch among multiple channels to monitor its nearby APs. In this paper, we formulate and solve two optimization problems on sniffer channel selection. Both problems require that each AP be monitored by at least one sniffer. In addition, one optimization problem requires minimizing the maximum number of channels that a sniffer listens to, and the other requires minimizing the total number of channels that the sniffers listen to. We propose a novel LP-relaxation based algorithm, and two simple greedy heuristics for the above two optimization problems. Through simulation, we demonstrate that all the algorithms are effective in achieving their optimization goals, and the LP-based algorithm outperforms the greedy heuristics.

  5. Biclustering of gene expression data using reactive greedy randomized adaptive search procedure.

    PubMed

    Dharan, Smitha; Nair, Achuthsankar S

    2009-01-30

    Biclustering algorithms belong to a distinct class of clustering algorithms that perform simultaneous clustering of both rows and columns of the gene expression matrix and can be a very useful analysis tool when some genes have multiple functions and experimental conditions are diverse. Cheng and Church have introduced a measure called mean squared residue score to evaluate the quality of a bicluster and has become one of the most popular measures to search for biclusters. In this paper, we review basic concepts of the metaheuristics Greedy Randomized Adaptive Search Procedure (GRASP)-construction and local search phases and propose a new method which is a variant of GRASP called Reactive Greedy Randomized Adaptive Search Procedure (Reactive GRASP) to detect significant biclusters from large microarray datasets. The method has two major steps. First, high quality bicluster seeds are generated by means of k-means clustering. In the second step, these seeds are grown using the Reactive GRASP, in which the basic parameter that defines the restrictiveness of the candidate list is self-adjusted, depending on the quality of the solutions found previously. We performed statistical and biological validations of the biclusters obtained and evaluated the method against the results of basic GRASP and as well as with the classic work of Cheng and Church. The experimental results indicate that the Reactive GRASP approach outperforms the basic GRASP algorithm and Cheng and Church approach. The Reactive GRASP approach for the detection of significant biclusters is robust and does not require calibration efforts.

  6. Survey of gene splicing algorithms based on reads.

    PubMed

    Si, Xiuhua; Wang, Qian; Zhang, Lei; Wu, Ruo; Ma, Jiquan

    2017-11-02

    Gene splicing is the process of assembling a large number of unordered short sequence fragments to the original genome sequence as accurately as possible. Several popular splicing algorithms based on reads are reviewed in this article, including reference genome algorithms and de novo splicing algorithms (Greedy-extension, Overlap-Layout-Consensus graph, De Bruijn graph). We also discuss a new splicing method based on the MapReduce strategy and Hadoop. By comparing these algorithms, some conclusions are drawn and some suggestions on gene splicing research are made.

  7. FindGDPs: fast identification of primers for labeling microbial transcriptomes for DNA microarray analysis

    PubMed Central

    Blick, Robert J.; Revel, Andrew T.; Hansen, Eric J.

    2008-01-01

    Summary FindGDPs is a program that uses a greedy algorithm to quickly identify a set of genome-directed primers that specifically anneal to all of the open reading frames in a genome and that do not exhibit full-length complementarity to the members of another user-supplied set of nucleotide sequences. Availability The program code is distributed under the GNU General Public License at http://www8.utsouthwestern.edu/utsw/cda/dept131456/files/159331.html Contact eric.hansen@utsouthwestern.edu PMID:15593406

  8. Design of nucleic acid sequences for DNA computing based on a thermodynamic approach

    PubMed Central

    Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

    2005-01-01

    We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (ΔGmin). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate ΔGmin. This effectively excludes inappropriate sequences before ΔGmin is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (ΔGexp) of 126 sequences correlated well with ΔGmin (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java. PMID:15701762

  9. Initialization and Restart in Stochastic Local Search: Computing a Most Probable Explanation in Bayesian Networks

    NASA Technical Reports Server (NTRS)

    Mengshoel, Ole J.; Wilkins, David C.; Roth, Dan

    2010-01-01

    For hard computational problems, stochastic local search has proven to be a competitive approach to finding optimal or approximately optimal problem solutions. Two key research questions for stochastic local search algorithms are: Which algorithms are effective for initialization? When should the search process be restarted? In the present work we investigate these research questions in the context of approximate computation of most probable explanations (MPEs) in Bayesian networks (BNs). We introduce a novel approach, based on the Viterbi algorithm, to explanation initialization in BNs. While the Viterbi algorithm works on sequences and trees, our approach works on BNs with arbitrary topologies. We also give a novel formalization of stochastic local search, with focus on initialization and restart, using probability theory and mixture models. Experimentally, we apply our methods to the problem of MPE computation, using a stochastic local search algorithm known as Stochastic Greedy Search. By carefully optimizing both initialization and restart, we reduce the MPE search time for application BNs by several orders of magnitude compared to using uniform at random initialization without restart. On several BNs from applications, the performance of Stochastic Greedy Search is competitive with clique tree clustering, a state-of-the-art exact algorithm used for MPE computation in BNs.

  10. Feature Clustering for Accelerating Parallel Coordinate Descent

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Scherrer, Chad; Tewari, Ambuj; Halappanavar, Mahantesh

    2012-12-06

    We demonstrate an approach for accelerating calculation of the regularization path for L1 sparse logistic regression problems. We show the benefit of feature clustering as a preconditioning step for parallel block-greedy coordinate descent algorithms.

  11. Smart Phase Tuning in Microwave Photonic Integrated Circuits Toward Automated Frequency Multiplication by Design

    NASA Astrophysics Data System (ADS)

    Nabavi, N.

    2018-07-01

    The author investigates the monitoring methods for fine adjustment of the previously proposed on-chip architecture for frequency multiplication and translation of harmonics by design. Digital signal processing (DSP) algorithms are utilized to create an optimized microwave photonic integrated circuit functionality toward automated frequency multiplication. The implemented DSP algorithms are formed on discrete Fourier transform and optimization-based algorithms (Greedy and gradient-based algorithms), which are analytically derived and numerically compared based on the accuracy and speed of convergence criteria.

  12. WFIRST: Exoplanet Target Selection and Scheduling with Greedy Optimization

    NASA Astrophysics Data System (ADS)

    Keithly, Dean; Garrett, Daniel; Delacroix, Christian; Savransky, Dmitry

    2018-01-01

    We present target selection and scheduling algorithms for missions with direct imaging of exoplanets, and the Wide Field Infrared Survey Telescope (WFIRST) in particular, which will be equipped with a coronagraphic instrument (CGI). Optimal scheduling of CGI targets can maximize the expected value of directly imaged exoplanets (completeness). Using target completeness as a reward metric and integration time plus overhead time as a cost metric, we can maximize the sum completeness for a mission with a fixed duration. We optimize over these metrics to create a list of target stars using a greedy optimization algorithm based off altruistic yield optimization (AYO) under ideal conditions. We simulate full missions using EXOSIMS by observing targets in this list for their predetermined integration times. In this poster, we report the theoretical maximum sum completeness, mean number of detected exoplanets from Monte Carlo simulations, and the ideal expected value of the simulated missions.

  13. Improving recovery of ECG signal with deterministic guarantees using split signal for multiple supports of matching pursuit (SS-MSMP) algorithm.

    PubMed

    Tawfic, Israa Shaker; Kayhan, Sema Koc

    2017-02-01

    Compressed sensing (CS) is a new field used for signal acquisition and design of sensor that made a large drooping in the cost of acquiring sparse signals. In this paper, new algorithms are developed to improve the performance of the greedy algorithms. In this paper, a new greedy pursuit algorithm, SS-MSMP (Split Signal for Multiple Support of Matching Pursuit), is introduced and theoretical analyses are given. The SS-MSMP is suggested for sparse data acquisition, in order to reconstruct analog and efficient signals via a small set of general measurements. This paper proposes a new fast method which depends on a study of the behavior of the support indices through picking the best estimation of the corrosion between residual and measurement matrix. The term multiple supports originates from an algorithm; in each iteration, the best support indices are picked based on maximum quality created by discovering correlation for a particular length of support. We depend on this new algorithm upon our previous derivative of halting condition that we produce for Least Support Orthogonal Matching Pursuit (LS-OMP) for clear and noisy signal. For better reconstructed results, SS-MSMP algorithm provides the recovery of support set for long signals such as signals used in WBAN. Numerical experiments demonstrate that the new suggested algorithm performs well compared to existing algorithms in terms of many factors used for reconstruction performance. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Biclustering of gene expression data using reactive greedy randomized adaptive search procedure

    PubMed Central

    Dharan, Smitha; Nair, Achuthsankar S

    2009-01-01

    Background Biclustering algorithms belong to a distinct class of clustering algorithms that perform simultaneous clustering of both rows and columns of the gene expression matrix and can be a very useful analysis tool when some genes have multiple functions and experimental conditions are diverse. Cheng and Church have introduced a measure called mean squared residue score to evaluate the quality of a bicluster and has become one of the most popular measures to search for biclusters. In this paper, we review basic concepts of the metaheuristics Greedy Randomized Adaptive Search Procedure (GRASP)-construction and local search phases and propose a new method which is a variant of GRASP called Reactive Greedy Randomized Adaptive Search Procedure (Reactive GRASP) to detect significant biclusters from large microarray datasets. The method has two major steps. First, high quality bicluster seeds are generated by means of k-means clustering. In the second step, these seeds are grown using the Reactive GRASP, in which the basic parameter that defines the restrictiveness of the candidate list is self-adjusted, depending on the quality of the solutions found previously. Results We performed statistical and biological validations of the biclusters obtained and evaluated the method against the results of basic GRASP and as well as with the classic work of Cheng and Church. The experimental results indicate that the Reactive GRASP approach outperforms the basic GRASP algorithm and Cheng and Church approach. Conclusion The Reactive GRASP approach for the detection of significant biclusters is robust and does not require calibration efforts. PMID:19208127

  15. An Improved Hybrid Encoding Cuckoo Search Algorithm for 0-1 Knapsack Problems

    PubMed Central

    Feng, Yanhong; Jia, Ke; He, Yichao

    2014-01-01

    Cuckoo search (CS) is a new robust swarm intelligence method that is based on the brood parasitism of some cuckoo species. In this paper, an improved hybrid encoding cuckoo search algorithm (ICS) with greedy strategy is put forward for solving 0-1 knapsack problems. First of all, for solving binary optimization problem with ICS, based on the idea of individual hybrid encoding, the cuckoo search over a continuous space is transformed into the synchronous evolution search over discrete space. Subsequently, the concept of confidence interval (CI) is introduced; hence, the new position updating is designed and genetic mutation with a small probability is introduced. The former enables the population to move towards the global best solution rapidly in every generation, and the latter can effectively prevent the ICS from trapping into the local optimum. Furthermore, the greedy transform method is used to repair the infeasible solution and optimize the feasible solution. Experiments with a large number of KP instances show the effectiveness of the proposed algorithm and its ability to achieve good quality solutions. PMID:24527026

  16. Fractal dimension of interfaces in Edwards-Anderson spin glasses for up to six space dimensions.

    PubMed

    Wang, Wenlong; Moore, M A; Katzgraber, Helmut G

    2018-03-01

    The fractal dimension of domain walls produced by changing the boundary conditions from periodic to antiperiodic in one spatial direction is studied using both the strong-disorder renormalization group algorithm and the greedy algorithm for the Edwards-Anderson Ising spin-glass model for up to six space dimensions. We find that for five or fewer space dimensions, the fractal dimension is lower than the space dimension. This means that interfaces are not space filling, thus implying that replica symmetry breaking is absent in space dimensions fewer than six. However, the fractal dimension approaches the space dimension in six dimensions, indicating that replica symmetry breaking occurs above six dimensions. In two space dimensions, the strong-disorder renormalization group results for the fractal dimension are in good agreement with essentially exact numerical results, but the small difference is significant. We discuss the origin of this close agreement. For the greedy algorithm there is analytical expectation that the fractal dimension is equal to the space dimension in six dimensions and our numerical results are consistent with this expectation.

  17. A statistical-based scheduling algorithm in automated data path synthesis

    NASA Technical Reports Server (NTRS)

    Jeon, Byung Wook; Lursinsap, Chidchanok

    1992-01-01

    In this paper, we propose a new heuristic scheduling algorithm based on the statistical analysis of the cumulative frequency distribution of operations among control steps. It has a tendency of escaping from local minima and therefore reaching a globally optimal solution. The presented algorithm considers the real world constraints such as chained operations, multicycle operations, and pipelined data paths. The result of the experiment shows that it gives optimal solutions, even though it is greedy in nature.

  18. Assessment of metal ion concentration in water with structured feature selection.

    PubMed

    Naula, Pekka; Airola, Antti; Pihlasalo, Sari; Montoya Perez, Ileana; Salakoski, Tapio; Pahikkala, Tapio

    2017-10-01

    We propose a cost-effective system for the determination of metal ion concentration in water, addressing a central issue in water resources management. The system combines novel luminometric label array technology with a machine learning algorithm that selects a minimal number of array reagents (modulators) and liquid sample dilutions, such that enable accurate quantification. The algorithm is able to identify the optimal modulators and sample dilutions leading to cost reductions since less manual labour and resources are needed. Inferring the ion detector involves a unique type of a structured feature selection problem, which we formalize in this paper. We propose a novel Cartesian greedy forward feature selection algorithm for solving the problem. The novel algorithm was evaluated in the concentration assessment of five metal ions and the performance was compared to two known feature selection approaches. The results demonstrate that the proposed system can assist in lowering the costs with minimal loss in accuracy. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Community-aware task allocation for social networked multiagent systems.

    PubMed

    Wang, Wanyuan; Jiang, Yichuan

    2014-09-01

    In this paper, we propose a novel community-aware task allocation model for social networked multiagent systems (SN-MASs), where the agent' cooperation domain is constrained in community and each agent can negotiate only with its intracommunity member agents. Under such community-aware scenarios, we prove that it remains NP-hard to maximize system overall profit. To solve this problem effectively, we present a heuristic algorithm that is composed of three phases: 1) task selection: select the desirable task to be allocated preferentially; 2) allocation to community: allocate the selected task to communities based on a significant task-first heuristics; and 3) allocation to agent: negotiate resources for the selected task based on a nonoverlap agent-first and breadth-first resource negotiation mechanism. Through the theoretical analyses and experiments, the advantages of our presented heuristic algorithm and community-aware task allocation model are validated. 1) Our presented heuristic algorithm performs very closely to the benchmark exponential brute-force optimal algorithm and the network flow-based greedy algorithm in terms of system overall profit in small-scale applications. Moreover, in the large-scale applications, the presented heuristic algorithm achieves approximately the same overall system profit, but significantly reduces the computational load compared with the greedy algorithm. 2) Our presented community-aware task allocation model reduces the system communication cost compared with the previous global-aware task allocation model and improves the system overall profit greatly compared with the previous local neighbor-aware task allocation model.

  20. Scalable Iterative Classification for Sanitizing Large-Scale Datasets

    PubMed Central

    Li, Bo; Vorobeychik, Yevgeniy; Li, Muqun; Malin, Bradley

    2017-01-01

    Cheap ubiquitous computing enables the collection of massive amounts of personal data in a wide variety of domains. Many organizations aim to share such data while obscuring features that could disclose personally identifiable information. Much of this data exhibits weak structure (e.g., text), such that machine learning approaches have been developed to detect and remove identifiers from it. While learning is never perfect, and relying on such approaches to sanitize data can leak sensitive information, a small risk is often acceptable. Our goal is to balance the value of published data and the risk of an adversary discovering leaked identifiers. We model data sanitization as a game between 1) a publisher who chooses a set of classifiers to apply to data and publishes only instances predicted as non-sensitive and 2) an attacker who combines machine learning and manual inspection to uncover leaked identifying information. We introduce a fast iterative greedy algorithm for the publisher that ensures a low utility for a resource-limited adversary. Moreover, using five text data sets we illustrate that our algorithm leaves virtually no automatically identifiable sensitive instances for a state-of-the-art learning algorithm, while sharing over 93% of the original data, and completes after at most 5 iterations. PMID:28943741

  1. Greedy algorithms for diffuse optical tomography reconstruction

    NASA Astrophysics Data System (ADS)

    Dileep, B. P. V.; Das, Tapan; Dutta, Pranab K.

    2018-03-01

    Diffuse optical tomography (DOT) is a noninvasive imaging modality that reconstructs the optical parameters of a highly scattering medium. However, the inverse problem of DOT is ill-posed and highly nonlinear due to the zig-zag propagation of photons that diffuses through the cross section of tissue. The conventional DOT imaging methods iteratively compute the solution of forward diffusion equation solver which makes the problem computationally expensive. Also, these methods fail when the geometry is complex. Recently, the theory of compressive sensing (CS) has received considerable attention because of its efficient use in biomedical imaging applications. The objective of this paper is to solve a given DOT inverse problem by using compressive sensing framework and various Greedy algorithms such as orthogonal matching pursuit (OMP), compressive sampling matching pursuit (CoSaMP), and stagewise orthogonal matching pursuit (StOMP), regularized orthogonal matching pursuit (ROMP) and simultaneous orthogonal matching pursuit (S-OMP) have been studied to reconstruct the change in the absorption parameter i.e, Δα from the boundary data. Also, the Greedy algorithms have been validated experimentally on a paraffin wax rectangular phantom through a well designed experimental set up. We also have studied the conventional DOT methods like least square method and truncated singular value decomposition (TSVD) for comparison. One of the main features of this work is the usage of less number of source-detector pairs, which can facilitate the use of DOT in routine applications of screening. The performance metrics such as mean square error (MSE), normalized mean square error (NMSE), structural similarity index (SSIM), and peak signal to noise ratio (PSNR) have been used to evaluate the performance of the algorithms mentioned in this paper. Extensive simulation results confirm that CS based DOT reconstruction outperforms the conventional DOT imaging methods in terms of computational efficiency. The main advantage of this study is that the forward diffusion equation solver need not be repeatedly solved.

  2. A Target Coverage Scheduling Scheme Based on Genetic Algorithms in Directional Sensor Networks

    PubMed Central

    Gil, Joon-Min; Han, Youn-Hee

    2011-01-01

    As a promising tool for monitoring the physical world, directional sensor networks (DSNs) consisting of a large number of directional sensors are attracting increasing attention. As directional sensors in DSNs have limited battery power and restricted angles of sensing range, maximizing the network lifetime while monitoring all the targets in a given area remains a challenge. A major technique to conserve the energy of directional sensors is to use a node wake-up scheduling protocol by which some sensors remain active to provide sensing services, while the others are inactive to conserve their energy. In this paper, we first address a Maximum Set Covers for DSNs (MSCD) problem, which is known to be NP-complete, and present a greedy algorithm-based target coverage scheduling scheme that can solve this problem by heuristics. This scheme is used as a baseline for comparison. We then propose a target coverage scheduling scheme based on a genetic algorithm that can find the optimal cover sets to extend the network lifetime while monitoring all targets by the evolutionary global search technique. To verify and evaluate these schemes, we conducted simulations and showed that the schemes can contribute to extending the network lifetime. Simulation results indicated that the genetic algorithm-based scheduling scheme had better performance than the greedy algorithm-based scheme in terms of maximizing network lifetime. PMID:22319387

  3. Iterated greedy algorithms to minimize the total family flow time for job-shop scheduling with job families and sequence-dependent set-ups

    NASA Astrophysics Data System (ADS)

    Kim, Ji-Su; Park, Jung-Hyeon; Lee, Dong-Ho

    2017-10-01

    This study addresses a variant of job-shop scheduling in which jobs are grouped into job families, but they are processed individually. The problem can be found in various industrial systems, especially in reprocessing shops of remanufacturing systems. If the reprocessing shop is a job-shop type and has the component-matching requirements, it can be regarded as a job shop with job families since the components of a product constitute a job family. In particular, sequence-dependent set-ups in which set-up time depends on the job just completed and the next job to be processed are also considered. The objective is to minimize the total family flow time, i.e. the maximum among the completion times of the jobs within a job family. A mixed-integer programming model is developed and two iterated greedy algorithms with different local search methods are proposed. Computational experiments were conducted on modified benchmark instances and the results are reported.

  4. Extraction of process zones and low-dimensional attractive subspaces in stochastic fracture mechanics

    PubMed Central

    Kerfriden, P.; Schmidt, K.M.; Rabczuk, T.; Bordas, S.P.A.

    2013-01-01

    We propose to identify process zones in heterogeneous materials by tailored statistical tools. The process zone is redefined as the part of the structure where the random process cannot be correctly approximated in a low-dimensional deterministic space. Such a low-dimensional space is obtained by a spectral analysis performed on pre-computed solution samples. A greedy algorithm is proposed to identify both process zone and low-dimensional representative subspace for the solution in the complementary region. In addition to the novelty of the tools proposed in this paper for the analysis of localised phenomena, we show that the reduced space generated by the method is a valid basis for the construction of a reduced order model. PMID:27069423

  5. Optical network unit placement in Fiber-Wireless (FiWi) access network by Moth-Flame optimization algorithm

    NASA Astrophysics Data System (ADS)

    Singh, Puja; Prakash, Shashi

    2017-07-01

    Hybrid wireless-optical broadband access network (WOBAN) or Fiber-Wireless (FiWi) is the integration of wireless access network and optical network. This hybrid multi-domain network adopts the advantages of wireless and optical domains and serves the demand of technology savvy users. FiWi exhibits the properties of cost effectiveness, robustness, flexibility, high capacity, reliability and is self organized. Optical Network Unit (ONU) placement problem in FiWi contributes in simplifying the network design and enhances the performance in terms of cost efficiency and increased throughput. Several individual-based algorithms, such as Simulated Annealing (SA), Tabu Search, etc. have been suggested for ONU placement, but these algorithms suffer from premature convergence (trapping in a local optima). The present research work undertakes the deployment of FiWi and proposes a novel nature-inspired heuristic paradigm called Moth-Flame optimization (MFO) algorithm for multiple optical network units' placement. MFO is a population based algorithm. Population-based algorithms are better in handling local optima avoidance. The simulation results are compared with the existing Greedy and Simulated Annealing algorithms to optimize the position of ONUs. To the best of our knowledge, MFO algorithm has been used for the first time in this domain, moreover it has been able to provide very promising and competitive results. The performance of MFO algorithm has been analyzed by varying the 'b' parameter. MFO algorithm results in faster convergence than the existing strategies of Greedy and SA and returns a lower value of overall cost function. The results exhibit the dependence of the objective function on the distribution of wireless users also.

  6. Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.

    PubMed

    Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin

    2017-04-01

    As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.

  7. Functional Data Approximation on Bounded Domains using Polygonal Finite Elements.

    PubMed

    Cao, Juan; Xiao, Yanyang; Chen, Zhonggui; Wang, Wenping; Bajaj, Chandrajit

    2018-07-01

    We construct and analyze piecewise approximations of functional data on arbitrary 2D bounded domains using generalized barycentric finite elements, and particularly quadratic serendipity elements for planar polygons. We compare approximation qualities (precision/convergence) of these partition-of-unity finite elements through numerical experiments, using Wachspress coordinates, natural neighbor coordinates, Poisson coordinates, mean value coordinates, and quadratic serendipity bases over polygonal meshes on the domain. For a convex n -sided polygon, the quadratic serendipity elements have 2 n basis functions, associated in a Lagrange-like fashion to each vertex and each edge midpoint, rather than the usual n ( n + 1)/2 basis functions to achieve quadratic convergence. Two greedy algorithms are proposed to generate Voronoi meshes for adaptive functional/scattered data approximations. Experimental results show space/accuracy advantages for these quadratic serendipity finite elements on polygonal domains versus traditional finite elements over simplicial meshes. Polygonal meshes and parameter coefficients of the quadratic serendipity finite elements obtained by our greedy algorithms can be further refined using an L 2 -optimization to improve the piecewise functional approximation. We conduct several experiments to demonstrate the efficacy of our algorithm for modeling features/discontinuities in functional data/image approximation.

  8. Interactive outlining: an improved approach using active contours

    NASA Astrophysics Data System (ADS)

    Daneels, Dirk; van Campenhout, David; Niblack, Carlton W.; Equitz, Will; Barber, Ron; Fierens, Freddy

    1993-04-01

    The purpose of our work is to outline objects on images in an interactive environment. We use an improved method based on energy minimizing active contours or `snakes.' Kass et al., proposed a variational technique; Amini used dynamic programming; and Williams and Shah introduced a fast, greedy algorithm. We combine the advantages of the latter two methods in a two-stage algorithm. The first stage is a greedy procedure that provides fast initial convergence. It is enhanced with a cost term that extends over a large number of points to avoid oscillations. The second stage, when accuracy becomes important, uses dynamic programming. This step is accelerated by the use of alternating search neighborhoods and by dropping stable points from the iterations. We have also added several features for user interaction. First, the user can define points of high confidence. Mathematically, this results in an extra cost term and, in that way, the robustness in difficult areas (e.g., noisy edges, sharp corners) is improved. We also give the user the possibility of incremental contour tracking, thus providing feedback on the refinement process. The algorithm has been tested on numerous photographic clip art images and extensive tests on medical images are in progress.

  9. [The study of medical supplies automation replenishment algorithm in hospital on medical supplies supplying chain].

    PubMed

    Sheng, Xi

    2012-07-01

    The thesis aims to study the automation replenishment algorithm in hospital on medical supplies supplying chain. The mathematical model and algorithm of medical supplies automation replenishment are designed through referring to practical data form hospital on the basis of applying inventory theory, greedy algorithm and partition algorithm. The automation replenishment algorithm is proved to realize automatic calculation of the medical supplies distribution amount and optimize medical supplies distribution scheme. A conclusion could be arrived that the model and algorithm of inventory theory, if applied in medical supplies circulation field, could provide theoretical and technological support for realizing medical supplies automation replenishment of hospital on medical supplies supplying chain.

  10. Heuristic algorithms for solving of the tool routing problem for CNC cutting machines

    NASA Astrophysics Data System (ADS)

    Chentsov, P. A.; Petunin, A. A.; Sesekin, A. N.; Shipacheva, E. N.; Sholohov, A. E.

    2015-11-01

    The article is devoted to the problem of minimizing the path of the cutting tool to shape cutting machines began. This problem can be interpreted as a generalized traveling salesman problem. Earlier version of the dynamic programming method to solve this problem was developed. Unfortunately, this method allows to process an amount not exceeding thirty circuits. In this regard, the task of constructing quasi-optimal route becomes relevant. In this paper we propose options for quasi-optimal greedy algorithms. Comparison of the results of exact and approximate algorithms is given.

  11. Using a Card Trick to Teach Discrete Mathematics

    ERIC Educational Resources Information Center

    Simonson, Shai; Holm, Tara S.

    2003-01-01

    We present a card trick that can be used to review or teach a variety of topics in discrete mathematics. We address many subjects, including permutations, combinations, functions, graphs, depth first search, the pigeonhole principle, greedy algorithms, and concepts from number theory. Moreover, the trick motivates the use of computers in…

  12. The In-Transit Vigilant Covering Tour Problem of Routing Unmanned Ground Vehicles

    DTIC Science & Technology

    2012-08-01

    of vertices in both vertex sets V and W, rather than exclusively in the vertex set V. A metaheuristic algorithm which follows the Greedy Randomized...window (VRPTW) approach, with the application of Java-encoded metaheuristic , was used [O’Rourke et al., 2001] for the dynamic routing of UAVs. Harder et...minimize both the two conflicting objectives; tour length and the coverage distance via a multi-objective evolutionary algorithm . This approach avoids a

  13. A greedy, graph-based algorithm for the alignment of multiple homologous gene lists.

    PubMed

    Fostier, Jan; Proost, Sebastian; Dhoedt, Bart; Saeys, Yvan; Demeester, Piet; Van de Peer, Yves; Vandepoele, Klaas

    2011-03-15

    Many comparative genomics studies rely on the correct identification of homologous genomic regions using accurate alignment tools. In such case, the alphabet of the input sequences consists of complete genes, rather than nucleotides or amino acids. As optimal multiple sequence alignment is computationally impractical, a progressive alignment strategy is often employed. However, such an approach is susceptible to the propagation of alignment errors in early pairwise alignment steps, especially when dealing with strongly diverged genomic regions. In this article, we present a novel accurate and efficient greedy, graph-based algorithm for the alignment of multiple homologous genomic segments, represented as ordered gene lists. Based on provable properties of the graph structure, several heuristics are developed to resolve local alignment conflicts that occur due to gene duplication and/or rearrangement events on the different genomic segments. The performance of the algorithm is assessed by comparing the alignment results of homologous genomic segments in Arabidopsis thaliana to those obtained by using both a progressive alignment method and an earlier graph-based implementation. Especially for datasets that contain strongly diverged segments, the proposed method achieves a substantially higher alignment accuracy, and proves to be sufficiently fast for large datasets including a few dozens of eukaryotic genomes. http://bioinformatics.psb.ugent.be/software. The algorithm is implemented as a part of the i-ADHoRe 3.0 package.

  14. Large-Scale Dynamic Observation Planning for Unmanned Surface Vessels

    DTIC Science & Technology

    2007-06-01

    programming language. In addition, the useful development software NetBeans IDE is free and makes the use of Java very user-friendly. 92...3. We implemented the greedy and 3PAA algorithms in Java using the NetBeans IDE version 5.5. 4. The test datasets were generated in MATLAB. 5

  15. Context-Sensitive Grammar Transform: Compression and Pattern Matching

    NASA Astrophysics Data System (ADS)

    Maruyama, Shirou; Tanaka, Youhei; Sakamoto, Hiroshi; Takeda, Masayuki

    A framework of context-sensitive grammar transform for speeding-up compressed pattern matching (CPM) is proposed. A greedy compression algorithm with the transform model is presented as well as a Knuth-Morris-Pratt (KMP)-type compressed pattern matching algorithm. The compression ratio is a match for gzip and Re-Pair, and the search speed of our CPM algorithm is almost twice faster than the KMP-type CPM algorithm on Byte-Pair-Encoding by Shibata et al.[18], and in the case of short patterns, faster than the Boyer-Moore-Horspool algorithm with the stopper encoding by Rautio et al.[14], which is regarded as one of the best combinations that allows a practically fast search.

  16. Greedy data transportation scheme with hard packet deadlines for wireless ad hoc networks.

    PubMed

    Lee, HyungJune

    2014-01-01

    We present a greedy data transportation scheme with hard packet deadlines in ad hoc sensor networks of stationary nodes and multiple mobile nodes with scheduled trajectory path and arrival time. In the proposed routing strategy, each stationary ad hoc node en route decides whether to relay a shortest-path stationary node toward destination or a passing-by mobile node that will carry closer to destination. We aim to utilize mobile nodes to minimize the total routing cost as far as the selected route can satisfy the end-to-end packet deadline. We evaluate our proposed routing algorithm in terms of routing cost, packet delivery ratio, packet delivery time, and usability of mobile nodes based on network level simulations. Simulation results show that our proposed algorithm fully exploits the remaining time till packet deadline to turn into networking benefits of reducing the overall routing cost and improving packet delivery performance. Also, we demonstrate that the routing scheme guarantees packet delivery with hard deadlines, contributing to QoS improvement in various network services.

  17. Greedy Data Transportation Scheme with Hard Packet Deadlines for Wireless Ad Hoc Networks

    PubMed Central

    Lee, HyungJune

    2014-01-01

    We present a greedy data transportation scheme with hard packet deadlines in ad hoc sensor networks of stationary nodes and multiple mobile nodes with scheduled trajectory path and arrival time. In the proposed routing strategy, each stationary ad hoc node en route decides whether to relay a shortest-path stationary node toward destination or a passing-by mobile node that will carry closer to destination. We aim to utilize mobile nodes to minimize the total routing cost as far as the selected route can satisfy the end-to-end packet deadline. We evaluate our proposed routing algorithm in terms of routing cost, packet delivery ratio, packet delivery time, and usability of mobile nodes based on network level simulations. Simulation results show that our proposed algorithm fully exploits the remaining time till packet deadline to turn into networking benefits of reducing the overall routing cost and improving packet delivery performance. Also, we demonstrate that the routing scheme guarantees packet delivery with hard deadlines, contributing to QoS improvement in various network services. PMID:25258736

  18. Spatial cluster detection using dynamic programming.

    PubMed

    Sverchkov, Yuriy; Jiang, Xia; Cooper, Gregory F

    2012-03-25

    The task of spatial cluster detection involves finding spatial regions where some property deviates from the norm or the expected value. In a probabilistic setting this task can be expressed as finding a region where some event is significantly more likely than usual. Spatial cluster detection is of interest in fields such as biosurveillance, mining of astronomical data, military surveillance, and analysis of fMRI images. In almost all such applications we are interested both in the question of whether a cluster exists in the data, and if it exists, we are interested in finding the most accurate characterization of the cluster. We present a general dynamic programming algorithm for grid-based spatial cluster detection. The algorithm can be used for both Bayesian maximum a-posteriori (MAP) estimation of the most likely spatial distribution of clusters and Bayesian model averaging over a large space of spatial cluster distributions to compute the posterior probability of an unusual spatial clustering. The algorithm is explained and evaluated in the context of a biosurveillance application, specifically the detection and identification of Influenza outbreaks based on emergency department visits. A relatively simple underlying model is constructed for the purpose of evaluating the algorithm, and the algorithm is evaluated using the model and semi-synthetic test data. When compared to baseline methods, tests indicate that the new algorithm can improve MAP estimates under certain conditions: the greedy algorithm we compared our method to was found to be more sensitive to smaller outbreaks, while as the size of the outbreaks increases, in terms of area affected and proportion of individuals affected, our method overtakes the greedy algorithm in spatial precision and recall. The new algorithm performs on-par with baseline methods in the task of Bayesian model averaging. We conclude that the dynamic programming algorithm performs on-par with other available methods for spatial cluster detection and point to its low computational cost and extendability as advantages in favor of further research and use of the algorithm.

  19. Spatial cluster detection using dynamic programming

    PubMed Central

    2012-01-01

    Background The task of spatial cluster detection involves finding spatial regions where some property deviates from the norm or the expected value. In a probabilistic setting this task can be expressed as finding a region where some event is significantly more likely than usual. Spatial cluster detection is of interest in fields such as biosurveillance, mining of astronomical data, military surveillance, and analysis of fMRI images. In almost all such applications we are interested both in the question of whether a cluster exists in the data, and if it exists, we are interested in finding the most accurate characterization of the cluster. Methods We present a general dynamic programming algorithm for grid-based spatial cluster detection. The algorithm can be used for both Bayesian maximum a-posteriori (MAP) estimation of the most likely spatial distribution of clusters and Bayesian model averaging over a large space of spatial cluster distributions to compute the posterior probability of an unusual spatial clustering. The algorithm is explained and evaluated in the context of a biosurveillance application, specifically the detection and identification of Influenza outbreaks based on emergency department visits. A relatively simple underlying model is constructed for the purpose of evaluating the algorithm, and the algorithm is evaluated using the model and semi-synthetic test data. Results When compared to baseline methods, tests indicate that the new algorithm can improve MAP estimates under certain conditions: the greedy algorithm we compared our method to was found to be more sensitive to smaller outbreaks, while as the size of the outbreaks increases, in terms of area affected and proportion of individuals affected, our method overtakes the greedy algorithm in spatial precision and recall. The new algorithm performs on-par with baseline methods in the task of Bayesian model averaging. Conclusions We conclude that the dynamic programming algorithm performs on-par with other available methods for spatial cluster detection and point to its low computational cost and extendability as advantages in favor of further research and use of the algorithm. PMID:22443103

  20. Text Summarization Model based on Maximum Coverage Problem and its Variant

    NASA Astrophysics Data System (ADS)

    Takamura, Hiroya; Okumura, Manabu

    We discuss text summarization in terms of maximum coverage problem and its variant. To solve the optimization problem, we applied some decoding algorithms including the ones never used in this summarization formulation, such as a greedy algorithm with performance guarantee, a randomized algorithm, and a branch-and-bound method. We conduct comparative experiments. On the basis of the experimental results, we also augment the summarization model so that it takes into account the relevance to the document cluster. Through experiments, we showed that the augmented model is at least comparable to the best-performing method of DUC'04.

  1. Discrete Particle Swarm Optimization Routing Protocol for Wireless Sensor Networks with Multiple Mobile Sinks.

    PubMed

    Yang, Jin; Liu, Fagui; Cao, Jianneng; Wang, Liangming

    2016-07-14

    Mobile sinks can achieve load-balancing and energy-consumption balancing across the wireless sensor networks (WSNs). However, the frequent change of the paths between source nodes and the sinks caused by sink mobility introduces significant overhead in terms of energy and packet delays. To enhance network performance of WSNs with mobile sinks (MWSNs), we present an efficient routing strategy, which is formulated as an optimization problem and employs the particle swarm optimization algorithm (PSO) to build the optimal routing paths. However, the conventional PSO is insufficient to solve discrete routing optimization problems. Therefore, a novel greedy discrete particle swarm optimization with memory (GMDPSO) is put forward to address this problem. In the GMDPSO, particle's position and velocity of traditional PSO are redefined under discrete MWSNs scenario. Particle updating rule is also reconsidered based on the subnetwork topology of MWSNs. Besides, by improving the greedy forwarding routing, a greedy search strategy is designed to drive particles to find a better position quickly. Furthermore, searching history is memorized to accelerate convergence. Simulation results demonstrate that our new protocol significantly improves the robustness and adapts to rapid topological changes with multiple mobile sinks, while efficiently reducing the communication overhead and the energy consumption.

  2. A comparison of 12 algorithms for matching on the propensity score.

    PubMed

    Austin, Peter C

    2014-03-15

    Propensity-score matching is increasingly being used to reduce the confounding that can occur in observational studies examining the effects of treatments or interventions on outcomes. We used Monte Carlo simulations to examine the following algorithms for forming matched pairs of treated and untreated subjects: optimal matching, greedy nearest neighbor matching without replacement, and greedy nearest neighbor matching without replacement within specified caliper widths. For each of the latter two algorithms, we examined four different sub-algorithms defined by the order in which treated subjects were selected for matching to an untreated subject: lowest to highest propensity score, highest to lowest propensity score, best match first, and random order. We also examined matching with replacement. We found that (i) nearest neighbor matching induced the same balance in baseline covariates as did optimal matching; (ii) when at least some of the covariates were continuous, caliper matching tended to induce balance on baseline covariates that was at least as good as the other algorithms; (iii) caliper matching tended to result in estimates of treatment effect with less bias compared with optimal and nearest neighbor matching; (iv) optimal and nearest neighbor matching resulted in estimates of treatment effect with negligibly less variability than did caliper matching; (v) caliper matching had amongst the best performance when assessed using mean squared error; (vi) the order in which treated subjects were selected for matching had at most a modest effect on estimation; and (vii) matching with replacement did not have superior performance compared with caliper matching without replacement. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.

  3. A comparison of 12 algorithms for matching on the propensity score

    PubMed Central

    Austin, Peter C

    2014-01-01

    Propensity-score matching is increasingly being used to reduce the confounding that can occur in observational studies examining the effects of treatments or interventions on outcomes. We used Monte Carlo simulations to examine the following algorithms for forming matched pairs of treated and untreated subjects: optimal matching, greedy nearest neighbor matching without replacement, and greedy nearest neighbor matching without replacement within specified caliper widths. For each of the latter two algorithms, we examined four different sub-algorithms defined by the order in which treated subjects were selected for matching to an untreated subject: lowest to highest propensity score, highest to lowest propensity score, best match first, and random order. We also examined matching with replacement. We found that (i) nearest neighbor matching induced the same balance in baseline covariates as did optimal matching; (ii) when at least some of the covariates were continuous, caliper matching tended to induce balance on baseline covariates that was at least as good as the other algorithms; (iii) caliper matching tended to result in estimates of treatment effect with less bias compared with optimal and nearest neighbor matching; (iv) optimal and nearest neighbor matching resulted in estimates of treatment effect with negligibly less variability than did caliper matching; (v) caliper matching had amongst the best performance when assessed using mean squared error; (vi) the order in which treated subjects were selected for matching had at most a modest effect on estimation; and (vii) matching with replacement did not have superior performance compared with caliper matching without replacement. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:24123228

  4. Team formation and breakup in multiagent systems

    NASA Astrophysics Data System (ADS)

    Rao, Venkatesh Guru

    The goal of this dissertation is to pose and solve problems involving team formation and breakup in two specific multiagent domains: formation travel and space-based interferometric observatories. The methodology employed comprises elements drawn from control theory, scheduling theory and artificial intelligence (AI). The original contribution of the work comprises three elements. The first contribution, the partitioned state-space approach is a technique for formulating and solving co-ordinated motion problem using calculus of variations techniques. The approach is applied to obtain optimal two-agent formation travel trajectories on graphs. The second contribution is the class of MixTeam algorithms, a class of team dispatchers that extends classical dispatching by accommodating team formation and breakup and exploration/exploitation learning. The algorithms are applied to observation scheduling and constellation geometry design for interferometric space telescopes. The use of feedback control for team scheduling is also demonstrated with these algorithms. The third contribution is the analysis of the optimality properties of greedy, or myopic, decision-making for a simple class of team dispatching problems. This analysis represents a first step towards the complete analysis of complex team schedulers such as the MixTeam algorithms. The contributions represent an extension to the literature on team dynamics in control theory. The broad conclusions that emerge from this research are that greedy or myopic decision-making strategies for teams perform well when specific parameters in the domain are weakly affected by an agent's actions, and that intelligent systems require a closer integration of domain knowledge in decision-making functions.

  5. Chemotaxis can provide biological organisms with good solutions to the travelling salesman problem.

    PubMed

    Reynolds, A M

    2011-05-01

    The ability to find good solutions to the traveling salesman problem can benefit some biological organisms. Bacterial infection would, for instance, be eradicated most promptly if cells of the immune system minimized the total distance they traveled when moving between bacteria. Similarly, foragers would maximize their net energy gain if the distance that they traveled between multiple dispersed prey items was minimized. The traveling salesman problem is one of the most intensively studied problems in combinatorial optimization. There are no efficient algorithms for even solving the problem approximately (within a guaranteed constant factor from the optimum) because the problem is nondeterministic polynomial time complete. The best approximate algorithms can typically find solutions within 1%-2% of the optimal, but these are computationally intensive and can not be implemented by biological organisms. Biological organisms could, in principle, implement the less efficient greedy nearest-neighbor algorithm, i.e., always move to the nearest surviving target. Implementation of this strategy does, however, require quite sophisticated cognitive abilities and prior knowledge of the target locations. Here, with the aid of numerical simulations, it is shown that biological organisms can simply use chemotaxis to solve, or at worst provide good solutions (comparable to those found by the greedy algorithm) to, the traveling salesman problem when the targets are sources of a chemoattractant and are modest in number (n < 10). This applies to neutrophils and macrophages in microbial defense and to some predators.

  6. A Particle Swarm Optimization-Based Approach with Local Search for Predicting Protein Folding.

    PubMed

    Yang, Cheng-Hong; Lin, Yu-Shiun; Chuang, Li-Yeh; Chang, Hsueh-Wei

    2017-10-01

    The hydrophobic-polar (HP) model is commonly used for predicting protein folding structures and hydrophobic interactions. This study developed a particle swarm optimization (PSO)-based algorithm combined with local search algorithms; specifically, the high exploration PSO (HEPSO) algorithm (which can execute global search processes) was combined with three local search algorithms (hill-climbing algorithm, greedy algorithm, and Tabu table), yielding the proposed HE-L-PSO algorithm. By using 20 known protein structures, we evaluated the performance of the HE-L-PSO algorithm in predicting protein folding in the HP model. The proposed HE-L-PSO algorithm exhibited favorable performance in predicting both short and long amino acid sequences with high reproducibility and stability, compared with seven reported algorithms. The HE-L-PSO algorithm yielded optimal solutions for all predicted protein folding structures. All HE-L-PSO-predicted protein folding structures possessed a hydrophobic core that is similar to normal protein folding.

  7. Biased and greedy random walks on two-dimensional lattices with quenched randomness: The greedy ant within a disordered environment

    NASA Astrophysics Data System (ADS)

    Mitran, T. L.; Melchert, O.; Hartmann, A. K.

    2013-12-01

    The main characteristics of biased greedy random walks (BGRWs) on two-dimensional lattices with real-valued quenched disorder on the lattice edges are studied. Here the disorder allows for negative edge weights. In previous studies, considering the negative-weight percolation (NWP) problem, this was shown to change the universality class of the existing, static percolation transition. In the presented study, four different types of BGRWs and an algorithm based on the ant colony optimization heuristic were considered. Regarding the BGRWs, the precise configurations of the lattice walks constructed during the numerical simulations were influenced by two parameters: a disorder parameter ρ that controls the amount of negative edge weights on the lattice and a bias strength B that governs the drift of the walkers along a certain lattice direction. The random walks are “greedy” in the sense that the local optimal choice of the walker is to preferentially traverse edges with a negative weight (associated with a net gain of “energy” for the walker). Here, the pivotal observable is the probability that, after termination, a lattice walk exhibits a total negative weight, which is here considered as percolating. The behavior of this observable as function of ρ for different bias strengths B is put under scrutiny. Upon tuning ρ, the probability to find such a feasible lattice walk increases from zero to 1. This is the key feature of the percolation transition in the NWP model. Here, we address the question how well the transition point ρc, resulting from numerically exact and “static” simulations in terms of the NWP model, can be resolved using simple dynamic algorithms that have only local information available, one of the basic questions in the physics of glassy systems.

  8. Mobile transporter path planning

    NASA Technical Reports Server (NTRS)

    Baffes, Paul; Wang, Lui

    1990-01-01

    The use of a genetic algorithm (GA) for solving the mobile transporter path planning problem is investigated. The mobile transporter is a traveling robotic vehicle proposed for the space station which must be able to reach any point of the structure autonomously. Elements of the genetic algorithm are explored in both a theoretical and experimental sense. Specifically, double crossover, greedy crossover, and tournament selection techniques are examined. Additionally, the use of local optimization techniques working in concert with the GA are also explored. Recent developments in genetic algorithm theory are shown to be particularly effective in a path planning problem domain, though problem areas can be cited which require more research.

  9. Privacy Protection on Multiple Sensitive Attributes

    NASA Astrophysics Data System (ADS)

    Li, Zhen; Ye, Xiaojun

    In recent years, a privacy model called k-anonymity has gained popularity in the microdata releasing. As the microdata may contain multiple sensitive attributes about an individual, the protection of multiple sensitive attributes has become an important problem. Different from the existing models of single sensitive attribute, extra associations among multiple sensitive attributes should be invested. Two kinds of disclosure scenarios may happen because of logical associations. The Q&S Diversity is checked to prevent the foregoing disclosure risks, with an α Requirement definition used to ensure the diversity requirement. At last, a two-step greedy generalization algorithm is used to carry out the multiple sensitive attributes processing which deal with quasi-identifiers and sensitive attributes respectively. We reduce the overall distortion by the measure of Masking SA.

  10. Generation of referring expressions: assessing the Incremental Algorithm.

    PubMed

    van Deemter, Kees; Gatt, Albert; van der Sluis, Ielka; Power, Richard

    2012-07-01

    A substantial amount of recent work in natural language generation has focused on the generation of ''one-shot'' referring expressions whose only aim is to identify a target referent. Dale and Reiter's Incremental Algorithm (IA) is often thought to be the best algorithm for maximizing the similarity to referring expressions produced by people. We test this hypothesis by eliciting referring expressions from human subjects and computing the similarity between the expressions elicited and the ones generated by algorithms. It turns out that the success of the IA depends substantially on the ''preference order'' (PO) employed by the IA, particularly in complex domains. While some POs cause the IA to produce referring expressions that are very similar to expressions produced by human subjects, others cause the IA to perform worse than its main competitors; moreover, it turns out to be difficult to predict the success of a PO on the basis of existing psycholinguistic findings or frequencies in corpora. We also examine the computational complexity of the algorithms in question and argue that there are no compelling reasons for preferring the IA over some of its main competitors on these grounds. We conclude that future research on the generation of referring expressions should explore alternatives to the IA, focusing on algorithms, inspired by the Greedy Algorithm, which do not work with a fixed PO. Copyright © 2011 Cognitive Science Society, Inc.

  11. A tight upper bound for quadratic knapsack problems in grid-based wind farm layout optimization

    NASA Astrophysics Data System (ADS)

    Quan, Ning; Kim, Harrison M.

    2018-03-01

    The 0-1 quadratic knapsack problem (QKP) in wind farm layout optimization models possible turbine locations as nodes, and power loss due to wake effects between pairs of turbines as edges in a complete graph. The goal is to select up to a certain number of turbine locations such that the sum of selected node and edge coefficients is maximized. Finding the optimal solution to the QKP is difficult in general, but it is possible to obtain a tight upper bound on the QKP's optimal value which facilitates the use of heuristics to solve QKPs by giving a good estimate of the optimality gap of any feasible solution. This article applies an upper bound method that is especially well-suited to QKPs in wind farm layout optimization due to certain features of the formulation that reduce the computational complexity of calculating the upper bound. The usefulness of the upper bound was demonstrated by assessing the performance of the greedy algorithm for solving QKPs in wind farm layout optimization. The results show that the greedy algorithm produces good solutions within 4% of the optimal value for small to medium sized problems considered in this article.

  12. An iterative network partition algorithm for accurate identification of dense network modules

    PubMed Central

    Sun, Siqi; Dong, Xinran; Fu, Yao; Tian, Weidong

    2012-01-01

    A key step in network analysis is to partition a complex network into dense modules. Currently, modularity is one of the most popular benefit functions used to partition network modules. However, recent studies suggested that it has an inherent limitation in detecting dense network modules. In this study, we observed that despite the limitation, modularity has the advantage of preserving the primary network structure of the undetected modules. Thus, we have developed a simple iterative Network Partition (iNP) algorithm to partition a network. The iNP algorithm provides a general framework in which any modularity-based algorithm can be implemented in the network partition step. Here, we tested iNP with three modularity-based algorithms: multi-step greedy (MSG), spectral clustering and Qcut. Compared with the original three methods, iNP achieved a significant improvement in the quality of network partition in a benchmark study with simulated networks, identified more modules with significantly better enrichment of functionally related genes in both yeast protein complex network and breast cancer gene co-expression network, and discovered more cancer-specific modules in the cancer gene co-expression network. As such, iNP should have a broad application as a general method to assist in the analysis of biological networks. PMID:22121225

  13. Discrete Particle Swarm Optimization Routing Protocol for Wireless Sensor Networks with Multiple Mobile Sinks

    PubMed Central

    Yang, Jin; Liu, Fagui; Cao, Jianneng; Wang, Liangming

    2016-01-01

    Mobile sinks can achieve load-balancing and energy-consumption balancing across the wireless sensor networks (WSNs). However, the frequent change of the paths between source nodes and the sinks caused by sink mobility introduces significant overhead in terms of energy and packet delays. To enhance network performance of WSNs with mobile sinks (MWSNs), we present an efficient routing strategy, which is formulated as an optimization problem and employs the particle swarm optimization algorithm (PSO) to build the optimal routing paths. However, the conventional PSO is insufficient to solve discrete routing optimization problems. Therefore, a novel greedy discrete particle swarm optimization with memory (GMDPSO) is put forward to address this problem. In the GMDPSO, particle’s position and velocity of traditional PSO are redefined under discrete MWSNs scenario. Particle updating rule is also reconsidered based on the subnetwork topology of MWSNs. Besides, by improving the greedy forwarding routing, a greedy search strategy is designed to drive particles to find a better position quickly. Furthermore, searching history is memorized to accelerate convergence. Simulation results demonstrate that our new protocol significantly improves the robustness and adapts to rapid topological changes with multiple mobile sinks, while efficiently reducing the communication overhead and the energy consumption. PMID:27428971

  14. Diffusive behavior of a greedy traveling salesman.

    PubMed

    Lipowski, Adam; Lipowska, Dorota

    2011-06-01

    Using Monte Carlo simulations we examine the diffusive properties of the greedy algorithm in the d-dimensional traveling salesman problem. Our results show that for d=3 and 4 the average squared distance from the origin (r(2)) is proportional to the number of steps t. In the d=2 case such a scaling is modified with some logarithmic corrections, which might suggest that d=2 is the critical dimension of the problem. The distribution of lengths also shows marked differences between d=2 and d>2 versions. A simple strategy adopted by the salesman might resemble strategies chosen by some foraging and hunting animals, for which anomalous diffusive behavior has recently been reported and interpreted in terms of Lévy flights. Our results suggest that broad and Lévy-like distributions in such systems might appear due to dimension-dependent properties of a search space.

  15. Simulated annealing algorithm for solving chambering student-case assignment problem

    NASA Astrophysics Data System (ADS)

    Ghazali, Saadiah; Abdul-Rahman, Syariza

    2015-12-01

    The problem related to project assignment problem is one of popular practical problem that appear nowadays. The challenge of solving the problem raise whenever the complexity related to preferences, the existence of real-world constraints and problem size increased. This study focuses on solving a chambering student-case assignment problem by using a simulated annealing algorithm where this problem is classified under project assignment problem. The project assignment problem is considered as hard combinatorial optimization problem and solving it using a metaheuristic approach is an advantage because it could return a good solution in a reasonable time. The problem of assigning chambering students to cases has never been addressed in the literature before. For the proposed problem, it is essential for law graduates to peruse in chambers before they are qualified to become legal counselor. Thus, assigning the chambering students to cases is a critically needed especially when involving many preferences. Hence, this study presents a preliminary study of the proposed project assignment problem. The objective of the study is to minimize the total completion time for all students in solving the given cases. This study employed a minimum cost greedy heuristic in order to construct a feasible initial solution. The search then is preceded with a simulated annealing algorithm for further improvement of solution quality. The analysis of the obtained result has shown that the proposed simulated annealing algorithm has greatly improved the solution constructed by the minimum cost greedy heuristic. Hence, this research has demonstrated the advantages of solving project assignment problem by using metaheuristic techniques.

  16. Infrastructure system restoration planning using evolutionary algorithms

    USGS Publications Warehouse

    Corns, Steven; Long, Suzanna K.; Shoberg, Thomas G.

    2016-01-01

    This paper presents an evolutionary algorithm to address restoration issues for supply chain interdependent critical infrastructure. Rapid restoration of infrastructure after a large-scale disaster is necessary to sustaining a nation's economy and security, but such long-term restoration has not been investigated as thoroughly as initial rescue and recovery efforts. A model of the Greater Saint Louis Missouri area was created and a disaster scenario simulated. An evolutionary algorithm is used to determine the order in which the bridges should be repaired based on indirect costs. Solutions were evaluated based on the reduction of indirect costs and the restoration of transportation capacity. When compared to a greedy algorithm, the evolutionary algorithm solution reduced indirect costs by approximately 12.4% by restoring automotive travel routes for workers and re-establishing the flow of commodities across the three rivers in the Saint Louis area.

  17. A model-based spike sorting algorithm for removing correlation artifacts in multi-neuron recordings.

    PubMed

    Pillow, Jonathan W; Shlens, Jonathon; Chichilnisky, E J; Simoncelli, Eero P

    2013-01-01

    We examine the problem of estimating the spike trains of multiple neurons from voltage traces recorded on one or more extracellular electrodes. Traditional spike-sorting methods rely on thresholding or clustering of recorded signals to identify spikes. While these methods can detect a large fraction of the spikes from a recording, they generally fail to identify synchronous or near-synchronous spikes: cases in which multiple spikes overlap. Here we investigate the geometry of failures in traditional sorting algorithms, and document the prevalence of such errors in multi-electrode recordings from primate retina. We then develop a method for multi-neuron spike sorting using a model that explicitly accounts for the superposition of spike waveforms. We model the recorded voltage traces as a linear combination of spike waveforms plus a stochastic background component of correlated Gaussian noise. Combining this measurement model with a Bernoulli prior over binary spike trains yields a posterior distribution for spikes given the recorded data. We introduce a greedy algorithm to maximize this posterior that we call "binary pursuit". The algorithm allows modest variability in spike waveforms and recovers spike times with higher precision than the voltage sampling rate. This method substantially corrects cross-correlation artifacts that arise with conventional methods, and substantially outperforms clustering methods on both real and simulated data. Finally, we develop diagnostic tools that can be used to assess errors in spike sorting in the absence of ground truth.

  18. A Model-Based Spike Sorting Algorithm for Removing Correlation Artifacts in Multi-Neuron Recordings

    PubMed Central

    Chichilnisky, E. J.; Simoncelli, Eero P.

    2013-01-01

    We examine the problem of estimating the spike trains of multiple neurons from voltage traces recorded on one or more extracellular electrodes. Traditional spike-sorting methods rely on thresholding or clustering of recorded signals to identify spikes. While these methods can detect a large fraction of the spikes from a recording, they generally fail to identify synchronous or near-synchronous spikes: cases in which multiple spikes overlap. Here we investigate the geometry of failures in traditional sorting algorithms, and document the prevalence of such errors in multi-electrode recordings from primate retina. We then develop a method for multi-neuron spike sorting using a model that explicitly accounts for the superposition of spike waveforms. We model the recorded voltage traces as a linear combination of spike waveforms plus a stochastic background component of correlated Gaussian noise. Combining this measurement model with a Bernoulli prior over binary spike trains yields a posterior distribution for spikes given the recorded data. We introduce a greedy algorithm to maximize this posterior that we call “binary pursuit”. The algorithm allows modest variability in spike waveforms and recovers spike times with higher precision than the voltage sampling rate. This method substantially corrects cross-correlation artifacts that arise with conventional methods, and substantially outperforms clustering methods on both real and simulated data. Finally, we develop diagnostic tools that can be used to assess errors in spike sorting in the absence of ground truth. PMID:23671583

  19. Research on Multirobot Pursuit Task Allocation Algorithm Based on Emotional Cooperation Factor

    PubMed Central

    Fang, Baofu; Chen, Lu; Wang, Hao; Dai, Shuanglu; Zhong, Qiubo

    2014-01-01

    Multirobot task allocation is a hot issue in the field of robot research. A new emotional model is used with the self-interested robot, which gives a new way to measure self-interested robots' individual cooperative willingness in the problem of multirobot task allocation. Emotional cooperation factor is introduced into self-interested robot; it is updated based on emotional attenuation and external stimuli. Then a multirobot pursuit task allocation algorithm is proposed, which is based on emotional cooperation factor. Combined with the two-step auction algorithm recruiting team leaders and team collaborators, set up pursuit teams, and finally use certain strategies to complete the pursuit task. In order to verify the effectiveness of this algorithm, some comparing experiments have been done with the instantaneous greedy optimal auction algorithm; the results of experiments show that the total pursuit time and total team revenue can be optimized by using this algorithm. PMID:25152925

  20. Research on multirobot pursuit task allocation algorithm based on emotional cooperation factor.

    PubMed

    Fang, Baofu; Chen, Lu; Wang, Hao; Dai, Shuanglu; Zhong, Qiubo

    2014-01-01

    Multirobot task allocation is a hot issue in the field of robot research. A new emotional model is used with the self-interested robot, which gives a new way to measure self-interested robots' individual cooperative willingness in the problem of multirobot task allocation. Emotional cooperation factor is introduced into self-interested robot; it is updated based on emotional attenuation and external stimuli. Then a multirobot pursuit task allocation algorithm is proposed, which is based on emotional cooperation factor. Combined with the two-step auction algorithm recruiting team leaders and team collaborators, set up pursuit teams, and finally use certain strategies to complete the pursuit task. In order to verify the effectiveness of this algorithm, some comparing experiments have been done with the instantaneous greedy optimal auction algorithm; the results of experiments show that the total pursuit time and total team revenue can be optimized by using this algorithm.

  1. Spatiotemporal Local-Remote Senor Fusion (ST-LRSF) for Cooperative Vehicle Positioning.

    PubMed

    Jeong, Han-You; Nguyen, Hoa-Hung; Bhawiyuga, Adhitya

    2018-04-04

    Vehicle positioning plays an important role in the design of protocols, algorithms, and applications in the intelligent transport systems. In this paper, we present a new framework of spatiotemporal local-remote sensor fusion (ST-LRSF) that cooperatively improves the accuracy of absolute vehicle positioning based on two state estimates of a vehicle in the vicinity: a local sensing estimate, measured by the on-board exteroceptive sensors, and a remote sensing estimate, received from neighbor vehicles via vehicle-to-everything communications. Given both estimates of vehicle state, the ST-LRSF scheme identifies the set of vehicles in the vicinity, determines the reference vehicle state, proposes a spatiotemporal dissimilarity metric between two reference vehicle states, and presents a greedy algorithm to compute a minimal weighted matching (MWM) between them. Given the outcome of MWM, the theoretical position uncertainty of the proposed refinement algorithm is proven to be inversely proportional to the square root of matching size. To further reduce the positioning uncertainty, we also develop an extended Kalman filter model with the refined position of ST-LRSF as one of the measurement inputs. The numerical results demonstrate that the proposed ST-LRSF framework can achieve high positioning accuracy for many different scenarios of cooperative vehicle positioning.

  2. A Framework for an Automated Compilation System for Reconfigurable Architectures

    DTIC Science & Technology

    1997-03-01

    HDLs, Hardware C requires the designer to be thoroughly familiar with digital hardware design. 48 Vahid, Gong, and Gajski focus on the partitioning...of hardware used. Vahid, Gong, and Gajski suggest that the greedy approach used by Gupta and De Micheli is easily trapped in local minimums [46:216...iterative algorithm. To overcome this limitation, the Vahid, Gong, and Gajski suggest a binary constraint partitioning approach. The partitioning

  3. Cascade phenomenon against subsequent failures in complex networks

    NASA Astrophysics Data System (ADS)

    Jiang, Zhong-Yuan; Liu, Zhi-Quan; He, Xuan; Ma, Jian-Feng

    2018-06-01

    Cascade phenomenon may lead to catastrophic disasters which extremely imperil the network safety or security in various complex systems such as communication networks, power grids, social networks and so on. In some flow-based networks, the load of failed nodes can be redistributed locally to their neighboring nodes to maximally preserve the traffic oscillations or large-scale cascading failures. However, in such local flow redistribution model, a small set of key nodes attacked subsequently can result in network collapse. Then it is a critical problem to effectively find the set of key nodes in the network. To our best knowledge, this work is the first to study this problem comprehensively. We first introduce the extra capacity for every node to put up with flow fluctuations from neighbors, and two extra capacity distributions including degree based distribution and average distribution are employed. Four heuristic key nodes discovering methods including High-Degree-First (HDF), Low-Degree-First (LDF), Random and Greedy Algorithms (GA) are presented. Extensive simulations are realized in both scale-free networks and random networks. The results show that the greedy algorithm can efficiently find the set of key nodes in both scale-free and random networks. Our work studies network robustness against cascading failures from a very novel perspective, and methods and results are very useful for network robustness evaluations and protections.

  4. Greedy feature selection for glycan chromatography data with the generalized Dirichlet distribution

    PubMed Central

    2013-01-01

    Background Glycoproteins are involved in a diverse range of biochemical and biological processes. Changes in protein glycosylation are believed to occur in many diseases, particularly during cancer initiation and progression. The identification of biomarkers for human disease states is becoming increasingly important, as early detection is key to improving survival and recovery rates. To this end, the serum glycome has been proposed as a potential source of biomarkers for different types of cancers. High-throughput hydrophilic interaction liquid chromatography (HILIC) technology for glycan analysis allows for the detailed quantification of the glycan content in human serum. However, the experimental data from this analysis is compositional by nature. Compositional data are subject to a constant-sum constraint, which restricts the sample space to a simplex. Statistical analysis of glycan chromatography datasets should account for their unusual mathematical properties. As the volume of glycan HILIC data being produced increases, there is a considerable need for a framework to support appropriate statistical analysis. Proposed here is a methodology for feature selection in compositional data. The principal objective is to provide a template for the analysis of glycan chromatography data that may be used to identify potential glycan biomarkers. Results A greedy search algorithm, based on the generalized Dirichlet distribution, is carried out over the feature space to search for the set of “grouping variables” that best discriminate between known group structures in the data, modelling the compositional variables using beta distributions. The algorithm is applied to two glycan chromatography datasets. Statistical classification methods are used to test the ability of the selected features to differentiate between known groups in the data. Two well-known methods are used for comparison: correlation-based feature selection (CFS) and recursive partitioning (rpart). CFS is a feature selection method, while recursive partitioning is a learning tree algorithm that has been used for feature selection in the past. Conclusions The proposed feature selection method performs well for both glycan chromatography datasets. It is computationally slower, but results in a lower misclassification rate and a higher sensitivity rate than both correlation-based feature selection and the classification tree method. PMID:23651459

  5. Multimodal Hierarchical Dirichlet Process-Based Active Perception by a Robot

    PubMed Central

    Taniguchi, Tadahiro; Yoshino, Ryo; Takano, Toshiaki

    2018-01-01

    In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an active perception for MHDP method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback–Leibler divergence between a final recognition state and the recognition state after the next set of actions. However, a straightforward calculation of IG is practically impossible. Therefore, we derive a Monte Carlo approximation method for IG by making use of a property of the MHDP. We also show that the IG has submodular and non-decreasing properties as a set function because of the structure of the graphical model of the MHDP. Therefore, the IG maximization problem is reduced to a submodular maximization problem. This means that greedy and lazy greedy algorithms are effective and have a theoretical justification for their performance. We conducted an experiment using an upper-torso humanoid robot and a second one using synthetic data. The experimental results show that the method enables the robot to select a set of actions that allow it to recognize target objects quickly and accurately. The numerical experiment using the synthetic data shows that the proposed method can work appropriately even when the number of actions is large and a set of target objects involves objects categorized into multiple classes. The results support our theoretical outcomes. PMID:29872389

  6. Multimodal Hierarchical Dirichlet Process-Based Active Perception by a Robot.

    PubMed

    Taniguchi, Tadahiro; Yoshino, Ryo; Takano, Toshiaki

    2018-01-01

    In this paper, we propose an active perception method for recognizing object categories based on the multimodal hierarchical Dirichlet process (MHDP). The MHDP enables a robot to form object categories using multimodal information, e.g., visual, auditory, and haptic information, which can be observed by performing actions on an object. However, performing many actions on a target object requires a long time. In a real-time scenario, i.e., when the time is limited, the robot has to determine the set of actions that is most effective for recognizing a target object. We propose an active perception for MHDP method that uses the information gain (IG) maximization criterion and lazy greedy algorithm. We show that the IG maximization criterion is optimal in the sense that the criterion is equivalent to a minimization of the expected Kullback-Leibler divergence between a final recognition state and the recognition state after the next set of actions. However, a straightforward calculation of IG is practically impossible. Therefore, we derive a Monte Carlo approximation method for IG by making use of a property of the MHDP. We also show that the IG has submodular and non-decreasing properties as a set function because of the structure of the graphical model of the MHDP. Therefore, the IG maximization problem is reduced to a submodular maximization problem. This means that greedy and lazy greedy algorithms are effective and have a theoretical justification for their performance. We conducted an experiment using an upper-torso humanoid robot and a second one using synthetic data. The experimental results show that the method enables the robot to select a set of actions that allow it to recognize target objects quickly and accurately. The numerical experiment using the synthetic data shows that the proposed method can work appropriately even when the number of actions is large and a set of target objects involves objects categorized into multiple classes. The results support our theoretical outcomes.

  7. Comparative analysis on the selection of number of clusters in community detection

    NASA Astrophysics Data System (ADS)

    Kawamoto, Tatsuro; Kabashima, Yoshiyuki

    2018-02-01

    We conduct a comparative analysis on various estimates of the number of clusters in community detection. An exhaustive comparison requires testing of all possible combinations of frameworks, algorithms, and assessment criteria. In this paper we focus on the framework based on a stochastic block model, and investigate the performance of greedy algorithms, statistical inference, and spectral methods. For the assessment criteria, we consider modularity, map equation, Bethe free energy, prediction errors, and isolated eigenvalues. From the analysis, the tendency of overfit and underfit that the assessment criteria and algorithms have becomes apparent. In addition, we propose that the alluvial diagram is a suitable tool to visualize statistical inference results and can be useful to determine the number of clusters.

  8. Joint Optimization of Receiver Placement and Illuminator Selection for a Multiband Passive Radar Network.

    PubMed

    Xie, Rui; Wan, Xianrong; Hong, Sheng; Yi, Jianxin

    2017-06-14

    The performance of a passive radar network can be greatly improved by an optimal radar network structure. Generally, radar network structure optimization consists of two aspects, namely the placement of receivers in suitable places and selection of appropriate illuminators. The present study investigates issues concerning the joint optimization of receiver placement and illuminator selection for a passive radar network. Firstly, the required radar cross section (RCS) for target detection is chosen as the performance metric, and the joint optimization model boils down to the partition p -center problem (PPCP). The PPCP is then solved by a proposed bisection algorithm. The key of the bisection algorithm lies in solving the partition set covering problem (PSCP), which can be solved by a hybrid algorithm developed by coupling the convex optimization with the greedy dropping algorithm. In the end, the performance of the proposed algorithm is validated via numerical simulations.

  9. A constraint optimization based virtual network mapping method

    NASA Astrophysics Data System (ADS)

    Li, Xiaoling; Guo, Changguo; Wang, Huaimin; Li, Zhendong; Yang, Zhiwen

    2013-03-01

    Virtual network mapping problem, maps different virtual networks onto the substrate network is an extremely challenging work. This paper proposes a constraint optimization based mapping method for solving virtual network mapping problem. This method divides the problem into two phases, node mapping phase and link mapping phase, which are all NP-hard problems. Node mapping algorithm and link mapping algorithm are proposed for solving node mapping phase and link mapping phase, respectively. Node mapping algorithm adopts the thinking of greedy algorithm, mainly considers two factors, available resources which are supplied by the nodes and distance between the nodes. Link mapping algorithm is based on the result of node mapping phase, adopts the thinking of distributed constraint optimization method, which can guarantee to obtain the optimal mapping with the minimum network cost. Finally, simulation experiments are used to validate the method, and results show that the method performs very well.

  10. Restarting and recentering genetic algorithm variations for DNA fragment assembly: The necessity of a multi-strategy approach.

    PubMed

    Hughes, James Alexander; Houghten, Sheridan; Ashlock, Daniel

    2016-12-01

    DNA Fragment assembly - an NP-Hard problem - is one of the major steps in of DNA sequencing. Multiple strategies have been used for this problem, including greedy graph-based algorithms, deBruijn graphs, and the overlap-layout-consensus approach. This study focuses on the overlap-layout-consensus approach. Heuristics and computational intelligence methods are combined to exploit their respective benefits. These algorithm combinations were able to produce high quality results surpassing the best results obtained by a number of competitive algorithms specially designed and tuned for this problem on thirteen of sixteen popular benchmarks. This work also reinforces the necessity of using multiple search strategies as it is clearly observed that algorithm performance is dependent on problem instance; without a deeper look into many searches, top solutions could be missed entirely. Copyright © 2016. Published by Elsevier Ireland Ltd.

  11. Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions.

    PubMed

    Chen, Ke; Wang, Shihai

    2011-01-01

    Semi-supervised learning concerns the problem of learning in the presence of labeled and unlabeled data. Several boosting algorithms have been extended to semi-supervised learning with various strategies. To our knowledge, however, none of them takes all three semi-supervised assumptions, i.e., smoothness, cluster, and manifold assumptions, together into account during boosting learning. In this paper, we propose a novel cost functional consisting of the margin cost on labeled data and the regularization penalty on unlabeled data based on three fundamental semi-supervised assumptions. Thus, minimizing our proposed cost functional with a greedy yet stagewise functional optimization procedure leads to a generic boosting framework for semi-supervised learning. Extensive experiments demonstrate that our algorithm yields favorite results for benchmark and real-world classification tasks in comparison to state-of-the-art semi-supervised learning algorithms, including newly developed boosting algorithms. Finally, we discuss relevant issues and relate our algorithm to the previous work.

  12. A generalized interval fuzzy mixed integer programming model for a multimodal transportation problem under uncertainty

    NASA Astrophysics Data System (ADS)

    Tian, Wenli; Cao, Chengxuan

    2017-03-01

    A generalized interval fuzzy mixed integer programming model is proposed for the multimodal freight transportation problem under uncertainty, in which the optimal mode of transport and the optimal amount of each type of freight transported through each path need to be decided. For practical purposes, three mathematical methods, i.e. the interval ranking method, fuzzy linear programming method and linear weighted summation method, are applied to obtain equivalents of constraints and parameters, and then a fuzzy expected value model is presented. A heuristic algorithm based on a greedy criterion and the linear relaxation algorithm are designed to solve the model.

  13. OGUPSA sensor scheduling architecture and algorithm

    NASA Astrophysics Data System (ADS)

    Zhang, Zhixiong; Hintz, Kenneth J.

    1996-06-01

    This paper introduces a new architecture for a sensor measurement scheduler as well as a dynamic sensor scheduling algorithm called the on-line, greedy, urgency-driven, preemptive scheduling algorithm (OGUPSA). OGUPSA incorporates a preemptive mechanism which uses three policies, (1) most-urgent-first (MUF), (2) earliest- completed-first (ECF), and (3) least-versatile-first (LVF). The three policies are used successively to dynamically allocate and schedule and distribute a set of arriving tasks among a set of sensors. OGUPSA also can detect the failure of a task to meet a deadline as well as generate an optimal schedule in the sense of minimum makespan for a group of tasks with the same priorities. A side benefit is OGUPSA's ability to improve dynamic load balance among all sensors while being a polynomial time algorithm. Results of a simulation are presented for a simple sensor system.

  14. An Improved SoC Test Scheduling Method Based on Simulated Annealing Algorithm

    NASA Astrophysics Data System (ADS)

    Zheng, Jingjing; Shen, Zhihang; Gao, Huaien; Chen, Bianna; Zheng, Weida; Xiong, Xiaoming

    2017-02-01

    In this paper, we propose an improved SoC test scheduling method based on simulated annealing algorithm (SA). It is our first to disorganize IP core assignment for each TAM to produce a new solution for SA, allocate TAM width for each TAM using greedy algorithm and calculate corresponding testing time. And accepting the core assignment according to the principle of simulated annealing algorithm and finally attain the optimum solution. Simultaneously, we run the test scheduling experiment with the international reference circuits provided by International Test Conference 2002(ITC’02) and the result shows that our algorithm is superior to the conventional integer linear programming algorithm (ILP), simulated annealing algorithm (SA) and genetic algorithm(GA). When TAM width reaches to 48,56 and 64, the testing time based on our algorithm is lesser than the classic methods and the optimization rates are 30.74%, 3.32%, 16.13% respectively. Moreover, the testing time based on our algorithm is very close to that of improved genetic algorithm (IGA), which is state-of-the-art at present.

  15. Portfolios in Stochastic Local Search: Efficiently Computing Most Probable Explanations in Bayesian Networks

    NASA Technical Reports Server (NTRS)

    Mengshoel, Ole J.; Roth, Dan; Wilkins, David C.

    2001-01-01

    Portfolio methods support the combination of different algorithms and heuristics, including stochastic local search (SLS) heuristics, and have been identified as a promising approach to solve computationally hard problems. While successful in experiments, theoretical foundations and analytical results for portfolio-based SLS heuristics are less developed. This article aims to improve the understanding of the role of portfolios of heuristics in SLS. We emphasize the problem of computing most probable explanations (MPEs) in Bayesian networks (BNs). Algorithmically, we discuss a portfolio-based SLS algorithm for MPE computation, Stochastic Greedy Search (SGS). SGS supports the integration of different initialization operators (or initialization heuristics) and different search operators (greedy and noisy heuristics), thereby enabling new analytical and experimental results. Analytically, we introduce a novel Markov chain model tailored to portfolio-based SLS algorithms including SGS, thereby enabling us to analytically form expected hitting time results that explain empirical run time results. For a specific BN, we show the benefit of using a homogenous initialization portfolio. To further illustrate the portfolio approach, we consider novel additive search heuristics for handling determinism in the form of zero entries in conditional probability tables in BNs. Our additive approach adds rather than multiplies probabilities when computing the utility of an explanation. We motivate the additive measure by studying the dramatic impact of zero entries in conditional probability tables on the number of zero-probability explanations, which again complicates the search process. We consider the relationship between MAXSAT and MPE, and show that additive utility (or gain) is a generalization, to the probabilistic setting, of MAXSAT utility (or gain) used in the celebrated GSAT and WalkSAT algorithms and their descendants. Utilizing our Markov chain framework, we show that expected hitting time is a rational function - i.e. a ratio of two polynomials - of the probability of applying an additive search operator. Experimentally, we report on synthetically generated BNs as well as BNs from applications, and compare SGSs performance to that of Hugin, which performs BN inference by compilation to and propagation in clique trees. On synthetic networks, SGS speeds up computation by approximately two orders of magnitude compared to Hugin. In application networks, our approach is highly competitive in Bayesian networks with a high degree of determinism. In addition to showing that stochastic local search can be competitive with clique tree clustering, our empirical results provide an improved understanding of the circumstances under which portfolio-based SLS outperforms clique tree clustering and vice versa.

  16. Unsupervised quantification of abdominal fat from CT images using Greedy Snakes

    NASA Astrophysics Data System (ADS)

    Agarwal, Chirag; Dallal, Ahmed H.; Arbabshirani, Mohammad R.; Patel, Aalpen; Moore, Gregory

    2017-02-01

    Adipose tissue has been associated with adverse consequences of obesity. Total adipose tissue (TAT) is divided into subcutaneous adipose tissue (SAT) and visceral adipose tissue (VAT). Intra-abdominal fat (VAT), located inside the abdominal cavity, is a major factor for the classic obesity related pathologies. Since direct measurement of visceral and subcutaneous fat is not trivial, substitute metrics like waist circumference (WC) and body mass index (BMI) are used in clinical settings to quantify obesity. Abdominal fat can be assessed effectively using CT or MRI, but manual fat segmentation is rather subjective and time-consuming. Hence, an automatic and accurate quantification tool for abdominal fat is needed. The goal of this study is to extract TAT, VAT and SAT fat from abdominal CT in a fully automated unsupervised fashion using energy minimization techniques. We applied a four step framework consisting of 1) initial body contour estimation, 2) approximation of the body contour, 3) estimation of inner abdominal contour using Greedy Snakes algorithm, and 4) voting, to segment the subcutaneous and visceral fat. We validated our algorithm on 952 clinical abdominal CT images (from 476 patients with a very wide BMI range) collected from various radiology departments of Geisinger Health System. To our knowledge, this is the first study of its kind on such a large and diverse clinical dataset. Our algorithm obtained a 3.4% error for VAT segmentation compared to manual segmentation. These personalized and accurate measurements of fat can complement traditional population health driven obesity metrics such as BMI and WC.

  17. Feature selection with harmony search.

    PubMed

    Diao, Ren; Shen, Qiang

    2012-12-01

    Many search strategies have been exploited for the task of feature selection (FS), in an effort to identify more compact and better quality subsets. Such work typically involves the use of greedy hill climbing (HC), or nature-inspired heuristics, in order to discover the optimal solution without going through exhaustive search. In this paper, a novel FS approach based on harmony search (HS) is presented. It is a general approach that can be used in conjunction with many subset evaluation techniques. The simplicity of HS is exploited to reduce the overall complexity of the search process. The proposed approach is able to escape from local solutions and identify multiple solutions owing to the stochastic nature of HS. Additional parameter control schemes are introduced to reduce the effort and impact of parameter configuration. These can be further combined with the iterative refinement strategy, tailored to enforce the discovery of quality subsets. The resulting approach is compared with those that rely on HC, genetic algorithms, and particle swarm optimization, accompanied by in-depth studies of the suggested improvements.

  18. Photon-efficient super-resolution laser radar

    NASA Astrophysics Data System (ADS)

    Shin, Dongeek; Shapiro, Jeffrey H.; Goyal, Vivek K.

    2017-08-01

    The resolution achieved in photon-efficient active optical range imaging systems can be low due to non-idealities such as propagation through a diffuse scattering medium. We propose a constrained optimization-based frame- work to address extremes in scarcity of photons and blurring by a forward imaging kernel. We provide two algorithms for the resulting inverse problem: a greedy algorithm, inspired by sparse pursuit algorithms; and a convex optimization heuristic that incorporates image total variation regularization. We demonstrate that our framework outperforms existing deconvolution imaging techniques in terms of peak signal-to-noise ratio. Since our proposed method is able to super-resolve depth features using small numbers of photon counts, it can be useful for observing fine-scale phenomena in remote sensing through a scattering medium and through-the-skin biomedical imaging applications.

  19. Self-Coexistence among IEEE 802.22 Networks: Distributed Allocation of Power and Channel

    PubMed Central

    Sakin, Sayef Azad; Alamri, Atif; Tran, Nguyen H.

    2017-01-01

    Ensuring self-coexistence among IEEE 802.22 networks is a challenging problem owing to opportunistic access of incumbent-free radio resources by users in co-located networks. In this study, we propose a fully-distributed non-cooperative approach to ensure self-coexistence in downlink channels of IEEE 802.22 networks. We formulate the self-coexistence problem as a mixed-integer non-linear optimization problem for maximizing the network data rate, which is an NP-hard one. This work explores a sub-optimal solution by dividing the optimization problem into downlink channel allocation and power assignment sub-problems. Considering fairness, quality of service and minimum interference for customer-premises-equipment, we also develop a greedy algorithm for channel allocation and a non-cooperative game-theoretic framework for near-optimal power allocation. The base stations of networks are treated as players in a game, where they try to increase spectrum utilization by controlling power and reaching a Nash equilibrium point. We further develop a utility function for the game to increase the data rate by minimizing the transmission power and, subsequently, the interference from neighboring networks. A theoretical proof of the uniqueness and existence of the Nash equilibrium has been presented. Performance improvements in terms of data-rate with a degree of fairness compared to a cooperative branch-and-bound-based algorithm and a non-cooperative greedy approach have been shown through simulation studies. PMID:29215591

  20. Hybrid Self-Adaptive Evolution Strategies Guided by Neighborhood Structures for Combinatorial Optimization Problems.

    PubMed

    Coelho, V N; Coelho, I M; Souza, M J F; Oliveira, T A; Cota, L P; Haddad, M N; Mladenovic, N; Silva, R C P; Guimarães, F G

    2016-01-01

    This article presents an Evolution Strategy (ES)--based algorithm, designed to self-adapt its mutation operators, guiding the search into the solution space using a Self-Adaptive Reduced Variable Neighborhood Search procedure. In view of the specific local search operators for each individual, the proposed population-based approach also fits into the context of the Memetic Algorithms. The proposed variant uses the Greedy Randomized Adaptive Search Procedure with different greedy parameters for generating its initial population, providing an interesting exploration-exploitation balance. To validate the proposal, this framework is applied to solve three different [Formula: see text]-Hard combinatorial optimization problems: an Open-Pit-Mining Operational Planning Problem with dynamic allocation of trucks, an Unrelated Parallel Machine Scheduling Problem with Setup Times, and the calibration of a hybrid fuzzy model for Short-Term Load Forecasting. Computational results point out the convergence of the proposed model and highlight its ability in combining the application of move operations from distinct neighborhood structures along the optimization. The results gathered and reported in this article represent a collective evidence of the performance of the method in challenging combinatorial optimization problems from different application domains. The proposed evolution strategy demonstrates an ability of adapting the strength of the mutation disturbance during the generations of its evolution process. The effectiveness of the proposal motivates the application of this novel evolutionary framework for solving other combinatorial optimization problems.

  1. Self-Coexistence among IEEE 802.22 Networks: Distributed Allocation of Power and Channel.

    PubMed

    Sakin, Sayef Azad; Razzaque, Md Abdur; Hassan, Mohammad Mehedi; Alamri, Atif; Tran, Nguyen H; Fortino, Giancarlo

    2017-12-07

    Ensuring self-coexistence among IEEE 802.22 networks is a challenging problem owing to opportunistic access of incumbent-free radio resources by users in co-located networks. In this study, we propose a fully-distributed non-cooperative approach to ensure self-coexistence in downlink channels of IEEE 802.22 networks. We formulate the self-coexistence problem as a mixed-integer non-linear optimization problem for maximizing the network data rate, which is an NP-hard one. This work explores a sub-optimal solution by dividing the optimization problem into downlink channel allocation and power assignment sub-problems. Considering fairness, quality of service and minimum interference for customer-premises-equipment, we also develop a greedy algorithm for channel allocation and a non-cooperative game-theoretic framework for near-optimal power allocation. The base stations of networks are treated as players in a game, where they try to increase spectrum utilization by controlling power and reaching a Nash equilibrium point. We further develop a utility function for the game to increase the data rate by minimizing the transmission power and, subsequently, the interference from neighboring networks. A theoretical proof of the uniqueness and existence of the Nash equilibrium has been presented. Performance improvements in terms of data-rate with a degree of fairness compared to a cooperative branch-and-bound-based algorithm and a non-cooperative greedy approach have been shown through simulation studies.

  2. An Effective Hybrid Cuckoo Search Algorithm with Improved Shuffled Frog Leaping Algorithm for 0-1 Knapsack Problems

    PubMed Central

    Wang, Gai-Ge; Feng, Qingjiang; Zhao, Xiang-Jun

    2014-01-01

    An effective hybrid cuckoo search algorithm (CS) with improved shuffled frog-leaping algorithm (ISFLA) is put forward for solving 0-1 knapsack problem. First of all, with the framework of SFLA, an improved frog-leap operator is designed with the effect of the global optimal information on the frog leaping and information exchange between frog individuals combined with genetic mutation with a small probability. Subsequently, in order to improve the convergence speed and enhance the exploitation ability, a novel CS model is proposed with considering the specific advantages of Lévy flights and frog-leap operator. Furthermore, the greedy transform method is used to repair the infeasible solution and optimize the feasible solution. Finally, numerical simulations are carried out on six different types of 0-1 knapsack instances, and the comparative results have shown the effectiveness of the proposed algorithm and its ability to achieve good quality solutions, which outperforms the binary cuckoo search, the binary differential evolution, and the genetic algorithm. PMID:25404940

  3. Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes

    PubMed Central

    Xia, Xiao-Lei; Xing, Huanlai; Liu, Xueqin

    2013-01-01

    One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM) in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS) is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing -like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS) which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate. PMID:24349110

  4. Algorithm to determine the percolation largest component in interconnected networks.

    PubMed

    Schneider, Christian M; Araújo, Nuno A M; Herrmann, Hans J

    2013-04-01

    Interconnected networks have been shown to be much more vulnerable to random and targeted failures than isolated ones, raising several interesting questions regarding the identification and mitigation of their risk. The paradigm to address these questions is the percolation model, where the resilience of the system is quantified by the dependence of the size of the largest cluster on the number of failures. Numerically, the major challenge is the identification of this cluster and the calculation of its size. Here, we propose an efficient algorithm to tackle this problem. We show that the algorithm scales as O(NlogN), where N is the number of nodes in the network, a significant improvement compared to O(N(2)) for a greedy algorithm, which permits studying much larger networks. Our new strategy can be applied to any network topology and distribution of interdependencies, as well as any sequence of failures.

  5. A Globally Optimal Particle Tracking Technique for Stereo Imaging Velocimetry Experiments

    NASA Technical Reports Server (NTRS)

    McDowell, Mark

    2008-01-01

    An important phase of any Stereo Imaging Velocimetry experiment is particle tracking. Particle tracking seeks to identify and characterize the motion of individual particles entrained in a fluid or air experiment. We analyze a cylindrical chamber filled with water and seeded with density-matched particles. In every four-frame sequence, we identify a particle track by assigning a unique track label for each camera image. The conventional approach to particle tracking is to use an exhaustive tree-search method utilizing greedy algorithms to reduce search times. However, these types of algorithms are not optimal due to a cascade effect of incorrect decisions upon adjacent tracks. We examine the use of a guided evolutionary neural net with simulated annealing to arrive at a globally optimal assignment of tracks. The net is guided both by the minimization of the search space through the use of prior limiting assumptions about valid tracks and by a strategy which seeks to avoid high-energy intermediate states which can trap the net in a local minimum. A stochastic search algorithm is used in place of back-propagation of error to further reduce the chance of being trapped in an energy well. Global optimization is achieved by minimizing an objective function, which includes both track smoothness and particle-image utilization parameters. In this paper we describe our model and present our experimental results. We compare our results with a nonoptimizing, predictive tracker and obtain an average increase in valid track yield of 27 percent

  6. Spatiotemporal Local-Remote Senor Fusion (ST-LRSF) for Cooperative Vehicle Positioning

    PubMed Central

    Bhawiyuga, Adhitya

    2018-01-01

    Vehicle positioning plays an important role in the design of protocols, algorithms, and applications in the intelligent transport systems. In this paper, we present a new framework of spatiotemporal local-remote sensor fusion (ST-LRSF) that cooperatively improves the accuracy of absolute vehicle positioning based on two state estimates of a vehicle in the vicinity: a local sensing estimate, measured by the on-board exteroceptive sensors, and a remote sensing estimate, received from neighbor vehicles via vehicle-to-everything communications. Given both estimates of vehicle state, the ST-LRSF scheme identifies the set of vehicles in the vicinity, determines the reference vehicle state, proposes a spatiotemporal dissimilarity metric between two reference vehicle states, and presents a greedy algorithm to compute a minimal weighted matching (MWM) between them. Given the outcome of MWM, the theoretical position uncertainty of the proposed refinement algorithm is proven to be inversely proportional to the square root of matching size. To further reduce the positioning uncertainty, we also develop an extended Kalman filter model with the refined position of ST-LRSF as one of the measurement inputs. The numerical results demonstrate that the proposed ST-LRSF framework can achieve high positioning accuracy for many different scenarios of cooperative vehicle positioning. PMID:29617341

  7. Algorithm to solve a chance-constrained network capacity design problem with stochastic demands and finite support

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Schumacher, Kathryn M.; Chen, Richard Li-Yang; Cohn, Amy E. M.

    2016-04-15

    Here, we consider the problem of determining the capacity to assign to each arc in a given network, subject to uncertainty in the supply and/or demand of each node. This design problem underlies many real-world applications, such as the design of power transmission and telecommunications networks. We first consider the case where a set of supply/demand scenarios are provided, and we must determine the minimum-cost set of arc capacities such that a feasible flow exists for each scenario. We briefly review existing theoretical approaches to solving this problem and explore implementation strategies to reduce run times. With this as amore » foundation, our primary focus is on a chance-constrained version of the problem in which α% of the scenarios must be feasible under the chosen capacity, where α is a user-defined parameter and the specific scenarios to be satisfied are not predetermined. We describe an algorithm which utilizes a separation routine for identifying violated cut-sets which can solve the problem to optimality, and we present computational results. We also present a novel greedy algorithm, our primary contribution, which can be used to solve for a high quality heuristic solution. We present computational analysis to evaluate the performance of our proposed approaches.« less

  8. Routing design and fleet allocation optimization of freeway service patrol: Improved results using genetic algorithm

    NASA Astrophysics Data System (ADS)

    Sun, Xiuqiao; Wang, Jian

    2018-07-01

    Freeway service patrol (FSP), is considered to be an effective method for incident management and can help transportation agency decision-makers alter existing route coverage and fleet allocation. This paper investigates the FSP problem of patrol routing design and fleet allocation, with the objective of minimizing the overall average incident response time. While the simulated annealing (SA) algorithm and its improvements have been applied to solve this problem, they often become trapped in local optimal solution. Moreover, the issue of searching efficiency remains to be further addressed. In this paper, we employ the genetic algorithm (GA) and SA to solve the FSP problem. To maintain population diversity and avoid premature convergence, niche strategy is incorporated into the traditional genetic algorithm. We also employ elitist strategy to speed up the convergence. Numerical experiments have been conducted with the help of the Sioux Falls network. Results show that the GA slightly outperforms the dual-based greedy (DBG) algorithm, the very large-scale neighborhood searching (VLNS) algorithm, the SA algorithm and the scenario algorithm.

  9. Seeding for pervasively overlapping communities

    NASA Astrophysics Data System (ADS)

    Lee, Conrad; Reid, Fergal; McDaid, Aaron; Hurley, Neil

    2011-06-01

    In some social and biological networks, the majority of nodes belong to multiple communities. It has recently been shown that a number of the algorithms specifically designed to detect overlapping communities do not perform well in such highly overlapping settings. Here, we consider one class of these algorithms, those which optimize a local fitness measure, typically by using a greedy heuristic to expand a seed into a community. We perform synthetic benchmarks which indicate that an appropriate seeding strategy becomes more important as the extent of community overlap increases. We find that distinct cliques provide the best seeds. We find further support for this seeding strategy with benchmarks on a Facebook network and the yeast interactome.

  10. Equation Discovery for Model Identification in Respiratory Mechanics of the Mechanically Ventilated Human Lung

    NASA Astrophysics Data System (ADS)

    Ganzert, Steven; Guttmann, Josef; Steinmann, Daniel; Kramer, Stefan

    Lung protective ventilation strategies reduce the risk of ventilator associated lung injury. To develop such strategies, knowledge about mechanical properties of the mechanically ventilated human lung is essential. This study was designed to develop an equation discovery system to identify mathematical models of the respiratory system in time-series data obtained from mechanically ventilated patients. Two techniques were combined: (i) the usage of declarative bias to reduce search space complexity and inherently providing the processing of background knowledge. (ii) A newly developed heuristic for traversing the hypothesis space with a greedy, randomized strategy analogical to the GSAT algorithm. In 96.8% of all runs the applied equation discovery system was capable to detect the well-established equation of motion model of the respiratory system in the provided data. We see the potential of this semi-automatic approach to detect more complex mathematical descriptions of the respiratory system from respiratory data.

  11. Learning from Noisy and Delayed Rewards: The Value of Reinforcement Learning to Defense Modeling and Simulation

    DTIC Science & Technology

    2012-09-01

    following 500 trials with 1000 replications with single reward upon attainment of the goal state by algorithm and policy. DQ- C with -greedy obtained...aspects of the civilian population rather than combat forces. These agents rep- resent not a single human, but a population segment. Similar...TD(λ) combines elements of MC and TD methods into a single framework to estimate the value of each state, V(s), through the use of eligibility traces

  12. A Greedy Algorithm for Brain MRI's Registration.

    PubMed

    Chesseboeuf, Clément

    2016-12-01

    This document presents a non-rigid registration algorithm for the use of brain magnetic resonance (MR) images comparison. More precisely, we want to compare pre-operative and post-operative MR images in order to assess the deformation due to a surgical removal. The proposed algorithm has been studied in Chesseboeuf et al. ((Non-rigid registration of magnetic resonance imaging of brain. IEEE, 385-390. doi: 10.1109/IPTA.2015.7367172 , 2015), following ideas of Trouvé (An infinite dimensional group approach for physics based models in patterns recognition. Technical Report DMI Ecole Normale Supérieure, Cachan, 1995), in which the author introduces the algorithm within a very general framework. Here we recalled this theory from a practical point of view. The emphasis is on illustrations and description of the numerical procedure. Our version of the algorithm is associated with a particular matching criterion. Then, a section is devoted to the description of this object. In the last section we focus on the construction of a statistical method of evaluation.

  13. Heuristic algorithms for the minmax regret flow-shop problem with interval processing times.

    PubMed

    Ćwik, Michał; Józefczyk, Jerzy

    2018-01-01

    An uncertain version of the permutation flow-shop with unlimited buffers and the makespan as a criterion is considered. The investigated parametric uncertainty is represented by given interval-valued processing times. The maximum regret is used for the evaluation of uncertainty. Consequently, the minmax regret discrete optimization problem is solved. Due to its high complexity, two relaxations are applied to simplify the optimization procedure. First of all, a greedy procedure is used for calculating the criterion's value, as such calculation is NP-hard problem itself. Moreover, the lower bound is used instead of solving the internal deterministic flow-shop. The constructive heuristic algorithm is applied for the relaxed optimization problem. The algorithm is compared with previously elaborated other heuristic algorithms basing on the evolutionary and the middle interval approaches. The conducted computational experiments showed the advantage of the constructive heuristic algorithm with regards to both the criterion and the time of computations. The Wilcoxon paired-rank statistical test confirmed this conclusion.

  14. Design and coverage of high throughput genotyping arrays optimized for individuals of East Asian, African American, and Latino race/ethnicity using imputation and a novel hybrid SNP selection algorithm.

    PubMed

    Hoffmann, Thomas J; Zhan, Yiping; Kvale, Mark N; Hesselson, Stephanie E; Gollub, Jeremy; Iribarren, Carlos; Lu, Yontao; Mei, Gangwu; Purdy, Matthew M; Quesenberry, Charles; Rowell, Sarah; Shapero, Michael H; Smethurst, David; Somkin, Carol P; Van den Eeden, Stephen K; Walter, Larry; Webster, Teresa; Whitmer, Rachel A; Finn, Andrea; Schaefer, Catherine; Kwok, Pui-Yan; Risch, Neil

    2011-12-01

    Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies. Copyright © 2011 Elsevier Inc. All rights reserved.

  15. Nearest greedy for solving the waste collection vehicle routing problem: A case study

    NASA Astrophysics Data System (ADS)

    Mat, Nur Azriati; Benjamin, Aida Mauziah; Abdul-Rahman, Syariza; Wibowo, Antoni

    2017-11-01

    This paper presents a real case study pertaining to an issue related to waste collection in the northern part of Malaysia by using a constructive heuristic algorithm known as the Nearest Greedy (NG) technique. This technique has been widely used to devise initial solutions for issues concerning vehicle routing. Basically, the waste collection cycle involves the following steps: i) each vehicle starts from a depot, ii) visits a number of customers to collect waste, iii) unloads waste at the disposal site, and lastly, iv) returns to the depot. Moreover, the sample data set used in this paper consisted of six areas, where each area involved up to 103 customers. In this paper, the NG technique was employed to construct an initial route for each area. The solution proposed from the technique was compared with the present vehicle routes implemented by a waste collection company within the city. The comparison results portrayed that NG offered better vehicle routes with a 11.07% reduction of the total distance traveled, in comparison to the present vehicle routes.

  16. Introducing TreeCollapse: a novel greedy algorithm to solve the cophylogeny reconstruction problem.

    PubMed

    Drinkwater, Benjamin; Charleston, Michael A

    2014-01-01

    Cophylogeny mapping is used to uncover deep coevolutionary associations between two or more phylogenetic histories at a macro coevolutionary scale. As cophylogeny mapping is NP-Hard, this technique relies heavily on heuristics to solve all but the most trivial cases. One notable approach utilises a metaheuristic to search only a subset of the exponential number of fixed node orderings possible for the phylogenetic histories in question. This is of particular interest as it is the only known heuristic that guarantees biologically feasible solutions. This has enabled research to focus on larger coevolutionary systems, such as coevolutionary associations between figs and their pollinator wasps, including over 200 taxa. Although able to converge on solutions for problem instances of this size, a reduction from the current cubic running time is required to handle larger systems, such as Wolbachia and their insect hosts. Rather than solving this underlying problem optimally this work presents a greedy algorithm called TreeCollapse, which uses common topological patterns to recover an approximation of the coevolutionary history where the internal node ordering is fixed. This approach offers a significant speed-up compared to previous methods, running in linear time. This algorithm has been applied to over 100 well-known coevolutionary systems converging on Pareto optimal solutions in over 68% of test cases, even where in some cases the Pareto optimal solution has not previously been recoverable. Further, while TreeCollapse applies a local search technique, it can guarantee solutions are biologically feasible, making this the fastest method that can provide such a guarantee. As a result, we argue that the newly proposed algorithm is a valuable addition to the field of coevolutionary research. Not only does it offer a significantly faster method to estimate the cost of cophylogeny mappings but by using this approach, in conjunction with existing heuristics, it can assist in recovering a larger subset of the Pareto front than has previously been possible.

  17. Greedy subspace clustering.

    DOT National Transportation Integrated Search

    2016-09-01

    We consider the problem of subspace clustering: given points that lie on or near the union of many low-dimensional linear subspaces, recover the subspaces. To this end, one first identifies sets of points close to the same subspace and uses the sets ...

  18. Network immunization under limited budget using graph spectra

    NASA Astrophysics Data System (ADS)

    Zahedi, R.; Khansari, M.

    2016-03-01

    In this paper, we propose a new algorithm that minimizes the worst expected growth of an epidemic by reducing the size of the largest connected component (LCC) of the underlying contact network. The proposed algorithm is applicable to any level of available resources and, despite the greedy approaches of most immunization strategies, selects nodes simultaneously. In each iteration, the proposed method partitions the LCC into two groups. These are the best candidates for communities in that component, and the available resources are sufficient to separate them. Using Laplacian spectral partitioning, the proposed method performs community detection inference with a time complexity that rivals that of the best previous methods. Experiments show that our method outperforms targeted immunization approaches in both real and synthetic networks.

  19. A native Bayesian classifier based routing protocol for VANETS

    NASA Astrophysics Data System (ADS)

    Bao, Zhenshan; Zhou, Keqin; Zhang, Wenbo; Gong, Xiaolei

    2016-12-01

    Geographic routing protocols are one of the most hot research areas in VANET (Vehicular Ad-hoc Network). However, there are few routing protocols can take both the transmission efficient and the usage of ratio into account. As we have noticed, different messages in VANET may ask different quality of service. So we raised a Native Bayesian Classifier based routing protocol (Naive Bayesian Classifier-Greedy, NBC-Greedy), which can classify and transmit different messages by its emergency degree. As a result, we can balance the transmission efficient and the usage of ratio with this protocol. Based on Matlab simulation, we can draw a conclusion that NBC-Greedy is more efficient and stable than LR-Greedy and GPSR.

  20. Pathgroups, a dynamic data structure for genome reconstruction problems.

    PubMed

    Zheng, Chunfang

    2010-07-01

    Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available heuristics dedicated to each of these problems are computationally costly for even small instances. We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html chunfang313@gmail.com Supplementary data are available at Bioinformatics online.

  1. A cooperative game framework for detecting overlapping communities in social networks

    NASA Astrophysics Data System (ADS)

    Jonnalagadda, Annapurna; Kuppusamy, Lakshmanan

    2018-02-01

    Community detection in social networks is a challenging and complex task, which received much attention from researchers of multiple domains in recent years. The evolution of communities in social networks happens merely due to the self-interest of the nodes. The interesting feature of community structure in social networks is the multi membership of the nodes resulting in overlapping communities. Assuming the nodes of the social network as self-interested players, the dynamics of community formation can be captured in the form of a game. In this paper, we propose a greedy algorithm, namely, Weighted Graph Community Game (WGCG), in order to model the interactions among the self-interested nodes of the social network. The proposed algorithm employs the Shapley value mechanism to discover the inherent communities of the underlying social network. The experimental evaluation on the real-world and synthetic benchmark networks demonstrates that the performance of the proposed algorithm is superior to the state-of-the-art overlapping community detection algorithms.

  2. Variable neighborhood search for reverse engineering of gene regulatory networks.

    PubMed

    Nicholson, Charles; Goodwin, Leslie; Clark, Corey

    2017-01-01

    A new search heuristic, Divided Neighborhood Exploration Search, designed to be used with inference algorithms such as Bayesian networks to improve on the reverse engineering of gene regulatory networks is presented. The approach systematically moves through the search space to find topologies representative of gene regulatory networks that are more likely to explain microarray data. In empirical testing it is demonstrated that the novel method is superior to the widely employed greedy search techniques in both the quality of the inferred networks and computational time. Copyright © 2016 Elsevier Inc. All rights reserved.

  3. Channel and Switchbox Routing Using a Greedy Based Channel Algorithm with Outward Scanning Technique.

    DTIC Science & Technology

    1988-12-01

    ol ) V. CONCLUSION AND DISCUSSION......................... ... 6 APPENDIX A. NPGS ROUTER USER GUIDE........................6 APPENDIX B. C PROGRAM...problem and shows some of the terminology. previously mentioned. that is peculiar to VISI routing. Clq C4 C4 C4 -4 C4 Clq -4- o C CCD Co -4 q 04 -4 oL ...34 l II 92-. -.-- -.-- , -.... -4--*- -*-- tC I I + 62- -- - ----- -. .t -- +* 0C ’l i II I o -- - ..... 4+ - -+- j- - --- +-+-g9! 6 Ol ... ... "II g4

  4. LNDriver: identifying driver genes by integrating mutation and expression data based on gene-gene interaction network.

    PubMed

    Wei, Pi-Jing; Zhang, Di; Xia, Junfeng; Zheng, Chun-Hou

    2016-12-23

    Cancer is a complex disease which is characterized by the accumulation of genetic alterations during the patient's lifetime. With the development of the next-generation sequencing technology, multiple omics data, such as cancer genomic, epigenomic and transcriptomic data etc., can be measured from each individual. Correspondingly, one of the key challenges is to pinpoint functional driver mutations or pathways, which contributes to tumorigenesis, from millions of functional neutral passenger mutations. In this paper, in order to identify driver genes effectively, we applied a generalized additive model to mutation profiles to filter genes with long length and constructed a new gene-gene interaction network. Then we integrated the mutation data and expression data into the gene-gene interaction network. Lastly, greedy algorithm was used to prioritize candidate driver genes from the integrated data. We named the proposed method Length-Net-Driver (LNDriver). Experiments on three TCGA datasets, i.e., head and neck squamous cell carcinoma, kidney renal clear cell carcinoma and thyroid carcinoma, demonstrated that the proposed method was effective. Also, it can identify not only frequently mutated drivers, but also rare candidate driver genes.

  5. An Effective Mechanism for Virtual Machine Placement using Aco in IAAS Cloud

    NASA Astrophysics Data System (ADS)

    Shenbaga Moorthy, Rajalakshmi; Fareentaj, U.; Divya, T. K.

    2017-08-01

    Cloud computing provides an effective way to dynamically provide numerous resources to meet customer demands. A major challenging problem for cloud providers is designing efficient mechanisms for optimal virtual machine Placement (OVMP). Such mechanisms enable the cloud providers to effectively utilize their available resources and obtain higher profits. In order to provide appropriate resources to the clients an optimal virtual machine placement algorithm is proposed. Virtual machine placement is NP-Hard problem. Such NP-Hard problem can be solved using heuristic algorithm. In this paper, Ant Colony Optimization based virtual machine placement is proposed. Our proposed system focuses on minimizing the cost spending in each plan for hosting virtual machines in a multiple cloud provider environment and the response time of each cloud provider is monitored periodically, in such a way to minimize delay in providing the resources to the users. The performance of the proposed algorithm is compared with greedy mechanism. The proposed algorithm is simulated in Eclipse IDE. The results clearly show that the proposed algorithm minimizes the cost, response time and also number of migrations.

  6. A sub-space greedy search method for efficient Bayesian Network inference.

    PubMed

    Zhang, Qing; Cao, Yong; Li, Yong; Zhu, Yanming; Sun, Samuel S M; Guo, Dianjing

    2011-09-01

    Bayesian network (BN) has been successfully used to infer the regulatory relationships of genes from microarray dataset. However, one major limitation of BN approach is the computational cost because the calculation time grows more than exponentially with the dimension of the dataset. In this paper, we propose a sub-space greedy search method for efficient Bayesian Network inference. Particularly, this method limits the greedy search space by only selecting gene pairs with higher partial correlation coefficients. Using both synthetic and real data, we demonstrate that the proposed method achieved comparable results with standard greedy search method yet saved ∼50% of the computational time. We believe that sub-space search method can be widely used for efficient BN inference in systems biology. Copyright © 2011 Elsevier Ltd. All rights reserved.

  7. Learning planar Ising models

    DOE PAGES

    Johnson, Jason K.; Oyen, Diane Adele; Chertkov, Michael; ...

    2016-12-01

    Inference and learning of graphical models are both well-studied problems in statistics and machine learning that have found many applications in science and engineering. However, exact inference is intractable in general graphical models, which suggests the problem of seeking the best approximation to a collection of random variables within some tractable family of graphical models. In this paper, we focus on the class of planar Ising models, for which exact inference is tractable using techniques of statistical physics. Based on these techniques and recent methods for planarity testing and planar embedding, we propose a greedy algorithm for learning the bestmore » planar Ising model to approximate an arbitrary collection of binary random variables (possibly from sample data). Given the set of all pairwise correlations among variables, we select a planar graph and optimal planar Ising model defined on this graph to best approximate that set of correlations. Finally, we demonstrate our method in simulations and for two applications: modeling senate voting records and identifying geo-chemical depth trends from Mars rover data.« less

  8. Learning planar Ising models

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, Jason K.; Oyen, Diane Adele; Chertkov, Michael

    Inference and learning of graphical models are both well-studied problems in statistics and machine learning that have found many applications in science and engineering. However, exact inference is intractable in general graphical models, which suggests the problem of seeking the best approximation to a collection of random variables within some tractable family of graphical models. In this paper, we focus on the class of planar Ising models, for which exact inference is tractable using techniques of statistical physics. Based on these techniques and recent methods for planarity testing and planar embedding, we propose a greedy algorithm for learning the bestmore » planar Ising model to approximate an arbitrary collection of binary random variables (possibly from sample data). Given the set of all pairwise correlations among variables, we select a planar graph and optimal planar Ising model defined on this graph to best approximate that set of correlations. Finally, we demonstrate our method in simulations and for two applications: modeling senate voting records and identifying geo-chemical depth trends from Mars rover data.« less

  9. Algorithms for Automatic Alignment of Arrays

    NASA Technical Reports Server (NTRS)

    Chatterjee, Siddhartha; Gilbert, John R.; Oliker, Leonid; Schreiber, Robert; Sheffler, Thomas J.

    1996-01-01

    Aggregate data objects (such as arrays) are distributed across the processor memories when compiling a data-parallel language for a distributed-memory machine. The mapping determines the amount of communication needed to bring operands of parallel operations into alignment with each other. A common approach is to break the mapping into two stages: an alignment that maps all the objects to an abstract template, followed by a distribution that maps the template to the processors. This paper describes algorithms for solving the various facets of the alignment problem: axis and stride alignment, static and mobile offset alignment, and replication labeling. We show that optimal axis and stride alignment is NP-complete for general program graphs, and give a heuristic method that can explore the space of possible solutions in a number of ways. We show that some of these strategies can give better solutions than a simple greedy approach proposed earlier. We also show how local graph contractions can reduce the size of the problem significantly without changing the best solution. This allows more complex and effective heuristics to be used. We show how to model the static offset alignment problem using linear programming, and we show that loop-dependent mobile offset alignment is sometimes necessary for optimum performance. We describe an algorithm with for determining mobile alignments for objects within do loops. We also identify situations in which replicated alignment is either required by the program itself or can be used to improve performance. We describe an algorithm based on network flow that replicates objects so as to minimize the total amount of broadcast communication in replication.

  10. Sequential Insertion Heuristic with Adaptive Bee Colony Optimisation Algorithm for Vehicle Routing Problem with Time Windows

    PubMed Central

    Jawarneh, Sana; Abdullah, Salwani

    2015-01-01

    This paper presents a bee colony optimisation (BCO) algorithm to tackle the vehicle routing problem with time window (VRPTW). The VRPTW involves recovering an ideal set of routes for a fleet of vehicles serving a defined number of customers. The BCO algorithm is a population-based algorithm that mimics the social communication patterns of honeybees in solving problems. The performance of the BCO algorithm is dependent on its parameters, so the online (self-adaptive) parameter tuning strategy is used to improve its effectiveness and robustness. Compared with the basic BCO, the adaptive BCO performs better. Diversification is crucial to the performance of the population-based algorithm, but the initial population in the BCO algorithm is generated using a greedy heuristic, which has insufficient diversification. Therefore the ways in which the sequential insertion heuristic (SIH) for the initial population drives the population toward improved solutions are examined. Experimental comparisons indicate that the proposed adaptive BCO-SIH algorithm works well across all instances and is able to obtain 11 best results in comparison with the best-known results in the literature when tested on Solomon’s 56 VRPTW 100 customer instances. Also, a statistical test shows that there is a significant difference between the results. PMID:26132158

  11. Optimizing spread dynamics on graphs by message passing

    NASA Astrophysics Data System (ADS)

    Altarelli, F.; Braunstein, A.; Dall'Asta, L.; Zecchina, R.

    2013-09-01

    Cascade processes are responsible for many important phenomena in natural and social sciences. Simple models of irreversible dynamics on graphs, in which nodes activate depending on the state of their neighbors, have been successfully applied to describe cascades in a large variety of contexts. Over the past decades, much effort has been devoted to understanding the typical behavior of the cascades arising from initial conditions extracted at random from some given ensemble. However, the problem of optimizing the trajectory of the system, i.e. of identifying appropriate initial conditions to maximize (or minimize) the final number of active nodes, is still considered to be practically intractable, with the only exception being models that satisfy a sort of diminishing returns property called submodularity. Submodular models can be approximately solved by means of greedy strategies, but by definition they lack cooperative characteristics which are fundamental in many real systems. Here we introduce an efficient algorithm based on statistical physics for the optimization of trajectories in cascade processes on graphs. We show that for a wide class of irreversible dynamics, even in the absence of submodularity, the spread optimization problem can be solved efficiently on large networks. Analytic and algorithmic results on random graphs are complemented by the solution of the spread maximization problem on a real-world network (the Epinions consumer reviews network).

  12. Multi-camera sensor system for 3D segmentation and localization of multiple mobile robots.

    PubMed

    Losada, Cristina; Mazo, Manuel; Palazuelos, Sira; Pizarro, Daniel; Marrón, Marta

    2010-01-01

    This paper presents a method for obtaining the motion segmentation and 3D localization of multiple mobile robots in an intelligent space using a multi-camera sensor system. The set of calibrated and synchronized cameras are placed in fixed positions within the environment (intelligent space). The proposed algorithm for motion segmentation and 3D localization is based on the minimization of an objective function. This function includes information from all the cameras, and it does not rely on previous knowledge or invasive landmarks on board the robots. The proposed objective function depends on three groups of variables: the segmentation boundaries, the motion parameters and the depth. For the objective function minimization, we use a greedy iterative algorithm with three steps that, after initialization of segmentation boundaries and depth, are repeated until convergence.

  13. An Enhanced Artificial Bee Colony Algorithm with Solution Acceptance Rule and Probabilistic Multisearch.

    PubMed

    Yurtkuran, Alkın; Emel, Erdal

    2016-01-01

    The artificial bee colony (ABC) algorithm is a popular swarm based technique, which is inspired from the intelligent foraging behavior of honeybee swarms. This paper proposes a new variant of ABC algorithm, namely, enhanced ABC with solution acceptance rule and probabilistic multisearch (ABC-SA) to address global optimization problems. A new solution acceptance rule is proposed where, instead of greedy selection between old solution and new candidate solution, worse candidate solutions have a probability to be accepted. Additionally, the acceptance probability of worse candidates is nonlinearly decreased throughout the search process adaptively. Moreover, in order to improve the performance of the ABC and balance the intensification and diversification, a probabilistic multisearch strategy is presented. Three different search equations with distinctive characters are employed using predetermined search probabilities. By implementing a new solution acceptance rule and a probabilistic multisearch approach, the intensification and diversification performance of the ABC algorithm is improved. The proposed algorithm has been tested on well-known benchmark functions of varying dimensions by comparing against novel ABC variants, as well as several recent state-of-the-art algorithms. Computational results show that the proposed ABC-SA outperforms other ABC variants and is superior to state-of-the-art algorithms proposed in the literature.

  14. Improving M-SBL for Joint Sparse Recovery Using a Subspace Penalty

    NASA Astrophysics Data System (ADS)

    Ye, Jong Chul; Kim, Jong Min; Bresler, Yoram

    2015-12-01

    The multiple measurement vector problem (MMV) is a generalization of the compressed sensing problem that addresses the recovery of a set of jointly sparse signal vectors. One of the important contributions of this paper is to reveal that the seemingly least related state-of-art MMV joint sparse recovery algorithms - M-SBL (multiple sparse Bayesian learning) and subspace-based hybrid greedy algorithms - have a very important link. More specifically, we show that replacing the $\\log\\det(\\cdot)$ term in M-SBL by a rank proxy that exploits the spark reduction property discovered in subspace-based joint sparse recovery algorithms, provides significant improvements. In particular, if we use the Schatten-$p$ quasi-norm as the corresponding rank proxy, the global minimiser of the proposed algorithm becomes identical to the true solution as $p \\rightarrow 0$. Furthermore, under the same regularity conditions, we show that the convergence to a local minimiser is guaranteed using an alternating minimization algorithm that has closed form expressions for each of the minimization steps, which are convex. Numerical simulations under a variety of scenarios in terms of SNR, and condition number of the signal amplitude matrix demonstrate that the proposed algorithm consistently outperforms M-SBL and other state-of-the art algorithms.

  15. Design and implementation of priority and time-window based traffic scheduling and routing-spectrum allocation mechanism in elastic optical networks

    NASA Astrophysics Data System (ADS)

    Wang, Honghuan; Xing, Fangyuan; Yin, Hongxi; Zhao, Nan; Lian, Bizhan

    2016-02-01

    With the explosive growth of network services, the reasonable traffic scheduling and efficient configuration of network resources have an important significance to increase the efficiency of the network. In this paper, an adaptive traffic scheduling policy based on the priority and time window is proposed and the performance of this algorithm is evaluated in terms of scheduling ratio. The routing and spectrum allocation are achieved by using the Floyd shortest path algorithm and establishing a node spectrum resource allocation model based on greedy algorithm, which is proposed by us. The fairness index is introduced to improve the capability of spectrum configuration. The results show that the designed traffic scheduling strategy can be applied to networks with multicast and broadcast functionalities, and makes them get real-time and efficient response. The scheme of node spectrum configuration improves the frequency resource utilization and gives play to the efficiency of the network.

  16. Design of Clinical Support Systems Using Integrated Genetic Algorithm and Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Chen, Yung-Fu; Huang, Yung-Fa; Jiang, Xiaoyi; Hsu, Yuan-Nian; Lin, Hsuan-Hung

    Clinical decision support system (CDSS) provides knowledge and specific information for clinicians to enhance diagnostic efficiency and improving healthcare quality. An appropriate CDSS can highly elevate patient safety, improve healthcare quality, and increase cost-effectiveness. Support vector machine (SVM) is believed to be superior to traditional statistical and neural network classifiers. However, it is critical to determine suitable combination of SVM parameters regarding classification performance. Genetic algorithm (GA) can find optimal solution within an acceptable time, and is faster than greedy algorithm with exhaustive searching strategy. By taking the advantage of GA in quickly selecting the salient features and adjusting SVM parameters, a method using integrated GA and SVM (IGS), which is different from the traditional method with GA used for feature selection and SVM for classification, was used to design CDSSs for prediction of successful ventilation weaning, diagnosis of patients with severe obstructive sleep apnea, and discrimination of different cell types form Pap smear. The results show that IGS is better than methods using SVM alone or linear discriminator.

  17. Robust MST-Based Clustering Algorithm.

    PubMed

    Liu, Qidong; Zhang, Ruisheng; Zhao, Zhili; Wang, Zhenghai; Jiao, Mengyao; Wang, Guangjing

    2018-06-01

    Minimax similarity stresses the connectedness of points via mediating elements rather than favoring high mutual similarity. The grouping principle yields superior clustering results when mining arbitrarily-shaped clusters in data. However, it is not robust against noises and outliers in the data. There are two main problems with the grouping principle: first, a single object that is far away from all other objects defines a separate cluster, and second, two connected clusters would be regarded as two parts of one cluster. In order to solve such problems, we propose robust minimum spanning tree (MST)-based clustering algorithm in this letter. First, we separate the connected objects by applying a density-based coarsening phase, resulting in a low-rank matrix in which the element denotes the supernode by combining a set of nodes. Then a greedy method is presented to partition those supernodes through working on the low-rank matrix. Instead of removing the longest edges from MST, our algorithm groups the data set based on the minimax similarity. Finally, the assignment of all data points can be achieved through their corresponding supernodes. Experimental results on many synthetic and real-world data sets show that our algorithm consistently outperforms compared clustering algorithms.

  18. Efficient least angle regression for identification of linear-in-the-parameters models

    PubMed Central

    Beach, Thomas H.; Rezgui, Yacine

    2017-01-01

    Least angle regression, as a promising model selection method, differentiates itself from conventional stepwise and stagewise methods, in that it is neither too greedy nor too slow. It is closely related to L1 norm optimization, which has the advantage of low prediction variance through sacrificing part of model bias property in order to enhance model generalization capability. In this paper, we propose an efficient least angle regression algorithm for model selection for a large class of linear-in-the-parameters models with the purpose of accelerating the model selection process. The entire algorithm works completely in a recursive manner, where the correlations between model terms and residuals, the evolving directions and other pertinent variables are derived explicitly and updated successively at every subset selection step. The model coefficients are only computed when the algorithm finishes. The direct involvement of matrix inversions is thereby relieved. A detailed computational complexity analysis indicates that the proposed algorithm possesses significant computational efficiency, compared with the original approach where the well-known efficient Cholesky decomposition is involved in solving least angle regression. Three artificial and real-world examples are employed to demonstrate the effectiveness, efficiency and numerical stability of the proposed algorithm. PMID:28293140

  19. Evolutionary-inspired probabilistic search for enhancing sampling of local minima in the protein energy surface

    PubMed Central

    2012-01-01

    Background Despite computational challenges, elucidating conformations that a protein system assumes under physiologic conditions for the purpose of biological activity is a central problem in computational structural biology. While these conformations are associated with low energies in the energy surface that underlies the protein conformational space, few existing conformational search algorithms focus on explicitly sampling low-energy local minima in the protein energy surface. Methods This work proposes a novel probabilistic search framework, PLOW, that explicitly samples low-energy local minima in the protein energy surface. The framework combines algorithmic ingredients from evolutionary computation and computational structural biology to effectively explore the subspace of local minima. A greedy local search maps a conformation sampled in conformational space to a nearby local minimum. A perturbation move jumps out of a local minimum to obtain a new starting conformation for the greedy local search. The process repeats in an iterative fashion, resulting in a trajectory-based exploration of the subspace of local minima. Results and conclusions The analysis of PLOW's performance shows that, by navigating only the subspace of local minima, PLOW is able to sample conformations near a protein's native structure, either more effectively or as well as state-of-the-art methods that focus on reproducing the native structure for a protein system. Analysis of the actual subspace of local minima shows that PLOW samples this subspace more effectively that a naive sampling approach. Additional theoretical analysis reveals that the perturbation function employed by PLOW is key to its ability to sample a diverse set of low-energy conformations. This analysis also suggests directions for further research and novel applications for the proposed framework. PMID:22759582

  20. Robust 2DPCA with non-greedy l1 -norm maximization for image analysis.

    PubMed

    Wang, Rong; Nie, Feiping; Yang, Xiaojun; Gao, Feifei; Yao, Minli

    2015-05-01

    2-D principal component analysis based on l1 -norm (2DPCA-L1) is a recently developed approach for robust dimensionality reduction and feature extraction in image domain. Normally, a greedy strategy is applied due to the difficulty of directly solving the l1 -norm maximization problem, which is, however, easy to get stuck in local solution. In this paper, we propose a robust 2DPCA with non-greedy l1 -norm maximization in which all projection directions are optimized simultaneously. Experimental results on face and other datasets confirm the effectiveness of the proposed approach.

  1. Adaptive Greedy Dictionary Selection for Web Media Summarization.

    PubMed

    Cong, Yang; Liu, Ji; Sun, Gan; You, Quanzeng; Li, Yuncheng; Luo, Jiebo

    2017-01-01

    Initializing an effective dictionary is an indispensable step for sparse representation. In this paper, we focus on the dictionary selection problem with the objective to select a compact subset of basis from original training data instead of learning a new dictionary matrix as dictionary learning models do. We first design a new dictionary selection model via l 2,0 norm. For model optimization, we propose two methods: one is the standard forward-backward greedy algorithm, which is not suitable for large-scale problems; the other is based on the gradient cues at each forward iteration and speeds up the process dramatically. In comparison with the state-of-the-art dictionary selection models, our model is not only more effective and efficient, but also can control the sparsity. To evaluate the performance of our new model, we select two practical web media summarization problems: 1) we build a new data set consisting of around 500 users, 3000 albums, and 1 million images, and achieve effective assisted albuming based on our model and 2) by formulating the video summarization problem as a dictionary selection issue, we employ our model to extract keyframes from a video sequence in a more flexible way. Generally, our model outperforms the state-of-the-art methods in both these two tasks.

  2. Network community-detection enhancement by proper weighting

    NASA Astrophysics Data System (ADS)

    Khadivi, Alireza; Ajdari Rad, Ali; Hasler, Martin

    2011-04-01

    In this paper, we show how proper assignment of weights to the edges of a complex network can enhance the detection of communities and how it can circumvent the resolution limit and the extreme degeneracy problems associated with modularity. Our general weighting scheme takes advantage of graph theoretic measures and it introduces two heuristics for tuning its parameters. We use this weighting as a preprocessing step for the greedy modularity optimization algorithm of Newman to improve its performance. The result of the experiments of our approach on computer-generated and real-world data networks confirm that the proposed approach not only mitigates the problems of modularity but also improves the modularity optimization.

  3. Two Simple and Efficient Algorithms to Compute the SP-Score Objective Function of a Multiple Sequence Alignment.

    PubMed

    Ranwez, Vincent

    2016-01-01

    Multiple sequence alignment (MSA) is a crucial step in many molecular analyses and many MSA tools have been developed. Most of them use a greedy approach to construct a first alignment that is then refined by optimizing the sum of pair score (SP-score). The SP-score estimation is thus a bottleneck for most MSA tools since it is repeatedly required and is time consuming. Given an alignment of n sequences and L sites, I introduce here optimized solutions reaching O(nL) time complexity for affine gap cost, instead of O(n2L), which are easy to implement.

  4. Scheduling Results for the THEMIS Observation Scheduling Tool

    NASA Technical Reports Server (NTRS)

    Mclaren, David; Rabideau, Gregg; Chien, Steve; Knight, Russell; Anwar, Sadaat; Mehall, Greg; Christensen, Philip

    2011-01-01

    We describe a scheduling system intended to assist in the development of instrument data acquisitions for the THEMIS instrument, onboard the Mars Odyssey spacecraft, and compare results from multiple scheduling algorithms. This tool creates observations of both (a) targeted geographical regions of interest and (b) general mapping observations, while respecting spacecraft constraints such as data volume, observation timing, visibility, lighting, season, and science priorities. This tool therefore must address both geometric and state/timing/resource constraints. We describe a tool that maps geometric polygon overlap constraints to set covering constraints using a grid-based approach. These set covering constraints are then incorporated into a greedy optimization scheduling algorithm incorporating operations constraints to generate feasible schedules. The resultant tool generates schedules of hundreds of observations per week out of potential thousands of observations. This tool is currently under evaluation by the THEMIS observation planning team at Arizona State University.

  5. Application of constraint-based satellite mission planning model in forest fire monitoring

    NASA Astrophysics Data System (ADS)

    Guo, Bingjun; Wang, Hongfei; Wu, Peng

    2017-10-01

    In this paper, a constraint-based satellite mission planning model is established based on the thought of constraint satisfaction. It includes target, request, observation, satellite, payload and other elements, with constraints linked up. The optimization goal of the model is to make full use of time and resources, and improve the efficiency of target observation. Greedy algorithm is used in the model solving to make observation plan and data transmission plan. Two simulation experiments are designed and carried out, which are routine monitoring of global forest fire and emergency monitoring of forest fires in Australia. The simulation results proved that the model and algorithm perform well. And the model is of good emergency response capability. Efficient and reasonable plan can be worked out to meet users' needs under complex cases of multiple payloads, multiple targets and variable priorities with this model.

  6. Determination system for solar cell layout in traffic light network using dominating set

    NASA Astrophysics Data System (ADS)

    Eka Yulia Retnani, Windi; Fambudi, Brelyanes Z.; Slamin

    2018-04-01

    Graph Theory is one of the fields in Mathematics that solves discrete problems. In daily life, the applications of Graph Theory are used to solve various problems. One of the topics in the Graph Theory that is used to solve the problem is the dominating set. The concept of dominating set is used, for example, to locate some objects systematically. In this study, the dominating set are used to determine the dominating points for solar panels, where the vertex represents the traffic light point and the edge represents the connection between the points of the traffic light. To search the dominating points for solar panels using the greedy algorithm. This algorithm is used to determine the location of solar panel. This research produced applications that can determine the location of solar panels with optimal results, that is, the minimum dominating points.

  7. Resource-aware taxon selection for maximizing phylogenetic diversity.

    PubMed

    Pardi, Fabio; Goldman, Nick

    2007-06-01

    Phylogenetic diversity (PD) is a useful metric for selecting taxa in a range of biological applications, for example, bioconservation and genomics, where the selection is usually constrained by the limited availability of resources. We formalize taxon selection as a conceptually simple optimization problem, aiming to maximize PD subject to resource constraints. This allows us to take into account the different amounts of resources required by the different taxa. Although this is a computationally difficult problem, we present a dynamic programming algorithm that solves it in pseudo-polynomial time. Our algorithm can also solve many instances of the Noah's Ark Problem, a more realistic formulation of taxon selection for biodiversity conservation that allows for taxon-specific extinction risks. These instances extend the set of problems for which solutions are available beyond previously known greedy-tractable cases. Finally, we discuss the relevance of our results to real-life scenarios.

  8. Performance tradeoffs in static and dynamic load balancing strategies

    NASA Technical Reports Server (NTRS)

    Iqbal, M. A.; Saltz, J. H.; Bokhart, S. H.

    1986-01-01

    The problem of uniformly distributing the load of a parallel program over a multiprocessor system was considered. A program was analyzed whose structure permits the computation of the optimal static solution. Then four strategies for load balancing were described and their performance compared. The strategies are: (1) the optimal static assignment algorithm which is guaranteed to yield the best static solution, (2) the static binary dissection method which is very fast but sub-optimal, (3) the greedy algorithm, a static fully polynomial time approximation scheme, which estimates the optimal solution to arbitrary accuracy, and (4) the predictive dynamic load balancing heuristic which uses information on the precedence relationships within the program and outperforms any of the static methods. It is also shown that the overhead incurred by the dynamic heuristic is reduced considerably if it is started off with a static assignment provided by either of the other three strategies.

  9. Efficiently hiding sensitive itemsets with transaction deletion based on genetic algorithms.

    PubMed

    Lin, Chun-Wei; Zhang, Binbin; Yang, Kuo-Tung; Hong, Tzung-Pei

    2014-01-01

    Data mining is used to mine meaningful and useful information or knowledge from a very large database. Some secure or private information can be discovered by data mining techniques, thus resulting in an inherent risk of threats to privacy. Privacy-preserving data mining (PPDM) has thus arisen in recent years to sanitize the original database for hiding sensitive information, which can be concerned as an NP-hard problem in sanitization process. In this paper, a compact prelarge GA-based (cpGA2DT) algorithm to delete transactions for hiding sensitive itemsets is thus proposed. It solves the limitations of the evolutionary process by adopting both the compact GA-based (cGA) mechanism and the prelarge concept. A flexible fitness function with three adjustable weights is thus designed to find the appropriate transactions to be deleted in order to hide sensitive itemsets with minimal side effects of hiding failure, missing cost, and artificial cost. Experiments are conducted to show the performance of the proposed cpGA2DT algorithm compared to the simple GA-based (sGA2DT) algorithm and the greedy approach in terms of execution time and three side effects.

  10. A Q-Learning-Based Delay-Aware Routing Algorithm to Extend the Lifetime of Underwater Sensor Networks.

    PubMed

    Jin, Zhigang; Ma, Yingying; Su, Yishan; Li, Shuo; Fu, Xiaomei

    2017-07-19

    Underwater sensor networks (UWSNs) have become a hot research topic because of their various aquatic applications. As the underwater sensor nodes are powered by built-in batteries which are difficult to replace, extending the network lifetime is a most urgent need. Due to the low and variable transmission speed of sound, the design of reliable routing algorithms for UWSNs is challenging. In this paper, we propose a Q-learning based delay-aware routing (QDAR) algorithm to extend the lifetime of underwater sensor networks. In QDAR, a data collection phase is designed to adapt to the dynamic environment. With the application of the Q-learning technique, QDAR can determine a global optimal next hop rather than a greedy one. We define an action-utility function in which residual energy and propagation delay are both considered for adequate routing decisions. Thus, the QDAR algorithm can extend the network lifetime by uniformly distributing the residual energy and provide lower end-to-end delay. The simulation results show that our protocol can yield nearly the same network lifetime, and can reduce the end-to-end delay by 20-25% compared with a classic lifetime-extended routing protocol (QELAR).

  11. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Duchaineau, M.; Wolinsky, M.; Sigeti, D.E.

    Real-time terrain rendering for interactive visualization remains a demanding task. We present a novel algorithm with several advantages over previous methods: our method is unusually stingy with polygons yet achieves real-time performance and is scalable to arbitrary regions and resolutions. The method provides a continuous terrain mesh of specified triangle count having provably minimum error in restricted but reasonably general classes of permissible meshes and error metrics. Our method provides an elegant solution to guaranteeing certain elusive types of consistency in scenes produced by multiple scene generators which share a common finest-resolution database but which otherwise operate entirely independently. Thismore » consistency is achieved by exploiting the freedom of choice of error metric allowed by the algorithm to provide, for example, multiple exact lines-of-sight in real-time. Our methods rely on an off-line pre-processing phase to construct a multi-scale data structure consisting of triangular terrain approximations enhanced ({open_quotes}thickened{close_quotes}) with world-space error information. In real time, this error data is efficiently transformed into screen-space where it is used to guide a greedy top-down triangle subdivision algorithm which produces the desired minimal error continuous terrain mesh. Our algorithm has been implemented and it operates at real-time rates.« less

  12. A Q-Learning-Based Delay-Aware Routing Algorithm to Extend the Lifetime of Underwater Sensor Networks

    PubMed Central

    Ma, Yingying; Su, Yishan; Li, Shuo; Fu, Xiaomei

    2017-01-01

    Underwater sensor networks (UWSNs) have become a hot research topic because of their various aquatic applications. As the underwater sensor nodes are powered by built-in batteries which are difficult to replace, extending the network lifetime is a most urgent need. Due to the low and variable transmission speed of sound, the design of reliable routing algorithms for UWSNs is challenging. In this paper, we propose a Q-learning based delay-aware routing (QDAR) algorithm to extend the lifetime of underwater sensor networks. In QDAR, a data collection phase is designed to adapt to the dynamic environment. With the application of the Q-learning technique, QDAR can determine a global optimal next hop rather than a greedy one. We define an action-utility function in which residual energy and propagation delay are both considered for adequate routing decisions. Thus, the QDAR algorithm can extend the network lifetime by uniformly distributing the residual energy and provide lower end-to-end delay. The simulation results show that our protocol can yield nearly the same network lifetime, and can reduce the end-to-end delay by 20–25% compared with a classic lifetime-extended routing protocol (QELAR). PMID:28753951

  13. Randomized algorithms for high quality treatment planning in volumetric modulated arc therapy

    NASA Astrophysics Data System (ADS)

    Yang, Yu; Dong, Bin; Wen, Zaiwen

    2017-02-01

    In recent years, volumetric modulated arc therapy (VMAT) has been becoming a more and more important radiation technique widely used in clinical application for cancer treatment. One of the key problems in VMAT is treatment plan optimization, which is complicated due to the constraints imposed by the involved equipments. In this paper, we consider a model with four major constraints: the bound on the beam intensity, an upper bound on the rate of the change of the beam intensity, the moving speed of leaves of the multi-leaf collimator (MLC) and its directional-convexity. We solve the model by a two-stage algorithm: performing minimization with respect to the shapes of the aperture and the beam intensities alternatively. Specifically, the shapes of the aperture are obtained by a greedy algorithm whose performance is enhanced by random sampling in the leaf pairs with a decremental rate. The beam intensity is optimized using a gradient projection method with non-monotonic line search. We further improve the proposed algorithm by an incremental random importance sampling of the voxels to reduce the computational cost of the energy functional. Numerical simulations on two clinical cancer date sets demonstrate that our method is highly competitive to the state-of-the-art algorithms in terms of both computational time and quality of treatment planning.

  14. UAVs Task and Motion Planning in the Presence of Obstacles and Prioritized Targets

    PubMed Central

    Gottlieb, Yoav; Shima, Tal

    2015-01-01

    The intertwined task assignment and motion planning problem of assigning a team of fixed-winged unmanned aerial vehicles to a set of prioritized targets in an environment with obstacles is addressed. It is assumed that the targets’ locations and initial priorities are determined using a network of unattended ground sensors used to detect potential threats at restricted zones. The targets are characterized by a time-varying level of importance, and timing constraints must be fulfilled before a vehicle is allowed to visit a specific target. It is assumed that the vehicles are carrying body-fixed sensors and, thus, are required to approach a designated target while flying straight and level. The fixed-winged aerial vehicles are modeled as Dubins vehicles, i.e., having a constant speed and a minimum turning radius constraint. The investigated integrated problem of task assignment and motion planning is posed in the form of a decision tree, and two search algorithms are proposed: an exhaustive algorithm that improves over run time and provides the minimum cost solution, encoded in the tree, and a greedy algorithm that provides a quick feasible solution. To satisfy the target’s visitation timing constraint, a path elongation motion planning algorithm amidst obstacles is provided. Using simulations, the performance of the algorithms is compared, evaluated and exemplified. PMID:26610522

  15. Phase unwrapping with graph cuts optimization and dual decomposition acceleration for 3D high-resolution MRI data.

    PubMed

    Dong, Jianwu; Chen, Feng; Zhou, Dong; Liu, Tian; Yu, Zhaofei; Wang, Yi

    2017-03-01

    Existence of low SNR regions and rapid-phase variations pose challenges to spatial phase unwrapping algorithms. Global optimization-based phase unwrapping methods are widely used, but are significantly slower than greedy methods. In this paper, dual decomposition acceleration is introduced to speed up a three-dimensional graph cut-based phase unwrapping algorithm. The phase unwrapping problem is formulated as a global discrete energy minimization problem, whereas the technique of dual decomposition is used to increase the computational efficiency by splitting the full problem into overlapping subproblems and enforcing the congruence of overlapping variables. Using three dimensional (3D) multiecho gradient echo images from an agarose phantom and five brain hemorrhage patients, we compared this proposed method with an unaccelerated graph cut-based method. Experimental results show up to 18-fold acceleration in computation time. Dual decomposition significantly improves the computational efficiency of 3D graph cut-based phase unwrapping algorithms. Magn Reson Med 77:1353-1358, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.

  16. Enhancing battery efficiency for pervasive health-monitoring systems based on electronic textiles.

    PubMed

    Zheng, Nenggan; Wu, Zhaohui; Lin, Man; Yang, Laurence Tianruo

    2010-03-01

    Electronic textiles are regarded as one of the most important computation platforms for future computer-assisted health-monitoring applications. In these novel systems, multiple batteries are used in order to prolong their operational lifetime, which is a significant metric for system usability. However, due to the nonlinear features of batteries, computing systems with multiple batteries cannot achieve the same battery efficiency as those powered by a monolithic battery of equal capacity. In this paper, we propose an algorithm aiming to maximize battery efficiency globally for the computer-assisted health-care systems with multiple batteries. Based on an accurate analytical battery model, the concept of weighted battery fatigue degree is introduced and the novel battery-scheduling algorithm called predicted weighted fatigue degree least first (PWFDLF) is developed. Besides, we also discuss our attempts during search PWFDLF: a weighted round-robin (WRR) and a greedy algorithm achieving highest local battery efficiency, which reduces to the sequential discharging policy. Evaluation results show that a considerable improvement in battery efficiency can be obtained by PWFDLF under various battery configurations and current profiles compared to conventional sequential and WRR discharging policies.

  17. An algorithm for direct causal learning of influences on patient outcomes.

    PubMed

    Rathnam, Chandramouli; Lee, Sanghoon; Jiang, Xia

    2017-01-01

    This study aims at developing and introducing a new algorithm, called direct causal learner (DCL), for learning the direct causal influences of a single target. We applied it to both simulated and real clinical and genome wide association study (GWAS) datasets and compared its performance to classic causal learning algorithms. The DCL algorithm learns the causes of a single target from passive data using Bayesian-scoring, instead of using independence checks, and a novel deletion algorithm. We generate 14,400 simulated datasets and measure the number of datasets for which DCL correctly and partially predicts the direct causes. We then compare its performance with the constraint-based path consistency (PC) and conservative PC (CPC) algorithms, the Bayesian-score based fast greedy search (FGS) algorithm, and the partial ancestral graphs algorithm fast causal inference (FCI). In addition, we extend our comparison of all five algorithms to both a real GWAS dataset and real breast cancer datasets over various time-points in order to observe how effective they are at predicting the causal influences of Alzheimer's disease and breast cancer survival. DCL consistently outperforms FGS, PC, CPC, and FCI in discovering the parents of the target for the datasets simulated using a simple network. Overall, DCL predicts significantly more datasets correctly (McNemar's test significance: p<0.0001) than any of the other algorithms for these network types. For example, when assessing overall performance (simple and complex network results combined), DCL correctly predicts approximately 1400 more datasets than the top FGS method, 1600 more datasets than the top CPC method, 4500 more datasets than the top PC method, and 5600 more datasets than the top FCI method. Although FGS did correctly predict more datasets than DCL for the complex networks, and DCL correctly predicted only a few more datasets than CPC for these networks, there is no significant difference in performance between these three algorithms for this network type. However, when we use a more continuous measure of accuracy, we find that all the DCL methods are able to better partially predict more direct causes than FGS and CPC for the complex networks. In addition, DCL consistently had faster runtimes than the other algorithms. In the application to the real datasets, DCL identified rs6784615, located on the NISCH gene, and rs10824310, located on the PRKG1 gene, as direct causes of late onset Alzheimer's disease (LOAD) development. In addition, DCL identified ER category as a direct predictor of breast cancer mortality within 5 years, and HER2 status as a direct predictor of 10-year breast cancer mortality. These predictors have been identified in previous studies to have a direct causal relationship with their respective phenotypes, supporting the predictive power of DCL. When the other algorithms discovered predictors from the real datasets, these predictors were either also found by DCL or could not be supported by previous studies. Our results show that DCL outperforms FGS, PC, CPC, and FCI in almost every case, demonstrating its potential to advance causal learning. Furthermore, our DCL algorithm effectively identifies direct causes in the LOAD and Metabric GWAS datasets, which indicates its potential for clinical applications. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. The Greedy Little Boy Teacher's Manual [With Units for Levels A and B].

    ERIC Educational Resources Information Center

    Otto, Dale; George, Larry

    The Center for the Study of Migrant and Indian Education has recognized the need to develop special materials to improve the non-Indian's understanding of the differences he observes in his Indian classmates and to promote a better understanding by American Indian children of their unique cultural heritage. The Greedy Little Boy is a traditional…

  19. Landmine detection using two-tapped joint orthogonal matching pursuits

    NASA Astrophysics Data System (ADS)

    Goldberg, Sean; Glenn, Taylor; Wilson, Joseph N.; Gader, Paul D.

    2012-06-01

    Joint Orthogonal Matching Pursuits (JOMP) is used here in the context of landmine detection using data obtained from an electromagnetic induction (EMI) sensor. The response from an object containing metal can be decomposed into a discrete spectrum of relaxation frequencies (DSRF) from which we construct a dictionary. A greedy iterative algorithm is proposed for computing successive residuals of a signal by subtracting away the highest matching dictionary element at each step. The nal condence of a particular signal is a combination of the reciprocal of this residual and the mean of the complex component. A two-tap approach comparing signals on opposite sides of the geometric location of the sensor is examined and found to produce better classication. It is found that using only a single pursuit does a comparable job, reducing complexity and allowing for real-time implementation in automated target recognition systems. JOMP is particularly highlighted in comparison with a previous EMI detection algorithm known as String Match.

  20. Research on Operation Strategy for Bundled Wind-thermal Generation Power Systems Based on Two-Stage Optimization Model

    NASA Astrophysics Data System (ADS)

    Sun, Congcong; Wang, Zhijie; Liu, Sanming; Jiang, Xiuchen; Sheng, Gehao; Liu, Tianyu

    2017-05-01

    Wind power has the advantages of being clean and non-polluting and the development of bundled wind-thermal generation power systems (BWTGSs) is one of the important means to improve wind power accommodation rate and implement “clean alternative” on generation side. A two-stage optimization strategy for BWTGSs considering wind speed forecasting results and load characteristics is proposed. By taking short-term wind speed forecasting results of generation side and load characteristics of demand side into account, a two-stage optimization model for BWTGSs is formulated. By using the environmental benefit index of BWTGSs as the objective function, supply-demand balance and generator operation as the constraints, the first-stage optimization model is developed with the chance-constrained programming theory. By using the operation cost for BWTGSs as the objective function, the second-stage optimization model is developed with the greedy algorithm. The improved PSO algorithm is employed to solve the model and numerical test verifies the effectiveness of the proposed strategy.

  1. Scaling up spike-and-slab models for unsupervised feature learning.

    PubMed

    Goodfellow, Ian J; Courville, Aaron; Bengio, Yoshua

    2013-08-01

    We describe the use of two spike-and-slab models for modeling real-valued data, with an emphasis on their applications to object recognition. The first model, which we call spike-and-slab sparse coding (S3C), is a preexisting model for which we introduce a faster approximate inference algorithm. We introduce a deep variant of S3C, which we call the partially directed deep Boltzmann machine (PD-DBM) and extend our S3C inference algorithm for use on this model. We describe learning procedures for each. We demonstrate that our inference procedure for S3C enables scaling the model to unprecedented large problem sizes, and demonstrate that using S3C as a feature extractor results in very good object recognition performance, particularly when the number of labeled examples is low. We show that the PD-DBM generates better samples than its shallow counterpart, and that unlike DBMs or DBNs, the PD-DBM may be trained successfully without greedy layerwise training.

  2. Evaluation of five diffeomorphic image registration algorithms for mouse brain magnetic resonance microscopy.

    PubMed

    Fu, Zhenrong; Lin, Lan; Tian, Miao; Wang, Jingxuan; Zhang, Baiwen; Chu, Pingping; Li, Shaowu; Pathan, Muhammad Mohsin; Deng, Yulin; Wu, Shuicai

    2017-11-01

    The development of genetically engineered mouse models for neuronal diseases and behavioural disorders have generated a growing need for small animal imaging. High-resolution magnetic resonance microscopy (MRM) provides powerful capabilities for noninvasive studies of mouse brains, while avoiding some limits associated with the histological procedures. Quantitative comparison of structural images is a critical step in brain imaging analysis, which highly relies on the performance of image registration techniques. Nowadays, there is a mushrooming growth of human brain registration algorithms, while fine-tuning of those algorithms for mouse brain MRMs is rarely addressed. Because of their topology preservation property and outstanding performance in human studies, diffeomorphic transformations have become popular in computational anatomy. In this study, we specially tuned five diffeomorphic image registration algorithms [DARTEL, geodesic shooting, diffeo-demons, SyN (Greedy-SyN and geodesic-SyN)] for mouse brain MRMs and evaluated their performance using three measures [volume overlap percentage (VOP), residual intensity error (RIE) and surface concordance ratio (SCR)]. Geodesic-SyN performed significantly better than the other methods according to all three different measures. These findings are important for the studies on structural brain changes that may occur in wild-type and transgenic mouse brains. © 2017 The Authors Journal of Microscopy © 2017 Royal Microscopical Society.

  3. Limited-memory adaptive snapshot selection for proper orthogonal decomposition

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oxberry, Geoffrey M.; Kostova-Vassilevska, Tanya; Arrighi, Bill

    2015-04-02

    Reduced order models are useful for accelerating simulations in many-query contexts, such as optimization, uncertainty quantification, and sensitivity analysis. However, offline training of reduced order models can have prohibitively expensive memory and floating-point operation costs in high-performance computing applications, where memory per core is limited. To overcome this limitation for proper orthogonal decomposition, we propose a novel adaptive selection method for snapshots in time that limits offline training costs by selecting snapshots according an error control mechanism similar to that found in adaptive time-stepping ordinary differential equation solvers. The error estimator used in this work is related to theory boundingmore » the approximation error in time of proper orthogonal decomposition-based reduced order models, and memory usage is minimized by computing the singular value decomposition using a single-pass incremental algorithm. Results for a viscous Burgers’ test problem demonstrate convergence in the limit as the algorithm error tolerances go to zero; in this limit, the full order model is recovered to within discretization error. The resulting method can be used on supercomputers to generate proper orthogonal decomposition-based reduced order models, or as a subroutine within hyperreduction algorithms that require taking snapshots in time, or within greedy algorithms for sampling parameter space.« less

  4. A System for Automatically Generating Scheduling Heuristics

    NASA Technical Reports Server (NTRS)

    Morris, Robert

    1996-01-01

    The goal of this research is to improve the performance of automated schedulers by designing and implementing an algorithm by automatically generating heuristics by selecting a schedule. The particular application selected by applying this method solves the problem of scheduling telescope observations, and is called the Associate Principal Astronomer. The input to the APA scheduler is a set of observation requests submitted by one or more astronomers. Each observation request specifies an observation program as well as scheduling constraints and preferences associated with the program. The scheduler employs greedy heuristic search to synthesize a schedule that satisfies all hard constraints of the domain and achieves a good score with respect to soft constraints expressed as an objective function established by an astronomer-user.

  5. On the Complexity of the Metric TSP under Stability Considerations

    NASA Astrophysics Data System (ADS)

    Mihalák, Matúš; Schöngens, Marcel; Šrámek, Rastislav; Widmayer, Peter

    We consider the metric Traveling Salesman Problem (Δ-TSP for short) and study how stability (as defined by Bilu and Linial [3]) influences the complexity of the problem. On an intuitive level, an instance of Δ-TSP is γ-stable (γ> 1), if there is a unique optimum Hamiltonian tour and any perturbation of arbitrary edge weights by at most γ does not change the edge set of the optimal solution (i.e., there is a significant gap between the optimum tour and all other tours). We show that for γ ≥ 1.8 a simple greedy algorithm (resembling Prim's algorithm for constructing a minimum spanning tree) computes the optimum Hamiltonian tour for every γ-stable instance of the Δ-TSP, whereas a simple local search algorithm can fail to find the optimum even if γ is arbitrary. We further show that there are γ-stable instances of Δ-TSP for every 1 < γ< 2. These results provide a different view on the hardness of the Δ-TSP and give rise to a new class of problem instances which are substantially easier to solve than instances of the general Δ-TSP.

  6. Joint Sparse Recovery With Semisupervised MUSIC

    NASA Astrophysics Data System (ADS)

    Wen, Zaidao; Hou, Biao; Jiao, Licheng

    2017-05-01

    Discrete multiple signal classification (MUSIC) with its low computational cost and mild condition requirement becomes a significant noniterative algorithm for joint sparse recovery (JSR). However, it fails in rank defective problem caused by coherent or limited amount of multiple measurement vectors (MMVs). In this letter, we provide a novel sight to address this problem by interpreting JSR as a binary classification problem with respect to atoms. Meanwhile, MUSIC essentially constructs a supervised classifier based on the labeled MMVs so that its performance will heavily depend on the quality and quantity of these training samples. From this viewpoint, we develop a semisupervised MUSIC (SS-MUSIC) in the spirit of machine learning, which declares that the insufficient supervised information in the training samples can be compensated from those unlabeled atoms. Instead of constructing a classifier in a fully supervised manner, we iteratively refine a semisupervised classifier by exploiting the labeled MMVs and some reliable unlabeled atoms simultaneously. Through this way, the required conditions and iterations can be greatly relaxed and reduced. Numerical experimental results demonstrate that SS-MUSIC can achieve much better recovery performances than other MUSIC extended algorithms as well as some typical greedy algorithms for JSR in terms of iterations and recovery probability.

  7. A novel structured dictionary for fast processing of 3D medical images, with application to computed tomography restoration and denoising

    NASA Astrophysics Data System (ADS)

    Karimi, Davood; Ward, Rabab K.

    2016-03-01

    Sparse representation of signals in learned overcomplete dictionaries has proven to be a powerful tool with applications in denoising, restoration, compression, reconstruction, and more. Recent research has shown that learned overcomplete dictionaries can lead to better results than analytical dictionaries such as wavelets in almost all image processing applications. However, a major disadvantage of these dictionaries is that their learning and usage is very computationally intensive. In particular, finding the sparse representation of a signal in these dictionaries requires solving an optimization problem that leads to very long computational times, especially in 3D image processing. Moreover, the sparse representation found by greedy algorithms is usually sub-optimal. In this paper, we propose a novel two-level dictionary structure that improves the performance and the speed of standard greedy sparse coding methods. The first (i.e., the top) level in our dictionary is a fixed orthonormal basis, whereas the second level includes the atoms that are learned from the training data. We explain how such a dictionary can be learned from the training data and how the sparse representation of a new signal in this dictionary can be computed. As an application, we use the proposed dictionary structure for removing the noise and artifacts in 3D computed tomography (CT) images. Our experiments with real CT images show that the proposed method achieves results that are comparable with standard dictionary-based methods while substantially reducing the computational time.

  8. Exploring Maps with Greedy Navigators

    NASA Astrophysics Data System (ADS)

    Lee, Sang Hoon; Holme, Petter

    2012-03-01

    During the last decade of network research focusing on structural and dynamical properties of networks, the role of network users has been more or less underestimated from the bird’s-eye view of global perspective. In this era of global positioning system equipped smartphones, however, a user’s ability to access local geometric information and find efficient pathways on networks plays a crucial role, rather than the globally optimal pathways. We present a simple greedy spatial navigation strategy as a probe to explore spatial networks. These greedy navigators use directional information in every move they take, without being trapped in a dead end based on their memory about previous routes. We suggest that the centralities measures have to be modified to incorporate the navigators’ behavior, and present the intriguing effect of navigators’ greediness where removing some edges may actually enhance the routing efficiency, which is reminiscent of Braess’s paradox. In addition, using samples of road structures in large cities around the world, it is shown that the navigability measure we define reflects unique structural properties, which are not easy to predict from other topological characteristics. In this respect, we believe that our routing scheme significantly moves the routing problem on networks one step closer to reality, incorporating the inevitable incompleteness of navigators’ information.

  9. Selection Strategies for Social Influence in the Threshold Model

    NASA Astrophysics Data System (ADS)

    Karampourniotis, Panagiotis; Szymanski, Boleslaw; Korniss, Gyorgy

    The ubiquity of online social networks makes the study of social influence extremely significant for its applications to marketing, politics and security. Maximizing the spread of influence by strategically selecting nodes as initiators of a new opinion or trend is a challenging problem. We study the performance of various strategies for selection of large fractions of initiators on a classical social influence model, the Threshold model (TM). Under the TM, a node adopts a new opinion only when the fraction of its first neighbors possessing that opinion exceeds a pre-assigned threshold. The strategies we study are of two kinds: strategies based solely on the initial network structure (Degree-rank, Dominating Sets, PageRank etc.) and strategies that take into account the change of the states of the nodes during the evolution of the cascade, e.g. the greedy algorithm. We find that the performance of these strategies depends largely on both the network structure properties, e.g. the assortativity, and the distribution of the thresholds assigned to the nodes. We conclude that the optimal strategy needs to combine the network specifics and the model specific parameters to identify the most influential spreaders. Supported in part by ARL NS-CTA, ARO, and ONR.

  10. Emergence of social cohesion in a model society of greedy, mobile individuals

    PubMed Central

    Roca, Carlos P.; Helbing, Dirk

    2011-01-01

    Human wellbeing in modern societies relies on social cohesion, which can be characterized by high levels of cooperation and a large number of social ties. Both features, however, are frequently challenged by individual self-interest. In fact, the stability of social and economic systems can suddenly break down as the recent financial crisis and outbreaks of civil wars illustrate. To understand the conditions for the emergence and robustness of social cohesion, we simulate the creation of public goods among mobile agents, assuming that behavioral changes are determined by individual satisfaction. Specifically, we study a generalized win-stay-lose-shift learning model, which is only based on previous experience and rules out greenbeard effects that would allow individuals to guess future gains. The most noteworthy aspect of this model is that it promotes cooperation in social dilemma situations despite very low information requirements and without assuming imitation, a shadow of the future, reputation effects, signaling, or punishment. We find that moderate greediness favors social cohesionby a coevolution between cooperation and spatial organization, additionally showing that those cooperation-enforcing levels of greediness can be evolutionarily selected. However, a maladaptive trend of increasing greediness, although enhancing individuals’ returns in the beginning, eventually causes cooperation and social relationships to fall apart. Our model is, therefore, expected to shed light on the long-standing problem of the emergence and stability of cooperative behavior. PMID:21709245

  11. Emergence of social cohesion in a model society of greedy, mobile individuals.

    PubMed

    Roca, Carlos P; Helbing, Dirk

    2011-07-12

    Human wellbeing in modern societies relies on social cohesion, which can be characterized by high levels of cooperation and a large number of social ties. Both features, however, are frequently challenged by individual self-interest. In fact, the stability of social and economic systems can suddenly break down as the recent financial crisis and outbreaks of civil wars illustrate. To understand the conditions for the emergence and robustness of social cohesion, we simulate the creation of public goods among mobile agents, assuming that behavioral changes are determined by individual satisfaction. Specifically, we study a generalized win-stay-lose-shift learning model, which is only based on previous experience and rules out greenbeard effects that would allow individuals to guess future gains. The most noteworthy aspect of this model is that it promotes cooperation in social dilemma situations despite very low information requirements and without assuming imitation, a shadow of the future, reputation effects, signaling, or punishment. We find that moderate greediness favors social cohesion by a coevolution between cooperation and spatial organization, additionally showing that those cooperation-enforcing levels of greediness can be evolutionarily selected. However, a maladaptive trend of increasing greediness, although enhancing individuals' returns in the beginning, eventually causes cooperation and social relationships to fall apart. Our model is, therefore, expected to shed light on the long-standing problem of the emergence and stability of cooperative behavior.

  12. Scheduling and control strategies for the departure problem in air traffic control

    NASA Astrophysics Data System (ADS)

    Bolender, Michael Alan

    Two problems relating to the departure problem in air traffic control automation are examined. The first problem that is addressed is the scheduling of aircraft for departure. The departure operations at a major US hub airport are analyzed, and a discrete event simulation of the departure operations is constructed. Specifically, the case where there is a single departure runway is considered. The runway is fed by two queues of aircraft. Each queue, in turn, is fed by a single taxiway. Two salient areas regarding scheduling are addressed. The first is the construction of optimal departure sequences for the aircraft that are queued. Several greedy search algorithms are designed to minimize the total time to depart a set of queued aircraft. Each algorithm has a different set of heuristic rules to resolve situations within the search space whenever two branches of the search tree with equal edge costs are encountered. These algorithms are then compared and contrasted with a genetic search algorithm in order to assess the performance of the heuristics. This is done in the context of a static departure problem where the length of the departure queue is fixed. A greedy algorithm which deepens the search whenever two branches of the search tree with non-unique costs are encountered is shown to outperform the other heuristic algorithms. This search strategy is then implemented in the discrete event simulation. A baseline performance level is established, and a sensitivity analysis is performed by implementing changes in traffic mix, routing, and miles-in-trail restrictions for comparison. It is concluded that to minimize the average time spent in the queue for different traffic conditions, a queue assignment algorithm is needed to maintain an even balance of aircraft in the queues. A necessary consideration is to base queue assignment upon traffic management restrictions such as miles-in-trail constraints. The second problem addresses the technical challenges associated with merging departure aircraft onto their filed routes in a congested airspace environment. Conflicts between departures and en route aircraft within the Center airspace are analyzed. Speed control, holding the aircraft; at an intermediate altitude, re-routing, and vectoring are posed as possible deconfliction maneuvers. A cost assessment of these merge strategies, which are based upon 4D fight management and conflict detection and resolution principles, is given. Several merge conflicts are studied and a cost for each resolution is computed. It is shown that vectoring tends to be the most expensive resolution technique. Altitude hold is simple, costs less than vectoring, but may require a long time for the aircraft to achieve separation. Re-routing is the simplest, and provides the most cost benefit since the aircraft flies a shorter distance than if it had followed its filed route. Speed control is shown to be ineffective as a means of increasing separation, but is effective for maintaining separation between aircraft. In addition, the affects of uncertainties on the cost are assessed. The analysis shows that cost is invariant with the decision time.

  13. Short paths in expander graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kleinberg, J.; Rubinfeld, R.

    Graph expansion has proved to be a powerful general tool for analyzing the behavior of routing algorithms and the interconnection networks on which they run. We develop new routing algorithms and structural results for bounded-degree expander graphs. Our results are unified by the fact that they are all based upon, and extend, a body of work asserting that expanders are rich in short, disjoint paths. In particular, our work has consequences for the disjoint paths problem, multicommodify flow, and graph minor containment. We show: (i) A greedy algorithm for approximating the maximum disjoint paths problem achieves a polylogarithmic approximation ratiomore » in bounded-degree expanders. Although our algorithm is both deterministic and on-line, its performance guarantee is an improvement over previous bounds in expanders. (ii) For a multicommodily flow problem with arbitrary demands on a bounded-degree expander, there is a (1 + {epsilon})-optimal solution using only flow paths of polylogarithmic length. It follows that the multicommodity flow algorithm of Awerbuch and Leighton runs in nearly linear time per commodity in expanders. Our analysis is based on establishing the following: given edge weights on an expander G, one can increase some of the weights very slightly so the resulting shortest-path metric is smooth - the min-weight path between any pair of nodes uses a polylogarithmic number of edges. (iii) Every bounded-degree expander on n nodes contains every graph with O(n/log{sup O(1)} n) nodes and edges as a minor.« less

  14. Sparse Regression as a Sparse Eigenvalue Problem

    NASA Technical Reports Server (NTRS)

    Moghaddam, Baback; Gruber, Amit; Weiss, Yair; Avidan, Shai

    2008-01-01

    We extend the l0-norm "subspectral" algorithms for sparse-LDA [5] and sparse-PCA [6] to general quadratic costs such as MSE in linear (kernel) regression. The resulting "Sparse Least Squares" (SLS) problem is also NP-hard, by way of its equivalence to a rank-1 sparse eigenvalue problem (e.g., binary sparse-LDA [7]). Specifically, for a general quadratic cost we use a highly-efficient technique for direct eigenvalue computation using partitioned matrix inverses which leads to dramatic x103 speed-ups over standard eigenvalue decomposition. This increased efficiency mitigates the O(n4) scaling behaviour that up to now has limited the previous algorithms' utility for high-dimensional learning problems. Moreover, the new computation prioritizes the role of the less-myopic backward elimination stage which becomes more efficient than forward selection. Similarly, branch-and-bound search for Exact Sparse Least Squares (ESLS) also benefits from partitioned matrix inverse techniques. Our Greedy Sparse Least Squares (GSLS) generalizes Natarajan's algorithm [9] also known as Order-Recursive Matching Pursuit (ORMP). Specifically, the forward half of GSLS is exactly equivalent to ORMP but more efficient. By including the backward pass, which only doubles the computation, we can achieve lower MSE than ORMP. Experimental comparisons to the state-of-the-art LARS algorithm [3] show forward-GSLS is faster, more accurate and more flexible in terms of choice of regularization

  15. Study of Huizhou architecture component point cloud in surface reconstruction

    NASA Astrophysics Data System (ADS)

    Zhang, Runmei; Wang, Guangyin; Ma, Jixiang; Wu, Yulu; Zhang, Guangbin

    2017-06-01

    Surface reconfiguration softwares have many problems such as complicated operation on point cloud data, too many interaction definitions, and too stringent requirements for inputing data. Thus, it has not been widely popularized so far. This paper selects the unique Huizhou Architecture chuandou wooden beam framework as the research object, and presents a complete set of implementation in data acquisition from point, point cloud preprocessing and finally implemented surface reconstruction. Firstly, preprocessing the acquired point cloud data, including segmentation and filtering. Secondly, the surface’s normals are deduced directly from the point cloud dataset. Finally, the surface reconstruction is studied by using Greedy Projection Triangulation Algorithm. Comparing the reconstructed model with the three-dimensional surface reconstruction softwares, the results show that the proposed scheme is more smooth, time efficient and portable.

  16. Hammerstein system represention of financial volatility processes

    NASA Astrophysics Data System (ADS)

    Capobianco, E.

    2002-05-01

    We show new modeling aspects of stock return volatility processes, by first representing them through Hammerstein Systems, and by then approximating the observed and transformed dynamics with wavelet-based atomic dictionaries. We thus propose an hybrid statistical methodology for volatility approximation and non-parametric estimation, and aim to use the information embedded in a bank of volatility sources obtained by decomposing the observed signal with multiresolution techniques. Scale dependent information refers both to market activity inherent to different temporally aggregated trading horizons, and to a variable degree of sparsity in representing the signal. A decomposition of the expansion coefficients in least dependent coordinates is then implemented through Independent Component Analysis. Based on the described steps, the features of volatility can be more effectively detected through global and greedy algorithms.

  17. Distributed resource allocation under communication constraints

    NASA Astrophysics Data System (ADS)

    Dodin, Pierre; Nimier, Vincent

    2001-03-01

    This paper deals with a study of the multi-sensor management problem for multi-target tracking. The collaboration between many sensors observing the same target means that they are able to fuse their data during the information process. Then one must take into account this possibility to compute the optimal association sensors-target at each step of time. In order to solve this problem for real large scale system, one must both consider the information aspect and the control aspect of the problem. To unify these problems, one possibility is to use a decentralized filtering algorithm locally driven by an assignment algorithm. The decentralized filtering algorithm we use in our model is the filtering algorithm of Grime, which relaxes the usual full-connected hypothesis. By full-connected, one means that the information in a full-connected system is totally distributed everywhere at the same moment, which is unacceptable for a real large scale system. We modelize the distributed assignment decision with the help of a greedy algorithm. Each sensor performs a global optimization, in order to estimate other information sets. A consequence of the relaxation of the full- connected hypothesis is that the sensors' information set are not the same at each step of time, producing an information dis- symmetry in the system. The assignment algorithm uses a local knowledge of this dis-symmetry. By testing the reactions and the coherence of the local assignment decisions of our system, against maneuvering targets, we show that it is still possible to manage with decentralized assignment control even though the system is not full-connected.

  18. Classroom Materials for Job-Related BSEP 2 Program

    DTIC Science & Technology

    1983-09-01

    gathered D. gathered, combined, camoufl age 10. The greedy man was happy to take the money. A. greedy C. was *B. was happy D. take 11. The banana ...tastes good with peanut butter on it. A. tastes C. tastes good B. on D. banana III. Instructions: You are given a choice of two verbs in the following...previously o.- before. (The M16 had ALREADY been fired.) 162 peel Grammar Activity Sheet 36A Good Usage of English Name 6. ALL RIGHT - "ALRIGHT" ALL RIGHT

  19. Inferring Stop-Locations from WiFi.

    PubMed

    Wind, David Kofoed; Sapiezynski, Piotr; Furman, Magdalena Anna; Lehmann, Sune

    2016-01-01

    Human mobility patterns are inherently complex. In terms of understanding these patterns, the process of converting raw data into series of stop-locations and transitions is an important first step which greatly reduces the volume of data, thus simplifying the subsequent analyses. Previous research into the mobility of individuals has focused on inferring 'stop locations' (places of stationarity) from GPS or CDR data, or on detection of state (static/active). In this paper we bridge the gap between the two approaches: we introduce methods for detecting both mobility state and stop-locations. In addition, our methods are based exclusively on WiFi data. We study two months of WiFi data collected every two minutes by a smartphone, and infer stop-locations in the form of labelled time-intervals. For this purpose, we investigate two algorithms, both of which scale to large datasets: a greedy approach to select the most important routers and one which uses a density-based clustering algorithm to detect router fingerprints. We validate our results using participants' GPS data as well as ground truth data collected during a two month period.

  20. Inferring Stop-Locations from WiFi

    PubMed Central

    Wind, David Kofoed; Sapiezynski, Piotr; Furman, Magdalena Anna; Lehmann, Sune

    2016-01-01

    Human mobility patterns are inherently complex. In terms of understanding these patterns, the process of converting raw data into series of stop-locations and transitions is an important first step which greatly reduces the volume of data, thus simplifying the subsequent analyses. Previous research into the mobility of individuals has focused on inferring ‘stop locations’ (places of stationarity) from GPS or CDR data, or on detection of state (static/active). In this paper we bridge the gap between the two approaches: we introduce methods for detecting both mobility state and stop-locations. In addition, our methods are based exclusively on WiFi data. We study two months of WiFi data collected every two minutes by a smartphone, and infer stop-locations in the form of labelled time-intervals. For this purpose, we investigate two algorithms, both of which scale to large datasets: a greedy approach to select the most important routers and one which uses a density-based clustering algorithm to detect router fingerprints. We validate our results using participants’ GPS data as well as ground truth data collected during a two month period. PMID:26901663

  1. Linear time algorithms to construct populations fitting multiple constraint distributions at genomic scales.

    PubMed

    Siragusa, Enrico; Haiminen, Niina; Utro, Filippo; Parida, Laxmi

    2017-10-09

    Computer simulations can be used to study population genetic methods, models and parameters, as well as to predict potential outcomes. For example, in plant populations, predicting the outcome of breeding operations can be studied using simulations. In-silico construction of populations with pre-specified characteristics is an important task in breeding optimization and other population genetic studies. We present two linear time Simulation using Best-fit Algorithms (SimBA) for two classes of problems where each co-fits two distributions: SimBA-LD fits linkage disequilibrium and minimum allele frequency distributions, while SimBA-hap fits founder-haplotype and polyploid allele dosage distributions. An incremental gap-filling version of previously introduced SimBA-LD is here demonstrated to accurately fit the target distributions, allowing efficient large scale simulations. SimBA-hap accuracy and efficiency is demonstrated by simulating tetraploid populations with varying numbers of founder haplotypes, we evaluate both a linear time greedy algoritm and an optimal solution based on mixed-integer programming. SimBA is available on http://researcher.watson.ibm.com/project/5669.

  2. Contextual Multi-armed Bandits under Feature Uncertainty

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yun, Seyoung; Nam, Jun Hyun; Mo, Sangwoo

    We study contextual multi-armed bandit problems under linear realizability on rewards and uncertainty (or noise) on features. For the case of identical noise on features across actions, we propose an algorithm, coined NLinRel, having O(T⁷/₈(log(dT)+K√d)) regret bound for T rounds, K actions, and d-dimensional feature vectors. Next, for the case of non-identical noise, we observe that popular linear hypotheses including NLinRel are impossible to achieve such sub-linear regret. Instead, under assumption of Gaussian feature vectors, we prove that a greedy algorithm has O(T²/₃√log d)regret bound with respect to the optimal linear hypothesis. Utilizing our theoretical understanding on the Gaussian case,more » we also design a practical variant of NLinRel, coined Universal-NLinRel, for arbitrary feature distributions. It first runs NLinRel for finding the ‘true’ coefficient vector using feature uncertainties and then adjust it to minimize its regret using the statistical feature information. We justify the performance of Universal-NLinRel on both synthetic and real-world datasets.« less

  3. A Distributed and Energy-Efficient Algorithm for Event K-Coverage in Underwater Sensor Networks

    PubMed Central

    Jiang, Peng; Xu, Yiming; Liu, Jun

    2017-01-01

    For event dynamic K-coverage algorithms, each management node selects its assistant node by using a greedy algorithm without considering the residual energy and situations in which a node is selected by several events. This approach affects network energy consumption and balance. Therefore, this study proposes a distributed and energy-efficient event K-coverage algorithm (DEEKA). After the network achieves 1-coverage, the nodes that detect the same event compete for the event management node with the number of candidate nodes and the average residual energy, as well as the distance to the event. Second, each management node estimates the probability of its neighbor nodes’ being selected by the event it manages with the distance level, the residual energy level, and the number of dynamic coverage event of these nodes. Third, each management node establishes an optimization model that uses expectation energy consumption and the residual energy variance of its neighbor nodes and detects the performance of the events it manages as targets. Finally, each management node uses a constrained non-dominated sorting genetic algorithm (NSGA-II) to obtain the Pareto set of the model and the best strategy via technique for order preference by similarity to an ideal solution (TOPSIS). The algorithm first considers the effect of harsh underwater environments on information collection and transmission. It also considers the residual energy of a node and a situation in which the node is selected by several other events. Simulation results show that, unlike the on-demand variable sensing K-coverage algorithm, DEEKA balances and reduces network energy consumption, thereby prolonging the network’s best service quality and lifetime. PMID:28106837

  4. A Distributed and Energy-Efficient Algorithm for Event K-Coverage in Underwater Sensor Networks.

    PubMed

    Jiang, Peng; Xu, Yiming; Liu, Jun

    2017-01-19

    For event dynamic K-coverage algorithms, each management node selects its assistant node by using a greedy algorithm without considering the residual energy and situations in which a node is selected by several events. This approach affects network energy consumption and balance. Therefore, this study proposes a distributed and energy-efficient event K-coverage algorithm (DEEKA). After the network achieves 1-coverage, the nodes that detect the same event compete for the event management node with the number of candidate nodes and the average residual energy, as well as the distance to the event. Second, each management node estimates the probability of its neighbor nodes' being selected by the event it manages with the distance level, the residual energy level, and the number of dynamic coverage event of these nodes. Third, each management node establishes an optimization model that uses expectation energy consumption and the residual energy variance of its neighbor nodes and detects the performance of the events it manages as targets. Finally, each management node uses a constrained non-dominated sorting genetic algorithm (NSGA-II) to obtain the Pareto set of the model and the best strategy via technique for order preference by similarity to an ideal solution (TOPSIS). The algorithm first considers the effect of harsh underwater environments on information collection and transmission. It also considers the residual energy of a node and a situation in which the node is selected by several other events. Simulation results show that, unlike the on-demand variable sensing K-coverage algorithm, DEEKA balances and reduces network energy consumption, thereby prolonging the network's best service quality and lifetime.

  5. Region-Based Collision Avoidance Beaconless Geographic Routing Protocol in Wireless Sensor Networks.

    PubMed

    Lee, JeongCheol; Park, HoSung; Kang, SeokYoon; Kim, Ki-Il

    2015-06-05

    Due to the lack of dependency on beacon messages for location exchange, the beaconless geographic routing protocol has attracted considerable attention from the research community. However, existing beaconless geographic routing protocols are likely to generate duplicated data packets when multiple winners in the greedy area are selected. Furthermore, these protocols are designed for a uniform sensor field, so they cannot be directly applied to practical irregular sensor fields with partial voids. To prevent the failure of finding a forwarding node and to remove unnecessary duplication, in this paper, we propose a region-based collision avoidance beaconless geographic routing protocol to increase forwarding opportunities for randomly-deployed sensor networks. By employing different contention priorities into the mutually-communicable nodes and the rest of the nodes in the greedy area, every neighbor node in the greedy area can be used for data forwarding without any packet duplication. Moreover, simulation results are given to demonstrate the increased packet delivery ratio and shorten end-to-end delay, rather than well-referred comparative protocols.

  6. Region-Based Collision Avoidance Beaconless Geographic Routing Protocol in Wireless Sensor Networks

    PubMed Central

    Lee, JeongCheol; Park, HoSung; Kang, SeokYoon; Kim, Ki-Il

    2015-01-01

    Due to the lack of dependency on beacon messages for location exchange, the beaconless geographic routing protocol has attracted considerable attention from the research community. However, existing beaconless geographic routing protocols are likely to generate duplicated data packets when multiple winners in the greedy area are selected. Furthermore, these protocols are designed for a uniform sensor field, so they cannot be directly applied to practical irregular sensor fields with partial voids. To prevent the failure of finding a forwarding node and to remove unnecessary duplication, in this paper, we propose a region-based collision avoidance beaconless geographic routing protocol to increase forwarding opportunities for randomly-deployed sensor networks. By employing different contention priorities into the mutually-communicable nodes and the rest of the nodes in the greedy area, every neighbor node in the greedy area can be used for data forwarding without any packet duplication. Moreover, simulation results are given to demonstrate the increased packet delivery ratio and shorten end-to-end delay, rather than well-referred comparative protocols. PMID:26057037

  7. An Enhanced Method for Scheduling Observations of Large Sky Error Regions for Finding Optical Counterparts to Transients

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rana, Javed; Singhal, Akshat; Gadre, Bhooshan

    2017-04-01

    The discovery and subsequent study of optical counterparts to transient sources is crucial for their complete astrophysical understanding. Various gamma-ray burst (GRB) detectors, and more notably the ground-based gravitational wave detectors, typically have large uncertainties in the sky positions of detected sources. Searching these large sky regions spanning hundreds of square degrees is a formidable challenge for most ground-based optical telescopes, which can usually image less than tens of square degrees of the sky in a single night. We present algorithms for better scheduling of such follow-up observations in order to maximize the probability of imaging the optical counterpart, basedmore » on the all-sky probability distribution of the source position. We incorporate realistic observing constraints such as the diurnal cycle, telescope pointing limitations, available observing time, and the rising/setting of the target at the observatory’s location. We use simulations to demonstrate that our proposed algorithms outperform the default greedy observing schedule used by many observatories. Our algorithms are applicable for follow-up of other transient sources with large positional uncertainties, such as Fermi -detected GRBs, and can easily be adapted for scheduling radio or space-based X-ray follow-up.« less

  8. Tug-of-war model for the two-bandit problem: nonlocally-correlated parallel exploration via resource conservation.

    PubMed

    Kim, Song-Ju; Aono, Masashi; Hara, Masahiko

    2010-07-01

    We propose a model - the "tug-of-war (TOW) model" - to conduct unique parallel searches using many nonlocally-correlated search agents. The model is based on the property of a single-celled amoeba, the true slime mold Physarum, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a "nonlocal correlation" among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision making in the case of a dilemma. The multi-armed bandit problem is to determine the optimal strategy for maximizing the total reward sum with incompatible demands, by either exploiting the rewards obtained using the already collected information or exploring new information for acquiring higher payoffs involving risks. Our model can efficiently manage the "exploration-exploitation dilemma" and exhibits good performances. The average accuracy rate of our model is higher than those of well-known algorithms such as the modified -greedy algorithm and modified softmax algorithm, especially, for solving relatively difficult problems. Moreover, our model flexibly adapts to changing environments, a property essential for living organisms surviving in uncertain environments.

  9. Cost- and reliability-oriented aggregation point association in long-term evolution and passive optical network hybrid access infrastructure for smart grid neighborhood area network

    NASA Astrophysics Data System (ADS)

    Cheng, Xiao; Feng, Lei; Zhou, Fanqin; Wei, Lei; Yu, Peng; Li, Wenjing

    2018-02-01

    With the rapid development of the smart grid, the data aggregation point (AP) in the neighborhood area network (NAN) is becoming increasingly important for forwarding the information between the home area network and wide area network. Due to limited budget, it is unable to use one-single access technology to meet the ongoing requirements on AP coverage. This paper first introduces the wired and wireless hybrid access network with the integration of long-term evolution (LTE) and passive optical network (PON) system for NAN, which allows a good trade-off among cost, flexibility, and reliability. Then, based on the already existing wireless LTE network, an AP association optimization model is proposed to make the PON serve as many APs as possible, considering both the economic efficiency and network reliability. Moreover, since the features of the constraints and variables of this NP-hard problem, a hybrid intelligent optimization algorithm is proposed, which is achieved by the mixture of the genetic, ant colony and dynamic greedy algorithm. By comparing with other published methods, simulation results verify the performance of the proposed method in improving the AP coverage and the performance of the proposed algorithm in terms of convergence.

  10. Highly scalable and robust rule learner: performance evaluation and comparison.

    PubMed

    Kurgan, Lukasz A; Cios, Krzysztof J; Dick, Scott

    2006-02-01

    Business intelligence and bioinformatics applications increasingly require the mining of datasets consisting of millions of data points, or crafting real-time enterprise-level decision support systems for large corporations and drug companies. In all cases, there needs to be an underlying data mining system, and this mining system must be highly scalable. To this end, we describe a new rule learner called DataSqueezer. The learner belongs to the family of inductive supervised rule extraction algorithms. DataSqueezer is a simple, greedy, rule builder that generates a set of production rules from labeled input data. In spite of its relative simplicity, DataSqueezer is a very effective learner. The rules generated by the algorithm are compact, comprehensible, and have accuracy comparable to rules generated by other state-of-the-art rule extraction algorithms. The main advantages of DataSqueezer are very high efficiency, and missing data resistance. DataSqueezer exhibits log-linear asymptotic complexity with the number of training examples, and it is faster than other state-of-the-art rule learners. The learner is also robust to large quantities of missing data, as verified by extensive experimental comparison with the other learners. DataSqueezer is thus well suited to modern data mining and business intelligence tasks, which commonly involve huge datasets with a large fraction of missing data.

  11. BiCluE - Exact and heuristic algorithms for weighted bi-cluster editing of biomedical data

    PubMed Central

    2013-01-01

    Background The explosion of biological data has dramatically reformed today's biology research. The biggest challenge to biologists and bioinformaticians is the integration and analysis of large quantity of data to provide meaningful insights. One major problem is the combined analysis of data from different types. Bi-cluster editing, as a special case of clustering, which partitions two different types of data simultaneously, might be used for several biomedical scenarios. However, the underlying algorithmic problem is NP-hard. Results Here we contribute with BiCluE, a software package designed to solve the weighted bi-cluster editing problem. It implements (1) an exact algorithm based on fixed-parameter tractability and (2) a polynomial-time greedy heuristics based on solving the hardest part, edge deletions, first. We evaluated its performance on artificial graphs. Afterwards we exemplarily applied our implementation on real world biomedical data, GWAS data in this case. BiCluE generally works on any kind of data types that can be modeled as (weighted or unweighted) bipartite graphs. Conclusions To our knowledge, this is the first software package solving the weighted bi-cluster editing problem. BiCluE as well as the supplementary results are available online at http://biclue.mpi-inf.mpg.de. PMID:24565035

  12. The maintenance of cooperation in multiplex networks with limited and partible resources of agents

    NASA Astrophysics Data System (ADS)

    Li, Zhaofeng; Shen, Bi; Jiang, Yichuan

    2017-02-01

    In this paper, we try to explain the maintenance of cooperation in multiplex networks with limited and partible resources of agents: defection brings larger short-term benefit and cooperative agents may become defective because of the unaffordable costs of cooperative behaviors that are performed in multiple layers simultaneously. Recent studies have identified the positive effects of multiple layers on evolutionary cooperation but generally overlook the maximum costs of agents in these synchronous games. By utilizing network effects and designing evolutionary mechanisms, cooperative behaviors become prevailing in public goods games, and agents can allocate personal resources across multiple layers. First, we generalize degree diversity into multiplex networks to improve the prospect for cooperation. Second, to prevent agents allocating all the resources into one layer, a greedy-first mechanism is proposed, in which agents prefer to add additional investments in the higher-payoff layer. It is found that greedy-first agents can perform cooperative behaviors in multiplex networks when one layer is scale-free network and degree differences between conjoint nodes increase. Our work may help to explain the emergence of cooperation in the absence of individual reputation and punishment mechanisms.

  13. Greed and the frightening rumble of psychic hunger.

    PubMed

    Waska, Robert

    2004-09-01

    Many patients are desperately struggling with feelings of envy and greed. For some, greed is experienced as a constant hunger, a feeling of being empty and alone. This type of patient can be aggressive or resentful in the way they feel and act. They are determined to take what they feel is rightly theirs. Other such patients are much more conflicted about their greedy phantasies and striving. This paper focuses on patients who are fearful and anxious about the greedy urges that shape their inner world. Case material is used for illustration.

  14. Efficient selection of tagging single-nucleotide polymorphisms in multiple populations.

    PubMed

    Howie, Bryan N; Carlson, Christopher S; Rieder, Mark J; Nickerson, Deborah A

    2006-08-01

    Common genetic polymorphism may explain a portion of the heritable risk for common diseases, so considerable effort has been devoted to finding and typing common single-nucleotide polymorphisms (SNPs) in the human genome. Many SNPs show correlated genotypes, or linkage disequilibrium (LD), suggesting that only a subset of all SNPs (known as tagging SNPs, or tagSNPs) need to be genotyped for disease association studies. Based on the genetic differences that exist among human populations, most tagSNP sets are defined in a single population and applied only in populations that are closely related. To improve the efficiency of multi-population analyses, we have developed an algorithm called MultiPop-TagSelect that finds a near-minimal union of population-specific tagSNP sets across an arbitrary number of populations. We present this approach as an extension of LD-select, a tagSNP selection method that uses a greedy algorithm to group SNPs into bins based on their pairwise association patterns, although the MultiPop-TagSelect algorithm could be used with any SNP tagging approach that allows choices between nearly equivalent SNPs. We evaluate the algorithm by considering tagSNP selection in candidate-gene resequencing data and lower density whole-chromosome data. Our analysis reveals that an exhaustive search is often intractable, while the developed algorithm can quickly and reliably find near-optimal solutions even for difficult tagSNP selection problems. Using populations of African, Asian, and European ancestry, we also show that an optimal multi-population set of tagSNPs can be substantially smaller (up to 44%) than a typical set obtained through independent or sequential selection.

  15. Optimal placement of multiple types of communicating sensors with availability and coverage redundancy constraints

    NASA Astrophysics Data System (ADS)

    Vecherin, Sergey N.; Wilson, D. Keith; Pettit, Chris L.

    2010-04-01

    Determination of an optimal configuration (numbers, types, and locations) of a sensor network is an important practical problem. In most applications, complex signal propagation effects and inhomogeneous coverage preferences lead to an optimal solution that is highly irregular and nonintuitive. The general optimization problem can be strictly formulated as a binary linear programming problem. Due to the combinatorial nature of this problem, however, its strict solution requires significant computational resources (NP-complete class of complexity) and is unobtainable for large spatial grids of candidate sensor locations. For this reason, a greedy algorithm for approximate solution was recently introduced [S. N. Vecherin, D. K. Wilson, and C. L. Pettit, "Optimal sensor placement with terrain-based constraints and signal propagation effects," Unattended Ground, Sea, and Air Sensor Technologies and Applications XI, SPIE Proc. Vol. 7333, paper 73330S (2009)]. Here further extensions to the developed algorithm are presented to include such practical needs and constraints as sensor availability, coverage by multiple sensors, and wireless communication of the sensor information. Both communication and detection are considered in a probabilistic framework. Communication signal and signature propagation effects are taken into account when calculating probabilities of communication and detection. Comparison of approximate and strict solutions on reduced-size problems suggests that the approximate algorithm yields quick and good solutions, which thus justifies using that algorithm for full-size problems. Examples of three-dimensional outdoor sensor placement are provided using a terrain-based software analysis tool.

  16. Clustering evolving proteins into homologous families.

    PubMed

    Chan, Cheong Xin; Mahbob, Maisarah; Ragan, Mark A

    2013-04-08

    Clustering sequences into groups of putative homologs (families) is a critical first step in many areas of comparative biology and bioinformatics. The performance of clustering approaches in delineating biologically meaningful families depends strongly on characteristics of the data, including content bias and degree of divergence. New, highly scalable methods have recently been introduced to cluster the very large datasets being generated by next-generation sequencing technologies. However, there has been little systematic investigation of how characteristics of the data impact the performance of these approaches. Using clusters from a manually curated dataset as reference, we examined the performance of a widely used graph-based Markov clustering algorithm (MCL) and a greedy heuristic approach (UCLUST) in delineating protein families coded by three sets of bacterial genomes of different G+C content. Both MCL and UCLUST generated clusters that are comparable to the reference sets at specific parameter settings, although UCLUST tends to under-cluster compositionally biased sequences (G+C content 33% and 66%). Using simulated data, we sought to assess the individual effects of sequence divergence, rate heterogeneity, and underlying G+C content. Performance decreased with increasing sequence divergence, decreasing among-site rate variation, and increasing G+C bias. Two MCL-based methods recovered the simulated families more accurately than did UCLUST. MCL using local alignment distances is more robust across the investigated range of sequence features than are greedy heuristics using distances based on global alignment. Our results demonstrate that sequence divergence, rate heterogeneity and content bias can individually and in combination affect the accuracy with which MCL and UCLUST can recover homologous protein families. For application to data that are more divergent, and exhibit higher among-site rate variation and/or content bias, MCL may often be the better choice, especially if computational resources are not limiting.

  17. Greedy Sampling and Incremental Surrogate Model-Based Tailoring of Aeroservoelastic Model Database for Flexible Aircraft

    NASA Technical Reports Server (NTRS)

    Wang, Yi; Pant, Kapil; Brenner, Martin J.; Ouellette, Jeffrey A.

    2018-01-01

    This paper presents a data analysis and modeling framework to tailor and develop linear parameter-varying (LPV) aeroservoelastic (ASE) model database for flexible aircrafts in broad 2D flight parameter space. The Kriging surrogate model is constructed using ASE models at a fraction of grid points within the original model database, and then the ASE model at any flight condition can be obtained simply through surrogate model interpolation. The greedy sampling algorithm is developed to select the next sample point that carries the worst relative error between the surrogate model prediction and the benchmark model in the frequency domain among all input-output channels. The process is iterated to incrementally improve surrogate model accuracy till a pre-determined tolerance or iteration budget is met. The methodology is applied to the ASE model database of a flexible aircraft currently being tested at NASA/AFRC for flutter suppression and gust load alleviation. Our studies indicate that the proposed method can reduce the number of models in the original database by 67%. Even so the ASE models obtained through Kriging interpolation match the model in the original database constructed directly from the physics-based tool with the worst relative error far below 1%. The interpolated ASE model exhibits continuously-varying gains along a set of prescribed flight conditions. More importantly, the selected grid points are distributed non-uniformly in the parameter space, a) capturing the distinctly different dynamic behavior and its dependence on flight parameters, and b) reiterating the need and utility for adaptive space sampling techniques for ASE model database compaction. The present framework is directly extendible to high-dimensional flight parameter space, and can be used to guide the ASE model development, model order reduction, robust control synthesis and novel vehicle design of flexible aircraft.

  18. Efficient Optimization of Stimuli for Model-Based Design of Experiments to Resolve Dynamical Uncertainty.

    PubMed

    Mdluli, Thembi; Buzzard, Gregery T; Rundell, Ann E

    2015-09-01

    This model-based design of experiments (MBDOE) method determines the input magnitudes of an experimental stimuli to apply and the associated measurements that should be taken to optimally constrain the uncertain dynamics of a biological system under study. The ideal global solution for this experiment design problem is generally computationally intractable because of parametric uncertainties in the mathematical model of the biological system. Others have addressed this issue by limiting the solution to a local estimate of the model parameters. Here we present an approach that is independent of the local parameter constraint. This approach is made computationally efficient and tractable by the use of: (1) sparse grid interpolation that approximates the biological system dynamics, (2) representative parameters that uniformly represent the data-consistent dynamical space, and (3) probability weights of the represented experimentally distinguishable dynamics. Our approach identifies data-consistent representative parameters using sparse grid interpolants, constructs the optimal input sequence from a greedy search, and defines the associated optimal measurements using a scenario tree. We explore the optimality of this MBDOE algorithm using a 3-dimensional Hes1 model and a 19-dimensional T-cell receptor model. The 19-dimensional T-cell model also demonstrates the MBDOE algorithm's scalability to higher dimensions. In both cases, the dynamical uncertainty region that bounds the trajectories of the target system states were reduced by as much as 86% and 99% respectively after completing the designed experiments in silico. Our results suggest that for resolving dynamical uncertainty, the ability to design an input sequence paired with its associated measurements is particularly important when limited by the number of measurements.

  19. Design of ACM system based on non-greedy punctured LDPC codes

    NASA Astrophysics Data System (ADS)

    Lu, Zijun; Jiang, Zihong; Zhou, Lin; He, Yucheng

    2017-08-01

    In this paper, an adaptive coded modulation (ACM) scheme based on rate-compatible LDPC (RC-LDPC) codes was designed. The RC-LDPC codes were constructed by a non-greedy puncturing method which showed good performance in high code rate region. Moreover, the incremental redundancy scheme of LDPC-based ACM system over AWGN channel was proposed. By this scheme, code rates vary from 2/3 to 5/6 and the complication of the ACM system is lowered. Simulations show that more and more obvious coding gain can be obtained by the proposed ACM system with higher throughput.

  20. Learning Multirobot Hose Transportation and Deployment by Distributed Round-Robin Q-Learning.

    PubMed

    Fernandez-Gauna, Borja; Etxeberria-Agiriano, Ismael; Graña, Manuel

    2015-01-01

    Multi-Agent Reinforcement Learning (MARL) algorithms face two main difficulties: the curse of dimensionality, and environment non-stationarity due to the independent learning processes carried out by the agents concurrently. In this paper we formalize and prove the convergence of a Distributed Round Robin Q-learning (D-RR-QL) algorithm for cooperative systems. The computational complexity of this algorithm increases linearly with the number of agents. Moreover, it eliminates environment non sta tionarity by carrying a round-robin scheduling of the action selection and execution. That this learning scheme allows the implementation of Modular State-Action Vetoes (MSAV) in cooperative multi-agent systems, which speeds up learning convergence in over-constrained systems by vetoing state-action pairs which lead to undesired termination states (UTS) in the relevant state-action subspace. Each agent's local state-action value function learning is an independent process, including the MSAV policies. Coordination of locally optimal policies to obtain the global optimal joint policy is achieved by a greedy selection procedure using message passing. We show that D-RR-QL improves over state-of-the-art approaches, such as Distributed Q-Learning, Team Q-Learning and Coordinated Reinforcement Learning in a paradigmatic Linked Multi-Component Robotic System (L-MCRS) control problem: the hose transportation task. L-MCRS are over-constrained systems with many UTS induced by the interaction of the passive linking element and the active mobile robots.

  1. A novel method of language modeling for automatic captioning in TC video teleconferencing.

    PubMed

    Zhang, Xiaojia; Zhao, Yunxin; Schopp, Laura

    2007-05-01

    We are developing an automatic captioning system for teleconsultation video teleconferencing (TC-VTC) in telemedicine, based on large vocabulary conversational speech recognition. In TC-VTC, doctors' speech contains a large number of infrequently used medical terms in spontaneous styles. Due to insufficiency of data, we adopted mixture language modeling, with models trained from several datasets of medical and nonmedical domains. This paper proposes novel modeling and estimation methods for the mixture language model (LM). Component LMs are trained from individual datasets, with class n-gram LMs trained from in-domain datasets and word n-gram LMs trained from out-of-domain datasets, and they are interpolated into a mixture LM. For class LMs, semantic categories are used for class definition on medical terms, names, and digits. The interpolation weights of a mixture LM are estimated by a greedy algorithm of forward weight adjustment (FWA). The proposed mixing of in-domain class LMs and out-of-domain word LMs, the semantic definitions of word classes, as well as the weight-estimation algorithm of FWA are effective on the TC-VTC task. As compared with using mixtures of word LMs with weights estimated by the conventional expectation-maximization algorithm, the proposed methods led to a 21% reduction of perplexity on test sets of five doctors, which translated into improvements of captioning accuracy.

  2. Energy-Efficient Scheduling for Hybrid Tasks in Control Devices for the Internet of Things

    PubMed Central

    Gao, Zhigang; Wu, Yifan; Dai, Guojun; Xia, Haixia

    2012-01-01

    In control devices for the Internet of Things (IoT), energy is one of the critical restriction factors. Dynamic voltage scaling (DVS) has been proved to be an effective method for reducing the energy consumption of processors. This paper proposes an energy-efficient scheduling algorithm for IoT control devices with hard real-time control tasks (HRCTs) and soft real-time tasks (SRTs). The main contribution of this paper includes two parts. First, it builds the Hybrid tasks with multi-subtasks of different function Weight (HoW) task model for IoT control devices. HoW describes the structure of HRCTs and SRTs, and their properties, e.g., deadlines, execution time, preemption properties, and energy-saving goals, etc. Second, it presents the Hybrid Tasks' Dynamic Voltage Scaling (HTDVS) algorithm. HTDVS first sets the slowdown factors of subtasks while meeting the different real-time requirements of HRCTs and SRTs, and then dynamically reclaims, reserves, and reuses the slack time of the subtasks to meet their ideal energy-saving goals. Experimental results show HTDVS can reduce energy consumption about 10%–80% while meeting the real-time requirements of HRCTs, HRCTs help to reduce the deadline miss ratio (DMR) of systems, and HTDVS has comparable performance with the greedy algorithm and is more favorable to keep the subtasks' ideal speeds. PMID:23112659

  3. Predicting protein-protein interactions from protein domains using a set cover approach.

    PubMed

    Huang, Chengbang; Morcos, Faruck; Kanaan, Simon P; Wuchty, Stefan; Chen, Danny Z; Izaguirre, Jesús A

    2007-01-01

    One goal of contemporary proteome research is the elucidation of cellular protein interactions. Based on currently available protein-protein interaction and domain data, we introduce a novel method, Maximum Specificity Set Cover (MSSC), for the prediction of protein-protein interactions. In our approach, we map the relationship between interactions of proteins and their corresponding domain architectures to a generalized weighted set cover problem. The application of a greedy algorithm provides sets of domain interactions which explain the presence of protein interactions to the largest degree of specificity. Utilizing domain and protein interaction data of S. cerevisiae, MSSC enables prediction of previously unknown protein interactions, links that are well supported by a high tendency of coexpression and functional homogeneity of the corresponding proteins. Focusing on concrete examples, we show that MSSC reliably predicts protein interactions in well-studied molecular systems, such as the 26S proteasome and RNA polymerase II of S. cerevisiae. We also show that the quality of the predictions is comparable to the Maximum Likelihood Estimation while MSSC is faster. This new algorithm and all data sets used are accessible through a Web portal at http://ppi.cse.nd.edu.

  4. Estimating haplotype frequencies by combining data from large DNA pools with database information.

    PubMed

    Gasbarra, Dario; Kulathinal, Sangita; Pirinen, Matti; Sillanpää, Mikko J

    2011-01-01

    We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of sampled individuals. Our goal is to estimate the haplotype frequencies among the sampled individuals by combining the pooled allele frequency data with prior knowledge about the set of possible haplotypes. Such prior information can be obtained, for example, from a database such as HapMap. We present a Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution. The proposed method is applicable when the sizes of the DNA pools and/or the number of considered loci exceed the limits of several earlier methods. In the example analyses, the proposed model clearly outperforms a deterministic greedy algorithm on real data from the HapMap database. With a small number of loci, the performance of the proposed method is similar to that of an EM-algorithm, which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes. The method has been implemented using Matlab and the code is available upon request from the authors.

  5. A trust-based sensor allocation algorithm in cooperative space search problems

    NASA Astrophysics Data System (ADS)

    Shen, Dan; Chen, Genshe; Pham, Khanh; Blasch, Erik

    2011-06-01

    Sensor allocation is an important and challenging problem within the field of multi-agent systems. The sensor allocation problem involves deciding how to assign a number of targets or cells to a set of agents according to some allocation protocol. Generally, in order to make efficient allocations, we need to design mechanisms that consider both the task performers' costs for the service and the associated probability of success (POS). In our problem, the costs are the used sensor resource, and the POS is the target tracking performance. Usually, POS may be perceived differently by different agents because they typically have different standards or means of evaluating the performance of their counterparts (other sensors in the search and tracking problem). Given this, we turn to the notion of trust to capture such subjective perceptions. In our approach, we develop a trust model to construct a novel mechanism that motivates sensor agents to limit their greediness or selfishness. Then we model the sensor allocation optimization problem with trust-in-loop negotiation game and solve it using a sub-game perfect equilibrium. Numerical simulations are performed to demonstrate the trust-based sensor allocation algorithm in cooperative space situation awareness (SSA) search problems.

  6. An improved exploratory search technique for pure integer linear programming problems

    NASA Technical Reports Server (NTRS)

    Fogle, F. R.

    1990-01-01

    The development is documented of a heuristic method for the solution of pure integer linear programming problems. The procedure draws its methodology from the ideas of Hooke and Jeeves type 1 and 2 exploratory searches, greedy procedures, and neighborhood searches. It uses an efficient rounding method to obtain its first feasible integer point from the optimal continuous solution obtained via the simplex method. Since this method is based entirely on simple addition or subtraction of one to each variable of a point in n-space and the subsequent comparison of candidate solutions to a given set of constraints, it facilitates significant complexity improvements over existing techniques. It also obtains the same optimal solution found by the branch-and-bound technique in 44 of 45 small to moderate size test problems. Two example problems are worked in detail to show the inner workings of the method. Furthermore, using an established weighted scheme for comparing computational effort involved in an algorithm, a comparison of this algorithm is made to the more established and rigorous branch-and-bound method. A computer implementation of the procedure, in PC compatible Pascal, is also presented and discussed.

  7. A Genetic Algorithm Approach for the TV Self-Promotion Assignment Problem

    NASA Astrophysics Data System (ADS)

    Pereira, Paulo A.; Fontes, Fernando A. C. C.; Fontes, Dalila B. M. M.

    2009-09-01

    We report on the development of a Genetic Algorithm (GA), which has been integrated into a Decision Support System to plan the best assignment of the weekly self-promotion space for a TV station. The problem addressed consists on deciding which shows to advertise and when such that the number of viewers, of an intended group or target, is maximized. The GA proposed incorporates a greedy heuristic to find good initial solutions. These solutions, as well as the solutions later obtained through the use of the GA, go then through a repair procedure. This is used with two objectives, which are addressed in turn. Firstly, it checks the solution feasibility and if unfeasible it is fixed by removing some shows. Secondly, it tries to improve the solution by adding some extra shows. Since the problem faced by the commercial TV station is too big and has too many features it cannot be solved exactly. Therefore, in order to test the quality of the solutions provided by the proposed GA we have randomly generated some smaller problem instances. For these problems we have obtained solutions on average within 1% of the optimal solution value.

  8. Greedy algorithms and Zipf laws

    NASA Astrophysics Data System (ADS)

    Moran, José; Bouchaud, Jean-Philippe

    2018-04-01

    We consider a simple model of firm/city/etc growth based on a multi-item criterion: whenever entity B fares better than entity A on a subset of M items out of K, the agent originally in A moves to B. We solve the model analytically in the cases K  =  1 and . The resulting stationary distribution of sizes is generically a Zipf-law provided M  >  K/2. When , no selection occurs and the size distribution remains thin-tailed. In the special case M  =  K, one needs to regularize the problem by introducing a small ‘default’ probability ϕ. We find that the stationary distribution has a power-law tail that becomes a Zipf-law when . The approach to the stationary state can also be characterized, with strong similarities with a simple ‘aging’ model considered by Barrat and Mézard.

  9. An Efficient Offloading Scheme For MEC System Considering Delay and Energy Consumption

    NASA Astrophysics Data System (ADS)

    Sun, Yanhua; Hao, Zhe; Zhang, Yanhua

    2018-01-01

    With the increasing numbers of mobile devices, mobile edge computing (MEC) which provides cloud computing capabilities proximate to mobile devices in 5G networks has been envisioned as a promising paradigm to enhance users experience. In this paper, we investigate a joint consideration of delay and energy consumption offloading scheme (JCDE) for MEC system in 5G heterogeneous networks. An optimization is formulated to minimize the delay as well as energy consumption of the offloading system, which the delay and energy consumption of transmitting and calculating tasks are taken into account. We adopt an iterative greedy algorithm to solve the optimization problem. Furthermore, simulations were carried out to validate the utility and effectiveness of our proposed scheme. The effect of parameter variations on the system is analysed as well. Numerical results demonstrate delay and energy efficiency promotion of our proposed scheme compared with another paper’s scheme.

  10. A Comparison of Techniques for Scheduling Fleets of Earth-Observing Satellites

    NASA Technical Reports Server (NTRS)

    Globus, Al; Crawford, James; Lohn, Jason; Pryor, Anna

    2003-01-01

    Earth observing satellite (EOS) scheduling is a complex real-world domain representative of a broad class of over-subscription scheduling problems. Over-subscription problems are those where requests for a facility exceed its capacity. These problems arise in a wide variety of NASA and terrestrial domains and are .XI important class of scheduling problems because such facilities often represent large capital investments. We have run experiments comparing multiple variants of the genetic algorithm, hill climbing, simulated annealing, squeaky wheel optimization and iterated sampling on two variants of a realistically-sized model of the EOS scheduling problem. These are implemented as permutation-based methods; methods that search in the space of priority orderings of observation requests and evaluate each permutation by using it to drive a greedy scheduler. Simulated annealing performs best and random mutation operators outperform our squeaky (more intelligent) operator. Furthermore, taking smaller steps towards the end of the search improves performance.

  11. Compression of Flow Can Reveal Overlapping-Module Organization in Networks

    NASA Astrophysics Data System (ADS)

    Viamontes Esquivel, Alcides; Rosvall, Martin

    2011-10-01

    To better understand the organization of overlapping modules in large networks with respect to flow, we introduce the map equation for overlapping modules. In this information-theoretic framework, we use the correspondence between compression and regularity detection. The generalized map equation measures how well we can compress a description of flow in the network when we partition it into modules with possible overlaps. When we minimize the generalized map equation over overlapping network partitions, we detect modules that capture flow and determine which nodes at the boundaries between modules should be classified in multiple modules and to what degree. With a novel greedy-search algorithm, we find that some networks, for example, the neural network of the nematode Caenorhabditis elegans, are best described by modules dominated by hard boundaries, but that others, for example, the sparse European-roads network, have an organization of highly overlapping modules.

  12. Iterative non-sequential protein structural alignment.

    PubMed

    Salem, Saeed; Zaki, Mohammed J; Bystroff, Christopher

    2009-06-01

    Structural similarity between proteins gives us insights into their evolutionary relationships when there is low sequence similarity. In this paper, we present a novel approach called SNAP for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process consisting of a superposition step and an alignment step, until convergence. We propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of SNAP alignments were assessed by comparing against the manually curated reference alignments in the challenging SISY and RIPC datasets. Moreover, when applied to a dataset of 4410 protein pairs selected from the CATH database, SNAP produced longer alignments with lower rmsd than several state-of-the-art alignment methods. Classification of folds using SNAP alignments was both highly sensitive and highly selective. The SNAP software along with the datasets are available online at http://www.cs.rpi.edu/~zaki/software/SNAP.

  13. A bi-objective model for robust yard allocation scheduling for outbound containers

    NASA Astrophysics Data System (ADS)

    Liu, Changchun; Zhang, Canrong; Zheng, Li

    2017-01-01

    This article examines the yard allocation problem for outbound containers, with consideration of uncertainty factors, mainly including the arrival and operation time of calling vessels. Based on the time buffer inserting method, a bi-objective model is constructed to minimize the total operational cost and to maximize the robustness of fighting against the uncertainty. Due to the NP-hardness of the constructed model, a two-stage heuristic is developed to solve the problem. In the first stage, initial solutions are obtained by a greedy algorithm that looks n-steps ahead with the uncertainty factors set as their respective expected values; in the second stage, based on the solutions obtained in the first stage and with consideration of uncertainty factors, a neighbourhood search heuristic is employed to generate robust solutions that can fight better against the fluctuation of uncertainty factors. Finally, extensive numerical experiments are conducted to test the performance of the proposed method.

  14. Smiles2Monomers: a link between chemical and biological structures for polymers.

    PubMed

    Dufresne, Yoann; Noé, Laurent; Leclère, Valérie; Pupin, Maude

    2015-01-01

    The monomeric composition of polymers is powerful for structure comparison and synthetic biology, among others. Many databases give access to the atomic structure of compounds but the monomeric structure of polymers is often lacking. We have designed a smart algorithm, implemented in the tool Smiles2Monomers (s2m), to infer efficiently and accurately the monomeric structure of a polymer from its chemical structure. Our strategy is divided into two steps: first, monomers are mapped on the atomic structure by an efficient subgraph-isomorphism algorithm ; second, the best tiling is computed so that non-overlapping monomers cover all the structure of the target polymer. The mapping is based on a Markovian index built by a dynamic programming algorithm. The index enables s2m to search quickly all the given monomers on a target polymer. After, a greedy algorithm combines the mapped monomers into a consistent monomeric structure. Finally, a local branch and cut algorithm refines the structure. We tested this method on two manually annotated databases of polymers and reconstructed the structures de novo with a sensitivity over 90 %. The average computation time per polymer is 2 s. s2m automatically creates de novo monomeric annotations for polymers, efficiently in terms of time computation and sensitivity. s2m allowed us to detect annotation errors in the tested databases and to easily find the accurate structures. So, s2m could be integrated into the curation process of databases of small compounds to verify the current entries and accelerate the annotation of new polymers. The full method can be downloaded or accessed via a website for peptide-like polymers at http://bioinfo.lifl.fr/norine/smiles2monomers.jsp.Graphical abstract:.

  15. Automated Reconstruction of Neural Trees Using Front Re-initialization

    PubMed Central

    Mukherjee, Amit; Stepanyants, Armen

    2013-01-01

    This paper proposes a greedy algorithm for automated reconstruction of neural arbors from light microscopy stacks of images. The algorithm is based on the minimum cost path method. While the minimum cost path, obtained using the Fast Marching Method, results in a trace with the least cumulative cost between the start and the end points, it is not sufficient for the reconstruction of neural trees. This is because sections of the minimum cost path can erroneously travel through the image background with undetectable detriment to the cumulative cost. To circumvent this problem we propose an algorithm that grows a neural tree from a specified root by iteratively re-initializing the Fast Marching fronts. The speed image used in the Fast Marching Method is generated by computing the average outward flux of the gradient vector flow field. Each iteration of the algorithm produces a candidate extension by allowing the front to travel a specified distance and then tracking from the farthest point of the front back to the tree. Robust likelihood ratio test is used to evaluate the quality of the candidate extension by comparing voxel intensities along the extension to those in the foreground and the background. The qualified extensions are appended to the current tree, the front is re-initialized, and Fast Marching is continued until the stopping criterion is met. To evaluate the performance of the algorithm we reconstructed 6 stacks of two-photon microscopy images and compared the results to the ground truth reconstructions by using the DIADEM metric. The average comparison score was 0.82 out of 1.0, which is on par with the performance achieved by expert manual tracers. PMID:24386539

  16. Quantum annealing for combinatorial clustering

    NASA Astrophysics Data System (ADS)

    Kumar, Vaibhaw; Bass, Gideon; Tomlin, Casey; Dulny, Joseph

    2018-02-01

    Clustering is a powerful machine learning technique that groups "similar" data points based on their characteristics. Many clustering algorithms work by approximating the minimization of an objective function, namely the sum of within-the-cluster distances between points. The straightforward approach involves examining all the possible assignments of points to each of the clusters. This approach guarantees the solution will be a global minimum; however, the number of possible assignments scales quickly with the number of data points and becomes computationally intractable even for very small datasets. In order to circumvent this issue, cost function minima are found using popular local search-based heuristic approaches such as k-means and hierarchical clustering. Due to their greedy nature, such techniques do not guarantee that a global minimum will be found and can lead to sub-optimal clustering assignments. Other classes of global search-based techniques, such as simulated annealing, tabu search, and genetic algorithms, may offer better quality results but can be too time-consuming to implement. In this work, we describe how quantum annealing can be used to carry out clustering. We map the clustering objective to a quadratic binary optimization problem and discuss two clustering algorithms which are then implemented on commercially available quantum annealing hardware, as well as on a purely classical solver "qbsolv." The first algorithm assigns N data points to K clusters, and the second one can be used to perform binary clustering in a hierarchical manner. We present our results in the form of benchmarks against well-known k-means clustering and discuss the advantages and disadvantages of the proposed techniques.

  17. Multi-agent coordination algorithms for control of distributed energy resources in smart grids

    NASA Astrophysics Data System (ADS)

    Cortes, Andres

    Sustainable energy is a top-priority for researchers these days, since electricity and transportation are pillars of modern society. Integration of clean energy technologies such as wind, solar, and plug-in electric vehicles (PEVs), is a major engineering challenge in operation and management of power systems. This is due to the uncertain nature of renewable energy technologies and the large amount of extra load that PEVs would add to the power grid. Given the networked structure of a power system, multi-agent control and optimization strategies are natural approaches to address the various problems of interest for the safe and reliable operation of the power grid. The distributed computation in multi-agent algorithms addresses three problems at the same time: i) it allows for the handling of problems with millions of variables that a single processor cannot compute, ii) it allows certain independence and privacy to electricity customers by not requiring any usage information, and iii) it is robust to localized failures in the communication network, being able to solve problems by simply neglecting the failing section of the system. We propose various algorithms to coordinate storage, generation, and demand resources in a power grid using multi-agent computation and decentralized decision making. First, we introduce a hierarchical vehicle-one-grid (V1G) algorithm for coordination of PEVs under usage constraints, where energy only flows from the grid in to the batteries of PEVs. We then present a hierarchical vehicle-to-grid (V2G) algorithm for PEV coordination that takes into consideration line capacity constraints in the distribution grid, and where energy flows both ways, from the grid in to the batteries, and from the batteries to the grid. Next, we develop a greedy-like hierarchical algorithm for management of demand response events with on/off loads. Finally, we introduce distributed algorithms for the optimal control of distributed energy resources, i.e., generation and storage in a microgrid. The algorithms we present are provably correct and tested in simulation. Each algorithm is assumed to work on a particular network topology, and simulation studies are carried out in order to demonstrate their convergence properties to a desired solution.

  18. A Greedy Double Auction Mechanism for Grid Resource Allocation

    NASA Astrophysics Data System (ADS)

    Ding, Ding; Luo, Siwei; Gao, Zhan

    To improve the resource utilization and satisfy more users, a Greedy Double Auction Mechanism(GDAM) is proposed to allocate resources in grid environments. GDAM trades resources at discriminatory price instead of uniform price, reflecting the variance in requirements for profits and quantities. Moreover, GDAM applies different auction rules to different cases, over-demand, over-supply and equilibrium of demand and supply. As a new mechanism for grid resource allocation, GDAM is proved to be strategy-proof, economically efficient, weakly budget-balanced and individual rational. Simulation results also confirm that GDAM outperforms the traditional one on both the total trade amount and the user satisfaction percentage, specially as more users are involved in the auction market.

  19. On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment

    PubMed Central

    Alonso-Mora, Javier; Samaranayake, Samitha; Wallar, Alex; Frazzoli, Emilio; Rus, Daniela

    2017-01-01

    Ride-sharing services are transforming urban mobility by providing timely and convenient transportation to anybody, anywhere, and anytime. These services present enormous potential for positive societal impacts with respect to pollution, energy consumption, congestion, etc. Current mathematical models, however, do not fully address the potential of ride-sharing. Recently, a large-scale study highlighted some of the benefits of car pooling but was limited to static routes with two riders per vehicle (optimally) or three (with heuristics). We present a more general mathematical model for real-time high-capacity ride-sharing that (i) scales to large numbers of passengers and trips and (ii) dynamically generates optimal routes with respect to online demand and vehicle locations. The algorithm starts from a greedy assignment and improves it through a constrained optimization, quickly returning solutions of good quality and converging to the optimal assignment over time. We quantify experimentally the tradeoff between fleet size, capacity, waiting time, travel delay, and operational costs for low- to medium-capacity vehicles, such as taxis and van shuttles. The algorithm is validated with ∼3 million rides extracted from the New York City taxicab public dataset. Our experimental study considers ride-sharing with rider capacity of up to 10 simultaneous passengers per vehicle. The algorithm applies to fleets of autonomous vehicles and also incorporates rebalancing of idling vehicles to areas of high demand. This framework is general and can be used for many real-time multivehicle, multitask assignment problems. PMID:28049820

  20. On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment.

    PubMed

    Alonso-Mora, Javier; Samaranayake, Samitha; Wallar, Alex; Frazzoli, Emilio; Rus, Daniela

    2017-01-17

    Ride-sharing services are transforming urban mobility by providing timely and convenient transportation to anybody, anywhere, and anytime. These services present enormous potential for positive societal impacts with respect to pollution, energy consumption, congestion, etc. Current mathematical models, however, do not fully address the potential of ride-sharing. Recently, a large-scale study highlighted some of the benefits of car pooling but was limited to static routes with two riders per vehicle (optimally) or three (with heuristics). We present a more general mathematical model for real-time high-capacity ride-sharing that (i) scales to large numbers of passengers and trips and (ii) dynamically generates optimal routes with respect to online demand and vehicle locations. The algorithm starts from a greedy assignment and improves it through a constrained optimization, quickly returning solutions of good quality and converging to the optimal assignment over time. We quantify experimentally the tradeoff between fleet size, capacity, waiting time, travel delay, and operational costs for low- to medium-capacity vehicles, such as taxis and van shuttles. The algorithm is validated with ∼3 million rides extracted from the New York City taxicab public dataset. Our experimental study considers ride-sharing with rider capacity of up to 10 simultaneous passengers per vehicle. The algorithm applies to fleets of autonomous vehicles and also incorporates rebalancing of idling vehicles to areas of high demand. This framework is general and can be used for many real-time multivehicle, multitask assignment problems.

  1. Technical Note: A novel leaf sequencing optimization algorithm which considers previous underdose and overdose events for MLC tracking radiotherapy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wisotzky, Eric, E-mail: eric.wisotzky@charite.de, E-mail: eric.wisotzky@ipk.fraunhofer.de; O’Brien, Ricky; Keall, Paul J., E-mail: paul.keall@sydney.edu.au

    2016-01-15

    Purpose: Multileaf collimator (MLC) tracking radiotherapy is complex as the beam pattern needs to be modified due to the planned intensity modulation as well as the real-time target motion. The target motion cannot be planned; therefore, the modified beam pattern differs from the original plan and the MLC sequence needs to be recomputed online. Current MLC tracking algorithms use a greedy heuristic in that they optimize for a given time, but ignore past errors. To overcome this problem, the authors have developed and improved an algorithm that minimizes large underdose and overdose regions. Additionally, previous underdose and overdose events aremore » taken into account to avoid regions with high quantity of dose events. Methods: The authors improved the existing MLC motion control algorithm by introducing a cumulative underdose/overdose map. This map represents the actual projection of the planned tumor shape and logs occurring dose events at each specific regions. These events have an impact on the dose cost calculation and reduce recurrence of dose events at each region. The authors studied the improvement of the new temporal optimization algorithm in terms of the L1-norm minimization of the sum of overdose and underdose compared to not accounting for previous dose events. For evaluation, the authors simulated the delivery of 5 conformal and 14 intensity-modulated radiotherapy (IMRT)-plans with 7 3D patient measured tumor motion traces. Results: Simulations with conformal shapes showed an improvement of L1-norm up to 8.5% after 100 MLC modification steps. Experiments showed comparable improvements with the same type of treatment plans. Conclusions: A novel leaf sequencing optimization algorithm which considers previous dose events for MLC tracking radiotherapy has been developed and investigated. Reductions in underdose/overdose are observed for conformal and IMRT delivery.« less

  2. Multi-Satellite Scheduling Approach for Dynamic Areal Tasks Triggered by Emergent Disasters

    NASA Astrophysics Data System (ADS)

    Niu, X. N.; Zhai, X. J.; Tang, H.; Wu, L. X.

    2016-06-01

    The process of satellite mission scheduling, which plays a significant role in rapid response to emergent disasters, e.g. earthquake, is used to allocate the observation resources and execution time to a series of imaging tasks by maximizing one or more objectives while satisfying certain given constraints. In practice, the information obtained of disaster situation changes dynamically, which accordingly leads to the dynamic imaging requirement of users. We propose a satellite scheduling model to address dynamic imaging tasks triggered by emergent disasters. The goal of proposed model is to meet the emergency response requirements so as to make an imaging plan to acquire rapid and effective information of affected area. In the model, the reward of the schedule is maximized. To solve the model, we firstly present a dynamic segmenting algorithm to partition area targets. Then the dynamic heuristic algorithm embedding in a greedy criterion is designed to obtain the optimal solution. To evaluate the model, we conduct experimental simulations in the scene of Wenchuan Earthquake. The results show that the simulated imaging plan can schedule satellites to observe a wider scope of target area. We conclude that our satellite scheduling model can optimize the usage of satellite resources so as to obtain images in disaster response in a more timely and efficient manner.

  3. Polarity related influence maximization in signed social networks.

    PubMed

    Li, Dong; Xu, Zhi-Ming; Chakraborty, Nilanjan; Gupta, Anika; Sycara, Katia; Li, Sheng

    2014-01-01

    Influence maximization in social networks has been widely studied motivated by applications like spread of ideas or innovations in a network and viral marketing of products. Current studies focus almost exclusively on unsigned social networks containing only positive relationships (e.g. friend or trust) between users. Influence maximization in signed social networks containing both positive relationships and negative relationships (e.g. foe or distrust) between users is still a challenging problem that has not been studied. Thus, in this paper, we propose the polarity-related influence maximization (PRIM) problem which aims to find the seed node set with maximum positive influence or maximum negative influence in signed social networks. To address the PRIM problem, we first extend the standard Independent Cascade (IC) model to the signed social networks and propose a Polarity-related Independent Cascade (named IC-P) diffusion model. We prove that the influence function of the PRIM problem under the IC-P model is monotonic and submodular Thus, a greedy algorithm can be used to achieve an approximation ratio of 1-1/e for solving the PRIM problem in signed social networks. Experimental results on two signed social network datasets, Epinions and Slashdot, validate that our approximation algorithm for solving the PRIM problem outperforms state-of-the-art methods.

  4. Polarity Related Influence Maximization in Signed Social Networks

    PubMed Central

    Li, Dong; Xu, Zhi-Ming; Chakraborty, Nilanjan; Gupta, Anika; Sycara, Katia; Li, Sheng

    2014-01-01

    Influence maximization in social networks has been widely studied motivated by applications like spread of ideas or innovations in a network and viral marketing of products. Current studies focus almost exclusively on unsigned social networks containing only positive relationships (e.g. friend or trust) between users. Influence maximization in signed social networks containing both positive relationships and negative relationships (e.g. foe or distrust) between users is still a challenging problem that has not been studied. Thus, in this paper, we propose the polarity-related influence maximization (PRIM) problem which aims to find the seed node set with maximum positive influence or maximum negative influence in signed social networks. To address the PRIM problem, we first extend the standard Independent Cascade (IC) model to the signed social networks and propose a Polarity-related Independent Cascade (named IC-P) diffusion model. We prove that the influence function of the PRIM problem under the IC-P model is monotonic and submodular Thus, a greedy algorithm can be used to achieve an approximation ratio of 1-1/e for solving the PRIM problem in signed social networks. Experimental results on two signed social network datasets, Epinions and Slashdot, validate that our approximation algorithm for solving the PRIM problem outperforms state-of-the-art methods. PMID:25061986

  5. Energy-landscape paving for prediction of face-centered-cubic hydrophobic-hydrophilic lattice model proteins

    NASA Astrophysics Data System (ADS)

    Liu, Jingfa; Song, Beibei; Liu, Zhaoxia; Huang, Weibo; Sun, Yuanyuan; Liu, Wenjie

    2013-11-01

    Protein structure prediction (PSP) is a classical NP-hard problem in computational biology. The energy-landscape paving (ELP) method is a class of heuristic global optimization algorithm, and has been successfully applied to solving many optimization problems with complex energy landscapes in the continuous space. By putting forward a new update mechanism of the histogram function in ELP and incorporating the generation of initial conformation based on the greedy strategy and the neighborhood search strategy based on pull moves into ELP, an improved energy-landscape paving (ELP+) method is put forward. Twelve general benchmark instances are first tested on both two-dimensional and three-dimensional (3D) face-centered-cubic (fcc) hydrophobic-hydrophilic (HP) lattice models. The lowest energies by ELP+ are as good as or better than those of other methods in the literature for all instances. Then, five sets of larger-scale instances, denoted by S, R, F90, F180, and CASP target instances on the 3D FCC HP lattice model are tested. The proposed algorithm finds lower energies than those by the five other methods in literature. Not unexpectedly, this is particularly pronounced for the longer sequences considered. Computational results show that ELP+ is an effective method for PSP on the fcc HP lattice model.

  6. A Novel Feature Selection Technique for Text Classification Using Naïve Bayes.

    PubMed

    Dey Sarkar, Subhajit; Goswami, Saptarsi; Agarwal, Aman; Aktar, Javed

    2014-01-01

    With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naïve Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naïve Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naïve Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naïve Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naïve Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.

  7. Distribution-Preserving Stratified Sampling for Learning Problems.

    PubMed

    Cervellera, Cristiano; Maccio, Danilo

    2017-06-09

    The need for extracting a small sample from a large amount of real data, possibly streaming, arises routinely in learning problems, e.g., for storage, to cope with computational limitations, obtain good training/test/validation sets, and select minibatches for stochastic gradient neural network training. Unless we have reasons to select the samples in an active way dictated by the specific task and/or model at hand, it is important that the distribution of the selected points is as similar as possible to the original data. This is obvious for unsupervised learning problems, where the goal is to gain insights on the distribution of the data, but it is also relevant for supervised problems, where the theory explains how the training set distribution influences the generalization error. In this paper, we analyze the technique of stratified sampling from the point of view of distances between probabilities. This allows us to introduce an algorithm, based on recursive binary partition of the input space, aimed at obtaining samples that are distributed as much as possible as the original data. A theoretical analysis is proposed, proving the (greedy) optimality of the procedure together with explicit error bounds. An adaptive version of the algorithm is also introduced to cope with streaming data. Simulation tests on various data sets and different learning tasks are also provided.

  8. Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction

    PubMed Central

    Arruti, Andoni; Cearreta, Idoia; Álvarez, Aitor; Lazkano, Elena; Sierra, Basilio

    2014-01-01

    Study of emotions in human–computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested. PMID:25279686

  9. Minimizing the average distance to a closest leaf in a phylogenetic tree.

    PubMed

    Matsen, Frederick A; Gallagher, Aaron; McCoy, Connor O

    2013-11-01

    When performing an analysis on a collection of molecular sequences, it can be convenient to reduce the number of sequences under consideration while maintaining some characteristic of a larger collection of sequences. For example, one may wish to select a subset of high-quality sequences that represent the diversity of a larger collection of sequences. One may also wish to specialize a large database of characterized "reference sequences" to a smaller subset that is as close as possible on average to a collection of "query sequences" of interest. Such a representative subset can be useful whenever one wishes to find a set of reference sequences that is appropriate to use for comparative analysis of environmentally derived sequences, such as for selecting "reference tree" sequences for phylogenetic placement of metagenomic reads. In this article, we formalize these problems in terms of the minimization of the Average Distance to the Closest Leaf (ADCL) and investigate algorithms to perform the relevant minimization. We show that the greedy algorithm is not effective, show that a variant of the Partitioning Around Medoids (PAM) heuristic gets stuck in local minima, and develop an exact dynamic programming approach. Using this exact program we note that the performance of PAM appears to be good for simulated trees, and is faster than the exact algorithm for small trees. On the other hand, the exact program gives solutions for all numbers of leaves less than or equal to the given desired number of leaves, whereas PAM only gives a solution for the prespecified number of leaves. Via application to real data, we show that the ADCL criterion chooses chimeric sequences less often than random subsets, whereas the maximization of phylogenetic diversity chooses them more often than random. These algorithms have been implemented in publicly available software.

  10. Amoeba-inspired Tug-of-War algorithms for exploration-exploitation dilemma in extended Bandit Problem.

    PubMed

    Aono, Masashi; Kim, Song-Ju; Hara, Masahiko; Munakata, Toshinori

    2014-03-01

    The true slime mold Physarum polycephalum, a single-celled amoeboid organism, is capable of efficiently allocating a constant amount of intracellular resource to its pseudopod-like branches that best fit the environment where dynamic light stimuli are applied. Inspired by the resource allocation process, the authors formulated a concurrent search algorithm, called the Tug-of-War (TOW) model, for maximizing the profit in the multi-armed Bandit Problem (BP). A player (gambler) of the BP should decide as quickly and accurately as possible which slot machine to invest in out of the N machines and faces an "exploration-exploitation dilemma." The dilemma is a trade-off between the speed and accuracy of the decision making that are conflicted objectives. The TOW model maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a nonlocal correlation among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). Owing to this nonlocal correlation, the TOW model can efficiently manage the dilemma. In this study, we extend the TOW model to apply it to a stretched variant of BP, the Extended Bandit Problem (EBP), which is a problem of selecting the best M-tuple of the N machines. We demonstrate that the extended TOW model exhibits better performances for 2-tuple-3-machine and 2-tuple-4-machine instances of EBP compared with the extended versions of well-known algorithms for BP, the ϵ-Greedy and SoftMax algorithms, particularly in terms of its short-term decision-making capability that is essential for the survival of the amoeba in a hostile environment. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  11. Ant colony optimisation-direct cover: a hybrid ant colony direct cover technique for multi-level synthesis of multiple-valued logic functions

    NASA Astrophysics Data System (ADS)

    Abd-El-Barr, Mostafa

    2010-12-01

    The use of non-binary (multiple-valued) logic in the synthesis of digital systems can lead to savings in chip area. Advances in very large scale integration (VLSI) technology have enabled the successful implementation of multiple-valued logic (MVL) circuits. A number of heuristic algorithms for the synthesis of (near) minimal sum-of products (two-level) realisation of MVL functions have been reported in the literature. The direct cover (DC) technique is one such algorithm. The ant colony optimisation (ACO) algorithm is a meta-heuristic that uses constructive greediness to explore a large solution space in finding (near) optimal solutions. The ACO algorithm mimics the ant's behaviour in the real world in using the shortest path to reach food sources. We have previously introduced an ACO-based heuristic for the synthesis of two-level MVL functions. In this article, we introduce the ACO-DC hybrid technique for the synthesis of multi-level MVL functions. The basic idea is to use an ant to decompose a given MVL function into a number of levels and then synthesise each sub-function using a DC-based technique. The results obtained using the proposed approach are compared to those obtained using existing techniques reported in the literature. A benchmark set consisting of 50,000 randomly generated 2-variable 4-valued functions is used in the comparison. The results obtained using the proposed ACO-DC technique are shown to produce efficient realisation in terms of the average number of gates (as a measure of chip area) needed for the synthesis of a given MVL function.

  12. Understanding and predicting binding between human leukocyte antigens (HLAs) and peptides by network analysis.

    PubMed

    Luo, Heng; Ye, Hao; Ng, Hui; Shi, Leming; Tong, Weida; Mattes, William; Mendrick, Donna; Hong, Huixiao

    2015-01-01

    As the major histocompatibility complex (MHC), human leukocyte antigens (HLAs) are one of the most polymorphic genes in humans. Patients carrying certain HLA alleles may develop adverse drug reactions (ADRs) after taking specific drugs. Peptides play an important role in HLA related ADRs as they are the necessary co-binders of HLAs with drugs. Many experimental data have been generated for understanding HLA-peptide binding. However, efficiently utilizing the data for understanding and accurately predicting HLA-peptide binding is challenging. Therefore, we developed a network analysis based method to understand and predict HLA-peptide binding. Qualitative Class I HLA-peptide binding data were harvested and prepared from four major databases. An HLA-peptide binding network was constructed from this dataset and modules were identified by the fast greedy modularity optimization algorithm. To examine the significance of signals in the yielded models, the modularity was compared with the modularity values generated from 1,000 random networks. The peptides and HLAs in the modules were characterized by similarity analysis. The neighbor-edges based and unbiased leverage algorithm (Nebula) was developed for predicting HLA-peptide binding. Leave-one-out (LOO) validations and two-fold cross-validations were conducted to evaluate the performance of Nebula using the constructed HLA-peptide binding network. Nine modules were identified from analyzing the HLA-peptide binding network with a highest modularity compared to all the random networks. Peptide length and functional side chains of amino acids at certain positions of the peptides were different among the modules. HLA sequences were module dependent to some extent. Nebula archived an overall prediction accuracy of 0.816 in the LOO validations and average accuracy of 0.795 in the two-fold cross-validations and outperformed the method reported in the literature. Network analysis is a useful approach for analyzing large and sparse datasets such as the HLA-peptide binding dataset. The modules identified from the network analysis clustered peptides and HLAs with similar sequences and properties of amino acids. Nebula performed well in the predictions of HLA-peptide binding. We demonstrated that network analysis coupled with Nebula is an efficient approach to understand and predict HLA-peptide binding interactions and thus, could further our understanding of ADRs.

  13. Understanding and predicting binding between human leukocyte antigens (HLAs) and peptides by network analysis

    PubMed Central

    2015-01-01

    Background As the major histocompatibility complex (MHC), human leukocyte antigens (HLAs) are one of the most polymorphic genes in humans. Patients carrying certain HLA alleles may develop adverse drug reactions (ADRs) after taking specific drugs. Peptides play an important role in HLA related ADRs as they are the necessary co-binders of HLAs with drugs. Many experimental data have been generated for understanding HLA-peptide binding. However, efficiently utilizing the data for understanding and accurately predicting HLA-peptide binding is challenging. Therefore, we developed a network analysis based method to understand and predict HLA-peptide binding. Methods Qualitative Class I HLA-peptide binding data were harvested and prepared from four major databases. An HLA-peptide binding network was constructed from this dataset and modules were identified by the fast greedy modularity optimization algorithm. To examine the significance of signals in the yielded models, the modularity was compared with the modularity values generated from 1,000 random networks. The peptides and HLAs in the modules were characterized by similarity analysis. The neighbor-edges based and unbiased leverage algorithm (Nebula) was developed for predicting HLA-peptide binding. Leave-one-out (LOO) validations and two-fold cross-validations were conducted to evaluate the performance of Nebula using the constructed HLA-peptide binding network. Results Nine modules were identified from analyzing the HLA-peptide binding network with a highest modularity compared to all the random networks. Peptide length and functional side chains of amino acids at certain positions of the peptides were different among the modules. HLA sequences were module dependent to some extent. Nebula archived an overall prediction accuracy of 0.816 in the LOO validations and average accuracy of 0.795 in the two-fold cross-validations and outperformed the method reported in the literature. Conclusions Network analysis is a useful approach for analyzing large and sparse datasets such as the HLA-peptide binding dataset. The modules identified from the network analysis clustered peptides and HLAs with similar sequences and properties of amino acids. Nebula performed well in the predictions of HLA-peptide binding. We demonstrated that network analysis coupled with Nebula is an efficient approach to understand and predict HLA-peptide binding interactions and thus, could further our understanding of ADRs. PMID:26424483

  14. Masking Strategies for Image Manifolds.

    PubMed

    Dadkhahi, Hamid; Duarte, Marco F

    2016-07-07

    We consider the problem of selecting an optimal mask for an image manifold, i.e., choosing a subset of the pixels of the image that preserves the manifold's geometric structure present in the original data. Such masking implements a form of compressive sensing through emerging imaging sensor platforms for which the power expense grows with the number of pixels acquired. Our goal is for the manifold learned from masked images to resemble its full image counterpart as closely as possible. More precisely, we show that one can indeed accurately learn an image manifold without having to consider a large majority of the image pixels. In doing so, we consider two masking methods that preserve the local and global geometric structure of the manifold, respectively. In each case, the process of finding the optimal masking pattern can be cast as a binary integer program, which is computationally expensive but can be approximated by a fast greedy algorithm. Numerical experiments show that the relevant manifold structure is preserved through the datadependent masking process, even for modest mask sizes.

  15. Tag-Based Social Image Search: Toward Relevant and Diverse Results

    NASA Astrophysics Data System (ADS)

    Yang, Kuiyuan; Wang, Meng; Hua, Xian-Sheng; Zhang, Hong-Jiang

    Recent years have witnessed a great success of social media websites. Tag-based image search is an important approach to access the image content of interest on these websites. However, the existing ranking methods for tag-based image search frequently return results that are irrelevant or lack of diversity. This chapter presents a diverse relevance ranking scheme which simultaneously takes relevance and diversity into account by exploring the content of images and their associated tags. First, it estimates the relevance scores of images with respect to the query term based on both visual information of images and semantic information of associated tags. Then semantic similarities of social images are estimated based on their tags. Based on the relevance scores and the similarities, the ranking list is generated by a greedy ordering algorithm which optimizes Average Diverse Precision (ADP), a novel measure that is extended from the conventional Average Precision (AP). Comprehensive experiments and user studies demonstrate the effectiveness of the approach.

  16. Sparsity-based Poisson denoising with dictionary learning.

    PubMed

    Giryes, Raja; Elad, Michael

    2014-12-01

    The problem of Poisson denoising appears in various imaging applications, such as low-light photography, medical imaging, and microscopy. In cases of high SNR, several transformations exist so as to convert the Poisson noise into an additive-independent identically distributed. Gaussian noise, for which many effective algorithms are available. However, in a low-SNR regime, these transformations are significantly less accurate, and a strategy that relies directly on the true noise statistics is required. Salmon et al took this route, proposing a patch-based exponential image representation model based on Gaussian mixture model, leading to state-of-the-art results. In this paper, we propose to harness sparse-representation modeling to the image patches, adopting the same exponential idea. Our scheme uses a greedy pursuit with boot-strapping-based stopping condition and dictionary learning within the denoising process. The reconstruction performance of the proposed scheme is competitive with leading methods in high SNR and achieving state-of-the-art results in cases of low SNR.

  17. On the inherent competition between valid and spurious inductive inferences in Boolean data

    NASA Astrophysics Data System (ADS)

    Andrecut, M.

    Inductive inference is the process of extracting general rules from specific observations. This problem also arises in the analysis of biological networks, such as genetic regulatory networks, where the interactions are complex and the observations are incomplete. A typical task in these problems is to extract general interaction rules as combinations of Boolean covariates, that explain a measured response variable. The inductive inference process can be considered as an incompletely specified Boolean function synthesis problem. This incompleteness of the problem will also generate spurious inferences, which are a serious threat to valid inductive inference rules. Using random Boolean data as a null model, here we attempt to measure the competition between valid and spurious inductive inference rules from a given data set. We formulate two greedy search algorithms, which synthesize a given Boolean response variable in a sparse disjunct normal form, and respectively a sparse generalized algebraic normal form of the variables from the observation data, and we evaluate numerically their performance.

  18. Automated construction of arterial and venous trees in retinal images.

    PubMed

    Hu, Qiao; Abràmoff, Michael D; Garvin, Mona K

    2015-10-01

    While many approaches exist to segment retinal vessels in fundus photographs, only a limited number focus on the construction and disambiguation of arterial and venous trees. Previous approaches are local and/or greedy in nature, making them susceptible to errors or limiting their applicability to large vessels. We propose a more global framework to generate arteriovenous trees in retinal images, given a vessel segmentation. In particular, our approach consists of three stages. The first stage is to generate an overconnected vessel network, named the vessel potential connectivity map (VPCM), consisting of vessel segments and the potential connectivity between them. The second stage is to disambiguate the VPCM into multiple anatomical trees, using a graph-based metaheuristic algorithm. The third stage is to classify these trees into arterial or venous (A/V) trees. We evaluated our approach with a ground truth built based on a public database, showing a pixel-wise classification accuracy of 88.15% using a manual vessel segmentation as input, and 86.11% using an automatic vessel segmentation as input.

  19. Increasing the Lifetime of Mobile WSNs via Dynamic Optimization of Sensor Node Communication Activity

    PubMed Central

    Guimarães, Dayan Adionel; Sakai, Lucas Jun; Alberti, Antonio Marcos; de Souza, Rausley Adriano Amaral

    2016-01-01

    In this paper, a simple and flexible method for increasing the lifetime of fixed or mobile wireless sensor networks is proposed. Based on past residual energy information reported by the sensor nodes, the sink node or another central node dynamically optimizes the communication activity levels of the sensor nodes to save energy without sacrificing the data throughput. The activity levels are defined to represent portions of time or time-frequency slots in a frame, during which the sensor nodes are scheduled to communicate with the sink node to report sensory measurements. Besides node mobility, it is considered that sensors’ batteries may be recharged via a wireless power transmission or equivalent energy harvesting scheme, bringing to the optimization problem an even more dynamic character. We report large increased lifetimes over the non-optimized network and comparable or even larger lifetime improvements with respect to an idealized greedy algorithm that uses both the real-time channel state and the residual energy information. PMID:27657075

  20. Gaussian functional regression for output prediction: Model assimilation and experimental design

    NASA Astrophysics Data System (ADS)

    Nguyen, N. C.; Peraire, J.

    2016-03-01

    In this paper, we introduce a Gaussian functional regression (GFR) technique that integrates multi-fidelity models with model reduction to efficiently predict the input-output relationship of a high-fidelity model. The GFR method combines the high-fidelity model with a low-fidelity model to provide an estimate of the output of the high-fidelity model in the form of a posterior distribution that can characterize uncertainty in the prediction. A reduced basis approximation is constructed upon the low-fidelity model and incorporated into the GFR method to yield an inexpensive posterior distribution of the output estimate. As this posterior distribution depends crucially on a set of training inputs at which the high-fidelity models are simulated, we develop a greedy sampling algorithm to select the training inputs. Our approach results in an output prediction model that inherits the fidelity of the high-fidelity model and has the computational complexity of the reduced basis approximation. Numerical results are presented to demonstrate the proposed approach.

  1. Increasing the Lifetime of Mobile WSNs via Dynamic Optimization of Sensor Node Communication Activity.

    PubMed

    Guimarães, Dayan Adionel; Sakai, Lucas Jun; Alberti, Antonio Marcos; de Souza, Rausley Adriano Amaral

    2016-09-20

    In this paper, a simple and flexible method for increasing the lifetime of fixed or mobile wireless sensor networks is proposed. Based on past residual energy information reported by the sensor nodes, the sink node or another central node dynamically optimizes the communication activity levels of the sensor nodes to save energy without sacrificing the data throughput. The activity levels are defined to represent portions of time or time-frequency slots in a frame, during which the sensor nodes are scheduled to communicate with the sink node to report sensory measurements. Besides node mobility, it is considered that sensors' batteries may be recharged via a wireless power transmission or equivalent energy harvesting scheme, bringing to the optimization problem an even more dynamic character. We report large increased lifetimes over the non-optimized network and comparable or even larger lifetime improvements with respect to an idealized greedy algorithm that uses both the real-time channel state and the residual energy information.

  2. Optimized Structure of the Traffic Flow Forecasting Model With a Deep Learning Approach.

    PubMed

    Yang, Hao-Fan; Dillon, Tharam S; Chen, Yi-Ping Phoebe

    2017-10-01

    Forecasting accuracy is an important issue for successful intelligent traffic management, especially in the domain of traffic efficiency and congestion reduction. The dawning of the big data era brings opportunities to greatly improve prediction accuracy. In this paper, we propose a novel model, stacked autoencoder Levenberg-Marquardt model, which is a type of deep architecture of neural network approach aiming to improve forecasting accuracy. The proposed model is designed using the Taguchi method to develop an optimized structure and to learn traffic flow features through layer-by-layer feature granulation with a greedy layerwise unsupervised learning algorithm. It is applied to real-world data collected from the M6 freeway in the U.K. and is compared with three existing traffic predictors. To the best of our knowledge, this is the first time that an optimized structure of the traffic flow forecasting model with a deep learning approach is presented. The evaluation results demonstrate that the proposed model with an optimized structure has superior performance in traffic flow forecasting.

  3. Entropy Based Feature Selection for Fuzzy Set-Valued Information Systems

    NASA Astrophysics Data System (ADS)

    Ahmed, Waseem; Sufyan Beg, M. M.; Ahmad, Tanvir

    2018-06-01

    In Set-valued Information Systems (SIS), several objects contain more than one value for some attributes. Tolerance relation used for handling SIS sometimes leads to loss of certain information. To surmount this problem, fuzzy rough model was introduced. However, in some cases, SIS may contain some real or continuous set-values. Therefore, the existing fuzzy rough model for handling Information system with fuzzy set-values needs some changes. In this paper, Fuzzy Set-valued Information System (FSIS) is proposed and fuzzy similarity relation for FSIS is defined. Yager's relative conditional entropy was studied to find the significance measure of a candidate attribute of FSIS. Later, using these significance values, three greedy forward algorithms are discussed for finding the reduct and relative reduct for the proposed FSIS. An experiment was conducted on a sample population of the real dataset and a comparison of classification accuracies of the proposed FSIS with the existing SIS and single-valued Fuzzy Information Systems was made, which demonstrated the effectiveness of proposed FSIS.

  4. High-speed and high-ratio referential genome compression.

    PubMed

    Liu, Yuansheng; Peng, Hui; Wong, Limsoon; Li, Jinyan

    2017-11-01

    The rapidly increasing number of genomes generated by high-throughput sequencing platforms and assembly algorithms is accompanied by problems in data storage, compression and communication. Traditional compression algorithms are unable to meet the demand of high compression ratio due to the intrinsic challenging features of DNA sequences such as small alphabet size, frequent repeats and palindromes. Reference-based lossless compression, by which only the differences between two similar genomes are stored, is a promising approach with high compression ratio. We present a high-performance referential genome compression algorithm named HiRGC. It is based on a 2-bit encoding scheme and an advanced greedy-matching search on a hash table. We compare the performance of HiRGC with four state-of-the-art compression methods on a benchmark dataset of eight human genomes. HiRGC takes <30 min to compress about 21 gigabytes of each set of the seven target genomes into 96-260 megabytes, achieving compression ratios of 217 to 82 times. This performance is at least 1.9 times better than the best competing algorithm on its best case. Our compression speed is also at least 2.9 times faster. HiRGC is stable and robust to deal with different reference genomes. In contrast, the competing methods' performance varies widely on different reference genomes. More experiments on 100 human genomes from the 1000 Genome Project and on genomes of several other species again demonstrate that HiRGC's performance is consistently excellent. The C ++ and Java source codes of our algorithm are freely available for academic and non-commercial use. They can be downloaded from https://github.com/yuansliu/HiRGC. jinyan.li@uts.edu.au. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  5. "Norwegians fear fatness more than anything else"--a qualitative study of normative newspaper messages on obesity and health.

    PubMed

    Malterud, Kirsti; Ulriksen, Kjersti

    2010-10-01

    To explore normative aspects of the Norwegian discourse on obesity. We conducted a qualitative study with data from five Norwegian newspapers, focusing normative entries about body weight. Discourse analysis provided a focus on the cultural attitudes when systematic text condensation was conducted. Data comprised 26 normative messages (prescriptions or comments on how obese people are or should be, messages mediating or discussing values prescribing a 'good' body). Two main normative domains within the obesity discourse were identified. One group of entries warned about obesity from an aesthetic point of view, notifying the reader that beauty would suffer when weight increases, due to reduced attractiveness. These texts appealed to bodily conformity, linking leanness with attractiveness and delight, suggesting that fat people are ugly and unhappy. The other group referred to lack of control in the obese person, linking greediness to lack of responsibility and bad health. Fat people were displayed as undisciplined and greedy individuals who should be ashamed. Cultural messages of blame and shame are associated with obesity, but also spreading from body weight to the very scene of life. People with obesity cannot escape this cultural context, only find a way of coping with it. Quality care for people with obesity implies that public health and clinical medicine acknowledge the burden of cultural stigma. Developing awareness for cultural prejudices on body weight, doctors could counteract stigmatization and contribute to empowerment and health. (c) 2009 Elsevier Ireland Ltd. All rights reserved.

  6. Multi-dimensional Rankings, Program Termination, and Complexity Bounds of Flowchart Programs

    NASA Astrophysics Data System (ADS)

    Alias, Christophe; Darte, Alain; Feautrier, Paul; Gonnord, Laure

    Proving the termination of a flowchart program can be done by exhibiting a ranking function, i.e., a function from the program states to a well-founded set, which strictly decreases at each program step. A standard method to automatically generate such a function is to compute invariants for each program point and to search for a ranking in a restricted class of functions that can be handled with linear programming techniques. Previous algorithms based on affine rankings either are applicable only to simple loops (i.e., single-node flowcharts) and rely on enumeration, or are not complete in the sense that they are not guaranteed to find a ranking in the class of functions they consider, if one exists. Our first contribution is to propose an efficient algorithm to compute ranking functions: It can handle flowcharts of arbitrary structure, the class of candidate rankings it explores is larger, and our method, although greedy, is provably complete. Our second contribution is to show how to use the ranking functions we generate to get upper bounds for the computational complexity (number of transitions) of the source program. This estimate is a polynomial, which means that we can handle programs with more than linear complexity. We applied the method on a collection of test cases from the literature. We also show the links and differences with previous techniques based on the insertion of counters.

  7. Experimental investigations on airborne gravimetry based on compressed sensing.

    PubMed

    Yang, Yapeng; Wu, Meiping; Wang, Jinling; Zhang, Kaidong; Cao, Juliang; Cai, Shaokun

    2014-03-18

    Gravity surveys are an important research topic in geophysics and geodynamics. This paper investigates a method for high accuracy large scale gravity anomaly data reconstruction. Based on the airborne gravimetry technology, a flight test was carried out in China with the strap-down airborne gravimeter (SGA-WZ) developed by the Laboratory of Inertial Technology of the National University of Defense Technology. Taking into account the sparsity of airborne gravimetry by the discrete Fourier transform (DFT), this paper proposes a method for gravity anomaly data reconstruction using the theory of compressed sensing (CS). The gravity anomaly data reconstruction is an ill-posed inverse problem, which can be transformed into a sparse optimization problem. This paper uses the zero-norm as the objective function and presents a greedy algorithm called Orthogonal Matching Pursuit (OMP) to solve the corresponding minimization problem. The test results have revealed that the compressed sampling rate is approximately 14%, the standard deviation of the reconstruction error by OMP is 0.03 mGal and the signal-to-noise ratio (SNR) is 56.48 dB. In contrast, the standard deviation of the reconstruction error by the existing nearest-interpolation method (NIPM) is 0.15 mGal and the SNR is 42.29 dB. These results have shown that the OMP algorithm can reconstruct the gravity anomaly data with higher accuracy and fewer measurements.

  8. Experimental Investigations on Airborne Gravimetry Based on Compressed Sensing

    PubMed Central

    Yang, Yapeng; Wu, Meiping; Wang, Jinling; Zhang, Kaidong; Cao, Juliang; Cai, Shaokun

    2014-01-01

    Gravity surveys are an important research topic in geophysics and geodynamics. This paper investigates a method for high accuracy large scale gravity anomaly data reconstruction. Based on the airborne gravimetry technology, a flight test was carried out in China with the strap-down airborne gravimeter (SGA-WZ) developed by the Laboratory of Inertial Technology of the National University of Defense Technology. Taking into account the sparsity of airborne gravimetry by the discrete Fourier transform (DFT), this paper proposes a method for gravity anomaly data reconstruction using the theory of compressed sensing (CS). The gravity anomaly data reconstruction is an ill-posed inverse problem, which can be transformed into a sparse optimization problem. This paper uses the zero-norm as the objective function and presents a greedy algorithm called Orthogonal Matching Pursuit (OMP) to solve the corresponding minimization problem. The test results have revealed that the compressed sampling rate is approximately 14%, the standard deviation of the reconstruction error by OMP is 0.03 mGal and the signal-to-noise ratio (SNR) is 56.48 dB. In contrast, the standard deviation of the reconstruction error by the existing nearest-interpolation method (NIPM) is 0.15 mGal and the SNR is 42.29 dB. These results have shown that the OMP algorithm can reconstruct the gravity anomaly data with higher accuracy and fewer measurements. PMID:24647125

  9. Contribution of Geographic Information Systems and location models to planning of wastewater systems.

    PubMed

    Leitão, J P; Matos, J S; Gonçalves, A B; Matos, J L

    2005-01-01

    This paper presents the contributions of Geographic Information Systems (GIS) and location models towards planning regional wastewater systems (sewers and wastewater treatment plants) serving small agglomerations, i.e. agglomerations with less than 2,000 inhabitants. The main goal was to develop a decision support tool for tracing and locating regional wastewater systems. The main results of the model are expressed in terms of number, capacity and location of Wastewater Treatment Plants (WWTP) and the length of main sewers. The decision process concerning the location and capacity of wastewater systems has a number of parameters that can be optimized. These parameters include the total sewer length and number, capacity and location of WWTP. The optimization of parameters should lead to the minimization of construction and operation costs of the integrated system. Location models have been considered as tools for decision support, mainly when a geo-referenced database can be used. In these cases, the GIS may represent an important role for the analysis of data and results especially in the preliminary stage of planning and design. After selecting the spatial location model and the heuristics, two greedy algorithms were implemented in Visual Basic for Applications on the ArcGIS software environment. To illustrate the application of these algorithms a case study was developed, in a rural area located in the central part of Portugal.

  10. Seismic signal time-frequency analysis based on multi-directional window using greedy strategy

    NASA Astrophysics Data System (ADS)

    Chen, Yingpin; Peng, Zhenming; Cheng, Zhuyuan; Tian, Lin

    2017-08-01

    Wigner-Ville distribution (WVD) is an important time-frequency analysis technology with a high energy distribution in seismic signal processing. However, it is interfered by many cross terms. To suppress the cross terms of the WVD and keep the concentration of its high energy distribution, an adaptive multi-directional filtering window in the ambiguity domain is proposed. This begins with the relationship of the Cohen distribution and the Gabor transform combining the greedy strategy and the rotational invariance property of the fractional Fourier transform in order to propose the multi-directional window, which extends the one-dimensional, one directional, optimal window function of the optimal fractional Gabor transform (OFrGT) to a two-dimensional, multi-directional window in the ambiguity domain. In this way, the multi-directional window matches the main auto terms of the WVD more precisely. Using the greedy strategy, the proposed window takes into account the optimal and other suboptimal directions, which also solves the problem of the OFrGT, called the local concentration phenomenon, when encountering a multi-component signal. Experiments on different types of both the signal models and the real seismic signals reveal that the proposed window can overcome the drawbacks of the WVD and the OFrGT mentioned above. Finally, the proposed method is applied to a seismic signal's spectral decomposition. The results show that the proposed method can explore the space distribution of a reservoir more precisely.

  11. Development of an adjoint sensitivity field-based treatment-planning technique for the use of newly designed directional LDR sources in brachytherapy.

    PubMed

    Chaswal, V; Thomadsen, B R; Henderson, D L

    2012-02-21

    The development and application of an automated 3D greedy heuristic (GH) optimization algorithm utilizing the adjoint sensitivity fields for treatment planning to assess the advantage of directional interstitial prostate brachytherapy is presented. Directional and isotropic dose kernels generated using Monte Carlo simulations based on Best Industries model 2301 I-125 source are utilized for treatment planning. The newly developed GH algorithm is employed for optimization of the treatment plans for seven interstitial prostate brachytherapy cases using mixed sources (directional brachytherapy) and using only isotropic sources (conventional brachytherapy). All treatment plans resulted in V100 > 98% and D90 > 45 Gy for the target prostate region. For the urethra region, the D10(Ur), D90(Ur) and V150(Ur) and for the rectum region the V100cc, D2cc, D90(Re) and V90(Re) all are reduced significantly when mixed sources brachytherapy is used employing directional sources. The simulations demonstrated that the use of directional sources in the low dose-rate (LDR) brachytherapy of the prostate clearly benefits in sparing the urethra and the rectum sensitive structures from overdose. The time taken for a conventional treatment plan is less than three seconds, while the time taken for a mixed source treatment plan is less than nine seconds, as tested on an Intel Core2 Duo 2.2 GHz processor with 1GB RAM. The new 3D GH algorithm is successful in generating a feasible LDR brachytherapy treatment planning solution with an extra degree of freedom, i.e. directionality in very little time.

  12. Development of an adjoint sensitivity field-based treatment-planning technique for the use of newly designed directional LDR sources in brachytherapy

    NASA Astrophysics Data System (ADS)

    Chaswal, V.; Thomadsen, B. R.; Henderson, D. L.

    2012-02-01

    The development and application of an automated 3D greedy heuristic (GH) optimization algorithm utilizing the adjoint sensitivity fields for treatment planning to assess the advantage of directional interstitial prostate brachytherapy is presented. Directional and isotropic dose kernels generated using Monte Carlo simulations based on Best Industries model 2301 I-125 source are utilized for treatment planning. The newly developed GH algorithm is employed for optimization of the treatment plans for seven interstitial prostate brachytherapy cases using mixed sources (directional brachytherapy) and using only isotropic sources (conventional brachytherapy). All treatment plans resulted in V100 > 98% and D90 > 45 Gy for the target prostate region. For the urethra region, the D10Ur, D90Ur and V150Ur and for the rectum region the V100cc, D2cc, D90Re and V90Re all are reduced significantly when mixed sources brachytherapy is used employing directional sources. The simulations demonstrated that the use of directional sources in the low dose-rate (LDR) brachytherapy of the prostate clearly benefits in sparing the urethra and the rectum sensitive structures from overdose. The time taken for a conventional treatment plan is less than three seconds, while the time taken for a mixed source treatment plan is less than nine seconds, as tested on an Intel Core2 Duo 2.2 GHz processor with 1GB RAM. The new 3D GH algorithm is successful in generating a feasible LDR brachytherapy treatment planning solution with an extra degree of freedom, i.e. directionality in very little time.

  13. Minimal-delay traffic grooming for WDM star networks

    NASA Astrophysics Data System (ADS)

    Choi, Hongsik; Garg, Nikhil; Choi, Hyeong-Ah

    2003-10-01

    All-optical networks face the challenge of reducing slower opto-electronic conversions by managing assignment of traffic streams to wavelengths in an intelligent manner, while at the same time utilizing bandwidth resources to the maximum. This challenge becomes harder in networks closer to the end users that have insufficient data to saturate single wavelengths as well as traffic streams outnumbering the usable wavelengths, resulting in traffic grooming which requires costly traffic analysis at access nodes. We study the problem of traffic grooming that reduces the need to analyze traffic, for a class of network architecture most used by Metropolitan Area Networks; the star network. The problem being NP-complete, we provide an efficient twice-optimal-bound greedy heuristic for the same, that can be used to intelligently groom traffic at the LANs to reduce latency at the access nodes. Simulation results show that our greedy heuristic achieves a near-optimal solution.

  14. NITPICK: peak identification for mass spectrometry data

    PubMed Central

    Renard, Bernhard Y; Kirchner, Marc; Steen , Hanno; Steen, Judith AJ; Hamprecht , Fred A

    2008-01-01

    Background The reliable extraction of features from mass spectra is a fundamental step in the automated analysis of proteomic mass spectrometry (MS) experiments. Results This contribution proposes a sparse template regression approach to peak picking called NITPICK. NITPICK is a Non-greedy, Iterative Template-based peak PICKer that deconvolves complex overlapping isotope distributions in multicomponent mass spectra. NITPICK is based on fractional averagine, a novel extension to Senko's well-known averagine model, and on a modified version of sparse, non-negative least angle regression, for which a suitable, statistically motivated early stopping criterion has been derived. The strength of NITPICK is the deconvolution of overlapping mixture mass spectra. Conclusion Extensive comparative evaluation has been carried out and results are provided for simulated and real-world data sets. NITPICK outperforms pepex, to date the only alternate, publicly available, non-greedy feature extraction routine. NITPICK is available as software package for the R programming language and can be downloaded from . PMID:18755032

  15. Prediction based Greedy Perimeter Stateless Routing Protocol for Vehicular Self-organizing Network

    NASA Astrophysics Data System (ADS)

    Wang, Chunlin; Fan, Quanrun; Chen, Xiaolin; Xu, Wanjin

    2018-03-01

    PGPSR (Prediction based Greedy Perimeter Stateless Routing) is based on and extended the GPSR protocol to adapt to the high speed mobility of the vehicle auto organization network (VANET) and the changes in the network topology. GPSR is used in the VANET network environment, the network loss rate and throughput are not ideal, even cannot work. Aiming at the problems of the GPSR, the proposed PGPSR routing protocol, it redefines the hello and query packet structure, in the structure of the new node speed and direction information, which received the next update before you can take advantage of its speed and direction to predict the position of node and new network topology, select the right the next hop routing and path. Secondly, the update of the outdated node information of the neighbor’s table is deleted in time. The simulation experiment shows the performance of PGPSR is better than that of GPSR.

  16. Craving, longing, denial, and the dangers of change: clinical manifestations of greed.

    PubMed

    Waska, Robert

    2002-08-01

    Greed is the unrelenting and unrealistic search for all the good an object has to offer and, via identification, all the good one can produce and provide. In phantasy, and sometimes in the patient's early developmental environment, the object and the ego demand more from each other than either have to give. Some patients cannot contain their urge to possess all and to be all, so it becomes a part of the interpersonal and psychological relationship with the analyst rather quickly. These patients feel something is owed to them, and they demand to be fed immediately. Other patients try and hide these greedy phantasies by being the opposite of greedy. They strive to be independent and charitable, while having great conflict over deeper desires to be dependent and in possession of an idealized giving object, an all-providing breast. Case material was used to explore these ideas.

  17. No place to hide: when shame causes proselfs to cooperate.

    PubMed

    Declerck, Carolyn Henriette; Boone, Christophe; Kiyonari, Toko

    2014-01-01

    Shame is considered a social emotion with action tendencies that elicit socially beneficial behavior. Yet, unlike other social emotions, prior experimental studies do not indicate that incidental shame boosts prosocial behavior. Based on the affect as information theory, we hypothesize that incidental feelings of shame can increase cooperation, but only for self-interested individuals, and only in a context where shame is relevant with regards to its action tendency. To test this hypothesis, cooperation levels are compared between a simultaneous prisoner's dilemma (where "defect" may result from multiple motives) and a sequential prisoner's dilemma (where "second player defect" is the result of intentional greediness). As hypothesized, shame positively affected proselfs in a sequential prisoner's dilemma. Hence ashamed proselfs become inclined to cooperate when they believe they have no way to hide their greediness, and not necessarily because they want to make up for earlier wrong-doing.

  18. Optimization of rainfall networks using information entropy and temporal variability analysis

    NASA Astrophysics Data System (ADS)

    Wang, Wenqi; Wang, Dong; Singh, Vijay P.; Wang, Yuankun; Wu, Jichun; Wang, Lachun; Zou, Xinqing; Liu, Jiufu; Zou, Ying; He, Ruimin

    2018-04-01

    Rainfall networks are the most direct sources of precipitation data and their optimization and evaluation are essential and important. Information entropy can not only represent the uncertainty of rainfall distribution but can also reflect the correlation and information transmission between rainfall stations. Using entropy this study performs optimization of rainfall networks that are of similar size located in two big cities in China, Shanghai (in Yangtze River basin) and Xi'an (in Yellow River basin), with respect to temporal variability analysis. Through an easy-to-implement greedy ranking algorithm based on the criterion called, Maximum Information Minimum Redundancy (MIMR), stations of the networks in the two areas (each area is further divided into two subareas) are ranked during sliding inter-annual series and under different meteorological conditions. It is found that observation series with different starting days affect the ranking, alluding to the temporal variability during network evaluation. We propose a dynamic network evaluation framework for considering temporal variability, which ranks stations under different starting days with a fixed time window (1-year, 2-year, and 5-year). Therefore, we can identify rainfall stations which are temporarily of importance or redundancy and provide some useful suggestions for decision makers. The proposed framework can serve as a supplement for the primary MIMR optimization approach. In addition, during different periods (wet season or dry season) the optimal network from MIMR exhibits differences in entropy values and the optimal network from wet season tended to produce higher entropy values. Differences in spatial distribution of the optimal networks suggest that optimizing the rainfall network for changing meteorological conditions may be more recommended.

  19. Target Control in Logical Models Using the Domain of Influence of Nodes.

    PubMed

    Yang, Gang; Gómez Tejeda Zañudo, Jorge; Albert, Réka

    2018-01-01

    Dynamical models of biomolecular networks are successfully used to understand the mechanisms underlying complex diseases and to design therapeutic strategies. Network control and its special case of target control, is a promising avenue toward developing disease therapies. In target control it is assumed that a small subset of nodes is most relevant to the system's state and the goal is to drive the target nodes into their desired states. An example of target control would be driving a cell to commit to apoptosis (programmed cell death). From the experimental perspective, gene knockout, pharmacological inhibition of proteins, and providing sustained external signals are among practical intervention techniques. We identify methodologies to use the stabilizing effect of sustained interventions for target control in Boolean network models of biomolecular networks. Specifically, we define the domain of influence (DOI) of a node (in a certain state) to be the nodes (and their corresponding states) that will be ultimately stabilized by the sustained state of this node regardless of the initial state of the system. We also define the related concept of the logical domain of influence (LDOI) of a node, and develop an algorithm for its identification using an auxiliary network that incorporates the regulatory logic. This way a solution to the target control problem is a set of nodes whose DOI can cover the desired target node states. We perform greedy randomized adaptive search in node state space to find such solutions. We apply our strategy to in silico biological network models of real systems to demonstrate its effectiveness.

  20. Assignment of protein sequences to existing domain and family classification systems: Pfam and the PDB.

    PubMed

    Xu, Qifang; Dunbrack, Roland L

    2012-11-01

    Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in query sequences often erroneously result in two copies of the domain assigned to the query. Divergent repeat sequences in proteins are often missed. We have developed a multilevel procedure to produce nearly complete assignments of protein families of an existing classification system to a large set of sequences. We apply this to the task of assigning Pfam domains to sequences and structures in the Protein Data Bank (PDB). We found that HHsearch alignments frequently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, leading to erroneous assignment at the Pfam family level. A greedy algorithm allowing for partial overlaps was, thus, applied first to sequence/HMM alignments, then HMM-HMM alignments and then structure alignments, taking care to join partial alignments split by large insertions into single-domain assignments. Additional assignment of repeat Pfams with weaker E-values was allowed after stronger assignments of the repeat HMM. Our database of assignments, presented in a database called PDBfam, contains Pfams for 99.4% of chains >50 residues. The Pfam assignment data in PDBfam are available at http://dunbrack2.fccc.edu/ProtCid/PDBfam, which can be searched by PDB codes and Pfam identifiers. They will be updated regularly.

  1. CNN-SVM for Microvascular Morphological Type Recognition with Data Augmentation.

    PubMed

    Xue, Di-Xiu; Zhang, Rong; Feng, Hui; Wang, Ya-Lei

    2016-01-01

    This paper focuses on the problem of feature extraction and the classification of microvascular morphological types to aid esophageal cancer detection. We present a patch-based system with a hybrid SVM model with data augmentation for intraepithelial papillary capillary loop recognition. A greedy patch-generating algorithm and a specialized CNN named NBI-Net are designed to extract hierarchical features from patches. We investigate a series of data augmentation techniques to progressively improve the prediction invariance of image scaling and rotation. For classifier boosting, SVM is used as an alternative to softmax to enhance generalization ability. The effectiveness of CNN feature representation ability is discussed for a set of widely used CNN models, including AlexNet, VGG-16, and GoogLeNet. Experiments are conducted on the NBI-ME dataset. The recognition rate is up to 92.74% on the patch level with data augmentation and classifier boosting. The results show that the combined CNN-SVM model beats models of traditional features with SVM as well as the original CNN with softmax. The synthesis results indicate that our system is able to assist clinical diagnosis to a certain extent.

  2. Short-term scheduling of an open-pit mine with multiple objectives

    NASA Astrophysics Data System (ADS)

    Blom, Michelle; Pearce, Adrian R.; Stuckey, Peter J.

    2017-05-01

    This article presents a novel algorithm for the generation of multiple short-term production schedules for an open-pit mine, in which several objectives, of varying priority, characterize the quality of each solution. A short-term schedule selects regions of a mine site, known as 'blocks', to be extracted in each week of a planning horizon (typically spanning 13 weeks). Existing tools for constructing these schedules use greedy heuristics, with little optimization. To construct a single schedule in which infrastructure is sufficiently utilized, with production grades consistently close to a desired target, a planner must often run these heuristics many times, adjusting parameters after each iteration. A planner's intuition and experience can evaluate the relative quality and mineability of different schedules in a way that is difficult to automate. Of interest to a short-term planner is the generation of multiple schedules, extracting available ore and waste in varying sequences, which can then be manually compared. This article presents a tool in which multiple, diverse, short-term schedules are constructed, meeting a range of common objectives without the need for iterative parameter adjustment.

  3. Automated construction of arterial and venous trees in retinal images

    PubMed Central

    Hu, Qiao; Abràmoff, Michael D.; Garvin, Mona K.

    2015-01-01

    Abstract. While many approaches exist to segment retinal vessels in fundus photographs, only a limited number focus on the construction and disambiguation of arterial and venous trees. Previous approaches are local and/or greedy in nature, making them susceptible to errors or limiting their applicability to large vessels. We propose a more global framework to generate arteriovenous trees in retinal images, given a vessel segmentation. In particular, our approach consists of three stages. The first stage is to generate an overconnected vessel network, named the vessel potential connectivity map (VPCM), consisting of vessel segments and the potential connectivity between them. The second stage is to disambiguate the VPCM into multiple anatomical trees, using a graph-based metaheuristic algorithm. The third stage is to classify these trees into arterial or venous (A/V) trees. We evaluated our approach with a ground truth built based on a public database, showing a pixel-wise classification accuracy of 88.15% using a manual vessel segmentation as input, and 86.11% using an automatic vessel segmentation as input. PMID:26636114

  4. More reliable protein NMR peak assignment via improved 2-interval scheduling.

    PubMed

    Chen, Zhi-Zhong; Lin, Guohui; Rizzi, Romeo; Wen, Jianjun; Xu, Dong; Xu, Ying; Jiang, Tao

    2005-03-01

    Protein NMR peak assignment refers to the process of assigning a group of "spin systems" obtained experimentally to a protein sequence of amino acids. The automation of this process is still an unsolved and challenging problem in NMR protein structure determination. Recently, protein NMR peak assignment has been formulated as an interval scheduling problem (ISP), where a protein sequence P of amino acids is viewed as a discrete time interval I (the amino acids on P one-to-one correspond to the time units of I), each subset S of spin systems that are known to originate from consecutive amino acids from P is viewed as a "job" j(s), the preference of assigning S to a subsequence P of consecutive amino acids on P is viewed as the profit of executing job j(s) in the subinterval of I corresponding to P, and the goal is to maximize the total profit of executing the jobs (on a single machine) during I. The interval scheduling problem is max SNP-hard in general; but in the real practice of protein NMR peak assignment, each job j(s) usually requires at most 10 consecutive time units, and typically the jobs that require one or two consecutive time units are the most difficult to assign/schedule. In order to solve these most difficult assignments, we present an efficient 13/7-approximation algorithm for the special case of the interval scheduling problem where each job takes one or two consecutive time units. Combining this algorithm with a greedy filtering strategy for handling long jobs (i.e., jobs that need more than two consecutive time units), we obtain a new efficient heuristic for protein NMR peak assignment. Our experimental study shows that the new heuristic produces the best peak assignment in most of the cases, compared with the NMR peak assignment algorithms in the recent literature. The above algorithm is also the first approximation algorithm for a nontrivial case of the well-known interval scheduling problem that breaks the ratio 2 barrier.

  5. TH-EF-BRB-05: 4pi Non-Coplanar IMRT Beam Angle Selection by Convex Optimization with Group Sparsity Penalty

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    O’Connor, D; Nguyen, D; Voronenko, Y

    Purpose: Integrated beam orientation and fluence map optimization is expected to be the foundation of robust automated planning but existing heuristic methods do not promise global optimality. We aim to develop a new method for beam angle selection in 4π non-coplanar IMRT systems based on solving (globally) a single convex optimization problem, and to demonstrate the effectiveness of the method by comparison with a state of the art column generation method for 4π beam angle selection. Methods: The beam angle selection problem is formulated as a large scale convex fluence map optimization problem with an additional group sparsity term thatmore » encourages most candidate beams to be inactive. The optimization problem is solved using an accelerated first-order method, the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA). The beam angle selection and fluence map optimization algorithm is used to create non-coplanar 4π treatment plans for several cases (including head and neck, lung, and prostate cases) and the resulting treatment plans are compared with 4π treatment plans created using the column generation algorithm. Results: In our experiments the treatment plans created using the group sparsity method meet or exceed the dosimetric quality of plans created using the column generation algorithm, which was shown superior to clinical plans. Moreover, the group sparsity approach converges in about 3 minutes in these cases, as compared with runtimes of a few hours for the column generation method. Conclusion: This work demonstrates the first non-greedy approach to non-coplanar beam angle selection, based on convex optimization, for 4π IMRT systems. The method given here improves both treatment plan quality and runtime as compared with a state of the art column generation algorithm. When the group sparsity term is set to zero, we obtain an excellent method for fluence map optimization, useful when beam angles have already been selected. NIH R43CA183390, NIH R01CA188300, Varian Medical Systems; Part of this research took place while D. O’Connor was a summer intern at RefleXion Medical.« less

  6. Two-agent cooperative search using game models with endurance-time constraints

    NASA Astrophysics Data System (ADS)

    Sujit, P. B.; Ghose, Debasish

    2010-07-01

    In this article, the problem of two Unmanned Aerial Vehicles (UAVs) cooperatively searching an unknown region is addressed. The search region is discretized into hexagonal cells and each cell is assumed to possess an uncertainty value. The UAVs have to cooperatively search these cells taking limited endurance, sensor and communication range constraints into account. Due to limited endurance, the UAVs need to return to the base station for refuelling and also need to select a base station when multiple base stations are present. This article proposes a route planning algorithm that takes endurance time constraints into account and uses game theoretical strategies to reduce the uncertainty. The route planning algorithm selects only those cells that ensure the agent will return to any one of the available bases. A set of paths are formed using these cells which the game theoretical strategies use to select a path that yields maximum uncertainty reduction. We explore non-cooperative Nash, cooperative and security strategies from game theory to enhance the search effectiveness. Monte-Carlo simulations are carried out which show the superiority of the game theoretical strategies over greedy strategy for different look ahead step length paths. Within the game theoretical strategies, non-cooperative Nash and cooperative strategy perform similarly in an ideal case, but Nash strategy performs better than the cooperative strategy when the perceived information is different. We also propose a heuristic based on partitioning of the search space into sectors to reduce computational overhead without performance degradation.

  7. The Inventor-Investor Conundrum

    ERIC Educational Resources Information Center

    Hobbs, Francis

    2006-01-01

    The complexities of developing a business based on a novel product may appear insurmountable. Stereotypical convention suggests that there are two major players: polarized inventors and "greedy" investors. Surely there is a way of aligning the inventor-investor relationship into something positive for both parties? In this paper Francis…

  8. An Unexamined Translation of Plutarch: "Libro contre la cobdicia delas riquezas" ("The Book against the Greediness of the Rich," Valladolid, 1538)

    ERIC Educational Resources Information Center

    Beardsley, Theodore S., Jr.

    1973-01-01

    Special issue as a tribute to Dr. Arnold Reichenberger, well-known Hispanist, who has served as chairman of the Department of Romance Languages at Pennsylvania State University, University Park, Pennsylvania. (DS)

  9. A multimetric, map-aware routing protocol for VANETs in urban areas.

    PubMed

    Tripp-Barba, Carolina; Urquiza-Aguiar, Luis; Aguilar Igartua, Mónica; Rebollo-Monedero, David; de la Cruz Llopis, Luis J; Mezher, Ahmad Mohamad; Aguilar-Calderón, José Alfonso

    2014-01-28

    In recent years, the general interest in routing for vehicular ad hoc networks (VANETs) has increased notably. Many proposals have been presented to improve the behavior of the routing decisions in these very changeable networks. In this paper, we propose a new routing protocol for VANETs that uses four different metrics. which are the distance to destination, the vehicles' density, the vehicles' trajectory and the available bandwidth, making use of the information retrieved by the sensors of the vehicle, in order to make forwarding decisions, minimizing packet losses and packet delay. Through simulation, we compare our proposal to other protocols, such as AODV (Ad hoc On-Demand Distance Vector), GPSR (Greedy Perimeter Stateless Routing), I-GPSR (Improvement GPSR) and to our previous proposal, GBSR-B (Greedy Buffer Stateless Routing Building-aware). Besides, we present a performance evaluation of the individual importance of each metric to make forwarding decisions. Experimental results show that our proposed forwarding decision outperforms existing solutions in terms of packet delivery.

  10. NITPICK: peak identification for mass spectrometry data.

    PubMed

    Renard, Bernhard Y; Kirchner, Marc; Steen, Hanno; Steen, Judith A J; Hamprecht, Fred A

    2008-08-28

    The reliable extraction of features from mass spectra is a fundamental step in the automated analysis of proteomic mass spectrometry (MS) experiments. This contribution proposes a sparse template regression approach to peak picking called NITPICK. NITPICK is a Non-greedy, Iterative Template-based peak PICKer that deconvolves complex overlapping isotope distributions in multicomponent mass spectra. NITPICK is based on fractional averaging, a novel extension to Senko's well-known averaging model, and on a modified version of sparse, non-negative least angle regression, for which a suitable, statistically motivated early stopping criterion has been derived. The strength of NITPICK is the deconvolution of overlapping mixture mass spectra. Extensive comparative evaluation has been carried out and results are provided for simulated and real-world data sets. NITPICK outperforms pepex, to date the only alternate, publicly available, non-greedy feature extraction routine. NITPICK is available as software package for the R programming language and can be downloaded from (http://hci.iwr.uni-heidelberg.de/mip/proteomics/).

  11. A greedy-navigator approach to navigable city plans

    NASA Astrophysics Data System (ADS)

    Lee, Sang Hoon; Holme, Petter

    2013-01-01

    We use a set of four theoretical navigability indices for street maps to investigate the shape of the resulting street networks, if they are grown by optimizing these indices. The indices compare the performance of simulated navigators (having a partial information about the surroundings, like humans in many real situations) to the performance of optimally navigating individuals. We show that our simple greedy shortcut construction strategy generates the emerging structures that are different from real road network, but not inconceivable. The resulting city plans, for all navigation indices, share common qualitative properties such as the tendency for triangular blocks to appear, while the more quantitative features, such as degree distributions and clustering, are characteristically different depending on the type of metrics and routing strategies. We show that it is the type of metrics used which determines the overall shapes characterized by structural heterogeneity, but the routing schemes contribute to more subtle details of locality, which is more emphasized in case of unrestricted connections when the edge crossing is allowed.

  12. Small-Tip-Angle Spokes Pulse Design Using Interleaved Greedy and Local Optimization Methods

    PubMed Central

    Grissom, William A.; Khalighi, Mohammad-Mehdi; Sacolick, Laura I.; Rutt, Brian K.; Vogel, Mika W.

    2013-01-01

    Current spokes pulse design methods can be grouped into methods based either on sparse approximation or on iterative local (gradient descent-based) optimization of the transverse-plane spatial frequency locations visited by the spokes. These two classes of methods have complementary strengths and weaknesses: sparse approximation-based methods perform an efficient search over a large swath of candidate spatial frequency locations but most are incompatible with off-resonance compensation, multifrequency designs, and target phase relaxation, while local methods can accommodate off-resonance and target phase relaxation but are sensitive to initialization and suboptimal local cost function minima. This article introduces a method that interleaves local iterations, which optimize the radiofrequency pulses, target phase patterns, and spatial frequency locations, with a greedy method to choose new locations. Simulations and experiments at 3 and 7 T show that the method consistently produces single- and multifrequency spokes pulses with lower flip angle inhomogeneity compared to current methods. PMID:22392822

  13. A protein interaction network analysis for yeast integral membrane protein.

    PubMed

    Shi, Ming-Guang; Huang, De-Shuang; Li, Xue-Ling

    2008-01-01

    Although the yeast Saccharomyces cerevisiae is the best exemplified single-celled eukaryote, the vast number of protein-protein interactions of integral membrane proteins of Saccharomyces cerevisiae have not been characterized by experiments. Here, based on the kernel method of Greedy Kernel Principal Component analysis plus Linear Discriminant Analysis, we identify 300 protein-protein interactions involving 189 membrane proteins and get the outcome of a highly connected protein-protein interactions network. Furthermore, we study the global topological features of integral membrane proteins network of Saccharomyces cerevisiae. These results give the comprehensive description of protein-protein interactions of integral membrane proteins and reveal global topological and robustness of the interactome network at a system level. This work represents an important step towards a comprehensive understanding of yeast protein interactions.

  14. Ranking influential spreaders is an ill-defined problem

    NASA Astrophysics Data System (ADS)

    Gu, Jain; Lee, Sungmin; Saramäki, Jari; Holme, Petter

    2017-06-01

    Finding influential spreaders of information and disease in networks is an important theoretical problem, and one of considerable recent interest. It has been almost exclusively formulated as a node-ranking problem —methods for identifying influential spreaders output a ranking of the nodes. In this work, we show that such a greedy heuristic does not necessarily work: the set of most influential nodes depends on the number of nodes in the set. Therefore, the set of n most important nodes to vaccinate does not need to have any node in common with the set of n + 1 most important nodes. We propose a method for quantifying the extent and impact of this phenomenon. By this method, we show that it is a common phenomenon in both empirical and model networks.

  15. Modulation Depth Estimation and Variable Selection in State-Space Models for Neural Interfaces

    PubMed Central

    Hochberg, Leigh R.; Donoghue, John P.; Brown, Emery N.

    2015-01-01

    Rapid developments in neural interface technology are making it possible to record increasingly large signal sets of neural activity. Various factors such as asymmetrical information distribution and across-channel redundancy may, however, limit the benefit of high-dimensional signal sets, and the increased computational complexity may not yield corresponding improvement in system performance. High-dimensional system models may also lead to overfitting and lack of generalizability. To address these issues, we present a generalized modulation depth measure using the state-space framework that quantifies the tuning of a neural signal channel to relevant behavioral covariates. For a dynamical system, we develop computationally efficient procedures for estimating modulation depth from multivariate data. We show that this measure can be used to rank neural signals and select an optimal channel subset for inclusion in the neural decoding algorithm. We present a scheme for choosing the optimal subset based on model order selection criteria. We apply this method to neuronal ensemble spike-rate decoding in neural interfaces, using our framework to relate motor cortical activity with intended movement kinematics. With offline analysis of intracortical motor imagery data obtained from individuals with tetraplegia using the BrainGate neural interface, we demonstrate that our variable selection scheme is useful for identifying and ranking the most information-rich neural signals. We demonstrate that our approach offers several orders of magnitude lower complexity but virtually identical decoding performance compared to greedy search and other selection schemes. Our statistical analysis shows that the modulation depth of human motor cortical single-unit signals is well characterized by the generalized Pareto distribution. Our variable selection scheme has wide applicability in problems involving multisensor signal modeling and estimation in biomedical engineering systems. PMID:25265627

  16. Assignment of protein sequences to existing domain and family classification systems: Pfam and the PDB

    PubMed Central

    Dunbrack, Roland L.

    2012-01-01

    Motivation: Automating the assignment of existing domain and protein family classifications to new sets of sequences is an important task. Current methods often miss assignments because remote relationships fail to achieve statistical significance. Some assignments are not as long as the actual domain definitions because local alignment methods often cut alignments short. Long insertions in query sequences often erroneously result in two copies of the domain assigned to the query. Divergent repeat sequences in proteins are often missed. Results: We have developed a multilevel procedure to produce nearly complete assignments of protein families of an existing classification system to a large set of sequences. We apply this to the task of assigning Pfam domains to sequences and structures in the Protein Data Bank (PDB). We found that HHsearch alignments frequently scored more remotely related Pfams in Pfam clans higher than closely related Pfams, thus, leading to erroneous assignment at the Pfam family level. A greedy algorithm allowing for partial overlaps was, thus, applied first to sequence/HMM alignments, then HMM–HMM alignments and then structure alignments, taking care to join partial alignments split by large insertions into single-domain assignments. Additional assignment of repeat Pfams with weaker E-values was allowed after stronger assignments of the repeat HMM. Our database of assignments, presented in a database called PDBfam, contains Pfams for 99.4% of chains >50 residues. Availability: The Pfam assignment data in PDBfam are available at http://dunbrack2.fccc.edu/ProtCid/PDBfam, which can be searched by PDB codes and Pfam identifiers. They will be updated regularly. Contact: Roland.Dunbracks@fccc.edu PMID:22942020

  17. Efficient Optimization of Stimuli for Model-Based Design of Experiments to Resolve Dynamical Uncertainty

    PubMed Central

    Mdluli, Thembi; Buzzard, Gregery T.; Rundell, Ann E.

    2015-01-01

    This model-based design of experiments (MBDOE) method determines the input magnitudes of an experimental stimuli to apply and the associated measurements that should be taken to optimally constrain the uncertain dynamics of a biological system under study. The ideal global solution for this experiment design problem is generally computationally intractable because of parametric uncertainties in the mathematical model of the biological system. Others have addressed this issue by limiting the solution to a local estimate of the model parameters. Here we present an approach that is independent of the local parameter constraint. This approach is made computationally efficient and tractable by the use of: (1) sparse grid interpolation that approximates the biological system dynamics, (2) representative parameters that uniformly represent the data-consistent dynamical space, and (3) probability weights of the represented experimentally distinguishable dynamics. Our approach identifies data-consistent representative parameters using sparse grid interpolants, constructs the optimal input sequence from a greedy search, and defines the associated optimal measurements using a scenario tree. We explore the optimality of this MBDOE algorithm using a 3-dimensional Hes1 model and a 19-dimensional T-cell receptor model. The 19-dimensional T-cell model also demonstrates the MBDOE algorithm’s scalability to higher dimensions. In both cases, the dynamical uncertainty region that bounds the trajectories of the target system states were reduced by as much as 86% and 99% respectively after completing the designed experiments in silico. Our results suggest that for resolving dynamical uncertainty, the ability to design an input sequence paired with its associated measurements is particularly important when limited by the number of measurements. PMID:26379275

  18. Experiential learning for education on Earth Sciences

    NASA Astrophysics Data System (ADS)

    Marsili, Antonella; D'Addezio, Giuliana; Todaro, Riccardo; Scipilliti, Francesca

    2015-04-01

    The Laboratorio Divulgazione Scientifica e Attività Museali of the Istituto Nazionale di Geofisica e Vulcanologia (INGV's Laboratory for Outreach and Museum Activities) in Rome, organizes every year intense educational and outreach activities to convey scientific knowledge and to promote research on Earth Science, focusing on volcanic and seismic hazard. Focusing on kids, we designed and implemented the "greedy laboratory for children curious on science (Laboratorio goloso per bambini curiosi di scienza)", to intrigue children from primary schools and to attract their interest by addressing in a fun and unusual way topics regarding the Earth, seismicity and seismic risk. We performed the "greedy laboratory" using experiential teaching, an innovative method envisaging the use and handling commonly used substances. In particular, in the "greedy laboratory" we proposed the use of everyday life's elements, such as food, to engage, entertain and convey in a simple and interesting communication approach notions concerning Earth processes. We proposed the initiative to public during the "European Researchers Night" in Rome, on September 26, 2014. Children attending the "greedy laboratory", guided by researchers and technicians, had the opportunity to become familiar with scientific concepts, such as the composition of the Earth, the Plate tectonics, the earthquake generation, the propagation of seismic waves and their shaking effects on the anthropogenic environment. During the hand-on laboratory, each child used not harmful substances such as honey, chocolate, flour, barley, boiled eggs and biscuits. At the end, we administered a questionnaire rating the proposed activities, first evaluating the level of general satisfaction of the laboratory and then the various activities in which it was divided. This survey supplied our team with feedbacks, revealing some precious hints on appreciation and margins of improvement. We provided a semi-quantitative assessment with a questionnaire focused on the appreciation, on the emotional and cognitive learning and trying to test the issue we addressed when we built up the performance. The questionnaire are set in a semi-structured way, keeping free only a few questions. One hundred of both boys and girls attended the laboratory, seventy-one of whom completed the questionnaire. As a general results, we register a very high level of satisfaction and interest. We analyzed the questionnaires, using as first the variables "age" and "gender". Children 5 to 11 years old completed the questionare, about 72% were girls. This experential teaching for primary schools intrigues and involves child using the methodology of ''learning by doing". Our experience demonstrates that this teaching approach may represents a successful and effective method to transfer useful information about geo-hazards strengthening the culture of prevention.

  19. Optimization methods for decision making in disease prevention and epidemic control.

    PubMed

    Deng, Yan; Shen, Siqian; Vorobeychik, Yevgeniy

    2013-11-01

    This paper investigates problems of disease prevention and epidemic control (DPEC), in which we optimize two sets of decisions: (i) vaccinating individuals and (ii) closing locations, given respective budgets with the goal of minimizing the expected number of infected individuals after intervention. The spread of diseases is inherently stochastic due to the uncertainty about disease transmission and human interaction. We use a bipartite graph to represent individuals' propensities of visiting a set of location, and formulate two integer nonlinear programming models to optimize choices of individuals to vaccinate and locations to close. Our first model assumes that if a location is closed, its visitors stay in a safe location and will not visit other locations. Our second model incorporates compensatory behavior by assuming multiple behavioral groups, always visiting the most preferred locations that remain open. The paper develops algorithms based on a greedy strategy, dynamic programming, and integer programming, and compares the computational efficacy and solution quality. We test problem instances derived from daily behavior patterns of 100 randomly chosen individuals (corresponding to 195 locations) in Portland, Oregon, and provide policy insights regarding the use of the two DPEC models. Copyright © 2013 Elsevier Inc. All rights reserved.

  20. Cyber War Game in Temporal Networks

    PubMed Central

    Cho, Jin-Hee; Gao, Jianxi

    2016-01-01

    In a cyber war game where a network is fully distributed and characterized by resource constraints and high dynamics, attackers or defenders often face a situation that may require optimal strategies to win the game with minimum effort. Given the system goal states of attackers and defenders, we study what strategies attackers or defenders can take to reach their respective system goal state (i.e., winning system state) with minimum resource consumption. However, due to the dynamics of a network caused by a node’s mobility, failure or its resource depletion over time or action(s), this optimization problem becomes NP-complete. We propose two heuristic strategies in a greedy manner based on a node’s two characteristics: resource level and influence based on k-hop reachability. We analyze complexity and optimality of each algorithm compared to optimal solutions for a small-scale static network. Further, we conduct a comprehensive experimental study for a large-scale temporal network to investigate best strategies, given a different environmental setting of network temporality and density. We demonstrate the performance of each strategy under various scenarios of attacker/defender strategies in terms of win probability, resource consumption, and system vulnerability. PMID:26859840

  1. Retro-regression--another important multivariate regression improvement.

    PubMed

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  2. Policy oscillation is overshooting.

    PubMed

    Wagner, Paul

    2014-04-01

    A majority of approximate dynamic programming approaches to the reinforcement learning problem can be categorized into greedy value function methods and value-based policy gradient methods. The former approach, although fast, is well known to be susceptible to the policy oscillation phenomenon. We take a fresh view to this phenomenon by casting, within the context of non-optimistic policy iteration, a considerable subset of the former approach as a limiting special case of the latter. We explain the phenomenon in terms of this view and illustrate the underlying mechanism with artificial examples. We also use it to derive the constrained natural actor-critic algorithm that can interpolate between the aforementioned approaches. In addition, it has been suggested in the literature that the oscillation phenomenon might be subtly connected to the grossly suboptimal performance in the Tetris benchmark problem of all attempted approximate dynamic programming methods. Based on empirical findings, we offer a hypothesis that might explain the inferior performance levels and the associated policy degradation phenomenon, and which would partially support the suggested connection. Finally, we report scores in the Tetris problem that improve on existing dynamic programming based results by an order of magnitude. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Kernel methods and flexible inference for complex stochastic dynamics

    NASA Astrophysics Data System (ADS)

    Capobianco, Enrico

    2008-07-01

    Approximation theory suggests that series expansions and projections represent standard tools for random process applications from both numerical and statistical standpoints. Such instruments emphasize the role of both sparsity and smoothness for compression purposes, the decorrelation power achieved in the expansion coefficients space compared to the signal space, and the reproducing kernel property when some special conditions are met. We consider these three aspects central to the discussion in this paper, and attempt to analyze the characteristics of some known approximation instruments employed in a complex application domain such as financial market time series. Volatility models are often built ad hoc, parametrically and through very sophisticated methodologies. But they can hardly deal with stochastic processes with regard to non-Gaussianity, covariance non-stationarity or complex dependence without paying a big price in terms of either model mis-specification or computational efficiency. It is thus a good idea to look at other more flexible inference tools; hence the strategy of combining greedy approximation and space dimensionality reduction techniques, which are less dependent on distributional assumptions and more targeted to achieve computationally efficient performances. Advantages and limitations of their use will be evaluated by looking at algorithmic and model building strategies, and by reporting statistical diagnostics.

  4. Maximization of the Supportable Number of Sensors in QoS-Aware Cluster-Based Underwater Acoustic Sensor Networks

    PubMed Central

    Nguyen, Thi-Tham; Van Le, Duc; Yoon, Seokhoon

    2014-01-01

    This paper proposes a practical low-complexity MAC (medium access control) scheme for quality of service (QoS)-aware and cluster-based underwater acoustic sensor networks (UASN), in which the provision of differentiated QoS is required. In such a network, underwater sensors (U-sensor) in a cluster are divided into several classes, each of which has a different QoS requirement. The major problem considered in this paper is the maximization of the number of nodes that a cluster can accommodate while still providing the required QoS for each class in terms of the PDR (packet delivery ratio). In order to address the problem, we first estimate the packet delivery probability (PDP) and use it to formulate an optimization problem to determine the optimal value of the maximum packet retransmissions for each QoS class. The custom greedy and interior-point algorithms are used to find the optimal solutions, which are verified by extensive simulations. The simulation results show that, by solving the proposed optimization problem, the supportable number of underwater sensor nodes can be maximized while satisfying the QoS requirements for each class. PMID:24608009

  5. Maximization of the supportable number of sensors in QoS-aware cluster-based underwater acoustic sensor networks.

    PubMed

    Nguyen, Thi-Tham; Le, Duc Van; Yoon, Seokhoon

    2014-03-07

    This paper proposes a practical low-complexity MAC (medium access control) scheme for quality of service (QoS)-aware and cluster-based underwater acoustic sensor networks (UASN), in which the provision of differentiated QoS is required. In such a network, underwater sensors (U-sensor) in a cluster are divided into several classes, each of which has a different QoS requirement. The major problem considered in this paper is the maximization of the number of nodes that a cluster can accommodate while still providing the required QoS for each class in terms of the PDR (packet delivery ratio). In order to address the problem, we first estimate the packet delivery probability (PDP) and use it to formulate an optimization problem to determine the optimal value of the maximum packet retransmissions for each QoS class. The custom greedy and interior-point algorithms are used to find the optimal solutions, which are verified by extensive simulations. The simulation results show that, by solving the proposed optimization problem, the supportable number of underwater sensor nodes can be maximized while satisfying the QoS requirements for each class.

  6. Intellectual and Moral Differences among Today's College Students

    ERIC Educational Resources Information Center

    Sokolov, A. V.

    2006-01-01

    Post-Soviet young people are said to be "scornful of ordinary, diligent labor, greedy for easy wealth, and massively antipatriotic." Social scientist A. S. Panarin observes that the demoralization and disorientation of the younger generation are not subject to doubt. Proceeding on the assumption that having a trusting and frank dialogue…

  7. Gaining Insights into Children's Geometric Knowledge

    ERIC Educational Resources Information Center

    Mack, Nancy K.

    2007-01-01

    This article describes how research on children's geometric thinking was used in conjunction with the picture book "The Greedy Triangle" to gain valuable insights into children's prior geometric knowledge of polygons. Exercises focused on the names, visual appearance, and properties of polygons, as well as real-world connections for each, are…

  8. A "Mixed" Strategy for Collaborative Group Formation and Its Learning Outcomes

    ERIC Educational Resources Information Center

    Acharya, Anal; Sinha, Devadatta

    2018-01-01

    This study uses homogeneity in personal learning styles and heterogeneity in subject knowledge for collaborative learning group decomposition indicating that groups are "mixed" in nature. Homogeneity within groups was formed using K-means clustering and greedy search, whereas heterogeneity imbibed using agenda-driven search. For checking…

  9. Teaching with Children's Books: The "Wow" Factor

    ERIC Educational Resources Information Center

    Von Drasek, Lisa

    2006-01-01

    No classroom teacher needs convincing of the benefits of using children's picture books in his or her math program. As Marilyn Burns, the creator and founder of Math Solutions Professional Development, and the author of "The Greedy Triangle" (Scholastic, 1996), says, "Evidence shows that teaching math through children's books motivates children to…

  10. The President as Public Intellectual

    ERIC Educational Resources Information Center

    Ungar, Sanford J.

    2006-01-01

    As likely as not, college and university presidents are in the news now for rather more uncomfortable reasons--for investigations into their seemingly greedy and extravagant ways, for compromising circumstances involving big-time athletic teams and corrupt coaches, for personal scandals, or for attempts to discuss pseudo-academic issues that veer…

  11. Political Science Careers at Comprehensive Universities: Building Balanced Careers at "Greedy" Institutions

    ERIC Educational Resources Information Center

    Hendrickson, Ryan C.; Mueller, Melinda A.; Strand, Jonathan R.

    2011-01-01

    A considerable amount of research exists about political science careers at community colleges and liberal arts institutions, as well as about training and hiring practices across different types of institutions. However, there is virtually no commentary available on political science careers at comprehensive institutions, where a significant…

  12. Critical Features of Fragment Libraries for Protein Structure Prediction

    PubMed Central

    dos Santos, Karina Baptista

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction. PMID:28085928

  13. Critical Features of Fragment Libraries for Protein Structure Prediction.

    PubMed

    Trevizani, Raphael; Custódio, Fábio Lima; Dos Santos, Karina Baptista; Dardenne, Laurent Emmanuel

    2017-01-01

    The use of fragment libraries is a popular approach among protein structure prediction methods and has proven to substantially improve the quality of predicted structures. However, some vital aspects of a fragment library that influence the accuracy of modeling a native structure remain to be determined. This study investigates some of these features. Particularly, we analyze the effect of using secondary structure prediction guiding fragments selection, different fragments sizes and the effect of structural clustering of fragments within libraries. To have a clearer view of how these factors affect protein structure prediction, we isolated the process of model building by fragment assembly from some common limitations associated with prediction methods, e.g., imprecise energy functions and optimization algorithms, by employing an exact structure-based objective function under a greedy algorithm. Our results indicate that shorter fragments reproduce the native structure more accurately than the longer. Libraries composed of multiple fragment lengths generate even better structures, where longer fragments show to be more useful at the beginning of the simulations. The use of many different fragment sizes shows little improvement when compared to predictions carried out with libraries that comprise only three different fragment sizes. Models obtained from libraries built using only sequence similarity are, on average, better than those built with a secondary structure prediction bias. However, we found that the use of secondary structure prediction allows greater reduction of the search space, which is invaluable for prediction methods. The results of this study can be critical guidelines for the use of fragment libraries in protein structure prediction.

  14. Methodologic considerations in the design and analysis of nested case-control studies: association between cytokines and postoperative delirium.

    PubMed

    Ngo, Long H; Inouye, Sharon K; Jones, Richard N; Travison, Thomas G; Libermann, Towia A; Dillon, Simon T; Kuchel, George A; Vasunilashorn, Sarinnapha M; Alsop, David C; Marcantonio, Edward R

    2017-06-06

    The nested case-control study (NCC) design within a prospective cohort study is used when outcome data are available for all subjects, but the exposure of interest has not been collected, and is difficult or prohibitively expensive to obtain for all subjects. A NCC analysis with good matching procedures yields estimates that are as efficient and unbiased as estimates from the full cohort study. We present methodological considerations in a matched NCC design and analysis, which include the choice of match algorithms, analysis methods to evaluate the association of exposures of interest with outcomes, and consideration of overmatching. Matched, NCC design within a longitudinal observational prospective cohort study in the setting of two academic hospitals. Study participants are patients aged over 70 years who underwent scheduled major non-cardiac surgery. The primary outcome was postoperative delirium from in-hospital interviews and medical record review. The main exposure was IL-6 concentration (pg/ml) from blood sampled at three time points before delirium occurred. We used nonparametric signed ranked test to test for the median of the paired differences. We used conditional logistic regression to model the risk of IL-6 on delirium incidence. Simulation was used to generate a sample of cohort data on which unconditional multivariable logistic regression was used, and the results were compared to those of the conditional logistic regression. Partial R-square was used to assess the level of overmatching. We found that the optimal match algorithm yielded more matched pairs than the greedy algorithm. The choice of analytic strategy-whether to consider measured cytokine levels as the predictor or outcome-- yielded inferences that have different clinical interpretations but similar levels of statistical significance. Estimation results from NCC design using conditional logistic regression, and from simulated cohort design using unconditional logistic regression, were similar. We found minimal evidence for overmatching. Using a matched NCC approach introduces methodological challenges into the study design and data analysis. Nonetheless, with careful selection of the match algorithm, match factors, and analysis methods, this design is cost effective and, for our study, yields estimates that are similar to those from a prospective cohort study design.

  15. MO-FG-CAMPUS-TeP2-05: Optimizing Stereotactic Radiosurgery Treatment of Multiple Brain Metastasis Lesions with Individualized Rotational Arc Trajectories

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dong, P; Xing, L; Ma, L

    Purpose: Radiosurgery of multiple (n>4) brain metastasis lesions requires 3–4 noncoplanar VMAT arcs with excessively high monitor units and long delivery time. We investigated whether an improved optimization technique would decrease the needed arc numbers and increase the delivery efficiency, while improving or maintaining the plan quality. Methods: The proposed 4pi arc space optimization algorithm consists of two steps: automatic couch angle selection followed by aperture generation for each arc with optimized control points distribution. We use a greedy algorithm to select the couch angles. Starting from a single coplanar arc plan we search through the candidate noncoplanar arcs tomore » pick a single noncoplanar arc that will bring the best plan quality when added into the existing treatment plan. Each time, only one additional noncoplanar arc is considered making the calculation time tractable. This process repeats itself until desired number of arc is reached. The technique is first evaluated in coplanar arc delivery scheme with testing cases and then applied to noncoplanar treatments of a case with 12 brain metastasis lesions. Results: Clinically acceptable plans are created within minutes. For the coplanar testing cases the algorithm yields singlearc plans with better dose distributions than that of two-arc VMAT, simultaneously with a 12–17% reduction in the delivery time and a 14–21% reduction in MUs. For the treatment of 12 brain mets while Paddick conformity indexes of the two plans were comparable the SCG-optimization with 2 arcs (1 noncoplanar and 1 coplanar) significantly improved the conventional VMAT with 3 arcs (2 noncoplanar and 1 coplanar). Specifically V16 V10 and V5 of the brain were reduced by 11%, 11% and 12% respectively. The beam delivery time was shortened by approximately 30%. Conclusion: The proposed 4pi arc space optimization technique promises to significantly reduce the brain toxicity while greatly improving the treatment efficiency.« less

  16. Adaptive radial basis function mesh deformation using data reduction

    NASA Astrophysics Data System (ADS)

    Gillebaart, T.; Blom, D. S.; van Zuijlen, A. H.; Bijl, H.

    2016-09-01

    Radial Basis Function (RBF) mesh deformation is one of the most robust mesh deformation methods available. Using the greedy (data reduction) method in combination with an explicit boundary correction, results in an efficient method as shown in literature. However, to ensure the method remains robust, two issues are addressed: 1) how to ensure that the set of control points remains an accurate representation of the geometry in time and 2) how to use/automate the explicit boundary correction, while ensuring a high mesh quality. In this paper, we propose an adaptive RBF mesh deformation method, which ensures the set of control points always represents the geometry/displacement up to a certain (user-specified) criteria, by keeping track of the boundary error throughout the simulation and re-selecting when needed. Opposed to the unit displacement and prescribed displacement selection methods, the adaptive method is more robust, user-independent and efficient, for the cases considered. Secondly, the analysis of a single high aspect ratio cell is used to formulate an equation for the correction radius needed, depending on the characteristics of the correction function used, maximum aspect ratio, minimum first cell height and boundary error. Based on the analysis two new radial basis correction functions are derived and proposed. This proposed automated procedure is verified while varying the correction function, Reynolds number (and thus first cell height and aspect ratio) and boundary error. Finally, the parallel efficiency is studied for the two adaptive methods, unit displacement and prescribed displacement for both the CPU as well as the memory formulation with a 2D oscillating and translating airfoil with oscillating flap, a 3D flexible locally deforming tube and deforming wind turbine blade. Generally, the memory formulation requires less work (due to the large amount of work required for evaluating RBF's), but the parallel efficiency reduces due to the limited bandwidth available between CPU and memory. In terms of parallel efficiency/scaling the different studied methods perform similarly, with the greedy algorithm being the bottleneck. In terms of absolute computational work the adaptive methods are better for the cases studied due to their more efficient selection of the control points. By automating most of the RBF mesh deformation, a robust, efficient and almost user-independent mesh deformation method is presented.

  17. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem

    PubMed Central

    Schmidhuber, Jürgen

    2013-01-01

    Most of computer science focuses on automatically solving given computational problems. I focus on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Consider the infinite set of all computable descriptions of tasks with possibly computable solutions. Given a general problem-solving architecture, at any given time, the novel algorithmic framework PowerPlay (Schmidhuber, 2011) searches the space of possible pairs of new tasks and modifications of the current problem solver, until it finds a more powerful problem solver that provably solves all previously learned tasks plus the new one, while the unmodified predecessor does not. Newly invented tasks may require to achieve a wow-effect by making previously learned skills more efficient such that they require less time and space. New skills may (partially) re-use previously learned skills. The greedy search of typical PowerPlay variants uses time-optimal program search to order candidate pairs of tasks and solver modifications by their conditional computational (time and space) complexity, given the stored experience so far. The new task and its corresponding task-solving skill are those first found and validated. This biases the search toward pairs that can be described compactly and validated quickly. The computational costs of validating new tasks need not grow with task repertoire size. Standard problem solver architectures of personal computers or neural networks tend to generalize by solving numerous tasks outside the self-invented training set; PowerPlay’s ongoing search for novelty keeps breaking the generalization abilities of its present solver. This is related to Gödel’s sequence of increasingly powerful formal theories based on adding formerly unprovable statements to the axioms without affecting previously provable theorems. The continually increasing repertoire of problem-solving procedures can be exploited by a parallel search for solutions to additional externally posed tasks. PowerPlay may be viewed as a greedy but practical implementation of basic principles of creativity (Schmidhuber, 2006a, 2010). A first experimental analysis can be found in separate papers (Srivastava et al., 2012a,b, 2013). PMID:23761771

  18. Three Do's and Three Don'ts for Expert Witnesses.

    ERIC Educational Resources Information Center

    Oates, R. Kim

    1993-01-01

    Guidelines are offered for child protection workers who are appearing in court as expert witnesses. Guidelines include be objective, be accurate, stick to the area of expertise, don't get manipulated by lawyers, don't be greedy, and maintain one's expert witness work as a minor part of one's professional activities. (JDD)

  19. Report on the Black Hills Alliance.

    ERIC Educational Resources Information Center

    Ryan, Joe

    1979-01-01

    A rally to save the Black Hills from coal- and uranium-greedy energy companies was held on July 6 and over 2,000 joined in a 15-mile walk on July 7 in Rapid City, South Dakota. The Black Hills Alliance, an Indian coalition concerned about energy development proposals in the Great Plains, sponsored the gathering. (NQ)

  20. An Analysis of the Motivations of Oregon's Ranchers to Diversify into Agritourism

    Treesearch

    Fernanda de Vasconcellos Pêgas; Joanne F. Tynon

    2004-01-01

    Cattle ranches are unique American cultural icons. Unfortunately, ranching is also associated by some with the exploitation of natural resources and labeled an environmentally destructive activity motivated by greedy and neglectful livestock operators (Jacobs, 1991; Wuerthner, 1990). Some believe that livestock ranching is a major contributor to unsustainable land use...

  1. A Real-Time Greedy-Index Dispatching Policy for using PEVs to Provide Frequency Regulation Service

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ke, Xinda; Wu, Di; Lu, Ning

    This article presents a real-time greedy-index dispatching policy (GIDP) for using plug-in electric vehicles (PEVs) to provide frequency regulation services. A new service cost allocation mechanism is proposed to award PEVs based on the amount of service they provided, while considering compensations for delayed-charging and reduction of battery lifetime due to participation of the service. The GIDP transforms the optimal dispatch problem from a high-dimensional space into a one-dimensional space while preserving the solution optimality. When solving the transformed problem in real-time, the global optimality of the GIDP solution can be guaranteed by mathematically proved “indexability”. Because the GIDP indexmore » can be calculated upon the PEV’s arrival and used for the entire decision making process till its departure, the computational burden is minimized and the complexity of the aggregator dispatch process is significantly reduced. Finally, simulation results are used to evaluate the proposed GIDP, and to demonstrate the potential profitability from providing frequency regulation service by using PEVs.« less

  2. A Greedy Scanning Data Collection Strategy for Large-Scale Wireless Sensor Networks with a Mobile Sink.

    PubMed

    Zhu, Chuan; Zhang, Sai; Han, Guangjie; Jiang, Jinfang; Rodrigues, Joel J P C

    2016-09-06

    Mobile sink is widely used for data collection in wireless sensor networks. It can avoid 'hot spot' problems but energy consumption caused by multihop transmission is still inefficient in real-time application scenarios. In this paper, a greedy scanning data collection strategy (GSDCS) is proposed, and we focus on how to reduce routing energy consumption by shortening total length of routing paths. We propose that the mobile sink adjusts its trajectory dynamically according to the changes of network, instead of predetermined trajectory or random walk. Next, the mobile sink determines which area has more source nodes, then it moves toward this area. The benefit of GSDCS is that most source nodes are no longer needed to upload sensory data for long distances. Especially in event-driven application scenarios, when event area changes, the mobile sink could arrive at the new event area where most source nodes are located currently. Hence energy can be saved. Analytical and simulation results show that compared with existing work, our GSDCS has a better performance in specific application scenarios.

  3. A Real-Time Greedy-Index Dispatching Policy for using PEVs to Provide Frequency Regulation Service

    DOE PAGES

    Ke, Xinda; Wu, Di; Lu, Ning

    2017-09-18

    This article presents a real-time greedy-index dispatching policy (GIDP) for using plug-in electric vehicles (PEVs) to provide frequency regulation services. A new service cost allocation mechanism is proposed to award PEVs based on the amount of service they provided, while considering compensations for delayed-charging and reduction of battery lifetime due to participation of the service. The GIDP transforms the optimal dispatch problem from a high-dimensional space into a one-dimensional space while preserving the solution optimality. When solving the transformed problem in real-time, the global optimality of the GIDP solution can be guaranteed by mathematically proved “indexability”. Because the GIDP indexmore » can be calculated upon the PEV’s arrival and used for the entire decision making process till its departure, the computational burden is minimized and the complexity of the aggregator dispatch process is significantly reduced. Finally, simulation results are used to evaluate the proposed GIDP, and to demonstrate the potential profitability from providing frequency regulation service by using PEVs.« less

  4. A Greedy Scanning Data Collection Strategy for Large-Scale Wireless Sensor Networks with a Mobile Sink

    PubMed Central

    Zhu, Chuan; Zhang, Sai; Han, Guangjie; Jiang, Jinfang; Rodrigues, Joel J. P. C.

    2016-01-01

    Mobile sink is widely used for data collection in wireless sensor networks. It can avoid ‘hot spot’ problems but energy consumption caused by multihop transmission is still inefficient in real-time application scenarios. In this paper, a greedy scanning data collection strategy (GSDCS) is proposed, and we focus on how to reduce routing energy consumption by shortening total length of routing paths. We propose that the mobile sink adjusts its trajectory dynamically according to the changes of network, instead of predetermined trajectory or random walk. Next, the mobile sink determines which area has more source nodes, then it moves toward this area. The benefit of GSDCS is that most source nodes are no longer needed to upload sensory data for long distances. Especially in event-driven application scenarios, when event area changes, the mobile sink could arrive at the new event area where most source nodes are located currently. Hence energy can be saved. Analytical and simulation results show that compared with existing work, our GSDCS has a better performance in specific application scenarios. PMID:27608022

  5. Ant system: optimization by a colony of cooperating agents.

    PubMed

    Dorigo, M; Maniezzo, V; Colorni, A

    1996-01-01

    An analogy with the way ant colonies function has suggested the definition of a new computational paradigm, which we call ant system (AS). We propose it as a viable new approach to stochastic combinatorial optimization. The main characteristics of this model are positive feedback, distributed computation, and the use of a constructive greedy heuristic. Positive feedback accounts for rapid discovery of good solutions, distributed computation avoids premature convergence, and the greedy heuristic helps find acceptable solutions in the early stages of the search process. We apply the proposed methodology to the classical traveling salesman problem (TSP), and report simulation results. We also discuss parameter selection and the early setups of the model, and compare it with tabu search and simulated annealing using TSP. To demonstrate the robustness of the approach, we show how the ant system (AS) can be applied to other optimization problems like the asymmetric traveling salesman, the quadratic assignment and the job-shop scheduling. Finally we discuss the salient characteristics-global data structure revision, distributed communication and probabilistic transitions of the AS.

  6. Inferring Alcoholism SNPs and Regulatory Chemical Compounds Based on Ensemble Bayesian Network.

    PubMed

    Chen, Huan; Sun, Jiatong; Jiang, Hong; Wang, Xianyue; Wu, Lingxiang; Wu, Wei; Wang, Qh

    2017-01-01

    The disturbance of consciousness is one of the most common symptoms of those have alcoholism and may cause disability and mortality. Previous studies indicated that several single nucleotide polymorphisms (SNP) increase the susceptibility of alcoholism. In this study, we utilized the Ensemble Bayesian Network (EBN) method to identify causal SNPs of alcoholism based on the verified GAW14 data. We built a Bayesian network combining random process and greedy search by using Genetic Analysis Workshop 14 (GAW14) dataset to establish EBN of SNPs. Then we predicted the association between SNPs and alcoholism by determining Bayes' prior probability. Thirteen out of eighteen SNPs directly connected with alcoholism were found concordance with potential risk regions of alcoholism in OMIM database. As many SNPs were found contributing to alteration on gene expression, known as expression quantitative trait loci (eQTLs), we further sought to identify chemical compounds acting as regulators of alcoholism genes captured by causal SNPs. Chloroprene and valproic acid were identified as the expression regulators for genes C11orf66 and SALL3 which were captured by alcoholism SNPs, respectively. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  7. Proceedings of the Second NASA Formal Methods Symposium

    NASA Technical Reports Server (NTRS)

    Munoz, Cesar (Editor)

    2010-01-01

    This publication contains the proceedings of the Second NASA Formal Methods Symposium sponsored by the National Aeronautics and Space Administration and held in Washington D.C. April 13-15, 2010. Topics covered include: Decision Engines for Software Analysis using Satisfiability Modulo Theories Solvers; Verification and Validation of Flight-Critical Systems; Formal Methods at Intel -- An Overview; Automatic Review of Abstract State Machines by Meta Property Verification; Hardware-independent Proofs of Numerical Programs; Slice-based Formal Specification Measures -- Mapping Coupling and Cohesion Measures to Formal Z; How Formal Methods Impels Discovery: A Short History of an Air Traffic Management Project; A Machine-Checked Proof of A State-Space Construction Algorithm; Automated Assume-Guarantee Reasoning for Omega-Regular Systems and Specifications; Modeling Regular Replacement for String Constraint Solving; Using Integer Clocks to Verify the Timing-Sync Sensor Network Protocol; Can Regulatory Bodies Expect Efficient Help from Formal Methods?; Synthesis of Greedy Algorithms Using Dominance Relations; A New Method for Incremental Testing of Finite State Machines; Verification of Faulty Message Passing Systems with Continuous State Space in PVS; Phase Two Feasibility Study for Software Safety Requirements Analysis Using Model Checking; A Prototype Embedding of Bluespec System Verilog in the PVS Theorem Prover; SimCheck: An Expressive Type System for Simulink; Coverage Metrics for Requirements-Based Testing: Evaluation of Effectiveness; Software Model Checking of ARINC-653 Flight Code with MCP; Evaluation of a Guideline by Formal Modelling of Cruise Control System in Event-B; Formal Verification of Large Software Systems; Symbolic Computation of Strongly Connected Components Using Saturation; Towards the Formal Verification of a Distributed Real-Time Automotive System; Slicing AADL Specifications for Model Checking; Model Checking with Edge-valued Decision Diagrams; and Data-flow based Model Analysis.

  8. A mathematical framework for the selection of an optimal set of peptides for epitope-based vaccines.

    PubMed

    Toussaint, Nora C; Dönnes, Pierre; Kohlbacher, Oliver

    2008-12-01

    Epitope-based vaccines (EVs) have a wide range of applications: from therapeutic to prophylactic approaches, from infectious diseases to cancer. The development of an EV is based on the knowledge of target-specific antigens from which immunogenic peptides, so-called epitopes, are derived. Such epitopes form the key components of the EV. Due to regulatory, economic, and practical concerns the number of epitopes that can be included in an EV is limited. Furthermore, as the major histocompatibility complex (MHC) binding these epitopes is highly polymorphic, every patient possesses a set of MHC class I and class II molecules of differing specificities. A peptide combination effective for one person can thus be completely ineffective for another. This renders the optimal selection of these epitopes an important and interesting optimization problem. In this work we present a mathematical framework based on integer linear programming (ILP) that allows the formulation of various flavors of the vaccine design problem and the efficient identification of optimal sets of epitopes. Out of a user-defined set of predicted or experimentally determined epitopes, the framework selects the set with the maximum likelihood of eliciting a broad and potent immune response. Our ILP approach allows an elegant and flexible formulation of numerous variants of the EV design problem. In order to demonstrate this, we show how common immunological requirements for a good EV (e.g., coverage of epitopes from each antigen, coverage of all MHC alleles in a set, or avoidance of epitopes with high mutation rates) can be translated into constraints or modifications of the objective function within the ILP framework. An implementation of the algorithm outperforms a simple greedy strategy as well as a previously suggested evolutionary algorithm and has runtimes on the order of seconds for typical problem sizes.

  9. NOBLE - Flexible concept recognition for large-scale biomedical natural language processing.

    PubMed

    Tseytlin, Eugene; Mitchell, Kevin; Legowski, Elizabeth; Corrigan, Julia; Chavan, Girish; Jacobson, Rebecca S

    2016-01-14

    Natural language processing (NLP) applications are increasingly important in biomedical data analysis, knowledge engineering, and decision support. Concept recognition is an important component task for NLP pipelines, and can be either general-purpose or domain-specific. We describe a novel, flexible, and general-purpose concept recognition component for NLP pipelines, and compare its speed and accuracy against five commonly used alternatives on both a biological and clinical corpus. NOBLE Coder implements a general algorithm for matching terms to concepts from an arbitrary vocabulary set. The system's matching options can be configured individually or in combination to yield specific system behavior for a variety of NLP tasks. The software is open source, freely available, and easily integrated into UIMA or GATE. We benchmarked speed and accuracy of the system against the CRAFT and ShARe corpora as reference standards and compared it to MMTx, MGrep, Concept Mapper, cTAKES Dictionary Lookup Annotator, and cTAKES Fast Dictionary Lookup Annotator. We describe key advantages of the NOBLE Coder system and associated tools, including its greedy algorithm, configurable matching strategies, and multiple terminology input formats. These features provide unique functionality when compared with existing alternatives, including state-of-the-art systems. On two benchmarking tasks, NOBLE's performance exceeded commonly used alternatives, performing almost as well as the most advanced systems. Error analysis revealed differences in error profiles among systems. NOBLE Coder is comparable to other widely used concept recognition systems in terms of accuracy and speed. Advantages of NOBLE Coder include its interactive terminology builder tool, ease of configuration, and adaptability to various domains and tasks. NOBLE provides a term-to-concept matching system suitable for general concept recognition in biomedical NLP pipelines.

  10. Automatic Classification of volcano-seismic events based on Deep Neural Networks.

    NASA Astrophysics Data System (ADS)

    Titos Luzón, M.; Bueno Rodriguez, A.; Garcia Martinez, L.; Benitez, C.; Ibáñez, J. M.

    2017-12-01

    Seismic monitoring of active volcanoes is a popular remote sensing technique to detect seismic activity, often associated to energy exchanges between the volcano and the environment. As a result, seismographs register a wide range of volcano-seismic signals that reflect the nature and underlying physics of volcanic processes. Machine learning and signal processing techniques provide an appropriate framework to analyze such data. In this research, we propose a new classification framework for seismic events based on deep neural networks. Deep neural networks are composed by multiple processing layers, and can discover intrinsic patterns from the data itself. Internal parameters can be initialized using a greedy unsupervised pre-training stage, leading to an efficient training of fully connected architectures. We aim to determine the robustness of these architectures as classifiers of seven different types of seismic events recorded at "Volcán de Fuego" (Colima, Mexico). Two deep neural networks with different pre-training strategies are studied: stacked denoising autoencoder and deep belief networks. Results are compared to existing machine learning algorithms (SVM, Random Forest, Multilayer Perceptron). We used 5 LPC coefficients over three non-overlapping segments as training features in order to characterize temporal evolution, avoid redundancy and encode the signal, regardless of its duration. Experimental results show that deep architectures can classify seismic events with higher accuracy than classical algorithms, attaining up to 92% recognition accuracy. Pre-training initialization helps these models to detect events that occur simultaneously in time (such explosions and rockfalls), increase robustness against noisy inputs, and provide better generalization. These results demonstrate deep neural networks are robust classifiers, and can be deployed in real-environments to monitor the seismicity of restless volcanoes.

  11. Identifying influential spreaders in complex networks through local effective spreading paths

    NASA Astrophysics Data System (ADS)

    Wang, Xiaojie; Zhang, Xue; Yi, Dongyun; Zhao, Chengli

    2017-05-01

    How to effectively identify a set of influential spreaders in complex networks is of great theoretical and practical value, which can help to inhibit the rapid spread of epidemics, promote the sales of products by word-of-mouth advertising, and so on. A naive strategy is to select the top ranked nodes as identified by some centrality indices, and other strategies are mainly based on greedy methods and heuristic methods. However, most of those approaches did not concern the connections between nodes. Usually, the distances between the selected spreaders are very close, leading to a serious overlapping of their influence. As a consequence, the global influence of the spreaders in networks will be greatly reduced, which largely restricts the performance of those methods. In this paper, a simple and efficient method is proposed to identify a set of discrete yet influential spreaders. By analyzing the spreading paths in the network, we present the concept of effective spreading paths and measure the influence of nodes via expectation calculation. The numerical analysis in undirected and directed networks all show that our proposed method outperforms many other centrality-based and heuristic benchmarks, especially in large-scale networks. Besides, experimental results on different spreading models and parameters demonstrates the stability and wide applicability of our method.

  12. Dynamic minimum set problem for reserve design: Heuristic solutions for large problems

    PubMed Central

    Sabbadin, Régis; Johnson, Fred A.; Stith, Bradley

    2018-01-01

    Conversion of wild habitats to human dominated landscape is a major cause of biodiversity loss. An approach to mitigate the impact of habitat loss consists of designating reserves where habitat is preserved and managed. Determining the most valuable areas to preserve in a landscape is called the reserve design problem. There exists several possible formulations of the reserve design problem, depending on the objectives and the constraints. In this article, we considered the dynamic problem of designing a reserve that contains a desired area of several key habitats. The dynamic case implies that the reserve cannot be designed in one time step, due to budget constraints, and that habitats can be lost before they are reserved, due for example to climate change or human development. We proposed two heuristics strategies that can be used to select sites to reserve each year for large reserve design problem. The first heuristic is a combination of the Marxan and site-ordering algorithms and the second heuristic is an augmented version of the common naive myopic heuristic. We evaluated the strategies on several simulated examples and showed that the augmented greedy heuristic is particularly interesting when some of the habitats to protect are particularly threatened and/or the compactness of the network is accounted for. PMID:29543830

  13. GFam: a platform for automatic annotation of gene families.

    PubMed

    Sasidharan, Rajkumar; Nepusz, Tamás; Swarbreck, David; Huala, Eva; Paccanaro, Alberto

    2012-10-01

    We have developed GFam, a platform for automatic annotation of gene/protein families. GFam provides a framework for genome initiatives and model organism resources to build domain-based families, derive meaningful functional labels and offers a seamless approach to propagate functional annotation across periodic genome updates. GFam is a hybrid approach that uses a greedy algorithm to chain component domains from InterPro annotation provided by its 12 member resources followed by a sequence-based connected component analysis of un-annotated sequence regions to derive consensus domain architecture for each sequence and subsequently generate families based on common architectures. Our integrated approach increases sequence coverage by 7.2 percentage points and residue coverage by 14.6 percentage points higher than the coverage relative to the best single-constituent database within InterPro for the proteome of Arabidopsis. The true power of GFam lies in maximizing annotation provided by the different InterPro data sources that offer resource-specific coverage for different regions of a sequence. GFam's capability to capture higher sequence and residue coverage can be useful for genome annotation, comparative genomics and functional studies. GFam is a general-purpose software and can be used for any collection of protein sequences. The software is open source and can be obtained from http://www.paccanarolab.org/software/gfam/.

  14. Cooperative Opportunistic Pressure Based Routing for Underwater Wireless Sensor Networks.

    PubMed

    Javaid, Nadeem; Muhammad; Sher, Arshad; Abdul, Wadood; Niaz, Iftikhar Azim; Almogren, Ahmad; Alamri, Atif

    2017-03-19

    In this paper, three opportunistic pressure based routing techniques for underwater wireless sensor networks (UWSNs) are proposed. The first one is the cooperative opportunistic pressure based routing protocol (Co-Hydrocast), second technique is the improved Hydrocast (improved-Hydrocast), and third one is the cooperative improved Hydrocast (Co-improved Hydrocast). In order to minimize lengthy routing paths between the source and the destination and to avoid void holes at the sparse networks, sensor nodes are deployed at different strategic locations. The deployment of sensor nodes at strategic locations assure the maximum monitoring of the network field. To conserve the energy consumption and minimize the number of hops, greedy algorithm is used to transmit data packets from the source to the destination. Moreover, the opportunistic routing is also exploited to avoid void regions by making backward transmissions to find reliable path towards the destination in the network. The relay cooperation mechanism is used for reliable data packet delivery, when signal to noise ratio (SNR) of the received signal is not within the predefined threshold then the maximal ratio combining (MRC) is used as a diversity technique to improve the SNR of the received signals at the destination. Extensive simulations validate that our schemes perform better in terms of packet delivery ratio and energy consumption than the existing technique; Hydrocast.

  15. Causal gene identification using combinatorial V-structure search.

    PubMed

    Cai, Ruichu; Zhang, Zhenjie; Hao, Zhifeng

    2013-07-01

    With the advances of biomedical techniques in the last decade, the costs of human genomic sequencing and genomic activity monitoring are coming down rapidly. To support the huge genome-based business in the near future, researchers are eager to find killer applications based on human genome information. Causal gene identification is one of the most promising applications, which may help the potential patients to estimate the risk of certain genetic diseases and locate the target gene for further genetic therapy. Unfortunately, existing pattern recognition techniques, such as Bayesian networks, cannot be directly applied to find the accurate causal relationship between genes and diseases. This is mainly due to the insufficient number of samples and the extremely high dimensionality of the gene space. In this paper, we present the first practical solution to causal gene identification, utilizing a new combinatorial formulation over V-Structures commonly used in conventional Bayesian networks, by exploring the combinations of significant V-Structures. We prove the NP-hardness of the combinatorial search problem under a general settings on the significance measure on the V-Structures, and present a greedy algorithm to find sub-optimal results. Extensive experiments show that our proposal is both scalable and effective, particularly with interesting findings on the causal genes over real human genome data. Copyright © 2013 Elsevier Ltd. All rights reserved.

  16. Bayesian clustering of DNA sequences using Markov chains and a stochastic partition model.

    PubMed

    Jääskinen, Väinö; Parkkinen, Ville; Cheng, Lu; Corander, Jukka

    2014-02-01

    In many biological applications it is necessary to cluster DNA sequences into groups that represent underlying organismal units, such as named species or genera. In metagenomics this grouping needs typically to be achieved on the basis of relatively short sequences which contain different types of errors, making the use of a statistical modeling approach desirable. Here we introduce a novel method for this purpose by developing a stochastic partition model that clusters Markov chains of a given order. The model is based on a Dirichlet process prior and we use conjugate priors for the Markov chain parameters which enables an analytical expression for comparing the marginal likelihoods of any two partitions. To find a good candidate for the posterior mode in the partition space, we use a hybrid computational approach which combines the EM-algorithm with a greedy search. This is demonstrated to be faster and yield highly accurate results compared to earlier suggested clustering methods for the metagenomics application. Our model is fairly generic and could also be used for clustering of other types of sequence data for which Markov chains provide a reasonable way to compress information, as illustrated by experiments on shotgun sequence type data from an Escherichia coli strain.

  17. Colored Traveling Salesman Problem.

    PubMed

    Li, Jun; Zhou, MengChu; Sun, Qirui; Dai, Xianzhong; Yu, Xiaolong

    2015-11-01

    The multiple traveling salesman problem (MTSP) is an important combinatorial optimization problem. It has been widely and successfully applied to the practical cases in which multiple traveling individuals (salesmen) share the common workspace (city set). However, it cannot represent some application problems where multiple traveling individuals not only have their own exclusive tasks but also share a group of tasks with each other. This work proposes a new MTSP called colored traveling salesman problem (CTSP) for handling such cases. Two types of city groups are defined, i.e., each group of exclusive cities of a single color for a salesman to visit and a group of shared cities of multiple colors allowing all salesmen to visit. Evidences show that CTSP is NP-hard and a multidepot MTSP and multiple single traveling salesman problems are its special cases. We present a genetic algorithm (GA) with dual-chromosome coding for CTSP and analyze the corresponding solution space. Then, GA is improved by incorporating greedy, hill-climbing (HC), and simulated annealing (SA) operations to achieve better performance. By experiments, the limitation of the exact solution method is revealed and the performance of the presented GAs is compared. The results suggest that SAGA can achieve the best quality of solutions and HCGA should be the choice making good tradeoff between the solution quality and computing time.

  18. Cooperative Opportunistic Pressure Based Routing for Underwater Wireless Sensor Networks

    PubMed Central

    Javaid, Nadeem; Muhammad; Sher, Arshad; Abdul, Wadood; Niaz, Iftikhar Azim; Almogren, Ahmad; Alamri, Atif

    2017-01-01

    In this paper, three opportunistic pressure based routing techniques for underwater wireless sensor networks (UWSNs) are proposed. The first one is the cooperative opportunistic pressure based routing protocol (Co-Hydrocast), second technique is the improved Hydrocast (improved-Hydrocast), and third one is the cooperative improved Hydrocast (Co-improved Hydrocast). In order to minimize lengthy routing paths between the source and the destination and to avoid void holes at the sparse networks, sensor nodes are deployed at different strategic locations. The deployment of sensor nodes at strategic locations assure the maximum monitoring of the network field. To conserve the energy consumption and minimize the number of hops, greedy algorithm is used to transmit data packets from the source to the destination. Moreover, the opportunistic routing is also exploited to avoid void regions by making backward transmissions to find reliable path towards the destination in the network. The relay cooperation mechanism is used for reliable data packet delivery, when signal to noise ratio (SNR) of the received signal is not within the predefined threshold then the maximal ratio combining (MRC) is used as a diversity technique to improve the SNR of the received signals at the destination. Extensive simulations validate that our schemes perform better in terms of packet delivery ratio and energy consumption than the existing technique; Hydrocast. PMID:28335494

  19. Designing Robust and Resilient Tactical MANETs

    DTIC Science & Technology

    2014-09-25

    Bounds on the Throughput Efficiency of Greedy Maximal Scheduling in Wireless Networks , IEEE/ACM Transactions on Networking , (06 2011): 0. doi: N... Wireless Sensor Networks and Effects of Long Range Dependant Data, Special IWSM Issue of Sequential Analysis, (11 2012): 0. doi: A. D. Dominguez...Bushnell, R. Poovendran. A Convex Optimization Approach for Clone Detection in Wireless Sensor Networks , Pervasive and Mobile Computing, (01 2012

  20. Optimal Achievable Encoding for Brain Machine Interface

    DTIC Science & Technology

    2017-12-22

    dictionary-based encoding approach to translate a visual image into sequential patterns of electrical stimulation in real time , in a manner that...including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and...networks, and by applying linear decoding to complete recorded populations of retinal ganglion cells for the first time . Third, we developed a greedy

  1. Differential discounting and present impact of past information.

    PubMed

    Brandimarte, Laura; Vosgerau, Joachim; Acquisti, Alessandro

    2018-01-01

    How does information about a person's past, accessed now, affect individuals' impressions of that person? In 2 survey experiments and 2 experiments with actual incentives, we compare whether, when evaluating a person, information about that person's past greedy or immoral behaviors is discounted similarly to information about her past generous or moral behaviors. We find that, no matter how far in the past a person behaved greedily or immorally, information about her negative behaviors is hardly discounted at all. In contrast, information about her past positive behaviors is discounted heavily: recent behaviors are much more influential than behaviors that occurred a long time ago. The lesser discounting of information about immoral and greedy behaviors is not caused by these behaviors being more influential, memorable, extreme, or attention-grabbing; rather, they are perceived as more diagnostic of a person's character than past moral or generous behaviors. The phenomenon of differential discounting of past information has particular relevance in the digital age, where information about people's past is easily retrieved. Our findings have significant implications for theories of impression formation and social information processing. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  2. Dispositional greed.

    PubMed

    Seuntjens, Terri G; Zeelenberg, Marcel; van de Ven, Niels; Breugelmans, Seger M

    2015-06-01

    Greed is an important motive: it is seen as both productive (a source of ambition; the motor of the economy) and destructive (undermining social relationships; the cause of the late 2000s financial crisis). However, relatively little is known about what greed is and does. This article reports on 5 studies that develop and test the 7-item Dispositional Greed Scale (DGS). Study 1 (including 4 separate samples from 2 different countries, total N = 6092) provides evidence for the construct and discriminant validity of the DGS in terms of positive correlations with maximization, self-interest, envy, materialism, and impulsiveness, and negative correlations with self-control and life satisfaction. Study 2 (N = 290) presents further evidence for discriminant validity, finding that the DGS predicts greedy behavioral tendencies over and above materialism. Furthermore, the DGS predicts economic behavior: greedy people allocate more money to themselves in dictator games (Study 3, N = 300) and ultimatum games (Study 4, N = 603), and take more in a resource dilemma (Study 5, N = 305). These findings shed light on what greed is and does, how people differ in greed, and how greed can be measured. In addition, they show the importance of greed in economic behavior and provide directions for future studies. (c) 2015 APA, all rights reserved).

  3. Starvation dynamics of a greedy forager

    NASA Astrophysics Data System (ADS)

    Bhat, U.; Redner, S.; Bénichou, O.

    2017-07-01

    We investigate the dynamics of a greedy forager that moves by random walking in an environment where each site initially contains one unit of food. Upon encountering a food-containing site, the forager eats all the food there and can subsequently hop an additional S steps without food before starving to death. Upon encountering an empty site, the forager goes hungry and comes one time unit closer to starvation. We investigate the new feature of forager greed; if the forager has a choice between hopping to an empty site or to a food-containing site in its nearest neighborhood, it hops preferentially towards food. If the neighboring sites all contain food or are all empty, the forager hops equiprobably to one of these neighbors. Paradoxically, the lifetime of the forager can depend non-monotonically on greed, and the sense of the non-monotonicity is opposite in one and two dimensions. Even more unexpectedly, the forager lifetime in one dimension is substantially enhanced when the greed is negative; here the forager tends to avoid food in its local neighborhood. We also determine the average amount of food consumed at the instant when the forager starves. We present analytic, heuristic, and numerical results to elucidate these intriguing phenomena.

  4. Real-time Control of sewer pumps by using ControlNEXT to smooth inflow at Waste Water Treatment Plant Garmerwolde

    NASA Astrophysics Data System (ADS)

    van Heeringen, Klaas-Jan; van Nooijen, Ronald; Kooij, Kees; Postma, Bokke

    2016-04-01

    The Garmerwolde waste water treatment plant (WWTP) in the Groningen area of the Netherlands, receives waste water from a large area. That waste water is collected from many sewer systems and transported to the WWTP through pressurized pipes. The supply of waste water to the WWTP is relatively low and very irregular during dry-weather conditions, resulting in a random pattern of flows. This irregularity is the effect of the local control of the pumps, where the pumps are individually operated as an on/off control based on the water levels in the connected sewer system. The influent may change from zero to high values in a few minutes. The treatment processes at the WWTP are negatively influenced by this irregularity, which ends in high costs for energy and use of chemicals. The ControlNEXT central control system is used to control the 5 largest pump stations, such that the total inflow at the WWTP becomes much smoother. This results in a reduction of operational costs of about 10%. The control algorithm determines whether the actual condition is dry or wet, based on real-time radar precipitation images and the rainfall forecast product HiRLAM. All actual data is also collected and validated, like water levels, pump operations and pump availability. This data management is done using Delft-FEWS. If the situation is identified as "wet", the sewer systems are emptied as far as possible to create maximum storage. If the situation is "dry" (and of course there is a dead band between dry and wet), the pumps are operated such that the total inflow into the WWTP is smoothed. This is done with a Greedy algorithm, developed by Delft University of Technology. The algorithm makes a plan for the next 24 hours (as the daily inflow has a typical daily pattern) and generally stores some water volume in the sewer systems during the day to be able to continue operations during the night. The pumps are controlled with a time step of 5 minutes, where ControlNEXT manages the communication of pump operation setpoints to the SCADA system. In case of failing communication, backup procedures are programmed in the PLC of the pump stations. In that case the old on/off operation based on local water levels will be used. The system has been operational since January 2016 and has been monitored since then. In addition to monitoring the positive effect on the inflow at the WWTP, an important issue is the possible sedimentation in the sewer systems. This will be monitored too.

  5. Seeking Balance between Challenge and Success in an Age of Accountability: A First-Year Faculty Growth Model (FFGM)

    ERIC Educational Resources Information Center

    Tenuto, Penny L.; Gardiner, Mary E.

    2013-01-01

    Committing to a tenure-track role by novice university faculty has been described as a difficult marriage, and higher educational organizations referred to as greedy, pointing to the need for research on the transition experiences of faculty themselves. The first year for faculty on the tenure-track is critical for academic faculty success in a…

  6. Transjugular Intrahepatic Porto-Systemic Shunt in Patients with Liver Cirrhosis and Model for End-Stage Liver Disease ≥15.

    PubMed

    Ascha, Mona; Hanouneh, Mohamad; S Ascha, Mustafa; Zein, Nizar N; Sands, Mark; Lopez, Rocio; Hanouneh, Ibrahim A

    2017-02-01

    It is not known whether transjugular intrahepatic porto-systemic shunt (TIPS) is safe in patients with advanced liver cirrhosis. The aim of our study was to evaluate the impact of TIPS on transplant-free survival in patients with liver cirrhosis and MELD score ≥15. All adult patients who underwent TIPS at our institution between 2004 and 2011 were identified (N = 470). A total of 144 patients had MELD ≥15 at the time of TIPS. These patients were matched 1:1 to patients with liver cirrhosis who did not undergo TIPS based on age and MELD score using the greedy algorithm. Patients were followed up until time of death or liver transplantation. Kaplan-Meier curves and log-rank tests were used to test for differences in survival outcome between the two groups. A total of 288 patients with liver cirrhosis were included, of whom 144 underwent TIPS and 144 did not. The two groups were matched based on age and MELD score and were comparable with regard to gender and ethnicity. Mean MELD and Child-Pugh scores in the study population were 20.9 ± 6.5 and 10.5 ± 1.8, respectively. The most common indication for TIPS was varices (49 %), followed by refractory ascites (42 %). In the first 2 months post-TIPS, there was increased mortality or liver transplantation in patients who had TIPS compared to those who did not, but this did not reach statistical significance (p = 0.07). However, after 2 months, TIPS is associated with 56 % lower risk of dying or needing liver transplantation (p < 0.01) than cirrhotic patients who did not undergo TIPS. In patients with liver cirrhosis and MELD ≥15, TIPS might improve transplant-free survival for patients who live for at least 2 months after the procedure.

  7. The effects of dutasteride and finasteride on BPH-related hospitalization, surgery and prostate cancer diagnosis: a record-linkage analysis.

    PubMed

    Cindolo, Luca; Fanizza, Caterina; Romero, Marilena; Pirozzi, Luisella; Autorino, Riccardo; Berardinelli, Francesco; Schips, Luigi

    2013-06-01

    To investigate differences in the risk of benign prostatic hyperplasia (BPH)-related hospitalization, for surgical and non-surgical reasons, and of new prostate cancer (PCa) diagnosis between patients using finasteride or dutasteride. A retrospective cohort study was conducted using data from record linkage of administrative databases (pharmaceutical prescription data, hospital discharge records, Italian population registry). Men aged ≥ 40 years old who had received a prescription for at least 10 packs/year between January 1, 2004 and December 31, 2004 were included and followed for 5 years. The association of the outcomes was assessed using a multiple Cox proportional hazard model. Propensity score-matched analysis and a 5-1, greedy 1:1 matching algorithm were performed. 8,132 patients were identified. Overall incidence rates of BPH hospitalization and BPH-related surgery were 21.05 (95 % CI 19.52-22.71) and 20.97 (95 % CI 19.45-22.61) per 1,000 person-years, respectively. In the dutasteride group compared with finasteride group, the incidence rate of both events was statistically significant lower: 16.07 versus 21.76 for BPH hospitalization and 15.91 versus 21.69 for BPH-related surgery. The incidence rate of new PCa was also lower for the dutasteride group [8.34 (95 % CI 5.96-11.68) vs. 10.25 (95 % CI 9.15-11.49)]. Dutasteride was associated with a reduction in BPH-related hospitalizations (HR 0.75, 95 % CI 0.58-0.98 and 0.58-0.98 for surgical and non-surgical reasons). The matched analysis confirmed the risk reduction with dutasteride for BPH-related surgery. These findings suggest that the clinical effects of dutasteride and finasteride might be different. Patients treated with dutasteride seem to be less likely to experience BPH-related hospitalization. Comparative studies are needed to confirm these results.

  8. Offshore wind farm layout optimization

    NASA Astrophysics Data System (ADS)

    Elkinton, Christopher Neil

    Offshore wind energy technology is maturing in Europe and is poised to make a significant contribution to the U.S. energy production portfolio. Building on the knowledge the wind industry has gained to date, this dissertation investigates the influences of different site conditions on offshore wind farm micrositing---the layout of individual turbines within the boundaries of a wind farm. For offshore wind farms, these conditions include, among others, the wind and wave climates, water depths, and soil conditions at the site. An analysis tool has been developed that is capable of estimating the cost of energy (COE) from offshore wind farms. For this analysis, the COE has been divided into several modeled components: major costs (e.g. turbines, electrical interconnection, maintenance, etc.), energy production, and energy losses. By treating these component models as functions of site-dependent parameters, the analysis tool can investigate the influence of these parameters on the COE. Some parameters result in simultaneous increases of both energy and cost. In these cases, the analysis tool was used to determine the value of the parameter that yielded the lowest COE and, thus, the best balance of cost and energy. The models have been validated and generally compare favorably with existing offshore wind farm data. The analysis technique was then paired with optimization algorithms to form a tool with which to design offshore wind farm layouts for which the COE was minimized. Greedy heuristic and genetic optimization algorithms have been tuned and implemented. The use of these two algorithms in series has been shown to produce the best, most consistent solutions. The influences of site conditions on the COE have been studied further by applying the analysis and optimization tools to the initial design of a small offshore wind farm near the town of Hull, Massachusetts. The results of an initial full-site analysis and optimization were used to constrain the boundaries of the farm. A more thorough optimization highlighted the features of the area that would result in a minimized COE. The results showed reasonable layout designs and COE estimates that are consistent with existing offshore wind farms.

  9. SOPRA: Scaffolding algorithm for paired reads via statistical optimization.

    PubMed

    Dayarian, Adel; Michael, Todd P; Sengupta, Anirvan M

    2010-06-24

    High throughput sequencing (HTS) platforms produce gigabases of short read (<100 bp) data per run. While these short reads are adequate for resequencing applications, de novo assembly of moderate size genomes from such reads remains a significant challenge. These limitations could be partially overcome by utilizing mate pair technology, which provides pairs of short reads separated by a known distance along the genome. We have developed SOPRA, a tool designed to exploit the mate pair/paired-end information for assembly of short reads. The main focus of the algorithm is selecting a sufficiently large subset of simultaneously satisfiable mate pair constraints to achieve a balance between the size and the quality of the output scaffolds. Scaffold assembly is presented as an optimization problem for variables associated with vertices and with edges of the contig connectivity graph. Vertices of this graph are individual contigs with edges drawn between contigs connected by mate pairs. Similar graph problems have been invoked in the context of shotgun sequencing and scaffold building for previous generation of sequencing projects. However, given the error-prone nature of HTS data and the fundamental limitations from the shortness of the reads, the ad hoc greedy algorithms used in the earlier studies are likely to lead to poor quality results in the current context. SOPRA circumvents this problem by treating all the constraints on equal footing for solving the optimization problem, the solution itself indicating the problematic constraints (chimeric/repetitive contigs, etc.) to be removed. The process of solving and removing of constraints is iterated till one reaches a core set of consistent constraints. For SOLiD sequencer data, SOPRA uses a dynamic programming approach to robustly translate the color-space assembly to base-space. For assessing the quality of an assembly, we report the no-match/mismatch error rate as well as the rates of various rearrangement errors. Applying SOPRA to real data from bacterial genomes, we were able to assemble contigs into scaffolds of significant length (N50 up to 200 Kb) with very few errors introduced in the process. In general, the methodology presented here will allow better scaffold assemblies of any type of mate pair sequencing data.

  10. Model-Based Optimal Experimental Design for Complex Physical Systems

    DTIC Science & Technology

    2015-12-03

    for public release. magnitude reduction in estimator error required to make solving the exact optimal design problem tractable. Instead of using a naive...for designing a sequence of experiments uses suboptimal approaches: batch design that has no feedback, or greedy ( myopic ) design that optimally...approved for public release. Equation 1 is difficult to solve directly, but can be expressed in an equivalent form using the principle of dynamic programming

  11. The United States Army Battalion Surgeon: Frontline Requirement or Relic of a Bygone Era?

    DTIC Science & Technology

    2009-12-11

    Battalion Aid Station BN Battalion BS Battalion Surgeon CBMM Core Battalion Medical Mission DOW Died of Wounds FSO Full Spectrum Operations GMO ...General Medical Officers or GMOs . Young, motivated, and greedy for knowledge, GMOs propelled the field of military medicine forward during...peacetime through analysis, research, and innovation. Their treated populations were small and exceedingly healthy. GMOs had no mission to treat dependents

  12. Adaptive feature selection using v-shaped binary particle swarm optimization.

    PubMed

    Teng, Xuyang; Dong, Hongbin; Zhou, Xiurong

    2017-01-01

    Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers.

  13. Adaptive feature selection using v-shaped binary particle swarm optimization

    PubMed Central

    Dong, Hongbin; Zhou, Xiurong

    2017-01-01

    Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers. PMID:28358850

  14. Optimization in the utility maximization framework for conservation planning: a comparison of solution procedures in a study of multifunctional agriculture

    PubMed Central

    Stoms, David M.; Davis, Frank W.

    2014-01-01

    Quantitative methods of spatial conservation prioritization have traditionally been applied to issues in conservation biology and reserve design, though their use in other types of natural resource management is growing. The utility maximization problem is one form of a covering problem where multiple criteria can represent the expected social benefits of conservation action. This approach allows flexibility with a problem formulation that is more general than typical reserve design problems, though the solution methods are very similar. However, few studies have addressed optimization in utility maximization problems for conservation planning, and the effect of solution procedure is largely unquantified. Therefore, this study mapped five criteria describing elements of multifunctional agriculture to determine a hypothetical conservation resource allocation plan for agricultural land conservation in the Central Valley of CA, USA. We compared solution procedures within the utility maximization framework to determine the difference between an open source integer programming approach and a greedy heuristic, and find gains from optimization of up to 12%. We also model land availability for conservation action as a stochastic process and determine the decline in total utility compared to the globally optimal set using both solution algorithms. Our results are comparable to other studies illustrating the benefits of optimization for different conservation planning problems, and highlight the importance of maximizing the effectiveness of limited funding for conservation and natural resource management. PMID:25538868

  15. JIGSAW: Joint Inhomogeneity estimation via Global Segment Assembly for Water-fat separation.

    PubMed

    Lu, Wenmiao; Lu, Yi

    2011-07-01

    Water-fat separation in magnetic resonance imaging (MRI) is of great clinical importance, and the key to uniform water-fat separation lies in field map estimation. This work deals with three-point field map estimation, in which water and fat are modelled as two single-peak spectral lines, and field inhomogeneities shift the spectrum by an unknown amount. Due to the simplified spectrum modelling, there exists inherent ambiguity in forming field maps from multiple locally feasible field map values at each pixel. To resolve such ambiguity, spatial smoothness of field maps has been incorporated as a constraint of an optimization problem. However, there are two issues: the optimization problem is computationally intractable and even when it is solved exactly, it does not always separate water and fat images. Hence, robust field map estimation remains challenging in many clinically important imaging scenarios. This paper proposes a novel field map estimation technique called JIGSAW. It extends a loopy belief propagation (BP) algorithm to obtain an approximate solution to the optimization problem. The solution produces locally smooth segments and avoids error propagation associated with greedy methods. The locally smooth segments are then assembled into a globally consistent field map by exploiting the periodicity of the feasible field map values. In vivo results demonstrate that JIGSAW outperforms existing techniques and produces correct water-fat separation in challenging imaging scenarios.

  16. Optimization in the utility maximization framework for conservation planning: a comparison of solution procedures in a study of multifunctional agriculture

    USGS Publications Warehouse

    Kreitler, Jason R.; Stoms, David M.; Davis, Frank W.

    2014-01-01

    Quantitative methods of spatial conservation prioritization have traditionally been applied to issues in conservation biology and reserve design, though their use in other types of natural resource management is growing. The utility maximization problem is one form of a covering problem where multiple criteria can represent the expected social benefits of conservation action. This approach allows flexibility with a problem formulation that is more general than typical reserve design problems, though the solution methods are very similar. However, few studies have addressed optimization in utility maximization problems for conservation planning, and the effect of solution procedure is largely unquantified. Therefore, this study mapped five criteria describing elements of multifunctional agriculture to determine a hypothetical conservation resource allocation plan for agricultural land conservation in the Central Valley of CA, USA. We compared solution procedures within the utility maximization framework to determine the difference between an open source integer programming approach and a greedy heuristic, and find gains from optimization of up to 12%. We also model land availability for conservation action as a stochastic process and determine the decline in total utility compared to the globally optimal set using both solution algorithms. Our results are comparable to other studies illustrating the benefits of optimization for different conservation planning problems, and highlight the importance of maximizing the effectiveness of limited funding for conservation and natural resource management.

  17. Segmentation of Large Unstructured Point Clouds Using Octree-Based Region Growing and Conditional Random Fields

    NASA Astrophysics Data System (ADS)

    Bassier, M.; Bonduel, M.; Van Genechten, B.; Vergauwen, M.

    2017-11-01

    Point cloud segmentation is a crucial step in scene understanding and interpretation. The goal is to decompose the initial data into sets of workable clusters with similar properties. Additionally, it is a key aspect in the automated procedure from point cloud data to BIM. Current approaches typically only segment a single type of primitive such as planes or cylinders. Also, current algorithms suffer from oversegmenting the data and are often sensor or scene dependent. In this work, a method is presented to automatically segment large unstructured point clouds of buildings. More specifically, the segmentation is formulated as a graph optimisation problem. First, the data is oversegmented with a greedy octree-based region growing method. The growing is conditioned on the segmentation of planes as well as smooth surfaces. Next, the candidate clusters are represented by a Conditional Random Field after which the most likely configuration of candidate clusters is computed given a set of local and contextual features. The experiments prove that the used method is a fast and reliable framework for unstructured point cloud segmentation. Processing speeds up to 40,000 points per second are recorded for the region growing. Additionally, the recall and precision of the graph clustering is approximately 80%. Overall, nearly 22% of oversegmentation is reduced by clustering the data. These clusters will be classified and used as a basis for the reconstruction of BIM models.

  18. Fast Component Pursuit for Large-Scale Inverse Covariance Estimation.

    PubMed

    Han, Lei; Zhang, Yu; Zhang, Tong

    2016-08-01

    The maximum likelihood estimation (MLE) for the Gaussian graphical model, which is also known as the inverse covariance estimation problem, has gained increasing interest recently. Most existing works assume that inverse covariance estimators contain sparse structure and then construct models with the ℓ 1 regularization. In this paper, different from existing works, we study the inverse covariance estimation problem from another perspective by efficiently modeling the low-rank structure in the inverse covariance, which is assumed to be a combination of a low-rank part and a diagonal matrix. One motivation for this assumption is that the low-rank structure is common in many applications including the climate and financial analysis, and another one is that such assumption can reduce the computational complexity when computing its inverse. Specifically, we propose an efficient COmponent Pursuit (COP) method to obtain the low-rank part, where each component can be sparse. For optimization, the COP method greedily learns a rank-one component in each iteration by maximizing the log-likelihood. Moreover, the COP algorithm enjoys several appealing properties including the existence of an efficient solution in each iteration and the theoretical guarantee on the convergence of this greedy approach. Experiments on large-scale synthetic and real-world datasets including thousands of millions variables show that the COP method is faster than the state-of-the-art techniques for the inverse covariance estimation problem when achieving comparable log-likelihood on test data.

  19. An Adaptive Data Gathering Scheme for Multi-Hop Wireless Sensor Networks Based on Compressed Sensing and Network Coding.

    PubMed

    Yin, Jun; Yang, Yuwang; Wang, Lei

    2016-04-01

    Joint design of compressed sensing (CS) and network coding (NC) has been demonstrated to provide a new data gathering paradigm for multi-hop wireless sensor networks (WSNs). By exploiting the correlation of the network sensed data, a variety of data gathering schemes based on NC and CS (Compressed Data Gathering--CDG) have been proposed. However, these schemes assume that the sparsity of the network sensed data is constant and the value of the sparsity is known before starting each data gathering epoch, thus they ignore the variation of the data observed by the WSNs which are deployed in practical circumstances. In this paper, we present a complete design of the feedback CDG scheme where the sink node adaptively queries those interested nodes to acquire an appropriate number of measurements. The adaptive measurement-formation procedure and its termination rules are proposed and analyzed in detail. Moreover, in order to minimize the number of overall transmissions in the formation procedure of each measurement, we have developed a NP-complete model (Maximum Leaf Nodes Minimum Steiner Nodes--MLMS) and realized a scalable greedy algorithm to solve the problem. Experimental results show that the proposed measurement-formation method outperforms previous schemes, and experiments on both datasets from ocean temperature and practical network deployment also prove the effectiveness of our proposed feedback CDG scheme.

  20. Brain functional network connectivity based on a visual task: visual information processing-related brain regions are significantly activated in the task state.

    PubMed

    Yang, Yan-Li; Deng, Hong-Xia; Xing, Gui-Yang; Xia, Xiao-Luan; Li, Hai-Fang

    2015-02-01

    It is not clear whether the method used in functional brain-network related research can be applied to explore the feature binding mechanism of visual perception. In this study, we investigated feature binding of color and shape in visual perception. Functional magnetic resonance imaging data were collected from 38 healthy volunteers at rest and while performing a visual perception task to construct brain networks active during resting and task states. Results showed that brain regions involved in visual information processing were obviously activated during the task. The components were partitioned using a greedy algorithm, indicating the visual network existed during the resting state. Z-values in the vision-related brain regions were calculated, confirming the dynamic balance of the brain network. Connectivity between brain regions was determined, and the result showed that occipital and lingual gyri were stable brain regions in the visual system network, the parietal lobe played a very important role in the binding process of color features and shape features, and the fusiform and inferior temporal gyri were crucial for processing color and shape information. Experimental findings indicate that understanding visual feature binding and cognitive processes will help establish computational models of vision, improve image recognition technology, and provide a new theoretical mechanism for feature binding in visual perception.

  1. Altered structural and effective connectivity in anorexia and bulimia nervosa in circuits that regulate energy and reward homeostasis.

    PubMed

    Frank, G K W; Shott, M E; Riederer, J; Pryor, T L

    2016-11-01

    Anorexia and bulimia nervosa are severe eating disorders that share many behaviors. Structural and functional brain circuits could provide biological links that those disorders have in common. We recruited 77 young adult women, 26 healthy controls, 26 women with anorexia and 25 women with bulimia nervosa. Probabilistic tractography was used to map white matter connectivity strength across taste and food intake regulating brain circuits. An independent multisample greedy equivalence search algorithm tested effective connectivity between those regions during sucrose tasting. Anorexia and bulimia nervosa had greater structural connectivity in pathways between insula, orbitofrontal cortex and ventral striatum, but lower connectivity from orbitofrontal cortex and amygdala to the hypothalamus (P<0.05, corrected for comorbidity, medication and multiple comparisons). Functionally, in controls the hypothalamus drove ventral striatal activity, but in anorexia and bulimia nervosa effective connectivity was directed from anterior cingulate via ventral striatum to the hypothalamus. Across all groups, sweetness perception was predicted by connectivity strength in pathways connecting to the middle orbitofrontal cortex. This study provides evidence that white matter structural as well as effective connectivity within the energy-homeostasis and food reward-regulating circuitry is fundamentally different in anorexia and bulimia nervosa compared with that in controls. In eating disorders, anterior cingulate cognitive-emotional top down control could affect food reward and eating drive, override hypothalamic inputs to the ventral striatum and enable prolonged food restriction.

  2. Time-saving impact of an algorithm to identify potential surgical site infections.

    PubMed

    Knepper, B C; Young, H; Jenkins, T C; Price, C S

    2013-10-01

    To develop and validate a partially automated algorithm to identify surgical site infections (SSIs) using commonly available electronic data to reduce manual chart review. Retrospective cohort study of patients undergoing specific surgical procedures over a 4-year period from 2007 through 2010 (algorithm development cohort) or over a 3-month period from January 2011 through March 2011 (algorithm validation cohort). A single academic safety-net hospital in a major metropolitan area. Patients undergoing at least 1 included surgical procedure during the study period. Procedures were identified in the National Healthcare Safety Network; SSIs were identified by manual chart review. Commonly available electronic data, including microbiologic, laboratory, and administrative data, were identified via a clinical data warehouse. Algorithms using combinations of these electronic variables were constructed and assessed for their ability to identify SSIs and reduce chart review. The most efficient algorithm identified in the development cohort combined microbiologic data with postoperative procedure and diagnosis codes. This algorithm resulted in 100% sensitivity and 85% specificity. Time savings from the algorithm was almost 600 person-hours of chart review. The algorithm demonstrated similar sensitivity on application to the validation cohort. A partially automated algorithm to identify potential SSIs was highly sensitive and dramatically reduced the amount of manual chart review required of infection control personnel during SSI surveillance.

  3. gamAID: Greedy CP tensor decomposition for supervised EHR-based disease trajectory differentiation.

    PubMed

    Henderson, Jette; Ho, Joyce; Ghosh, Joydeep

    2017-07-01

    We propose gamAID, an exploratory, supervised nonnegative tensor factorization method that iteratively extracts phenotypes from tensors constructed from medical count data. Using data from diabetic patients who later on get diagnosed with chronic kidney disorder (CKD) as well as diabetic patients who do not receive a CKD diagnosis, we demonstrate the potential of gamAID to discover phenotypes that characterize patients who are at risk for developing a disease.

  4. Significant locations in auxiliary data as seeds for typical use cases of point clustering

    NASA Astrophysics Data System (ADS)

    Kröger, Johannes

    2018-05-01

    Random greedy clustering and grid-based clustering are highly susceptible by their initial parameters. When used for point data clustering in maps they often change the apparent distribution of the underlying data. We propose a process that uses precomputed weighted seed points for the initialization of clusters, for example from local maxima in population density data. Exemplary results from the clustering of a dataset of petrol stations are presented.

  5. Determining the Most Vital Arcs Within a Multi-Mode Communication Network Using Set-Based Measures

    DTIC Science & Technology

    2015-03-26

    Disruption • llE"GreedyDisruption F’ ig u r c8 . A b..-o r grap h d isplayin g tho avcragcch augc in v;• lnc at cnchdcgrccor d is r u l>- t io n w h...Force Institute of Technology Graduate School of Engineering and Management (AFIT/ENS) 2950 Hobson Way WPAFB OH 45433-7765 AFIT-ENS-MS-15-M-131

  6. Phytoplankton global mapping from space with a support vector machine algorithm

    NASA Astrophysics Data System (ADS)

    de Boissieu, Florian; Menkes, Christophe; Dupouy, Cécile; Rodier, Martin; Bonnet, Sophie; Mangeas, Morgan; Frouin, Robert J.

    2014-11-01

    In recent years great progress has been made in global mapping of phytoplankton from space. Two main trends have emerged, the recognition of phytoplankton functional types (PFT) based on reflectance normalized to chlorophyll-a concentration, and the recognition of phytoplankton size class (PSC) based on the relationship between cell size and chlorophyll-a concentration. However, PFTs and PSCs are not decorrelated, and one approach can complement the other in a recognition task. In this paper, we explore the recognition of several dominant PFTs by combining reflectance anomalies, chlorophyll-a concentration and other environmental parameters, such as sea surface temperature and wind speed. Remote sensing pixels are labeled thanks to coincident in-situ pigment data from GeP&CO, NOMAD and MAREDAT datasets, covering various oceanographic environments. The recognition is made with a supervised Support Vector Machine classifier trained on the labeled pixels. This algorithm enables a non-linear separation of the classes in the input space and is especially adapted for small training datasets as available here. Moreover, it provides a class probability estimate, allowing one to enhance the robustness of the classification results through the choice of a minimum probability threshold. A greedy feature selection associated to a 10-fold cross-validation procedure is applied to select the most discriminative input features and evaluate the classification performance. The best classifiers are finally applied on daily remote sensing datasets (SeaWIFS, MODISA) and the resulting dominant PFT maps are compared with other studies. Several conclusions are drawn: (1) the feature selection highlights the weight of temperature, chlorophyll-a and wind speed variables in phytoplankton recognition; (2) the classifiers show good results and dominant PFT maps in agreement with phytoplankton distribution knowledge; (3) classification on MODISA data seems to perform better than on SeaWIFS data, (4) the probability threshold screens correctly the areas of smallest confidence such as the interclass regions.

  7. CLUSTOM-CLOUD: In-Memory Data Grid-Based Software for Clustering 16S rRNA Sequence Data in the Cloud Environment.

    PubMed

    Oh, Jeongsu; Choi, Chi-Hwan; Park, Min-Kyu; Kim, Byung Kwon; Hwang, Kyuin; Lee, Sang-Heon; Hong, Soon Gyu; Nasir, Arshan; Cho, Wan-Sup; Kim, Kyung Mo

    2016-01-01

    High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology-a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr.

  8. Practical optimization of Steiner trees via the cavity method

    NASA Astrophysics Data System (ADS)

    Braunstein, Alfredo; Muntoni, Anna

    2016-07-01

    The optimization version of the cavity method for single instances, called Max-Sum, has been applied in the past to the minimum Steiner tree problem on graphs and variants. Max-Sum has been shown experimentally to give asymptotically optimal results on certain types of weighted random graphs, and to give good solutions in short computation times for some types of real networks. However, the hypotheses behind the formulation and the cavity method itself limit substantially the class of instances on which the approach gives good results (or even converges). Moreover, in the standard model formulation, the diameter of the tree solution is limited by a predefined bound, that affects both computation time and convergence properties. In this work we describe two main enhancements to the Max-Sum equations to be able to cope with optimization of real-world instances. First, we develop an alternative ‘flat’ model formulation that allows the relevant configuration space to be reduced substantially, making the approach feasible on instances with large solution diameter, in particular when the number of terminal nodes is small. Second, we propose an integration between Max-Sum and three greedy heuristics. This integration allows Max-Sum to be transformed into a highly competitive self-contained algorithm, in which a feasible solution is given at each step of the iterative procedure. Part of this development participated in the 2014 DIMACS Challenge on Steiner problems, and we report the results here. The performance on the challenge of the proposed approach was highly satisfactory: it maintained a small gap to the best bound in most cases, and obtained the best results on several instances in two different categories. We also present several improvements with respect to the version of the algorithm that participated in the competition, including new best solutions for some of the instances of the challenge.

  9. CLUSTOM-CLOUD: In-Memory Data Grid-Based Software for Clustering 16S rRNA Sequence Data in the Cloud Environment

    PubMed Central

    Park, Min-Kyu; Kim, Byung Kwon; Hwang, Kyuin; Lee, Sang-Heon; Hong, Soon Gyu; Nasir, Arshan; Cho, Wan-Sup; Kim, Kyung Mo

    2016-01-01

    High-throughput sequencing can produce hundreds of thousands of 16S rRNA sequence reads corresponding to different organisms present in the environmental samples. Typically, analysis of microbial diversity in bioinformatics starts from pre-processing followed by clustering 16S rRNA reads into relatively fewer operational taxonomic units (OTUs). The OTUs are reliable indicators of microbial diversity and greatly accelerate the downstream analysis time. However, existing hierarchical clustering algorithms that are generally more accurate than greedy heuristic algorithms struggle with large sequence datasets. To keep pace with the rapid rise in sequencing data, we present CLUSTOM-CLOUD, which is the first distributed sequence clustering program based on In-Memory Data Grid (IMDG) technology–a distributed data structure to store all data in the main memory of multiple computing nodes. The IMDG technology helps CLUSTOM-CLOUD to enhance both its capability of handling larger datasets and its computational scalability better than its ancestor, CLUSTOM, while maintaining high accuracy. Clustering speed of CLUSTOM-CLOUD was evaluated on published 16S rRNA human microbiome sequence datasets using the small laboratory cluster (10 nodes) and under the Amazon EC2 cloud-computing environments. Under the laboratory environment, it required only ~3 hours to process dataset of size 200 K reads regardless of the complexity of the human microbiome data. In turn, one million reads were processed in approximately 20, 14, and 11 hours when utilizing 20, 30, and 40 nodes on the Amazon EC2 cloud-computing environment. The running time evaluation indicates that CLUSTOM-CLOUD can handle much larger sequence datasets than CLUSTOM and is also a scalable distributed processing system. The comparative accuracy test using 16S rRNA pyrosequences of a mock community shows that CLUSTOM-CLOUD achieves higher accuracy than DOTUR, mothur, ESPRIT-Tree, UCLUST and Swarm. CLUSTOM-CLOUD is written in JAVA and is freely available at http://clustomcloud.kopri.re.kr. PMID:26954507

  10. U.S. Africa Policy: Some Possible Course Adjustment

    DTIC Science & Technology

    1994-08-29

    installations on behalf of an African Marxist government. A lot of Angola’s oil was exported to the United States; the U.S. companies involved made money. The...country, with major legitimate social service and infrastructure needs and, probably, very greedy leadership, would be able to forego exportation of its...droughts like the Sahel-and fully capable of feeding its population and producing for agricultural export . It has diamonds for ready cash. The size of

  11. Scheduling Capacitated One-Way Vehicles on Paths with Deadlines

    NASA Astrophysics Data System (ADS)

    Uchida, Jun; Karuno, Yoshiyuki; Nagamochi, Hiroshi

    In this paper, we deal with a scheduling problem of minimizing the number of employed vehicles on paths. Let G=(V,E) be a path with a set V={vi|i=1,2,...,n} of vertices and a set E={{vi,vi+1}|i=1,2,...,n-1} of edges. Vehicles with capacity b are initially situated at v1. There is a job i at each vertex vi∈V, which has its own handling time hi and deadline di. With each edge {vi,vi+1}∈E, a travel time wi,i+1 is associated. Each job is processed by exactly one vehicle, and the number of jobs processed by a vehicle does not exceed the capacity b. A routing of a vehicle is called one-way if the vehicle visits every edge {vi,vi+1} exactly once (i.e., it simply moves from v1 to vn on G). Any vehicle is assumed to follow the one-way routing constraint. The problem asks to find a schedule that minimizes the number of one-way vehicles, meeting the deadline and capacity constraints. A greedy heuristic is proposed, which repeats a dynamic programming procedure for a single one-way vehicle problem of maximizing the number of non-tardy jobs. We show that the greedy heuristic runs in O(n3) time, and the approximation ratio is at most ln b+1.

  12. A Fast, Locally Adaptive, Interactive Retrieval Algorithm for the Analysis of DIAL Measurements

    NASA Astrophysics Data System (ADS)

    Samarov, D. V.; Rogers, R.; Hair, J. W.; Douglass, K. O.; Plusquellic, D.

    2010-12-01

    Differential absorption light detection and ranging (DIAL) is a laser-based tool which is used for remote, range-resolved measurement of particular gases in the atmosphere, such as carbon-dioxide and methane. In many instances it is of interest to study how these gases are distributed over a region such as a landfill, factory, or farm. While a single DIAL measurement only tells us about the distribution of a gas along a single path, a sequence of consecutive measurements provides us with information on how that gas is distributed over a region, making DIAL a natural choice for such studies. DIAL measurements present a number of interesting challenges; first, in order to convert the raw data to concentration it is necessary to estimate the derivative along the path of the measurement. Second, as the distribution of gases across a region can be highly heterogeneous it is important that the spatial nature of the measurements be taken into account. Finally, since it is common for the set of collected measurements to be quite large it is important for the method to be computationally efficient. Existing work based on Local Polynomial Regression (LPR) has been developed which addresses the first two issues, but the issue of computational speed remains an open problem. In addition to the latter, another desirable property is to allow user input into the algorithm. In this talk we present a novel method based on LPR which utilizes a variant of the RODEO algorithm to provide a fast, locally adaptive and interactive approach to the analysis of DIAL measurements. This methodology is motivated by and applied to several simulated examples and a study out of NASA Langley Research Center (LaRC) looking at the estimation of aerosol extinction in the atmosphere. A comparison study of our method against several other algorithms is also presented. References Chaudhuri, P., Marron, J.S., Scale-space view of curve estimation, Annals of Statistics 28 (2000) 408-428. Duong, T., Cowling, A., Koch, I., Wand, M.P., Feature significance for multivariate kernel density estimation, Computational Statistics and Data Analysis 52 (2008) 4225-4242. Godtliebsen, F., Marron, J.S., Chaudhuri, P., Statistical Significance of features in digital images, Image and Vision Computing 22 (2004) 1093-1104. Lafferty, J., Wasserman, L., RODEO: Sparse, Greedy Nonparametric Regression, Annals of Statistics 36 (2008) 28-63. Lindstrom, T., Holst, U., Weibring, P., Analysis of lidar fields using local polynomial regression, Environmetrics 16 (2005) 619-634

  13. Validation of an International Classification of Diseases, Ninth Revision Code Algorithm for Identifying Chiari Malformation Type 1 Surgery in Adults.

    PubMed

    Greenberg, Jacob K; Ladner, Travis R; Olsen, Margaret A; Shannon, Chevis N; Liu, Jingxia; Yarbrough, Chester K; Piccirillo, Jay F; Wellons, John C; Smyth, Matthew D; Park, Tae Sung; Limbrick, David D

    2015-08-01

    The use of administrative billing data may enable large-scale assessments of treatment outcomes for Chiari Malformation type I (CM-1). However, to utilize such data sets, validated International Classification of Diseases, Ninth Revision (ICD-9-CM) code algorithms for identifying CM-1 surgery are needed. To validate 2 ICD-9-CM code algorithms identifying patients undergoing CM-1 decompression surgery. We retrospectively analyzed the validity of 2 ICD-9-CM code algorithms for identifying adult CM-1 decompression surgery performed at 2 academic medical centers between 2001 and 2013. Algorithm 1 included any discharge diagnosis code of 348.4 (CM-1), as well as a procedure code of 01.24 (cranial decompression) or 03.09 (spinal decompression, or laminectomy). Algorithm 2 restricted this group to patients with a primary diagnosis of 348.4. The positive predictive value (PPV) and sensitivity of each algorithm were calculated. Among 340 first-time admissions identified by Algorithm 1, the overall PPV for CM-1 decompression was 65%. Among the 214 admissions identified by Algorithm 2, the overall PPV was 99.5%. The PPV for Algorithm 1 was lower in the Vanderbilt (59%) cohort, males (40%), and patients treated between 2009 and 2013 (57%), whereas the PPV of Algorithm 2 remained high (≥99%) across subgroups. The sensitivity of Algorithms 1 (86%) and 2 (83%) were above 75% in all subgroups. ICD-9-CM code Algorithm 2 has excellent PPV and good sensitivity to identify adult CM-1 decompression surgery. These results lay the foundation for studying CM-1 treatment outcomes by using large administrative databases.

  14. Blind compressive sensing dynamic MRI

    PubMed Central

    Lingala, Sajan Goud; Jacob, Mathews

    2013-01-01

    We propose a novel blind compressive sensing (BCS) frame work to recover dynamic magnetic resonance images from undersampled measurements. This scheme models the dynamic signal as a sparse linear combination of temporal basis functions, chosen from a large dictionary. In contrast to classical compressed sensing, the BCS scheme simultaneously estimates the dictionary and the sparse coefficients from the undersampled measurements. Apart from the sparsity of the coefficients, the key difference of the BCS scheme with current low rank methods is the non-orthogonal nature of the dictionary basis functions. Since the number of degrees of freedom of the BCS model is smaller than that of the low-rank methods, it provides improved reconstructions at high acceleration rates. We formulate the reconstruction as a constrained optimization problem; the objective function is the linear combination of a data consistency term and sparsity promoting ℓ1 prior of the coefficients. The Frobenius norm dictionary constraint is used to avoid scale ambiguity. We introduce a simple and efficient majorize-minimize algorithm, which decouples the original criterion into three simpler sub problems. An alternating minimization strategy is used, where we cycle through the minimization of three simpler problems. This algorithm is seen to be considerably faster than approaches that alternates between sparse coding and dictionary estimation, as well as the extension of K-SVD dictionary learning scheme. The use of the ℓ1 penalty and Frobenius norm dictionary constraint enables the attenuation of insignificant basis functions compared to the ℓ0 norm and column norm constraint assumed in most dictionary learning algorithms; this is especially important since the number of basis functions that can be reliably estimated is restricted by the available measurements. We also observe that the proposed scheme is more robust to local minima compared to K-SVD method, which relies on greedy sparse coding. Our phase transition experiments demonstrate that the BCS scheme provides much better recovery rates than classical Fourier-based CS schemes, while being only marginally worse than the dictionary aware setting. Since the overhead in additionally estimating the dictionary is low, this method can be very useful in dynamic MRI applications, where the signal is not sparse in known dictionaries. We demonstrate the utility of the BCS scheme in accelerating contrast enhanced dynamic data. We observe superior reconstruction performance with the BCS scheme in comparison to existing low rank and compressed sensing schemes. PMID:23542951

  15. Altered structural and effective connectivity in anorexia and bulimia nervosa in circuits that regulate energy and reward homeostasis

    PubMed Central

    Frank, G K W; Shott, M E; Riederer, J; Pryor, T L

    2016-01-01

    Anorexia and bulimia nervosa are severe eating disorders that share many behaviors. Structural and functional brain circuits could provide biological links that those disorders have in common. We recruited 77 young adult women, 26 healthy controls, 26 women with anorexia and 25 women with bulimia nervosa. Probabilistic tractography was used to map white matter connectivity strength across taste and food intake regulating brain circuits. An independent multisample greedy equivalence search algorithm tested effective connectivity between those regions during sucrose tasting. Anorexia and bulimia nervosa had greater structural connectivity in pathways between insula, orbitofrontal cortex and ventral striatum, but lower connectivity from orbitofrontal cortex and amygdala to the hypothalamus (P<0.05, corrected for comorbidity, medication and multiple comparisons). Functionally, in controls the hypothalamus drove ventral striatal activity, but in anorexia and bulimia nervosa effective connectivity was directed from anterior cingulate via ventral striatum to the hypothalamus. Across all groups, sweetness perception was predicted by connectivity strength in pathways connecting to the middle orbitofrontal cortex. This study provides evidence that white matter structural as well as effective connectivity within the energy-homeostasis and food reward-regulating circuitry is fundamentally different in anorexia and bulimia nervosa compared with that in controls. In eating disorders, anterior cingulate cognitive–emotional top down control could affect food reward and eating drive, override hypothalamic inputs to the ventral striatum and enable prolonged food restriction. PMID:27801897

  16. A Practical, Robust Methodology for Acquiring New Observation Data Using Computationally Expensive Groundwater Models

    NASA Astrophysics Data System (ADS)

    Siade, Adam J.; Hall, Joel; Karelse, Robert N.

    2017-11-01

    Regional groundwater flow models play an important role in decision making regarding water resources; however, the uncertainty embedded in model parameters and model assumptions can significantly hinder the reliability of model predictions. One way to reduce this uncertainty is to collect new observation data from the field. However, determining where and when to obtain such data is not straightforward. There exist a number of data-worth and experimental design strategies developed for this purpose. However, these studies often ignore issues related to real-world groundwater models such as computational expense, existing observation data, high-parameter dimension, etc. In this study, we propose a methodology, based on existing methods and software, to efficiently conduct such analyses for large-scale, complex regional groundwater flow systems for which there is a wealth of available observation data. The method utilizes the well-established d-optimality criterion, and the minimax criterion for robust sampling strategies. The so-called Null-Space Monte Carlo method is used to reduce the computational burden associated with uncertainty quantification. And, a heuristic methodology, based on the concept of the greedy algorithm, is proposed for developing robust designs with subsets of the posterior parameter samples. The proposed methodology is tested on a synthetic regional groundwater model, and subsequently applied to an existing, complex, regional groundwater system in the Perth region of Western Australia. The results indicate that robust designs can be obtained efficiently, within reasonable computational resources, for making regional decisions regarding groundwater level sampling.

  17. Probabilistic cross-link analysis and experiment planning for high-throughput elucidation of protein structure.

    PubMed

    Ye, Xiaoduan; O'Neil, Patrick K; Foster, Adrienne N; Gajda, Michal J; Kosinski, Jan; Kurowski, Michal A; Bujnicki, Janusz M; Friedman, Alan M; Bailey-Kellogg, Chris

    2004-12-01

    Emerging high-throughput techniques for the characterization of protein and protein-complex structures yield noisy data with sparse information content, placing a significant burden on computation to properly interpret the experimental data. One such technique uses cross-linking (chemical or by cysteine oxidation) to confirm or select among proposed structural models (e.g., from fold recognition, ab initio prediction, or docking) by testing the consistency between cross-linking data and model geometry. This paper develops a probabilistic framework for analyzing the information content in cross-linking experiments, accounting for anticipated experimental error. This framework supports a mechanism for planning experiments to optimize the information gained. We evaluate potential experiment plans using explicit trade-offs among key properties of practical importance: discriminability, coverage, balance, ambiguity, and cost. We devise a greedy algorithm that considers those properties and, from a large number of combinatorial possibilities, rapidly selects sets of experiments expected to discriminate pairs of models efficiently. In an application to residue-specific chemical cross-linking, we demonstrate the ability of our approach to plan experiments effectively involving combinations of cross-linkers and introduced mutations. We also describe an experiment plan for the bacteriophage lambda Tfa chaperone protein in which we plan dicysteine mutants for discriminating threading models by disulfide formation. Preliminary results from a subset of the planned experiments are consistent and demonstrate the practicality of planning. Our methods provide the experimenter with a valuable tool (available from the authors) for understanding and optimizing cross-linking experiments.

  18. An Active Patch Model for Real World Texture and Appearance Classification

    PubMed Central

    Mao, Junhua; Zhu, Jun; Yuille, Alan L.

    2014-01-01

    This paper addresses the task of natural texture and appearance classification. Our goal is to develop a simple and intuitive method that performs at state of the art on datasets ranging from homogeneous texture (e.g., material texture), to less homogeneous texture (e.g., the fur of animals), and to inhomogeneous texture (the appearance patterns of vehicles). Our method uses a bag-of-words model where the features are based on a dictionary of active patches. Active patches are raw intensity patches which can undergo spatial transformations (e.g., rotation and scaling) and adjust themselves to best match the image regions. The dictionary of active patches is required to be compact and representative, in the sense that we can use it to approximately reconstruct the images that we want to classify. We propose a probabilistic model to quantify the quality of image reconstruction and design a greedy learning algorithm to obtain the dictionary. We classify images using the occurrence frequency of the active patches. Feature extraction is fast (about 100 ms per image) using the GPU. The experimental results show that our method improves the state of the art on a challenging material texture benchmark dataset (KTH-TIPS2). To test our method on less homogeneous or inhomogeneous images, we construct two new datasets consisting of appearance image patches of animals and vehicles cropped from the PASCAL VOC dataset. Our method outperforms competing methods on these datasets. PMID:25531013

  19. Development of an Agent-Based Model (ABM) to Simulate the Immune System and Integration of a Regression Method to Estimate the Key ABM Parameters by Fitting the Experimental Data

    PubMed Central

    Tong, Xuming; Chen, Jinghang; Miao, Hongyu; Li, Tingting; Zhang, Le

    2015-01-01

    Agent-based models (ABM) and differential equations (DE) are two commonly used methods for immune system simulation. However, it is difficult for ABM to estimate key parameters of the model by incorporating experimental data, whereas the differential equation model is incapable of describing the complicated immune system in detail. To overcome these problems, we developed an integrated ABM regression model (IABMR). It can combine the advantages of ABM and DE by employing ABM to mimic the multi-scale immune system with various phenotypes and types of cells as well as using the input and output of ABM to build up the Loess regression for key parameter estimation. Next, we employed the greedy algorithm to estimate the key parameters of the ABM with respect to the same experimental data set and used ABM to describe a 3D immune system similar to previous studies that employed the DE model. These results indicate that IABMR not only has the potential to simulate the immune system at various scales, phenotypes and cell types, but can also accurately infer the key parameters like DE model. Therefore, this study innovatively developed a complex system development mechanism that could simulate the complicated immune system in detail like ABM and validate the reliability and efficiency of model like DE by fitting the experimental data. PMID:26535589

  20. Extending SME to Handle Large-Scale Cognitive Modeling.

    PubMed

    Forbus, Kenneth D; Ferguson, Ronald W; Lovett, Andrew; Gentner, Dedre

    2017-07-01

    Analogy and similarity are central phenomena in human cognition, involved in processes ranging from visual perception to conceptual change. To capture this centrality requires that a model of comparison must be able to integrate with other processes and handle the size and complexity of the representations required by the tasks being modeled. This paper describes extensions to Structure-Mapping Engine (SME) since its inception in 1986 that have increased its scope of operation. We first review the basic SME algorithm, describe psychological evidence for SME as a process model, and summarize its role in simulating similarity-based retrieval and generalization. Then we describe five techniques now incorporated into the SME that have enabled it to tackle large-scale modeling tasks: (a) Greedy merging rapidly constructs one or more best interpretations of a match in polynomial time: O(n 2 log(n)); (b) Incremental operation enables mappings to be extended as new information is retrieved or derived about the base or target, to model situations where information in a task is updated over time; (c) Ubiquitous predicates model the varying degrees to which items may suggest alignment; (d) Structural evaluation of analogical inferences models aspects of plausibility judgments; (e) Match filters enable large-scale task models to communicate constraints to SME to influence the mapping process. We illustrate via examples from published studies how these enable it to capture a broader range of psychological phenomena than before. Copyright © 2016 Cognitive Science Society, Inc.

  1. Netgram: Visualizing Communities in Evolving Networks

    PubMed Central

    Mall, Raghvendra; Langone, Rocco; Suykens, Johan A. K.

    2015-01-01

    Real-world complex networks are dynamic in nature and change over time. The change is usually observed in the interactions within the network over time. Complex networks exhibit community like structures. A key feature of the dynamics of complex networks is the evolution of communities over time. Several methods have been proposed to detect and track the evolution of these groups over time. However, there is no generic tool which visualizes all the aspects of group evolution in dynamic networks including birth, death, splitting, merging, expansion, shrinkage and continuation of groups. In this paper, we propose Netgram: a tool for visualizing evolution of communities in time-evolving graphs. Netgram maintains evolution of communities over 2 consecutive time-stamps in tables which are used to create a query database using the sql outer-join operation. It uses a line-based visualization technique which adheres to certain design principles and aesthetic guidelines. Netgram uses a greedy solution to order the initial community information provided by the evolutionary clustering technique such that we have fewer line cross-overs in the visualization. This makes it easier to track the progress of individual communities in time evolving graphs. Netgram is a generic toolkit which can be used with any evolutionary community detection algorithm as illustrated in our experiments. We use Netgram for visualization of topic evolution in the NIPS conference over a period of 11 years and observe the emergence and merging of several disciplines in the field of information processing systems. PMID:26356538

  2. 3.5 GHz Environmental Sensing Capability Detection Thresholds and Deployment

    PubMed Central

    Nguyen, Thao T.; Souryal, Michael R.; Sahoo, Anirudha; Hall, Timothy A.

    2017-01-01

    Spectrum sharing in the 3.5 GHz band between commercial and government users along U.S. coastal areas depends on an environmental sensing capability (ESC)—that is, a network of radio frequency sensors and a decision system—to detect the presence of incumbent shipborne radar systems and trigger protective measures, as needed. It is well known that the sensitivity of these sensors depends on the aggregate interference generated by commercial systems to the incumbent radar receivers, but to date no comprehensive study has been made of the aggregate interference in realistic scenarios and its impact on the requirement for detection of the radar signal. This paper presents systematic methods for determining the placement of ESC sensors and their detection thresholds to adequately protect incumbent shipborne radar systems from harmful interference. Using terrain-based propagation models and a population-based deployment model, the analysis finds the offshore distances at which protection must be triggered and relates these to the detection levels of coastline sensors. We further show that sensor placement is a form of the well-known set cover problem, which has been shown to be NP-complete, and demonstrate practical solutions achieved with a greedy algorithm. Results show detection thresholds to be as much as 22 dB lower than required by current industry standards. The methodology and results presented in this paper can be used by ESC operators for planning and deployment of sensors and by regulators for testing sensor performance. PMID:29303162

  3. The efficiency of average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling in identifying homogeneous precipitation catchments

    NASA Astrophysics Data System (ADS)

    Chuan, Zun Liang; Ismail, Noriszura; Shinyie, Wendy Ling; Lit Ken, Tan; Fam, Soo-Fen; Senawi, Azlyna; Yusoff, Wan Nur Syahidah Wan

    2018-04-01

    Due to the limited of historical precipitation records, agglomerative hierarchical clustering algorithms widely used to extrapolate information from gauged to ungauged precipitation catchments in yielding a more reliable projection of extreme hydro-meteorological events such as extreme precipitation events. However, identifying the optimum number of homogeneous precipitation catchments accurately based on the dendrogram resulted using agglomerative hierarchical algorithms are very subjective. The main objective of this study is to propose an efficient regionalized algorithm to identify the homogeneous precipitation catchments for non-stationary precipitation time series. The homogeneous precipitation catchments are identified using average linkage hierarchical clustering algorithm associated multi-scale bootstrap resampling, while uncentered correlation coefficient as the similarity measure. The regionalized homogeneous precipitation is consolidated using K-sample Anderson Darling non-parametric test. The analysis result shows the proposed regionalized algorithm performed more better compared to the proposed agglomerative hierarchical clustering algorithm in previous studies.

  4. BIPAD: A web server for modeling bipartite sequence elements

    PubMed Central

    Bi, Chengpeng; Rogan, Peter K

    2006-01-01

    Background Many dimeric protein complexes bind cooperatively to families of bipartite nucleic acid sequence elements, which consist of pairs of conserved half-site sequences separated by intervening distances that vary among individual sites. Results We introduce the Bipad Server [1], a web interface to predict sequence elements embedded within unaligned sequences. Either a bipartite model, consisting of a pair of one-block position weight matrices (PWM's) with a gap distribution, or a single PWM matrix for contiguous single block motifs may be produced. The Bipad program performs multiple local alignment by entropy minimization and cyclic refinement using a stochastic greedy search strategy. The best models are refined by maximizing incremental information contents among a set of potential models with varying half site and gap lengths. Conclusion The web service generates information positional weight matrices, identifies binding site motifs, graphically represents the set of discovered elements as a sequence logo, and depicts the gap distribution as a histogram. Server performance was evaluated by generating a collection of bipartite models for distinct DNA binding proteins. PMID:16503993

  5. Automated Recognition of 3D Features in GPIR Images

    NASA Technical Reports Server (NTRS)

    Park, Han; Stough, Timothy; Fijany, Amir

    2007-01-01

    A method of automated recognition of three-dimensional (3D) features in images generated by ground-penetrating imaging radar (GPIR) is undergoing development. GPIR 3D images can be analyzed to detect and identify such subsurface features as pipes and other utility conduits. Until now, much of the analysis of GPIR images has been performed manually by expert operators who must visually identify and track each feature. The present method is intended to satisfy a need for more efficient and accurate analysis by means of algorithms that can automatically identify and track subsurface features, with minimal supervision by human operators. In this method, data from multiple sources (for example, data on different features extracted by different algorithms) are fused together for identifying subsurface objects. The algorithms of this method can be classified in several different ways. In one classification, the algorithms fall into three classes: (1) image-processing algorithms, (2) feature- extraction algorithms, and (3) a multiaxis data-fusion/pattern-recognition algorithm that includes a combination of machine-learning, pattern-recognition, and object-linking algorithms. The image-processing class includes preprocessing algorithms for reducing noise and enhancing target features for pattern recognition. The feature-extraction algorithms operate on preprocessed data to extract such specific features in images as two-dimensional (2D) slices of a pipe. Then the multiaxis data-fusion/ pattern-recognition algorithm identifies, classifies, and reconstructs 3D objects from the extracted features. In this process, multiple 2D features extracted by use of different algorithms and representing views along different directions are used to identify and reconstruct 3D objects. In object linking, which is an essential part of this process, features identified in successive 2D slices and located within a threshold radius of identical features in adjacent slices are linked in a directed-graph data structure. Relative to past approaches, this multiaxis approach offers the advantages of more reliable detections, better discrimination of objects, and provision of redundant information, which can be helpful in filling gaps in feature recognition by one of the component algorithms. The image-processing class also includes postprocessing algorithms that enhance identified features to prepare them for further scrutiny by human analysts (see figure). Enhancement of images as a postprocessing step is a significant departure from traditional practice, in which enhancement of images is a preprocessing step.

  6. Diagnosis of paediatric HIV infection in a primary health care setting with a clinical algorithm.

    PubMed Central

    Horwood, C.; Liebeschuetz, S.; Blaauw, D.; Cassol, S.; Qazi, S.

    2003-01-01

    OBJECTIVE: To determine the validity of an algorithm used by primary care health workers to identify children with symptomatic human immunodeficiency virus (HIV) infection. This HIV algorithm is being implemented in South Africa as part of the Integrated Management of Childhood Illness (IMCI), a strategy that aims to improve childhood morbidity and mortality by improving care at the primary care level. As AIDS is a leading cause of death in children in southern Africa, diagnosis and management of symptomatic HIV infection was added to the existing IMCI algorithm. METHODS: In total, 690 children who attended the outpatients department in a district hospital in South Africa were assessed with the HIV algorithm and by a paediatrician. All children were then tested for HIV viral load. The validity of the algorithm in detecting symptomatic HIV was compared with clinical diagnosis by a paediatrician and the result of an HIV test. Detailed clinical data were used to improve the algorithm. FINDINGS: Overall, 198 (28.7%) enrolled children were infected with HIV. The paediatrician correctly identified 142 (71.7%) children infected with HIV, whereas the IMCI/HIV algorithm identified 111 (56.1%). Odds ratios were calculated to identify predictors of HIV infection and used to develop an improved HIV algorithm that is 67.2% sensitive and 81.5% specific in clinically detecting HIV infection. CONCLUSIONS: Children with symptomatic HIV infection can be identified effectively by primary level health workers through the use of an algorithm. The improved HIV algorithm developed in this study could be used by countries with high prevalences of HIV to enable IMCI practitioners to identify and care for HIV-infected children. PMID:14997238

  7. Chiari malformation Type I surgery in pediatric patients. Part 1: validation of an ICD-9-CM code search algorithm.

    PubMed

    Ladner, Travis R; Greenberg, Jacob K; Guerrero, Nicole; Olsen, Margaret A; Shannon, Chevis N; Yarbrough, Chester K; Piccirillo, Jay F; Anderson, Richard C E; Feldstein, Neil A; Wellons, John C; Smyth, Matthew D; Park, Tae Sung; Limbrick, David D

    2016-05-01

    OBJECTIVE Administrative billing data may facilitate large-scale assessments of treatment outcomes for pediatric Chiari malformation Type I (CM-I). Validated International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) code algorithms for identifying CM-I surgery are critical prerequisites for such studies but are currently only available for adults. The objective of this study was to validate two ICD-9-CM code algorithms using hospital billing data to identify pediatric patients undergoing CM-I decompression surgery. METHODS The authors retrospectively analyzed the validity of two ICD-9-CM code algorithms for identifying pediatric CM-I decompression surgery performed at 3 academic medical centers between 2001 and 2013. Algorithm 1 included any discharge diagnosis code of 348.4 (CM-I), as well as a procedure code of 01.24 (cranial decompression) or 03.09 (spinal decompression or laminectomy). Algorithm 2 restricted this group to the subset of patients with a primary discharge diagnosis of 348.4. The positive predictive value (PPV) and sensitivity of each algorithm were calculated. RESULTS Among 625 first-time admissions identified by Algorithm 1, the overall PPV for CM-I decompression was 92%. Among the 581 admissions identified by Algorithm 2, the PPV was 97%. The PPV for Algorithm 1 was lower in one center (84%) compared with the other centers (93%-94%), whereas the PPV of Algorithm 2 remained high (96%-98%) across all subgroups. The sensitivity of Algorithms 1 (91%) and 2 (89%) was very good and remained so across subgroups (82%-97%). CONCLUSIONS An ICD-9-CM algorithm requiring a primary diagnosis of CM-I has excellent PPV and very good sensitivity for identifying CM-I decompression surgery in pediatric patients. These results establish a basis for utilizing administrative billing data to assess pediatric CM-I treatment outcomes.

  8. Novel Spectral Representations and Sparsity-Driven Algorithms for Shape Modeling and Analysis

    NASA Astrophysics Data System (ADS)

    Zhong, Ming

    In this dissertation, we focus on extending classical spectral shape analysis by incorporating spectral graph wavelets and sparsity-seeking algorithms. Defined with the graph Laplacian eigenbasis, the spectral graph wavelets are localized both in the vertex domain and graph spectral domain, and thus are very effective in describing local geometry. With a rich dictionary of elementary vectors and forcing certain sparsity constraints, a real life signal can often be well approximated by a very sparse coefficient representation. The many successful applications of sparse signal representation in computer vision and image processing inspire us to explore the idea of employing sparse modeling techniques with dictionary of spectral basis to solve various shape modeling problems. Conventional spectral mesh compression uses the eigenfunctions of mesh Laplacian as shape bases, which are highly inefficient in representing local geometry. To ameliorate, we advocate an innovative approach to 3D mesh compression using spectral graph wavelets as dictionary to encode mesh geometry. The spectral graph wavelets are locally defined at individual vertices and can better capture local shape information than Laplacian eigenbasis. The multi-scale SGWs form a redundant dictionary as shape basis, so we formulate the compression of 3D shape as a sparse approximation problem that can be readily handled by greedy pursuit algorithms. Surface inpainting refers to the completion or recovery of missing shape geometry based on the shape information that is currently available. We devise a new surface inpainting algorithm founded upon the theory and techniques of sparse signal recovery. Instead of estimating the missing geometry directly, our novel method is to find this low-dimensional representation which describes the entire original shape. More specifically, we find that, for many shapes, the vertex coordinate function can be well approximated by a very sparse coefficient representation with respect to the dictionary comprising its Laplacian eigenbasis, and it is then possible to recover this sparse representation from partial measurements of the original shape. Taking advantage of the sparsity cue, we advocate a novel variational approach for surface inpainting, integrating data fidelity constraints on the shape domain with coefficient sparsity constraints on the transformed domain. Because of the powerful properties of Laplacian eigenbasis, the inpainting results of our method tend to be globally coherent with the remaining shape. Informative and discriminative feature descriptors are vital in qualitative and quantitative shape analysis for a large variety of graphics applications. We advocate novel strategies to define generalized, user-specified features on shapes. Our new region descriptors are primarily built upon the coefficients of spectral graph wavelets that are both multi-scale and multi-level in nature, consisting of both local and global information. Based on our novel spectral feature descriptor, we developed a user-specified feature detection framework and a tensor-based shape matching algorithm. Through various experiments, we demonstrate the competitive performance of our proposed methods and the great potential of spectral basis and sparsity-driven methods for shape modeling.

  9. Avoiding "greedy reductionism" in personality theory. Comment on "Personality from a cognitive-biological perspective" by Y. Neuman

    NASA Astrophysics Data System (ADS)

    Smillie, Luke D.; Zhao, Kun; Barford, Kate A.

    2014-12-01

    Personality traits - i.e., broad descriptions of regularities in behaviour and experience - can be parsimoniously organised in terms of five trait 'domains' [8]. This is demonstrated by Neuman's [12] observation of the overlap between these 'Big Five' domains and traits derived from Panksepp's Affective Neuroscience framework [13]. This overlap reflects the fact that the Big Five - which can be recovered from factor analyses of questionnaires designed to measure other trait systems [2,10] - represent the major dimensions of covariation among all personality traits [5].

  10. A systematic review of validated methods for identifying hypersensitivity reactions other than anaphylaxis (fever, rash, and lymphadenopathy), using administrative and claims data.

    PubMed

    Schneider, Gary; Kachroo, Sumesh; Jones, Natalie; Crean, Sheila; Rotella, Philip; Avetisyan, Ruzan; Reynolds, Matthew W

    2012-01-01

    The Food and Drug Administration's Mini-Sentinel pilot program aims to conduct active surveillance to refine safety signals that emerge for marketed medical products. A key facet of this surveillance is to develop and understand the validity of algorithms for identifying health outcomes of interest from administrative and claims data. This article summarizes the process and findings of the algorithm review of hypersensitivity reactions. PubMed and Iowa Drug Information Service searches were conducted to identify citations applicable to the hypersensitivity reactions of health outcomes of interest. Level 1 abstract reviews and Level 2 full-text reviews were conducted to find articles using administrative and claims data to identify hypersensitivity reactions and including validation estimates of the coding algorithms. We identified five studies that provided validated hypersensitivity-reaction algorithms. Algorithm positive predictive values (PPVs) for various definitions of hypersensitivity reactions ranged from 3% to 95%. PPVs were high (i.e. 90%-95%) when both exposures and diagnoses were very specific. PPV generally decreased when the definition of hypersensitivity was expanded, except in one study that used data mining methodology for algorithm development. The ability of coding algorithms to identify hypersensitivity reactions varied, with decreasing performance occurring with expanded outcome definitions. This examination of hypersensitivity-reaction coding algorithms provides an example of surveillance bias resulting from outcome definitions that include mild cases. Data mining may provide tools for algorithm development for hypersensitivity and other health outcomes. Research needs to be conducted on designing validation studies to test hypersensitivity-reaction algorithms and estimating their predictive power, sensitivity, and specificity. Copyright © 2012 John Wiley & Sons, Ltd.

  11. New algorithms for identifying the flavour of [Formula: see text] mesons using pions and protons.

    PubMed

    Aaij, R; Adeva, B; Adinolfi, M; Ajaltouni, Z; Akar, S; Albrecht, J; Alessio, F; Alexander, M; Ali, S; Alkhazov, G; Alvarez Cartelle, P; Alves, A A; Amato, S; Amerio, S; Amhis, Y; An, L; Anderlini, L; Andreassi, G; Andreotti, M; Andrews, J E; Appleby, R B; Archilli, F; d'Argent, P; Arnau Romeu, J; Artamonov, A; Artuso, M; Aslanides, E; Auriemma, G; Baalouch, M; Babuschkin, I; Bachmann, S; Back, J J; Badalov, A; Baesso, C; Baker, S; Baldini, W; Barlow, R J; Barschel, C; Barsuk, S; Barter, W; Baszczyk, M; Batozskaya, V; Batsukh, B; Battista, V; Bay, A; Beaucourt, L; Beddow, J; Bedeschi, F; Bediaga, I; Bel, L J; Bellee, V; Belloli, N; Belous, K; Belyaev, I; Ben-Haim, E; Bencivenni, G; Benson, S; Benton, J; Berezhnoy, A; Bernet, R; Bertolin, A; Betti, F; Bettler, M-O; van Beuzekom, M; Bezshyiko, Ia; Bifani, S; Billoir, P; Bird, T; Birnkraut, A; Bitadze, A; Bizzeti, A; Blake, T; Blanc, F; Blouw, J; Blusk, S; Bocci, V; Boettcher, T; Bondar, A; Bondar, N; Bonivento, W; Bordyuzhin, I; Borgheresi, A; Borghi, S; Borisyak, M; Borsato, M; Bossu, F; Boubdir, M; Bowcock, T J V; Bowen, E; Bozzi, C; Braun, S; Britsch, M; Britton, T; Brodzicka, J; Buchanan, E; Burr, C; Bursche, A; Buytaert, J; Cadeddu, S; Calabrese, R; Calvi, M; Calvo Gomez, M; Camboni, A; Campana, P; Campora Perez, D; Campora Perez, D H; Capriotti, L; Carbone, A; Carboni, G; Cardinale, R; Cardini, A; Carniti, P; Carson, L; Carvalho Akiba, K; Casse, G; Cassina, L; Castillo Garcia, L; Cattaneo, M; Cauet, Ch; Cavallero, G; Cenci, R; Charles, M; Charpentier, Ph; Chatzikonstantinidis, G; Chefdeville, M; Chen, S; Cheung, S F; Chobanova, V; Chrzaszcz, M; Cid Vidal, X; Ciezarek, G; Clarke, P E L; Clemencic, M; Cliff, H V; Closier, J; Coco, V; Cogan, J; Cogneras, E; Cogoni, V; Cojocariu, L; Collins, P; Comerma-Montells, A; Contu, A; Cook, A; Coombs, G; Coquereau, S; Corti, G; Corvo, M; Costa Sobral, C M; Couturier, B; Cowan, G A; Craik, D C; Crocombe, A; Cruz Torres, M; Cunliffe, S; Currie, R; D'Ambrosio, C; Da Cunha Marinho, F; Dall'Occo, E; Dalseno, J; David, P N Y; Davis, A; De Aguiar Francisco, O; De Bruyn, K; De Capua, S; De Cian, M; De Miranda, J M; De Paula, L; De Serio, M; De Simone, P; Dean, C T; Decamp, D; Deckenhoff, M; Del Buono, L; Demmer, M; Dendek, A; Derkach, D; Deschamps, O; Dettori, F; Dey, B; Di Canto, A; Dijkstra, H; Dordei, F; Dorigo, M; Dosil Suárez, A; Dovbnya, A; Dreimanis, K; Dufour, L; Dujany, G; Dungs, K; Durante, P; Dzhelyadin, R; Dziurda, A; Dzyuba, A; Déléage, N; Easo, S; Ebert, M; Egede, U; Egorychev, V; Eidelman, S; Eisenhardt, S; Eitschberger, U; Ekelhof, R; Eklund, L; Elsasser, Ch; Ely, S; Esen, S; Evans, H M; Evans, T; Falabella, A; Farley, N; Farry, S; Fay, R; Fazzini, D; Ferguson, D; Fernandez Prieto, A; Ferrari, F; Ferreira Rodrigues, F; Ferro-Luzzi, M; Filippov, S; Fini, R A; Fiore, M; Fiorini, M; Firlej, M; Fitzpatrick, C; Fiutowski, T; Fleuret, F; Fohl, K; Fontana, M; Fontanelli, F; Forshaw, D C; Forty, R; Franco Lima, V; Frank, M; Frei, C; Fu, J; Furfaro, E; Färber, C; Gallas Torreira, A; Galli, D; Gallorini, S; Gambetta, S; Gandelman, M; Gandini, P; Gao, Y; Garcia Martin, L M; García Pardiñas, J; Garra Tico, J; Garrido, L; Garsed, P J; Gascon, D; Gaspar, C; Gavardi, L; Gazzoni, G; Gerick, D; Gersabeck, E; Gersabeck, M; Gershon, T; Ghez, Ph; Gianì, S; Gibson, V; Girard, O G; Giubega, L; Gizdov, K; Gligorov, V V; Golubkov, D; Golutvin, A; Gomes, A; Gorelov, I V; Gotti, C; Grabalosa Gándara, M; Graciani Diaz, R; Granado Cardoso, L A; Graugés, E; Graverini, E; Graziani, G; Grecu, A; Griffith, P; Grillo, L; Gruberg Cazon, B R; Grünberg, O; Gushchin, E; Guz, Yu; Gys, T; Göbel, C; Hadavizadeh, T; Hadjivasiliou, C; Haefeli, G; Haen, C; Haines, S C; Hall, S; Hamilton, B; Han, X; Hansmann-Menzemer, S; Harnew, N; Harnew, S T; Harrison, J; Hatch, M; He, J; Head, T; Heister, A; Hennessy, K; Henrard, P; Henry, L; Hernando Morata, J A; van Herwijnen, E; Heß, M; Hicheur, A; Hill, D; Hombach, C; Hopchev, P H; Hulsbergen, W; Humair, T; Hushchyn, M; Hussain, N; Hutchcroft, D; Idzik, M; Ilten, P; Jacobsson, R; Jaeger, A; Jalocha, J; Jans, E; Jawahery, A; Jiang, F; John, M; Johnson, D; Jones, C R; Joram, C; Jost, B; Jurik, N; Kandybei, S; Kanso, W; Karacson, M; Kariuki, J M; Karodia, S; Kecke, M; Kelsey, M; Kenyon, I R; Kenzie, M; Ketel, T; Khairullin, E; Khanji, B; Khurewathanakul, C; Kirn, T; Klaver, S; Klimaszewski, K; Koliiev, S; Kolpin, M; Komarov, I; Koopman, R F; Koppenburg, P; Kosmyntseva, A; Kozeiha, M; Kravchuk, L; Kreplin, K; Kreps, M; Krokovny, P; Kruse, F; Krzemien, W; Kucewicz, W; Kucharczyk, M; Kudryavtsev, V; Kuonen, A K; Kurek, K; Kvaratskheliya, T; Lacarrere, D; Lafferty, G; Lai, A; Lambert, D; Lanfranchi, G; Langenbruch, C; Latham, T; Lazzeroni, C; Le Gac, R; van Leerdam, J; Lees, J-P; Leflat, A; Lefrançois, J; Lefèvre, R; Lemaitre, F; Lemos Cid, E; Leroy, O; Lesiak, T; Leverington, B; Li, Y; Likhomanenko, T; Lindner, R; Linn, C; Lionetto, F; Liu, B; Liu, X; Loh, D; Longstaff, I; Lopes, J H; Lucchesi, D; Lucio Martinez, M; Luo, H; Lupato, A; Luppi, E; Lupton, O; Lusiani, A; Lyu, X; Machefert, F; Maciuc, F; Maev, O; Maguire, K; Malde, S; Malinin, A; Maltsev, T; Manca, G; Mancinelli, G; Manning, P; Maratas, J; Marchand, J F; Marconi, U; Marin Benito, C; Marino, P; Marks, J; Martellotti, G; Martin, M; Martinelli, M; Martinez Santos, D; Martinez Vidal, F; Martins Tostes, D; Massacrier, L M; Massafferri, A; Matev, R; Mathad, A; Mathe, Z; Matteuzzi, C; Mauri, A; Maurin, B; Mazurov, A; McCann, M; McCarthy, J; McNab, A; McNulty, R; Meadows, B; Meier, F; Meissner, M; Melnychuk, D; Merk, M; Merli, A; Michielin, E; Milanes, D A; Minard, M-N; Mitzel, D S; Mogini, A; Molina Rodriguez, J; Monroy, I A; Monteil, S; Morandin, M; Morawski, P; Mordà, A; Morello, M J; Moron, J; Morris, A B; Mountain, R; Muheim, F; Mulder, M; Mussini, M; Müller, D; Müller, J; Müller, K; Müller, V; Naik, P; Nakada, T; Nandakumar, R; Nandi, A; Nasteva, I; Needham, M; Neri, N; Neubert, S; Neufeld, N; Neuner, M; Nguyen, A D; Nguyen, T D; Nguyen-Mau, C; Nieswand, S; Niet, R; Nikitin, N; Nikodem, T; Novoselov, A; O'Hanlon, D P; Oblakowska-Mucha, A; Obraztsov, V; Ogilvy, S; Oldeman, R; Onderwater, C J G; Otalora Goicochea, J M; Otto, A; Owen, P; Oyanguren, A; Pais, P R; Palano, A; Palombo, F; Palutan, M; Panman, J; Papanestis, A; Pappagallo, M; Pappalardo, L L; Parker, W; Parkes, C; Passaleva, G; Pastore, A; Patel, G D; Patel, M; Patrignani, C; Pearce, A; Pellegrino, A; Penso, G; Pepe Altarelli, M; Perazzini, S; Perret, P; Pescatore, L; Petridis, K; Petrolini, A; Petrov, A; Petruzzo, M; Picatoste Olloqui, E; Pietrzyk, B; Pikies, M; Pinci, D; Pistone, A; Piucci, A; Playfer, S; Plo Casasus, M; Poikela, T; Polci, F; Poluektov, A; Polyakov, I; Polycarpo, E; Pomery, G J; Popov, A; Popov, D; Popovici, B; Poslavskii, S; Potterat, C; Price, E; Price, J D; Prisciandaro, J; Pritchard, A; Prouve, C; Pugatch, V; Puig Navarro, A; Punzi, G; Qian, W; Quagliani, R; Rachwal, B; Rademacker, J H; Rama, M; Ramos Pernas, M; Rangel, M S; Raniuk, I; Ratnikov, F; Raven, G; Redi, F; Reichert, S; Dos Reis, A C; Remon Alepuz, C; Renaudin, V; Ricciardi, S; Richards, S; Rihl, M; Rinnert, K; Rives Molina, V; Robbe, P; Rodrigues, A B; Rodrigues, E; Rodriguez Lopez, J A; Rodriguez Perez, P; Rogozhnikov, A; Roiser, S; Rollings, A; Romanovskiy, V; Romero Vidal, A; Ronayne, J W; Rotondo, M; Rudolph, M S; Ruf, T; Ruiz Valls, P; Saborido Silva, J J; Sadykhov, E; Sagidova, N; Saitta, B; Salustino Guimaraes, V; Sanchez Mayordomo, C; Sanmartin Sedes, B; Santacesaria, R; Santamarina Rios, C; Santimaria, M; Santovetti, E; Sarti, A; Satriano, C; Satta, A; Saunders, D M; Savrina, D; Schael, S; Schellenberg, M; Schiller, M; Schindler, H; Schlupp, M; Schmelling, M; Schmelzer, T; Schmidt, B; Schneider, O; Schopper, A; Schubert, K; Schubiger, M; Schune, M-H; Schwemmer, R; Sciascia, B; Sciubba, A; Semennikov, A; Sergi, A; Serra, N; Serrano, J; Sestini, L; Seyfert, P; Shapkin, M; Shapoval, I; Shcheglov, Y; Shears, T; Shekhtman, L; Shevchenko, V; Shires, A; Siddi, B G; Silva Coutinho, R; Silva de Oliveira, L; Simi, G; Simone, S; Sirendi, M; Skidmore, N; Skwarnicki, T; Smith, E; Smith, I T; Smith, J; Smith, M; Snoek, H; Sokoloff, M D; Soler, F J P; Souza De Paula, B; Spaan, B; Spradlin, P; Sridharan, S; Stagni, F; Stahl, M; Stahl, S; Stefko, P; Stefkova, S; Steinkamp, O; Stemmle, S; Stenyakin, O; Stevenson, S; Stoica, S; Stone, S; Storaci, B; Stracka, S; Straticiuc, M; Straumann, U; Sun, L; Sutcliffe, W; Swientek, K; Syropoulos, V; Szczekowski, M; Szumlak, T; T'Jampens, S; Tayduganov, A; Tekampe, T; Teklishyn, M; Tellarini, G; Teubert, F; Thomas, E; van Tilburg, J; Tilley, M J; Tisserand, V; Tobin, M; Tolk, S; Tomassetti, L; Tonelli, D; Topp-Joergensen, S; Toriello, F; Tournefier, E; Tourneur, S; Trabelsi, K; Traill, M; Tran, M T; Tresch, M; Trisovic, A; Tsaregorodtsev, A; Tsopelas, P; Tully, A; Tuning, N; Ukleja, A; Ustyuzhanin, A; Uwer, U; Vacca, C; Vagnoni, V; Valassi, A; Valat, S; Valenti, G; Vallier, A; Vazquez Gomez, R; Vazquez Regueiro, P; Vecchi, S; van Veghel, M; Velthuis, J J; Veltri, M; Veneziano, G; Venkateswaran, A; Vernet, M; Vesterinen, M; Viaud, B; Vieira, D; Vieites Diaz, M; Vilasis-Cardona, X; Volkov, V; Vollhardt, A; Voneki, B; Vorobyev, A; Vorobyev, V; Voß, C; de Vries, J A; Vázquez Sierra, C; Waldi, R; Wallace, C; Wallace, R; Walsh, J; Wang, J; Ward, D R; Wark, H M; Watson, N K; Websdale, D; Weiden, A; Whitehead, M; Wicht, J; Wilkinson, G; Wilkinson, M; Williams, M; Williams, M P; Williams, M; Williams, T; Wilson, F F; Wimberley, J; Wishahi, J; Wislicki, W; Witek, M; Wormser, G; Wotton, S A; Wraight, K; Wyllie, K; Xie, Y; Xu, Z; Yang, Z; Yin, H; Yu, J; Yuan, X; Yushchenko, O; Zarebski, K A; Zavertyaev, M; Zhang, L; Zhang, Y; Zhelezov, A; Zheng, Y; Zhokhov, A; Zhu, X; Zhukov, V; Zucchelli, S

    2017-01-01

    Two new algorithms for use in the analysis of [Formula: see text] collision are developed to identify the flavour of [Formula: see text] mesons at production using pions and protons from the hadronization process. The algorithms are optimized and calibrated on data, using [Formula: see text] decays from [Formula: see text] collision data collected by LHCb at centre-of-mass energies of 7 and 8 TeV . The tagging power of the new pion algorithm is 60% greater than the previously available one; the algorithm using protons to identify the flavour of a [Formula: see text] meson is the first of its kind.

  12. Validation of two algorithms for managing children with a non-blanching rash.

    PubMed

    Riordan, F Andrew I; Jones, Laura; Clark, Julia

    2016-08-01

    Paediatricians are concerned that children who present with a non-blanching rash (NBR) may have meningococcal disease (MCD). Two algorithms have been devised to help identify which children with an NBR have MCD. To evaluate the NBR algorithms' ability to identify children with MCD. The Newcastle-Birmingham-Liverpool (NBL) algorithm was applied retrospectively to three cohorts of children who had presented with NBRs. This algorithm was also piloted in four hospitals, and then used prospectively for 12 months in one hospital. The National Institute for Health and Care Excellence (NICE) algorithm was validated retrospectively using data from all cohorts. The cohorts included 625 children, 145 (23%) of whom had confirmed or probable MCD. Paediatricians empirically treated 324 (52%) children with antibiotics. The NBL algorithm identified all children with MCD and suggested treatment for a further 86 children (sensitivity 100%, specificity 82%). One child with MCD did not receive immediate antibiotic treatment, despite this being suggested by the algorithm. The NICE algorithm suggested 382 children (61%) who should be treated with antibiotics. This included 141 of the 145 children with MCD (sensitivity 97%, specificity 50%). These algorithms may help paediatricians identify children with MCD who present with NBRs. The NBL algorithm may be more specific than the NICE algorithm as it includes fewer features suggesting MCD. The only significant delay in treatment of MCD occurred when the algorithms were not followed. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  13. A Systematic Review of Validated Methods for Identifying Cerebrovascular Accident or Transient Ischemic Attack Using Administrative Data

    PubMed Central

    Andrade, Susan E.; Harrold, Leslie R.; Tjia, Jennifer; Cutrona, Sarah L.; Saczynski, Jane S.; Dodd, Katherine S.; Goldberg, Robert J.; Gurwitz, Jerry H.

    2012-01-01

    Purpose To perform a systematic review of the validity of algorithms for identifying cerebrovascular accidents (CVAs) or transient ischemic attacks (TIAs) using administrative and claims data. Methods PubMed and Iowa Drug Information Service (IDIS) searches of the English language literature were performed to identify studies published between 1990 and 2010 that evaluated the validity of algorithms for identifying CVAs (ischemic and hemorrhagic strokes, intracranial hemorrhage and subarachnoid hemorrhage) and/or TIAs in administrative data. Two study investigators independently reviewed the abstracts and articles to determine relevant studies according to pre-specified criteria. Results A total of 35 articles met the criteria for evaluation. Of these, 26 articles provided data to evaluate the validity of stroke, 7 reported the validity of TIA, 5 reported the validity of intracranial bleeds (intracerebral hemorrhage and subarachnoid hemorrhage), and 10 studies reported the validity of algorithms to identify the composite endpoints of stroke/TIA or cerebrovascular disease. Positive predictive values (PPVs) varied depending on the specific outcomes and algorithms evaluated. Specific algorithms to evaluate the presence of stroke and intracranial bleeds were found to have high PPVs (80% or greater). Algorithms to evaluate TIAs in adult populations were generally found to have PPVs of 70% or greater. Conclusions The algorithms and definitions to identify CVAs and TIAs using administrative and claims data differ greatly in the published literature. The choice of the algorithm employed should be determined by the stroke subtype of interest. PMID:22262598

  14. Automatable algorithms to identify nonmedical opioid use using electronic data: a systematic review.

    PubMed

    Canan, Chelsea; Polinski, Jennifer M; Alexander, G Caleb; Kowal, Mary K; Brennan, Troyen A; Shrank, William H

    2017-11-01

    Improved methods to identify nonmedical opioid use can help direct health care resources to individuals who need them. Automated algorithms that use large databases of electronic health care claims or records for surveillance are a potential means to achieve this goal. In this systematic review, we reviewed the utility, attempts at validation, and application of such algorithms to detect nonmedical opioid use. We searched PubMed and Embase for articles describing automatable algorithms that used electronic health care claims or records to identify patients or prescribers with likely nonmedical opioid use. We assessed algorithm development, validation, and performance characteristics and the settings where they were applied. Study variability precluded a meta-analysis. Of 15 included algorithms, 10 targeted patients, 2 targeted providers, 2 targeted both, and 1 identified medications with high abuse potential. Most patient-focused algorithms (67%) used prescription drug claims and/or medical claims, with diagnosis codes of substance abuse and/or dependence as the reference standard. Eleven algorithms were developed via regression modeling. Four used natural language processing, data mining, audit analysis, or factor analysis. Automated algorithms can facilitate population-level surveillance. However, there is no true gold standard for determining nonmedical opioid use. Users must recognize the implications of identifying false positives and, conversely, false negatives. Few algorithms have been applied in real-world settings. Automated algorithms may facilitate identification of patients and/or providers most likely to need more intensive screening and/or intervention for nonmedical opioid use. Additional implementation research in real-world settings would clarify their utility. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  15. Identifying Psoriasis and Psoriatic Arthritis Patients in Retrospective Databases When Diagnosis Codes Are Not Available: A Validation Study Comparing Medication/Prescriber Visit-Based Algorithms with Diagnosis Codes.

    PubMed

    Dobson-Belaire, Wendy; Goodfield, Jason; Borrelli, Richard; Liu, Fei Fei; Khan, Zeba M

    2018-01-01

    Using diagnosis code-based algorithms is the primary method of identifying patient cohorts for retrospective studies; nevertheless, many databases lack reliable diagnosis code information. To develop precise algorithms based on medication claims/prescriber visits (MCs/PVs) to identify psoriasis (PsO) patients and psoriatic patients with arthritic conditions (PsO-AC), a proxy for psoriatic arthritis, in Canadian databases lacking diagnosis codes. Algorithms were developed using medications with narrow indication profiles in combination with prescriber specialty to define PsO and PsO-AC. For a 3-year study period from July 1, 2009, algorithms were validated using the PharMetrics Plus database, which contains both adjudicated medication claims and diagnosis codes. Positive predictive value (PPV), negative predictive value (NPV), sensitivity, and specificity of the developed algorithms were assessed using diagnosis code as the reference standard. Chosen algorithms were then applied to Canadian drug databases to profile the algorithm-identified PsO and PsO-AC cohorts. In the selected database, 183,328 patients were identified for validation. The highest PPVs for PsO (85%) and PsO-AC (65%) occurred when a predictive algorithm of two or more MCs/PVs was compared with the reference standard of one or more diagnosis codes. NPV and specificity were high (99%-100%), whereas sensitivity was low (≤30%). Reducing the number of MCs/PVs or increasing diagnosis claims decreased the algorithms' PPVs. We have developed an MC/PV-based algorithm to identify PsO patients with a high degree of accuracy, but accuracy for PsO-AC requires further investigation. Such methods allow researchers to conduct retrospective studies in databases in which diagnosis codes are absent. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  16. Development and validation of an algorithm for identifying urinary retention in a cohort of patients with epilepsy in a large US administrative claims database.

    PubMed

    Quinlan, Scott C; Cheng, Wendy Y; Ishihara, Lianna; Irizarry, Michael C; Holick, Crystal N; Duh, Mei Sheng

    2016-04-01

    The aim of this study was to develop and validate an insurance claims-based algorithm for identifying urinary retention (UR) in epilepsy patients receiving antiepileptic drugs to facilitate safety monitoring. Data from the HealthCore Integrated Research Database(SM) in 2008-2011 (retrospective) and 2012-2013 (prospective) were used to identify epilepsy patients with UR. During the retrospective phase, three algorithms identified potential UR: (i) UR diagnosis code with a catheterization procedure code; (ii) UR diagnosis code alone; or (iii) diagnosis with UR-related symptoms. Medical records for 50 randomly selected patients satisfying ≥1 algorithm were reviewed by urologists to ascertain UR status. Positive predictive value (PPV) and 95% confidence intervals (CI) were calculated for the three component algorithms and the overall algorithm (defined as satisfying ≥1 component algorithms). Algorithms were refined using urologist review notes. In the prospective phase, the UR algorithm was refined using medical records for an additional 150 cases. In the retrospective phase, the PPV of the overall algorithm was 72.0% (95%CI: 57.5-83.8%). Algorithm 3 performed poorly and was dropped. Algorithm 1 was unchanged; urinary incontinence and cystitis were added as exclusionary diagnoses to Algorithm 2. The PPV for the modified overall algorithm was 89.2% (74.6-97.0%). In the prospective phase, the PPV for the modified overall algorithm was 76.0% (68.4-82.6%). Upon adding overactive bladder, nocturia and urinary frequency as exclusionary diagnoses, the PPV for the final overall algorithm was 81.9% (73.7-88.4%). The current UR algorithm yielded a PPV > 80% and could be used for more accurate identification of UR among epilepsy patients in a large claims database. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Anticipation-related brain connectivity in bipolar and unipolar depression: a graph theory approach

    PubMed Central

    Almeida, Jorge R. C.; Stiffler, Richelle; Lockovich, Jeanette C.; Aslam, Haris A.; Phillips, Mary L.

    2016-01-01

    Bipolar disorder is often misdiagnosed as major depressive disorder, which leads to inadequate treatment. Depressed individuals versus healthy control subjects, show increased expectation of negative outcomes. Due to increased impulsivity and risk for mania, however, depressed individuals with bipolar disorder may differ from those with major depressive disorder in neural mechanisms underlying anticipation processes. Graph theory methods for neuroimaging data analysis allow the identification of connectivity between multiple brain regions without prior model specification, and may help to identify neurobiological markers differentiating these disorders, thereby facilitating development of better therapeutic interventions. This study aimed to compare brain connectivity among regions involved in win/loss anticipation in depressed individuals with bipolar disorder (BDD) versus depressed individuals with major depressive disorder (MDD) versus healthy control subjects using graph theory methods. The study was conducted at the University of Pittsburgh Medical Center and included 31 BDD, 39 MDD, and 36 healthy control subjects. Participants were scanned while performing a number guessing reward task that included the periods of win and loss anticipation. We first identified the anticipatory network across all 106 participants by contrasting brain activation during all anticipation periods (win anticipation + loss anticipation) versus baseline, and win anticipation versus loss anticipation. Brain connectivity within the identified network was determined using the Independent Multiple sample Greedy Equivalence Search (IMaGES) and Linear non-Gaussian Orientation, Fixed Structure (LOFS) algorithms. Density of connections (the number of connections in the network), path length, and the global connectivity direction (‘top-down’ versus ‘bottom-up’) were compared across groups (BDD/MDD/healthy control subjects) and conditions (win/loss anticipation). These analyses showed that loss anticipation was characterized by denser top-down fronto-striatal and fronto-parietal connectivity in healthy control subjects, by bottom-up striatal-frontal connectivity in MDD, and by sparse connectivity lacking fronto-striatal connections in BDD. Win anticipation was characterized by dense connectivity of medial frontal with striatal and lateral frontal cortical regions in BDD, by sparser bottom-up striatum-medial frontal cortex connectivity in MDD, and by sparse connectivity in healthy control subjects. In summary, this is the first study to demonstrate that BDD and MDD with comparable levels of current depression differed from each other and healthy control subjects in density of connections, connectivity path length, and connectivity direction as a function of win or loss anticipation. These findings suggest that different neurobiological mechanisms may underlie aberrant anticipation processes in BDD and MDD, and that distinct therapeutic strategies may be required for these individuals to improve coping strategies during expectation of positive and negative outcomes. PMID:27368345

  18. Anticipation-related brain connectivity in bipolar and unipolar depression: a graph theory approach.

    PubMed

    Manelis, Anna; Almeida, Jorge R C; Stiffler, Richelle; Lockovich, Jeanette C; Aslam, Haris A; Phillips, Mary L

    2016-09-01

    Bipolar disorder is often misdiagnosed as major depressive disorder, which leads to inadequate treatment. Depressed individuals versus healthy control subjects, show increased expectation of negative outcomes. Due to increased impulsivity and risk for mania, however, depressed individuals with bipolar disorder may differ from those with major depressive disorder in neural mechanisms underlying anticipation processes. Graph theory methods for neuroimaging data analysis allow the identification of connectivity between multiple brain regions without prior model specification, and may help to identify neurobiological markers differentiating these disorders, thereby facilitating development of better therapeutic interventions. This study aimed to compare brain connectivity among regions involved in win/loss anticipation in depressed individuals with bipolar disorder (BDD) versus depressed individuals with major depressive disorder (MDD) versus healthy control subjects using graph theory methods. The study was conducted at the University of Pittsburgh Medical Center and included 31 BDD, 39 MDD, and 36 healthy control subjects. Participants were scanned while performing a number guessing reward task that included the periods of win and loss anticipation. We first identified the anticipatory network across all 106 participants by contrasting brain activation during all anticipation periods (win anticipation + loss anticipation) versus baseline, and win anticipation versus loss anticipation. Brain connectivity within the identified network was determined using the Independent Multiple sample Greedy Equivalence Search (IMaGES) and Linear non-Gaussian Orientation, Fixed Structure (LOFS) algorithms. Density of connections (the number of connections in the network), path length, and the global connectivity direction ('top-down' versus 'bottom-up') were compared across groups (BDD/MDD/healthy control subjects) and conditions (win/loss anticipation). These analyses showed that loss anticipation was characterized by denser top-down fronto-striatal and fronto-parietal connectivity in healthy control subjects, by bottom-up striatal-frontal connectivity in MDD, and by sparse connectivity lacking fronto-striatal connections in BDD. Win anticipation was characterized by dense connectivity of medial frontal with striatal and lateral frontal cortical regions in BDD, by sparser bottom-up striatum-medial frontal cortex connectivity in MDD, and by sparse connectivity in healthy control subjects. In summary, this is the first study to demonstrate that BDD and MDD with comparable levels of current depression differed from each other and healthy control subjects in density of connections, connectivity path length, and connectivity direction as a function of win or loss anticipation. These findings suggest that different neurobiological mechanisms may underlie aberrant anticipation processes in BDD and MDD, and that distinct therapeutic strategies may be required for these individuals to improve coping strategies during expectation of positive and negative outcomes. © The Author (2016). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  19. Feature Selection Method Based on Neighborhood Relationships: Applications in EEG Signal Identification and Chinese Character Recognition

    PubMed Central

    Zhao, Yu-Xiang; Chou, Chien-Hsing

    2016-01-01

    In this study, a new feature selection algorithm, the neighborhood-relationship feature selection (NRFS) algorithm, is proposed for identifying rat electroencephalogram signals and recognizing Chinese characters. In these two applications, dependent relationships exist among the feature vectors and their neighboring feature vectors. Therefore, the proposed NRFS algorithm was designed for solving this problem. By applying the NRFS algorithm, unselected feature vectors have a high priority of being added into the feature subset if the neighboring feature vectors have been selected. In addition, selected feature vectors have a high priority of being eliminated if the neighboring feature vectors are not selected. In the experiments conducted in this study, the NRFS algorithm was compared with two feature algorithms. The experimental results indicated that the NRFS algorithm can extract the crucial frequency bands for identifying rat vigilance states and identifying crucial character regions for recognizing Chinese characters. PMID:27314346

  20. SA-SOM algorithm for detecting communities in complex networks

    NASA Astrophysics Data System (ADS)

    Chen, Luogeng; Wang, Yanran; Huang, Xiaoming; Hu, Mengyu; Hu, Fang

    2017-10-01

    Currently, community detection is a hot topic. This paper, based on the self-organizing map (SOM) algorithm, introduced the idea of self-adaptation (SA) that the number of communities can be identified automatically, a novel algorithm SA-SOM of detecting communities in complex networks is proposed. Several representative real-world networks and a set of computer-generated networks by LFR-benchmark are utilized to verify the accuracy and the efficiency of this algorithm. The experimental findings demonstrate that this algorithm can identify the communities automatically, accurately and efficiently. Furthermore, this algorithm can also acquire higher values of modularity, NMI and density than the SOM algorithm does.

  1. Fault tolerance in protein interaction networks: stable bipartite subgraphs and redundant pathways.

    PubMed

    Brady, Arthur; Maxwell, Kyle; Daniels, Noah; Cowen, Lenore J

    2009-01-01

    As increasing amounts of high-throughput data for the yeast interactome become available, more system-wide properties are uncovered. One interesting question concerns the fault tolerance of protein interaction networks: whether there exist alternative pathways that can perform some required function if a gene essential to the main mechanism is defective, absent or suppressed. A signature pattern for redundant pathways is the BPM (between-pathway model) motif, introduced by Kelley and Ideker. Past methods proposed to search the yeast interactome for BPM motifs have had several important limitations. First, they have been driven heuristically by local greedy searches, which can lead to the inclusion of extra genes that may not belong in the motif; second, they have been validated solely by functional coherence of the putative pathways using GO enrichment, making it difficult to evaluate putative BPMs in the absence of already known biological annotation. We introduce stable bipartite subgraphs, and show they form a clean and efficient way of generating meaningful BPMs which naturally discard extra genes included by local greedy methods. We show by GO enrichment measures that our BPM set outperforms previous work, covering more known complexes and functional pathways. Perhaps most importantly, since our BPMs are initially generated by examining the genetic-interaction network only, the location of edges in the protein-protein physical interaction network can then be used to statistically validate each candidate BPM, even with sparse GO annotation (or none at all). We uncover some interesting biological examples of previously unknown putative redundant pathways in such areas as vesicle-mediated transport and DNA repair.

  2. Fault Tolerance in Protein Interaction Networks: Stable Bipartite Subgraphs and Redundant Pathways

    PubMed Central

    Brady, Arthur; Maxwell, Kyle; Daniels, Noah; Cowen, Lenore J.

    2009-01-01

    As increasing amounts of high-throughput data for the yeast interactome become available, more system-wide properties are uncovered. One interesting question concerns the fault tolerance of protein interaction networks: whether there exist alternative pathways that can perform some required function if a gene essential to the main mechanism is defective, absent or suppressed. A signature pattern for redundant pathways is the BPM (between-pathway model) motif, introduced by Kelley and Ideker. Past methods proposed to search the yeast interactome for BPM motifs have had several important limitations. First, they have been driven heuristically by local greedy searches, which can lead to the inclusion of extra genes that may not belong in the motif; second, they have been validated solely by functional coherence of the putative pathways using GO enrichment, making it difficult to evaluate putative BPMs in the absence of already known biological annotation. We introduce stable bipartite subgraphs, and show they form a clean and efficient way of generating meaningful BPMs which naturally discard extra genes included by local greedy methods. We show by GO enrichment measures that our BPM set outperforms previous work, covering more known complexes and functional pathways. Perhaps most importantly, since our BPMs are initially generated by examining the genetic-interaction network only, the location of edges in the protein-protein physical interaction network can then be used to statistically validate each candidate BPM, even with sparse GO annotation (or none at all). We uncover some interesting biological examples of previously unknown putative redundant pathways in such areas as vesicle-mediated transport and DNA repair. PMID:19399174

  3. Development of an algorithm to identify fall-related injuries and costs in Medicare data.

    PubMed

    Kim, Sung-Bou; Zingmond, David S; Keeler, Emmett B; Jennings, Lee A; Wenger, Neil S; Reuben, David B; Ganz, David A

    2016-12-01

    Identifying fall-related injuries and costs using healthcare claims data is cost-effective and easier to implement than using medical records or patient self-report to track falls. We developed a comprehensive four-step algorithm for identifying episodes of care for fall-related injuries and associated costs, using fee-for-service Medicare and Medicare Advantage health plan claims data for 2,011 patients from 5 medical groups between 2005 and 2009. First, as a preparatory step, we identified care received in acute inpatient and skilled nursing facility settings, in addition to emergency department visits. Second, based on diagnosis and procedure codes, we identified all fall-related claim records. Third, with these records, we identified six types of encounters for fall-related injuries, with different levels of injury and care. In the final step, we used these encounters to identify episodes of care for fall-related injuries. To illustrate the algorithm, we present a representative example of a fall episode and examine descriptive statistics of injuries and costs for such episodes. Altogether, we found that the results support the use of our algorithm for identifying episodes of care for fall-related injuries. When we decomposed an episode, we found that the details present a realistic and coherent story of fall-related injuries and healthcare services. Variation of episode characteristics across medical groups supported the use of a complex algorithm approach, and descriptive statistics on the proportion, duration, and cost of episodes by healthcare services and injuries verified that our results are consistent with other studies. This algorithm can be used to identify and analyze various types of fall-related outcomes including episodes of care, injuries, and associated costs. Furthermore, the algorithm can be applied and adopted in other fall-related studies with relative ease.

  4. Optimizing Algorithm Choice for Metaproteomics: Comparing X!Tandem and Proteome Discoverer for Soil Proteomes

    NASA Astrophysics Data System (ADS)

    Diaz, K. S.; Kim, E. H.; Jones, R. M.; de Leon, K. C.; Woodcroft, B. J.; Tyson, G. W.; Rich, V. I.

    2014-12-01

    The growing field of metaproteomics links microbial communities to their expressed functions by using mass spectrometry methods to characterize community proteins. Comparison of mass spectrometry protein search algorithms and their biases is crucial for maximizing the quality and amount of protein identifications in mass spectral data. Available algorithms employ different approaches when mapping mass spectra to peptides against a database. We compared mass spectra from four microbial proteomes derived from high-organic content soils searched with two search algorithms: 1) Sequest HT as packaged within Proteome Discoverer (v.1.4) and 2) X!Tandem as packaged in TransProteomicPipeline (v.4.7.1). Searches used matched metagenomes, and results were filtered to allow identification of high probability proteins. There was little overlap in proteins identified by both algorithms, on average just ~24% of the total. However, when adjusted for spectral abundance, the overlap improved to ~70%. Proteome Discoverer generally outperformed X!Tandem, identifying an average of 12.5% more proteins than X!Tandem, with X!Tandem identifying more proteins only in the first two proteomes. For spectrally-adjusted results, the algorithms were similar, with X!Tandem marginally outperforming Proteome Discoverer by an average of ~4%. We then assessed differences in heat shock proteins (HSP) identification by the two algorithms by BLASTing identified proteins against the Heat Shock Protein Information Resource, because HSP hits typically account for the majority signal in proteomes, due to extraction protocols. Total HSP identifications for each of the 4 proteomes were approximately ~15%, ~11%, ~17%, and ~19%, with ~14% for total HSPs with redundancies removed. Of the ~15% average of proteins from the 4 proteomes identified as HSPs, ~10% of proteins and spectra were identified by both algorithms. On average, Proteome Discoverer identified ~9% more HSPs than X!Tandem.

  5. Diagnostic accuracy of administrative data algorithms in the diagnosis of osteoarthritis: a systematic review.

    PubMed

    Shrestha, Swastina; Dave, Amish J; Losina, Elena; Katz, Jeffrey N

    2016-07-07

    Administrative health care data are frequently used to study disease burden and treatment outcomes in many conditions including osteoarthritis (OA). OA is a chronic condition with significant disease burden affecting over 27 million adults in the US. There are few studies examining the performance of administrative data algorithms to diagnose OA. The purpose of this study is to perform a systematic review of administrative data algorithms for OA diagnosis; and, to evaluate the diagnostic characteristics of algorithms based on restrictiveness and reference standards. Two reviewers independently screened English-language articles published in Medline, Embase, PubMed, and Cochrane databases that used administrative data to identify OA cases. Each algorithm was classified as restrictive or less restrictive based on number and type of administrative codes required to satisfy the case definition. We recorded sensitivity and specificity of algorithms and calculated positive likelihood ratio (LR+) and positive predictive value (PPV) based on assumed OA prevalence of 0.1, 0.25, and 0.50. The search identified 7 studies that used 13 algorithms. Of these 13 algorithms, 5 were classified as restrictive and 8 as less restrictive. Restrictive algorithms had lower median sensitivity and higher median specificity compared to less restrictive algorithms when reference standards were self-report and American college of Rheumatology (ACR) criteria. The algorithms compared to reference standard of physician diagnosis had higher sensitivity and specificity than those compared to self-reported diagnosis or ACR criteria. Restrictive algorithms are more specific for OA diagnosis and can be used to identify cases when false positives have higher costs e.g. interventional studies. Less restrictive algorithms are more sensitive and suited for studies that attempt to identify all cases e.g. screening programs.

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mitrani, J

    Bayesian networks (BN) are an excellent tool for modeling uncertainties in systems with several interdependent variables. A BN is a directed acyclic graph, and consists of a structure, or the set of directional links between variables that depend on other variables, and conditional probabilities (CP) for each variable. In this project, we apply BN's to understand uncertainties in NIF ignition experiments. One can represent various physical properties of National Ignition Facility (NIF) capsule implosions as variables in a BN. A dataset containing simulations of NIF capsule implosions was provided. The dataset was generated from a radiation hydrodynamics code, and itmore » contained 120 simulations of 16 variables. Relevant knowledge about the physics of NIF capsule implosions and greedy search algorithms were used to search for hypothetical structures for a BN. Our preliminary results found 6 links between variables in the dataset. However, we thought there should have been more links between the dataset variables based on the physics of NIF capsule implosions. Important reasons for the paucity of links are the relatively small size of the dataset, and the sampling of the values for dataset variables. Another factor that might have caused the paucity of links is the fact that in the dataset, 20% of the simulations represented successful fusion, and 80% didn't, (simulations of unsuccessful fusion are useful for measuring certain diagnostics) which skewed the distributions of several variables, and possibly reduced the number of links. Nevertheless, by illustrating the interdependencies and conditional probabilities of several parameters and diagnostics, an accurate and complete BN built from an appropriate simulation set would provide uncertainty quantification for NIF capsule implosions.« less

  7. Real-time distributed scheduling algorithm for supporting QoS over WDM networks

    NASA Astrophysics Data System (ADS)

    Kam, Anthony C.; Siu, Kai-Yeung

    1998-10-01

    Most existing or proposed WDM networks employ circuit switching, typically with one session having exclusive use of one entire wavelength. Consequently they are not suitable for data applications involving bursty traffic patterns. The MIT AON Consortium has developed an all-optical LAN/MAN testbed which provides time-slotted WDM service and employs fast-tunable transceivers in each optical terminal. In this paper, we explore extensions of this service to achieve fine-grained statistical multiplexing with different virtual circuits time-sharing the wavelengths in a fair manner. In particular, we develop a real-time distributed protocol for best-effort traffic over this time-slotted WDM service with near-optical fairness and throughput characteristics. As an additional design feature, our protocol supports the allocation of guaranteed bandwidths to selected connections. This feature acts as a first step towards supporting integrated services and quality-of-service guarantees over WDM networks. To achieve high throughput, our approach is based on scheduling transmissions, as opposed to collision- based schemes. Our distributed protocol involves one MAN scheduler and several LAN schedulers (one per LAN) in a master-slave arrangement. Because of propagation delays and limits on control channel capacities, all schedulers are designed to work with partial, delayed traffic information. Our distributed protocol is of the `greedy' type to ensure fast execution in real-time in response to dynamic traffic changes. It employs a hybrid form of rate and credit control for resource allocation. We have performed extensive simulations, which show that our protocol allocates resources (transmitters, receivers, wavelengths) fairly with high throughput, and supports bandwidth guarantees.

  8. Brain network response underlying decisions about abstract reinforcers.

    PubMed

    Mills-Finnerty, Colleen; Hanson, Catherine; Hanson, Stephen Jose

    2014-12-01

    Decision making studies typically use tasks that involve concrete action-outcome contingencies, in which subjects do something and get something. No studies have addressed decision making involving abstract reinforcers, where there are no action-outcome contingencies and choices are entirely hypothetical. The present study examines these kinds of choices, as well as whether the same biases that exist for concrete reinforcer decisions, specifically framing effects, also apply during abstract reinforcer decisions. We use both General Linear Model as well as Bayes network connectivity analysis using the Independent Multi-sample Greedy Equivalence Search (IMaGES) algorithm to examine network response underlying choices for abstract reinforcers under positive and negative framing. We find for the first time that abstract reinforcer decisions activate the same network of brain regions as concrete reinforcer decisions, including the striatum, insula, anterior cingulate, and VMPFC, results that are further supported via comparison to a meta-analysis of decision making studies. Positive and negative framing activated different parts of this network, with stronger activation in VMPFC during negative framing and in DLPFC during positive, suggesting different decision making pathways depending on frame. These results were further clarified using connectivity analysis, which revealed stronger connections between anterior cingulate, insula, and accumbens during negative framing compared to positive. Taken together, these results suggest that not only do abstract reinforcer decisions rely on the same brain substrates as concrete reinforcers, but that the response underlying framing effects on abstract reinforcers also resemble those for concrete reinforcers, specifically increased limbic system connectivity during negative frames. Copyright © 2014 Elsevier Inc. All rights reserved.

  9. Hubble Source Catalog

    NASA Astrophysics Data System (ADS)

    Lubow, S.; Budavári, T.

    2013-10-01

    We have created an initial catalog of objects observed by the WFPC2 and ACS instruments on the Hubble Space Telescope (HST). The catalog is based on observations taken on more than 6000 visits (telescope pointings) of ACS/WFC and more than 25000 visits of WFPC2. The catalog is obtained by cross matching by position in the sky all Hubble Legacy Archive (HLA) Source Extractor source lists for these instruments. The source lists describe properties of source detections within a visit. The calculations are performed on a SQL Server database system. First we collect overlapping images into groups, e.g., Eta Car, and determine nearby (approximately matching) pairs of sources from different images within each group. We then apply a novel algorithm for improving the cross matching of pairs of sources by adjusting the astrometry of the images. Next, we combine pairwise matches into maximal sets of possible multi-source matches. We apply a greedy Bayesian method to split the maximal matches into more reliable matches. We test the accuracy of the matches by comparing the fluxes of the matched sources. The result is a set of information that ties together multiple observations of the same object. A byproduct of the catalog is greatly improved relative astrometry for many of the HST images. We also provide information on nondetections that can be used to determine dropouts. With the catalog, for the first time, one can carry out time domain, multi-wavelength studies across a large set of HST data. The catalog is publicly available. Much more can be done to expand the catalog capabilities.

  10. An Improved Snake Model for Refinement of Lidar-Derived Building Roof Contours Using Aerial Images

    NASA Astrophysics Data System (ADS)

    Chen, Qi; Wang, Shugen; Liu, Xiuguo

    2016-06-01

    Building roof contours are considered as very important geometric data, which have been widely applied in many fields, including but not limited to urban planning, land investigation, change detection and military reconnaissance. Currently, the demand on building contours at a finer scale (especially in urban areas) has been raised in a growing number of studies such as urban environment quality assessment, urban sprawl monitoring and urban air pollution modelling. LiDAR is known as an effective means of acquiring 3D roof points with high elevation accuracy. However, the precision of the building contour obtained from LiDAR data is restricted by its relatively low scanning resolution. With the use of the texture information from high-resolution imagery, the precision can be improved. In this study, an improved snake model is proposed to refine the initial building contours extracted from LiDAR. First, an improved snake model is constructed with the constraints of the deviation angle, image gradient, and area. Then, the nodes of the contour are moved in a certain range to find the best optimized result using greedy algorithm. Considering both precision and efficiency, the candidate shift positions of the contour nodes are constrained, and the searching strategy for the candidate nodes is explicitly designed. The experiments on three datasets indicate that the proposed method for building contour refinement is effective and feasible. The average quality index is improved from 91.66% to 93.34%. The statistics of the evaluation results for every single building demonstrated that 77.0% of the total number of contours is updated with higher quality index.

  11. Polynomial meta-models with canonical low-rank approximations: Numerical insights and comparison to sparse polynomial chaos expansions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Konakli, Katerina, E-mail: konakli@ibk.baug.ethz.ch; Sudret, Bruno

    2016-09-15

    The growing need for uncertainty analysis of complex computational models has led to an expanding use of meta-models across engineering and sciences. The efficiency of meta-modeling techniques relies on their ability to provide statistically-equivalent analytical representations based on relatively few evaluations of the original model. Polynomial chaos expansions (PCE) have proven a powerful tool for developing meta-models in a wide range of applications; the key idea thereof is to expand the model response onto a basis made of multivariate polynomials obtained as tensor products of appropriate univariate polynomials. The classical PCE approach nevertheless faces the “curse of dimensionality”, namely themore » exponential increase of the basis size with increasing input dimension. To address this limitation, the sparse PCE technique has been proposed, in which the expansion is carried out on only a few relevant basis terms that are automatically selected by a suitable algorithm. An alternative for developing meta-models with polynomial functions in high-dimensional problems is offered by the newly emerged low-rank approximations (LRA) approach. By exploiting the tensor–product structure of the multivariate basis, LRA can provide polynomial representations in highly compressed formats. Through extensive numerical investigations, we herein first shed light on issues relating to the construction of canonical LRA with a particular greedy algorithm involving a sequential updating of the polynomial coefficients along separate dimensions. Specifically, we examine the selection of optimal rank, stopping criteria in the updating of the polynomial coefficients and error estimation. In the sequel, we confront canonical LRA to sparse PCE in structural-mechanics and heat-conduction applications based on finite-element solutions. Canonical LRA exhibit smaller errors than sparse PCE in cases when the number of available model evaluations is small with respect to the input dimension, a situation that is often encountered in real-life problems. By introducing the conditional generalization error, we further demonstrate that canonical LRA tend to outperform sparse PCE in the prediction of extreme model responses, which is critical in reliability analysis.« less

  12. Development and Validation of an Algorithm to Identify Planned Readmissions From Claims Data.

    PubMed

    Horwitz, Leora I; Grady, Jacqueline N; Cohen, Dorothy B; Lin, Zhenqiu; Volpe, Mark; Ngo, Chi K; Masica, Andrew L; Long, Theodore; Wang, Jessica; Keenan, Megan; Montague, Julia; Suter, Lisa G; Ross, Joseph S; Drye, Elizabeth E; Krumholz, Harlan M; Bernheim, Susannah M

    2015-10-01

    It is desirable not to include planned readmissions in readmission measures because they represent deliberate, scheduled care. To develop an algorithm to identify planned readmissions, describe its performance characteristics, and identify improvements. Consensus-driven algorithm development and chart review validation study at 7 acute-care hospitals in 2 health systems. For development, all discharges qualifying for the publicly reported hospital-wide readmission measure. For validation, all qualifying same-hospital readmissions that were characterized by the algorithm as planned, and a random sampling of same-hospital readmissions that were characterized as unplanned. We calculated weighted sensitivity and specificity, and positive and negative predictive values of the algorithm (version 2.1), compared to gold standard chart review. In consultation with 27 experts, we developed an algorithm that characterizes 7.8% of readmissions as planned. For validation we reviewed 634 readmissions. The weighted sensitivity of the algorithm was 45.1% overall, 50.9% in large teaching centers and 40.2% in smaller community hospitals. The weighted specificity was 95.9%, positive predictive value was 51.6%, and negative predictive value was 94.7%. We identified 4 minor changes to improve algorithm performance. The revised algorithm had a weighted sensitivity 49.8% (57.1% at large hospitals), weighted specificity 96.5%, positive predictive value 58.7%, and negative predictive value 94.5%. Positive predictive value was poor for the 2 most common potentially planned procedures: diagnostic cardiac catheterization (25%) and procedures involving cardiac devices (33%). An administrative claims-based algorithm to identify planned readmissions is feasible and can facilitate public reporting of primarily unplanned readmissions. © 2015 Society of Hospital Medicine.

  13. An algorithm to identify functional groups in organic molecules.

    PubMed

    Ertl, Peter

    2017-06-07

    The concept of functional groups forms a basis of organic chemistry, medicinal chemistry, toxicity assessment, spectroscopy and also chemical nomenclature. All current software systems to identify functional groups are based on a predefined list of substructures. We are not aware of any program that can identify all functional groups in a molecule automatically. The algorithm presented in this article is an attempt to solve this scientific challenge. An algorithm to identify functional groups in a molecule based on iterative marching through its atoms is described. The procedure is illustrated by extracting functional groups from the bioactive portion of the ChEMBL database, resulting in identification of 3080 unique functional groups. A new algorithm to identify all functional groups in organic molecules is presented. The algorithm is relatively simple and full details with examples are provided, therefore implementation in any cheminformatics toolkit should be relatively easy. The new method allows the analysis of functional groups in large chemical databases in a way that was not possible using previous approaches. Graphical abstract .

  14. A systematic review of validated methods for identifying anaphylaxis, including anaphylactic shock and angioneurotic edema, using administrative and claims data.

    PubMed

    Schneider, Gary; Kachroo, Sumesh; Jones, Natalie; Crean, Sheila; Rotella, Philip; Avetisyan, Ruzan; Reynolds, Matthew W

    2012-01-01

    The Food and Drug Administration's Mini-Sentinel pilot program initially aims to conduct active surveillance to refine safety signals that emerge for marketed medical products. A key facet of this surveillance is to develop and understand the validity of algorithms for identifying health outcomes of interest from administrative and claims data. This article summarizes the process and findings of the algorithm review of anaphylaxis. PubMed and Iowa Drug Information Service searches were conducted to identify citations applicable to the anaphylaxis health outcome of interest. Level 1 abstract reviews and Level 2 full-text reviews were conducted to find articles using administrative and claims data to identify anaphylaxis and including validation estimates of the coding algorithms. Our search revealed limited literature focusing on anaphylaxis that provided administrative and claims data-based algorithms and validation estimates. Only four studies identified via literature searches provided validated algorithms; however, two additional studies were identified by Mini-Sentinel collaborators and were incorporated. The International Classification of Diseases, Ninth Revision, codes varied, as did the positive predictive value, depending on the cohort characteristics and the specific codes used to identify anaphylaxis. Research needs to be conducted on designing validation studies to test anaphylaxis algorithms and estimating their predictive power, sensitivity, and specificity. Copyright © 2012 John Wiley & Sons, Ltd.

  15. Importance of multi-modal approaches to effectively identify cataract cases from electronic health records.

    PubMed

    Peissig, Peggy L; Rasmussen, Luke V; Berg, Richard L; Linneman, James G; McCarty, Catherine A; Waudby, Carol; Chen, Lin; Denny, Joshua C; Wilke, Russell A; Pathak, Jyotishman; Carrell, David; Kho, Abel N; Starren, Justin B

    2012-01-01

    There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries.

  16. Bounding the errors for convex dynamics on one or more polytopes.

    PubMed

    Tresser, Charles

    2007-09-01

    We discuss the greedy algorithm for approximating a sequence of inputs in a family of polytopes lying in affine spaces by an output sequence made of vertices of the respective polytopes. More precisely, we consider here the case when the greed of the algorithm is dictated by the Euclidean norms of the successive cumulative errors. This algorithm can be interpreted as a time-dependent dynamical system in the vector space, where the errors live, or as a time-dependent dynamical system in an affine space containing copies of all the original polytopes. This affine space contains the inputs, as well as the inputs modified by adding the respective former errors; it is the evolution of these modified inputs that the dynamical system in affine space describes. Scheduling problems with many polytopes arise naturally, for instance, when the inputs are from a single polytope P, but one imposes the constraint that whenever the input belongs to a codimension n face, the output has to be in the same codimension n face (as when scheduling drivers among participants of a carpool). It has been previously shown that the error is bounded in the case of a single polytope by proving the existence of an arbitrary large convex invariant region for the dynamics in affine space: A region that is simultaneously invariant for several polytopes, each considered separately, was also constructed. It was then shown that there cannot be an invariant region in affine space in the general case of a family of polytopes. Here we prove the existence of an arbitrary large convex invariant set for the dynamics in the vector space in the case when the sizes of the polytopes in the family are bounded and the set of all the outgoing normals to all the faces of all the polytopes is finite. It was also previously known that starting from zero as the initial error set, the error set could not be saturated in finitely many steps in some cases with several polytopes: Contradicting a former conjecture, we show that the same happens for some single quadrilaterals and for a single pentagon with an axial symmetry. The disproof of that conjecture is the new piece of information that leads us to expect, and then to verify, as we recount here, that the proof that the errors are bounded in the general case could be a small step beyond the proof of the same statement for the single polytope case.

  17. Bounding the errors for convex dynamics on one or more polytopes

    NASA Astrophysics Data System (ADS)

    Tresser, Charles

    2007-09-01

    We discuss the greedy algorithm for approximating a sequence of inputs in a family of polytopes lying in affine spaces by an output sequence made of vertices of the respective polytopes. More precisely, we consider here the case when the greed of the algorithm is dictated by the Euclidean norms of the successive cumulative errors. This algorithm can be interpreted as a time-dependent dynamical system in the vector space, where the errors live, or as a time-dependent dynamical system in an affine space containing copies of all the original polytopes. This affine space contains the inputs, as well as the inputs modified by adding the respective former errors; it is the evolution of these modified inputs that the dynamical system in affine space describes. Scheduling problems with many polytopes arise naturally, for instance, when the inputs are from a single polytope P, but one imposes the constraint that whenever the input belongs to a codimension n face, the output has to be in the same codimension n face (as when scheduling drivers among participants of a carpool). It has been previously shown that the error is bounded in the case of a single polytope by proving the existence of an arbitrary large convex invariant region for the dynamics in affine space: A region that is simultaneously invariant for several polytopes, each considered separately, was also constructed. It was then shown that there cannot be an invariant region in affine space in the general case of a family of polytopes. Here we prove the existence of an arbitrary large convex invariant set for the dynamics in the vector space in the case when the sizes of the polytopes in the family are bounded and the set of all the outgoing normals to all the faces of all the polytopes is finite. It was also previously known that starting from zero as the initial error set, the error set could not be saturated in finitely many steps in some cases with several polytopes: Contradicting a former conjecture, we show that the same happens for some single quadrilaterals and for a single pentagon with an axial symmetry. The disproof of that conjecture is the new piece of information that leads us to expect, and then to verify, as we recount here, that the proof that the errors are bounded in the general case could be a small step beyond the proof of the same statement for the single polytope case.

  18. Bouc-Wen hysteresis model identification using Modified Firefly Algorithm

    NASA Astrophysics Data System (ADS)

    Zaman, Mohammad Asif; Sikder, Urmita

    2015-12-01

    The parameters of Bouc-Wen hysteresis model are identified using a Modified Firefly Algorithm. The proposed algorithm uses dynamic process control parameters to improve its performance. The algorithm is used to find the model parameter values that results in the least amount of error between a set of given data points and points obtained from the Bouc-Wen model. The performance of the algorithm is compared with the performance of conventional Firefly Algorithm, Genetic Algorithm and Differential Evolution algorithm in terms of convergence rate and accuracy. Compared to the other three optimization algorithms, the proposed algorithm is found to have good convergence rate with high degree of accuracy in identifying Bouc-Wen model parameters. Finally, the proposed method is used to find the Bouc-Wen model parameters from experimental data. The obtained model is found to be in good agreement with measured data.

  19. Convex Regression with Interpretable Sharp Partitions

    PubMed Central

    Petersen, Ashley; Simon, Noah; Witten, Daniela

    2016-01-01

    We consider the problem of predicting an outcome variable on the basis of a small number of covariates, using an interpretable yet non-additive model. We propose convex regression with interpretable sharp partitions (CRISP) for this task. CRISP partitions the covariate space into blocks in a data-adaptive way, and fits a mean model within each block. Unlike other partitioning methods, CRISP is fit using a non-greedy approach by solving a convex optimization problem, resulting in low-variance fits. We explore the properties of CRISP, and evaluate its performance in a simulation study and on a housing price data set. PMID:27635120

  20. Energy entrepreneurs who bilked the public

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barnes, E.

    1982-11-01

    The rush to invest in domestic energy development was accompanied by fraud, which has cost victims financial loss and diverted needed capital from legitimate projects. Government policies and tax incentives encouraged greedy entrepreneurs to perpetrate energy-related frauds. The three major areas targeted for abuse were tax shelters, deferred delivery contracts, and securities in companies promoting energy-related products. The courts have been lenient in the conviction and punishment of unlawful promoters, while victims who risk losing tax deductions are often reluctant to cooperate. Several case histories illustrate the activities of con artists and the rewards available to the unscrupulous. (DCK)

  1. Classifying epileptic EEG signals with delay permutation entropy and Multi-Scale K-means.

    PubMed

    Zhu, Guohun; Li, Yan; Wen, Peng Paul; Wang, Shuaifang

    2015-01-01

    Most epileptic EEG classification algorithms are supervised and require large training datasets, that hinder their use in real time applications. This chapter proposes an unsupervised Multi-Scale K-means (MSK-means) MSK-means algorithm to distinguish epileptic EEG signals and identify epileptic zones. The random initialization of the K-means algorithm can lead to wrong clusters. Based on the characteristics of EEGs, the MSK-means MSK-means algorithm initializes the coarse-scale centroid of a cluster with a suitable scale factor. In this chapter, the MSK-means algorithm is proved theoretically superior to the K-means algorithm on efficiency. In addition, three classifiers: the K-means, MSK-means MSK-means and support vector machine (SVM), are used to identify seizure and localize epileptogenic zone using delay permutation entropy features. The experimental results demonstrate that identifying seizure with the MSK-means algorithm and delay permutation entropy achieves 4. 7 % higher accuracy than that of K-means, and 0. 7 % higher accuracy than that of the SVM.

  2. Evaluation of the performance of existing non-laboratory based cardiovascular risk assessment algorithms

    PubMed Central

    2013-01-01

    Background The high burden and rising incidence of cardiovascular disease (CVD) in resource constrained countries necessitates implementation of robust and pragmatic primary and secondary prevention strategies. Many current CVD management guidelines recommend absolute cardiovascular (CV) risk assessment as a clinically sound guide to preventive and treatment strategies. Development of non-laboratory based cardiovascular risk assessment algorithms enable absolute risk assessment in resource constrained countries. The objective of this review is to evaluate the performance of existing non-laboratory based CV risk assessment algorithms using the benchmarks for clinically useful CV risk assessment algorithms outlined by Cooney and colleagues. Methods A literature search to identify non-laboratory based risk prediction algorithms was performed in MEDLINE, CINAHL, Ovid Premier Nursing Journals Plus, and PubMed databases. The identified algorithms were evaluated using the benchmarks for clinically useful cardiovascular risk assessment algorithms outlined by Cooney and colleagues. Results Five non-laboratory based CV risk assessment algorithms were identified. The Gaziano and Framingham algorithms met the criteria for appropriateness of statistical methods used to derive the algorithms and endpoints. The Swedish Consultation, Framingham and Gaziano algorithms demonstrated good discrimination in derivation datasets. Only the Gaziano algorithm was externally validated where it had optimal discrimination. The Gaziano and WHO algorithms had chart formats which made them simple and user friendly for clinical application. Conclusion Both the Gaziano and Framingham non-laboratory based algorithms met most of the criteria outlined by Cooney and colleagues. External validation of the algorithms in diverse samples is needed to ascertain their performance and applicability to different populations and to enhance clinicians’ confidence in them. PMID:24373202

  3. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets

    PubMed Central

    Wernisch, Lorenz

    2017-01-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm. PMID:29036190

  4. Clusternomics: Integrative context-dependent clustering for heterogeneous datasets.

    PubMed

    Gabasova, Evelina; Reid, John; Wernisch, Lorenz

    2017-10-01

    Integrative clustering is used to identify groups of samples by jointly analysing multiple datasets describing the same set of biological samples, such as gene expression, copy number, methylation etc. Most existing algorithms for integrative clustering assume that there is a shared consistent set of clusters across all datasets, and most of the data samples follow this structure. However in practice, the structure across heterogeneous datasets can be more varied, with clusters being joined in some datasets and separated in others. In this paper, we present a probabilistic clustering method to identify groups across datasets that do not share the same cluster structure. The proposed algorithm, Clusternomics, identifies groups of samples that share their global behaviour across heterogeneous datasets. The algorithm models clusters on the level of individual datasets, while also extracting global structure that arises from the local cluster assignments. Clusters on both the local and the global level are modelled using a hierarchical Dirichlet mixture model to identify structure on both levels. We evaluated the model both on simulated and on real-world datasets. The simulated data exemplifies datasets with varying degrees of common structure. In such a setting Clusternomics outperforms existing algorithms for integrative and consensus clustering. In a real-world application, we used the algorithm for cancer subtyping, identifying subtypes of cancer from heterogeneous datasets. We applied the algorithm to TCGA breast cancer dataset, integrating gene expression, miRNA expression, DNA methylation and proteomics. The algorithm extracted clinically meaningful clusters with significantly different survival probabilities. We also evaluated the algorithm on lung and kidney cancer TCGA datasets with high dimensionality, again showing clinically significant results and scalability of the algorithm.

  5. School-Based Screening for Suicide Risk: Balancing Costs and Benefits

    PubMed Central

    Wilcox, Holly; Huo, Yanling; Turner, J. Blake; Fisher, Prudence; Shaffer, David

    2010-01-01

    Objectives. We examined the effects of a scoring algorithm change on the burden and sensitivity of a screen for adolescent suicide risk. Methods. The Columbia Suicide Screen was used to screen 641 high school students for high suicide risk (recent ideation or lifetime attempt and depression, or anxiety, or substance use), determined by subsequent blind assessment with the Diagnostic Interview Schedule for Children. We compared the accuracy of different screen algorithms in identifying high-risk cases. Results. A screen algorithm comprising recent ideation or lifetime attempt or depression, anxiety, or substance-use problems set at moderate-severity level classed 35% of students as positive and identified 96% of high-risk students. Increasing the algorithm's threshold reduced the proportion identified to 24% and identified 92% of high-risk cases. Asking only about recent suicidal ideation or lifetime suicide attempt identified 17% of the students and 89% of high-risk cases. The proportion of nonsuicidal diagnosis–bearing students found with the 3 algorithms was 62%, 34%, and 12%, respectively. Conclusions. The Columbia Suicide Screen threshold can be altered to reduce the screen-positive population, saving costs and time while identifying almost all students at high risk for suicide. PMID:20634467

  6. Validation of classification algorithms for childhood diabetes identified from administrative data.

    PubMed

    Vanderloo, Saskia E; Johnson, Jeffrey A; Reimer, Kim; McCrea, Patrick; Nuernberger, Kimberly; Krueger, Hans; Aydede, Sema K; Collet, Jean-Paul; Amed, Shazhan

    2012-05-01

    Type 1 diabetes is the most common form of diabetes among children; however, the proportion of cases of childhood type 2 diabetes is increasing. In Canada, the National Diabetes Surveillance System (NDSS) uses administrative health data to describe trends in the epidemiology of diabetes, but does not specify diabetes type. The objective of this study was to validate algorithms to classify diabetes type in children <20 yr identified using the NDSS methodology. We applied the NDSS case definition to children living in British Columbia between 1 April 1996 and 31 March 2007. Through an iterative process, four potential classification algorithms were developed based on demographic characteristics and drug-utilization patterns. Each algorithm was then validated against a gold standard clinical database. Algorithms based primarily on an age rule (i.e., age <10 at diagnosis categorized type 1 diabetes) were most sensitive in the identification of type 1 diabetes; algorithms with restrictions on drug utilization (i.e., no prescriptions for insulin ± glucose monitoring strips categorized type 2 diabetes) were most sensitive for identifying type 2 diabetes. One algorithm was identified as having the optimal balance of sensitivity (Sn) and specificity (Sp) for the identification of both type 1 (Sn: 98.6%; Sp: 78.2%; PPV: 97.8%) and type 2 diabetes (Sn: 83.2%; Sp: 97.5%; PPV: 73.7%). Demographic characteristics in combination with drug-utilization patterns can be used to differentiate diabetes type among cases of pediatric diabetes identified within administrative health databases. Validation of similar algorithms in other regions is warranted. © 2011 John Wiley & Sons A/S.

  7. A systematic review of validated methods to capture myopericarditis using administrative or claims data.

    PubMed

    Idowu, Rachel T; Carnahan, Ryan; Sathe, Nila A; McPheeters, Melissa L

    2013-12-30

    To identify algorithms that can capture incident cases of myocarditis and pericarditis in administrative and claims databases; these algorithms can eventually be used to identify cardiac inflammatory adverse events following vaccine administration. We searched MEDLINE from 1991 to September 2012 using controlled vocabulary and key terms related to myocarditis. We also searched the reference lists of included studies. Two investigators independently assessed the full text of studies against pre-determined inclusion criteria. Two reviewers independently extracted data regarding participant and algorithm characteristics as well as study conduct. Nine publications (including one study reported in two publications) met criteria for inclusion. Two studies performed medical record review in order to confirm that these coding algorithms actually captured patients with the disease of interest. One of these studies identified five potential cases, none of which were confirmed as acute myocarditis upon review. The other study, which employed a search algorithm based on diagnostic surveillance (using ICD-9 codes 420.90, 420.99, 422.90, 422.91 and 429.0) and sentinel reporting, identified 59 clinically confirmed cases of myopericarditis among 492,671 United States military service personnel who received smallpox vaccine between 2002 and 2003. Neither study provided algorithm validation statistics (positive predictive value, sensitivity, or specificity). A validated search algorithm is currently unavailable for identifying incident cases of pericarditis or myocarditis. Several authors have published unvalidated ICD-9-based search algorithms that appear to capture myocarditis events occurring in the context of other underlying cardiac or autoimmune conditions. Copyright © 2013. Published by Elsevier Ltd.

  8. Importance of multi-modal approaches to effectively identify cataract cases from electronic health records

    PubMed Central

    Rasmussen, Luke V; Berg, Richard L; Linneman, James G; McCarty, Catherine A; Waudby, Carol; Chen, Lin; Denny, Joshua C; Wilke, Russell A; Pathak, Jyotishman; Carrell, David; Kho, Abel N; Starren, Justin B

    2012-01-01

    Objective There is increasing interest in using electronic health records (EHRs) to identify subjects for genomic association studies, due in part to the availability of large amounts of clinical data and the expected cost efficiencies of subject identification. We describe the construction and validation of an EHR-based algorithm to identify subjects with age-related cataracts. Materials and methods We used a multi-modal strategy consisting of structured database querying, natural language processing on free-text documents, and optical character recognition on scanned clinical images to identify cataract subjects and related cataract attributes. Extensive validation on 3657 subjects compared the multi-modal results to manual chart review. The algorithm was also implemented at participating electronic MEdical Records and GEnomics (eMERGE) institutions. Results An EHR-based cataract phenotyping algorithm was successfully developed and validated, resulting in positive predictive values (PPVs) >95%. The multi-modal approach increased the identification of cataract subject attributes by a factor of three compared to single-mode approaches while maintaining high PPV. Components of the cataract algorithm were successfully deployed at three other institutions with similar accuracy. Discussion A multi-modal strategy incorporating optical character recognition and natural language processing may increase the number of cases identified while maintaining similar PPVs. Such algorithms, however, require that the needed information be embedded within clinical documents. Conclusion We have demonstrated that algorithms to identify and characterize cataracts can be developed utilizing data collected via the EHR. These algorithms provide a high level of accuracy even when implemented across multiple EHRs and institutional boundaries. PMID:22319176

  9. A physarum-inspired prize-collecting steiner tree approach to identify subnetworks for drug repositioning.

    PubMed

    Sun, Yahui; Hameed, Pathima Nusrath; Verspoor, Karin; Halgamuge, Saman

    2016-12-05

    Drug repositioning can reduce the time, costs and risks of drug development by identifying new therapeutic effects for known drugs. It is challenging to reposition drugs as pharmacological data is large and complex. Subnetwork identification has already been used to simplify the visualization and interpretation of biological data, but it has not been applied to drug repositioning so far. In this paper, we fill this gap by proposing a new Physarum-inspired Prize-Collecting Steiner Tree algorithm to identify subnetworks for drug repositioning. Drug Similarity Networks (DSN) are generated using the chemical, therapeutic, protein, and phenotype features of drugs. In DSNs, vertex prizes and edge costs represent the similarities and dissimilarities between drugs respectively, and terminals represent drugs in the cardiovascular class, as defined in the Anatomical Therapeutic Chemical classification system. A new Physarum-inspired Prize-Collecting Steiner Tree algorithm is proposed in this paper to identify subnetworks. We apply both the proposed algorithm and the widely-used GW algorithm to identify subnetworks in our 18 generated DSNs. In these DSNs, our proposed algorithm identifies subnetworks with an average Rand Index of 81.1%, while the GW algorithm can only identify subnetworks with an average Rand Index of 64.1%. We select 9 subnetworks with high Rand Index to find drug repositioning opportunities. 10 frequently occurring drugs in these subnetworks are identified as candidates to be repositioned for cardiovascular diseases. We find evidence to support previous discoveries that nitroglycerin, theophylline and acarbose may be able to be repositioned for cardiovascular diseases. Moreover, we identify seven previously unknown drug candidates that also may interact with the biological cardiovascular system. These discoveries show our proposed Prize-Collecting Steiner Tree approach as a promising strategy for drug repositioning.

  10. Metaphor Identification in Large Texts Corpora

    PubMed Central

    Neuman, Yair; Assaf, Dan; Cohen, Yohai; Last, Mark; Argamon, Shlomo; Howard, Newton; Frieder, Ophir

    2013-01-01

    Identifying metaphorical language-use (e.g., sweet child) is one of the challenges facing natural language processing. This paper describes three novel algorithms for automatic metaphor identification. The algorithms are variations of the same core algorithm. We evaluate the algorithms on two corpora of Reuters and the New York Times articles. The paper presents the most comprehensive study of metaphor identification in terms of scope of metaphorical phrases and annotated corpora size. Algorithms’ performance in identifying linguistic phrases as metaphorical or literal has been compared to human judgment. Overall, the algorithms outperform the state-of-the-art algorithm with 71% precision and 27% averaged improvement in prediction over the base-rate of metaphors in the corpus. PMID:23658625

  11. I was greedy, too.

    PubMed

    Coutu, Diane L

    2003-02-01

    Americans are outraged at the greediness of Wall Street analysts, dot-com entrepreneurs, and, most of all, chief executive officers. How could Tyco's Dennis Kozlowski use company funds to throw his wife a million-dollar birthday bash on an Italian island? How could Enron's Ken Lay sell thousands of shares of his company's once high-flying stock just before it crashed, leaving employees with nothing? Even America's most popular domestic guru, Martha Stewart, is suspected of having her hand in the cookie jar. To some extent, our outrage may be justified, writes HBR senior editor Diane Coutu. And yet, it's easy to forget that just a couple years ago these same people were lauded as heroes. Many Americans wanted nothing more, in fact, than to emulate them, to share in their fortunes. Indeed, we spent an enormous amount of time talking and thinking about double-digit returns, IPOs, day trading, and stock options. It could easily be argued that it was public indulgence in corporate money lust that largely created the mess we're now in. It's time to take a hard look at greed, both in its general form and in its peculiarly American incarnation, says Coutu. If Federal Reserve Board chairman Alan Greenspan was correct in telling Congress that "infectious greed" contaminated U.S. business, then we need to try to understand its causes--and how the average American may have contributed to it. Why did so many of us fall prey to greed? With a deep, almost reflexive trust in the free market, are Americans somehow greedier than other peoples? And as we look at the wreckage from the 1990s, can we be sure it won't happen again?

  12. Thinking about ... leadership. Warts and all.

    PubMed

    Kellerman, Barbara

    2004-01-01

    Does using Tyco's funds to purchase a $6,000 shower curtain and a $15,000 dog-shaped umbrella stand make Dennis Kozlowski a bad leader? Is Martha Stewart's career any less instructive because she may have sold some shares on the basis of a tip-off? Is leadership synonymous with moral leadership? Before 1970, the answer from most leadership theorists would certainly have been no. Look at Hitler, Stalin, Pol Pot, Mao Tsetung--great leaders all, but hardly good men. In fact, capricious, murderous, high-handed, corrupt, and evil leaders are effective and commonplace. Machiavelli celebrated them; the U.S. constitution built in safeguards against them. Everywhere, power goes hand in hand with corruption--everywhere, that is, except in the literature of business leadership. To read Tom Peters, Jay Conger, John Kotter, and most of their colleagues, leaders are, as Warren Bennis puts it, individuals who create shared meaning, have a distinctive voice, have the capacity to adapt, and have integrity. According to today's business literature, to be a leader is, by definition, to be benevolent. But leadership is not a moral concept, and it is high time we acknowledge that fact. We have as much to learn from those we would regard as bad examples as we do from the far fewer good examples we're presented with these days. Leaders are like the rest of us: trustworthy and deceitful, cowardly and brave, greedy and generous. To assume that all good leaders are good people is to be willfully blind to the reality of the human condition, and it severely limits our ability to become better leaders. Worse, it may cause senior executives to think that, because they are leaders, they are never deceitful, cowardly, or greedy. That way lies disaster.

  13. Identifying Physician-Recognized Depression from Administrative Data: Consequences for Quality Measurement

    PubMed Central

    Spettell, Claire M; Wall, Terry C; Allison, Jeroan; Calhoun, Jaimee; Kobylinski, Richard; Fargason, Rachel; Kiefe, Catarina I

    2003-01-01

    Background Multiple factors limit identification of patients with depression from administrative data. However, administrative data drives many quality measurement systems, including the Health Plan Employer Data and Information Set (HEDIS®). Methods We investigated two algorithms for identification of physician-recognized depression. The study sample was drawn from primary care physician member panels of a large managed care organization. All members were continuously enrolled between January 1 and December 31, 1997. Algorithm 1 required at least two criteria in any combination: (1) an outpatient diagnosis of depression or (2) a pharmacy claim for an antidepressant. Algorithm 2 included the same criteria as algorithm 1, but required a diagnosis of depression for all patients. With algorithm 1, we identified the medical records of a stratified, random subset of patients with and without depression (n=465). We also identified patients of primary care physicians with a minimum of 10 depressed members by algorithm 1 (n=32,819) and algorithm 2 (n=6,837). Results The sensitivity, specificity, and positive predictive values were: Algorithm 1: 95 percent, 65 percent, 49 percent; Algorithm 2: 52 percent, 88 percent, 60 percent. Compared to algorithm 1, profiles from algorithm 2 revealed higher rates of follow-up visits (43 percent, 55 percent) and appropriate antidepressant dosage acutely (82 percent, 90 percent) and chronically (83 percent, 91 percent) (p<0.05 for all). Conclusions Both algorithms had high false positive rates. Denominator construction (algorithm 1 versus 2) contributed significantly to variability in measured quality. Our findings raise concern about interpreting depression quality reports based upon administrative data. PMID:12968818

  14. Cloud computing-based TagSNP selection algorithm for human genome data.

    PubMed

    Hung, Che-Lun; Chen, Wen-Pei; Hua, Guan-Jie; Zheng, Huiru; Tsai, Suh-Jen Jane; Lin, Yaw-Ling

    2015-01-05

    Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used.

  15. Robust crop and weed segmentation under uncontrolled outdoor illumination.

    PubMed

    Jeon, Hong Y; Tian, Lei F; Zhu, Heping

    2011-01-01

    An image processing algorithm for detecting individual weeds was developed and evaluated. Weed detection processes included were normalized excessive green conversion, statistical threshold value estimation, adaptive image segmentation, median filter, morphological feature calculation and Artificial Neural Network (ANN). The developed algorithm was validated for its ability to identify and detect weeds and crop plants under uncontrolled outdoor illuminations. A machine vision implementing field robot captured field images under outdoor illuminations and the image processing algorithm automatically processed them without manual adjustment. The errors of the algorithm, when processing 666 field images, ranged from 2.1 to 2.9%. The ANN correctly detected 72.6% of crop plants from the identified plants, and considered the rest as weeds. However, the ANN identification rates for crop plants were improved up to 95.1% by addressing the error sources in the algorithm. The developed weed detection and image processing algorithm provides a novel method to identify plants against soil background under the uncontrolled outdoor illuminations, and to differentiate weeds from crop plants. Thus, the proposed new machine vision and processing algorithm may be useful for outdoor applications including plant specific direct applications (PSDA).

  16. Cloud Computing-Based TagSNP Selection Algorithm for Human Genome Data

    PubMed Central

    Hung, Che-Lun; Chen, Wen-Pei; Hua, Guan-Jie; Zheng, Huiru; Tsai, Suh-Jen Jane; Lin, Yaw-Ling

    2015-01-01

    Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used. PMID:25569088

  17. Identifying patients with ischemic heart disease in an electronic medical record.

    PubMed

    Ivers, Noah; Pylypenko, Bogdan; Tu, Karen

    2011-01-01

    Increasing utilization of electronic medical records (EMRs) presents an opportunity to efficiently measure quality indicators in primary care. Achieving this goal requires the development of accurate patient-disease registries. This study aimed to develop and validate an algorithm for identifying patients with ischemic heart disease (IHD) within the EMR. An algorithm was developed to search the unstructured text within the medical history fields in the EMR for IHD-related terminology. This algorithm was applied to a 5% random sample of adult patient charts (n = 969) drawn from a convenience sample of 17 Ontario family physicians. The accuracy of the algorithm for identifying patients with IHD was compared to the results of 3 trained chart abstractors. The manual chart abstraction identified 87 patients with IHD in the random sample (prevalence = 8.98%). The accuracy of the algorithm for identifying patients with IHD was as follows: sensitivity = 72.4% (95% confidence interval [CI]: 61.8-81.5); specificity = 99.3% (95% CI: 98.5-99.8); positive predictive value = 91.3% (95% CI: 82.0-96.7); negative predictive value = 97.3 (95% CI: 96.1-98.3); and kappa = 0.79 (95% CI: 0.72-0.86). Patients with IHD can be accurately identified by applying a search algorithm for the medical history fields in the EMR of primary care providers who were not using standardized approaches to code diagnoses. The accuracy compares favorably to other methods for identifying patients with IHD. The results of this study may aid policy makers, researchers, and clinicians to develop registries and to examine quality indicators for IHD in primary care.

  18. Efficient Record Linkage Algorithms Using Complete Linkage Clustering.

    PubMed

    Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar

    2016-01-01

    Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times.

  19. Efficient Record Linkage Algorithms Using Complete Linkage Clustering

    PubMed Central

    Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar

    2016-01-01

    Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times. PMID:27124604

  20. Validity of administrative database code algorithms to identify vascular access placement, surgical revisions, and secondary patency.

    PubMed

    Al-Jaishi, Ahmed A; Moist, Louise M; Oliver, Matthew J; Nash, Danielle M; Fleet, Jamie L; Garg, Amit X; Lok, Charmaine E

    2018-03-01

    We assessed the validity of physician billing codes and hospital admission using International Classification of Diseases 10th revision codes to identify vascular access placement, secondary patency, and surgical revisions in administrative data. We included adults (≥18 years) with a vascular access placed between 1 April 2004 and 31 March 2013 at the University Health Network, Toronto. Our reference standard was a prospective vascular access database (VASPRO) that contains information on vascular access type and dates of placement, dates for failure, and any revisions. We used VASPRO to assess the validity of different administrative coding algorithms by calculating the sensitivity, specificity, and positive predictive values of vascular access events. The sensitivity (95% confidence interval) of the best performing algorithm to identify arteriovenous access placement was 86% (83%, 89%) and specificity was 92% (89%, 93%). The corresponding numbers to identify catheter insertion were 84% (82%, 86%) and 84% (80%, 87%), respectively. The sensitivity of the best performing coding algorithm to identify arteriovenous access surgical revisions was 81% (67%, 90%) and specificity was 89% (87%, 90%). The algorithm capturing arteriovenous access placement and catheter insertion had a positive predictive value greater than 90% and arteriovenous access surgical revisions had a positive predictive value of 20%. The duration of arteriovenous access secondary patency was on average 578 (553, 603) days in VASPRO and 555 (530, 580) days in administrative databases. Administrative data algorithms have fair to good operating characteristics to identify vascular access placement and arteriovenous access secondary patency. Low positive predictive values for surgical revisions algorithm suggest that administrative data should only be used to rule out the occurrence of an event.

  1. System and method for resolving gamma-ray spectra

    DOEpatents

    Gentile, Charles A.; Perry, Jason; Langish, Stephen W.; Silber, Kenneth; Davis, William M.; Mastrovito, Dana

    2010-05-04

    A system for identifying radionuclide emissions is described. The system includes at least one processor for processing output signals from a radionuclide detecting device, at least one training algorithm run by the at least one processor for analyzing data derived from at least one set of known sample data from the output signals, at least one classification algorithm derived from the training algorithm for classifying unknown sample data, wherein the at least one training algorithm analyzes the at least one sample data set to derive at least one rule used by said classification algorithm for identifying at least one radionuclide emission detected by the detecting device.

  2. Closed-form expressions for flip angle variation that maximize total signal in T1-weighted rapid gradient echo MRI.

    PubMed

    Drobnitzky, Matthias; Klose, Uwe

    2017-03-01

    Magnetization-prepared rapid gradient-echo (MPRAGE) sequences are commonly employed for T1-weighted structural brain imaging. Following a contrast preparation radiofrequency (RF) pulse, the data acquisition proceeds under nonequilibrium conditions of the relaxing longitudinal magnetization. Variation of the flip angle can be used to maximize total available signal. Simulated annealing or greedy algorithms have so far been published to numerically solve this problem, with signal-to-noise ratios optimized for clinical imaging scenarios by adhering to a predefined shape of the signal evolution. We propose an unconstrained optimization of the MPRAGE experiment that employs techniques from resource allocation theory. A new dynamic programming solution is introduced that yields closed-form expressions for optimal flip angle variation. Flip angle series are proposed that maximize total transverse magnetization (Mxy) for a range of physiologic T1 values. A 3D MPRAGE sequence is modified to allow for a controlled variation of the excitation angle. Experiments employing a T1 contrast phantom are performed at 3T. 1D acquisitions without phase encoding permit measurement of the temporal development of Mxy. Image mean signal and standard deviation for reference flip angle trains are compared in 2D measurements. Signal profiles at sharp phantom edges are acquired to access image blurring related to nonuniform Mxy development. A novel closed-form expression for flip angle variation is found that constitutes the optimal policy to reach maximum total signal. It numerically equals previously published results of other authors when evaluated under their simplifying assumptions. Longitudinal magnetization (Mz) is exhaustively used without causing abrupt changes in the measured MR signal, which is a prerequisite for artifact free images. Phantom experiments at 3T verify the expected benefit for total accumulated k-space signal when compared with published flip angle series. Describing the MR signal collection in MPRAGE sequences as a Bellman problem is a new concept. By means of recursively solving a series of overlapping subproblems, this leads to an elegant solution for the problem of maximizing total available MR signal in k-space. A closed-form expression for flip angle variation avoids the complexity of numerical optimization and eases access to controlled variation in an attempt to identify potential clinical applications. © 2017 American Association of Physicists in Medicine.

  3. A New Population of Galactic Bulge Planetary Nebulas

    NASA Astrophysics Data System (ADS)

    Stenborg, T. N.

    A new population of Galactic bulge planetary nebulas is presented. Nebula candidates were discovered by systematically reviewing archival [OIII] on/off band survey imaging of the central -5° ≤ l ≤ 5°, -5° ≤ b ≤ 5° region around the Galactic centre. An image segmentation and interleaving scheme was developed to facilitate this review. The resultant candidates (> 200) were then double checked against complementary archival Hα sky survey data to screen for obvious planetary nebula (PN) mimics or spurious image artefacts. Confirmatory spectroscopy of the PN candidates was pursued with thin slit, fibre multiobject and wide field spectrographs. Custom software was built to streamline interfacing with third-party spectroscopic management tools and a parallel greedy set cover algorithm implemented for efficient field selection in constrained multi-object observations. The combined imaging and spectroscopic evidence yielded true (4), probable (31) and possible (83) PNs toward the bulge. Secondary discoveries such as new PN mimics and late type stars were by-products of the confirmatory spectroscopy. Instances of literature PN duplication encountered during the investigation were noticed and documented. Spectral analysis of new PNs, including those obtained with a new optimised sky subtraction technique devised and demonstrated here, provided diagnostic data allowing radial velocity and Balmer decrement determination. Using a combined diameter and radial velocity criterion, bona fide bulge PNs were distinguished from new foreground PNs. Where Balmer decrements were available for new bulge PNs, differential aperture photometry was used to provide a modest data increment to Galactic bulge planetary nebula luminosity function (PNLF). The PNLF was revised with data from some new bulge PNs, but more significantly, by a series of corrections to the data derived from previously known bulge PNs (~225), such as improved filter transmission effects, statistically justified binning and application of a uniform bulge-relevant extinction law. The result was the most rigorous bulge PNLF to date. An improvement on the legacy PNLF, the revised PNLF exhibited a form inconsistent with typical extragalactic examples, an expected result of the unusual extinction correction method used to address bulge-specific observational limitations. Issues restricting the accuracy of the bulge PNLF were identified. Until those restrictions are ameliorated, the utility of the PNLF in aiding physical understanding of its constituent members or their progenitors cannot be realised.

  4. Intervention in gene regulatory networks with maximal phenotype alteration.

    PubMed

    Yousefi, Mohammadmahdi R; Dougherty, Edward R

    2013-07-15

    A basic issue for translational genomics is to model gene interaction via gene regulatory networks (GRNs) and thereby provide an informatics environment to study the effects of intervention (say, via drugs) and to derive effective intervention strategies. Taking the view that the phenotype is characterized by the long-run behavior (steady-state distribution) of the network, we desire interventions to optimally move the probability mass from undesirable to desirable states Heretofore, two external control approaches have been taken to shift the steady-state mass of a GRN: (i) use a user-defined cost function for which desirable shift of the steady-state mass is a by-product and (ii) use heuristics to design a greedy algorithm. Neither approach provides an optimal control policy relative to long-run behavior. We use a linear programming approach to optimally shift the steady-state mass from undesirable to desirable states, i.e. optimization is directly based on the amount of shift and therefore must outperform previously proposed methods. Moreover, the same basic linear programming structure is used for both unconstrained and constrained optimization, where in the latter case, constraints on the optimization limit the amount of mass that may be shifted to 'ambiguous' states, these being states that are not directly undesirable relative to the pathology of interest but which bear some perceived risk. We apply the method to probabilistic Boolean networks, but the theory applies to any Markovian GRN. Supplementary materials, including the simulation results, MATLAB source code and description of suboptimal methods are available at http://gsp.tamu.edu/Publications/supplementary/yousefi13b. edward@ece.tamu.edu Supplementary data are available at Bioinformatics online.

  5. Selection of examples in case-based computer-aided decision systems

    PubMed Central

    Mazurowski, Maciej A.; Zurada, Jacek M.; Tourassi, Georgia D.

    2013-01-01

    Case-based computer-aided decision (CB-CAD) systems rely on a database of previously stored, known examples when classifying new, incoming queries. Such systems can be particularly useful since they do not need retraining every time a new example is deposited in the case base. The adaptive nature of case-based systems is well suited to the current trend of continuously expanding digital databases in the medical domain. To maintain efficiency, however, such systems need sophisticated strategies to effectively manage the available evidence database. In this paper, we discuss the general problem of building an evidence database by selecting the most useful examples to store while satisfying existing storage requirements. We evaluate three intelligent techniques for this purpose: genetic algorithm-based selection, greedy selection and random mutation hill climbing. These techniques are compared to a random selection strategy used as the baseline. The study is performed with a previously presented CB-CAD system applied for false positive reduction in screening mammograms. The experimental evaluation shows that when the development goal is to maximize the system’s diagnostic performance, the intelligent techniques are able to reduce the size of the evidence database to 37% of the original database by eliminating superfluous and/or detrimental examples while at the same time significantly improving the CAD system’s performance. Furthermore, if the case-base size is a main concern, the total number of examples stored in the system can be reduced to only 2–4% of the original database without a decrease in the diagnostic performance. Comparison of the techniques shows that random mutation hill climbing provides the best balance between the diagnostic performance and computational efficiency when building the evidence database of the CB-CAD system. PMID:18854606

  6. LensFlow: A Convolutional Neural Network in Search of Strong Gravitational Lenses

    NASA Astrophysics Data System (ADS)

    Pourrahmani, Milad; Nayyeri, Hooshang; Cooray, Asantha

    2018-03-01

    In this work, we present our machine learning classification algorithm for identifying strong gravitational lenses from wide-area surveys using convolutional neural networks; LENSFLOW. We train and test the algorithm using a wide variety of strong gravitational lens configurations from simulations of lensing events. Images are processed through multiple convolutional layers that extract feature maps necessary to assign a lens probability to each image. LENSFLOW provides a ranking scheme for all sources that could be used to identify potential gravitational lens candidates by significantly reducing the number of images that have to be visually inspected. We apply our algorithm to the HST/ACS i-band observations of the COSMOS field and present our sample of identified lensing candidates. The developed machine learning algorithm is more computationally efficient and complimentary to classical lens identification algorithms and is ideal for discovering such events across wide areas from current and future surveys such as LSST and WFIRST.

  7. An energy-based perturbation and a taboo strategy for improving the searching ability of stochastic structural optimization methods

    NASA Astrophysics Data System (ADS)

    Cheng, Longjiu; Cai, Wensheng; Shao, Xueguang

    2005-03-01

    An energy-based perturbation and a new idea of taboo strategy are proposed for structural optimization and applied in a benchmark problem, i.e., the optimization of Lennard-Jones (LJ) clusters. It is proved that the energy-based perturbation is much better than the traditional random perturbation both in convergence speed and searching ability when it is combined with a simple greedy method. By tabooing the most wide-spread funnel instead of the visited solutions, the hit rate of other funnels can be significantly improved. Global minima of (LJ) clusters up to 200 atoms are found with high efficiency.

  8. [Images of nursing mothers in France, 18th and 19th centuries].

    PubMed

    Morel, Marie-France

    2010-01-01

    As they became more widely adopted in eighteenth- and nineteenth-century France, wet-nursing and wet-nurses appeared prominently in the iconography of the time. Such images turned negative as criticism against “mercenary breast-feeding” mounted. Over the nineteenth century in particular, wet-nurses were heavily featured in press caricatures: they were being mocked while described as simple-minded, dumb, greedy creatures, with proclivities ranging from a taste for garish attire, to sexual appetites fuelling trysts in public gardens with soldiers on leave. A representative sample of such images will be selected to highlight the codes and values underpinning this mockery.

  9. Testing the accuracy of redshift-space group-finding algorithms

    NASA Astrophysics Data System (ADS)

    Frederic, James J.

    1995-04-01

    Using simulated redshift surveys generated from a high-resolution N-body cosmological structure simulation, we study algorithms used to identify groups of galaxies in redshift space. Two algorithms are investigated; both are friends-of-friends schemes with variable linking lengths in the radial and transverse dimenisons. The chief difference between the algorithms is in the redshift linking length. The algorithm proposed by Huchra & Geller (1982) uses a generous linking length designed to find 'fingers of god,' while that of Nolthenius & White (1987) uses a smaller linking length to minimize contamination by projection. We find that neither of the algorithms studied is intrinsically superior to the other; rather, the ideal algorithm as well as the ideal algorithm parameters depends on the purpose for which groups are to be studied. The Huchra & Geller algorithm misses few real groups, at the cost of including some spurious groups and members, while the Nolthenius & White algorithm misses high velocity dispersion groups and members but is less likely to include interlopers in its group assignments. Adjusting the parameters of either algorithm results in a trade-off between group accuracy and completeness. In a companion paper we investigate the accuracy of virial mass estimates and clustering properties of groups identified using these algorithms.

  10. Identification of periods of clear sky irradiance in time series of GHI measurements

    DOE PAGES

    Reno, Matthew J.; Hansen, Clifford W.

    2016-01-18

    In this study, we present a simple algorithm for identifying periods of time with broadband global horizontal irradiance (GHI) similar to that occurring during clear sky conditions from a time series of GHI measurements. Other available methods to identify these periods do so by identifying periods with clear sky conditions using additional measurements, such as direct or diffuse irradiance. Our algorithm compares characteristics of the time series of measured GHI with the output of a clear sky model without requiring additional measurements. We validate our algorithm using data from several locations by comparing our results with those obtained from amore » clear sky detection algorithm, and with satellite and ground-based sky imagery.« less

  11. Identification of periods of clear sky irradiance in time series of GHI measurements

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Reno, Matthew J.; Hansen, Clifford W.

    In this study, we present a simple algorithm for identifying periods of time with broadband global horizontal irradiance (GHI) similar to that occurring during clear sky conditions from a time series of GHI measurements. Other available methods to identify these periods do so by identifying periods with clear sky conditions using additional measurements, such as direct or diffuse irradiance. Our algorithm compares characteristics of the time series of measured GHI with the output of a clear sky model without requiring additional measurements. We validate our algorithm using data from several locations by comparing our results with those obtained from amore » clear sky detection algorithm, and with satellite and ground-based sky imagery.« less

  12. Use of administrative and electronic health record data for development of automated algorithms for childhood diabetes case ascertainment and type classification: the SEARCH for Diabetes in Youth Study

    PubMed Central

    Zhong, Victor W.; Pfaff, Emily R.; Beavers, Daniel P.; Thomas, Joan; Jaacks, Lindsay M.; Bowlby, Deborah A.; Carey, Timothy S.; Lawrence, Jean M.; Dabelea, Dana; Hamman, Richard F.; Pihoker, Catherine; Saydah, Sharon H.; Mayer-Davis, Elizabeth J.

    2014-01-01

    Background The performance of automated algorithms for childhood diabetes case ascertainment and type classification may differ by demographic characteristics. Objective This study evaluated the potential of administrative and electronic health record (EHR) data from a large academic care delivery system to conduct diabetes case ascertainment in youth according to type, age and race/ethnicity. Subjects 57,767 children aged <20 years as of December 31, 2011 seen at University of North Carolina Health Care System in 2011 were included. Methods Using an initial algorithm including billing data, patient problem lists, laboratory test results and diabetes related medications between July 1, 2008 and December 31, 2011, presumptive cases were identified and validated by chart review. More refined algorithms were evaluated by type (type 1 versus type 2), age (<10 versus ≥10 years) and race/ethnicity (non-Hispanic white versus “other”). Sensitivity, specificity and positive predictive value were calculated and compared. Results The best algorithm for ascertainment of diabetes cases overall was billing data. The best type 1 algorithm was the ratio of the number of type 1 billing codes to the sum of type 1 and type 2 billing codes ≥0.5. A useful algorithm to ascertain type 2 youth with “other” race/ethnicity was identified. Considerable age and racial/ethnic differences were present in type-non-specific and type 2 algorithms. Conclusions Administrative and EHR data may be used to identify cases of childhood diabetes (any type), and to identify type 1 cases. The performance of type 2 case ascertainment algorithms differed substantially by race/ethnicity. PMID:24913103

  13. Chemodynamical Clustering Applied to APOGEE Data: Rediscovering Globular Clusters

    NASA Astrophysics Data System (ADS)

    Chen, Boquan; D’Onghia, Elena; Pardy, Stephen A.; Pasquali, Anna; Bertelli Motta, Clio; Hanlon, Bret; Grebel, Eva K.

    2018-06-01

    We have developed a novel technique based on a clustering algorithm that searches for kinematically and chemically clustered stars in the APOGEE DR12 Cannon data. As compared to classical chemical tagging, the kinematic information included in our methodology allows us to identify stars that are members of known globular clusters with greater confidence. We apply our algorithm to the entire APOGEE catalog of 150,615 stars whose chemical abundances are derived by the Cannon. Our methodology found anticorrelations between the elements Al and Mg, Na and O, and C and N previously identified in the optical spectra in globular clusters, even though we omit these elements in our algorithm. Our algorithm identifies globular clusters without a priori knowledge of their locations in the sky. Thus, not only does this technique promise to discover new globular clusters, but it also allows us to identify candidate streams of kinematically and chemically clustered stars in the Milky Way.

  14. An Automated Summarization Assessment Algorithm for Identifying Summarizing Strategies

    PubMed Central

    Abdi, Asad; Idris, Norisma; Alguliyev, Rasim M.; Aliguliyev, Ramiz M.

    2016-01-01

    Background Summarization is a process to select important information from a source text. Summarizing strategies are the core cognitive processes in summarization activity. Since summarization can be important as a tool to improve comprehension, it has attracted interest of teachers for teaching summary writing through direct instruction. To do this, they need to review and assess the students' summaries and these tasks are very time-consuming. Thus, a computer-assisted assessment can be used to help teachers to conduct this task more effectively. Design/Results This paper aims to propose an algorithm based on the combination of semantic relations between words and their syntactic composition to identify summarizing strategies employed by students in summary writing. An innovative aspect of our algorithm lies in its ability to identify summarizing strategies at the syntactic and semantic levels. The efficiency of the algorithm is measured in terms of Precision, Recall and F-measure. We then implemented the algorithm for the automated summarization assessment system that can be used to identify the summarizing strategies used by students in summary writing. PMID:26735139

  15. Administrative Data Algorithms Can Describe Ambulatory Physician Utilization

    PubMed Central

    Shah, Baiju R; Hux, Janet E; Laupacis, Andreas; Zinman, Bernard; Cauch-Dudek, Karen; Booth, Gillian L

    2007-01-01

    Objective To validate algorithms using administrative data that characterize ambulatory physician care for patients with a chronic disease. Data Sources Seven-hundred and eighty-one people with diabetes were recruited mostly from community pharmacies to complete a written questionnaire about their physician utilization in 2002. These data were linked with administrative databases detailing health service utilization. Study Design An administrative data algorithm was defined that identified whether or not patients received specialist care, and it was tested for agreement with self-report. Other algorithms, which assigned each patient to a primary care and specialist physician, were tested for concordance with self-reported regular providers of care. Principal Findings The algorithm to identify whether participants received specialist care had 80.4 percent agreement with questionnaire responses (κ = 0.59). Compared with self-report, administrative data had a sensitivity of 68.9 percent and specificity 88.3 percent for identifying specialist care. The best administrative data algorithm to assign each participant's regular primary care and specialist providers was concordant with self-report in 82.6 and 78.2 percent of cases, respectively. Conclusions Administrative data algorithms can accurately match self-reported ambulatory physician utilization. PMID:17610448

  16. Robust Crop and Weed Segmentation under Uncontrolled Outdoor Illumination

    PubMed Central

    Jeon, Hong Y.; Tian, Lei F.; Zhu, Heping

    2011-01-01

    An image processing algorithm for detecting individual weeds was developed and evaluated. Weed detection processes included were normalized excessive green conversion, statistical threshold value estimation, adaptive image segmentation, median filter, morphological feature calculation and Artificial Neural Network (ANN). The developed algorithm was validated for its ability to identify and detect weeds and crop plants under uncontrolled outdoor illuminations. A machine vision implementing field robot captured field images under outdoor illuminations and the image processing algorithm automatically processed them without manual adjustment. The errors of the algorithm, when processing 666 field images, ranged from 2.1 to 2.9%. The ANN correctly detected 72.6% of crop plants from the identified plants, and considered the rest as weeds. However, the ANN identification rates for crop plants were improved up to 95.1% by addressing the error sources in the algorithm. The developed weed detection and image processing algorithm provides a novel method to identify plants against soil background under the uncontrolled outdoor illuminations, and to differentiate weeds from crop plants. Thus, the proposed new machine vision and processing algorithm may be useful for outdoor applications including plant specific direct applications (PSDA). PMID:22163954

  17. Evaluation of the CDC proposed laboratory HIV testing algorithm among men who have sex with men (MSM) from five US metropolitan statistical areas using specimens collected in 2011.

    PubMed

    Masciotra, Silvina; Smith, Amanda J; Youngpairoj, Ae S; Sprinkle, Patrick; Miles, Isa; Sionean, Catlainn; Paz-Bailey, Gabriela; Johnson, Jeffrey A; Owen, S Michele

    2013-12-01

    Until recently most testing algorithms in the United States (US) utilized Western blot (WB) as the supplemental test. CDC has proposed an algorithm for HIV diagnosis which includes an initial screen with a Combo Antigen/Antibody 4th generation-immunoassay (IA), followed by an HIV-1/2 discriminatory IA of initially reactive-IA specimens. Discordant results in the proposed algorithm are resolved by nucleic acid-amplification testing (NAAT). Evaluate the results obtained with the CDC proposed laboratory-based algorithm using specimens from men who have sex with men (MSM) obtained in five metropolitan statistical areas (MSAs). Specimens from 992 MSM from five MSAs participating in the CDC's National HIV Behavioral Surveillance System in 2011 were tested at local facilities and CDC. The five MSAs utilized algorithms of various screening assays and specimen types, and WB as the supplemental test. At the CDC, serum/plasma specimens were screened with 4th generation-IA and the Multispot HIV-1/HIV-2 discriminatory assay was used as the supplemental test. NAAT was used to resolve discordant results and to further identify acute HIV infections from all screened-non-reactive missed by the proposed algorithm. Performance of the proposed algorithm was compared to site-specific WB-based algorithms. The proposed algorithm detected 254 infections. The WB-based algorithms detected 19 fewer infections; 4 by oral fluid (OF) rapid testing and 15 by WB supplemental testing (12 OF and 3 blood). One acute infection was identified by NAAT from all screened-non-reactive specimens. The proposed algorithm identified more infections than the WB-based algorithms in a high-risk MSM population. OF testing was associated with most of the discordant results between algorithms. HIV testing with the proposed algorithm can increase diagnosis of infected individuals, including early infections. Published by Elsevier B.V.

  18. Sentiment analysis enhancement with target variable in Kumar’s Algorithm

    NASA Astrophysics Data System (ADS)

    Arman, A. A.; Kawi, A. B.; Hurriyati, R.

    2016-04-01

    Sentiment analysis (also known as opinion mining) refers to the use of text analysis and computational linguistics to identify and extract subjective information in source materials. Sentiment analysis is widely applied to reviews discussion that is being talked in social media for many purposes, ranging from marketing, customer service, or public opinion of public policy. One of the popular algorithm for Sentiment Analysis implementation is Kumar algorithm that developed by Kumar and Sebastian. Kumar algorithm can identify the sentiment score of the statement, sentence or tweet, but cannot determine the relationship of the object or target related to the sentiment being analysed. This research proposed solution for that challenge by adding additional component that represent object or target to the existing algorithm (Kumar algorithm). The result of this research is a modified algorithm that can give sentiment score based on a given object or target.

  19. A simple algorithm for the identification of clinical COPD phenotypes.

    PubMed

    Burgel, Pierre-Régis; Paillasseur, Jean-Louis; Janssens, Wim; Piquet, Jacques; Ter Riet, Gerben; Garcia-Aymerich, Judith; Cosio, Borja; Bakke, Per; Puhan, Milo A; Langhammer, Arnulf; Alfageme, Inmaculada; Almagro, Pere; Ancochea, Julio; Celli, Bartolome R; Casanova, Ciro; de-Torres, Juan P; Decramer, Marc; Echazarreta, Andrés; Esteban, Cristobal; Gomez Punter, Rosa Mar; Han, MeiLan K; Johannessen, Ane; Kaiser, Bernhard; Lamprecht, Bernd; Lange, Peter; Leivseth, Linda; Marin, Jose M; Martin, Francis; Martinez-Camblor, Pablo; Miravitlles, Marc; Oga, Toru; Sofia Ramírez, Ana; Sin, Don D; Sobradillo, Patricia; Soler-Cataluña, Juan J; Turner, Alice M; Verdu Rivera, Francisco Javier; Soriano, Joan B; Roche, Nicolas

    2017-11-01

    This study aimed to identify simple rules for allocating chronic obstructive pulmonary disease (COPD) patients to clinical phenotypes identified by cluster analyses.Data from 2409 COPD patients of French/Belgian COPD cohorts were analysed using cluster analysis resulting in the identification of subgroups, for which clinical relevance was determined by comparing 3-year all-cause mortality. Classification and regression trees (CARTs) were used to develop an algorithm for allocating patients to these subgroups. This algorithm was tested in 3651 patients from the COPD Cohorts Collaborative International Assessment (3CIA) initiative.Cluster analysis identified five subgroups of COPD patients with different clinical characteristics (especially regarding severity of respiratory disease and the presence of cardiovascular comorbidities and diabetes). The CART-based algorithm indicated that the variables relevant for patient grouping differed markedly between patients with isolated respiratory disease (FEV 1 , dyspnoea grade) and those with multi-morbidity (dyspnoea grade, age, FEV 1 and body mass index). Application of this algorithm to the 3CIA cohorts confirmed that it identified subgroups of patients with different clinical characteristics, mortality rates (median, from 4% to 27%) and age at death (median, from 68 to 76 years).A simple algorithm, integrating respiratory characteristics and comorbidities, allowed the identification of clinically relevant COPD phenotypes. Copyright ©ERS 2017.

  20. Use of electronic data and existing screening tools to identify clinically significant obstructive sleep apnea.

    PubMed

    Severson, Carl A; Pendharkar, Sachin R; Ronksley, Paul E; Tsai, Willis H

    2015-01-01

    To assess the ability of electronic health data and existing screening tools to identify clinically significant obstructive sleep apnea (OSA), as defined by symptomatic or severe OSA. The present retrospective cohort study of 1041 patients referred for sleep diagnostic testing was undertaken at a tertiary sleep centre in Calgary, Alberta. A diagnosis of clinically significant OSA or an alternative sleep diagnosis was assigned to each patient through blinded independent chart review by two sleep physicians. Predictive variables were identified from online questionnaire data, and diagnostic algorithms were developed. The performance of electronically derived algorithms for identifying patients with clinically significant OSA was determined. Diagnostic performance of these algorithms was compared with versions of the STOP-Bang questionnaire and adjusted neck circumference score (ANC) derived from electronic data. Electronic questionnaire data were highly sensitive (>95%) at identifying clinically significant OSA, but not specific. Sleep diagnostic testing-determined respiratory disturbance index was very specific (specificity ≥95%) for clinically relevant disease, but not sensitive (<35%). Derived algorithms had similar accuracy to the STOP-Bang or ANC, but required fewer questions and calculations. These data suggest that a two-step process using a small number of clinical variables (maximizing sensitivity) and objective diagnostic testing (maximizing specificity) is required to identify clinically significant OSA. When used in an online setting, simple algorithms can identify clinically relevant OSA with similar performance to existing decision rules such as the STOP-Bang or ANC.

  1. Sensitivity and specificity of administrative mortality data for identifying prescription opioid–related deaths

    PubMed Central

    Gladstone, Emilie; Smolina, Kate; Morgan, Steven G.; Fernandes, Kimberly A.; Martins, Diana; Gomes, Tara

    2016-01-01

    Background: Comprehensive systems for surveilling prescription opioid–related harms provide clear evidence that deaths from prescription opioids have increased dramatically in the United States. However, these harms are not systematically monitored in Canada. In light of a growing public health crisis, accessible, nationwide data sources to examine prescription opioid–related harms in Canada are needed. We sought to examine the performance of 5 algorithms to identify prescription opioid–related deaths from vital statistics data against data abstracted from the Office of the Chief Coroner of Ontario as a gold standard. Methods: We identified all prescription opioid–related deaths from Ontario coroners’ data that occurred between Jan. 31, 2003, and Dec. 31, 2010. We then used 5 different algorithms to identify prescription opioid–related deaths from vital statistics death data in 2010. We selected the algorithm with the highest sensitivity and a positive predictive value of more than 80% as the optimal algorithm for identifying prescription opioid–related deaths. Results: Four of the 5 algorithms had positive predictive values of more than 80%. The algorithm with the highest sensitivity (75%) in 2010 improved slightly in its predictive performance from 2003 to 2010. Interpretation: In the absence of specific systems for monitoring prescription opioid–related deaths in Canada, readily available national vital statistics data can be used to study prescription opioid–related mortality with considerable accuracy. Despite some limitations, these data may facilitate the implementation of national surveillance and monitoring strategies. PMID:26622006

  2. Sensitivity and specificity of administrative mortality data for identifying prescription opioid-related deaths.

    PubMed

    Gladstone, Emilie; Smolina, Kate; Morgan, Steven G; Fernandes, Kimberly A; Martins, Diana; Gomes, Tara

    2016-03-01

    Comprehensive systems for surveilling prescription opioid-related harms provide clear evidence that deaths from prescription opioids have increased dramatically in the United States. However, these harms are not systematically monitored in Canada. In light of a growing public health crisis, accessible, nationwide data sources to examine prescription opioid-related harms in Canada are needed. We sought to examine the performance of 5 algorithms to identify prescription opioid-related deaths from vital statistics data against data abstracted from the Office of the Chief Coroner of Ontario as a gold standard. We identified all prescription opioid-related deaths from Ontario coroners' data that occurred between Jan. 31, 2003, and Dec. 31, 2010. We then used 5 different algorithms to identify prescription opioid-related deaths from vital statistics death data in 2010. We selected the algorithm with the highest sensitivity and a positive predictive value of more than 80% as the optimal algorithm for identifying prescription opioid-related deaths. Four of the 5 algorithms had positive predictive values of more than 80%. The algorithm with the highest sensitivity (75%) in 2010 improved slightly in its predictive performance from 2003 to 2010. In the absence of specific systems for monitoring prescription opioid-related deaths in Canada, readily available national vital statistics data can be used to study prescription opioid-related mortality with considerable accuracy. Despite some limitations, these data may facilitate the implementation of national surveillance and monitoring strategies. © 2016 Canadian Medical Association or its licensors.

  3. An Enhanced K-Means Algorithm for Water Quality Analysis of The Haihe River in China.

    PubMed

    Zou, Hui; Zou, Zhihong; Wang, Xiaojing

    2015-11-12

    The increase and the complexity of data caused by the uncertain environment is today's reality. In order to identify water quality effectively and reliably, this paper presents a modified fast clustering algorithm for water quality analysis. The algorithm has adopted a varying weights K-means cluster algorithm to analyze water monitoring data. The varying weights scheme was the best weighting indicator selected by a modified indicator weight self-adjustment algorithm based on K-means, which is named MIWAS-K-means. The new clustering algorithm avoids the margin of the iteration not being calculated in some cases. With the fast clustering analysis, we can identify the quality of water samples. The algorithm is applied in water quality analysis of the Haihe River (China) data obtained by the monitoring network over a period of eight years (2006-2013) with four indicators at seven different sites (2078 samples). Both the theoretical and simulated results demonstrate that the algorithm is efficient and reliable for water quality analysis of the Haihe River. In addition, the algorithm can be applied to more complex data matrices with high dimensionality.

  4. 78 FR 57639 - Request for Comments on Pediatric Planned Procedure Algorithm

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-09-19

    ... Comments on Pediatric Planned Procedure Algorithm AGENCY: Agency for Healthcare Research and Quality (AHRQ), HHS. ACTION: Notice of request for comments on pediatric planned procedure algorithm from the members... Quality (AHRQ) is requesting comments from the public on an algorithm for identifying pediatric planned...

  5. Abbreviation definition identification based on automatic precision estimates.

    PubMed

    Sohn, Sunghwan; Comeau, Donald C; Kim, Won; Wilbur, W John

    2008-09-25

    The rapid growth of biomedical literature presents challenges for automatic text processing, and one of the challenges is abbreviation identification. The presence of unrecognized abbreviations in text hinders indexing algorithms and adversely affects information retrieval and extraction. Automatic abbreviation definition identification can help resolve these issues. However, abbreviations and their definitions identified by an automatic process are of uncertain validity. Due to the size of databases such as MEDLINE only a small fraction of abbreviation-definition pairs can be examined manually. An automatic way to estimate the accuracy of abbreviation-definition pairs extracted from text is needed. In this paper we propose an abbreviation definition identification algorithm that employs a variety of strategies to identify the most probable abbreviation definition. In addition our algorithm produces an accuracy estimate, pseudo-precision, for each strategy without using a human-judged gold standard. The pseudo-precisions determine the order in which the algorithm applies the strategies in seeking to identify the definition of an abbreviation. On the Medstract corpus our algorithm produced 97% precision and 85% recall which is higher than previously reported results. We also annotated 1250 randomly selected MEDLINE records as a gold standard. On this set we achieved 96.5% precision and 83.2% recall. This compares favourably with the well known Schwartz and Hearst algorithm. We developed an algorithm for abbreviation identification that uses a variety of strategies to identify the most probable definition for an abbreviation and also produces an estimated accuracy of the result. This process is purely automatic.

  6. An administrative data validation study of the accuracy of algorithms for identifying rheumatoid arthritis: the influence of the reference standard on algorithm performance.

    PubMed

    Widdifield, Jessica; Bombardier, Claire; Bernatsky, Sasha; Paterson, J Michael; Green, Diane; Young, Jacqueline; Ivers, Noah; Butt, Debra A; Jaakkimainen, R Liisa; Thorne, J Carter; Tu, Karen

    2014-06-23

    We have previously validated administrative data algorithms to identify patients with rheumatoid arthritis (RA) using rheumatology clinic records as the reference standard. Here we reassessed the accuracy of the algorithms using primary care records as the reference standard. We performed a retrospective chart abstraction study using a random sample of 7500 adult patients under the care of 83 family physicians contributing to the Electronic Medical Record Administrative data Linked Database (EMRALD) in Ontario, Canada. Using physician-reported diagnoses as the reference standard, we computed and compared the sensitivity, specificity, and predictive values for over 100 administrative data algorithms for RA case ascertainment. We identified 69 patients with RA for a lifetime RA prevalence of 0.9%. All algorithms had excellent specificity (>97%). However, sensitivity varied (75-90%) among physician billing algorithms. Despite the low prevalence of RA, most algorithms had adequate positive predictive value (PPV; 51-83%). The algorithm of "[1 hospitalization RA diagnosis code] or [3 physician RA diagnosis codes with ≥1 by a specialist over 2 years]" had a sensitivity of 78% (95% CI 69-88), specificity of 100% (95% CI 100-100), PPV of 78% (95% CI 69-88) and NPV of 100% (95% CI 100-100). Administrative data algorithms for detecting RA patients achieved a high degree of accuracy amongst the general population. However, results varied slightly from our previous report, which can be attributed to differences in the reference standards with respect to disease prevalence, spectrum of disease, and type of comparator group.

  7. Identifying Optimal Measurement Subspace for the Ensemble Kalman Filter

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhou, Ning; Huang, Zhenyu; Welch, Greg

    2012-05-24

    To reduce the computational load of the ensemble Kalman filter while maintaining its efficacy, an optimization algorithm based on the generalized eigenvalue decomposition method is proposed for identifying the most informative measurement subspace. When the number of measurements is large, the proposed algorithm can be used to make an effective tradeoff between computational complexity and estimation accuracy. This algorithm also can be extended to other Kalman filters for measurement subspace selection.

  8. Billing code algorithms to identify cases of peripheral artery disease from administrative data

    PubMed Central

    Fan, Jin; Arruda-Olson, Adelaide M; Leibson, Cynthia L; Smith, Carin; Liu, Guanghui; Bailey, Kent R; Kullo, Iftikhar J

    2013-01-01

    Objective To construct and validate billing code algorithms for identifying patients with peripheral arterial disease (PAD). Methods We extracted all encounters and line item details including PAD-related billing codes at Mayo Clinic Rochester, Minnesota, between July 1, 1997 and June 30, 2008; 22 712 patients evaluated in the vascular laboratory were divided into training and validation sets. Multiple logistic regression analysis was used to create an integer code score from the training dataset, and this was tested in the validation set. We applied a model-based code algorithm to patients evaluated in the vascular laboratory and compared this with a simpler algorithm (presence of at least one of the ICD-9 PAD codes 440.20–440.29). We also applied both algorithms to a community-based sample (n=4420), followed by a manual review. Results The logistic regression model performed well in both training and validation datasets (c statistic=0.91). In patients evaluated in the vascular laboratory, the model-based code algorithm provided better negative predictive value. The simpler algorithm was reasonably accurate for identification of PAD status, with lesser sensitivity and greater specificity. In the community-based sample, the sensitivity (38.7% vs 68.0%) of the simpler algorithm was much lower, whereas the specificity (92.0% vs 87.6%) was higher than the model-based algorithm. Conclusions A model-based billing code algorithm had reasonable accuracy in identifying PAD cases from the community, and in patients referred to the non-invasive vascular laboratory. The simpler algorithm had reasonable accuracy for identification of PAD in patients referred to the vascular laboratory but was significantly less sensitive in a community-based sample. PMID:24166724

  9. Nucleus detection using gradient orientation information and linear least squares regression

    NASA Astrophysics Data System (ADS)

    Kwak, Jin Tae; Hewitt, Stephen M.; Xu, Sheng; Pinto, Peter A.; Wood, Bradford J.

    2015-03-01

    Computerized histopathology image analysis enables an objective, efficient, and quantitative assessment of digitized histopathology images. Such analysis often requires an accurate and efficient detection and segmentation of histological structures such as glands, cells and nuclei. The segmentation is used to characterize tissue specimens and to determine the disease status or outcomes. The segmentation of nuclei, in particular, is challenging due to the overlapping or clumped nuclei. Here, we propose a nuclei seed detection method for the individual and overlapping nuclei that utilizes the gradient orientation or direction information. The initial nuclei segmentation is provided by a multiview boosting approach. The angle of the gradient orientation is computed and traced for the nuclear boundaries. Taking the first derivative of the angle of the gradient orientation, high concavity points (junctions) are discovered. False junctions are found and removed by adopting a greedy search scheme with the goodness-of-fit statistic in a linear least squares sense. Then, the junctions determine boundary segments. Partial boundary segments belonging to the same nucleus are identified and combined by examining the overlapping area between them. Using the final set of the boundary segments, we generate the list of seeds in tissue images. The method achieved an overall precision of 0.89 and a recall of 0.88 in comparison to the manual segmentation.

  10. [Algorithms based on medico-administrative data in the field of endocrine, nutritional and metabolic diseases, especially diabetes].

    PubMed

    Fosse-Edorh, S; Rigou, A; Morin, S; Fezeu, L; Mandereau-Bruno, L; Fagot-Campagna, A

    2017-10-01

    Medico-administrative databases represent a very interesting source of information in the field of endocrine, nutritional and metabolic diseases. The objective of this article is to describe the early works of the Redsiam working group in this field. Algorithms developed in France in the field of diabetes, the treatment of dyslipidemia, precocious puberty, and bariatric surgery based on the National Inter-schema Information System on Health Insurance (SNIIRAM) data were identified and described. Three algorithms for identifying people with diabetes are available in France. These algorithms are based either on full insurance coverage for diabetes or on claims of diabetes treatments, or on the combination of these two methods associated with hospitalizations related to diabetes. Each of these algorithms has a different purpose, and the choice should depend on the goal of the study. Algorithms for identifying people treated for dyslipidemia or precocious puberty or who underwent bariatric surgery are also available. Early work from the Redsiam working group in the field of endocrine, nutritional and metabolic diseases produced an inventory of existing algorithms in France, linked with their goals, together with a presentation of their limitations and advantages, providing useful information for the scientific community. This work will continue with discussions about algorithms on the incidence of diabetes in children, thyroidectomy for thyroid nodules, hypothyroidism, hypoparathyroidism, and amyloidosis. Copyright © 2017 Elsevier Masson SAS. All rights reserved.

  11. A systematic review of validated methods for identifying transfusion-related ABO incompatibility reactions using administrative and claims data.

    PubMed

    Carnahan, Ryan M; Kee, Vicki R

    2012-01-01

    This paper aimed to systematically review algorithms to identify transfusion-related ABO incompatibility reactions in administrative data, with a focus on studies that have examined the validity of the algorithms. A literature search was conducted using PubMed, Iowa Drug Information Service database, and Embase. A Google Scholar search was also conducted because of the difficulty identifying relevant studies. Reviews were conducted by two investigators to identify studies using data sources from the USA or Canada because these data sources were most likely to reflect the coding practices of Mini-Sentinel data sources. One study was found that validated International Classification of Diseases (ICD-9-CM) codes representing transfusion reactions. None of these cases were ABO incompatibility reactions. Several studies consistently used ICD-9-CM code 999.6, which represents ABO incompatibility reactions, and a technical report identified the ICD-10 code for these reactions. One study included the E-code E8760 for mismatched blood in transfusion in the algorithm. Another study reported finding no ABO incompatibility reaction codes in the Healthcare Cost and Utilization Project Nationwide Inpatient Sample database, which contains data of 2.23 million patients who received transfusions, raising questions about the sensitivity of administrative data for identifying such reactions. Two studies reported perfect specificity, with sensitivity ranging from 21% to 83%, for the code identifying allogeneic red blood cell transfusions in hospitalized patients. There is no information to assess the validity of algorithms to identify transfusion-related ABO incompatibility reactions. Further information on the validity of algorithms to identify transfusions would also be useful. Copyright © 2012 John Wiley & Sons, Ltd.

  12. GDPC: Gravitation-based Density Peaks Clustering algorithm

    NASA Astrophysics Data System (ADS)

    Jiang, Jianhua; Hao, Dehao; Chen, Yujun; Parmar, Milan; Li, Keqin

    2018-07-01

    The Density Peaks Clustering algorithm, which we refer to as DPC, is a novel and efficient density-based clustering approach, and it is published in Science in 2014. The DPC has advantages of discovering clusters with varying sizes and varying densities, but has some limitations of detecting the number of clusters and identifying anomalies. We develop an enhanced algorithm with an alternative decision graph based on gravitation theory and nearby distance to identify centroids and anomalies accurately. We apply our method to some UCI and synthetic data sets. We report comparative clustering performances using F-Measure and 2-dimensional vision. We also compare our method to other clustering algorithms, such as K-Means, Affinity Propagation (AP) and DPC. We present F-Measure scores and clustering accuracies of our GDPC algorithm compared to K-Means, AP and DPC on different data sets. We show that the GDPC has the superior performance in its capability of: (1) detecting the number of clusters obviously; (2) aggregating clusters with varying sizes, varying densities efficiently; (3) identifying anomalies accurately.

  13. Optimizing research in symptomatic uterine fibroids with development of a computable phenotype for use with electronic health records.

    PubMed

    Hoffman, Sarah R; Vines, Anissa I; Halladay, Jacqueline R; Pfaff, Emily; Schiff, Lauren; Westreich, Daniel; Sundaresan, Aditi; Johnson, La-Shell; Nicholson, Wanda K

    2018-06-01

    Women with symptomatic uterine fibroids can report a myriad of symptoms, including pain, bleeding, infertility, and psychosocial sequelae. Optimizing fibroid research requires the ability to enroll populations of women with image-confirmed symptomatic uterine fibroids. Our objective was to develop an electronic health record-based algorithm to identify women with symptomatic uterine fibroids for a comparative effectiveness study of medical or surgical treatments on quality-of-life measures. Using an iterative process and text-mining techniques, an effective computable phenotype algorithm, composed of demographics, and clinical and laboratory characteristics, was developed with reasonable performance. Such algorithms provide a feasible, efficient way to identify populations of women with symptomatic uterine fibroids for the conduct of large traditional or pragmatic trials and observational comparative effectiveness studies. Symptomatic uterine fibroids, due to menorrhagia, pelvic pain, bulk symptoms, or infertility, are a source of substantial morbidity for reproductive-age women. Comparing Treatment Options for Uterine Fibroids is a multisite registry study to compare the effectiveness of hormonal or surgical fibroid treatments on women's perceptions of their quality of life. Electronic health record-based algorithms are able to identify large numbers of women with fibroids, but additional work is needed to develop electronic health record algorithms that can identify women with symptomatic fibroids to optimize fibroid research. We sought to develop an efficient electronic health record-based algorithm that can identify women with symptomatic uterine fibroids in a large health care system for recruitment into large-scale observational and interventional research in fibroid management. We developed and assessed the accuracy of 3 algorithms to identify patients with symptomatic fibroids using an iterative approach. The data source was the Carolina Data Warehouse for Health, a repository for the health system's electronic health record data. In addition to International Classification of Diseases, Ninth Revision diagnosis and procedure codes and clinical characteristics, text data-mining software was used to derive information from imaging reports to confirm the presence of uterine fibroids. Results of each algorithm were compared with expert manual review to calculate the positive predictive values for each algorithm. Algorithm 1 was composed of the following criteria: (1) age 18-54 years; (2) either ≥1 International Classification of Diseases, Ninth Revision diagnosis codes for uterine fibroids or mention of fibroids using text-mined key words in imaging records or documents; and (3) no International Classification of Diseases, Ninth Revision or Current Procedural Terminology codes for hysterectomy and no reported history of hysterectomy. The positive predictive value was 47% (95% confidence interval 39-56%). Algorithm 2 required ≥2 International Classification of Diseases, Ninth Revision diagnosis codes for fibroids and positive text-mined key words and had a positive predictive value of 65% (95% confidence interval 50-79%). In algorithm 3, further refinements included ≥2 International Classification of Diseases, Ninth Revision diagnosis codes for fibroids on separate outpatient visit dates, the exclusion of women who had a positive pregnancy test within 3 months of their fibroid-related visit, and exclusion of incidentally detected fibroids during prenatal or emergency department visits. Algorithm 3 achieved a positive predictive value of 76% (95% confidence interval 71-81%). An electronic health record-based algorithm is capable of identifying cases of symptomatic uterine fibroids with moderate positive predictive value and may be an efficient approach for large-scale study recruitment. Copyright © 2018 Elsevier Inc. All rights reserved.

  14. Blooming Trees: Substructures and Surrounding Groups of Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Yu, Heng; Diaferio, Antonaldo; Serra, Ana Laura; Baldi, Marco

    2018-06-01

    We develop the Blooming Tree Algorithm, a new technique that uses spectroscopic redshift data alone to identify the substructures and the surrounding groups of galaxy clusters, along with their member galaxies. Based on the estimated binding energy of galaxy pairs, the algorithm builds a binary tree that hierarchically arranges all of the galaxies in the field of view. The algorithm searches for buds, corresponding to gravitational potential minima on the binary tree branches; for each bud, the algorithm combines the number of galaxies, their velocity dispersion, and their average pairwise distance into a parameter that discriminates between the buds that do not correspond to any substructure or group, and thus eventually die, and the buds that correspond to substructures and groups, and thus bloom into the identified structures. We test our new algorithm with a sample of 300 mock redshift surveys of clusters in different dynamical states; the clusters are extracted from a large cosmological N-body simulation of a ΛCDM model. We limit our analysis to substructures and surrounding groups identified in the simulation with mass larger than 1013 h ‑1 M ⊙. With mock redshift surveys with 200 galaxies within 6 h ‑1 Mpc from the cluster center, the technique recovers 80% of the real substructures and 60% of the surrounding groups; in 57% of the identified structures, at least 60% of the member galaxies of the substructures and groups belong to the same real structure. These results improve by roughly a factor of two the performance of the best substructure identification algorithm currently available, the σ plateau algorithm, and suggest that our Blooming Tree Algorithm can be an invaluable tool for detecting substructures of galaxy clusters and investigating their complex dynamics.

  15. An Evaluation of Algorithms for Identifying Metastatic Breast, Lung, or Colorectal Cancer in Administrative Claims Data.

    PubMed

    Whyte, Joanna L; Engel-Nitz, Nicole M; Teitelbaum, April; Gomez Rey, Gabriel; Kallich, Joel D

    2015-07-01

    Administrative health care claims data are used for epidemiologic, health services, and outcomes cancer research and thus play a significant role in policy. Cancer stage, which is often a major driver of cost and clinical outcomes, is not typically included in claims data. Evaluate algorithms used in a dataset of cancer patients to identify patients with metastatic breast (BC), lung (LC), or colorectal (CRC) cancer using claims data. Clinical data on BC, LC, or CRC patients (between January 1, 2007 and March 31, 2010) were linked to a health care claims database. Inclusion required health plan enrollment ≥3 months before initial cancer diagnosis date. Algorithms were used in the claims database to identify patients' disease status, which was compared with physician-reported metastases. Generic and tumor-specific algorithms were evaluated using ICD-9 codes, varying diagnosis time frames, and including/excluding other tumors. Positive and negative predictive values, sensitivity, and specificity were assessed. The linked databases included 14,480 patients; of whom, 32%, 17%, and 14.2% had metastatic BC, LC, and CRC, respectively, at diagnosis and met inclusion criteria. Nontumor-specific algorithms had lower specificity than tumor-specific algorithms. Tumor-specific algorithms' sensitivity and specificity were 53% and 99% for BC, 55% and 85% for LC, and 59% and 98% for CRC, respectively. Algorithms to distinguish metastatic BC, LC, and CRC from locally advanced disease should use tumor-specific primary cancer codes with 2 claims for the specific primary cancer >30-42 days apart to reduce misclassification. These performed best overall in specificity, positive predictive values, and overall accuracy to identify metastatic cancer in a health care claims database.

  16. ContextD: an algorithm to identify contextual properties of medical terms in a Dutch clinical corpus.

    PubMed

    Afzal, Zubair; Pons, Ewoud; Kang, Ning; Sturkenboom, Miriam C J M; Schuemie, Martijn J; Kors, Jan A

    2014-11-29

    In order to extract meaningful information from electronic medical records, such as signs and symptoms, diagnoses, and treatments, it is important to take into account the contextual properties of the identified information: negation, temporality, and experiencer. Most work on automatic identification of these contextual properties has been done on English clinical text. This study presents ContextD, an adaptation of the English ConText algorithm to the Dutch language, and a Dutch clinical corpus. We created a Dutch clinical corpus containing four types of anonymized clinical documents: entries from general practitioners, specialists' letters, radiology reports, and discharge letters. Using a Dutch list of medical terms extracted from the Unified Medical Language System, we identified medical terms in the corpus with exact matching. The identified terms were annotated for negation, temporality, and experiencer properties. To adapt the ConText algorithm, we translated English trigger terms to Dutch and added several general and document specific enhancements, such as negation rules for general practitioners' entries and a regular expression based temporality module. The ContextD algorithm utilized 41 unique triggers to identify the contextual properties in the clinical corpus. For the negation property, the algorithm obtained an F-score from 87% to 93% for the different document types. For the experiencer property, the F-score was 99% to 100%. For the historical and hypothetical values of the temporality property, F-scores ranged from 26% to 54% and from 13% to 44%, respectively. The ContextD showed good performance in identifying negation and experiencer property values across all Dutch clinical document types. Accurate identification of the temporality property proved to be difficult and requires further work. The anonymized and annotated Dutch clinical corpus can serve as a useful resource for further algorithm development.

  17. Fast Prediction and Evaluation of Gravitational Waveforms Using Surrogate Models

    NASA Astrophysics Data System (ADS)

    Field, Scott E.; Galley, Chad R.; Hesthaven, Jan S.; Kaye, Jason; Tiglio, Manuel

    2014-07-01

    We propose a solution to the problem of quickly and accurately predicting gravitational waveforms within any given physical model. The method is relevant for both real-time applications and more traditional scenarios where the generation of waveforms using standard methods can be prohibitively expensive. Our approach is based on three offline steps resulting in an accurate reduced order model in both parameter and physical dimensions that can be used as a surrogate for the true or fiducial waveform family. First, a set of m parameter values is determined using a greedy algorithm from which a reduced basis representation is constructed. Second, these m parameters induce the selection of m time values for interpolating a waveform time series using an empirical interpolant that is built for the fiducial waveform family. Third, a fit in the parameter dimension is performed for the waveform's value at each of these m times. The cost of predicting L waveform time samples for a generic parameter choice is of order O(mL+mcfit) online operations, where cfit denotes the fitting function operation count and, typically, m ≪L. The result is a compact, computationally efficient, and accurate surrogate model that retains the original physics of the fiducial waveform family while also being fast to evaluate. We generate accurate surrogate models for effective-one-body waveforms of nonspinning binary black hole coalescences with durations as long as 105M, mass ratios from 1 to 10, and for multiple spherical harmonic modes. We find that these surrogates are more than 3 orders of magnitude faster to evaluate as compared to the cost of generating effective-one-body waveforms in standard ways. Surrogate model building for other waveform families and models follows the same steps and has the same low computational online scaling cost. For expensive numerical simulations of binary black hole coalescences, we thus anticipate extremely large speedups in generating new waveforms with a surrogate. As waveform generation is one of the dominant costs in parameter estimation algorithms and parameter space exploration, surrogate models offer a new and practical way to dramatically accelerate such studies without impacting accuracy. Surrogates built in this paper, as well as others, are available from GWSurrogate, a publicly available python package.

  18. Multiagent pursuit-evasion games: Algorithms and experiments

    NASA Astrophysics Data System (ADS)

    Kim, Hyounjin

    Deployment of intelligent agents has been made possible through advances in control software, microprocessors, sensor/actuator technology, communication technology, and artificial intelligence. Intelligent agents now play important roles in many applications where human operation is too dangerous or inefficient. There is little doubt that the world of the future will be filled with intelligent robotic agents employed to autonomously perform tasks, or embedded in systems all around us, extending our capabilities to perceive, reason and act, and replacing human efforts. There are numerous real-world applications in which a single autonomous agent is not suitable and multiple agents are required. However, after years of active research in multi-agent systems, current technology is still far from achieving many of these real-world applications. Here, we consider the problem of deploying a team of unmanned ground vehicles (UGV) and unmanned aerial vehicles (UAV) to pursue a second team of UGV evaders while concurrently building a map in an unknown environment. This pursuit-evasion game encompasses many of the challenging issues that arise in operations using intelligent multi-agent systems. We cast the problem in a probabilistic game theoretic framework and consider two computationally feasible pursuit policies: greedy and global-max. We also formulate this probabilistic pursuit-evasion game as a partially observable Markov decision process and employ a policy search algorithm to obtain a good pursuit policy from a restricted class of policies. The estimated value of this policy is guaranteed to be uniformly close to the optimal value in the given policy class under mild conditions. To implement this scenario on real UAVs and UGVs, we propose a distributed hierarchical hybrid system architecture which emphasizes the autonomy of each agent yet allows for coordinated team efforts. We then describe our implementation on a fleet of UGVs and UAVs, detailing components such as high level pursuit policy computation, inter-agent communication, navigation, sensing, and regulation. We present both simulation and experimental results on real pursuit-evasion games between our fleet of UAVs and UGVs and evaluate the pursuit policies, relating expected capture times to the speed and intelligence of the evaders and the sensing capabilities of the pursuers. The architecture and algorithmsis described in this dissertation are general enough to be applied to many real-world applications.

  19. Some Physical Principles Governing Spatial and Temporal Organization in Living Systems

    NASA Astrophysics Data System (ADS)

    Ali, Md Zulfikar

    Spatial and temporal organization in living organisms are crucial for a variety of biological functions and arise from the interplay of large number of interacting molecules. One of the central questions in systems biology is to understand how such an intricate organization emerges from the molecular biochemistry of the cell. In this dissertation we explore two projects. The first project relates to pattern formation in a cell membrane as an example of spatial organization, and the second project relates to the evolution of oscillatory networks as a simple example of temporal organization. For the first project, we introduce a model for pattern formation in a two-component lipid bilayer and study the interplay between membrane composition and membrane geometry, demonstrating the existence of a rich phase diagram. Pattern formation is governed by the interplay between phase separation driven by lipid-lipid interactions and tendency of lipid domains with high intrinsic curvature to deform the membrane away from its preferred position. Depending on membrane parameters, we find the formation of compact lipid micro-clusters or of striped domains. We calculate the stripe width analytically and find good agreement with stripe widths obtained from the simulations. For the second project, we introduce a minimal model for the evolution of functional protein-interaction networks using a sequence-based mutational algorithm and apply it to study the following problems. Using the model, we study robustness and designabilty of a 2-component network that generate oscillations. We completely enumerate the sequence space and the phenotypic space, and discuss the relationship between designabilty, robustness and evolvability. We further apply the model to studies of neutral drift in networks that yield oscillatory dynamics, e.g. starting with a relatively simple network and allowing it to evolve by adding nodes and connections while requiring that oscillatory dynamics be preserved. Our studies demonstrate both the importance of employing a sequence-based evolutionary scheme and the relative rapidity (in evolutionary time) for the redistribution of function over new nodes via neutral drift. In addition we discovered another much slower timescale for network evolution, reflecting hidden order in sequence space that we interpret in terms of sparsely connected domains. Finally, we use the model to study the evolution of an oscillator from a non-oscillatory network under the influence of external periodic forcing as a model for evolution of circadian rhythm in living systems. We use a greedy algorithm based on optimizing biologically motivated fitness functions and find that the algorithm successfully produces oscillators. However, the distribution of free-period of evolved oscillators depends on the choice of fitness functions and the nature of forcing.

  20. An ant colony optimization based algorithm for identifying gene regulatory elements.

    PubMed

    Liu, Wei; Chen, Hanwu; Chen, Ling

    2013-08-01

    It is one of the most important tasks in bioinformatics to identify the regulatory elements in gene sequences. Most of the existing algorithms for identifying regulatory elements are inclined to converge into a local optimum, and have high time complexity. Ant Colony Optimization (ACO) is a meta-heuristic method based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of real ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper designs and implements an ACO based algorithm named ACRI (ant-colony-regulatory-identification) for identifying all possible binding sites of transcription factor from the upstream of co-expressed genes. To accelerate the ants' searching process, a strategy of local optimization is presented to adjust the ants' start positions on the searched sequences. By exploiting the powerful optimization ability of ACO, the algorithm ACRI can not only improve precision of the results, but also achieve a very high speed. Experimental results on real world datasets show that ACRI can outperform other traditional algorithms in the respects of speed and quality of solutions. Copyright © 2013 Elsevier Ltd. All rights reserved.

Top