Sample records for graph animals subgraph

  1. An algorithm for finding a similar subgraph of all Hamiltonian cycles

    NASA Astrophysics Data System (ADS)

    Wafdan, R.; Ihsan, M.; Suhaimi, D.

    2018-01-01

    This paper discusses an algorithm to find a similar subgraph called findSimSubG algorithm. A similar subgraph is a subgraph with a maximum number of edges, contains no isolated vertex and is contained in every Hamiltonian cycle of a Hamiltonian Graph. The algorithm runs only on Hamiltonian graphs with at least two Hamiltonian cycles. The algorithm works by examining whether the initial subgraph of the first Hamiltonian cycle is a subgraph of comparison graphs. If the initial subgraph is not in comparison graphs, the algorithm will remove edges and vertices of the initial subgraph that are not in comparison graphs. There are two main processes in the algorithm, changing Hamiltonian cycle into a cycle graph and removing edges and vertices of the initial subgraph that are not in comparison graphs. The findSimSubG algorithm can find the similar subgraph without using backtracking method. The similar subgraph cannot be found on certain graphs, such as an n-antiprism graph, complete bipartite graph, complete graph, 2n-crossed prism graph, n-crown graph, n-möbius ladder, prism graph, and wheel graph. The complexity of this algorithm is O(m|V|), where m is the number of Hamiltonian cycles and |V| is the number of vertices of a Hamiltonian graph.

  2. MISAGA: An Algorithm for Mining Interesting Subgraphs in Attributed Graphs.

    PubMed

    He, Tiantian; Chan, Keith C C

    2018-05-01

    An attributed graph contains vertices that are associated with a set of attribute values. Mining clusters or communities, which are interesting subgraphs in the attributed graph is one of the most important tasks of graph analytics. Many problems can be defined as the mining of interesting subgraphs in attributed graphs. Algorithms that discover subgraphs based on predefined topologies cannot be used to tackle these problems. To discover interesting subgraphs in the attributed graph, we propose an algorithm called mining interesting subgraphs in attributed graph algorithm (MISAGA). MISAGA performs its tasks by first using a probabilistic measure to determine whether the strength of association between a pair of attribute values is strong enough to be interesting. Given the interesting pairs of attribute values, then the degree of association is computed for each pair of vertices using an information theoretic measure. Based on the edge structure and degree of association between each pair of vertices, MISAGA identifies interesting subgraphs by formulating it as a constrained optimization problem and solves it by identifying the optimal affiliation of subgraphs for the vertices in the attributed graph. MISAGA has been tested with several large-sized real graphs and is found to be potentially very useful for various applications.

  3. SING: Subgraph search In Non-homogeneous Graphs

    PubMed Central

    2010-01-01

    Background Finding the subgraphs of a graph database that are isomorphic to a given query graph has practical applications in several fields, from cheminformatics to image understanding. Since subgraph isomorphism is a computationally hard problem, indexing techniques have been intensively exploited to speed up the process. Such systems filter out those graphs which cannot contain the query, and apply a subgraph isomorphism algorithm to each residual candidate graph. The applicability of such systems is limited to databases of small graphs, because their filtering power degrades on large graphs. Results In this paper, SING (Subgraph search In Non-homogeneous Graphs), a novel indexing system able to cope with large graphs, is presented. The method uses the notion of feature, which can be a small subgraph, subtree or path. Each graph in the database is annotated with the set of all its features. The key point is to make use of feature locality information. This idea is used to both improve the filtering performance and speed up the subgraph isomorphism task. Conclusions Extensive tests on chemical compounds, biological networks and synthetic graphs show that the proposed system outperforms the most popular systems in query time over databases of medium and large graphs. Other specific tests show that the proposed system is effective for single large graphs. PMID:20170516

  4. OLSVis: an animated, interactive visual browser for bio-ontologies

    PubMed Central

    2012-01-01

    Background More than one million terms from biomedical ontologies and controlled vocabularies are available through the Ontology Lookup Service (OLS). Although OLS provides ample possibility for querying and browsing terms, the visualization of parts of the ontology graphs is rather limited and inflexible. Results We created the OLSVis web application, a visualiser for browsing all ontologies available in the OLS database. OLSVis shows customisable subgraphs of the OLS ontologies. Subgraphs are animated via a real-time force-based layout algorithm which is fully interactive: each time the user makes a change, e.g. browsing to a new term, hiding, adding, or dragging terms, the algorithm performs smooth and only essential reorganisations of the graph. This assures an optimal viewing experience, because subsequent screen layouts are not grossly altered, and users can easily navigate through the graph. URL: http://ols.wordvis.com Conclusions The OLSVis web application provides a user-friendly tool to visualise ontologies from the OLS repository. It broadens the possibilities to investigate and select ontology subgraphs through a smooth visualisation method. PMID:22646023

  5. Mining connected global and local dense subgraphs for bigdata

    NASA Astrophysics Data System (ADS)

    Wu, Bo; Shen, Haiying

    2016-01-01

    The problem of discovering connected dense subgraphs of natural graphs is important in data analysis. Discovering dense subgraphs that do not contain denser subgraphs or are not contained in denser subgraphs (called significant dense subgraphs) is also critical for wide-ranging applications. In spite of many works on discovering dense subgraphs, there are no algorithms that can guarantee the connectivity of the returned subgraphs or discover significant dense subgraphs. Hence, in this paper, we define two subgraph discovery problems to discover connected and significant dense subgraphs, propose polynomial-time algorithms and theoretically prove their validity. We also propose an algorithm to further improve the time and space efficiency of our basic algorithm for discovering significant dense subgraphs in big data by taking advantage of the unique features of large natural graphs. In the experiments, we use massive natural graphs to evaluate our algorithms in comparison with previous algorithms. The experimental results show the effectiveness of our algorithms for the two problems and their efficiency. This work is also the first that reveals the physical significance of significant dense subgraphs in natural graphs from different domains.

  6. Functional network organization of the human brain

    PubMed Central

    Power, Jonathan D; Cohen, Alexander L; Nelson, Steven M; Wig, Gagan S; Barnes, Kelly Anne; Church, Jessica A; Vogel, Alecia C; Laumann, Timothy O; Miezin, Fran M; Schlaggar, Bradley L; Petersen, Steven E

    2011-01-01

    Summary Real-world complex systems may be mathematically modeled as graphs, revealing properties of the system. Here we study graphs of functional brain organization in healthy adults using resting state functional connectivity MRI. We propose two novel brain-wide graphs, one of 264 putative functional areas, the other a modification of voxelwise networks that eliminates potentially artificial short-distance relationships. These graphs contain many subgraphs in good agreement with known functional brain systems. Other subgraphs lack established functional identities; we suggest possible functional characteristics for these subgraphs. Further, graph measures of the areal network indicate that the default mode subgraph shares network properties with sensory and motor subgraphs: it is internally integrated but isolated from other subgraphs, much like a “processing” system. The modified voxelwise graph also reveals spatial motifs in the patterning of systems across the cortex. PMID:22099467

  7. Differentially Private Frequent Subgraph Mining

    PubMed Central

    Xu, Shengzhi; Xiong, Li; Cheng, Xiang; Xiao, Ke

    2016-01-01

    Mining frequent subgraphs from a collection of input graphs is an important topic in data mining research. However, if the input graphs contain sensitive information, releasing frequent subgraphs may pose considerable threats to individual's privacy. In this paper, we study the problem of frequent subgraph mining (FGM) under the rigorous differential privacy model. We introduce a novel differentially private FGM algorithm, which is referred to as DFG. In this algorithm, we first privately identify frequent subgraphs from input graphs, and then compute the noisy support of each identified frequent subgraph. In particular, to privately identify frequent subgraphs, we present a frequent subgraph identification approach which can improve the utility of frequent subgraph identifications through candidates pruning. Moreover, to compute the noisy support of each identified frequent subgraph, we devise a lattice-based noisy support derivation approach, where a series of methods has been proposed to improve the accuracy of the noisy supports. Through formal privacy analysis, we prove that our DFG algorithm satisfies ε-differential privacy. Extensive experimental results on real datasets show that the DFG algorithm can privately find frequent subgraphs with high data utility. PMID:27616876

  8. A Coding Method for Efficient Subgraph Querying on Vertex- and Edge-Labeled Graphs

    PubMed Central

    Zhu, Lei; Song, Qinbao; Guo, Yuchen; Du, Lei; Zhu, Xiaoyan; Wang, Guangtao

    2014-01-01

    Labeled graphs are widely used to model complex data in many domains, so subgraph querying has been attracting more and more attention from researchers around the world. Unfortunately, subgraph querying is very time consuming since it involves subgraph isomorphism testing that is known to be an NP-complete problem. In this paper, we propose a novel coding method for subgraph querying that is based on Laplacian spectrum and the number of walks. Our method follows the filtering-and-verification framework and works well on graph databases with frequent updates. We also propose novel two-step filtering conditions that can filter out most false positives and prove that the two-step filtering conditions satisfy the no-false-negative requirement (no dismissal in answers). Extensive experiments on both real and synthetic graphs show that, compared with six existing counterpart methods, our method can effectively improve the efficiency of subgraph querying. PMID:24853266

  9. Detecting community structure via the maximal sub-graphs and belonging degrees in complex networks

    NASA Astrophysics Data System (ADS)

    Cui, Yaozu; Wang, Xingyuan; Eustace, Justine

    2014-12-01

    Community structure is a common phenomenon in complex networks, and it has been shown that some communities in complex networks often overlap each other. So in this paper we propose a new algorithm to detect overlapping community structure in complex networks. To identify the overlapping community structure, our algorithm firstly extracts fully connected sub-graphs which are maximal sub-graphs from original networks. Then two maximal sub-graphs having the key pair-vertices can be merged into a new larger sub-graph using some belonging degree functions. Furthermore we extend the modularity function to evaluate the proposed algorithm. In addition, overlapping nodes between communities are founded successfully. Finally we report the comparison between the modularity and the computational complexity of the proposed algorithm with some other existing algorithms. The experimental results show that the proposed algorithm gives satisfactory results.

  10. A graph lattice approach to maintaining and learning dense collections of subgraphs as image features.

    PubMed

    Saund, Eric

    2013-10-01

    Effective object and scene classification and indexing depend on extraction of informative image features. This paper shows how large families of complex image features in the form of subgraphs can be built out of simpler ones through construction of a graph lattice—a hierarchy of related subgraphs linked in a lattice. Robustness is achieved by matching many overlapping and redundant subgraphs, which allows the use of inexpensive exact graph matching, instead of relying on expensive error-tolerant graph matching to a minimal set of ideal model graphs. Efficiency in exact matching is gained by exploitation of the graph lattice data structure. Additionally, the graph lattice enables methods for adaptively growing a feature space of subgraphs tailored to observed data. We develop the approach in the domain of rectilinear line art, specifically for the practical problem of document forms recognition. We are especially interested in methods that require only one or very few labeled training examples per category. We demonstrate two approaches to using the subgraph features for this purpose. Using a bag-of-words feature vector we achieve essentially single-instance learning on a benchmark forms database, following an unsupervised clustering stage. Further performance gains are achieved on a more difficult dataset using a feature voting method and feature selection procedure.

  11. A new augmentation based algorithm for extracting maximal chordal subgraphs

    DOE PAGES

    Bhowmick, Sanjukta; Chen, Tzu-Yi; Halappanavar, Mahantesh

    2014-10-18

    If every cycle of a graph is chordal length greater than three then it contains an edge between non-adjacent vertices. Chordal graphs are of interest both theoretically, since they admit polynomial time solutions to a range of NP-hard graph problems, and practically, since they arise in many applications including sparse linear algebra, computer vision, and computational biology. A maximal chordal subgraph is a chordal subgraph that is not a proper subgraph of any other chordal subgraph. Existing algorithms for computing maximal chordal subgraphs depend on dynamically ordering the vertices, which is an inherently sequential process and therefore limits the algorithms’more » parallelizability. In our paper we explore techniques to develop a scalable parallel algorithm for extracting a maximal chordal subgraph. We demonstrate that an earlier attempt at developing a parallel algorithm may induce a non-optimal vertex ordering and is therefore not guaranteed to terminate with a maximal chordal subgraph. We then give a new algorithm that first computes and then repeatedly augments a spanning chordal subgraph. After proving that the algorithm terminates with a maximal chordal subgraph, we then demonstrate that this algorithm is more amenable to parallelization and that the parallel version also terminates with a maximal chordal subgraph. That said, the complexity of the new algorithm is higher than that of the previous parallel algorithm, although the earlier algorithm computes a chordal subgraph which is not guaranteed to be maximal. Finally, we experimented with our augmentation-based algorithm on both synthetic and real-world graphs. We provide scalability results and also explore the effect of different choices for the initial spanning chordal subgraph on both the running time and on the number of edges in the maximal chordal subgraph.« less

  12. A New Augmentation Based Algorithm for Extracting Maximal Chordal Subgraphs.

    PubMed

    Bhowmick, Sanjukta; Chen, Tzu-Yi; Halappanavar, Mahantesh

    2015-02-01

    A graph is chordal if every cycle of length greater than three contains an edge between non-adjacent vertices. Chordal graphs are of interest both theoretically, since they admit polynomial time solutions to a range of NP-hard graph problems, and practically, since they arise in many applications including sparse linear algebra, computer vision, and computational biology. A maximal chordal subgraph is a chordal subgraph that is not a proper subgraph of any other chordal subgraph. Existing algorithms for computing maximal chordal subgraphs depend on dynamically ordering the vertices, which is an inherently sequential process and therefore limits the algorithms' parallelizability. In this paper we explore techniques to develop a scalable parallel algorithm for extracting a maximal chordal subgraph. We demonstrate that an earlier attempt at developing a parallel algorithm may induce a non-optimal vertex ordering and is therefore not guaranteed to terminate with a maximal chordal subgraph. We then give a new algorithm that first computes and then repeatedly augments a spanning chordal subgraph. After proving that the algorithm terminates with a maximal chordal subgraph, we then demonstrate that this algorithm is more amenable to parallelization and that the parallel version also terminates with a maximal chordal subgraph. That said, the complexity of the new algorithm is higher than that of the previous parallel algorithm, although the earlier algorithm computes a chordal subgraph which is not guaranteed to be maximal. We experimented with our augmentation-based algorithm on both synthetic and real-world graphs. We provide scalability results and also explore the effect of different choices for the initial spanning chordal subgraph on both the running time and on the number of edges in the maximal chordal subgraph.

  13. Finding Hierarchical and Overlapping Dense Subgraphs using Nucleus Decompositions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Seshadhri, Comandur; Pinar, Ali; Sariyuce, Ahmet Erdem

    Finding dense substructures in a graph is a fundamental graph mining operation, with applications in bioinformatics, social networks, and visualization to name a few. Yet most standard formulations of this problem (like clique, quasiclique, k-densest subgraph) are NP-hard. Furthermore, the goal is rarely to nd the \\true optimum", but to identify many (if not all) dense substructures, understand their distribution in the graph, and ideally determine a hierarchical structure among them. Current dense subgraph nding algorithms usually optimize some objective, and only nd a few such subgraphs without providing any hierarchy. It is also not clear how to account formore » overlaps in dense substructures. We de ne the nucleus decomposition of a graph, which represents the graph as a forest of nuclei. Each nucleus is a subgraph where smaller cliques are present in many larger cliques. The forest of nuclei is a hierarchy by containment, where the edge density increases as we proceed towards leaf nuclei. Sibling nuclei can have limited intersections, which allows for discovery of overlapping dense subgraphs. With the right parameters, the nuclear decomposition generalizes the classic notions of k-cores and k-trusses. We give provable e cient algorithms for nuclear decompositions, and empirically evaluate their behavior in a variety of real graphs. The tree of nuclei consistently gives a global, hierarchical snapshot of dense substructures, and outputs dense subgraphs of higher quality than other state-of-theart solutions. Our algorithm can process graphs with tens of millions of edges in less than an hour.« less

  14. Sudden emergence of q-regular subgraphs in random graphs

    NASA Astrophysics Data System (ADS)

    Pretti, M.; Weigt, M.

    2006-07-01

    We investigate the computationally hard problem whether a random graph of finite average vertex degree has an extensively large q-regular subgraph, i.e., a subgraph with all vertices having degree equal to q. We reformulate this problem as a constraint-satisfaction problem, and solve it using the cavity method of statistical physics at zero temperature. For q = 3, we find that the first large q-regular subgraphs appear discontinuously at an average vertex degree c3 - reg simeq 3.3546 and contain immediately about 24% of all vertices in the graph. This transition is extremely close to (but different from) the well-known 3-core percolation point c3 - core simeq 3.3509. For q > 3, the q-regular subgraph percolation threshold is found to coincide with that of the q-core.

  15. A PVS Graph Theory Library

    NASA Technical Reports Server (NTRS)

    Butler, Ricky W.; Sjogren, Jon A.

    1998-01-01

    This paper documents the NASA Langley PVS graph theory library. The library provides fundamental definitions for graphs, subgraphs, walks, paths, subgraphs generated by walks, trees, cycles, degree, separating sets, and four notions of connectedness. Theorems provided include Ramsey's and Menger's and the equivalence of all four notions of connectedness.

  16. GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Simmhan, Yogesh; Kumbhare, Alok; Wickramaarachchi, Charith

    2014-08-25

    Large scale graph processing is a major research area for Big Data exploration. Vertex centric programming models like Pregel are gaining traction due to their simple abstraction that allows for scalable execution on distributed systems naturally. However, there are limitations to this approach which cause vertex centric algorithms to under-perform due to poor compute to communication overhead ratio and slow convergence of iterative superstep. In this paper we introduce GoFFish a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters. We introduce a sub-graph centric programming abstraction that combines themore » scalability of a vertex centric approach with the flexibility of shared memory sub-graph computation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation.« less

  17. RNA Graph Partitioning for the Discovery of RNA Modularity: A Novel Application of Graph Partition Algorithm to Biology

    PubMed Central

    Elmetwaly, Shereef; Schlick, Tamar

    2014-01-01

    Graph representations have been widely used to analyze and design various economic, social, military, political, and biological networks. In systems biology, networks of cells and organs are useful for understanding disease and medical treatments and, in structural biology, structures of molecules can be described, including RNA structures. In our RNA-As-Graphs (RAG) framework, we represent RNA structures as tree graphs by translating unpaired regions into vertices and helices into edges. Here we explore the modularity of RNA structures by applying graph partitioning known in graph theory to divide an RNA graph into subgraphs. To our knowledge, this is the first application of graph partitioning to biology, and the results suggest a systematic approach for modular design in general. The graph partitioning algorithms utilize mathematical properties of the Laplacian eigenvector (µ2) corresponding to the second eigenvalues (λ2) associated with the topology matrix defining the graph: λ2 describes the overall topology, and the sum of µ2′s components is zero. The three types of algorithms, termed median, sign, and gap cuts, divide a graph by determining nodes of cut by median, zero, and largest gap of µ2′s components, respectively. We apply these algorithms to 45 graphs corresponding to all solved RNA structures up through 11 vertices (∼220 nucleotides). While we observe that the median cut divides a graph into two similar-sized subgraphs, the sign and gap cuts partition a graph into two topologically-distinct subgraphs. We find that the gap cut produces the best biologically-relevant partitioning for RNA because it divides RNAs at less stable connections while maintaining junctions intact. The iterative gap cuts suggest basic modules and assembly protocols to design large RNA structures. Our graph substructuring thus suggests a systematic approach to explore the modularity of biological networks. In our applications to RNA structures, subgraphs also suggest design strategies for novel RNA motifs. PMID:25188578

  18. On the modification Highly Connected Subgraphs (HCS) algorithm in graph clustering for weighted graph

    NASA Astrophysics Data System (ADS)

    Albirri, E. R.; Sugeng, K. A.; Aldila, D.

    2018-04-01

    Nowadays, in the modern world, since technology and human civilization start to progress, all city in the world is almost connected. The various places in this world are easier to visit. It is an impact of transportation technology and highway construction. The cities which have been connected can be represented by graph. Graph clustering is one of ways which is used to answer some problems represented by graph. There are some methods in graph clustering to solve the problem spesifically. One of them is Highly Connected Subgraphs (HCS) method. HCS is used to identify cluster based on the graph connectivity k for graph G. The connectivity in graph G is denoted by k(G)> \\frac{n}{2} that n is the total of vertices in G, then it is called as HCS or the cluster. This research used literature review and completed with simulation of program in a software. We modified HCS algorithm by using weighted graph. The modification is located in the Process Phase. Process Phase is used to cut the connected graph G into two subgraphs H and \\bar{H}. We also made a program by using software Octave-401. Then we applied the data of Flight Routes Mapping of One of Airlines in Indonesia to our program.

  19. Incremental k-core decomposition: Algorithms and evaluation

    DOE PAGES

    Sariyuce, Ahmet Erdem; Gedik, Bugra; Jacques-SIlva, Gabriela; ...

    2016-02-01

    A k-core of a graph is a maximal connected subgraph in which every vertex is connected to at least k vertices in the subgraph. k-core decomposition is often used in large-scale network analysis, such as community detection, protein function prediction, visualization, and solving NP-hard problems on real networks efficiently, like maximal clique finding. In many real-world applications, networks change over time. As a result, it is essential to develop efficient incremental algorithms for dynamic graph data. In this paper, we propose a suite of incremental k-core decomposition algorithms for dynamic graph data. These algorithms locate a small subgraph that ismore » guaranteed to contain the list of vertices whose maximum k-core values have changed and efficiently process this subgraph to update the k-core decomposition. We present incremental algorithms for both insertion and deletion operations, and propose auxiliary vertex state maintenance techniques that can further accelerate these operations. Our results show a significant reduction in runtime compared to non-incremental alternatives. We illustrate the efficiency of our algorithms on different types of real and synthetic graphs, at varying scales. Furthermore, for a graph of 16 million vertices, we observe relative throughputs reaching a million times, relative to the non-incremental algorithms.« less

  20. Super (a, d)-Cycle-Antimagic Total Labeling on Triangular Ladder Graph and Generalized Jahangir Graph

    NASA Astrophysics Data System (ADS)

    Roswitha, Mania; Amanda, Anna; Sri Martini, Titin; Winarno, Bowo

    2017-06-01

    Let G(V (G), E(G)) be a finite simple graph with |V (G)| = G and |E(G)| = eG . Let H be a subgraph of G. The graph G is said to be (a, d)-H-antimagic covering if every edge in G belongs to at least one of the subgraphs G isomorphic to H and there is a bijective function ξ : V ∪ E → {1, 2, …,νG + eG } such that all subgraphs H‧ isomorphic to H, the H‧ -weights w(H‧)=∑v∈V(H‧)ξ(v)+∑e∈E(H‧)ξ(e) constitutes an arithmetic progression {a, a + d, a + 2d, …, a + (t - 1)d}, where a and d are positive integers and t is the number of subgraphs G isomorphic to H. Such a labeling is called super if the vertices contain the smallest possible labels. This research provides super (a, d)-C 3-antimagic total labelng on triangular ladder TLn for n ≥ 2 and super (a, d)-C s+2-antimagic total labeling on generalized Jahangir Jk,s for k ≥ 2 and s ≥ 2.

  1. Super (a,d)-H-antimagic covering of möbius ladder graph

    NASA Astrophysics Data System (ADS)

    Indriyani, Novia; Sri Martini, Titin

    2018-04-01

    Let G = (V(G), E(G)) be a simple graph. Let H-covering of G is a subgraph H 1, H 2, …, Hj with every edge in G is contained in at least one graph Hi for 1 ≤ i ≤ j. If every Hi is isomorphic, then G admits an H-covering. Furthermore, an (a,d)-H-antimagic covering if there bijective function ξ :V(G)\\cup E(G)\\to \\{1,2,3,\\ldots,|V(G)|+|E(G)|\\}. The H‑-weights for all subgraphs H‑ isomorphic to H ω ({H}^{\\prime })={\\sum }v\\in V({H^{\\prime })}ξ (v)+{\\sum }e\\in E({H^{\\prime })}ξ (e). The weights of subgraphs constitutes an arithmatic progression {a, a + d, …, a + (t ‑ 1)d} where a and d are positive integers and t is the number of subgraphs G isomorphic to H. If ξ (V(G))=\\{1,2,\\ldots,|V(G)|\\} then ξ is called super (a, d)-H-antimagic covering. The research provides super (a, d)-H-antimagic covering with d = {1, 3} of Möbius ladder graph Mn for n > 5 and n is odd.

  2. An Improved Heuristic Method for Subgraph Isomorphism Problem

    NASA Astrophysics Data System (ADS)

    Xiang, Yingzhuo; Han, Jiesi; Xu, Haijiang; Guo, Xin

    2017-09-01

    This paper focus on the subgraph isomorphism (SI) problem. We present an improved genetic algorithm, a heuristic method to search the optimal solution. The contribution of this paper is that we design a dedicated crossover algorithm and a new fitness function to measure the evolution process. Experiments show our improved genetic algorithm performs better than other heuristic methods. For a large graph, such as a subgraph of 40 nodes, our algorithm outperforms the traditional tree search algorithms. We find that the performance of our improved genetic algorithm does not decrease as the number of nodes in prototype graphs.

  3. A simple method for finding the scattering coefficients of quantum graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cottrell, Seth S.

    2015-09-15

    Quantum walks are roughly analogous to classical random walks, and similar to classical walks they have been used to find new (quantum) algorithms. When studying the behavior of large graphs or combinations of graphs, it is useful to find the response of a subgraph to signals of different frequencies. In doing so, we can replace an entire subgraph with a single vertex with variable scattering coefficients. In this paper, a simple technique for quickly finding the scattering coefficients of any discrete-time quantum graph will be presented. These scattering coefficients can be expressed entirely in terms of the characteristic polynomial ofmore » the graph’s time step operator. This is a marked improvement over previous techniques which have traditionally required finding eigenstates for a given eigenvalue, which is far more computationally costly. With the scattering coefficients we can easily derive the “impulse response” which is the key to predicting the response of a graph to any signal. This gives us a powerful set of tools for rapidly understanding the behavior of graphs or for reducing a large graph into its constituent subgraphs regardless of how they are connected.« less

  4. Searching social networks for subgraph patterns

    NASA Astrophysics Data System (ADS)

    Ogaard, Kirk; Kase, Sue; Roy, Heather; Nagi, Rakesh; Sambhoos, Kedar; Sudit, Moises

    2013-06-01

    Software tools for Social Network Analysis (SNA) are being developed which support various types of analysis of social networks extracted from social media websites (e.g., Twitter). Once extracted and stored in a database such social networks are amenable to analysis by SNA software. This data analysis often involves searching for occurrences of various subgraph patterns (i.e., graphical representations of entities and relationships). The authors have developed the Graph Matching Toolkit (GMT) which provides an intuitive Graphical User Interface (GUI) for a heuristic graph matching algorithm called the Truncated Search Tree (TruST) algorithm. GMT is a visual interface for graph matching algorithms processing large social networks. GMT enables an analyst to draw a subgraph pattern by using a mouse to select categories and labels for nodes and links from drop-down menus. GMT then executes the TruST algorithm to find the top five occurrences of the subgraph pattern within the social network stored in the database. GMT was tested using a simulated counter-insurgency dataset consisting of cellular phone communications within a populated area of operations in Iraq. The results indicated GMT (when executing the TruST graph matching algorithm) is a time-efficient approach to searching large social networks. GMT's visual interface to a graph matching algorithm enables intelligence analysts to quickly analyze and summarize the large amounts of data necessary to produce actionable intelligence.

  5. Induced subgraph searching for geometric model fitting

    NASA Astrophysics Data System (ADS)

    Xiao, Fan; Xiao, Guobao; Yan, Yan; Wang, Xing; Wang, Hanzi

    2017-11-01

    In this paper, we propose a novel model fitting method based on graphs to fit and segment multiple-structure data. In the graph constructed on data, each model instance is represented as an induced subgraph. Following the idea of pursuing the maximum consensus, the multiple geometric model fitting problem is formulated as searching for a set of induced subgraphs including the maximum union set of vertices. After the generation and refinement of the induced subgraphs that represent the model hypotheses, the searching process is conducted on the "qualified" subgraphs. Multiple model instances can be simultaneously estimated by solving a converted problem. Then, we introduce the energy evaluation function to determine the number of model instances in data. The proposed method is able to effectively estimate the number and the parameters of model instances in data severely corrupted by outliers and noises. Experimental results on synthetic data and real images validate the favorable performance of the proposed method compared with several state-of-the-art fitting methods.

  6. Community structure and scale-free collections of Erdős-Rényi graphs.

    PubMed

    Seshadhri, C; Kolda, Tamara G; Pinar, Ali

    2012-05-01

    Community structure plays a significant role in the analysis of social networks and similar graphs, yet this structure is little understood and not well captured by most models. We formally define a community to be a subgraph that is internally highly connected and has no deeper substructure. We use tools of combinatorics to show that any such community must contain a dense Erdős-Rényi (ER) subgraph. Based on mathematical arguments, we hypothesize that any graph with a heavy-tailed degree distribution and community structure must contain a scale-free collection of dense ER subgraphs. These theoretical observations corroborate well with empirical evidence. From this, we propose the Block Two-Level Erdős-Rényi (BTER) model, and demonstrate that it accurately captures the observable properties of many real-world social networks.

  7. Topological structure of dictionary graphs

    NASA Astrophysics Data System (ADS)

    Fukś, Henryk; Krzemiński, Mark

    2009-09-01

    We investigate the topological structure of the subgraphs of dictionary graphs constructed from WordNet and Moby thesaurus data. In the process of learning a foreign language, the learner knows only a subset of all words of the language, corresponding to a subgraph of a dictionary graph. When this subgraph grows with time, its topological properties change. We introduce the notion of the pseudocore and argue that the growth of the vocabulary roughly follows decreasing pseudocore numbers—that is, one first learns words with a high pseudocore number followed by smaller pseudocores. We also propose an alternative strategy for vocabulary growth, involving decreasing core numbers as opposed to pseudocore numbers. We find that as the core or pseudocore grows in size, the clustering coefficient first decreases, then reaches a minimum and starts increasing again. The minimum occurs when the vocabulary reaches a size between 103 and 104. A simple model exhibiting similar behavior is proposed. The model is based on a generalized geometric random graph. Possible implications for language learning are discussed.

  8. Multi-Level Anomaly Detection on Time-Varying Graph Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bridges, Robert A; Collins, John P; Ferragut, Erik M

    This work presents a novel modeling and analysis framework for graph sequences which addresses the challenge of detecting and contextualizing anomalies in labelled, streaming graph data. We introduce a generalization of the BTER model of Seshadhri et al. by adding flexibility to community structure, and use this model to perform multi-scale graph anomaly detection. Specifically, probability models describing coarse subgraphs are built by aggregating probabilities at finer levels, and these closely related hierarchical models simultaneously detect deviations from expectation. This technique provides insight into a graph's structure and internal context that may shed light on a detected event. Additionally, thismore » multi-scale analysis facilitates intuitive visualizations by allowing users to narrow focus from an anomalous graph to particular subgraphs or nodes causing the anomaly. For evaluation, two hierarchical anomaly detectors are tested against a baseline Gaussian method on a series of sampled graphs. We demonstrate that our graph statistics-based approach outperforms both a distribution-based detector and the baseline in a labeled setting with community structure, and it accurately detects anomalies in synthetic and real-world datasets at the node, subgraph, and graph levels. To illustrate the accessibility of information made possible via this technique, the anomaly detector and an associated interactive visualization tool are tested on NCAA football data, where teams and conferences that moved within the league are identified with perfect recall, and precision greater than 0.786.« less

  9. [A retrieval method of drug molecules based on graph collapsing].

    PubMed

    Qu, J W; Lv, X Q; Liu, Z M; Liao, Y; Sun, P H; Wang, B; Tang, Z

    2018-04-18

    To establish a compact and efficient hypergraph representation and a graph-similarity-based retrieval method of molecules to achieve effective and efficient medicine information retrieval. Chemical structural formula (CSF) was a primary search target as a unique and precise identifier for each compound at the molecular level in the research field of medicine information retrieval. To retrieve medicine information effectively and efficiently, a complete workflow of the graph-based CSF retrieval system was introduced. This system accepted the photos taken from smartphones and the sketches drawn on tablet personal computers as CSF inputs, and formalized the CSFs with the corresponding graphs. Then this paper proposed a compact and efficient hypergraph representation for molecules on the basis of analyzing factors that directly affected the efficiency of graph matching. According to the characteristics of CSFs, a hierarchical collapsing method combining graph isomorphism and frequent subgraph mining was adopted. There was yet a fundamental challenge, subgraph overlapping during the collapsing procedure, which hindered the method from establishing the correct compact hypergraph of an original CSF graph. Therefore, a graph-isomorphism-based algorithm was proposed to select dominant acyclic subgraphs on the basis of overlapping analysis. Finally, the spatial similarity among graphical CSFs was evaluated by multi-dimensional measures of similarity. To evaluate the performance of the proposed method, the proposed system was firstly compared with Wikipedia Chemical Structure Explorer (WCSE), the state-of-the-art system that allowed CSF similarity searching within Wikipedia molecules dataset, on retrieval accuracy. The system achieved higher values on mean average precision, discounted cumulative gain, rank-biased precision, and expected reciprocal rank than WCSE from the top-2 to the top-10 retrieved results. Specifically, the system achieved 10%, 1.41, 6.42%, and 1.32% higher than WCSE on these metrics for top-10 retrieval results, respectively. Moreover, several retrieval cases were presented to intuitively compare with WCSE. The results of the above comparative study demonstrated that the proposed method outperformed the existing method with regard to accuracy and effectiveness. This paper proposes a graph-similarity-based retrieval approach for medicine information. To obtain satisfactory retrieval results, an isomorphism-based algorithm is proposed for dominant subgraph selection based on the subgraph overlapping analysis, as well as an effective and efficient hypergraph representation of molecules. Experiment results demonstrate the effectiveness of the proposed approach.

  10. Better Decomposition Heuristics for the Maximum-Weight Connected Graph Problem Using Betweenness Centrality

    NASA Astrophysics Data System (ADS)

    Yamamoto, Takanori; Bannai, Hideo; Nagasaki, Masao; Miyano, Satoru

    We present new decomposition heuristics for finding the optimal solution for the maximum-weight connected graph problem, which is known to be NP-hard. Previous optimal algorithms for solving the problem decompose the input graph into subgraphs using heuristics based on node degree. We propose new heuristics based on betweenness centrality measures, and show through computational experiments that our new heuristics tend to reduce the number of subgraphs in the decomposition, and therefore could lead to the reduction in computational time for finding the optimal solution. The method is further applied to analysis of biological pathway data.

  11. A multi-level anomaly detection algorithm for time-varying graph data with interactive visualization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bridges, Robert A.; Collins, John P.; Ferragut, Erik M.

    This work presents a novel modeling and analysis framework for graph sequences which addresses the challenge of detecting and contextualizing anomalies in labelled, streaming graph data. We introduce a generalization of the BTER model of Seshadhri et al. by adding flexibility to community structure, and use this model to perform multi-scale graph anomaly detection. Specifically, probability models describing coarse subgraphs are built by aggregating node probabilities, and these related hierarchical models simultaneously detect deviations from expectation. This technique provides insight into a graph's structure and internal context that may shed light on a detected event. Additionally, this multi-scale analysis facilitatesmore » intuitive visualizations by allowing users to narrow focus from an anomalous graph to particular subgraphs or nodes causing the anomaly. For evaluation, two hierarchical anomaly detectors are tested against a baseline Gaussian method on a series of sampled graphs. We demonstrate that our graph statistics-based approach outperforms both a distribution-based detector and the baseline in a labeled setting with community structure, and it accurately detects anomalies in synthetic and real-world datasets at the node, subgraph, and graph levels. Furthermore, to illustrate the accessibility of information made possible via this technique, the anomaly detector and an associated interactive visualization tool are tested on NCAA football data, where teams and conferences that moved within the league are identified with perfect recall, and precision greater than 0.786.« less

  12. A multi-level anomaly detection algorithm for time-varying graph data with interactive visualization

    DOE PAGES

    Bridges, Robert A.; Collins, John P.; Ferragut, Erik M.; ...

    2016-01-01

    This work presents a novel modeling and analysis framework for graph sequences which addresses the challenge of detecting and contextualizing anomalies in labelled, streaming graph data. We introduce a generalization of the BTER model of Seshadhri et al. by adding flexibility to community structure, and use this model to perform multi-scale graph anomaly detection. Specifically, probability models describing coarse subgraphs are built by aggregating node probabilities, and these related hierarchical models simultaneously detect deviations from expectation. This technique provides insight into a graph's structure and internal context that may shed light on a detected event. Additionally, this multi-scale analysis facilitatesmore » intuitive visualizations by allowing users to narrow focus from an anomalous graph to particular subgraphs or nodes causing the anomaly. For evaluation, two hierarchical anomaly detectors are tested against a baseline Gaussian method on a series of sampled graphs. We demonstrate that our graph statistics-based approach outperforms both a distribution-based detector and the baseline in a labeled setting with community structure, and it accurately detects anomalies in synthetic and real-world datasets at the node, subgraph, and graph levels. Furthermore, to illustrate the accessibility of information made possible via this technique, the anomaly detector and an associated interactive visualization tool are tested on NCAA football data, where teams and conferences that moved within the league are identified with perfect recall, and precision greater than 0.786.« less

  13. Dense Subgraphs with Restrictions and Applications to Gene Annotation Graphs

    NASA Astrophysics Data System (ADS)

    Saha, Barna; Hoch, Allison; Khuller, Samir; Raschid, Louiqa; Zhang, Xiao-Ning

    In this paper, we focus on finding complex annotation patterns representing novel and interesting hypotheses from gene annotation data. We define a generalization of the densest subgraph problem by adding an additional distance restriction (defined by a separate metric) to the nodes of the subgraph. We show that while this generalization makes the problem NP-hard for arbitrary metrics, when the metric comes from the distance metric of a tree, or an interval graph, the problem can be solved optimally in polynomial time. We also show that the densest subgraph problem with a specified subset of vertices that have to be included in the solution can be solved optimally in polynomial time. In addition, we consider other extensions when not just one solution needs to be found, but we wish to list all subgraphs of almost maximum density as well. We apply this method to a dataset of genes and their annotations obtained from The Arabidopsis Information Resource (TAIR). A user evaluation confirms that the patterns found in the distance restricted densest subgraph for a dataset of photomorphogenesis genes are indeed validated in the literature; a control dataset validates that these are not random patterns. Interestingly, the complex annotation patterns potentially lead to new and as yet unknown hypotheses. We perform experiments to determine the properties of the dense subgraphs, as we vary parameters, including the number of genes and the distance.

  14. CUDA Enabled Graph Subset Examiner

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnston, Jeremy T.

    2016-12-22

    Finding Godsil-McKay switching sets in graphs is one way to demonstrate that a specific graph is not determined by its spectrum--the eigenvalues of its adjacency matrix. An important area of active research in pure mathematics is determining which graphs are determined by their spectra, i.e. when the spectrum of the adjacency matrix uniquely determines the underlying graph. We are interested in exploring the spectra of graphs in the Johnson scheme and specifically seek to determine which of these graphs are determined by their spectra. Given a graph G, a Godsil-McKay switching set is an induced subgraph H on 2k verticesmore » with the following properties: I) H is regular, ii) every vertex in G/H is adjacent to either 0, k, or 2k vertices of H, and iii) at least one vertex in G/H is adjacent to k vertices in H. The software package examines each subset of a user specified size to determine whether or not it satisfies those 3 conditions. The software makes use of the massive parallel processing power of CUDA enabled GPUs. It also exploits the vertex transitivity of graphs in the Johnson scheme by reasoning that if G has a Godsil-McKay switching set, then it has a switching set which includes vertex 1. While the code (in its current state) is tuned to this specific problem, the method of examining each induced subgraph of G can be easily re-written to check for any user specified conditions on the subgraphs and can therefore be used much more broadly.« less

  15. Applied Graph-Mining Algorithms to Study Biomolecular Interaction Networks

    PubMed Central

    2014-01-01

    Protein-protein interaction (PPI) networks carry vital information on the organization of molecular interactions in cellular systems. The identification of functionally relevant modules in PPI networks is one of the most important applications of biological network analysis. Computational analysis is becoming an indispensable tool to understand large-scale biomolecular interaction networks. Several types of computational methods have been developed and employed for the analysis of PPI networks. Of these computational methods, graph comparison and module detection are the two most commonly used strategies. This review summarizes current literature on graph kernel and graph alignment methods for graph comparison strategies, as well as module detection approaches including seed-and-extend, hierarchical clustering, optimization-based, probabilistic, and frequent subgraph methods. Herein, we provide a comprehensive review of the major algorithms employed under each theme, including our recently published frequent subgraph method, for detecting functional modules commonly shared across multiple cancer PPI networks. PMID:24800226

  16. On P2 ⋄ Pn -supermagic labeling of edge corona product of cycle and path graph

    NASA Astrophysics Data System (ADS)

    Yulianto, R.; Martini, Titin S.

    2018-04-01

    A simple graph G = (V, E) admits a H-covering, where H is subgraph of G, if every edge in E belongs to a subgraph of G isomorphic to H. Graph G is H-magic if there is a total labeling f:V(G)\\cup E(G)\\to 1,2,\\ldots,|V(G)|+|E(G)|, such that each subgraph {H}{\\prime }=({V}{\\prime },{E}{\\prime }) of G isomorphic to H and satisfying f{({H}{\\prime })}=def{\\sum }\\upsilon \\in {V{\\prime }}f(\\upsilon )+{\\sum }e\\in {E{\\prime }}f(e)=m(f) where m(f) is a constant magic sum. Additionaly, G admits H-supermagic if f(V)=1,2,\\ldots,|V|. The edge corona {C}n \\diamond {P}n of Cn and Pn is defined as the graph obtained by taking one copy of Cn and n copies of Pn , and then joining two end-vertices of the i-th edge of Cn to every vertex in the i-th copy of Pn . This research aim is to find H-supermagic covering on an edge corona product of cycle and path graph {C}n \\diamond {P}n where H is {P}2 \\diamond {P}n. We use k-balanced multiset to solve our reserarch. Here, we find that an edge corona product of cycle and path graph {C}n \\diamond {P}n is {P}2 \\diamond {P}n supermagic for n > 3.

  17. Graph-theoretic approach to quantum correlations.

    PubMed

    Cabello, Adán; Severini, Simone; Winter, Andreas

    2014-01-31

    Correlations in Bell and noncontextuality inequalities can be expressed as a positive linear combination of probabilities of events. Exclusive events can be represented as adjacent vertices of a graph, so correlations can be associated to a subgraph. We show that the maximum value of the correlations for classical, quantum, and more general theories is the independence number, the Lovász number, and the fractional packing number of this subgraph, respectively. We also show that, for any graph, there is always a correlation experiment such that the set of quantum probabilities is exactly the Grötschel-Lovász-Schrijver theta body. This identifies these combinatorial notions as fundamental physical objects and provides a method for singling out experiments with quantum correlations on demand.

  18. The Construction of {P}_{2}\\vartriangleright H-antimagic graph using smaller edge-antimagic vertex labeling

    NASA Astrophysics Data System (ADS)

    Prihandini, Rafiantika M.; Agustin, I. H.; Dafik

    2018-04-01

    In this paper we use simple and non trivial graph. If there exist a bijective function g:V(G) \\cup E(G)\\to \\{1,2,\\ldots,|V(G)|+|E(G)|\\}, such that for all subgraphs {P}2\\vartriangleright H of G isomorphic to H, then graph G is called an (a, b)-{P}2\\vartriangleright H-antimagic total graph. Furthermore, we can consider the total {P}2\\vartriangleright H-weights W({P}2\\vartriangleright H)={\\sum }v\\in V({P2\\vartriangleright H)}f(v)+{\\sum }e\\in E({P2\\vartriangleright H)}f(e) which should form an arithmetic sequence {a, a + d, a + 2d, …, a + (n ‑ 1)d}, where a and d are positive integers and n is the number of all subgraphs isomorphic to H. Our paper describes the existence of super (a, b)-{P}2\\vartriangleright H antimagic total labeling for graph operation of comb product namely of G=L\\vartriangleright H, where L is a (b, d*)-edge antimagic vertex labeling graph and H is a connected graph.

  19. Classification of urine sediment based on convolution neural network

    NASA Astrophysics Data System (ADS)

    Pan, Jingjing; Jiang, Cunbo; Zhu, Tiantian

    2018-04-01

    By designing a new convolution neural network framework, this paper breaks the constraints of the original convolution neural network framework requiring large training samples and samples of the same size. Move and cropping the input images, generate the same size of the sub-graph. And then, the generated sub-graph uses the method of dropout, increasing the diversity of samples and preventing the fitting generation. Randomly select some proper subset in the sub-graphic set and ensure that the number of elements in the proper subset is same and the proper subset is not the same. The proper subsets are used as input layers for the convolution neural network. Through the convolution layer, the pooling, the full connection layer and output layer, we can obtained the classification loss rate of test set and training set. In the red blood cells, white blood cells, calcium oxalate crystallization classification experiment, the classification accuracy rate of 97% or more.

  20. Predictions of first passage times in sparse discrete fracture networks using graph-based reductions

    NASA Astrophysics Data System (ADS)

    Hyman, J.; Hagberg, A.; Srinivasan, G.; Mohd-Yusof, J.; Viswanathan, H. S.

    2017-12-01

    We present a graph-based methodology to reduce the computational cost of obtaining first passage times through sparse fracture networks. We derive graph representations of generic three-dimensional discrete fracture networks (DFNs) using the DFN topology and flow boundary conditions. Subgraphs corresponding to the union of the k shortest paths between the inflow and outflow boundaries are identified and transport on their equivalent subnetworks is compared to transport through the full network. The number of paths included in the subgraphs is based on the scaling behavior of the number of edges in the graph with the number of shortest paths. First passage times through the subnetworks are in good agreement with those obtained in the full network, both for individual realizations and in distribution. Accurate estimates of first passage times are obtained with an order of magnitude reduction of CPU time and mesh size using the proposed method.

  1. Predictions of first passage times in sparse discrete fracture networks using graph-based reductions

    NASA Astrophysics Data System (ADS)

    Hyman, Jeffrey D.; Hagberg, Aric; Srinivasan, Gowri; Mohd-Yusof, Jamaludin; Viswanathan, Hari

    2017-07-01

    We present a graph-based methodology to reduce the computational cost of obtaining first passage times through sparse fracture networks. We derive graph representations of generic three-dimensional discrete fracture networks (DFNs) using the DFN topology and flow boundary conditions. Subgraphs corresponding to the union of the k shortest paths between the inflow and outflow boundaries are identified and transport on their equivalent subnetworks is compared to transport through the full network. The number of paths included in the subgraphs is based on the scaling behavior of the number of edges in the graph with the number of shortest paths. First passage times through the subnetworks are in good agreement with those obtained in the full network, both for individual realizations and in distribution. Accurate estimates of first passage times are obtained with an order of magnitude reduction of CPU time and mesh size using the proposed method.

  2. Generalized monogamy of contextual inequalities from the no-disturbance principle.

    PubMed

    Ramanathan, Ravishankar; Soeda, Akihito; Kurzyński, Paweł; Kaszlikowski, Dagomir

    2012-08-03

    In this Letter, we demonstrate that the property of monogamy of Bell violations seen for no-signaling correlations in composite systems can be generalized to the monogamy of contextuality in single systems obeying the Gleason property of no disturbance. We show how one can construct monogamies for contextual inequalities by using the graph-theoretic technique of vertex decomposition of a graph representing a set of measurements into subgraphs of suitable independence numbers that themselves admit a joint probability distribution. After establishing that all the subgraphs that are chordal graphs admit a joint probability distribution, we formulate a precise graph-theoretic condition that gives rise to the monogamy of contextuality. We also show how such monogamies arise within quantum theory for a single four-dimensional system and interpret violation of these relations in terms of a violation of causality. These monogamies can be tested with current experimental techniques.

  3. Discovering interesting molecular substructures for molecular classification.

    PubMed

    Lam, Winnie W M; Chan, Keith C C

    2010-06-01

    Given a set of molecular structure data preclassified into a number of classes, the molecular classification problem is concerned with the discovering of interesting structural patterns in the data so that "unseen" molecules not originally in the dataset can be accurately classified. To tackle the problem, interesting molecular substructures have to be discovered and this is done typically by first representing molecular structures in molecular graphs, and then, using graph-mining algorithms to discover frequently occurring subgraphs in them. These subgraphs are then used to characterize different classes for molecular classification. While such an approach can be very effective, it should be noted that a substructure that occurs frequently in one class may also does occur in another. The discovering of frequent subgraphs for molecular classification may, therefore, not always be the most effective. In this paper, we propose a novel technique called mining interesting substructures in molecular data for classification (MISMOC) that can discover interesting frequent subgraphs not just for the characterization of a molecular class but also for the distinguishing of it from the others. Using a test statistic, MISMOC screens each frequent subgraph to determine if they are interesting. For those that are interesting, their degrees of interestingness are determined using an information-theoretic measure. When classifying an unseen molecule, its structure is then matched against the interesting subgraphs in each class and a total interestingness measure for the unseen molecule to be classified into a particular class is determined, which is based on the interestingness of each matched subgraphs. The performance of MISMOC is evaluated using both artificial and real datasets, and the results show that it can be an effective approach for molecular classification.

  4. VIGOR: Interactive Visual Exploration of Graph Query Results.

    PubMed

    Pienta, Robert; Hohman, Fred; Endert, Alex; Tamersoy, Acar; Roundy, Kevin; Gates, Chris; Navathe, Shamkant; Chau, Duen Horng

    2018-01-01

    Finding patterns in graphs has become a vital challenge in many domains from biological systems, network security, to finance (e.g., finding money laundering rings of bankers and business owners). While there is significant interest in graph databases and querying techniques, less research has focused on helping analysts make sense of underlying patterns within a group of subgraph results. Visualizing graph query results is challenging, requiring effective summarization of a large number of subgraphs, each having potentially shared node-values, rich node features, and flexible structure across queries. We present VIGOR, a novel interactive visual analytics system, for exploring and making sense of query results. VIGOR uses multiple coordinated views, leveraging different data representations and organizations to streamline analysts sensemaking process. VIGOR contributes: (1) an exemplar-based interaction technique, where an analyst starts with a specific result and relaxes constraints to find other similar results or starts with only the structure (i.e., without node value constraints), and adds constraints to narrow in on specific results; and (2) a novel feature-aware subgraph result summarization. Through a collaboration with Symantec, we demonstrate how VIGOR helps tackle real-world problems through the discovery of security blindspots in a cybersecurity dataset with over 11,000 incidents. We also evaluate VIGOR with a within-subjects study, demonstrating VIGOR's ease of use over a leading graph database management system, and its ability to help analysts understand their results at higher speed and make fewer errors.

  5. Layer-by-layer growth of vertex graph of Penrose tiling

    NASA Astrophysics Data System (ADS)

    Shutov, A. V.; Maleev, A. V.

    2017-09-01

    The growth form for the vertex graph of Penrose tiling is found to be a regular decagon. The lower and upper bounds for this form, coinciding with it, are strictly proven. A fractal character of layer-by-layer growth is revealed for some subgraphs of the vertex graph of Penrose tiling.

  6. An annealed chaotic maximum neural network for bipartite subgraph problem.

    PubMed

    Wang, Jiahai; Tang, Zheng; Wang, Ronglong

    2004-04-01

    In this paper, based on maximum neural network, we propose a new parallel algorithm that can help the maximum neural network escape from local minima by including a transient chaotic neurodynamics for bipartite subgraph problem. The goal of the bipartite subgraph problem, which is an NP- complete problem, is to remove the minimum number of edges in a given graph such that the remaining graph is a bipartite graph. Lee et al. presented a parallel algorithm using the maximum neural model (winner-take-all neuron model) for this NP- complete problem. The maximum neural model always guarantees a valid solution and greatly reduces the search space without a burden on the parameter-tuning. However, the model has a tendency to converge to a local minimum easily because it is based on the steepest descent method. By adding a negative self-feedback to the maximum neural network, we proposed a new parallel algorithm that introduces richer and more flexible chaotic dynamics and can prevent the network from getting stuck at local minima. After the chaotic dynamics vanishes, the proposed algorithm is then fundamentally reined by the gradient descent dynamics and usually converges to a stable equilibrium point. The proposed algorithm has the advantages of both the maximum neural network and the chaotic neurodynamics. A large number of instances have been simulated to verify the proposed algorithm. The simulation results show that our algorithm finds the optimum or near-optimum solution for the bipartite subgraph problem superior to that of the best existing parallel algorithms.

  7. On size-constrained minimum s–t cut problems and size-constrained dense subgraph problems

    DOE PAGES

    Chen, Wenbin; Samatova, Nagiza F.; Stallmann, Matthias F.; ...

    2015-10-30

    In some application cases, the solutions of combinatorial optimization problems on graphs should satisfy an additional vertex size constraint. In this paper, we consider size-constrained minimum s–t cut problems and size-constrained dense subgraph problems. We introduce the minimum s–t cut with at-least-k vertices problem, the minimum s–t cut with at-most-k vertices problem, and the minimum s–t cut with exactly k vertices problem. We prove that they are NP-complete. Thus, they are not polynomially solvable unless P = NP. On the other hand, we also study the densest at-least-k-subgraph problem (DalkS) and the densest at-most-k-subgraph problem (DamkS) introduced by Andersen andmore » Chellapilla [1]. We present a polynomial time algorithm for DalkS when k is bounded by some constant c. We also present two approximation algorithms for DamkS. In conclusion, the first approximation algorithm for DamkS has an approximation ratio of n-1/k-1, where n is the number of vertices in the input graph. The second approximation algorithm for DamkS has an approximation ratio of O (n δ), for some δ < 1/3.« less

  8. Percolation critical polynomial as a graph invariant

    DOE PAGES

    Scullard, Christian R.

    2012-10-18

    Every lattice for which the bond percolation critical probability can be found exactly possesses a critical polynomial, with the root in [0; 1] providing the threshold. Recent work has demonstrated that this polynomial may be generalized through a definition that can be applied on any periodic lattice. The polynomial depends on the lattice and on its decomposition into identical finite subgraphs, but once these are specified, the polynomial is essentially unique. On lattices for which the exact percolation threshold is unknown, the polynomials provide approximations for the critical probability with the estimates appearing to converge to the exact answer withmore » increasing subgraph size. In this paper, I show how the critical polynomial can be viewed as a graph invariant like the Tutte polynomial. In particular, the critical polynomial is computed on a finite graph and may be found using the deletion-contraction algorithm. This allows calculation on a computer, and I present such results for the kagome lattice using subgraphs of up to 36 bonds. For one of these, I find the prediction p c = 0:52440572:::, which differs from the numerical value, p c = 0:52440503(5), by only 6:9 X 10 -7.« less

  9. A Graph Analytic Metric for Mitigating Advanced Persistent Threat

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Johnson, John R.; Hogan, Emilie A.

    2013-06-04

    This paper introduces a novel graph analytic metric that can be used to measure the potential vulnerability of a cyber network to specific types of attacks that use lateral movement and privilege escalation such as the well known Pass The Hash, (PTH). The metric is computed from an oriented subgraph of the underlying cyber network induced by selecting only those edges for which a given property holds between the two vertices of the edge. The metric with respect to a select node on the subgraph is defined as the likelihood that the select node is reachable from another arbitrary nodemore » in the graph. This metric can be calculated dynamically from the authorization and auditing layers during the network security authorization phase and will potentially enable predictive deterrence against attacks such as PTH.« less

  10. Predictions of first passage times in sparse discrete fracture networks using graph-based reductions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hyman, Jeffrey De'Haven; Hagberg, Aric Arild; Mohd-Yusof, Jamaludin

    Here, we present a graph-based methodology to reduce the computational cost of obtaining first passage times through sparse fracture networks. We also derive graph representations of generic three-dimensional discrete fracture networks (DFNs) using the DFN topology and flow boundary conditions. Subgraphs corresponding to the union of the k shortest paths between the inflow and outflow boundaries are identified and transport on their equivalent subnetworks is compared to transport through the full network. The number of paths included in the subgraphs is based on the scaling behavior of the number of edges in the graph with the number of shortest paths.more » First passage times through the subnetworks are in good agreement with those obtained in the full network, both for individual realizations and in distribution. We obtain accurate estimates of first passage times with an order of magnitude reduction of CPU time and mesh size using the proposed method.« less

  11. Predictions of first passage times in sparse discrete fracture networks using graph-based reductions

    DOE PAGES

    Hyman, Jeffrey De'Haven; Hagberg, Aric Arild; Mohd-Yusof, Jamaludin; ...

    2017-07-10

    Here, we present a graph-based methodology to reduce the computational cost of obtaining first passage times through sparse fracture networks. We also derive graph representations of generic three-dimensional discrete fracture networks (DFNs) using the DFN topology and flow boundary conditions. Subgraphs corresponding to the union of the k shortest paths between the inflow and outflow boundaries are identified and transport on their equivalent subnetworks is compared to transport through the full network. The number of paths included in the subgraphs is based on the scaling behavior of the number of edges in the graph with the number of shortest paths.more » First passage times through the subnetworks are in good agreement with those obtained in the full network, both for individual realizations and in distribution. We obtain accurate estimates of first passage times with an order of magnitude reduction of CPU time and mesh size using the proposed method.« less

  12. Site- and bond-percolation thresholds in K_{n,n}-based lattices: Vulnerability of quantum annealers to random qubit and coupler failures on chimera topologies.

    PubMed

    Melchert, O; Katzgraber, Helmut G; Novotny, M A

    2016-04-01

    We estimate the critical thresholds of bond and site percolation on nonplanar, effectively two-dimensional graphs with chimeralike topology. The building blocks of these graphs are complete and symmetric bipartite subgraphs of size 2n, referred to as K_{n,n} graphs. For the numerical simulations we use an efficient union-find-based algorithm and employ a finite-size scaling analysis to obtain the critical properties for both bond and site percolation. We report the respective percolation thresholds for different sizes of the bipartite subgraph and verify that the associated universality class is that of standard two-dimensional percolation. For the canonical chimera graph used in the D-Wave Systems Inc. quantum annealer (n=4), we discuss device failure in terms of network vulnerability, i.e., we determine the critical fraction of qubits and couplers that can be absent due to random failures prior to losing large-scale connectivity throughout the device.

  13. A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Choudhury, Sutanay; Holder, Larry; Chin, George

    2015-05-27

    Cyber security is one of the most significant technical challenges in current times. Detecting adversarial activities, prevention of theft of intellectual properties and customer data is a high priority for corporations and government agencies around the world. Cyber defenders need to analyze massive-scale, high-resolution network flows to identify, categorize, and mitigate attacks involving networks spanning institutional and national boundaries. Many of the cyber attacks can be described as subgraph patterns, with prominent examples being insider infiltrations (path queries), denial of service (parallel paths) and malicious spreads (tree queries). This motivates us to explore subgraph matching on streaming graphs in amore » continuous setting. The novelty of our work lies in using the subgraph distributional statistics collected from the streaming graph to determine the query processing strategy. We introduce a ``Lazy Search" algorithm where the search strategy is decided on a vertex-to-vertex basis depending on the likelihood of a match in the vertex neighborhood. We also propose a metric named ``Relative Selectivity" that is used to select between different query processing strategies. Our experiments performed on real online news, network traffic stream and a synthetic social network benchmark demonstrate 10-100x speedups over non-incremental, selectivity agnostic approaches.« less

  14. Node similarity within subgraphs of protein interaction networks

    NASA Astrophysics Data System (ADS)

    Penner, Orion; Sood, Vishal; Musso, Gabriel; Baskerville, Kim; Grassberger, Peter; Paczuski, Maya

    2008-06-01

    We propose a biologically motivated quantity, twinness, to evaluate local similarity between nodes in a network. The twinness of a pair of nodes is the number of connected, labeled subgraphs of size n in which the two nodes possess identical neighbours. The graph animal algorithm is used to estimate twinness for each pair of nodes (for subgraph sizes n=4 to n=12) in four different protein interaction networks (PINs). These include an Escherichia coli PIN and three Saccharomyces cerevisiae PINs - each obtained using state-of-the-art high-throughput methods. In almost all cases, the average twinness of node pairs is vastly higher than that expected from a null model obtained by switching links. For all n, we observe a difference in the ratio of type A twins (which are unlinked pairs) to type B twins (which are linked pairs) distinguishing the prokaryote E. coli from the eukaryote S. cerevisiae. Interaction similarity is expected due to gene duplication, and whole genome duplication paralogues in S. cerevisiae have been reported to co-cluster into the same complexes. Indeed, we find that these paralogous proteins are over-represented as twins compared to pairs chosen at random. These results indicate that twinness can detect ancestral relationships from currently available PIN data.

  15. Islands and Bridges: Making Sense of Marked Nodes in Large Graphs

    DTIC Science & Technology

    2012-08-01

    time-evolving graphs. References [1] Nouf M. Kh. Alsudairy, Vijay V. Raghavan, Alaaeldin M. Hafez, and Hassan I Mathkour. Connection subgraphs: A survey...Riedy, David A. Bader, Karl Jiang, Pushkar Pande, , and Richa Sharma . Detecting communities from given seeds in social networks. Technical Report GT-CSE

  16. GenoLink: a graph-based querying and browsing system for investigating the function of genes and proteins.

    PubMed

    Durand, Patrick; Labarre, Laurent; Meil, Alain; Divo, Jean-Louis; Vandenbrouck, Yves; Viari, Alain; Wojcik, Jérôme

    2006-01-17

    A large variety of biological data can be represented by graphs. These graphs can be constructed from heterogeneous data coming from genomic and post-genomic technologies, but there is still need for tools aiming at exploring and analysing such graphs. This paper describes GenoLink, a software platform for the graphical querying and exploration of graphs. GenoLink provides a generic framework for representing and querying data graphs. This framework provides a graph data structure, a graph query engine, allowing to retrieve sub-graphs from the entire data graph, and several graphical interfaces to express such queries and to further explore their results. A query consists in a graph pattern with constraints attached to the vertices and edges. A query result is the set of all sub-graphs of the entire data graph that are isomorphic to the pattern and satisfy the constraints. The graph data structure does not rely upon any particular data model but can dynamically accommodate for any user-supplied data model. However, for genomic and post-genomic applications, we provide a default data model and several parsers for the most popular data sources. GenoLink does not require any programming skill since all operations on graphs and the analysis of the results can be carried out graphically through several dedicated graphical interfaces. GenoLink is a generic and interactive tool allowing biologists to graphically explore various sources of information. GenoLink is distributed either as a standalone application or as a component of the Genostar/Iogma platform. Both distributions are free for academic research and teaching purposes and can be requested at academy@genostar.com. A commercial licence form can be obtained for profit company at info@genostar.com. See also http://www.genostar.org.

  17. GenoLink: a graph-based querying and browsing system for investigating the function of genes and proteins

    PubMed Central

    Durand, Patrick; Labarre, Laurent; Meil, Alain; Divo1, Jean-Louis; Vandenbrouck, Yves; Viari, Alain; Wojcik, Jérôme

    2006-01-01

    Background A large variety of biological data can be represented by graphs. These graphs can be constructed from heterogeneous data coming from genomic and post-genomic technologies, but there is still need for tools aiming at exploring and analysing such graphs. This paper describes GenoLink, a software platform for the graphical querying and exploration of graphs. Results GenoLink provides a generic framework for representing and querying data graphs. This framework provides a graph data structure, a graph query engine, allowing to retrieve sub-graphs from the entire data graph, and several graphical interfaces to express such queries and to further explore their results. A query consists in a graph pattern with constraints attached to the vertices and edges. A query result is the set of all sub-graphs of the entire data graph that are isomorphic to the pattern and satisfy the constraints. The graph data structure does not rely upon any particular data model but can dynamically accommodate for any user-supplied data model. However, for genomic and post-genomic applications, we provide a default data model and several parsers for the most popular data sources. GenoLink does not require any programming skill since all operations on graphs and the analysis of the results can be carried out graphically through several dedicated graphical interfaces. Conclusion GenoLink is a generic and interactive tool allowing biologists to graphically explore various sources of information. GenoLink is distributed either as a standalone application or as a component of the Genostar/Iogma platform. Both distributions are free for academic research and teaching purposes and can be requested at academy@genostar.com. A commercial licence form can be obtained for profit company at info@genostar.com. See also . PMID:16417636

  18. High-performance analysis of filtered semantic graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Buluc, Aydin; Fox, Armando; Gilbert, John R.

    2012-01-01

    High performance is a crucial consideration when executing a complex analytic query on a massive semantic graph. In a semantic graph, vertices and edges carry "attributes" of various types. Analytic queries on semantic graphs typically depend on the values of these attributes; thus, the computation must either view the graph through a filter that passes only those individual vertices and edges of interest, or else must first materialize a subgraph or subgraphs consisting of only the vertices and edges of interest. The filtered approach is superior due to its generality, ease of use, and memory efficiency, but may carry amore » performance cost. In the Knowledge Discovery Toolbox (KDT), a Python library for parallel graph computations, the user writes filters in a high-level language, but those filters result in relatively low performance due to the bottleneck of having to call into the Python interpreter for each edge. In this work, we use the Selective Embedded JIT Specialization (SEJITS) approach to automatically translate filters defined by programmers into a lower-level efficiency language, bypassing the upcall into Python. We evaluate our approach by comparing it with the high-performance C++ /MPI Combinatorial BLAS engine, and show that the productivity gained by using a high-level filtering language comes without sacrificing performance.« less

  19. Counting Triangles to Sum Squares

    ERIC Educational Resources Information Center

    DeMaio, Joe

    2012-01-01

    Counting complete subgraphs of three vertices in complete graphs, yields combinatorial arguments for identities for sums of squares of integers, odd integers, even integers and sums of the triangular numbers.

  20. Integrative Analysis of Many Weighted Co-Expression Networks Using Tensor Computation

    PubMed Central

    Li, Wenyuan; Liu, Chun-Chi; Zhang, Tong; Li, Haifeng; Waterman, Michael S.; Zhou, Xianghong Jasmine

    2011-01-01

    The rapid accumulation of biological networks poses new challenges and calls for powerful integrative analysis tools. Most existing methods capable of simultaneously analyzing a large number of networks were primarily designed for unweighted networks, and cannot easily be extended to weighted networks. However, it is known that transforming weighted into unweighted networks by dichotomizing the edges of weighted networks with a threshold generally leads to information loss. We have developed a novel, tensor-based computational framework for mining recurrent heavy subgraphs in a large set of massive weighted networks. Specifically, we formulate the recurrent heavy subgraph identification problem as a heavy 3D subtensor discovery problem with sparse constraints. We describe an effective approach to solving this problem by designing a multi-stage, convex relaxation protocol, and a non-uniform edge sampling technique. We applied our method to 130 co-expression networks, and identified 11,394 recurrent heavy subgraphs, grouped into 2,810 families. We demonstrated that the identified subgraphs represent meaningful biological modules by validating against a large set of compiled biological knowledge bases. We also showed that the likelihood for a heavy subgraph to be meaningful increases significantly with its recurrence in multiple networks, highlighting the importance of the integrative approach to biological network analysis. Moreover, our approach based on weighted graphs detects many patterns that would be overlooked using unweighted graphs. In addition, we identified a large number of modules that occur predominately under specific phenotypes. This analysis resulted in a genome-wide mapping of gene network modules onto the phenome. Finally, by comparing module activities across many datasets, we discovered high-order dynamic cooperativeness in protein complex networks and transcriptional regulatory networks. PMID:21698123

  1. Multiscale weighted colored graphs for protein flexibility and rigidity analysis

    NASA Astrophysics Data System (ADS)

    Bramer, David; Wei, Guo-Wei

    2018-02-01

    Protein structural fluctuation, measured by Debye-Waller factors or B-factors, is known to correlate to protein flexibility and function. A variety of methods has been developed for protein Debye-Waller factor prediction and related applications to domain separation, docking pose ranking, entropy calculation, hinge detection, stability analysis, etc. Nevertheless, none of the current methodologies are able to deliver an accuracy of 0.7 in terms of the Pearson correlation coefficients averaged over a large set of proteins. In this work, we introduce a paradigm-shifting geometric graph model, multiscale weighted colored graph (MWCG), to provide a new generation of computational algorithms to significantly change the current status of protein structural fluctuation analysis. Our MWCG model divides a protein graph into multiple subgraphs based on interaction types between graph nodes and represents the protein rigidity by generalized centralities of subgraphs. MWCGs not only predict the B-factors of protein residues but also accurately analyze the flexibility of all atoms in a protein. The MWCG model is validated over a number of protein test sets and compared with many standard methods. An extensive numerical study indicates that the proposed MWCG offers an accuracy of over 0.8 and thus provides perhaps the first reliable method for estimating protein flexibility and B-factors. It also simultaneously predicts all-atom flexibility in a molecule.

  2. Some cycle-supermagic labelings of the calendula graphs

    NASA Astrophysics Data System (ADS)

    Pradipta, T. R.; Salman, A. N. M.

    2018-01-01

    In this paper, we introduce a calendula graph, denoted by Clm,n . It is a graph constructed from a cycle on m vertices Cm and m copies of Cn which are Cn1 , Cn2 , ⋯, Cnm and grafting the i-th edge of Cm to an edge of in Cni for each i ∈ {1,2,⋯,m}. A graph G = (V, E) admits a Cn -covering, if every edge e ∈ E(G) belongs to a subgraph of G isomorphic to Cn . The graph G is called cycle-magic, if there exists a total labeling ϕ: V ∪ E → {1,2,…,|V|+|E|} such that for every subgraph Cn ‧ = (V‧,E‧) of G isomorphic to Cn has the same weight. In this case, the weight of Cn , denoted by ϕ(Cn ’), is defined as ∑ v∈V(C’n ) ϕ(v) + ∑ e∈E(C’n ) ϕ(e). Furthermore, G is called cycle-supermagic, if ϕ:V→{1,2,…,|V|}. In this paper, we provide some cycle-supermagic labelings of calendula graphs. In order to prove it, we develop a technique, to make a partition of a multiset into m sub-multisets with the same cardinality such that the sum of all elements of each sub-multiset is same. The technique is called an m-balanced multiset.

  3. Graph Kernels for Molecular Similarity.

    PubMed

    Rupp, Matthias; Schneider, Gisbert

    2010-04-12

    Molecular similarity measures are important for many cheminformatics applications like ligand-based virtual screening and quantitative structure-property relationships. Graph kernels are formal similarity measures defined directly on graphs, such as the (annotated) molecular structure graph. Graph kernels are positive semi-definite functions, i.e., they correspond to inner products. This property makes them suitable for use with kernel-based machine learning algorithms such as support vector machines and Gaussian processes. We review the major types of kernels between graphs (based on random walks, subgraphs, and optimal assignments, respectively), and discuss their advantages, limitations, and successful applications in cheminformatics. Copyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  4. Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: I. Method development.

    PubMed

    Bandyopadhyay, Deepak; Huan, Jun; Prins, Jan; Snoeyink, Jack; Wang, Wei; Tropsha, Alexander

    2009-11-01

    Protein function prediction is one of the central problems in computational biology. We present a novel automated protein structure-based function prediction method using libraries of local residue packing patterns that are common to most proteins in a known functional family. Critical to this approach is the representation of a protein structure as a graph where residue vertices (residue name used as a vertex label) are connected by geometrical proximity edges. The approach employs two steps. First, it uses a fast subgraph mining algorithm to find all occurrences of family-specific labeled subgraphs for all well characterized protein structural and functional families. Second, it queries a new structure for occurrences of a set of motifs characteristic of a known family, using a graph index to speed up Ullman's subgraph isomorphism algorithm. The confidence of function inference from structure depends on the number of family-specific motifs found in the query structure compared with their distribution in a large non-redundant database of proteins. This method can assign a new structure to a specific functional family in cases where sequence alignments, sequence patterns, structural superposition and active site templates fail to provide accurate annotation.

  5. Optimizing graph-based patterns to extract biomedical events from the literature

    PubMed Central

    2015-01-01

    In BioNLP-ST 2013 We participated in the BioNLP 2013 shared tasks on event extraction. Our extraction method is based on the search for an approximate subgraph isomorphism between key context dependencies of events and graphs of input sentences. Our system was able to address both the GENIA (GE) task focusing on 13 molecular biology related event types and the Cancer Genetics (CG) task targeting a challenging group of 40 cancer biology related event types with varying arguments concerning 18 kinds of biological entities. In addition to adapting our system to the two tasks, we also attempted to integrate semantics into the graph matching scheme using a distributional similarity model for more events, and evaluated the event extraction impact of using paths of all possible lengths as key context dependencies beyond using only the shortest paths in our system. We achieved a 46.38% F-score in the CG task (ranking 3rd) and a 48.93% F-score in the GE task (ranking 4th). After BioNLP-ST 2013 We explored three ways to further extend our event extraction system in our previously published work: (1) We allow non-essential nodes to be skipped, and incorporated a node skipping penalty into the subgraph distance function of our approximate subgraph matching algorithm. (2) Instead of assigning a unified subgraph distance threshold to all patterns of an event type, we learned a customized threshold for each pattern. (3) We implemented the well-known Empirical Risk Minimization (ERM) principle to optimize the event pattern set by balancing prediction errors on training data against regularization. When evaluated on the official GE task test data, these extensions help to improve the extraction precision from 62% to 65%. However, the overall F-score stays equivalent to the previous performance due to a 1% drop in recall. PMID:26551594

  6. Unsupervised object segmentation with a hybrid graph model (HGM).

    PubMed

    Liu, Guangcan; Lin, Zhouchen; Yu, Yong; Tang, Xiaoou

    2010-05-01

    In this work, we address the problem of performing class-specific unsupervised object segmentation, i.e., automatic segmentation without annotated training images. Object segmentation can be regarded as a special data clustering problem where both class-specific information and local texture/color similarities have to be considered. To this end, we propose a hybrid graph model (HGM) that can make effective use of both symmetric and asymmetric relationship among samples. The vertices of a hybrid graph represent the samples and are connected by directed edges and/or undirected ones, which represent the asymmetric and/or symmetric relationship between them, respectively. When applied to object segmentation, vertices are superpixels, the asymmetric relationship is the conditional dependence of occurrence, and the symmetric relationship is the color/texture similarity. By combining the Markov chain formed by the directed subgraph and the minimal cut of the undirected subgraph, the object boundaries can be determined for each image. Using the HGM, we can conveniently achieve simultaneous segmentation and recognition by integrating both top-down and bottom-up information into a unified process. Experiments on 42 object classes (9,415 images in total) show promising results.

  7. An analysis of multi-type relational interactions in FMA using graph motifs with disjointness constraints.

    PubMed

    Zhang, Guo-Qiang; Luo, Lingyun; Ogbuji, Chime; Joslyn, Cliff; Mejino, Jose; Sahoo, Satya S

    2012-01-01

    The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions for detecting logical inconsistencies as well as other anomalies represented by the motifs. MOCH represents patterns of multi-type interaction as small labeled (with multiple types of edges) sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology, we performed exhaustive analyses of a variety of labeled sub-graph motifs. The quality assurance feature of MOCH comes from the distinct use of a subset of the edges of the graph motifs as constraints for disjointness, whereby bringing in rule-based flavor to the approach as well. With possible disjointness implied by antonyms, we performed manual inspection of the resulting FMA fragments and tracked down sources of abnormal inferred conclusions (logical inconsistencies), which are amendable for programmatic revision of the FMA. Our results demonstrate that MOCH provides a unique source of valuable information for quality assurance. Since our approach is general, it is applicable to any ontological system with an OWL representation.

  8. An Analysis of Multi-type Relational Interactions in FMA Using Graph Motifs with Disjointness Constraints

    PubMed Central

    Zhang, Guo-Qiang; Luo, Lingyun; Ogbuji, Chime; Joslyn, Cliff; Mejino, Jose; Sahoo, Satya S

    2012-01-01

    The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions for detecting logical inconsistencies as well as other anomalies represented by the motifs. MOCH represents patterns of multi-type interaction as small labeled (with multiple types of edges) sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology, we performed exhaustive analyses of a variety of labeled sub-graph motifs. The quality assurance feature of MOCH comes from the distinct use of a subset of the edges of the graph motifs as constraints for disjointness, whereby bringing in rule-based flavor to the approach as well. With possible disjointness implied by antonyms, we performed manual inspection of the resulting FMA fragments and tracked down sources of abnormal inferred conclusions (logical inconsistencies), which are amendable for programmatic revision of the FMA. Our results demonstrate that MOCH provides a unique source of valuable information for quality assurance. Since our approach is general, it is applicable to any ontological system with an OWL representation. PMID:23304382

  9. Predicting and Detecting Emerging Cyberattack Patterns Using StreamWorks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chin, George; Choudhury, Sutanay; Feo, John T.

    2014-06-30

    The number and sophistication of cyberattacks on industries and governments have dramatically grown in recent years. To counter this movement, new advanced tools and techniques are needed to detect cyberattacks in their early stages such that defensive actions may be taken to avert or mitigate potential damage. From a cybersecurity analysis perspective, detecting cyberattacks may be cast as a problem of identifying patterns in computer network traffic. Logically and intuitively, these patterns may take on the form of a directed graph that conveys how an attack or intrusion propagates through the computers of a network. Such cyberattack graphs could providemore » cybersecurity analysts with powerful conceptual representations that are natural to express and analyze. We have been researching and developing graph-centric approaches and algorithms for dynamic cyberattack detection. The advanced dynamic graph algorithms we are developing will be packaged into a streaming network analysis framework known as StreamWorks. With StreamWorks, a scientist or analyst may detect and identify precursor events and patterns as they emerge in complex networks. This analysis framework is intended to be used in a dynamic environment where network data is streamed in and is appended to a large-scale dynamic graph. Specific graphical query patterns are decomposed and collected into a graph query library. The individual decomposed subpatterns in the library are continuously and efficiently matched against the dynamic graph as it evolves to identify and detect early, partial subgraph patterns. The scalable emerging subgraph pattern algorithms will match on both structural and semantic network properties.« less

  10. Super (a*, d*)-ℋ-antimagic total covering of second order of shackle graphs

    NASA Astrophysics Data System (ADS)

    Hesti Agustin, Ika; Dafik; Nisviasari, Rosanita; Prihandini, R. M.

    2017-12-01

    Let H be a simple and connected graph. A shackle of graph H, denoted by G = shack(H, v, n), is a graph G constructed by non-trivial graphs H 1, H 2, …, H n such that, for every 1 ≤ s, t ≤ n, H s and Ht have no a common vertex with |s - t| ≥ 2 and for every 1 ≤ i ≤ n - 1, Hi and H i+1 share exactly one common vertex v, called connecting vertex, and those k - 1 connecting vertices are all distinct. The graph G is said to be an (a*, d*)-H-antimagic total graph of second order if there exist a bijective function f : V(G) ∪ E(G) → {1, 2, …, |V(G)| + |E(G)|} such that for all subgraphs isomorphic to H, the total H-weights W(H)=\\displaystyle {\\sum }v\\in V(H)f(v)+\\displaystyle {\\sum }e\\in E(H)f(e) form an arithmetic sequence of second order of \\{a* ,a* +d* ,a* +3d* ,a* +6d* ,\\ldots ,a* +(\\frac{{n}2-n}{2})d* \\}, where a* and d* are positive integers and n is the number of all subgraphs isomorphic to H. An (a*, d*)-H-antimagic total labeling of second order f is called super if the smallest labels appear in the vertices. In this paper, we study a super (a*, d*)-H antimagic total labeling of second order of G = shack(H, v, n) by using a partition technique of second order.

  11. A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Choudhury, Sutanay; Holder, Larry; Chin, George

    2015-02-02

    Cyber security is one of the most significant technical challenges in current times. Detecting adversarial activities, prevention of theft of intellectual properties and customer data is a high priority for corporations and government agencies around the world. Cyber defenders need to analyze massive-scale, high-resolution network flows to identify, categorize, and mitigate attacks involving net- works spanning institutional and national boundaries. Many of the cyber attacks can be described as subgraph patterns, with promi- nent examples being insider infiltrations (path queries), denial of service (parallel paths) and malicious spreads (tree queries). This motivates us to explore subgraph matching on streaming graphsmore » in a continuous setting. The novelty of our work lies in using the subgraph distributional statistics collected from the streaming graph to determine the query processing strategy. We introduce a “Lazy Search" algorithm where the search strategy is decided on a vertex-to-vertex basis depending on the likelihood of a match in the vertex neighborhood. We also propose a metric named “Relative Selectivity" that is used to se- lect between different query processing strategies. Our experiments performed on real online news, network traffic stream and a syn- thetic social network benchmark demonstrate 10-100x speedups over selectivity agnostic approaches.« less

  12. GraphPrints: Towards a Graph Analytic Method for Network Anomaly Detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Harshaw, Chris R; Bridges, Robert A; Iannacone, Michael D

    This paper introduces a novel graph-analytic approach for detecting anomalies in network flow data called \\textit{GraphPrints}. Building on foundational network-mining techniques, our method represents time slices of traffic as a graph, then counts graphlets\\textemdash small induced subgraphs that describe local topology. By performing outlier detection on the sequence of graphlet counts, anomalous intervals of traffic are identified, and furthermore, individual IPs experiencing abnormal behavior are singled-out. Initial testing of GraphPrints is performed on real network data with an implanted anomaly. Evaluation shows false positive rates bounded by 2.84\\% at the time-interval level, and 0.05\\% at the IP-level with 100\\% truemore » positive rates at both.« less

  13. A general method for computing Tutte polynomials of self-similar graphs

    NASA Astrophysics Data System (ADS)

    Gong, Helin; Jin, Xian'an

    2017-10-01

    Self-similar graphs were widely studied in both combinatorics and statistical physics. Motivated by the construction of the well-known 3-dimensional Sierpiński gasket graphs, in this paper we introduce a family of recursively constructed self-similar graphs whose inner duals are of the self-similar property. By combining the dual property of the Tutte polynomial and the subgraph-decomposition trick, we show that the Tutte polynomial of this family of graphs can be computed in an iterative way and in particular the exact expression of the formula of the number of their spanning trees is derived. Furthermore, we show our method is a general one that is easily extended to compute Tutte polynomials for other families of self-similar graphs such as Farey graphs, 2-dimensional Sierpiński gasket graphs, Hanoi graphs, modified Koch graphs, Apollonian graphs, pseudofractal scale-free web, fractal scale-free network, etc.

  14. Solving Connected Subgraph Problems in Wildlife Conservation

    NASA Astrophysics Data System (ADS)

    Dilkina, Bistra; Gomes, Carla P.

    We investigate mathematical formulations and solution techniques for a variant of the Connected Subgraph Problem. Given a connected graph with costs and profits associated with the nodes, the goal is to find a connected subgraph that contains a subset of distinguished vertices. In this work we focus on the budget-constrained version, where we maximize the total profit of the nodes in the subgraph subject to a budget constraint on the total cost. We propose several mixed-integer formulations for enforcing the subgraph connectivity requirement, which plays a key role in the combinatorial structure of the problem. We show that a new formulation based on subtour elimination constraints is more effective at capturing the combinatorial structure of the problem, providing significant advantages over the previously considered encoding which was based on a single commodity flow. We test our formulations on synthetic instances as well as on real-world instances of an important problem in environmental conservation concerning the design of wildlife corridors. Our encoding results in a much tighter LP relaxation, and more importantly, it results in finding better integer feasible solutions as well as much better upper bounds on the objective (often proving optimality or within less than 1% of optimality), both when considering the synthetic instances as well as the real-world wildlife corridor instances.

  15. Separation of ion types in tandem mass spectrometry data interpretation -- a graph-theoretic approach.

    PubMed

    Yan, Bo; Pan, Chongle; Olman, Victor N; Hettich, Robert L; Xu, Ying

    2004-01-01

    Mass spectrometry is one of the most popular analytical techniques for identification of individual proteins in a protein mixture, one of the basic problems in proteomics. It identifies a protein through identifying its unique mass spectral pattern. While the problem is theoretically solvable, it remains a challenging problem computationally. One of the key challenges comes from the difficulty in distinguishing the N- and C-terminus ions, mostly b- and y-ions respectively. In this paper, we present a graph algorithm for solving the problem of separating bfrom y-ions in a set of mass spectra. We represent each spectral peak as a node and consider two types of edges: a type-1 edge connects two peaks possibly of the same ion types and a type-2 edge connects two peaks possibly of different ion types, predicted based on local information. The ion-separation problem is then formulated and solved as a graph partition problem, which is to partition the graph into three subgraphs, namely b-, y-ions and others respectively, so to maximize the total weight of type-1 edges while minimizing the total weight of type-2 edges within each subgraph. We have developed a dynamic programming algorithm for rigorously solving this graph partition problem and implemented it as a computer program PRIME. We have tested PRIME on 18 data sets of high accurate FT-ICR tandem mass spectra and found that it achieved ~90% accuracy for separation of b- and y- ions.

  16. Protein-protein interaction network-based detection of functionally similar proteins within species.

    PubMed

    Song, Baoxing; Wang, Fen; Guo, Yang; Sang, Qing; Liu, Min; Li, Dengyun; Fang, Wei; Zhang, Deli

    2012-07-01

    Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent. Copyright © 2012 Wiley Periodicals, Inc.

  17. Aligning Biomolecular Networks Using Modular Graph Kernels

    NASA Astrophysics Data System (ADS)

    Towfic, Fadi; Greenlee, M. Heather West; Honavar, Vasant

    Comparative analysis of biomolecular networks constructed using measurements from different conditions, tissues, and organisms offer a powerful approach to understanding the structure, function, dynamics, and evolution of complex biological systems. We explore a class of algorithms for aligning large biomolecular networks by breaking down such networks into subgraphs and computing the alignment of the networks based on the alignment of their subgraphs. The resulting subnetworks are compared using graph kernels as scoring functions. We provide implementations of the resulting algorithms as part of BiNA, an open source biomolecular network alignment toolkit. Our experiments using Drosophila melanogaster, Saccharomyces cerevisiae, Mus musculus and Homo sapiens protein-protein interaction networks extracted from the DIP repository of protein-protein interaction data demonstrate that the performance of the proposed algorithms (as measured by % GO term enrichment of subnetworks identified by the alignment) is competitive with some of the state-of-the-art algorithms for pair-wise alignment of large protein-protein interaction networks. Our results also show that the inter-species similarity scores computed based on graph kernels can be used to cluster the species into a species tree that is consistent with the known phylogenetic relationships among the species.

  18. High performance genetic algorithm for VLSI circuit partitioning

    NASA Astrophysics Data System (ADS)

    Dinu, Simona

    2016-12-01

    Partitioning is one of the biggest challenges in computer-aided design for VLSI circuits (very large-scale integrated circuits). This work address the min-cut balanced circuit partitioning problem- dividing the graph that models the circuit into almost equal sized k sub-graphs while minimizing the number of edges cut i.e. minimizing the number of edges connecting the sub-graphs. The problem may be formulated as a combinatorial optimization problem. Experimental studies in the literature have shown the problem to be NP-hard and thus it is important to design an efficient heuristic algorithm to solve it. The approach proposed in this study is a parallel implementation of a genetic algorithm, namely an island model. The information exchange between the evolving subpopulations is modeled using a fuzzy controller, which determines an optimal balance between exploration and exploitation of the solution space. The results of simulations show that the proposed algorithm outperforms the standard sequential genetic algorithm both in terms of solution quality and convergence speed. As a direction for future study, this research can be further extended to incorporate local search operators which should include problem-specific knowledge. In addition, the adaptive configuration of mutation and crossover rates is another guidance for future research.

  19. The median problems on linear multichromosomal genomes: graph representation and fast exact solutions.

    PubMed

    Xu, Andrew Wei

    2010-09-01

    In genome rearrangement, given a set of genomes G and a distance measure d, the median problem asks for another genome q that minimizes the total distance [Formula: see text]. This is a key problem in genome rearrangement based phylogenetic analysis. Although this problem is known to be NP-hard, we have shown in a previous article, on circular genomes and under the DCJ distance measure, that a family of patterns in the given genomes--represented by adequate subgraphs--allow us to rapidly find exact solutions to the median problem in a decomposition approach. In this article, we extend this result to the case of linear multichromosomal genomes, in order to solve more interesting problems on eukaryotic nuclear genomes. A multi-way capping problem in the linear multichromosomal case imposes an extra computational challenge on top of the difficulty in the circular case, and this difficulty has been underestimated in our previous study and is addressed in this article. We represent the median problem by the capped multiple breakpoint graph, extend the adequate subgraphs into the capped adequate subgraphs, and prove optimality-preserving decomposition theorems, which give us the tools to solve the median problem and the multi-way capping optimization problem together. We also develop an exact algorithm ASMedian-linear, which iteratively detects instances of (capped) adequate subgraphs and decomposes problems into subproblems. Tested on simulated data, ASMedian-linear can rapidly solve most problems with up to several thousand genes, and it also can provide optimal or near-optimal solutions to the median problem under the reversal/HP distance measures. ASMedian-linear is available at http://sites.google.com/site/andrewweixu .

  20. Constraints on signaling network logic reveal functional subgraphs on Multiple Myeloma OMIC data.

    PubMed

    Miannay, Bertrand; Minvielle, Stéphane; Magrangeas, Florence; Guziolowski, Carito

    2018-03-21

    The integration of gene expression profiles (GEPs) and large-scale biological networks derived from pathways databases is a subject which is being widely explored. Existing methods are based on network distance measures among significantly measured species. Only a small number of them include the directionality and underlying logic existing in biological networks. In this study we approach the GEP-networks integration problem by considering the network logic, however our approach does not require a prior species selection according to their gene expression level. We start by modeling the biological network representing its underlying logic using Logic Programming. This model points to reachable network discrete states that maximize a notion of harmony between the molecular species active or inactive possible states and the directionality of the pathways reactions according to their activator or inhibitor control role. Only then, we confront these network states with the GEP. From this confrontation independent graph components are derived, each of them related to a fixed and optimal assignment of active or inactive states. These components allow us to decompose a large-scale network into subgraphs and their molecular species state assignments have different degrees of similarity when compared to the same GEP. We apply our method to study the set of possible states derived from a subgraph from the NCI-PID Pathway Interaction Database. This graph links Multiple Myeloma (MM) genes to known receptors for this blood cancer. We discover that the NCI-PID MM graph had 15 independent components, and when confronted to 611 MM GEPs, we find 1 component as being more specific to represent the difference between cancer and healthy profiles.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hamilton, Kathleen E.; Humble, Travis S.

    Using quantum annealing to solve an optimization problem requires minor embedding a logic graph into a known hardware graph. We introduce the minor set cover (MSC) of a known graph GG : a subset of graph minors which contain any remaining minor of the graph as a subgraph, in an effort to reduce the complexity of the minor embedding problem. Any graph that can be embedded into GG will be embeddable into a member of the MSC. Focusing on embedding into the hardware graph of commercially available quantum annealers, we establish the MSC for a particular known virtual hardware, whichmore » is a complete bipartite graph. Furthermore, we show that the complete bipartite graph K N,N has a MSC of N minors, from which K N+1 is identified as the largest clique minor of K N,N. In the case of determining the largest clique minor of hardware with faults we briefly discussed this open question.« less

  2. Continuous-Time Classical and Quantum Random Walk on Direct Product of Cayley Graphs

    NASA Astrophysics Data System (ADS)

    Salimi, S.; Jafarizadeh, M. A.

    2009-06-01

    In this paper we define direct product of graphs and give a recipe for obtaining probability of observing particle on vertices in the continuous-time classical and quantum random walk. In the recipe, the probability of observing particle on direct product of graph is obtained by multiplication of probability on the corresponding to sub-graphs, where this method is useful to determining probability of walk on complicated graphs. Using this method, we calculate the probability of continuous-time classical and quantum random walks on many of finite direct product Cayley graphs (complete cycle, complete Kn, charter and n-cube). Also, we inquire that the classical state the stationary uniform distribution is reached as t → ∞ but for quantum state is not always satisfied.

  3. The hypergraph regularity method and its applications

    PubMed Central

    Rödl, V.; Nagle, B.; Skokan, J.; Schacht, M.; Kohayakawa, Y.

    2005-01-01

    Szemerédi's regularity lemma asserts that every graph can be decomposed into relatively few random-like subgraphs. This random-like behavior enables one to find and enumerate subgraphs of a given isomorphism type, yielding the so-called counting lemma for graphs. The combined application of these two lemmas is known as the regularity method for graphs and has proved useful in graph theory, combinatorial geometry, combinatorial number theory, and theoretical computer science. Here, we report on recent advances in the regularity method for k-uniform hypergraphs, for arbitrary k ≥ 2. This method, purely combinatorial in nature, gives alternative proofs of density theorems originally due to E. Szemerédi, H. Furstenberg, and Y. Katznelson. Further results in extremal combinatorics also have been obtained with this approach. The two main components of the regularity method for k-uniform hypergraphs, the regularity lemma and the counting lemma, have been obtained recently: Rödl and Skokan (based on earlier work of Frankl and Rödl) generalized Szemerédi's regularity lemma to k-uniform hypergraphs, and Nagle, Rödl, and Schacht succeeded in proving a counting lemma accompanying the Rödl–Skokan hypergraph regularity lemma. The counting lemma is proved by reducing the counting problem to a simpler one previously investigated by Kohayakawa, Rödl, and Skokan. Similar results were obtained independently by W. T. Gowers, following a different approach. PMID:15919821

  4. Dynamic Visualization of Co-expression in Systems Genetics Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    New, Joshua Ryan; Huang, Jian; Chesler, Elissa J

    2008-01-01

    Biologists hope to address grand scientific challenges by exploring the abundance of data made available through modern microarray technology and other high-throughput techniques. The impact of this data, however, is limited unless researchers can effectively assimilate such complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene co-expression require novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These tools should allow biologists to develop an intuitive understanding of the structure of biologicalmore » networks and discover genes which reside in critical positions in networks and pathways. By using a graph as a universal data representation of correlation in gene expression data, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene co-expression data integrates techniques such as: graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized b-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using a real-world workflow from a large-scale, systems genetics study of mammalian gene co-expression.« less

  5. Eigenvector synchronization, graph rigidity and the molecule problemR

    PubMed Central

    Cucuringu, Mihai; Singer, Amit; Cowburn, David

    2013-01-01

    The graph realization problem has received a great deal of attention in recent years, due to its importance in applications such as wireless sensor networks and structural biology. In this paper, we extend the previous work and propose the 3D-As-Synchronized-As-Possible (3D-ASAP) algorithm, for the graph realization problem in ℝ3, given a sparse and noisy set of distance measurements. 3D-ASAP is a divide and conquer, non-incremental and non-iterative algorithm, which integrates local distance information into a global structure determination. Our approach starts with identifying, for every node, a subgraph of its 1-hop neighborhood graph, which can be accurately embedded in its own coordinate system. In the noise-free case, the computed coordinates of the sensors in each patch must agree with their global positioning up to some unknown rigid motion, that is, up to translation, rotation and possibly reflection. In other words, to every patch, there corresponds an element of the Euclidean group, Euc(3), of rigid transformations in ℝ3, and the goal was to estimate the group elements that will properly align all the patches in a globally consistent way. Furthermore, 3D-ASAP successfully incorporates information specific to the molecule problem in structural biology, in particular information on known substructures and their orientation. In addition, we also propose 3D-spectral-partitioning (SP)-ASAP, a faster version of 3D-ASAP, which uses a spectral partitioning algorithm as a pre-processing step for dividing the initial graph into smaller subgraphs. Our extensive numerical simulations show that 3D-ASAP and 3D-SP-ASAP are very robust to high levels of noise in the measured distances and to sparse connectivity in the measurement graph, and compare favorably with similar state-of-the-art localization algorithms. PMID:24432187

  6. Matching and Vertex Packing: How Hard Are They?

    DTIC Science & Technology

    1991-01-01

    Theory, 29, Ann. Discrete Math ., North-Holland, Amsterdam, 1986. [2] M.D. Plummer, Matching theory - a sampler: from D~nes K~nig to the present...Ser. B, 28, 1980, 284-304. [20i N. Sbihi, Algorithme de recherche d’un stable de cardinalit6 maximum dans un graphe sans 6toile, Discrete Math ., 29...cliques and by finite families of graphs, Discrete Math ., 49, 1984, 45-59. [92] G. Cornu~jols, D. Hartvigsen and W.R. Pulleyblank, Packing subgraphs in

  7. Enhancing Community Detection By Affinity-based Edge Weighting Scheme

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yoo, Andy; Sanders, Geoffrey; Henson, Van

    Community detection refers to an important graph analytics problem of finding a set of densely-connected subgraphs in a graph and has gained a great deal of interest recently. The performance of current community detection algorithms is limited by an inherent constraint of unweighted graphs that offer very little information on their internal community structures. In this paper, we propose a new scheme to address this issue that weights the edges in a given graph based on recently proposed vertex affinity. The vertex affinity quantifies the proximity between two vertices in terms of their clustering strength, and therefore, it is idealmore » for graph analytics applications such as community detection. We also demonstrate that the affinity-based edge weighting scheme can improve the performance of community detection algorithms significantly.« less

  8. Identifying the minor set cover of dense connected bipartite graphs via random matching edge sets

    NASA Astrophysics Data System (ADS)

    Hamilton, Kathleen E.; Humble, Travis S.

    2017-04-01

    Using quantum annealing to solve an optimization problem requires minor embedding a logic graph into a known hardware graph. In an effort to reduce the complexity of the minor embedding problem, we introduce the minor set cover (MSC) of a known graph G: a subset of graph minors which contain any remaining minor of the graph as a subgraph. Any graph that can be embedded into G will be embeddable into a member of the MSC. Focusing on embedding into the hardware graph of commercially available quantum annealers, we establish the MSC for a particular known virtual hardware, which is a complete bipartite graph. We show that the complete bipartite graph K_{N,N} has a MSC of N minors, from which K_{N+1} is identified as the largest clique minor of K_{N,N}. The case of determining the largest clique minor of hardware with faults is briefly discussed but remains an open question.

  9. Identifying the minor set cover of dense connected bipartite graphs via random matching edge sets

    DOE PAGES

    Hamilton, Kathleen E.; Humble, Travis S.

    2017-02-23

    Using quantum annealing to solve an optimization problem requires minor embedding a logic graph into a known hardware graph. We introduce the minor set cover (MSC) of a known graph GG : a subset of graph minors which contain any remaining minor of the graph as a subgraph, in an effort to reduce the complexity of the minor embedding problem. Any graph that can be embedded into GG will be embeddable into a member of the MSC. Focusing on embedding into the hardware graph of commercially available quantum annealers, we establish the MSC for a particular known virtual hardware, whichmore » is a complete bipartite graph. Furthermore, we show that the complete bipartite graph K N,N has a MSC of N minors, from which K N+1 is identified as the largest clique minor of K N,N. In the case of determining the largest clique minor of hardware with faults we briefly discussed this open question.« less

  10. Renal cortex segmentation using optimal surface search with novel graph construction.

    PubMed

    Li, Xiuli; Chen, Xinjian; Yao, Jianhua; Zhang, Xing; Tian, Jie

    2011-01-01

    In this paper, we propose a novel approach to solve the renal cortex segmentation problem, which has rarely been studied. In this study, the renal cortex segmentation problem is handled as a multiple-surfaces extraction problem, which is solved using the optimal surface search method. We propose a novel graph construction scheme in the optimal surface search to better accommodate multiple surfaces. Different surface sub-graphs are constructed according to their properties, and inter-surface relationships are also modeled in the graph. The proposed method was tested on 17 clinical CT datasets. The true positive volume fraction (TPVF) and false positive volume fraction (FPVF) are 74.10% and 0.08%, respectively. The experimental results demonstrate the effectiveness of the proposed method.

  11. A tool for filtering information in complex systems

    PubMed Central

    Tumminello, M.; Aste, T.; Di Matteo, T.; Mantegna, R. N.

    2005-01-01

    We introduce a technique to filter out complex data sets by extracting a subgraph of representative links. Such a filtering can be tuned up to any desired level by controlling the genus of the resulting graph. We show that this technique is especially suitable for correlation-based graphs, giving filtered graphs that preserve the hierarchical organization of the minimum spanning tree but containing a larger amount of information in their internal structure. In particular in the case of planar filtered graphs (genus equal to 0), triangular loops and four-element cliques are formed. The application of this filtering procedure to 100 stocks in the U.S. equity markets shows that such loops and cliques have important and significant relationships with the market structure and properties. PMID:16027373

  12. A tool for filtering information in complex systems.

    PubMed

    Tumminello, M; Aste, T; Di Matteo, T; Mantegna, R N

    2005-07-26

    We introduce a technique to filter out complex data sets by extracting a subgraph of representative links. Such a filtering can be tuned up to any desired level by controlling the genus of the resulting graph. We show that this technique is especially suitable for correlation-based graphs, giving filtered graphs that preserve the hierarchical organization of the minimum spanning tree but containing a larger amount of information in their internal structure. In particular in the case of planar filtered graphs (genus equal to 0), triangular loops and four-element cliques are formed. The application of this filtering procedure to 100 stocks in the U.S. equity markets shows that such loops and cliques have important and significant relationships with the market structure and properties.

  13. Simultaneously Discovering and Localizing Common Objects in Wild Images.

    PubMed

    Wang, Zhenzhen; Yuan, Junsong

    2018-09-01

    Motivated by the recent success of supervised and weakly supervised common object discovery, in this paper, we move forward one step further to tackle common object discovery in a fully unsupervised way. Generally, object co-localization aims at simultaneously localizing objects of the same class across a group of images. Traditional object localization/detection usually trains specific object detectors which require bounding box annotations of object instances, or at least image-level labels to indicate the presence/absence of objects in an image. Given a collection of images without any annotations, our proposed fully unsupervised method is to simultaneously discover images that contain common objects and also localize common objects in corresponding images. Without requiring to know the total number of common objects, we formulate this unsupervised object discovery as a sub-graph mining problem from a weighted graph of object proposals, where nodes correspond to object proposals, and edges represent the similarities between neighbouring proposals. The positive images and common objects are jointly discovered by finding sub-graphs of strongly connected nodes, with each sub-graph capturing one object pattern. The optimization problem can be efficiently solved by our proposed maximal-flow-based algorithm. Instead of assuming that each image contains only one common object, our proposed solution can better address wild images where each image may contain multiple common objects or even no common object. Moreover, our proposed method can be easily tailored to the task of image retrieval in which the nodes correspond to the similarity between query and reference images. Extensive experiments on PASCAL VOC 2007 and Object Discovery data sets demonstrate that even without any supervision, our approach can discover/localize common objects of various classes in the presence of scale, view point, appearance variation, and partial occlusions. We also conduct broad experiments on image retrieval benchmarks, Holidays and Oxford5k data sets, to show that our proposed method, which considers both the similarity between query and reference images and also similarities among reference images, can help to improve the retrieval results significantly.

  14. An novel frequent probability pattern mining algorithm based on circuit simulation method in uncertain biological networks.

    PubMed

    He, Jieyue; Wang, Chunyan; Qiu, Kunpu; Zhong, Wei

    2014-01-01

    Motif mining has always been a hot research topic in bioinformatics. Most of current research on biological networks focuses on exact motif mining. However, due to the inevitable experimental error and noisy data, biological network data represented as the probability model could better reflect the authenticity and biological significance, therefore, it is more biological meaningful to discover probability motif in uncertain biological networks. One of the key steps in probability motif mining is frequent pattern discovery which is usually based on the possible world model having a relatively high computational complexity. In this paper, we present a novel method for detecting frequent probability patterns based on circuit simulation in the uncertain biological networks. First, the partition based efficient search is applied to the non-tree like subgraph mining where the probability of occurrence in random networks is small. Then, an algorithm of probability isomorphic based on circuit simulation is proposed. The probability isomorphic combines the analysis of circuit topology structure with related physical properties of voltage in order to evaluate the probability isomorphism between probability subgraphs. The circuit simulation based probability isomorphic can avoid using traditional possible world model. Finally, based on the algorithm of probability subgraph isomorphism, two-step hierarchical clustering method is used to cluster subgraphs, and discover frequent probability patterns from the clusters. The experiment results on data sets of the Protein-Protein Interaction (PPI) networks and the transcriptional regulatory networks of E. coli and S. cerevisiae show that the proposed method can efficiently discover the frequent probability subgraphs. The discovered subgraphs in our study contain all probability motifs reported in the experiments published in other related papers. The algorithm of probability graph isomorphism evaluation based on circuit simulation method excludes most of subgraphs which are not probability isomorphism and reduces the search space of the probability isomorphism subgraphs using the mismatch values in the node voltage set. It is an innovative way to find the frequent probability patterns, which can be efficiently applied to probability motif discovery problems in the further studies.

  15. An novel frequent probability pattern mining algorithm based on circuit simulation method in uncertain biological networks

    PubMed Central

    2014-01-01

    Background Motif mining has always been a hot research topic in bioinformatics. Most of current research on biological networks focuses on exact motif mining. However, due to the inevitable experimental error and noisy data, biological network data represented as the probability model could better reflect the authenticity and biological significance, therefore, it is more biological meaningful to discover probability motif in uncertain biological networks. One of the key steps in probability motif mining is frequent pattern discovery which is usually based on the possible world model having a relatively high computational complexity. Methods In this paper, we present a novel method for detecting frequent probability patterns based on circuit simulation in the uncertain biological networks. First, the partition based efficient search is applied to the non-tree like subgraph mining where the probability of occurrence in random networks is small. Then, an algorithm of probability isomorphic based on circuit simulation is proposed. The probability isomorphic combines the analysis of circuit topology structure with related physical properties of voltage in order to evaluate the probability isomorphism between probability subgraphs. The circuit simulation based probability isomorphic can avoid using traditional possible world model. Finally, based on the algorithm of probability subgraph isomorphism, two-step hierarchical clustering method is used to cluster subgraphs, and discover frequent probability patterns from the clusters. Results The experiment results on data sets of the Protein-Protein Interaction (PPI) networks and the transcriptional regulatory networks of E. coli and S. cerevisiae show that the proposed method can efficiently discover the frequent probability subgraphs. The discovered subgraphs in our study contain all probability motifs reported in the experiments published in other related papers. Conclusions The algorithm of probability graph isomorphism evaluation based on circuit simulation method excludes most of subgraphs which are not probability isomorphism and reduces the search space of the probability isomorphism subgraphs using the mismatch values in the node voltage set. It is an innovative way to find the frequent probability patterns, which can be efficiently applied to probability motif discovery problems in the further studies. PMID:25350277

  16. CombiMotif: A new algorithm for network motifs discovery in protein-protein interaction networks

    NASA Astrophysics Data System (ADS)

    Luo, Jiawei; Li, Guanghui; Song, Dan; Liang, Cheng

    2014-12-01

    Discovering motifs in protein-protein interaction networks is becoming a current major challenge in computational biology, since the distribution of the number of network motifs can reveal significant systemic differences among species. However, this task can be computationally expensive because of the involvement of graph isomorphic detection. In this paper, we present a new algorithm (CombiMotif) that incorporates combinatorial techniques to count non-induced occurrences of subgraph topologies in the form of trees. The efficiency of our algorithm is demonstrated by comparing the obtained results with the current state-of-the art subgraph counting algorithms. We also show major differences between unicellular and multicellular organisms. The datasets and source code of CombiMotif are freely available upon request.

  17. Derivatives in discrete mathematics: a novel graph-theoretical invariant for generating new 2/3D molecular descriptors. I. Theory and QSPR application.

    PubMed

    Marrero-Ponce, Yovani; Santiago, Oscar Martínez; López, Yoan Martínez; Barigye, Stephen J; Torrens, Francisco

    2012-11-01

    In this report, we present a new mathematical approach for describing chemical structures of organic molecules at atomic-molecular level, proposing for the first time the use of the concept of the derivative ([Formula: see text]) of a molecular graph (MG) with respect to a given event (E), to obtain a new family of molecular descriptors (MDs). With this purpose, a new matrix representation of the MG, which generalizes graph's theory's traditional incidence matrix, is introduced. This matrix, denominated the generalized incidence matrix, Q, arises from the Boolean representation of molecular sub-graphs that participate in the formation of the graph molecular skeleton MG and could be complete (representing all possible connected sub-graphs) or constitute sub-graphs of determined orders or types as well as a combination of these. The Q matrix is a non-quadratic and unsymmetrical in nature, its columns (n) and rows (m) are conditions (letters) and collection of conditions (words) with which the event occurs. This non-quadratic and unsymmetrical matrix is transformed, by algebraic manipulation, to a quadratic and symmetric matrix known as relations frequency matrix, F, which characterizes the participation intensity of the conditions (letters) in the events (words). With F, we calculate the derivative over a pair of atomic nuclei. The local index for the atomic nuclei i, Δ(i), can therefore be obtained as a linear combination of all the pair derivatives of the atomic nuclei i with all the rest of the j's atomic nuclei. Here, we also define new strategies that generalize the present form of obtaining global or local (group or atom-type) invariants from atomic contributions (local vertex invariants, LOVIs). In respect to this, metric (norms), means and statistical invariants are introduced. These invariants are applied to a vector whose components are the values Δ(i) for the atomic nuclei of the molecule or its fragments. Moreover, with the purpose of differentiating among different atoms, an atomic weighting scheme (atom-type labels) is used in the formation of the matrix Q or in LOVIs state. The obtained indices were utilized to describe the partition coefficient (Log P) and the reactivity index (Log K) of the 34 derivatives of 2-furylethylenes. In all the cases, our MDs showed better statistical results than those previously obtained using some of the most used families of MDs in chemometric practice. Therefore, it has been demonstrated to that the proposed MDs are useful in molecular design and permit obtaining easier and robust mathematical models than the majority of those reported in the literature. All this range of mentioned possibilities open "the doors" to the creation of a new family of MDs, using the graph derivative, and avail a new tool for QSAR/QSPR and molecular diversity/similarity studies.

  18. Sequential Monte Carlo for Maximum Weight Subgraphs with Application to Solving Image Jigsaw Puzzles.

    PubMed

    Adluru, Nagesh; Yang, Xingwei; Latecki, Longin Jan

    2015-05-01

    We consider a problem of finding maximum weight subgraphs (MWS) that satisfy hard constraints in a weighted graph. The constraints specify the graph nodes that must belong to the solution as well as mutual exclusions of graph nodes, i.e., pairs of nodes that cannot belong to the same solution. Our main contribution is a novel inference approach for solving this problem in a sequential monte carlo (SMC) sampling framework. Usually in an SMC framework there is a natural ordering of the states of the samples. The order typically depends on observations about the states or on the annealing setup used. In many applications (e.g., image jigsaw puzzle problems), all observations (e.g., puzzle pieces) are given at once and it is hard to define a natural ordering. Therefore, we relax the assumption of having ordered observations about states and propose a novel SMC algorithm for obtaining maximum a posteriori estimate of a high-dimensional posterior distribution. This is achieved by exploring different orders of states and selecting the most informative permutations in each step of the sampling. Our experimental results demonstrate that the proposed inference framework significantly outperforms loopy belief propagation in solving the image jigsaw puzzle problem. In particular, our inference quadruples the accuracy of the puzzle assembly compared to that of loopy belief propagation.

  19. Optimal Co-segmentation of Tumor in PET-CT Images with Context Information

    PubMed Central

    Song, Qi; Bai, Junjie; Han, Dongfeng; Bhatia, Sudershan; Sun, Wenqing; Rockey, William; Bayouth, John E.; Buatti, John M.

    2014-01-01

    PET-CT images have been widely used in clinical practice for radiotherapy treatment planning of the radiotherapy. Many existing segmentation approaches only work for a single imaging modality, which suffer from the low spatial resolution in PET or low contrast in CT. In this work we propose a novel method for the co-segmentation of the tumor in both PET and CT images, which makes use of advantages from each modality: the functionality information from PET and the anatomical structure information from CT. The approach formulates the segmentation problem as a minimization problem of a Markov Random Field (MRF) model, which encodes the information from both modalities. The optimization is solved using a graph-cut based method. Two sub-graphs are constructed for the segmentation of the PET and the CT images, respectively. To achieve consistent results in two modalities, an adaptive context cost is enforced by adding context arcs between the two subgraphs. An optimal solution can be obtained by solving a single maximum flow problem, which leads to simultaneous segmentation of the tumor volumes in both modalities. The proposed algorithm was validated in robust delineation of lung tumors on 23 PET-CT datasets and two head-and-neck cancer subjects. Both qualitative and quantitative results show significant improvement compared to the graph cut methods solely using PET or CT. PMID:23693127

  20. Sequential Monte Carlo for Maximum Weight Subgraphs with Application to Solving Image Jigsaw Puzzles

    PubMed Central

    Adluru, Nagesh; Yang, Xingwei; Latecki, Longin Jan

    2015-01-01

    We consider a problem of finding maximum weight subgraphs (MWS) that satisfy hard constraints in a weighted graph. The constraints specify the graph nodes that must belong to the solution as well as mutual exclusions of graph nodes, i.e., pairs of nodes that cannot belong to the same solution. Our main contribution is a novel inference approach for solving this problem in a sequential monte carlo (SMC) sampling framework. Usually in an SMC framework there is a natural ordering of the states of the samples. The order typically depends on observations about the states or on the annealing setup used. In many applications (e.g., image jigsaw puzzle problems), all observations (e.g., puzzle pieces) are given at once and it is hard to define a natural ordering. Therefore, we relax the assumption of having ordered observations about states and propose a novel SMC algorithm for obtaining maximum a posteriori estimate of a high-dimensional posterior distribution. This is achieved by exploring different orders of states and selecting the most informative permutations in each step of the sampling. Our experimental results demonstrate that the proposed inference framework significantly outperforms loopy belief propagation in solving the image jigsaw puzzle problem. In particular, our inference quadruples the accuracy of the puzzle assembly compared to that of loopy belief propagation. PMID:26052182

  1. Residuals-Based Subgraph Detection with Cue Vertices

    DTIC Science & Technology

    2015-11-30

    Workshop, 2012, pp. 129–132. [5] M. E. J. Newman , “Finding community structure in networks using the eigenvectors of matrices,” Phys. Rev. E, vol. 74, no...from Data, vol. 1, no. 1, 2007. [7] M. W. Mahoney , L. Orecchia, and N. K. Vishnoi, “A spectral algorithm for improving graph partitions,” CoRR, vol. abs

  2. A tool for filtering information in complex systems

    NASA Astrophysics Data System (ADS)

    Tumminello, M.; Aste, T.; Di Matteo, T.; Mantegna, R. N.

    2005-07-01

    We introduce a technique to filter out complex data sets by extracting a subgraph of representative links. Such a filtering can be tuned up to any desired level by controlling the genus of the resulting graph. We show that this technique is especially suitable for correlation-based graphs, giving filtered graphs that preserve the hierarchical organization of the minimum spanning tree but containing a larger amount of information in their internal structure. In particular in the case of planar filtered graphs (genus equal to 0), triangular loops and four-element cliques are formed. The application of this filtering procedure to 100 stocks in the U.S. equity markets shows that such loops and cliques have important and significant relationships with the market structure and properties. This paper was submitted directly (Track II) to the PNAS office.Abbreviations: MST, minimum spanning tree; PMFG, Planar Maximally Filtered Graph; r-clique, clique of r elements.

  3. The braingraph.org database of high resolution structural connectomes and the brain graph tools.

    PubMed

    Kerepesi, Csaba; Szalkai, Balázs; Varga, Bálint; Grolmusz, Vince

    2017-10-01

    Based on the data of the NIH-funded Human Connectome Project, we have computed structural connectomes of 426 human subjects in five different resolutions of 83, 129, 234, 463 and 1015 nodes and several edge weights. The graphs are given in anatomically annotated GraphML format that facilitates better further processing and visualization. For 96 subjects, the anatomically classified sub-graphs can also be accessed, formed from the vertices corresponding to distinct lobes or even smaller regions of interests of the brain. For example, one can easily download and study the connectomes, restricted to the frontal lobes or just to the left precuneus of 96 subjects using the data. Partially directed connectomes of 423 subjects are also available for download. We also present a GitHub-deposited set of tools, called the Brain Graph Tools, for several processing tasks of the connectomes on the site http://braingraph.org.

  4. An impatient evolutionary algorithm with probabilistic tabu search for unified solution of some NP-hard problems in graph and set theory via clique finding.

    PubMed

    Guturu, Parthasarathy; Dantu, Ram

    2008-06-01

    Many graph- and set-theoretic problems, because of their tremendous application potential and theoretical appeal, have been well investigated by the researchers in complexity theory and were found to be NP-hard. Since the combinatorial complexity of these problems does not permit exhaustive searches for optimal solutions, only near-optimal solutions can be explored using either various problem-specific heuristic strategies or metaheuristic global-optimization methods, such as simulated annealing, genetic algorithms, etc. In this paper, we propose a unified evolutionary algorithm (EA) to the problems of maximum clique finding, maximum independent set, minimum vertex cover, subgraph and double subgraph isomorphism, set packing, set partitioning, and set cover. In the proposed approach, we first map these problems onto the maximum clique-finding problem (MCP), which is later solved using an evolutionary strategy. The proposed impatient EA with probabilistic tabu search (IEA-PTS) for the MCP integrates the best features of earlier successful approaches with a number of new heuristics that we developed to yield a performance that advances the state of the art in EAs for the exploration of the maximum cliques in a graph. Results of experimentation with the 37 DIMACS benchmark graphs and comparative analyses with six state-of-the-art algorithms, including two from the smaller EA community and four from the larger metaheuristics community, indicate that the IEA-PTS outperforms the EAs with respect to a Pareto-lexicographic ranking criterion and offers competitive performance on some graph instances when individually compared to the other heuristic algorithms. It has also successfully set a new benchmark on one graph instance. On another benchmark suite called Benchmarks with Hidden Optimal Solutions, IEA-PTS ranks second, after a very recent algorithm called COVER, among its peers that have experimented with this suite.

  5. Path scanning for the detection of anomalous subgraphs and use of DNS requests and host agents for anomaly/change detection and network situational awareness

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Neil, Joshua Charles; Fisk, Michael Edward; Brugh, Alexander William

    A system, apparatus, computer-readable medium, and computer-implemented method are provided for detecting anomalous behavior in a network. Historical parameters of the network are determined in order to determine normal activity levels. A plurality of paths in the network are enumerated as part of a graph representing the network, where each computing system in the network may be a node in the graph and the sequence of connections between two computing systems may be a directed edge in the graph. A statistical model is applied to the plurality of paths in the graph on a sliding window basis to detect anomalousmore » behavior. Data collected by a Unified Host Collection Agent ("UHCA") may also be used to detect anomalous behavior.« less

  6. Exhibition of Monogamy Relations between Entropic Non-contextuality Inequalities

    NASA Astrophysics Data System (ADS)

    Zhu, Feng; Zhang, Wei; Huang, Yi-Dong

    2017-06-01

    We exhibit the monogamy relation between two entropic non-contextuality inequalities in the scenario where compatible projectors are orthogonal. We show the monogamy relation can be exhibited by decomposing the orthogonality graph into perfect induced subgraphs. Then we find two entropic non-contextuality inequalities are monogamous while the KCBS-type non-contextuality inequalities are not if the orthogonality graphs of the observable sets are two odd cycles with two shared vertices. Supported by 973 Programs of China under Grant Nos. 2011CBA00303 and 2013CB328700, Basic Research Foundation of Tsinghua National Laboratory for Information Science and Technology (TNList)

  7. The Development of Novel Chemical Fragment-Based Descriptors Using Frequent Common Subgraph Mining Approach and Their Application in QSAR Modeling.

    PubMed

    Khashan, Raed; Zheng, Weifan; Tropsha, Alexander

    2014-03-01

    We present a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected chemical graph. Fast Frequent Subgraph Mining (FFSM) is used to find chemical-fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs (FSG) forms a dataset-specific descriptors whose values for each molecule are defined by the number of times each frequent fragment occurs in this molecule. We have employed the FSG descriptors to develop variable selection k Nearest Neighbor (kNN) QSAR models of several datasets with binary target property including Maximum Recommended Therapeutic Dose (MRTD), Salmonella Mutagenicity (Ames Genotoxicity), and P-Glycoprotein (PGP) data. Each dataset was divided into training, test, and validation sets to establish the statistical figures of merit reflecting the model validated predictive power. The classification accuracies of models for both training and test sets for all datasets exceeded 75 %, and the accuracy for the external validation sets exceeded 72 %. The model accuracies were comparable or better than those reported earlier in the literature for the same datasets. Furthermore, the use of fragment-based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds' target property. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Growth and structure of the World Wide Web: Towards realistic modeling

    NASA Astrophysics Data System (ADS)

    Tadić, Bosiljka

    2002-08-01

    We simulate evolution of the World Wide Web from the dynamic rules incorporating growth, bias attachment, and rewiring. We show that the emergent double-hierarchical structure with distinct distributions of out- and in-links is comparable with the observed empirical data when the control parameter (average graph flexibility β) is kept in the range β=3-4. We then explore the Web graph by simulating (a) Web crawling to determine size and depth of connected components, and (b) a random walker that discovers the structure of connected subgraphs with dominant attractor and promoter nodes. A random walker that adapts its move strategy to mimic local node linking preferences is shown to have a short access time to "important" nodes on the Web graph.

  9. A nonlinear merging protocol for consensus in multi-agent systems on signed and weighted graphs

    NASA Astrophysics Data System (ADS)

    Feng, Shasha; Wang, Li; Li, Yijia; Sun, Shiwen; Xia, Chengyi

    2018-01-01

    In this paper, we investigate the multi-agent consensus for networks with undirected graphs which are not connected, especially for the signed graph in which some edge weights are positive and some edges have negative weights, and the negative-weight graph whose edge weights are negative. We propose a novel nonlinear merging consensus protocol to drive the states of all agents to converge to the same state zero which is not dependent upon the initial states of agents. If the undirected graph whose edge weights are positive is connected, then the states of all agents converge to the same state more quickly when compared to most other protocols. While the undirected graph whose edge weights might be positive or negative is unconnected, the states of all agents can still converge to the same state zero under the premise that the undirected graph can be divided into several connected subgraphs with more than one node. Furthermore, we also discuss the impact of parameter r presented in our protocol. Current results can further deepen the understanding of consensus processes for multi-agent systems.

  10. Key-Node-Separated Graph Clustering and Layouts for Human Relationship Graph Visualization.

    PubMed

    Itoh, Takayuki; Klein, Karsten

    2015-01-01

    Many graph-drawing methods apply node-clustering techniques based on the density of edges to find tightly connected subgraphs and then hierarchically visualize the clustered graphs. However, users may want to focus on important nodes and their connections to groups of other nodes for some applications. For this purpose, it is effective to separately visualize the key nodes detected based on adjacency and attributes of the nodes. This article presents a graph visualization technique for attribute-embedded graphs that applies a graph-clustering algorithm that accounts for the combination of connections and attributes. The graph clustering step divides the nodes according to the commonality of connected nodes and similarity of feature value vectors. It then calculates the distances between arbitrary pairs of clusters according to the number of connecting edges and the similarity of feature value vectors and finally places the clusters based on the distances. Consequently, the technique separates important nodes that have connections to multiple large clusters and improves the visibility of such nodes' connections. To test this technique, this article presents examples with human relationship graph datasets, including a coauthorship and Twitter communication network dataset.

  11. K-theory of locally finite graph C∗-algebras

    NASA Astrophysics Data System (ADS)

    Iyudu, Natalia

    2013-09-01

    We calculate the K-theory of the Cuntz-Krieger algebra OE associated with an infinite, locally finite graph, via the Bass-Hashimoto operator. The formulae we get express the Grothendieck group and the Whitehead group in purely graph theoretic terms. We consider the category of finite (black-and-white, bi-directed) subgraphs with certain graph homomorphisms and construct a continuous functor to abelian groups. In this category K0 is an inductive limit of K-groups of finite graphs, which were calculated in Cornelissen et al. (2008) [3]. In the case of an infinite graph with the finite Betti number we obtain the formula for the Grothendieck group K0(OE)=Z, where β(E) is the first Betti number and γ(E) is the valency number of the graph E. We note that in the infinite case the torsion part of K0, which is present in the case of a finite graph, vanishes. The Whitehead group depends only on the first Betti number: K1(OE)=Z. These allow us to provide a counterexample to the fact, which holds for finite graphs, that K1(OE) is the torsion free part of K0(OE).

  12. Multi-A Graph Patrolling and Partitioning

    NASA Astrophysics Data System (ADS)

    Elor, Y.; Bruckstein, A. M.

    2012-12-01

    We introduce a novel multi agent patrolling algorithm inspired by the behavior of gas filled balloons. Very low capability ant-like agents are considered with the task of patrolling an unknown area modeled as a graph. While executing the proposed algorithm, the agents dynamically partition the graph between them using simple local interactions, every agent assuming the responsibility for patrolling his subgraph. Balanced graph partition is an emergent behavior due to the local interactions between the agents in the swarm. Extensive simulations on various graphs (environments) showed that the average time to reach a balanced partition is linear with the graph size. The simulations yielded a convincing argument for conjecturing that if the graph being patrolled contains a balanced partition, the agents will find it. However, we could not prove this. Nevertheless, we have proved that if a balanced partition is reached, the maximum time lag between two successive visits to any vertex using the proposed strategy is at most twice the optimal so the patrol quality is at least half the optimal. In case of weighted graphs the patrol quality is at least (1)/(2){lmin}/{lmax} of the optimal where lmax (lmin) is the longest (shortest) edge in the graph.

  13. Computing the Edge-Neighbour-Scattering Number of Graphs

    NASA Astrophysics Data System (ADS)

    Wei, Zongtian; Qi, Nannan; Yue, Xiaokui

    2013-11-01

    A set of edges X is subverted from a graph G by removing the closed neighbourhood N[X] from G. We denote the survival subgraph by G=X. An edge-subversion strategy X is called an edge-cut strategy of G if G=X is disconnected, a single vertex, or empty. The edge-neighbour-scattering number of a graph G is defined as ENS(G) = max{ω(G/X)-|X| : X is an edge-cut strategy of G}, where w(G=X) is the number of components of G=X. This parameter can be used to measure the vulnerability of networks when some edges are failed, especially spy networks and virus-infected networks. In this paper, we prove that the problem of computing the edge-neighbour-scattering number of a graph is NP-complete and give some upper and lower bounds for this parameter.

  14. Application-Specific Graph Sampling for Frequent Subgraph Mining and Community Detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Purohit, Sumit; Choudhury, Sutanay; Holder, Lawrence B.

    Graph mining is an important data analysis methodology, but struggles as the input graph size increases. The scalability and usability challenges posed by such large graphs make it imperative to sample the input graph and reduce its size. The critical challenge in sampling is to identify the appropriate algorithm to insure the resulting analysis does not suffer heavily from the data reduction. Predicting the expected performance degradation for a given graph and sampling algorithm is also useful. In this paper, we present different sampling approaches for graph mining applications such as Frequent Subgrpah Mining (FSM), and Community Detection (CD). Wemore » explore graph metrics such as PageRank, Triangles, and Diversity to sample a graph and conclude that for heterogeneous graphs Triangles and Diversity perform better than degree based metrics. We also present two new sampling variations for targeted graph mining applications. We present empirical results to show that knowledge of the target application, along with input graph properties can be used to select the best sampling algorithm. We also conclude that performance degradation is an abrupt, rather than gradual phenomena, as the sample size decreases. We present the empirical results to show that the performance degradation follows a logistic function.« less

  15. Generalizing Experimental Findings

    DTIC Science & Technology

    2015-06-01

    ity.” In graphical terms, these assumptions may require several d-separation tests on several sub-graphs. It is utterly unimaginable therefore that...Education) (a) (Salary)(Education) (Skill) (b) S ( Test ) YX ZZ YX Figure 1: (a) A transportability model in which a post-treatment variable Z is S-admissible...observational studies to estimate population treatment ef- fects. Journal Royal Statistical Society: Series A (Statistics in Society) Forthcoming, doi

  16. Meraculous2

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2014-06-01

    meraculous2 is a whole genome shotgun assembler for short-reads that is capable of assembling large, polymorphic genomes with modest computational requirements. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset, avoiding an explicit error correction step as used in other short-read assemblers. Additional features include (1) handling of allelic variation using "bubble" structures within the deBruijn graph, (2) gap closing of repetitive and low quality regions using localized assemblies, and (3) an improved scaffolding algorithm that produces more complete assemblies without compromising onmore » scaffolding accuracy« less

  17. Top-k similar graph matching using TraM in biological networks.

    PubMed

    Amin, Mohammad Shafkat; Finley, Russell L; Jamil, Hasan M

    2012-01-01

    Many emerging database applications entail sophisticated graph-based query manipulation, predominantly evident in large-scale scientific applications. To access the information embedded in graphs, efficient graph matching tools and algorithms have become of prime importance. Although the prohibitively expensive time complexity associated with exact subgraph isomorphism techniques has limited its efficacy in the application domain, approximate yet efficient graph matching techniques have received much attention due to their pragmatic applicability. Since public domain databases are noisy and incomplete in nature, inexact graph matching techniques have proven to be more promising in terms of inferring knowledge from numerous structural data repositories. In this paper, we propose a novel technique called TraM for approximate graph matching that off-loads a significant amount of its processing on to the database making the approach viable for large graphs. Moreover, the vector space embedding of the graphs and efficient filtration of the search space enables computation of approximate graph similarity at a throw-away cost. We annotate nodes of the query graphs by means of their global topological properties and compare them with neighborhood biased segments of the datagraph for proper matches. We have conducted experiments on several real data sets, and have demonstrated the effectiveness and efficiency of the proposed method

  18. Review on Graph Clustering and Subgraph Similarity Based Analysis of Neurological Disorders

    PubMed Central

    Thomas, Jaya; Seo, Dongmin; Sael, Lee

    2016-01-01

    How can complex relationships among molecular or clinico-pathological entities of neurological disorders be represented and analyzed? Graphs seem to be the current answer to the question no matter the type of information: molecular data, brain images or neural signals. We review a wide spectrum of graph representation and graph analysis methods and their application in the study of both the genomic level and the phenotypic level of the neurological disorder. We find numerous research works that create, process and analyze graphs formed from one or a few data types to gain an understanding of specific aspects of the neurological disorders. Furthermore, with the increasing number of data of various types becoming available for neurological disorders, we find that integrative analysis approaches that combine several types of data are being recognized as a way to gain a global understanding of the diseases. Although there are still not many integrative analyses of graphs due to the complexity in analysis, multi-layer graph analysis is a promising framework that can incorporate various data types. We describe and discuss the benefits of the multi-layer graph framework for studies of neurological disease. PMID:27258269

  19. Review on Graph Clustering and Subgraph Similarity Based Analysis of Neurological Disorders.

    PubMed

    Thomas, Jaya; Seo, Dongmin; Sael, Lee

    2016-06-01

    How can complex relationships among molecular or clinico-pathological entities of neurological disorders be represented and analyzed? Graphs seem to be the current answer to the question no matter the type of information: molecular data, brain images or neural signals. We review a wide spectrum of graph representation and graph analysis methods and their application in the study of both the genomic level and the phenotypic level of the neurological disorder. We find numerous research works that create, process and analyze graphs formed from one or a few data types to gain an understanding of specific aspects of the neurological disorders. Furthermore, with the increasing number of data of various types becoming available for neurological disorders, we find that integrative analysis approaches that combine several types of data are being recognized as a way to gain a global understanding of the diseases. Although there are still not many integrative analyses of graphs due to the complexity in analysis, multi-layer graph analysis is a promising framework that can incorporate various data types. We describe and discuss the benefits of the multi-layer graph framework for studies of neurological disease.

  20. Non-isolated Resolving Sets of certain Graphs Cartesian Product with a Path

    NASA Astrophysics Data System (ADS)

    Hasibuan, I. M.; Salman, A. N. M.; Saputro, S. W.

    2018-04-01

    Let G be a connected, simple, and finite graph. For an ordered subset W = {w 1 , w 2 , · · ·, wk } of vertices in a graph G and a vertex v of G, the metric representation of v with respect to W is the k-vector r(v|W ) = (d(v, w 1), d(v, w 2), · · ·, d(v, wk )). The set W is called a resolving set for G if every vertex of G has a distinct representation. The minimum cardinality of W is called the metric dimension of G, denoted by dim(G). If the induced subgraph < W> has no isolated vertices, then W is called a non-isolated resolving set. The minimum cardinality of non-isolated resolving set of G is called the non-isolated resolving number of G, denoted by nr(G). In this paper, we consider H\\square {P}n that is a graph obtained from Cartesian product between a connected graph H and a path Pn . We determine nr(H\\square {P}n), for some classes of H, including cycles, complete graphs, complete bipartite graphs, and friendship graphs.

  1. Network Reliability: The effect of local network structure on diffusive processes

    PubMed Central

    Youssef, Mina; Khorramzadeh, Yasamin; Eubank, Stephen

    2014-01-01

    This paper re-introduces the network reliability polynomial – introduced by Moore and Shannon in 1956 – for studying the effect of network structure on the spread of diseases. We exhibit a representation of the polynomial that is well-suited for estimation by distributed simulation. We describe a collection of graphs derived from Erdős-Rényi and scale-free-like random graphs in which we have manipulated assortativity-by-degree and the number of triangles. We evaluate the network reliability for all these graphs under a reliability rule that is related to the expected size of a connected component. Through these extensive simulations, we show that for positively or neutrally assortative graphs, swapping edges to increase the number of triangles does not increase the network reliability. Also, positively assortative graphs are more reliable than neutral or disassortative graphs with the same number of edges. Moreover, we show the combined effect of both assortativity-by-degree and the presence of triangles on the critical point and the size of the smallest subgraph that is reliable. PMID:24329321

  2. Origin of hyperbolicity in brain-to-brain coordination networks

    NASA Astrophysics Data System (ADS)

    Tadić, Bosiljka; Andjelković, Miroslav; Šuvakov, Milovan

    2018-02-01

    Hyperbolicity or negative curvature of complex networks is the intrinsic geometric proximity of nodes in the graph metric space, which implies an improved network function. Here, we investigate hidden combinatorial geometries in brain-to-brain coordination networks arising through social communications. The networks originate from correlations among EEG signals previously recorded during spoken communications comprising of 14 individuals with 24 speaker-listener pairs. We find that the corresponding networks are delta-hyperbolic with delta_max=1 and the graph diameter D=3 in each brain. While the emergent hyperbolicity in the two-brain networks satisfies delta_max/D/2 < 1 and can be attributed to the topology of the subgraph formed around the cross-brains linking channels. We identify these subgraphs in each studied two-brain network and decompose their structure into simple geometric descriptors (triangles, tetrahedra and cliques of higher orders) that contribute to hyperbolicity. Considering topologies that exceed two separate brain networks as a measure of coordination synergy between the brains, we identify different neuronal correlation patterns ranging from weak coordination to super-brain structure. These topology features are in qualitative agreement with the listener’s self-reported ratings of own experience and quality of the speaker, suggesting that studies of the cross-brain connector networks can reveal new insight into the neural mechanisms underlying human social behavior.

  3. A graph edit dictionary for correcting errors in roof topology graphs reconstructed from point clouds

    NASA Astrophysics Data System (ADS)

    Xiong, B.; Oude Elberink, S.; Vosselman, G.

    2014-07-01

    In the task of 3D building model reconstruction from point clouds we face the problem of recovering a roof topology graph in the presence of noise, small roof faces and low point densities. Errors in roof topology graphs will seriously affect the final modelling results. The aim of this research is to automatically correct these errors. We define the graph correction as a graph-to-graph problem, similar to the spelling correction problem (also called the string-to-string problem). The graph correction is more complex than string correction, as the graphs are 2D while strings are only 1D. We design a strategy based on a dictionary of graph edit operations to automatically identify and correct the errors in the input graph. For each type of error the graph edit dictionary stores a representative erroneous subgraph as well as the corrected version. As an erroneous roof topology graph may contain several errors, a heuristic search is applied to find the optimum sequence of graph edits to correct the errors one by one. The graph edit dictionary can be expanded to include entries needed to cope with errors that were previously not encountered. Experiments show that the dictionary with only fifteen entries already properly corrects one quarter of erroneous graphs in about 4500 buildings, and even half of the erroneous graphs in one test area, achieving as high as a 95% acceptance rate of the reconstructed models.

  4. Assembly planning based on subassembly extraction

    NASA Technical Reports Server (NTRS)

    Lee, Sukhan; Shin, Yeong Gil

    1990-01-01

    A method is presented for the automatic determination of assembly partial orders from a liaison graph representation of an assembly through the extraction of preferred subassemblies. In particular, the authors show how to select a set of tentative subassemblies by decomposing a liaison graph into a set of subgraphs based on feasibility and difficulty of disassembly, how to evaluate each of the tentative subassemblies in terms of assembly cost using the subassembly selection indices, and how to construct a hierarchical partial order graph (HPOG) as an assembly plan. The method provides an approach to assembly planning by identifying spatial parallelism in assembly as a means of constructing temporal relationships among assembly operations and solves the problem of finding a cost-effective assembly plan in a flexible environment. A case study of the assembly planning of a mechanical assembly is presented.

  5. New graph polynomials in parametric QED Feynman integrals

    NASA Astrophysics Data System (ADS)

    Golz, Marcel

    2017-10-01

    In recent years enormous progress has been made in perturbative quantum field theory by applying methods of algebraic geometry to parametric Feynman integrals for scalar theories. The transition to gauge theories is complicated not only by the fact that their parametric integrand is much larger and more involved. It is, moreover, only implicitly given as the result of certain differential operators applied to the scalar integrand exp(-ΦΓ /ΨΓ) , where ΨΓ and ΦΓ are the Kirchhoff and Symanzik polynomials of the Feynman graph Γ. In the case of quantum electrodynamics we find that the full parametric integrand inherits a rich combinatorial structure from ΨΓ and ΦΓ. In the end, it can be expressed explicitly as a sum over products of new types of graph polynomials which have a combinatoric interpretation via simple cycle subgraphs of Γ.

  6. Collaborative mining of graph patterns from multiple sources

    NASA Astrophysics Data System (ADS)

    Levchuk, Georgiy; Colonna-Romanoa, John

    2016-05-01

    Intelligence analysts require automated tools to mine multi-source data, including answering queries, learning patterns of life, and discovering malicious or anomalous activities. Graph mining algorithms have recently attracted significant attention in intelligence community, because the text-derived knowledge can be efficiently represented as graphs of entities and relationships. However, graph mining models are limited to use-cases involving collocated data, and often make restrictive assumptions about the types of patterns that need to be discovered, the relationships between individual sources, and availability of accurate data segmentation. In this paper we present a model to learn the graph patterns from multiple relational data sources, when each source might have only a fragment (or subgraph) of the knowledge that needs to be discovered, and segmentation of data into training or testing instances is not available. Our model is based on distributed collaborative graph learning, and is effective in situations when the data is kept locally and cannot be moved to a centralized location. Our experiments show that proposed collaborative learning achieves learning quality better than aggregated centralized graph learning, and has learning time comparable to traditional distributed learning in which a knowledge of data segmentation is needed.

  7. Complexity and approximability for a problem of intersecting of proximity graphs with minimum number of equal disks

    NASA Astrophysics Data System (ADS)

    Kobylkin, Konstantin

    2016-10-01

    Computational complexity and approximability are studied for the problem of intersecting of a set of straight line segments with the smallest cardinality set of disks of fixed radii r > 0 where the set of segments forms straight line embedding of possibly non-planar geometric graph. This problem arises in physical network security analysis for telecommunication, wireless and road networks represented by specific geometric graphs defined by Euclidean distances between their vertices (proximity graphs). It can be formulated in a form of known Hitting Set problem over a set of Euclidean r-neighbourhoods of segments. Being of interest computational complexity and approximability of Hitting Set over so structured sets of geometric objects did not get much focus in the literature. Strong NP-hardness of the problem is reported over special classes of proximity graphs namely of Delaunay triangulations, some of their connected subgraphs, half-θ6 graphs and non-planar unit disk graphs as well as APX-hardness is given for non-planar geometric graphs at different scales of r with respect to the longest graph edge length. Simple constant factor approximation algorithm is presented for the case where r is at the same scale as the longest edge length.

  8. Locating influential nodes in complex networks

    PubMed Central

    Malliaros, Fragkiskos D.; Rossi, Maria-Evgenia G.; Vazirgiannis, Michalis

    2016-01-01

    Understanding and controlling spreading processes in networks is an important topic with many diverse applications, including information dissemination, disease propagation and viral marketing. It is of crucial importance to identify which entities act as influential spreaders that can propagate information to a large portion of the network, in order to ensure efficient information diffusion, optimize available resources or even control the spreading. In this work, we capitalize on the properties of the K-truss decomposition, a triangle-based extension of the core decomposition of graphs, to locate individual influential nodes. Our analysis on real networks indicates that the nodes belonging to the maximal K-truss subgraph show better spreading behavior compared to previously used importance criteria, including node degree and k-core index, leading to faster and wider epidemic spreading. We further show that nodes belonging to such dense subgraphs, dominate the small set of nodes that achieve the optimal spreading in the network. PMID:26776455

  9. Distributed Sensing and Processing Adaptive Collaboration Environment (D-SPACE)

    DTIC Science & Technology

    2014-07-01

    to the query graph, or subgraph permutations with the same mismatch cost (often the case for homogeneous and/or symmetrical data/query). To avoid...decisions are generated in a bottom-up manner using the metric of entropy at the cluster level (Figure 9c). Using the definition of belief messages...for a cluster and a set of data nodes in this cluster , we compute the entropy for forward and backward messages as (,) = −∑ (

  10. Parameterized Complexity Results for General Factors in Bipartite Graphs with an Application to Constraint Programming

    NASA Astrophysics Data System (ADS)

    Gutin, Gregory; Kim, Eun Jung; Soleimanfallah, Arezou; Szeider, Stefan; Yeo, Anders

    The NP-hard general factor problem asks, given a graph and for each vertex a list of integers, whether the graph has a spanning subgraph where each vertex has a degree that belongs to its assigned list. The problem remains NP-hard even if the given graph is bipartite with partition U ⊎ V, and each vertex in U is assigned the list {1}; this subproblem appears in the context of constraint programming as the consistency problem for the extended global cardinality constraint. We show that this subproblem is fixed-parameter tractable when parameterized by the size of the second partite set V. More generally, we show that the general factor problem for bipartite graphs, parameterized by |V |, is fixed-parameter tractable as long as all vertices in U are assigned lists of length 1, but becomes W[1]-hard if vertices in U are assigned lists of length at most 2. We establish fixed-parameter tractability by reducing the problem instance to a bounded number of acyclic instances, each of which can be solved in polynomial time by dynamic programming.

  11. Mining integrated semantic networks for drug repositioning opportunities

    PubMed Central

    Mullen, Joseph; Tipney, Hannah

    2016-01-01

    Current research and development approaches to drug discovery have become less fruitful and more costly. One alternative paradigm is that of drug repositioning. Many marketed examples of repositioned drugs have been identified through serendipitous or rational observations, highlighting the need for more systematic methodologies to tackle the problem. Systems level approaches have the potential to enable the development of novel methods to understand the action of therapeutic compounds, but requires an integrative approach to biological data. Integrated networks can facilitate systems level analyses by combining multiple sources of evidence to provide a rich description of drugs, their targets and their interactions. Classically, such networks can be mined manually where a skilled person is able to identify portions of the graph (semantic subgraphs) that are indicative of relationships between drugs and highlight possible repositioning opportunities. However, this approach is not scalable. Automated approaches are required to systematically mine integrated networks for these subgraphs and bring them to the attention of the user. We introduce a formal framework for the definition of integrated networks and their associated semantic subgraphs for drug interaction analysis and describe DReSMin, an algorithm for mining semantically-rich networks for occurrences of a given semantic subgraph. This algorithm allows instances of complex semantic subgraphs that contain data about putative drug repositioning opportunities to be identified in a computationally tractable fashion, scaling close to linearly with network data. We demonstrate the utility of our approach by mining an integrated drug interaction network built from 11 sources. This work identified and ranked 9,643,061 putative drug-target interactions, showing a strong correlation between highly scored associations and those supported by literature. We discuss the 20 top ranked associations in more detail, of which 14 are novel and 6 are supported by the literature. We also show that our approach better prioritizes known drug-target interactions, than other state-of-the art approaches for predicting such interactions. PMID:26844016

  12. Partitioning sparse matrices with eigenvectors of graphs

    NASA Technical Reports Server (NTRS)

    Pothen, Alex; Simon, Horst D.; Liou, Kang-Pu

    1990-01-01

    The problem of computing a small vertex separator in a graph arises in the context of computing a good ordering for the parallel factorization of sparse, symmetric matrices. An algebraic approach for computing vertex separators is considered in this paper. It is shown that lower bounds on separator sizes can be obtained in terms of the eigenvalues of the Laplacian matrix associated with a graph. The Laplacian eigenvectors of grid graphs can be computed from Kronecker products involving the eigenvectors of path graphs, and these eigenvectors can be used to compute good separators in grid graphs. A heuristic algorithm is designed to compute a vertex separator in a general graph by first computing an edge separator in the graph from an eigenvector of the Laplacian matrix, and then using a maximum matching in a subgraph to compute the vertex separator. Results on the quality of the separators computed by the spectral algorithm are presented, and these are compared with separators obtained from other algorithms for computing separators. Finally, the time required to compute the Laplacian eigenvector is reported, and the accuracy with which the eigenvector must be computed to obtain good separators is considered. The spectral algorithm has the advantage that it can be implemented on a medium-size multiprocessor in a straightforward manner.

  13. Topics on data transmission problem in software definition network

    NASA Astrophysics Data System (ADS)

    Gao, Wei; Liang, Li; Xu, Tianwei; Gan, Jianhou

    2017-08-01

    In normal computer networks, the data transmission between two sites go through the shortest path between two corresponding vertices. However, in the setting of software definition network (SDN), it should monitor the network traffic flow in each site and channel timely, and the data transmission path between two sites in SDN should consider the congestion in current networks. Hence, the difference of available data transmission theory between normal computer network and software definition network is that we should consider the prohibit graph structures in SDN, and these forbidden subgraphs represent the sites and channels in which data can't be passed by the serious congestion. Inspired by theoretical analysis of an available data transmission in SDN, we consider some computational problems from the perspective of the graph theory. Several results determined in the paper imply the sufficient conditions of data transmission in SDN in the various graph settings.

  14. Scalable Faceted Ranking in Tagging Systems

    NASA Astrophysics Data System (ADS)

    Orlicki, José I.; Alvarez-Hamelin, J. Ignacio; Fierens, Pablo I.

    Nowadays, web collaborative tagging systems which allow users to upload, comment on and recommend contents, are growing. Such systems can be represented as graphs where nodes correspond to users and tagged-links to recommendations. In this paper we analyze the problem of computing a ranking of users with respect to a facet described as a set of tags. A straightforward solution is to compute a PageRank-like algorithm on a facet-related graph, but it is not feasible for online computation. We propose an alternative: (i) a ranking for each tag is computed offline on the basis of tag-related subgraphs; (ii) a faceted order is generated online by merging rankings corresponding to all the tags in the facet. Based on the graph analysis of YouTube and Flickr, we show that step (i) is scalable. We also present efficient algorithms for step (ii), which are evaluated by comparing their results with two gold standards.

  15. The Challenge of Characterizing Branching in Molecular Species.

    DTIC Science & Technology

    1986-07-16

    representing respectively paths of lengths two and three. Strictly speaking, a septuple rather than a pair should have been used to account for all the paths...same counts, are of fundmental importance in the study of isospectral graphs. These facts were exploited by the latter workers to establish a 1-1...case of the Hosoya index, Z(G), a composition principle was given [38] from which it was apparent that Z(G) depends on certain subgraphs of C for

  16. What Causal Forces Shape Internet Connectivity at the AS-level?

    DTIC Science & Technology

    2003-01-01

    business “peering relationship .” By focusing on the AS subgraph ASPC whose links repre- sent provider- customer relationships , we present an empirical... customer relationships may be determined in the actual Internet, we develop a new optimization-driven model for Internet growth at the ASPC level...among ASs. Two ASs are connected in an AS graph by a link only if they have a “peering relationship ” between them, e.g., provider- customer or peer-to

  17. Interactions of information transfer along separable causal paths

    NASA Astrophysics Data System (ADS)

    Jiang, Peishi; Kumar, Praveen

    2018-04-01

    Complex systems arise as a result of interdependences between multiple variables, whose causal interactions can be visualized in a time-series graph. Transfer entropy and information partitioning approaches have been used to characterize such dependences. However, these approaches capture net information transfer occurring through a multitude of pathways involved in the interaction and as a result mask our ability to discern the causal interaction within a subgraph of interest through specific pathways. We build on recent developments of momentary information transfer along causal paths proposed by Runge [Phys. Rev. E 92, 062829 (2015), 10.1103/PhysRevE.92.062829] to develop a framework for quantifying information partitioning along separable causal paths. Momentary information transfer along causal paths captures the amount of information transfer between any two variables lagged at two specific points in time. Our approach expands this concept to characterize the causal interaction in terms of synergistic, unique, and redundant information transfer through separable causal paths. Through a graphical model, we analyze the impact of the separable and nonseparable causal paths and the causality structure embedded in the graph as well as the noise effect on information partitioning by using synthetic data generated from two coupled logistic equation models. Our approach can provide a valuable reference for an autonomous information partitioning along separable causal paths which form a causal subgraph influencing a target.

  18. Classification of Domain Movements in Proteins Using Dynamic Contact Graphs

    PubMed Central

    Taylor, Daniel; Cawley, Gavin; Hayward, Steven

    2013-01-01

    A new method for the classification of domain movements in proteins is described and applied to 1822 pairs of structures from the Protein Data Bank that represent a domain movement in two-domain proteins. The method is based on changes in contacts between residues from the two domains in moving from one conformation to the other. We argue that there are five types of elemental contact changes and that these relate to five model domain movements called: “free”, “open-closed”, “anchored”, “sliding-twist”, and “see-saw.” A directed graph is introduced called the “Dynamic Contact Graph” which represents the contact changes in a domain movement. In many cases a graph, or part of a graph, provides a clear visual metaphor for the movement it represents and is a motif that can be easily recognised. The Dynamic Contact Graphs are often comprised of disconnected subgraphs indicating independent regions which may play different roles in the domain movement. The Dynamic Contact Graph for each domain movement is decomposed into elemental Dynamic Contact Graphs, those that represent elemental contact changes, allowing us to count the number of instances of each type of elemental contact change in the domain movement. This naturally leads to sixteen classes into which the 1822 domain movements are classified. PMID:24260562

  19. Modular biological function is most effectively captured by combining molecular interaction data types.

    PubMed

    Ames, Ryan M; Macpherson, Jamie I; Pinney, John W; Lovell, Simon C; Robertson, David L

    2013-01-01

    Large-scale molecular interaction data sets have the potential to provide a comprehensive, system-wide understanding of biological function. Although individual molecules can be promiscuous in terms of their contribution to function, molecular functions emerge from the specific interactions of molecules giving rise to modular organisation. As functions often derive from a range of mechanisms, we demonstrate that they are best studied using networks derived from different sources. Implementing a graph partitioning algorithm we identify subnetworks in yeast protein-protein interaction (PPI), genetic interaction and gene co-regulation networks. Among these subnetworks we identify cohesive subgraphs that we expect to represent functional modules in the different data types. We demonstrate significant overlap between the subgraphs generated from the different data types and show these overlaps can represent related functions as represented by the Gene Ontology (GO). Next, we investigate the correspondence between our subgraphs and the Gene Ontology. This revealed varying degrees of coverage of the biological process, molecular function and cellular component ontologies, dependent on the data type. For example, subgraphs from the PPI show enrichment for 84%, 58% and 93% of annotated GO terms, respectively. Integrating the interaction data into a combined network increases the coverage of GO. Furthermore, the different annotation types of GO are not predominantly associated with one of the interaction data types. Collectively our results demonstrate that successful capture of functional relationships by network data depends on both the specific biological function being characterised and the type of network data being used. We identify functions that require integrated information to be accurately represented, demonstrating the limitations of individual data types. Combining interaction subnetworks across data types is therefore essential for fully understanding the complex and emergent nature of biological function.

  20. Local Higher-Order Graph Clustering

    PubMed Central

    Yin, Hao; Benson, Austin R.; Leskovec, Jure; Gleich, David F.

    2018-01-01

    Local graph clustering methods aim to find a cluster of nodes by exploring a small region of the graph. These methods are attractive because they enable targeted clustering around a given seed node and are faster than traditional global graph clustering methods because their runtime does not depend on the size of the input graph. However, current local graph partitioning methods are not designed to account for the higher-order structures crucial to the network, nor can they effectively handle directed networks. Here we introduce a new class of local graph clustering methods that address these issues by incorporating higher-order network information captured by small subgraphs, also called network motifs. We develop the Motif-based Approximate Personalized PageRank (MAPPR) algorithm that finds clusters containing a seed node with minimal motif conductance, a generalization of the conductance metric for network motifs. We generalize existing theory to prove the fast running time (independent of the size of the graph) and obtain theoretical guarantees on the cluster quality (in terms of motif conductance). We also develop a theory of node neighborhoods for finding sets that have small motif conductance, and apply these results to the case of finding good seed nodes to use as input to the MAPPR algorithm. Experimental validation on community detection tasks in both synthetic and real-world networks, shows that our new framework MAPPR outperforms the current edge-based personalized PageRank methodology. PMID:29770258

  1. Approximation algorithms for the min-power symmetric connectivity problem

    NASA Astrophysics Data System (ADS)

    Plotnikov, Roman; Erzin, Adil; Mladenovic, Nenad

    2016-10-01

    We consider the NP-hard problem of synthesis of optimal spanning communication subgraph in a given arbitrary simple edge-weighted graph. This problem occurs in the wireless networks while minimizing the total transmission power consumptions. We propose several new heuristics based on the variable neighborhood search metaheuristic for the approximation solution of the problem. We have performed a numerical experiment where all proposed algorithms have been executed on the randomly generated test samples. For these instances, on average, our algorithms outperform the previously known heuristics.

  2. On k-ary n-cubes: Theory and applications

    NASA Technical Reports Server (NTRS)

    Mao, Weizhen; Nicol, David M.

    1994-01-01

    Many parallel processing networks can be viewed as graphs called k-ary n-cubes, whose special cases include rings, hypercubes and toruses. In this paper, combinatorial properties of k-ary n-cubes are explored. In particular, the problem of characterizing the subgraph of a given number of nodes with the maximum edge count is studied. These theoretical results are then used to compute a lower bounding function in branch-and-bound partitioning algorithms and to establish the optimality of some irregular partitions.

  3. DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases

    NASA Astrophysics Data System (ADS)

    Bröcheler, Matthias; Pugliese, Andrea; Subrahmanian, V. S.

    RDF is an increasingly important paradigm for the representation of information on the Web. As RDF databases increase in size to approach tens of millions of triples, and as sophisticated graph matching queries expressible in languages like SPARQL become increasingly important, scalability becomes an issue. To date, there is no graph-based indexing method for RDF data where the index was designed in a way that makes it disk-resident. There is therefore a growing need for indexes that can operate efficiently when the index itself resides on disk. In this paper, we first propose the DOGMA index for fast subgraph matching on disk and then develop a basic algorithm to answer queries over this index. This algorithm is then significantly sped up via an optimized algorithm that uses efficient (but correct) pruning strategies when combined with two different extensions of the index. We have implemented a preliminary system and tested it against four existing RDF database systems developed by others. Our experiments show that our algorithm performs very well compared to these systems, with orders of magnitude improvements for complex graph queries.

  4. Unwinding the hairball graph: Pruning algorithms for weighted complex networks

    NASA Astrophysics Data System (ADS)

    Dianati, Navid

    2016-01-01

    Empirical networks of weighted dyadic relations often contain "noisy" edges that alter the global characteristics of the network and obfuscate the most important structures therein. Graph pruning is the process of identifying the most significant edges according to a generative null model and extracting the subgraph consisting of those edges. Here, we focus on integer-weighted graphs commonly arising when weights count the occurrences of an "event" relating the nodes. We introduce a simple and intuitive null model related to the configuration model of network generation and derive two significance filters from it: the marginal likelihood filter (MLF) and the global likelihood filter (GLF). The former is a fast algorithm assigning a significance score to each edge based on the marginal distribution of edge weights, whereas the latter is an ensemble approach which takes into account the correlations among edges. We apply these filters to the network of air traffic volume between US airports and recover a geographically faithful representation of the graph. Furthermore, compared with thresholding based on edge weight, we show that our filters extract a larger and significantly sparser giant component.

  5. Enabling Graph Mining in RDF Triplestores using SPARQL for Holistic In-situ Graph Analysis

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lee, Sangkeun; Sukumar, Sreenivas R; Hong, Seokyong

    The graph analysis is now considered as a promising technique to discover useful knowledge in data with a new perspective. We envi- sion that there are two dimensions of graph analysis: OnLine Graph Analytic Processing (OLGAP) and Graph Mining (GM) where each respectively focuses on subgraph pattern matching and automatic knowledge discovery in graph. Moreover, as these two dimensions aim to complementarily solve complex problems, holistic in-situ graph analysis which covers both OLGAP and GM in a single system is critical for minimizing the burdens of operating multiple graph systems and transferring intermediate result-sets between those systems. Nevertheless, most existingmore » graph analysis systems are only capable of one dimension of graph analysis. In this work, we take an approach to enabling GM capabilities (e.g., PageRank, connected-component analysis, node eccentricity, etc.) in RDF triplestores, which are originally developed to store RDF datasets and provide OLGAP capability. More specifically, to achieve our goal, we implemented six representative graph mining algorithms using SPARQL. The approach allows a wide range of available RDF data sets directly applicable for holistic graph analysis within a system. For validation of our approach, we evaluate performance of our implementations with nine real-world datasets and three different computing environments - a laptop computer, an Amazon EC2 instance, and a shared-memory Cray XMT2 URIKA-GD graph-processing appliance. The experimen- tal results show that our implementation can provide promising and scalable performance for real world graph analysis in all tested environments. The developed software is publicly available in an open-source project that we initiated.« less

  6. Enabling Graph Mining in RDF Triplestores using SPARQL for Holistic In-situ Graph Analysis

    DOE PAGES

    Lee, Sangkeun; Sukumar, Sreenivas R; Hong, Seokyong; ...

    2016-01-01

    The graph analysis is now considered as a promising technique to discover useful knowledge in data with a new perspective. We envi- sion that there are two dimensions of graph analysis: OnLine Graph Analytic Processing (OLGAP) and Graph Mining (GM) where each respectively focuses on subgraph pattern matching and automatic knowledge discovery in graph. Moreover, as these two dimensions aim to complementarily solve complex problems, holistic in-situ graph analysis which covers both OLGAP and GM in a single system is critical for minimizing the burdens of operating multiple graph systems and transferring intermediate result-sets between those systems. Nevertheless, most existingmore » graph analysis systems are only capable of one dimension of graph analysis. In this work, we take an approach to enabling GM capabilities (e.g., PageRank, connected-component analysis, node eccentricity, etc.) in RDF triplestores, which are originally developed to store RDF datasets and provide OLGAP capability. More specifically, to achieve our goal, we implemented six representative graph mining algorithms using SPARQL. The approach allows a wide range of available RDF data sets directly applicable for holistic graph analysis within a system. For validation of our approach, we evaluate performance of our implementations with nine real-world datasets and three different computing environments - a laptop computer, an Amazon EC2 instance, and a shared-memory Cray XMT2 URIKA-GD graph-processing appliance. The experimen- tal results show that our implementation can provide promising and scalable performance for real world graph analysis in all tested environments. The developed software is publicly available in an open-source project that we initiated.« less

  7. Acceleration of Binding Site Comparisons by Graph Partitioning.

    PubMed

    Krotzky, Timo; Klebe, Gerhard

    2015-08-01

    The comparison of protein binding sites is a prominent task in computational chemistry and has been studied in many different ways. For the automatic detection and comparison of putative binding cavities the Cavbase system has been developed which uses a coarse-grained set of pseudocenters to represent the physicochemical properties of a binding site and employs a graph-based procedure to calculate similarities between two binding sites. However, the comparison of two graphs is computationally quite demanding which makes large-scale studies such as the rapid screening of entire databases hardly feasible. In a recent work, we proposed the method Local Cliques (LC) for the efficient comparison of Cavbase binding sites. It employs a clique heuristic to detect the maximum common subgraph of two binding sites and an extended graph model to additionally compare the shape of individual surface patches. In this study, we present an alternative to further accelerate the LC method by partitioning the binding-site graphs into disjoint components prior to their comparisons. The pseudocenter sets are split with regard to their assigned phyiscochemical type, which leads to seven much smaller graphs than the original one. Applying this approach on the same test scenarios as in the former comprehensive way results in a significant speed-up without sacrificing accuracy. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Evaluation of Graph Pattern Matching Workloads in Graph Analysis Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hong, Seokyong; Lee, Sangkeun; Lim, Seung-Hwan

    2016-01-01

    Graph analysis has emerged as a powerful method for data scientists to represent, integrate, query, and explore heterogeneous data sources. As a result, graph data management and mining became a popular area of research, and led to the development of plethora of systems in recent years. Unfortunately, the number of emerging graph analysis systems and the wide range of applications, coupled with a lack of apples-to-apples comparisons, make it difficult to understand the trade-offs between different systems and the graph operations for which they are designed. A fair comparison of these systems is a challenging task for the following reasons:more » multiple data models, non-standardized serialization formats, various query interfaces to users, and diverse environments they operate in. To address these key challenges, in this paper we present a new benchmark suite by extending the Lehigh University Benchmark (LUBM) to cover the most common capabilities of various graph analysis systems. We provide the design process of the benchmark, which generalizes the workflow for data scientists to conduct the desired graph analysis on different graph analysis systems. Equipped with this extended benchmark suite, we present performance comparison for nine subgraph pattern retrieval operations over six graph analysis systems, namely NetworkX, Neo4j, Jena, Titan, GraphX, and uRiKA. Through the proposed benchmark suite, this study reveals both quantitative and qualitative findings in (1) implications in loading data into each system; (2) challenges in describing graph patterns for each query interface; and (3) different sensitivity of each system to query selectivity. We envision that this study will pave the road for: (i) data scientists to select the suitable graph analysis systems, and (ii) data management system designers to advance graph analysis systems.« less

  9. Reproducibility of graph metrics of human brain structural networks.

    PubMed

    Duda, Jeffrey T; Cook, Philip A; Gee, James C

    2014-01-01

    Recent interest in human brain connectivity has led to the application of graph theoretical analysis to human brain structural networks, in particular white matter connectivity inferred from diffusion imaging and fiber tractography. While these methods have been used to study a variety of patient populations, there has been less examination of the reproducibility of these methods. A number of tractography algorithms exist and many of these are known to be sensitive to user-selected parameters. The methods used to derive a connectivity matrix from fiber tractography output may also influence the resulting graph metrics. Here we examine how these algorithm and parameter choices influence the reproducibility of proposed graph metrics on a publicly available test-retest dataset consisting of 21 healthy adults. The dice coefficient is used to examine topological similarity of constant density subgraphs both within and between subjects. Seven graph metrics are examined here: mean clustering coefficient, characteristic path length, largest connected component size, assortativity, global efficiency, local efficiency, and rich club coefficient. The reproducibility of these network summary measures is examined using the intraclass correlation coefficient (ICC). Graph curves are created by treating the graph metrics as functions of a parameter such as graph density. Functional data analysis techniques are used to examine differences in graph measures that result from the choice of fiber tracking algorithm. The graph metrics consistently showed good levels of reproducibility as measured with ICC, with the exception of some instability at low graph density levels. The global and local efficiency measures were the most robust to the choice of fiber tracking algorithm.

  10. NIBBS-search for fast and accurate prediction of phenotype-biased metabolic systems.

    PubMed

    Schmidt, Matthew C; Rocha, Andrea M; Padmanabhan, Kanchana; Shpanskaya, Yekaterina; Banfield, Jill; Scott, Kathleen; Mihelcic, James R; Samatova, Nagiza F

    2012-01-01

    Understanding of genotype-phenotype associations is important not only for furthering our knowledge on internal cellular processes, but also essential for providing the foundation necessary for genetic engineering of microorganisms for industrial use (e.g., production of bioenergy or biofuels). However, genotype-phenotype associations alone do not provide enough information to alter an organism's genome to either suppress or exhibit a phenotype. It is important to look at the phenotype-related genes in the context of the genome-scale network to understand how the genes interact with other genes in the organism. Identification of metabolic subsystems involved in the expression of the phenotype is one way of placing the phenotype-related genes in the context of the entire network. A metabolic system refers to a metabolic network subgraph; nodes are compounds and edges labels are the enzymes that catalyze the reaction. The metabolic subsystem could be part of a single metabolic pathway or span parts of multiple pathways. Arguably, comparative genome-scale metabolic network analysis is a promising strategy to identify these phenotype-related metabolic subsystems. Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. We set up experiments with target phenotypes like hydrogen production, TCA expression, and acid-tolerance. We show via extensive literature search that some of the resulting metabolic subsystems are indeed phenotype-related and formulate hypotheses for other systems in terms of their role in phenotype expression. NIBBS is also orders of magnitude faster than MULE, one of the most efficient maximal frequent subgraph mining algorithms that could be adjusted for this problem. Also, the set of phenotype-biased metabolic systems output by NIBBS comes very close to the set of phenotype-biased subgraphs output by an exact maximally-biased subgraph enumeration algorithm ( MBS-Enum ). The code (NIBBS and the module to visualize the identified subsystems) is available at http://freescience.org/cs/NIBBS.

  11. NIBBS-Search for Fast and Accurate Prediction of Phenotype-Biased Metabolic Systems

    PubMed Central

    Padmanabhan, Kanchana; Shpanskaya, Yekaterina; Banfield, Jill; Scott, Kathleen; Mihelcic, James R.; Samatova, Nagiza F.

    2012-01-01

    Understanding of genotype-phenotype associations is important not only for furthering our knowledge on internal cellular processes, but also essential for providing the foundation necessary for genetic engineering of microorganisms for industrial use (e.g., production of bioenergy or biofuels). However, genotype-phenotype associations alone do not provide enough information to alter an organism's genome to either suppress or exhibit a phenotype. It is important to look at the phenotype-related genes in the context of the genome-scale network to understand how the genes interact with other genes in the organism. Identification of metabolic subsystems involved in the expression of the phenotype is one way of placing the phenotype-related genes in the context of the entire network. A metabolic system refers to a metabolic network subgraph; nodes are compounds and edges labels are the enzymes that catalyze the reaction. The metabolic subsystem could be part of a single metabolic pathway or span parts of multiple pathways. Arguably, comparative genome-scale metabolic network analysis is a promising strategy to identify these phenotype-related metabolic subsystems. Network Instance-Based Biased Subgraph Search (NIBBS) is a graph-theoretic method for genome-scale metabolic network comparative analysis that can identify metabolic systems that are statistically biased toward phenotype-expressing organismal networks. We set up experiments with target phenotypes like hydrogen production, TCA expression, and acid-tolerance. We show via extensive literature search that some of the resulting metabolic subsystems are indeed phenotype-related and formulate hypotheses for other systems in terms of their role in phenotype expression. NIBBS is also orders of magnitude faster than MULE, one of the most efficient maximal frequent subgraph mining algorithms that could be adjusted for this problem. Also, the set of phenotype-biased metabolic systems output by NIBBS comes very close to the set of phenotype-biased subgraphs output by an exact maximally-biased subgraph enumeration algorithm ( MBS-Enum ). The code (NIBBS and the module to visualize the identified subsystems) is available at http://freescience.org/cs/NIBBS. PMID:22589706

  12. Unapparent Information Revelation: Text Mining for Counterterrorism

    NASA Astrophysics Data System (ADS)

    Srihari, Rohini K.

    Unapparent information revelation (UIR) is a special case of text mining that focuses on detecting possible links between concepts across multiple text documents by generating an evidence trail explaining the connection. A traditional search involving, for example, two or more person names will attempt to find documents mentioning both these individuals. This research focuses on a different interpretation of such a query: what is the best evidence trail across documents that explains a connection between these individuals? For example, all may be good golfers. A generalization of this task involves query terms representing general concepts (e.g. indictment, foreign policy). Previous approaches to this problem have focused on graph mining involving hyperlinked documents, and link analysis exploiting named entities. A new robust framework is presented, based on (i) generating concept chain graphs, a hybrid content representation, (ii) performing graph matching to select candidate subgraphs, and (iii) subsequently using graphical models to validate hypotheses using ranked evidence trails. We adapt the DUC data set for cross-document summarization to evaluate evidence trails generated by this approach

  13. Augmenting computer networks

    NASA Technical Reports Server (NTRS)

    Bokhari, S. H.; Raza, A. D.

    1984-01-01

    Three methods of augmenting computer networks by adding at most one link per processor are discussed: (1) A tree of N nodes may be augmented such that the resulting graph has diameter no greater than 4log sub 2((N+2)/3)-2. Thi O(N(3)) algorithm can be applied to any spanning tree of a connected graph to reduce the diameter of that graph to O(log N); (2) Given a binary tree T and a chain C of N nodes each, C may be augmented to produce C so that T is a subgraph of C. This algorithm is O(N) and may be used to produce augmented chains or rings that have diameter no greater than 2log sub 2((N+2)/3) and are planar; (3) Any rectangular two-dimensional 4 (8) nearest neighbor array of size N = 2(k) may be augmented so that it can emulate a single step shuffle-exchange network of size N/2 in 3(t) time steps.

  14. F-RAG: Generating Atomic Coordinates from RNA Graphs by Fragment Assembly.

    PubMed

    Jain, Swati; Schlick, Tamar

    2017-11-24

    Coarse-grained models represent attractive approaches to analyze and simulate ribonucleic acid (RNA) molecules, for example, for structure prediction and design, as they simplify the RNA structure to reduce the conformational search space. Our structure prediction protocol RAGTOP (RNA-As-Graphs Topology Prediction) represents RNA structures as tree graphs and samples graph topologies to produce candidate graphs. However, for a more detailed study and analysis, construction of atomic from coarse-grained models is required. Here we present our graph-based fragment assembly algorithm (F-RAG) to convert candidate three-dimensional (3D) tree graph models, produced by RAGTOP into atomic structures. We use our related RAG-3D utilities to partition graphs into subgraphs and search for structurally similar atomic fragments in a data set of RNA 3D structures. The fragments are edited and superimposed using common residues, full atomic models are scored using RAGTOP's knowledge-based potential, and geometries of top scoring models is optimized. To evaluate our models, we assess all-atom RMSDs and Interaction Network Fidelity (a measure of residue interactions) with respect to experimentally solved structures and compare our results to other fragment assembly programs. For a set of 50 RNA structures, we obtain atomic models with reasonable geometries and interactions, particularly good for RNAs containing junctions. Additional improvements to our protocol and databases are outlined. These results provide a good foundation for further work on RNA structure prediction and design applications. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Limit of a nonpreferential attachment multitype network model

    NASA Astrophysics Data System (ADS)

    Shang, Yilun

    2017-02-01

    Here, we deal with a model of multitype network with nonpreferential attachment growth. The connection between two nodes depends asymmetrically on their types, reflecting the implication of time order in temporal networks. Based upon graph limit theory, we analytically determined the limit of the network model characterized by a kernel, in the sense that the number of copies of any fixed subgraph converges when network size tends to infinity. The results are confirmed by extensive simulations. Our work thus provides a theoretical framework for quantitatively understanding grown temporal complex networks as a whole.

  16. An effective trust-based recommendation method using a novel graph clustering algorithm

    NASA Astrophysics Data System (ADS)

    Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin

    2015-10-01

    Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.

  17. Scalable Triadic Analysis of Large-Scale Graphs: Multi-Core vs. Multi-Processor vs. Multi-Threaded Shared Memory Architectures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chin, George; Marquez, Andres; Choudhury, Sutanay

    2012-09-01

    Triadic analysis encompasses a useful set of graph mining methods that is centered on the concept of a triad, which is a subgraph of three nodes and the configuration of directed edges across the nodes. Such methods are often applied in the social sciences as well as many other diverse fields. Triadic methods commonly operate on a triad census that counts the number of triads of every possible edge configuration in a graph. Like other graph algorithms, triadic census algorithms do not scale well when graphs reach tens of millions to billions of nodes. To enable the triadic analysis ofmore » large-scale graphs, we developed and optimized a triad census algorithm to efficiently execute on shared memory architectures. We will retrace the development and evolution of a parallel triad census algorithm. Over the course of several versions, we continually adapted the code’s data structures and program logic to expose more opportunities to exploit parallelism on shared memory that would translate into improved computational performance. We will recall the critical steps and modifications that occurred during code development and optimization. Furthermore, we will compare the performances of triad census algorithm versions on three specific systems: Cray XMT, HP Superdome, and AMD multi-core NUMA machine. These three systems have shared memory architectures but with markedly different hardware capabilities to manage parallelism.« less

  18. PLACNETw: a web-based tool for plasmid reconstruction from bacterial genomes.

    PubMed

    Vielva, Luis; de Toro, María; Lanza, Val F; de la Cruz, Fernando

    2017-12-01

    PLACNET is a graph-based tool for reconstruction of plasmids from next generation sequence pair-end datasets. PLACNET graphs contain two types of nodes (assembled contigs and reference genomes) and two types of edges (scaffold links and homology to references). Manual pruning of the graphs is a necessary requirement in PLACNET, but this is difficult for users without solid bioinformatic background. PLACNETw, a webtool based on PLACNET, provides an interactive graphic interface, automates BLAST searches, and extracts the relevant information for decision making. It allows a user with domain expertise to visualize the scaffold graphs and related information of contigs as well as reference sequences, so that the pruning operations can be done interactively from a personal computer without the need for additional tools. After successful pruning, each plasmid becomes a separate connected component subgraph. The resulting data are automatically downloaded by the user. PLACNETw is freely available at https://castillo.dicom.unican.es/upload/. delacruz@unican.es. A tutorial video and several solved examples are available at https://castillo.dicom.unican.es/placnetw_video/ and https://castillo.dicom.unican.es/examples/. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

  19. A Multilevel Gamma-Clustering Layout Algorithm for Visualization of Biological Networks

    PubMed Central

    Hruz, Tomas; Lucas, Christoph; Laule, Oliver; Zimmermann, Philip

    2013-01-01

    Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ-clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs. PMID:23864855

  20. Extended Graph-Based Models for Enhanced Similarity Search in Cavbase.

    PubMed

    Krotzky, Timo; Fober, Thomas; Hüllermeier, Eyke; Klebe, Gerhard

    2014-01-01

    To calculate similarities between molecular structures, measures based on the maximum common subgraph are frequently applied. For the comparison of protein binding sites, these measures are not fully appropriate since graphs representing binding sites on a detailed atomic level tend to get very large. In combination with an NP-hard problem, a large graph leads to a computationally demanding task. Therefore, for the comparison of binding sites, a less detailed coarse graph model is used building upon so-called pseudocenters. Consistently, a loss of structural data is caused since many atoms are discarded and no information about the shape of the binding site is considered. This is usually resolved by performing subsequent calculations based on additional information. These steps are usually quite expensive, making the whole approach very slow. The main drawback of a graph-based model solely based on pseudocenters, however, is the loss of information about the shape of the protein surface. In this study, we propose a novel and efficient modeling formalism that does not increase the size of the graph model compared to the original approach, but leads to graphs containing considerably more information assigned to the nodes. More specifically, additional descriptors considering surface characteristics are extracted from the local surface and attributed to the pseudocenters stored in Cavbase. These properties are evaluated as additional node labels, which lead to a gain of information and allow for much faster but still very accurate comparisons between different structures.

  1. Topographic Spreading Analysis of an Empirical Sex Workers' Network

    NASA Astrophysics Data System (ADS)

    Bjell, Johannes; Canright, Geoffrey; Engø-Monsen, Kenth; Remple, Valencia P.

    The problem of epidemic spreading over networks has received considerable attention in recent years, due both to its intrinsic intellectual challenge and to its practical importance. A good recent summary of such work may be found in Newman (8), while (9) gives an outstanding example of a non-trivial prediction which is obtained from explicitly modeling the network in the epidemic spreading. In the language of mathematicians and computer scientists, a network of nodes connected by edges is called a graph. Most work on epidemic spreading over networks focuses on whole-graph properties, such as the percentage of infected nodes at long time. Two of us have, in contrast, focused on understanding the spread of an infection over time and space (the network) (61; 63; 62). This work involves decomposing any given network into subgraphs called regions (61). Regions are precisely defined as disjoint subgraphs which may be viewed as coarse-grained units of infection—in that, once one node in a region is infected, the progress of the infection over the remainder of the region is relatively fast and predictable (63). We note that this approach is based on the ‘Susceptible-Infected’ (SI) model of infection, in which nodes, once infected, are never cured. This model is reasonable for some infections, such as HIV—which is one of the diseases studied here. We also study gonorrhea and chlamydia, for which a more appropriate model is Susceptible-Infected-Susceptible (SIS) (67) (since nodes can be cured); we discuss the limitations of our approach for these cases below.

  2. An efficient, large-scale, non-lattice-detection algorithm for exhaustive structural auditing of biomedical ontologies.

    PubMed

    Zhang, Guo-Qiang; Xing, Guangming; Cui, Licong

    2018-04-01

    One of the basic challenges in developing structural methods for systematic audition on the quality of biomedical ontologies is the computational cost usually involved in exhaustive sub-graph analysis. We introduce ANT-LCA, a new algorithm for computing all non-trivial lowest common ancestors (LCA) of each pair of concepts in the hierarchical order induced by an ontology. The computation of LCA is a fundamental step for non-lattice approach for ontology quality assurance. Distinct from existing approaches, ANT-LCA only computes LCAs for non-trivial pairs, those having at least one common ancestor. To skip all trivial pairs that may be of no practical interest, ANT-LCA employs a simple but innovative algorithmic strategy combining topological order and dynamic programming to keep track of non-trivial pairs. We provide correctness proofs and demonstrate a substantial reduction in computational time for two largest biomedical ontologies: SNOMED CT and Gene Ontology (GO). ANT-LCA achieved an average computation time of 30 and 3 sec per version for SNOMED CT and GO, respectively, about 2 orders of magnitude faster than the best known approaches. Our algorithm overcomes a fundamental computational barrier in sub-graph based structural analysis of large ontological systems. It enables the implementation of a new breed of structural auditing methods that not only identifies potential problematic areas, but also automatically suggests changes to fix the issues. Such structural auditing methods can lead to more effective tools supporting ontology quality assurance work. Copyright © 2018 Elsevier Inc. All rights reserved.

  3. A Simulation Study Comparing Epidemic Dynamics on Exponential Random Graph and Edge-Triangle Configuration Type Contact Network Models

    PubMed Central

    Rolls, David A.; Wang, Peng; McBryde, Emma; Pattison, Philippa; Robins, Garry

    2015-01-01

    We compare two broad types of empirically grounded random network models in terms of their abilities to capture both network features and simulated Susceptible-Infected-Recovered (SIR) epidemic dynamics. The types of network models are exponential random graph models (ERGMs) and extensions of the configuration model. We use three kinds of empirical contact networks, chosen to provide both variety and realistic patterns of human contact: a highly clustered network, a bipartite network and a snowball sampled network of a “hidden population”. In the case of the snowball sampled network we present a novel method for fitting an edge-triangle model. In our results, ERGMs consistently capture clustering as well or better than configuration-type models, but the latter models better capture the node degree distribution. Despite the additional computational requirements to fit ERGMs to empirical networks, the use of ERGMs provides only a slight improvement in the ability of the models to recreate epidemic features of the empirical network in simulated SIR epidemics. Generally, SIR epidemic results from using configuration-type models fall between those from a random network model (i.e., an Erdős-Rényi model) and an ERGM. The addition of subgraphs of size four to edge-triangle type models does improve agreement with the empirical network for smaller densities in clustered networks. Additional subgraphs do not make a noticeable difference in our example, although we would expect the ability to model cliques to be helpful for contact networks exhibiting household structure. PMID:26555701

  4. Derivatives in discrete mathematics: a novel graph-theoretical invariant for generating new 2/3D molecular descriptors. I. Theory and QSPR application

    NASA Astrophysics Data System (ADS)

    Marrero-Ponce, Yovani; Santiago, Oscar Martínez; López, Yoan Martínez; Barigye, Stephen J.; Torrens, Francisco

    2012-11-01

    In this report, we present a new mathematical approach for describing chemical structures of organic molecules at atomic-molecular level, proposing for the first time the use of the concept of the derivative ( partial ) of a molecular graph (MG) with respect to a given event ( E), to obtain a new family of molecular descriptors (MDs). With this purpose, a new matrix representation of the MG, which generalizes graph's theory's traditional incidence matrix, is introduced. This matrix, denominated the generalized incidence matrix, Q, arises from the Boolean representation of molecular sub- graphs that participate in the formation of the graph molecular skeleton MG and could be complete (representing all possible connected sub-graphs) or constitute sub-graphs of determined orders or types as well as a combination of these. The Q matrix is a non-quadratic and unsymmetrical in nature, its columns ( n) and rows ( m) are conditions (letters) and collection of conditions (words) with which the event occurs. This non-quadratic and unsymmetrical matrix is transformed, by algebraic manipulation, to a quadratic and symmetric matrix known as relations frequency matrix, F, which characterizes the participation intensity of the conditions (letters) in the events (words). With F, we calculate the derivative over a pair of atomic nuclei. The local index for the atomic nuclei i, Δ i , can therefore be obtained as a linear combination of all the pair derivatives of the atomic nuclei i with all the rest of the j's atomic nuclei. Here, we also define new strategies that generalize the present form of obtaining global or local (group or atom-type) invariants from atomic contributions (local vertex invariants, LOVIs). In respect to this, metric (norms), means and statistical invariants are introduced. These invariants are applied to a vector whose components are the values Δ i for the atomic nuclei of the molecule or its fragments. Moreover, with the purpose of differentiating among different atoms, an atomic weighting scheme (atom-type labels) is used in the formation of the matrix Q or in LOVIs state. The obtained indices were utilized to describe the partition coefficient (Log P) and the reactivity index (Log K) of the 34 derivatives of 2-furylethylenes. In all the cases, our MDs showed better statistical results than those previously obtained using some of the most used families of MDs in chemometric practice. Therefore, it has been demonstrated to that the proposed MDs are useful in molecular design and permit obtaining easier and robust mathematical models than the majority of those reported in the literature. All this range of mentioned possibilities open "the doors" to the creation of a new family of MDs, using the graph derivative, and avail a new tool for QSAR/QSPR and molecular diversity/similarity studies.

  5. Adaptive random walks on the class of Web graphs

    NASA Astrophysics Data System (ADS)

    Tadić, B.

    2001-09-01

    We study random walk with adaptive move strategies on a class of directed graphs with variable wiring diagram. The graphs are grown from the evolution rules compatible with the dynamics of the world-wide Web [B. Tadić, Physica A 293, 273 (2001)], and are characterized by a pair of power-law distributions of out- and in-degree for each value of the parameter β, which measures the degree of rewiring in the graph. The walker adapts its move strategy according to locally available information both on out-degree of the visited node and in-degree of target node. A standard random walk, on the other hand, uses the out-degree only. We compute the distribution of connected subgraphs visited by an ensemble of walkers, the average access time and survival probability of the walks. We discuss these properties of the walk dynamics relative to the changes in the global graph structure when the control parameter β is varied. For β≥ 3, corresponding to the world-wide Web, the access time of the walk to a given level of hierarchy on the graph is much shorter compared to the standard random walk on the same graph. By reducing the amount of rewiring towards rigidity limit β↦βc≲ 0.1, corresponding to the range of naturally occurring biochemical networks, the survival probability of adaptive and standard random walk become increasingly similar. The adaptive random walk can be used as an efficient message-passing algorithm on this class of graphs for large degree of rewiring.

  6. A graph theoretic approach to scene matching

    NASA Technical Reports Server (NTRS)

    Ranganath, Heggere S.; Chipman, Laure J.

    1991-01-01

    The ability to match two scenes is a fundamental requirement in a variety of computer vision tasks. A graph theoretic approach to inexact scene matching is presented which is useful in dealing with problems due to imperfect image segmentation. A scene is described by a set of graphs, with nodes representing objects and arcs representing relationships between objects. Each node has a set of values representing the relations between pairs of objects, such as angle, adjacency, or distance. With this method of scene representation, the task in scene matching is to match two sets of graphs. Because of segmentation errors, variations in camera angle, illumination, and other conditions, an exact match between the sets of observed and stored graphs is usually not possible. In the developed approach, the problem is represented as an association graph, in which each node represents a possible mapping of an observed region to a stored object, and each arc represents the compatibility of two mappings. Nodes and arcs have weights indicating the merit or a region-object mapping and the degree of compatibility between two mappings. A match between the two graphs corresponds to a clique, or fully connected subgraph, in the association graph. The task is to find the clique that represents the best match. Fuzzy relaxation is used to update the node weights using the contextual information contained in the arcs and neighboring nodes. This simplifies the evaluation of cliques. A method of handling oversegmentation and undersegmentation problems is also presented. The approach is tested with a set of realistic images which exhibit many types of sementation errors.

  7. From near to eternity: Spin-glass planting, tiling puzzles, and constraint-satisfaction problems

    NASA Astrophysics Data System (ADS)

    Hamze, Firas; Jacob, Darryl C.; Ochoa, Andrew J.; Perera, Dilina; Wang, Wenlong; Katzgraber, Helmut G.

    2018-04-01

    We present a methodology for generating Ising Hamiltonians of tunable complexity and with a priori known ground states based on a decomposition of the model graph into edge-disjoint subgraphs. The idea is illustrated with a spin-glass model defined on a cubic lattice, where subproblems, whose couplers are restricted to the two values {-1 ,+1 } , are specified on unit cubes and are parametrized by their local degeneracy. The construction is shown to be equivalent to a type of three-dimensional constraint-satisfaction problem known as the tiling puzzle. By varying the proportions of subproblem types, the Hamiltonian can span a dramatic range of typical computational complexity, from fairly easy to many orders of magnitude more difficult than prototypical bimodal and Gaussian spin glasses in three space dimensions. We corroborate this behavior via experiments with different algorithms and discuss generalizations and extensions to different types of graphs.

  8. Quantify spatial relations to discover handwritten graphical symbols

    NASA Astrophysics Data System (ADS)

    Li, Jinpeng; Mouchère, Harold; Viard-Gaudin, Christian

    2012-01-01

    To model a handwritten graphical language, spatial relations describe how the strokes are positioned in the 2-dimensional space. Most of existing handwriting recognition systems make use of some predefined spatial relations. However, considering a complex graphical language, it is hard to express manually all the spatial relations. Another possibility would be to use a clustering technique to discover the spatial relations. In this paper, we discuss how to create a relational graph between strokes (nodes) labeled with graphemes in a graphical language. Then we vectorize spatial relations (edges) for clustering and quantization. As the targeted application, we extract the repetitive sub-graphs (graphical symbols) composed of graphemes and learned spatial relations. On two handwriting databases, a simple mathematical expression database and a complex flowchart database, the unsupervised spatial relations outperform the predefined spatial relations. In addition, we visualize the frequent patterns on two text-lines containing Chinese characters.

  9. A binary-decision-diagram-based two-bit arithmetic logic unit on a GaAs-based regular nanowire network with hexagonal topology.

    PubMed

    Zhao, Hong-Quan; Kasai, Seiya; Shiratori, Yuta; Hashizume, Tamotsu

    2009-06-17

    A two-bit arithmetic logic unit (ALU) was successfully fabricated on a GaAs-based regular nanowire network with hexagonal topology. This fundamental building block of central processing units can be implemented on a regular nanowire network structure with simple circuit architecture based on graphical representation of logic functions using a binary decision diagram and topology control of the graph. The four-instruction ALU was designed by integrating subgraphs representing each instruction, and the circuitry was implemented by transferring the logical graph structure to a GaAs-based nanowire network formed by electron beam lithography and wet chemical etching. A path switching function was implemented in nodes by Schottky wrap gate control of nanowires. The fabricated circuit integrating 32 node devices exhibits the correct output waveforms at room temperature allowing for threshold voltage variation.

  10. G-Hash: Towards Fast Kernel-based Similarity Search in Large Graph Databases.

    PubMed

    Wang, Xiaohong; Smalter, Aaron; Huan, Jun; Lushington, Gerald H

    2009-01-01

    Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and similarity search. With the fast accumulation of graph databases, similarity search in graph databases has emerged as an important research topic. Graph similarity search has applications in a wide range of domains including cheminformatics, bioinformatics, sensor network management, social network management, and XML documents, among others.Most of the current graph indexing methods focus on subgraph query processing, i.e. determining the set of database graphs that contains the query graph and hence do not directly support similarity search. In data mining and machine learning, various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models for supervised learning, graph kernel functions have (i) high computational complexity and (ii) non-trivial difficulty to be indexed in a graph database.Our objective is to bridge graph kernel function and similarity search in graph databases by proposing (i) a novel kernel-based similarity measurement and (ii) an efficient indexing structure for graph data management. Our method of similarity measurement builds upon local features extracted from each node and their neighboring nodes in graphs. A hash table is utilized to support efficient storage and fast search of the extracted local features. Using the hash table, a graph kernel function is defined to capture the intrinsic similarity of graphs and for fast similarity query processing. We have implemented our method, which we have named G-hash, and have demonstrated its utility on large chemical graph databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Most importantly, the new similarity measurement and the index structure is scalable to large database with smaller indexing size, faster indexing construction time, and faster query processing time as compared to state-of-the-art indexing methods such as C-tree, gIndex, and GraphGrep.

  11. Discrete geometric analysis of message passing algorithm on graphs

    NASA Astrophysics Data System (ADS)

    Watanabe, Yusuke

    2010-04-01

    We often encounter probability distributions given as unnormalized products of non-negative functions. The factorization structures are represented by hypergraphs called factor graphs. Such distributions appear in various fields, including statistics, artificial intelligence, statistical physics, error correcting codes, etc. Given such a distribution, computations of marginal distributions and the normalization constant are often required. However, they are computationally intractable because of their computational costs. One successful approximation method is Loopy Belief Propagation (LBP) algorithm. The focus of this thesis is an analysis of the LBP algorithm. If the factor graph is a tree, i.e. having no cycle, the algorithm gives the exact quantities. If the factor graph has cycles, however, the LBP algorithm does not give exact results and possibly exhibits oscillatory and non-convergent behaviors. The thematic question of this thesis is "How the behaviors of the LBP algorithm are affected by the discrete geometry of the factor graph?" The primary contribution of this thesis is the discovery of a formula that establishes the relation between the LBP, the Bethe free energy and the graph zeta function. This formula provides new techniques for analysis of the LBP algorithm, connecting properties of the graph and of the LBP and the Bethe free energy. We demonstrate applications of the techniques to several problems including (non) convexity of the Bethe free energy, the uniqueness and stability of the LBP fixed point. We also discuss the loop series initiated by Chertkov and Chernyak. The loop series is a subgraph expansion of the normalization constant, or partition function, and reflects the graph geometry. We investigate theoretical natures of the series. Moreover, we show a partial connection between the loop series and the graph zeta function.

  12. Uncovering the overlapping community structure of complex networks by maximal cliques

    NASA Astrophysics Data System (ADS)

    Li, Junqiu; Wang, Xingyuan; Cui, Yaozu

    2014-12-01

    In this paper, a unique algorithm is proposed to detect overlapping communities in the un-weighted and weighted networks with considerable accuracy. The maximal cliques, overlapping vertex, bridge vertex and isolated vertex are introduced. First, all the maximal cliques are extracted by the algorithm based on the deep and bread searching. Then two maximal cliques can be merged into a larger sub-graph by some given rules. In addition, the proposed algorithm successfully finds overlapping vertices and bridge vertices between communities. Experimental results using some real-world networks data show that the performance of the proposed algorithm is satisfactory.

  13. The Communications and Networks Collaborative Technology Alliance Publication Network: A Case Study on Graph and Simplicial Complex Analysis

    DTIC Science & Technology

    2015-05-01

    Amer 26 19 Adarshpal S Sethi 24 20 Sunil Samtani 22 21 Alenka G Zajic 22 7 We study 2 important types of subgraphs on the C&N CTA network: 1) the...Maria Striki 0.0841 20 Sunil Samtani at least 4 times the end-of-program mean degree (4.8185). These authors also have at least 7 times the end-of...Lee Tarek N Saadawi 19 2 Mariusz A Fecko Sunil Samtani 18 3 Maitreya Natu Adarshpal S Sethi 17 3 Anthony J McAuley Raquel Morera 17 5 Richard Gopaul

  14. FINAL REPORT (MILESTONE DATE 9/30/11) FOR SUBCONTRACT NO. B594099 NUMERICAL METHODS FOR LARGE-SCALE DATA FACTORIZATION

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    De Sterck, H

    2011-10-18

    The following work has been performed by PI Hans De Sterck and graduate student Manda Winlaw for the required tasks 1-5 (as listed in the Statement of Work). Graduate student Manda Winlaw has visited LLNL January 31-March 11, 2011 and May 23-August 19, 2010, working with Van Henson and Mike O'Hara on non-negative matrix factorizations (NMF). She has investigated the dense subgraph clustering algorithm from 'Finding Dense Subgraphs for Sparse Undirected, Directed, and Bipartite Graphs' by Chen and Saad, testing this method on several term-document matrices and adapting it to cluster based on the rank of the subgraphs instead ofmore » the density. Manda Winlaw was awarded a first prize in the annual LLNL summer student poster competition for a poster on her NMF research. PI Hans De Sterck has developed a new adaptive algebraic multigrid algorithm for computing a few dominant or minimal singular triplets of sparse rectangular matrices. This work builds on adaptive algebraic multigrid methods that were further developed by the PI and collaborators (including Sanders and Henson) for Markov chains. The method also combines and extends existing multigrid algorithms for the symmetric eigenproblem. The PI has visited LLNL February 22-25, 2011, and has given a CASC seminar 'Algebraic Multigrid for the Singular Value Problem' on this work on February 23, 2011. During his visit, he has discussed this work and related topics with Van Henson, Geoffrey Sanders, Panayot Vassilevski, and others. He has tested the algorithm on PDE matrices and on a term-document matrix, with promising initial results. Manda Winlaw has also started to work, with O'Hara, on estimating probability distributions over undirected graph edges. The goal is to estimate probabilistic models from sets of undirected graph edges for the purpose of prediction, anomaly detection and support to supervised learning. Graduate student Manda Winlaw is writing a paper on the results obtained with O'Hara which will be submitted some time later in 2011 to a data mining conference. PI Hans De Sterck has developed a new optimization algorithm for canonical tensor approximation, formulating an extension of the nonlinear GMRES method to optimization problems. Numerical results for tensors with up to 8 modes show that this new method is efficient for sparse and dense tensors. He has written a paper on this which has been submitted to the SIAM Journal on Scientific Computing. PI Hans De Sterck has further developed his new optimization algorithm for canonical tensor approximation, formulating an extension in terms of steepest-descent preconditioning, which makes the approach generally applicable for nonlinear optimization. He has written a paper on this extension which has been submitted to Numerical Linear Algebra with Applications.« less

  15. RAG-3D: A search tool for RNA 3D substructures

    DOE PAGES

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; ...

    2015-08-24

    In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less

  16. Using Graph Components Derived from an Associative Concept Dictionary to Predict fMRI Neural Activation Patterns that Represent the Meaning of Nouns.

    PubMed

    Akama, Hiroyuki; Miyake, Maki; Jung, Jaeyoung; Murphy, Brian

    2015-01-01

    In this study, we introduce an original distance definition for graphs, called the Markov-inverse-F measure (MiF). This measure enables the integration of classical graph theory indices with new knowledge pertaining to structural feature extraction from semantic networks. MiF improves the conventional Jaccard and/or Simpson indices, and reconciles both the geodesic information (random walk) and co-occurrence adjustment (degree balance and distribution). We measure the effectiveness of graph-based coefficients through the application of linguistic graph information for a neural activity recorded during conceptual processing in the human brain. Specifically, the MiF distance is computed between each of the nouns used in a previous neural experiment and each of the in-between words in a subgraph derived from the Edinburgh Word Association Thesaurus of English. From the MiF-based information matrix, a machine learning model can accurately obtain a scalar parameter that specifies the degree to which each voxel in (the MRI image of) the brain is activated by each word or each principal component of the intermediate semantic features. Furthermore, correlating the voxel information with the MiF-based principal components, a new computational neurolinguistics model with a network connectivity paradigm is created. This allows two dimensions of context space to be incorporated with both semantic and neural distributional representations.

  17. RAG-3D: a search tool for RNA 3D substructures

    PubMed Central

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef; Schlick, Tamar

    2015-01-01

    To address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally described in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding. PMID:26304547

  18. RAG-3D: A search tool for RNA 3D substructures

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zahran, Mai; Sevim Bayrak, Cigdem; Elmetwaly, Shereef

    In this study, to address many challenges in RNA structure/function prediction, the characterization of RNA's modular architectural units is required. Using the RNA-As-Graphs (RAG) database, we have previously explored the existence of secondary structure (2D) submotifs within larger RNA structures. Here we present RAG-3D—a dataset of RNA tertiary (3D) structures and substructures plus a web-based search tool—designed to exploit graph representations of RNAs for the goal of searching for similar 3D structural fragments. The objects in RAG-3D consist of 3D structures translated into 3D graphs, cataloged based on the connectivity between their secondary structure elements. Each graph is additionally describedmore » in terms of its subgraph building blocks. The RAG-3D search tool then compares a query RNA 3D structure to those in the database to obtain structurally similar structures and substructures. This comparison reveals conserved 3D RNA features and thus may suggest functional connections. Though RNA search programs based on similarity in sequence, 2D, and/or 3D structural elements are available, our graph-based search tool may be advantageous for illuminating similarities that are not obvious; using motifs rather than sequence space also reduces search times considerably. Ultimately, such substructuring could be useful for RNA 3D structure prediction, structure/function inference and inverse folding.« less

  19. L-GRAAL: Lagrangian graphlet-based network aligner.

    PubMed

    Malod-Dognin, Noël; Pržulj, Nataša

    2015-07-01

    Discovering and understanding patterns in networks of protein-protein interactions (PPIs) is a central problem in systems biology. Alignments between these networks aid functional understanding as they uncover important information, such as evolutionary conserved pathways, protein complexes and functional orthologs. A few methods have been proposed for global PPI network alignments, but because of NP-completeness of underlying sub-graph isomorphism problem, producing topologically and biologically accurate alignments remains a challenge. We introduce a novel global network alignment tool, Lagrangian GRAphlet-based ALigner (L-GRAAL), which directly optimizes both the protein and the interaction functional conservations, using a novel alignment search heuristic based on integer programming and Lagrangian relaxation. We compare L-GRAAL with the state-of-the-art network aligners on the largest available PPI networks from BioGRID and observe that L-GRAAL uncovers the largest common sub-graphs between the networks, as measured by edge-correctness and symmetric sub-structures scores, which allow transferring more functional information across networks. We assess the biological quality of the protein mappings using the semantic similarity of their Gene Ontology annotations and observe that L-GRAAL best uncovers functionally conserved proteins. Furthermore, we introduce for the first time a measure of the semantic similarity of the mapped interactions and show that L-GRAAL also uncovers best functionally conserved interactions. In addition, we illustrate on the PPI networks of baker's yeast and human the ability of L-GRAAL to predict new PPIs. Finally, L-GRAAL's results are the first to show that topological information is more important than sequence information for uncovering functionally conserved interactions. L-GRAAL is coded in C++. Software is available at: http://bio-nets.doc.ic.ac.uk/L-GRAAL/. n.malod-dognin@imperial.ac.uk Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  20. Towards comprehensive structural motif mining for better fold annotation in the "twilight zone" of sequence dissimilarity

    PubMed Central

    Jia, Yi; Huan, Jun; Buhr, Vincent; Zhang, Jintao; Carayannopoulos, Leonidas N

    2009-01-01

    Background Automatic identification of structure fingerprints from a group of diverse protein structures is challenging, especially for proteins whose divergent amino acid sequences may fall into the "twilight-" or "midnight-" zones where pair-wise sequence identities to known sequences fall below 25% and sequence-based functional annotations often fail. Results Here we report a novel graph database mining method and demonstrate its application to protein structure pattern identification and structure classification. The biologic motivation of our study is to recognize common structure patterns in "immunoevasins", proteins mediating virus evasion of host immune defense. Our experimental study, using both viral and non-viral proteins, demonstrates the efficiency and efficacy of the proposed method. Conclusion We present a theoretic framework, offer a practical software implementation for incorporating prior domain knowledge, such as substitution matrices as studied here, and devise an efficient algorithm to identify approximate matched frequent subgraphs. By doing so, we significantly expanded the analytical power of sophisticated data mining algorithms in dealing with large volume of complicated and noisy protein structure data. And without loss of generality, choice of appropriate compatibility matrices allows our method to be easily employed in domains where subgraph labels have some uncertainty. PMID:19208148

  1. Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text

    PubMed Central

    Xin, Yu; Hochberg, Ephraim; Joshi, Rohit; Uzuner, Ozlem; Szolovits, Peter

    2015-01-01

    Objective Extracting medical knowledge from electronic medical records requires automated approaches to combat scalability limitations and selection biases. However, existing machine learning approaches are often regarded by clinicians as black boxes. Moreover, training data for these automated approaches at often sparsely annotated at best. The authors target unsupervised learning for modeling clinical narrative text, aiming at improving both accuracy and interpretability. Methods The authors introduce a novel framework named subgraph augmented non-negative tensor factorization (SANTF). In addition to relying on atomic features (e.g., words in clinical narrative text), SANTF automatically mines higher-order features (e.g., relations of lymphoid cells expressing antigens) from clinical narrative text by converting sentences into a graph representation and identifying important subgraphs. The authors compose a tensor using patients, higher-order features, and atomic features as its respective modes. We then apply non-negative tensor factorization to cluster patients, and simultaneously identify latent groups of higher-order features that link to patient clusters, as in clinical guidelines where a panel of immunophenotypic features and laboratory results are used to specify diagnostic criteria. Results and Conclusion SANTF demonstrated over 10% improvement in averaged F-measure on patient clustering compared to widely used non-negative matrix factorization (NMF) and k-means clustering methods. Multiple baselines were established by modeling patient data using patient-by-features matrices with different feature configurations and then performing NMF or k-means to cluster patients. Feature analysis identified latent groups of higher-order features that lead to medical insights. We also found that the latent groups of atomic features help to better correlate the latent groups of higher-order features. PMID:25862765

  2. Supercomputer Environments

    DTIC Science & Technology

    1990-01-09

    data structures can easily be presented to the user interface. An emphasis of the Graph Browser was the realization of graph views and graph animation ... animation of the graph. Anima- tion of the graph includes changing node shapes, changing node and arc colors, changing node and arc text, and making...many graphs tend to be tree-like. Animtion of a graph is a useful feature. One of the primary goals of GMB was to support animated graphs. For animation

  3. Subtractive procedure for calculating the anomalous electron magnetic moment in QED and its application for numerical calculation at the three-loop level

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Volkov, S. A., E-mail: volkoff-sergey@mail.ru

    2016-06-15

    A new subtractive procedure for canceling ultraviolet and infrared divergences in the Feynman integrals described here is developed for calculating QED corrections to the electron anomalous magnetic moment. The procedure formulated in the form of a forest expression with linear operators applied to Feynman amplitudes of UV-diverging subgraphs makes it possible to represent the contribution of each Feynman graph containing only electron and photon propagators in the form of a converging integral with respect to Feynman parameters. The application of the developed method for numerical calculation of two- and threeloop contributions is described.

  4. Paving the Way Towards Reactive Planar Spanner Construction in Wireless Networks

    NASA Astrophysics Data System (ADS)

    Frey, Hannes; Rührup, Stefan

    A spanner is a subgraph of a given graph that supports the original graph's shortest path lengths up to a constant factor. Planar spanners and their distributed construction are of particular interest for geographic routing, which is an efficient localized routing scheme for wireless ad hoc and sensor networks. Planarity of the network graph is a key criterion for guaranteed delivery, while the spanner property supports efficiency in terms of path length. We consider the problem of reactive local spanner construction, where a node's local topology is determined on demand. Known message-efficient reactive planarization algorithms do not preserve the spanner property, while reactive spanner constructions with a low message overhead have not been described so far. We introduce the concept of direct planarization which may be an enabler of efficient reactive spanner construction. Given an edge, nodes check for all incident intersecting edges a certain geometric criterion and withdraw the edge if this criterion is not satisfied. We use this concept to derive a generic reactive topology control mechanism and consider two geometric criteria. Simulation results show that direct planarization increases the performance of localized geographic routing by providing shorter paths than existing reactive approaches.

  5. A shortest-path graph kernel for estimating gene product semantic similarity.

    PubMed

    Alvarez, Marco A; Qi, Xiaojun; Yan, Changhui

    2011-07-29

    Existing methods for calculating semantic similarity between gene products using the Gene Ontology (GO) often rely on external resources, which are not part of the ontology. Consequently, changes in these external resources like biased term distribution caused by shifting of hot research topics, will affect the calculation of semantic similarity. One way to avoid this problem is to use semantic methods that are "intrinsic" to the ontology, i.e. independent of external knowledge. We present a shortest-path graph kernel (spgk) method that relies exclusively on the GO and its structure. In spgk, a gene product is represented by an induced subgraph of the GO, which consists of all the GO terms annotating it. Then a shortest-path graph kernel is used to compute the similarity between two graphs. In a comprehensive evaluation using a benchmark dataset, spgk compares favorably with other methods that depend on external resources. Compared with simUI, a method that is also intrinsic to GO, spgk achieves slightly better results on the benchmark dataset. Statistical tests show that the improvement is significant when the resolution and EC similarity correlation coefficient are used to measure the performance, but is insignificant when the Pfam similarity correlation coefficient is used. Spgk uses a graph kernel method in polynomial time to exploit the structure of the GO to calculate semantic similarity between gene products. It provides an alternative to both methods that use external resources and "intrinsic" methods with comparable performance.

  6. Alignment of Tractograms As Graph Matching.

    PubMed

    Olivetti, Emanuele; Sharmin, Nusrat; Avesani, Paolo

    2016-01-01

    The white matter pathways of the brain can be reconstructed as 3D polylines, called streamlines, through the analysis of diffusion magnetic resonance imaging (dMRI) data. The whole set of streamlines is called tractogram and represents the structural connectome of the brain. In multiple applications, like group-analysis, segmentation, or atlasing, tractograms of different subjects need to be aligned. Typically, this is done with registration methods, that transform the tractograms in order to increase their similarity. In contrast with transformation-based registration methods, in this work we propose the concept of tractogram correspondence, whose aim is to find which streamline of one tractogram corresponds to which streamline in another tractogram, i.e., a map from one tractogram to another. As a further contribution, we propose to use the relational information of each streamline, i.e., its distances from the other streamlines in its own tractogram, as the building block to define the optimal correspondence. We provide an operational procedure to find the optimal correspondence through a combinatorial optimization problem and we discuss its similarity to the graph matching problem. In this work, we propose to represent tractograms as graphs and we adopt a recent inexact sub-graph matching algorithm to approximate the solution of the tractogram correspondence problem. On tractograms generated from the Human Connectome Project dataset, we report experimental evidence that tractogram correspondence, implemented as graph matching, provides much better alignment than affine registration and comparable if not better results than non-linear registration of volumes.

  7. Multi-phase simultaneous segmentation of tumor in lung 4D-CT data with context information.

    PubMed

    Shen, Zhengwen; Wang, Huafeng; Xi, Weiwen; Deng, Xiaogang; Chen, Jin; Zhang, Yu

    2017-01-01

    Lung 4D computed tomography (4D-CT) plays an important role in high-precision radiotherapy because it characterizes respiratory motion, which is crucial for accurate target definition. However, the manual segmentation of a lung tumor is a heavy workload for doctors because of the large number of lung 4D-CT data slices. Meanwhile, tumor segmentation is still a notoriously challenging problem in computer-aided diagnosis. In this paper, we propose a new method based on an improved graph cut algorithm with context information constraint to find a convenient and robust approach of lung 4D-CT tumor segmentation. We combine all phases of the lung 4D-CT into a global graph, and construct a global energy function accordingly. The sub-graph is first constructed for each phase. A context cost term is enforced to achieve segmentation results in every phase by adding a context constraint between neighboring phases. A global energy function is finally constructed by combining all cost terms. The optimization is achieved by solving a max-flow/min-cut problem, which leads to simultaneous and robust segmentation of the tumor in all the lung 4D-CT phases. The effectiveness of our approach is validated through experiments on 10 different lung 4D-CT cases. The comparison with the graph cut without context constraint, the level set method and the graph cut with star shape prior demonstrates that the proposed method obtains more accurate and robust segmentation results.

  8. Auditing SNOMED CT hierarchical relations based on lexical features of concepts in non-lattice subgraphs.

    PubMed

    Cui, Licong; Bodenreider, Olivier; Shi, Jay; Zhang, Guo-Qiang

    2018-02-01

    We introduce a structural-lexical approach for auditing SNOMED CT using a combination of non-lattice subgraphs of the underlying hierarchical relations and enriched lexical attributes of fully specified concept names. Our goal is to develop a scalable and effective approach that automatically identifies missing hierarchical IS-A relations. Our approach involves 3 stages. In stage 1, all non-lattice subgraphs of SNOMED CT's IS-A hierarchical relations are extracted. In stage 2, lexical attributes of fully-specified concept names in such non-lattice subgraphs are extracted. For each concept in a non-lattice subgraph, we enrich its set of attributes with attributes from its ancestor concepts within the non-lattice subgraph. In stage 3, subset inclusion relations between the lexical attribute sets of each pair of concepts in each non-lattice subgraph are compared to existing IS-A relations in SNOMED CT. For concept pairs within each non-lattice subgraph, if a subset relation is identified but an IS-A relation is not present in SNOMED CT IS-A transitive closure, then a missing IS-A relation is reported. The September 2017 release of SNOMED CT (US edition) was used in this investigation. A total of 14,380 non-lattice subgraphs were extracted, from which we suggested a total of 41,357 missing IS-A relations. For evaluation purposes, 200 non-lattice subgraphs were randomly selected from 996 smaller subgraphs (of size 4, 5, or 6) within the "Clinical Finding" and "Procedure" sub-hierarchies. Two domain experts confirmed 185 (among 223) suggested missing IS-A relations, a precision of 82.96%. Our results demonstrate that analyzing the lexical features of concepts in non-lattice subgraphs is an effective approach for auditing SNOMED CT. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Game Theory in Fleet Management

    NASA Astrophysics Data System (ADS)

    Dulai, Tibor; Jaskó, Szilárd; Muhi, Dániel

    2008-11-01

    In this survey we attempt to apply the results of cooperative game theory on fleet management problems. We deal with the aspect of a fleet where the members have their own goal, however the fleet has a common purpose too. These goals are to reach all destinations and get back to the center as quickly as possible. If we draw the map of the area-which contains the destination points and their environment-as a graph, we should determinate circles in it for each member of the fleet. Separating the nodes for each member, we should find Hamilton-circles of the sub-graphs. How to separate the destination points between the fleet members? How to route the members? What happens if there is an accident on a road which changes the way of a member? It may influence the other members' route too. What to communicate for getting the relevant information? How to change the routes in real time? We use cooperative game theory to find the solution.

  10. The finite body triangulation: algorithms, subgraphs, homogeneity estimation and application.

    PubMed

    Carson, Cantwell G; Levine, Jonathan S

    2016-09-01

    The concept of a finite body Dirichlet tessellation has been extended to that of a finite body Delaunay 'triangulation' to provide a more meaningful description of the spatial distribution of nonspherical secondary phase bodies in 2- and 3-dimensional images. A finite body triangulation (FBT) consists of a network of minimum edge-to-edge distances between adjacent objects in a microstructure. From this is also obtained the characteristic object chords formed by the intersection of the object boundary with the finite body tessellation. These two sets of distances form the basis of a parsimonious homogeneity estimation. The characteristics of the spatial distribution are then evaluated with respect to the distances between objects and the distances within them. Quantitative analysis shows that more physically representative distributions can be obtained by selecting subgraphs, such as the relative neighbourhood graph and the minimum spanning tree, from the finite body tessellation. To demonstrate their potential, we apply these methods to 3-dimensional X-ray computed tomographic images of foamed cement and their 2-dimensional cross sections. The Python computer code used to estimate the FBT is made available. Other applications for the algorithm - such as porous media transport and crack-tip propagation - are also discussed. © 2016 The Authors Journal of Microscopy © 2016 Royal Microscopical Society.

  11. Robust cell tracking in epithelial tissues through identification of maximum common subgraphs.

    PubMed

    Kursawe, Jochen; Bardenet, Rémi; Zartman, Jeremiah J; Baker, Ruth E; Fletcher, Alexander G

    2016-11-01

    Tracking of cells in live-imaging microscopy videos of epithelial sheets is a powerful tool for investigating fundamental processes in embryonic development. Characterizing cell growth, proliferation, intercalation and apoptosis in epithelia helps us to understand how morphogenetic processes such as tissue invagination and extension are locally regulated and controlled. Accurate cell tracking requires correctly resolving cells entering or leaving the field of view between frames, cell neighbour exchanges, cell removals and cell divisions. However, current tracking methods for epithelial sheets are not robust to large morphogenetic deformations and require significant manual interventions. Here, we present a novel algorithm for epithelial cell tracking, exploiting the graph-theoretic concept of a 'maximum common subgraph' to track cells between frames of a video. Our algorithm does not require the adjustment of tissue-specific parameters, and scales in sub-quadratic time with tissue size. It does not rely on precise positional information, permitting large cell movements between frames and enabling tracking in datasets acquired at low temporal resolution due to experimental constraints such as phototoxicity. To demonstrate the method, we perform tracking on the Drosophila embryonic epidermis and compare cell-cell rearrangements to previous studies in other tissues. Our implementation is open source and generally applicable to epithelial tissues. © 2016 The Authors.

  12. Faster Parameterized Algorithms for Minor Containment

    NASA Astrophysics Data System (ADS)

    Adler, Isolde; Dorn, Frederic; Fomin, Fedor V.; Sau, Ignasi; Thilikos, Dimitrios M.

    The theory of Graph Minors by Robertson and Seymour is one of the deepest and significant theories in modern Combinatorics. This theory has also a strong impact on the recent development of Algorithms, and several areas, like Parameterized Complexity, have roots in Graph Minors. Until very recently it was a common belief that Graph Minors Theory is mainly of theoretical importance. However, it appears that many deep results from Robertson and Seymour's theory can be also used in the design of practical algorithms. Minor containment testing is one of algorithmically most important and technical parts of the theory, and minor containment in graphs of bounded branchwidth is a basic ingredient of this algorithm. In order to implement minor containment testing on graphs of bounded branchwidth, Hicks [NETWORKS 04] described an algorithm, that in time O(3^{k^2}\\cdot (h+k-1)!\\cdot m) decides if a graph G with m edges and branchwidth k, contains a fixed graph H on h vertices as a minor. That algorithm follows the ideas introduced by Robertson and Seymour in [J'CTSB 95]. In this work we improve the dependence on k of Hicks' result by showing that checking if H is a minor of G can be done in time O(2^{(2k +1 )\\cdot log k} \\cdot h^{2k} \\cdot 2^{2h^2} \\cdot m). Our approach is based on a combinatorial object called rooted packing, which captures the properties of the potential models of subgraphs of H that we seek in our dynamic programming algorithm. This formulation with rooted packings allows us to speed up the algorithm when G is embedded in a fixed surface, obtaining the first single-exponential algorithm for minor containment testing. Namely, it runs in time 2^{O(k)} \\cdot h^{2k} \\cdot 2^{O(h)} \\cdot n, with n = |V(G)|. Finally, we show that slight modifications of our algorithm permit to solve some related problems within the same time bounds, like induced minor or contraction minor containment.

  13. Coordination of networked systems on digraphs with multiple leaders via pinning control

    NASA Astrophysics Data System (ADS)

    Chen, Gang; Lewis, Frank L.

    2012-02-01

    It is well known that achieving consensus among a group of multi-vehicle systems by local distributed control is feasible if and only if all nodes in the communication digraph are reachable from a single (root) node. In this article, we take into account a more general case that the communication digraph of the networked multi-vehicle systems is weakly connected and has two or more zero-in-degree and strongly connected subgraphs, i.e. there are two or more leader groups. Based on the pinning control strategy, the feasibility problem of achieving second-order controlled consensus is studied. At first, a necessary and sufficient condition is given when the topology is fixed. Then the method to design the controller and the rule to choose the pinned vehicles are discussed. The proposed approach allows us to extend several existing results for undirected graphs to directed balanced graphs. A sufficient condition is proposed in the case where the coupling topology is variable. As an illustrative example, a second-order controlled consensus scheme is applied to coordinate the movement of networked multiple mobile robots.

  14. Efficient reordering of PROLOG programs

    NASA Technical Reports Server (NTRS)

    Gooley, Markian M.; Wah, Benjamin W.

    1989-01-01

    PROLOG programs are often inefficient: execution corresponds to a depth-first traversal of an AND/OR graph; traversing subgraphs in another order can be less expensive. It is shown how the reordering of clauses within PROLOG predicates, and especially of goals within clauses, can prevent unnecessary search. The characterization and detection of restrictions on reordering is discussed. A system of calling modes for PROLOG, geared to reordering, is proposed, and ways to infer them automatically are discussed. The information needed for safe reordering is summarized, and which types can be inferred automatically and which must be provided by the user are considered. An improved method for determining a good order for the goals of PROLOG clauses is presented and used as the basis for a reordering system.

  15. A novel sub-shot segmentation method for user-generated video

    NASA Astrophysics Data System (ADS)

    Lei, Zhuo; Zhang, Qian; Zheng, Chi; Qiu, Guoping

    2018-04-01

    With the proliferation of the user-generated videos, temporal segmentation is becoming a challengeable problem. Traditional video temporal segmentation methods like shot detection are not able to work on unedited user-generated videos, since they often only contain one single long shot. We propose a novel temporal segmentation framework for user-generated video. It finds similar frames with a tree partitioning min-Hash technique, constructs sparse temporal constrained affinity sub-graphs, and finally divides the video into sub-shot-level segments with a dense-neighbor-based clustering method. Experimental results show that our approach outperforms all the other related works. Furthermore, it is indicated that the proposed approach is able to segment user-generated videos at an average human level.

  16. An Algorithm to Automatically Generate the Combinatorial Orbit Counting Equations

    PubMed Central

    Melckenbeeck, Ine; Audenaert, Pieter; Michoel, Tom; Colle, Didier; Pickavet, Mario

    2016-01-01

    Graphlets are small subgraphs, usually containing up to five vertices, that can be found in a larger graph. Identification of the graphlets that a vertex in an explored graph touches can provide useful information about the local structure of the graph around that vertex. Actually finding all graphlets in a large graph can be time-consuming, however. As the graphlets grow in size, more different graphlets emerge and the time needed to find each graphlet also scales up. If it is not needed to find each instance of each graphlet, but knowing the number of graphlets touching each node of the graph suffices, the problem is less hard. Previous research shows a way to simplify counting the graphlets: instead of looking for the graphlets needed, smaller graphlets are searched, as well as the number of common neighbors of vertices. Solving a system of equations then gives the number of times a vertex is part of each graphlet of the desired size. However, until now, equations only exist to count graphlets with 4 or 5 nodes. In this paper, two new techniques are presented. The first allows to generate the equations needed in an automatic way. This eliminates the tedious work needed to do so manually each time an extra node is added to the graphlets. The technique is independent on the number of nodes in the graphlets and can thus be used to count larger graphlets than previously possible. The second technique gives all graphlets a unique ordering which is easily extended to name graphlets of any size. Both techniques were used to generate equations to count graphlets with 4, 5 and 6 vertices, which extends all previous results. Code can be found at https://github.com/IneMelckenbeeck/equation-generator and https://github.com/IneMelckenbeeck/graphlet-naming. PMID:26797021

  17. Knowledge-based understanding of aerial surveillance video

    NASA Astrophysics Data System (ADS)

    Cheng, Hui; Butler, Darren

    2006-05-01

    Aerial surveillance has long been used by the military to locate, monitor and track the enemy. Recently, its scope has expanded to include law enforcement activities, disaster management and commercial applications. With the ever-growing amount of aerial surveillance video acquired daily, there is an urgent need for extracting actionable intelligence in a timely manner. Furthermore, to support high-level video understanding, this analysis needs to go beyond current approaches and consider the relationships, motivations and intentions of the objects in the scene. In this paper we propose a system for interpreting aerial surveillance videos that automatically generates a succinct but meaningful description of the observed regions, objects and events. For a given video, the semantics of important regions and objects, and the relationships between them, are summarised into a semantic concept graph. From this, a textual description is derived that provides new search and indexing options for aerial video and enables the fusion of aerial video with other information modalities, such as human intelligence, reports and signal intelligence. Using a Mixture-of-Experts video segmentation algorithm an aerial video is first decomposed into regions and objects with predefined semantic meanings. The objects are then tracked and coerced into a semantic concept graph and the graph is summarized spatially, temporally and semantically using ontology guided sub-graph matching and re-writing. The system exploits domain specific knowledge and uses a reasoning engine to verify and correct the classes, identities and semantic relationships between the objects. This approach is advantageous because misclassifications lead to knowledge contradictions and hence they can be easily detected and intelligently corrected. In addition, the graph representation highlights events and anomalies that a low-level analysis would overlook.

  18. Decompositions of large-scale biological systems based on dynamical properties.

    PubMed

    Soranzo, Nicola; Ramezani, Fahimeh; Iacono, Giovanni; Altafini, Claudio

    2012-01-01

    Given a large-scale biological network represented as an influence graph, in this article we investigate possible decompositions of the network aimed at highlighting specific dynamical properties. The first decomposition we study consists in finding a maximal directed acyclic subgraph of the network, which dynamically corresponds to searching for a maximal open-loop subsystem of the given system. Another dynamical property investigated is strong monotonicity. We propose two methods to deal with this property, both aimed at decomposing the system into strongly monotone subsystems, but with different structural characteristics: one method tends to produce a single large strongly monotone component, while the other typically generates a set of smaller disjoint strongly monotone subsystems. Original heuristics for the methods investigated are described in the article. altafini@sissa.it

  19. Faster Mass Spectrometry-based Protein Inference: Junction Trees are More Efficient than Sampling and Marginalization by Enumeration

    PubMed Central

    Serang, Oliver; Noble, William Stafford

    2012-01-01

    The problem of identifying the proteins in a complex mixture using tandem mass spectrometry can be framed as an inference problem on a graph that connects peptides to proteins. Several existing protein identification methods make use of statistical inference methods for graphical models, including expectation maximization, Markov chain Monte Carlo, and full marginalization coupled with approximation heuristics. We show that, for this problem, the majority of the cost of inference usually comes from a few highly connected subgraphs. Furthermore, we evaluate three different statistical inference methods using a common graphical model, and we demonstrate that junction tree inference substantially improves rates of convergence compared to existing methods. The python code used for this paper is available at http://noble.gs.washington.edu/proj/fido. PMID:22331862

  20. Kirchhoff's rule for quantum wires

    NASA Astrophysics Data System (ADS)

    Kostrykin, V.; Schrader, R.

    1999-01-01

    We formulate and discuss one-particle quantum scattering theory on an arbitrary finite graph with n open ends and where we define the Hamiltonian to be (minus) the Laplace operator with general boundary conditions at the vertices. This results in a scattering theory with n channels. The corresponding on-shell S-matrix formed by the reflection and transmission amplitudes for incoming plane waves of energy E>0 is given explicitly in terms of the boundary conditions and the lengths of the internal lines. It is shown to be unitary, which may be viewed as the quantum version of Kirchhoff's law. We exhibit covariance and symmetry properties. It is symmetric if the boundary conditions are real. Also there is a duality transformation on the set of boundary conditions and the lengths of the internal lines such that the low-energy behaviour of one theory gives the high-energy behaviour of the transformed theory. Finally, we provide a composition rule by which the on-shell S-matrix of a graph is factorizable in terms of the S-matrices of its subgraphs. All proofs use only known facts from the theory of self-adjoint extensions, standard linear algebra, complex function theory and elementary arguments from the theory of Hermitian symplectic forms.

  1. Streaming data analytics via message passing with application to graph algorithms

    DOE PAGES

    Plimpton, Steven J.; Shead, Tim

    2014-05-06

    The need to process streaming data, which arrives continuously at high-volume in real-time, arises in a variety of contexts including data produced by experiments, collections of environmental or network sensors, and running simulations. Streaming data can also be formulated as queries or transactions which operate on a large dynamic data store, e.g. a distributed database. We describe a lightweight, portable framework named PHISH which enables a set of independent processes to compute on a stream of data in a distributed-memory parallel manner. Datums are routed between processes in patterns defined by the application. PHISH can run on top of eithermore » message-passing via MPI or sockets via ZMQ. The former means streaming computations can be run on any parallel machine which supports MPI; the latter allows them to run on a heterogeneous, geographically dispersed network of machines. We illustrate how PHISH can support streaming MapReduce operations, and describe streaming versions of three algorithms for large, sparse graph analytics: triangle enumeration, subgraph isomorphism matching, and connected component finding. Lastly, we also provide benchmark timings for MPI versus socket performance of several kernel operations useful in streaming algorithms.« less

  2. Overlapping communities from dense disjoint and high total degree clusters

    NASA Astrophysics Data System (ADS)

    Zhang, Hongli; Gao, Yang; Zhang, Yue

    2018-04-01

    Community plays an important role in the field of sociology, biology and especially in domains of computer science, where systems are often represented as networks. And community detection is of great importance in the domains. A community is a dense subgraph of the whole graph with more links between its members than between its members to the outside nodes, and nodes in the same community probably share common properties or play similar roles in the graph. Communities overlap when nodes in a graph belong to multiple communities. A vast variety of overlapping community detection methods have been proposed in the literature, and the local expansion method is one of the most successful techniques dealing with large networks. The paper presents a density-based seeding method, in which dense disjoint local clusters are searched and selected as seeds. The proposed method selects a seed by the total degree and density of local clusters utilizing merely local structures of the network. Furthermore, this paper proposes a novel community refining phase via minimizing the conductance of each community, through which the quality of identified communities is largely improved in linear time. Experimental results in synthetic networks show that the proposed seeding method outperforms other seeding methods in the state of the art and the proposed refining method largely enhances the quality of the identified communities. Experimental results in real graphs with ground-truth communities show that the proposed approach outperforms other state of the art overlapping community detection algorithms, in particular, it is more than two orders of magnitude faster than the existing global algorithms with higher quality, and it obtains much more accurate community structure than the current local algorithms without any priori information.

  3. MMKG: An approach to generate metallic materials knowledge graph based on DBpedia and Wikipedia

    NASA Astrophysics Data System (ADS)

    Zhang, Xiaoming; Liu, Xin; Li, Xin; Pan, Dongyu

    2017-02-01

    The research and development of metallic materials are playing an important role in today's society, and in the meanwhile lots of metallic materials knowledge is generated and available on the Web (e.g., Wikipedia) for materials experts. However, due to the diversity and complexity of metallic materials knowledge, the knowledge utilization may encounter much inconvenience. The idea of knowledge graph (e.g., DBpedia) provides a good way to organize the knowledge into a comprehensive entity network. Therefore, the motivation of our work is to generate a metallic materials knowledge graph (MMKG) using available knowledge on the Web. In this paper, an approach is proposed to build MMKG based on DBpedia and Wikipedia. First, we use an algorithm based on directly linked sub-graph semantic distance (DLSSD) to preliminarily extract metallic materials entities from DBpedia according to some predefined seed entities; then based on the results of the preliminary extraction, we use an algorithm, which considers both semantic distance and string similarity (SDSS), to achieve the further extraction. Second, due to the absence of materials properties in DBpedia, we use an ontology-based method to extract properties knowledge from the HTML tables of corresponding Wikipedia Web pages for enriching MMKG. Materials ontology is used to locate materials properties tables as well as to identify the structure of the tables. The proposed approach is evaluated by precision, recall, F1 and time performance, and meanwhile the appropriate thresholds for the algorithms in our approach are determined through experiments. The experimental results show that our approach returns expected performance. A tool prototype is also designed to facilitate the process of building the MMKG as well as to demonstrate the effectiveness of our approach.

  4. Quality Assurance of NCI Thesaurus by Mining Structural-Lexical Patterns

    PubMed Central

    Abeysinghe, Rashmie; Brooks, Michael A.; Talbert, Jeffery; Licong, Cui

    2017-01-01

    Quality assurance of biomedical terminologies such as the National Cancer Institute (NCI) Thesaurus is an essential part of the terminology management lifecycle. We investigate a structural-lexical approach based on non-lattice subgraphs to automatically identify missing hierarchical relations and missing concepts in the NCI Thesaurus. We mine six structural-lexical patterns exhibiting in non-lattice subgraphs: containment, union, intersection, union-intersection, inference-contradiction, and inference union. Each pattern indicates a potential specific type of error and suggests a potential type of remediation. We found 809 non-lattice subgraphs with these patterns in the NCI Thesaurus (version 16.12d). Domain experts evaluated a random sample of 50 small non-lattice subgraphs, of which 33 were confirmed to contain errors and make correct suggestions (33/50 = 66%). Of the 25 evaluated subgraphs revealing multiple patterns, 22 were verified correct (22/25 = 88%). This shows the effectiveness of our structurallexical-pattern-based approach in detecting errors and suggesting remediations in the NCI Thesaurus. PMID:29854100

  5. Motifs in triadic random graphs based on Steiner triple systems

    NASA Astrophysics Data System (ADS)

    Winkler, Marco; Reichardt, Jörg

    2013-08-01

    Conventionally, pairwise relationships between nodes are considered to be the fundamental building blocks of complex networks. However, over the last decade, the overabundance of certain subnetwork patterns, i.e., the so-called motifs, has attracted much attention. It has been hypothesized that these motifs, instead of links, serve as the building blocks of network structures. Although the relation between a network's topology and the general properties of the system, such as its function, its robustness against perturbations, or its efficiency in spreading information, is the central theme of network science, there is still a lack of sound generative models needed for testing the functional role of subgraph motifs. Our work aims to overcome this limitation. We employ the framework of exponential random graph models (ERGMs) to define models based on triadic substructures. The fact that only a small portion of triads can actually be set independently poses a challenge for the formulation of such models. To overcome this obstacle, we use Steiner triple systems (STSs). These are partitions of sets of nodes into pair-disjoint triads, which thus can be specified independently. Combining the concepts of ERGMs and STSs, we suggest generative models capable of generating ensembles of networks with nontrivial triadic Z-score profiles. Further, we discover inevitable correlations between the abundance of triad patterns, which occur solely for statistical reasons and need to be taken into account when discussing the functional implications of motif statistics. Moreover, we calculate the degree distributions of our triadic random graphs analytically.

  6. Vehicle Animation Software (VAS) to Animate Results Obtained from Vehicle Handling and Rollover Simulations and Tests

    DOT National Transportation Integrated Search

    1991-04-01

    Results from vehicle computer simulations usually take the form of numeric data or graphs. While these graphs provide the investigator with the insight into vehicle behavior, it may be difficult to use these graphs to assess complex vehicle motion. C...

  7. Multi-scale structure and topological anomaly detection via a new network statistic: The onion decomposition.

    PubMed

    Hébert-Dufresne, Laurent; Grochow, Joshua A; Allard, Antoine

    2016-08-18

    We introduce a network statistic that measures structural properties at the micro-, meso-, and macroscopic scales, while still being easy to compute and interpretable at a glance. Our statistic, the onion spectrum, is based on the onion decomposition, which refines the k-core decomposition, a standard network fingerprinting method. The onion spectrum is exactly as easy to compute as the k-cores: It is based on the stages at which each vertex gets removed from a graph in the standard algorithm for computing the k-cores. Yet, the onion spectrum reveals much more information about a network, and at multiple scales; for example, it can be used to quantify node heterogeneity, degree correlations, centrality, and tree- or lattice-likeness. Furthermore, unlike the k-core decomposition, the combined degree-onion spectrum immediately gives a clear local picture of the network around each node which allows the detection of interesting subgraphs whose topological structure differs from the global network organization. This local description can also be leveraged to easily generate samples from the ensemble of networks with a given joint degree-onion distribution. We demonstrate the utility of the onion spectrum for understanding both static and dynamic properties on several standard graph models and on many real-world networks.

  8. Clique-based data mining for related genes in a biomedical database.

    PubMed

    Matsunaga, Tsutomu; Yonemori, Chikara; Tomita, Etsuji; Muramatsu, Masaaki

    2009-07-01

    Progress in the life sciences cannot be made without integrating biomedical knowledge on numerous genes in order to help formulate hypotheses on the genetic mechanisms behind various biological phenomena, including diseases. There is thus a strong need for a way to automatically and comprehensively search from biomedical databases for related genes, such as genes in the same families and genes encoding components of the same pathways. Here we address the extraction of related genes by searching for densely-connected subgraphs, which are modeled as cliques, in a biomedical relational graph. We constructed a graph whose nodes were gene or disease pages, and edges were the hyperlink connections between those pages in the Online Mendelian Inheritance in Man (OMIM) database. We obtained over 20,000 sets of related genes (called 'gene modules') by enumerating cliques computationally. The modules included genes in the same family, genes for proteins that form a complex, and genes for components of the same signaling pathway. The results of experiments using 'metabolic syndrome'-related gene modules show that the gene modules can be used to get a coherent holistic picture helpful for interpreting relations among genes. We presented a data mining approach extracting related genes by enumerating cliques. The extracted gene sets provide a holistic picture useful for comprehending complex disease mechanisms.

  9. Probabilistic graphlet transfer for photo cropping.

    PubMed

    Zhang, Luming; Song, Mingli; Zhao, Qi; Liu, Xiao; Bu, Jiajun; Chen, Chun

    2013-02-01

    As one of the most basic photo manipulation processes, photo cropping is widely used in the printing, graphic design, and photography industries. In this paper, we introduce graphlets (i.e., small connected subgraphs) to represent a photo's aesthetic features, and propose a probabilistic model to transfer aesthetic features from the training photo onto the cropped photo. In particular, by segmenting each photo into a set of regions, we construct a region adjacency graph (RAG) to represent the global aesthetic feature of each photo. Graphlets are then extracted from the RAGs, and these graphlets capture the local aesthetic features of the photos. Finally, we cast photo cropping as a candidate-searching procedure on the basis of a probabilistic model, and infer the parameters of the cropped photos using Gibbs sampling. The proposed method is fully automatic. Subjective evaluations have shown that it is preferred over a number of existing approaches.

  10. Significant Scales in Community Structure

    NASA Astrophysics Data System (ADS)

    Traag, V. A.; Krings, G.; van Dooren, P.

    2013-10-01

    Many complex networks show signs of modular structure, uncovered by community detection. Although many methods succeed in revealing various partitions, it remains difficult to detect at what scale some partition is significant. This problem shows foremost in multi-resolution methods. We here introduce an efficient method for scanning for resolutions in one such method. Additionally, we introduce the notion of ``significance'' of a partition, based on subgraph probabilities. Significance is independent of the exact method used, so could also be applied in other methods, and can be interpreted as the gain in encoding a graph by making use of a partition. Using significance, we can determine ``good'' resolution parameters, which we demonstrate on benchmark networks. Moreover, optimizing significance itself also shows excellent performance. We demonstrate our method on voting data from the European Parliament. Our analysis suggests the European Parliament has become increasingly ideologically divided and that nationality plays no role.

  11. Tree decomposition based fast search of RNA structures including pseudoknots in genomes.

    PubMed

    Song, Yinglei; Liu, Chunmei; Malmberg, Russell; Pan, Fangfang; Cai, Liming

    2005-01-01

    Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t = 2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k(t)N(2)), where k is a small parameter; and N is the size of the projiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a Covariance model with a significant reduction in computation time. In particular; very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site h t t p ://w.uga.edu/RNA-informatics/software/index.php.

  12. Exploiting graph kernels for high performance biomedical relation extraction.

    PubMed

    Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri

    2018-01-30

    Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.

  13. A Graph-Based Recovery and Decomposition of Swanson’s Hypothesis using Semantic Predications

    PubMed Central

    Cameron, Delroy; Bodenreider, Olivier; Yalamanchili, Hima; Danh, Tu; Vallabhaneni, Sreeram; Thirunarayan, Krishnaprasad; Sheth, Amit P.; Rindflesch, Thomas C.

    2014-01-01

    Objectives This paper presents a methodology for recovering and decomposing Swanson’s Raynaud Syndrome–Fish Oil Hypothesis semi-automatically. The methodology leverages the semantics of assertions extracted from biomedical literature (called semantic predications) along with structured background knowledge and graph-based algorithms to semi-automatically capture the informative associations originally discovered manually by Swanson. Demonstrating that Swanson’s manually intensive techniques can be undertaken semi-automatically, paves the way for fully automatic semantics-based hypothesis generation from scientific literature. Methods Semantic predications obtained from biomedical literature allow the construction of labeled directed graphs which contain various associations among concepts from the literature. By aggregating such associations into informative subgraphs, some of the relevant details originally articulated by Swanson has been uncovered. However, by leveraging background knowledge to bridge important knowledge gaps in the literature, a methodology for semi-automatically capturing the detailed associations originally explicated in natural language by Swanson has been developed. Results Our methodology not only recovered the 3 associations commonly recognized as Swanson’s Hypothesis, but also decomposed them into an additional 16 detailed associations, formulated as chains of semantic predications. Altogether, 14 out of the 19 associations that can be attributed to Swanson were retrieved using our approach. To the best of our knowledge, such an in-depth recovery and decomposition of Swanson’s Hypothesis has never been attempted. Conclusion In this work therefore, we presented a methodology for semi- automatically recovering and decomposing Swanson’s RS-DFO Hypothesis using semantic representations and graph algorithms. Our methodology provides new insights into potential prerequisites for semantics-driven Literature-Based Discovery (LBD). These suggest that three critical aspects of LBD include: 1) the need for more expressive representations beyond Swanson’s ABC model; 2) an ability to accurately extract semantic information from text; and 3) the semantic integration of scientific literature with structured background knowledge. PMID:23026233

  14. Image Analysis and Modeling

    DTIC Science & Technology

    1975-05-01

    place "subgraphs" with more complicated subgraphs. There have beer many re- sults reported which extend string-grammar theorems to web grammar...W. Bacus and E. E. Gose , "Leukocyte Pattern Recognition," IEEE SMC-2. No. 2, pp. 513-536, September 1972. [4] J. K. Hawkins

  15. Equilibrium statistical mechanics on correlated random graphs

    NASA Astrophysics Data System (ADS)

    Barra, Adriano; Agliari, Elena

    2011-02-01

    Biological and social networks have recently attracted great attention from physicists. Among several aspects, two main ones may be stressed: a non-trivial topology of the graph describing the mutual interactions between agents and, typically, imitative, weighted, interactions. Despite such aspects being widely accepted and empirically confirmed, the schemes currently exploited in order to generate the expected topology are based on a priori assumptions and, in most cases, implement constant intensities for links. Here we propose a simple shift [-1,+1]\\to [0,+1] in the definition of patterns in a Hopfield model: a straightforward effect is the conversion of frustration into dilution. In fact, we show that by varying the bias of pattern distribution, the network topology (generated by the reciprocal affinities among agents, i.e. the Hebbian rule) crosses various well-known regimes, ranging from fully connected, to an extreme dilution scenario, then to completely disconnected. These features, as well as small-world properties, are, in this context, emergent and no longer imposed a priori. The model is throughout investigated also from a thermodynamics perspective: the Ising model defined on the resulting graph is analytically solved (at a replica symmetric level) by extending the double stochastic stability technique, and presented together with its fluctuation theory for a picture of criticality. Overall, our findings show that, at least at equilibrium, dilution (of whatever kind) simply decreases the strength of the coupling felt by the spins, but leaves the paramagnetic/ferromagnetic flavors unchanged. The main difference with respect to previous investigations is that, within our approach, replicas do not appear: instead of (multi)-overlaps as order parameters, we introduce a class of magnetizations on all the possible subgraphs belonging to the main one investigated: as a consequence, for these objects a closure for a self-consistent relation is achieved.

  16. An Automated Method for Identifying Inconsistencies within Diagrammatic Software Requirements Specifications

    NASA Technical Reports Server (NTRS)

    Zhang, Zhong

    1997-01-01

    The development of large-scale, composite software in a geographically distributed environment is an evolutionary process. Often, in such evolving systems, striving for consistency is complicated by many factors, because development participants have various locations, skills, responsibilities, roles, opinions, languages, terminology and different degrees of abstraction they employ. This naturally leads to many partial specifications or viewpoints. These multiple views on the system being developed usually overlap. From another aspect, these multiple views give rise to the potential for inconsistency. Existing CASE tools do not efficiently manage inconsistencies in distributed development environment for a large-scale project. Based on the ViewPoints framework the WHERE (Web-Based Hypertext Environment for requirements Evolution) toolkit aims to tackle inconsistency management issues within geographically distributed software development projects. Consequently, WHERE project helps make more robust software and support software assurance process. The long term goal of WHERE tools aims to the inconsistency analysis and management in requirements specifications. A framework based on Graph Grammar theory and TCMJAVA toolkit is proposed to detect inconsistencies among viewpoints. This systematic approach uses three basic operations (UNION, DIFFERENCE, INTERSECTION) to study the static behaviors of graphic and tabular notations. From these operations, subgraphs Query, Selection, Merge, Replacement operations can be derived. This approach uses graph PRODUCTIONS (rewriting rules) to study the dynamic transformations of graphs. We discuss the feasibility of implementation these operations. Also, We present the process of porting original TCM (Toolkit for Conceptual Modeling) project from C++ to Java programming language in this thesis. A scenario based on NASA International Space Station Specification is discussed to show the applicability of our approach. Finally, conclusion and future work about inconsistency management issues in WHERE project will be summarized.

  17. What can we learn from the network approach in finance?

    NASA Astrophysics Data System (ADS)

    Janos, Kertesz

    2005-03-01

    Correlations between variations of stock prices reveal information about relationships between companies. Different methods of analysis have been applied to such data in order to uncover the taxonomy of the market. We use Mantegna's miminum spanning tree (MST) method for daily data in a dynamic way: By introducing a moving window we study the temporal changes in the structure of the network defined by this ``asset tree.'' The MST is scale free with a significantly changing exponent of the degree distribution for crash periods, which demonstrates the restructuring of the network due to the enhancement of correlations. This approach is compared to that based on what we call ``asset graphs:'' We start from an empty graph with no edges where the vertices correspond to stocks and then, one by one, we insert edges between the vertices according to the rank of their correlation strength. We study the properties of the creatred (weighted) networks, such as topologically different growth types, number and size of clusters and clustering coefficient. Furthermore, we define new tools like subgraph intensity and coherence to describe the role of the weights. We also investigate the time shifted cross correlation functions for high frequency data and find a characteristic time delay in many cases representing that some stocks lead the price changes while others follow them. These data can be used to construct a directed network of influence.

  18. Structure-preserving model reduction of large-scale logistics networks. Applications for supply chains

    NASA Astrophysics Data System (ADS)

    Scholz-Reiter, B.; Wirth, F.; Dashkovskiy, S.; Makuschewitz, T.; Schönlein, M.; Kosmykov, M.

    2011-12-01

    We investigate the problem of model reduction with a view to large-scale logistics networks, specifically supply chains. Such networks are modeled by means of graphs, which describe the structure of material flow. An aim of the proposed model reduction procedure is to preserve important features within the network. As a new methodology we introduce the LogRank as a measure for the importance of locations, which is based on the structure of the flows within the network. We argue that these properties reflect relative importance of locations. Based on the LogRank we identify subgraphs of the network that can be neglected or aggregated. The effect of this is discussed for a few motifs. Using this approach we present a meta algorithm for structure-preserving model reduction that can be adapted to different mathematical modeling frameworks. The capabilities of the approach are demonstrated with a test case, where a logistics network is modeled as a Jackson network, i.e., a particular type of queueing network.

  19. A system biology approach for understanding the miRNA regulatory network in colon rectal cancer.

    PubMed

    Pradhan, Meeta; Nagulapalli, Kshithija; Ledford, Lakenvia; Pandit, Yogesh; Palakal, Mathew

    2015-01-01

    In this paper we present a systems biology approach to the understanding of the miRNA-regulatory network in colon rectal cancer. An initial set of significant genes in Colon Rectal Cancer (CRC) were obtained by mining relevant literature. An initial set of cancer-related miRNAs were obtained from three databases: miRBase, miRWalk, Targetscan and GEO microarray experiment. First principle methods were then used to generate the global miRNA-gene network. Significant miRNAs and associated transcription factors in the global miRNA-gene network were identified using topological and sub-graph analyses. Eleven novel miRNAs were identified and three of the novel miRNAs, hsa-miR-630, hsa-miR-100 and hsa-miR-99a, were further analysed to elucidate their role in CRC. The proposed methodology effectively made use of literature data and was able to show novel, significant miRNA-transcription associations in CRC.

  20. Spatial fingerprints of community structure in human interaction network for an extensive set of large-scale regions.

    PubMed

    Kallus, Zsófia; Barankai, Norbert; Szüle, János; Vattay, Gábor

    2015-01-01

    Human interaction networks inferred from country-wide telephone activity recordings were recently used to redraw political maps by projecting their topological partitions into geographical space. The results showed remarkable spatial cohesiveness of the network communities and a significant overlap between the redrawn and the administrative borders. Here we present a similar analysis based on one of the most popular online social networks represented by the ties between more than 5.8 million of its geo-located users. The worldwide coverage of their measured activity allowed us to analyze the large-scale regional subgraphs of entire continents and an extensive set of examples for single countries. We present results for North and South America, Europe and Asia. In our analysis we used the well-established method of modularity clustering after an aggregation of the individual links into a weighted graph connecting equal-area geographical pixels. Our results show fingerprints of both of the opposing forces of dividing local conflicts and of uniting cross-cultural trends of globalization.

  1. Playing hide and seek with repeats in local and global de novo transcriptome assembly of short RNA-seq reads.

    PubMed

    Lima, Leandro; Sinaimeri, Blerina; Sacomoto, Gustavo; Lopez-Maestre, Helene; Marchet, Camille; Miele, Vincent; Sagot, Marie-France; Lacroix, Vincent

    2017-01-01

    The main challenge in de novo genome assembly of DNA-seq data is certainly to deal with repeats that are longer than the reads. In de novo transcriptome assembly of RNA-seq reads, on the other hand, this problem has been underestimated so far. Even though we have fewer and shorter repeated sequences in transcriptomics, they do create ambiguities and confuse assemblers if not addressed properly. Most transcriptome assemblers of short reads are based on de Bruijn graphs (DBG) and have no clear and explicit model for repeats in RNA-seq data, relying instead on heuristics to deal with them. The results of this work are threefold. First, we introduce a formal model for representing high copy-number and low-divergence repeats in RNA-seq data and exploit its properties to infer a combinatorial characteristic of repeat-associated subgraphs. We show that the problem of identifying such subgraphs in a DBG is NP-complete. Second, we show that in the specific case of local assembly of alternative splicing (AS) events, we can implicitly avoid such subgraphs, and we present an efficient algorithm to enumerate AS events that are not included in repeats. Using simulated data, we show that this strategy is significantly more sensitive and precise than the previous version of KisSplice (Sacomoto et al. in WABI, pp 99-111, 1), Trinity (Grabherr et al. in Nat Biotechnol 29(7):644-652, 2), and Oases (Schulz et al. in Bioinformatics 28(8):1086-1092, 3), for the specific task of calling AS events. Third, we turn our focus to full-length transcriptome assembly, and we show that exploring the topology of DBGs can improve de novo transcriptome evaluation methods. Based on the observation that repeats create complicated regions in a DBG, and when assemblers try to traverse these regions, they can infer erroneous transcripts, we propose a measure to flag transcripts traversing such troublesome regions, thereby giving a confidence level for each transcript. The originality of our work when compared to other transcriptome evaluation methods is that we use only the topology of the DBG, and not read nor coverage information. We show that our simple method gives better results than Rsem-Eval (Li et al. in Genome Biol 15(12):553, 4) and TransRate (Smith-Unna et al. in Genome Res 26(8):1134-1144, 5) on both real and simulated datasets for detecting chimeras, and therefore is able to capture assembly errors missed by these methods.

  2. Symmetry compression method for discovering network motifs.

    PubMed

    Wang, Jianxin; Huang, Yuannan; Wu, Fang-Xiang; Pan, Yi

    2012-01-01

    Discovering network motifs could provide a significant insight into systems biology. Interestingly, many biological networks have been found to have a high degree of symmetry (automorphism), which is inherent in biological network topologies. The symmetry due to the large number of basic symmetric subgraphs (BSSs) causes a certain redundant calculation in discovering network motifs. Therefore, we compress all basic symmetric subgraphs before extracting compressed subgraphs and propose an efficient decompression algorithm to decompress all compressed subgraphs without loss of any information. In contrast to previous approaches, the novel Symmetry Compression method for Motif Detection, named as SCMD, eliminates most redundant calculations caused by widespread symmetry of biological networks. We use SCMD to improve three notable exact algorithms and two efficient sampling algorithms. Results of all exact algorithms with SCMD are the same as those of the original algorithms, since SCMD is a lossless method. The sampling results show that the use of SCMD almost does not affect the quality of sampling results. For highly symmetric networks, we find that SCMD used in both exact and sampling algorithms can help get a remarkable speedup. Furthermore, SCMD enables us to find larger motifs in biological networks with notable symmetry than previously possible.

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Guo Qiang; Luo, Lingyun; Ogbuji, Chime

    The interaction of multiple types of relationships among anatomical classes in the Foundational Model of Anatomy (FMA) can provide inferred information valuable for quality assurance. This paper introduces a method called Motif Checking (MOCH) to study the effects of such multi-relation type interactions. MOCH represents patterns of multitype interaction as small labeled sub-graph motifs, whose nodes represent class variables, and labeled edges represent relational types. By representing FMA as an RDF graph and motifs as SPARQL queries, fragments of FMA are automatically obtained as auditing candidates. Leveraging the scalability and reconfigurability of Semantic Web Technology (OWL, RDF and SPARQL) andmore » Virtuoso, we performed exhaustive analyses of three 2-node motifs, resulting in 638 matching FMA configurations; twelve 3-node motifs, resulting in 202,960 configurations. Using the Principal Ideal Explorer (PIE) methodology as an extension of MOCH, we were able to identify 755 root nodes with 4,100 respective descendants with opposing antonyms in their class names for arbitrary-length motifs. With possible disjointness implied by antonyms, we performed manual inspection of a subset of the resulting FMA fragments and tracked down a source of abnormal inferred conclusions (captured by the motifs), coming from a gender-neutral class being modeled as a part of gender-specific class, such as “Urinary system” is a part of “Female human body.” Our results demonstrate that MOCH and PIE provide a unique source of valuable information for quality assurance. Since our approach is general, it is applicable to any ontological system with an OWL representation.« less

  4. Epidemics in networks: a master equation approach

    NASA Astrophysics Data System (ADS)

    Cotacallapa, M.; Hase, M. O.

    2016-02-01

    A problem closely related to epidemiology, where a subgraph of ‘infected’ links is defined inside a larger network, is investigated. This subgraph is generated from the underlying network by a random variable, which decides whether a link is able to propagate a disease/information. The relaxation timescale of this random variable is examined in both annealed and quenched limits, and the effectiveness of propagation of disease/information is analyzed. The dynamics of the model is governed by a master equation and two types of underlying network are considered: one is scale-free and the other has exponential degree distribution. We have shown that the relaxation timescale of the contagion variable has a major influence on the topology of the subgraph of infected links, which determines the efficiency of spreading of disease/information over the network.

  5. On the star partition dimension of comb product of cycle and complete graph

    NASA Astrophysics Data System (ADS)

    Alfarisi, Ridho; Darmaji; Dafik

    2017-06-01

    Let G = (V, E) be a connected graphs with vertex set V (G), edge set E(G) and S ⊆ V (G). For an ordered partition Π = {S 1, S 2, S 3, …, Sk } of V (G), the representation of a vertex v ∈ V (G) with respect to Π is the k-vectors r(v|Π) = (d(v, S 1), d(v, S 2), …, d(v, Sk )), where d(v, Sk ) represents the distance between the vertex v and the set Sk , defined by d(v, Sk ) = min{d(v, x)|x ∈ Sk}. The partition Π of V (G) is a resolving partition if the k-vektors r(v|Π), v ∈ V (G) are distinct. The minimum resolving partition Π is a partition dimension of G, denoted by pd(G). The resolving partition Π = {S 1, S 2, S 3, …, Sk} is called a star resolving partition for G if it is a resolving partition and each subgraph induced by Si , 1 ≤ i ≤ k, is a star. The minimum k for which there exists a star resolving partition of V (G) is the star partition dimension of G, denoted by spd(G). Finding a star partition dimension of G is classified to be a NP-Hard problem. Furthermore, the comb product between G and H, denoted by G ⊲ H, is a graph obtained by taking one copy of G and |V (G)| copies of H and grafting the i-th copy of H at the vertex o to the i-th vertex of G. By definition of comb product, we can say that V (G ⊲ H) = {(a, u)|a ∈ V (G), u ∈ V (H)} and (a, u)(b, v) ∈ E(G ⊲ H) whenever a = b and uv ∈ E(H), or ab ∈ E(G) and u = v = o. In this paper, we will study the star partition dimension of comb product of cycle and complete graph, namely Cn ⊲ Km and Km ⊲ Cn for n ≥ 3 and m ≥ 3.

  6. A Graph Based Interface for Representing Volume Visualization Results

    NASA Technical Reports Server (NTRS)

    Patten, James M.; Ma, Kwan-Liu

    1998-01-01

    This paper discusses a graph based user interface for representing the results of the volume visualization process. As images are rendered, they are connected to other images in a graph based on their rendering parameters. The user can take advantage of the information in this graph to understand how certain rendering parameter changes affect a dataset, making the visualization process more efficient. Because the graph contains more information than is contained in an unstructured history of images, the image graph is also helpful for collaborative visualization and animation.

  7. Matching CCD images to a stellar catalog using locality-sensitive hashing

    NASA Astrophysics Data System (ADS)

    Liu, Bo; Yu, Jia-Zong; Peng, Qing-Yu

    2018-02-01

    The usage of a subset of observed stars in a CCD image to find their corresponding matched stars in a stellar catalog is an important issue in astronomical research. Subgraph isomorphic-based algorithms are the most widely used methods in star catalog matching. When more subgraph features are provided, the CCD images are recognized better. However, when the navigation feature database is large, the method requires more time to match the observing model. To solve this problem, this study investigates further and improves subgraph isomorphic matching algorithms. We present an algorithm based on a locality-sensitive hashing technique, which allocates quadrilateral models in the navigation feature database into different hash buckets and reduces the search range to the bucket in which the observed quadrilateral model is located. Experimental results indicate the effectivity of our method.

  8. Development of antibiotic regimens using graph based evolutionary algorithms.

    PubMed

    Corns, Steven M; Ashlock, Daniel A; Bryden, Kenneth M

    2013-12-01

    This paper examines the use of evolutionary algorithms in the development of antibiotic regimens given to production animals. A model is constructed that combines the lifespan of the animal and the bacteria living in the animal's gastro-intestinal tract from the early finishing stage until the animal reaches market weight. This model is used as the fitness evaluation for a set of graph based evolutionary algorithms to assess the impact of diversity control on the evolving antibiotic regimens. The graph based evolutionary algorithms have two objectives: to find an antibiotic treatment regimen that maintains the weight gain and health benefits of antibiotic use and to reduce the risk of spreading antibiotic resistant bacteria. This study examines different regimens of tylosin phosphate use on bacteria populations divided into Gram positive and Gram negative types, with a focus on Campylobacter spp. Treatment regimens were found that provided decreased antibiotic resistance relative to conventional methods while providing nearly the same benefits as conventional antibiotic regimes. By using a graph to control the information flow in the evolutionary algorithm, a variety of solutions along the Pareto front can be found automatically for this and other multi-objective problems. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  9. Docking pose selection by interaction pattern graph similarity: application to the D3R grand challenge 2015

    NASA Astrophysics Data System (ADS)

    Slynko, Inna; Da Silva, Franck; Bret, Guillaume; Rognan, Didier

    2016-09-01

    High affinity ligands for a given target tend to share key molecular interactions with important anchoring amino acids and therefore often present quite conserved interaction patterns. This simple concept was formalized in a topological knowledge-based scoring function (GRIM) for selecting the most appropriate docking poses from previously X-rayed interaction patterns. GRIM first converts protein-ligand atomic coordinates (docking poses) into a simple 3D graph describing the corresponding interaction pattern. In a second step, proposed graphs are compared to that found from template structures in the Protein Data Bank. Last, all docking poses are rescored according to an empirical score (GRIMscore) accounting for overlap of maximum common subgraphs. Taking the opportunity of the public D3R Grand Challenge 2015, GRIM was used to rescore docking poses for 36 ligands (6 HSP90α inhibitors, 30 MAP4K4 inhibitors) prior to the release of the corresponding protein-ligand X-ray structures. When applied to the HSP90α dataset, for which many protein-ligand X-ray structures are already available, GRIM provided very high quality solutions (mean rmsd = 1.06 Å, n = 6) as top-ranked poses, and significantly outperformed a state-of-the-art scoring function. In the case of MAP4K4 inhibitors, for which preexisting 3D knowledge is scarce and chemical diversity is much larger, the accuracy of GRIM poses decays (mean rmsd = 3.18 Å, n = 30) although GRIM still outperforms an energy-based scoring function. GRIM rescoring appears to be quite robust with comparison to the other approaches competing for the same challenge (42 submissions for the HSP90 dataset, 27 for the MAP4K4 dataset) as it ranked 3rd and 2nd respectively, for the two investigated datasets. The rescoring method is quite simple to implement, independent on a docking engine, and applicable to any target for which at least one holo X-ray structure is available.

  10. Markov Dynamics as a Zooming Lens for Multiscale Community Detection: Non Clique-Like Communities and the Field-of-View Limit

    PubMed Central

    Schaub, Michael T.; Delvenne, Jean-Charles; Yaliraki, Sophia N.; Barahona, Mauricio

    2012-01-01

    In recent years, there has been a surge of interest in community detection algorithms for complex networks. A variety of computational heuristics, some with a long history, have been proposed for the identification of communities or, alternatively, of good graph partitions. In most cases, the algorithms maximize a particular objective function, thereby finding the ‘right’ split into communities. Although a thorough comparison of algorithms is still lacking, there has been an effort to design benchmarks, i.e., random graph models with known community structure against which algorithms can be evaluated. However, popular community detection methods and benchmarks normally assume an implicit notion of community based on clique-like subgraphs, a form of community structure that is not always characteristic of real networks. Specifically, networks that emerge from geometric constraints can have natural non clique-like substructures with large effective diameters, which can be interpreted as long-range communities. In this work, we show that long-range communities escape detection by popular methods, which are blinded by a restricted ‘field-of-view’ limit, an intrinsic upper scale on the communities they can detect. The field-of-view limit means that long-range communities tend to be overpartitioned. We show how by adopting a dynamical perspective towards community detection [1], [2], in which the evolution of a Markov process on the graph is used as a zooming lens over the structure of the network at all scales, one can detect both clique- or non clique-like communities without imposing an upper scale to the detection. Consequently, the performance of algorithms on inherently low-diameter, clique-like benchmarks may not always be indicative of equally good results in real networks with local, sparser connectivity. We illustrate our ideas with constructive examples and through the analysis of real-world networks from imaging, protein structures and the power grid, where a multiscale structure of non clique-like communities is revealed. PMID:22384178

  11. Parallelization of a Fully-Distributed Hydrologic Model using Sub-basin Partitioning

    NASA Astrophysics Data System (ADS)

    Vivoni, E. R.; Mniszewski, S.; Fasel, P.; Springer, E.; Ivanov, V. Y.; Bras, R. L.

    2005-12-01

    A primary obstacle towards advances in watershed simulations has been the limited computational capacity available to most models. The growing trend of model complexity, data availability and physical representation has not been matched by adequate developments in computational efficiency. This situation has created a serious bottleneck which limits existing distributed hydrologic models to small domains and short simulations. In this study, we present novel developments in the parallelization of a fully-distributed hydrologic model. Our work is based on the TIN-based Real-time Integrated Basin Simulator (tRIBS), which provides continuous hydrologic simulation using a multiple resolution representation of complex terrain based on a triangulated irregular network (TIN). While the use of TINs reduces computational demand, the sequential version of the model is currently limited over large basins (>10,000 km2) and long simulation periods (>1 year). To address this, a parallel MPI-based version of the tRIBS model has been implemented and tested using high performance computing resources at Los Alamos National Laboratory. Our approach utilizes domain decomposition based on sub-basin partitioning of the watershed. A stream reach graph based on the channel network structure is used to guide the sub-basin partitioning. Individual sub-basins or sub-graphs of sub-basins are assigned to separate processors to carry out internal hydrologic computations (e.g. rainfall-runoff transformation). Routed streamflow from each sub-basin forms the major hydrologic data exchange along the stream reach graph. Individual sub-basins also share subsurface hydrologic fluxes across adjacent boundaries. We demonstrate how the sub-basin partitioning provides computational feasibility and efficiency for a set of test watersheds in northeastern Oklahoma. We compare the performance of the sequential and parallelized versions to highlight the efficiency gained as the number of processors increases. We also discuss how the coupled use of TINs and parallel processing can lead to feasible long-term simulations in regional watersheds while preserving basin properties at high-resolution.

  12. Meraculous: De Novo Genome Assembly with Short Paired-End Reads

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chapman, Jarrod A.; Ho, Isaac; Sunkara, Sirisha

    2011-08-18

    We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, and apply it to the assembly of a dataset of paired 75-bp Illumina reads derived from the 15.4 megabase genome of the haploid yeast Pichia stipitis. More than 95% of the genome is recovered, with no errors; half the assembled sequence is in contigs longer than 101 kilobases and in scaffolds longer than 269 kilobases. Incorporating fosmid ends recovers entire chromosomes. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions inmore » the dataset, avoiding an explicit error correction step as used in other short-read assemblers. A novel memory-efficient hashing scheme is introduced. The resulting contigs are ordered and oriented using paired reads separated by ~280 bp or ~3.2 kbp, and many gaps between contigs can be closed using paired-end placements. Practical issues with the dataset are described, and prospects for assembling larger genomes are discussed.« less

  13. Convergence analysis of directed signed networks via an M-matrix approach

    NASA Astrophysics Data System (ADS)

    Meng, Deyuan

    2018-04-01

    This paper aims at solving convergence problems on directed signed networks with multiple nodes, where interactions among nodes are described by signed digraphs. The convergence analysis is achieved by matrix-theoretic and graph-theoretic tools, in which M-matrices play a central role. The fundamental digon sign-symmetry assumption upon signed digraphs can be removed with the proposed analysis approach. Furthermore, necessary and sufficient conditions are established for semi-positive and positive stabilities of Laplacian matrices of signed digraphs, respectively. A benefit of this result is that given strong connectivity, a directed signed network can achieve bipartite consensus (or state stability) if and only if the signed digraph associated with it is structurally balanced (or unbalanced). If the interactions between nodes are described by a signed digraph only with spanning trees, a directed signed network can achieve interval bipartite consensus (or state stability) if and only if the signed digraph contains a structurally balanced (or unbalanced) rooted subgraph. Simulations are given to illustrate the developed results by considering signed networks associated with digon sign-unsymmetric signed digraphs.

  14. Spatial Fingerprints of Community Structure in Human Interaction Network for an Extensive Set of Large-Scale Regions

    PubMed Central

    Kallus, Zsófia; Barankai, Norbert; Szüle, János; Vattay, Gábor

    2015-01-01

    Human interaction networks inferred from country-wide telephone activity recordings were recently used to redraw political maps by projecting their topological partitions into geographical space. The results showed remarkable spatial cohesiveness of the network communities and a significant overlap between the redrawn and the administrative borders. Here we present a similar analysis based on one of the most popular online social networks represented by the ties between more than 5.8 million of its geo-located users. The worldwide coverage of their measured activity allowed us to analyze the large-scale regional subgraphs of entire continents and an extensive set of examples for single countries. We present results for North and South America, Europe and Asia. In our analysis we used the well-established method of modularity clustering after an aggregation of the individual links into a weighted graph connecting equal-area geographical pixels. Our results show fingerprints of both of the opposing forces of dividing local conflicts and of uniting cross-cultural trends of globalization. PMID:25993329

  15. Service-based analysis of biological pathways

    PubMed Central

    Zheng, George; Bouguettaya, Athman

    2009-01-01

    Background Computer-based pathway discovery is concerned with two important objectives: pathway identification and analysis. Conventional mining and modeling approaches aimed at pathway discovery are often effective at achieving either objective, but not both. Such limitations can be effectively tackled leveraging a Web service-based modeling and mining approach. Results Inspired by molecular recognitions and drug discovery processes, we developed a Web service mining tool, named PathExplorer, to discover potentially interesting biological pathways linking service models of biological processes. The tool uses an innovative approach to identify useful pathways based on graph-based hints and service-based simulation verifying user's hypotheses. Conclusion Web service modeling of biological processes allows the easy access and invocation of these processes on the Web. Web service mining techniques described in this paper enable the discovery of biological pathways linking these process service models. Algorithms presented in this paper for automatically highlighting interesting subgraph within an identified pathway network enable the user to formulate hypothesis, which can be tested out using our simulation algorithm that are also described in this paper. PMID:19796403

  16. Partial Information Community Detection in a Multilayer Network

    DTIC Science & Technology

    2016-06-01

    Network was taken from the CORE Lab at the Naval Postgraduate School [27]. Facebook dataset We will use a subgraph of the Facebook network to build a...larger synthetic multilayer network. We want to use this Facebook data as a way to introduce a real world example of a network into our synthetic network...This data is provided by the Standford Large Network Dataset Collection [28]. This is a large anonymous subgraph of Facebook . It contains over 4,000

  17. Do motifs reflect evolved function?--No convergent evolution of genetic regulatory network subgraph topologies.

    PubMed

    Knabe, Johannes F; Nehaniv, Chrystopher L; Schilstra, Maria J

    2008-01-01

    Methods that analyse the topological structure of networks have recently become quite popular. Whether motifs (subgraph patterns that occur more often than in randomized networks) have specific functions as elementary computational circuits has been cause for debate. As the question is difficult to resolve with currently available biological data, we approach the issue using networks that abstractly model natural genetic regulatory networks (GRNs) which are evolved to show dynamical behaviors. Specifically one group of networks was evolved to be capable of exhibiting two different behaviors ("differentiation") in contrast to a group with a single target behavior. In both groups we find motif distribution differences within the groups to be larger than differences between them, indicating that evolutionary niches (target functions) do not necessarily mold network structure uniquely. These results show that variability operators can have a stronger influence on network topologies than selection pressures, especially when many topologies can create similar dynamics. Moreover, analysis of motif functional relevance by lesioning did not suggest that motifs were of greater importance to the functioning of the network than arbitrary subgraph patterns. Only when drastically restricting network size, so that one motif corresponds to a whole functionally evolved network, was preference for particular connection patterns found. This suggests that in non-restricted, bigger networks, entanglement with the rest of the network hinders topological subgraph analysis.

  18. Topology of Innovation Spaces in the Knowledge Networks Emerging through Questions-And-Answers

    PubMed Central

    Andjelković, Miroslav; Tadić, Bosiljka; Mitrović Dankulov, Marija; Rajković, Milan; Melnik, Roderick

    2016-01-01

    The communication processes of knowledge creation represent a particular class of human dynamics where the expertise of individuals plays a substantial role, thus offering a unique possibility to study the structure of knowledge networks from online data. Here, we use the empirical evidence from questions-and-answers in mathematics to analyse the emergence of the network of knowledge contents (or tags) as the individual experts use them in the process. After removing extra edges from the network-associated graph, we apply the methods of algebraic topology of graphs to examine the structure of higher-order combinatorial spaces in networks for four consecutive time intervals. We find that the ranking distributions of the suitably scaled topological dimensions of nodes fall into a unique curve for all time intervals and filtering levels, suggesting a robust architecture of knowledge networks. Moreover, these networks preserve the logical structure of knowledge within emergent communities of nodes, labeled according to a standard mathematical classification scheme. Further, we investigate the appearance of new contents over time and their innovative combinations, which expand the knowledge network. In each network, we identify an innovation channel as a subgraph of triangles and larger simplices to which new tags attach. Our results show that the increasing topological complexity of the innovation channels contributes to network’s architecture over different time periods, and is consistent with temporal correlations of the occurrence of new tags. The methodology applies to a wide class of data with the suitable temporal resolution and clearly identified knowledge-content units. PMID:27171149

  19. Flexible sampling large-scale social networks by self-adjustable random walk

    NASA Astrophysics Data System (ADS)

    Xu, Xiao-Ke; Zhu, Jonathan J. H.

    2016-12-01

    Online social networks (OSNs) have become an increasingly attractive gold mine for academic and commercial researchers. However, research on OSNs faces a number of difficult challenges. One bottleneck lies in the massive quantity and often unavailability of OSN population data. Sampling perhaps becomes the only feasible solution to the problems. How to draw samples that can represent the underlying OSNs has remained a formidable task because of a number of conceptual and methodological reasons. Especially, most of the empirically-driven studies on network sampling are confined to simulated data or sub-graph data, which are fundamentally different from real and complete-graph OSNs. In the current study, we propose a flexible sampling method, called Self-Adjustable Random Walk (SARW), and test it against with the population data of a real large-scale OSN. We evaluate the strengths of the sampling method in comparison with four prevailing methods, including uniform, breadth-first search (BFS), random walk (RW), and revised RW (i.e., MHRW) sampling. We try to mix both induced-edge and external-edge information of sampled nodes together in the same sampling process. Our results show that the SARW sampling method has been able to generate unbiased samples of OSNs with maximal precision and minimal cost. The study is helpful for the practice of OSN research by providing a highly needed sampling tools, for the methodological development of large-scale network sampling by comparative evaluations of existing sampling methods, and for the theoretical understanding of human networks by highlighting discrepancies and contradictions between existing knowledge/assumptions of large-scale real OSN data.

  20. Graphical Understanding of Simple Feedback Systems.

    ERIC Educational Resources Information Center

    Janvier, Claude; Garancon, Maurice

    1989-01-01

    Shows that graphs can reveal much about feedback systems that formula conceal, especially as microcomputers can provide complex graphs presented as animations and allow students to interact easily with them. Describes feedback systems, evolution of the system, and phase diagram. (YP)

  1. FREQUENT SUBGRAPH MINING OF PERSONALIZED SIGNALING PATHWAY NETWORKS GROUPS PATIENTS WITH FREQUENTLY DYSREGULATED DISEASE PATHWAYS AND PREDICTS PROGNOSIS.

    PubMed

    Durmaz, Arda; Henderson, Tim A D; Brubaker, Douglas; Bebek, Gurkan

    2017-01-01

    Large scale genomics studies have generated comprehensive molecular characterization of numerous cancer types. Subtypes for many tumor types have been established; however, these classifications are based on molecular characteristics of a small gene sets with limited power to detect dysregulation at the patient level. We hypothesize that frequent graph mining of pathways to gather pathways functionally relevant to tumors can characterize tumor types and provide opportunities for personalized therapies. In this study we present an integrative omics approach to group patients based on their altered pathway characteristics and show prognostic differences within breast cancer (p < 9:57E - 10) and glioblastoma multiforme (p < 0:05) patients. We were able validate this approach in secondary RNA-Seq datasets with p < 0:05 and p < 0:01 respectively. We also performed pathway enrichment analysis to further investigate the biological relevance of dysregulated pathways. We compared our approach with network-based classifier algorithms and showed that our unsupervised approach generates more robust and biologically relevant clustering whereas previous approaches failed to report specific functions for similar patient groups or classify patients into prognostic groups. These results could serve as a means to improve prognosis for future cancer patients, and to provide opportunities for improved treatment options and personalized interventions. The proposed novel graph mining approach is able to integrate PPI networks with gene expression in a biologically sound approach and cluster patients in to clinically distinct groups. We have utilized breast cancer and glioblastoma multiforme datasets from microarray and RNA-Seq platforms and identified disease mechanisms differentiating samples. Supplementary methods, figures, tables and code are available at https://github.com/bebeklab/dysprog.

  2. Human Rights Event Detection from Heterogeneous Social Media Graphs.

    PubMed

    Chen, Feng; Neill, Daniel B

    2015-03-01

    Human rights organizations are increasingly monitoring social media for identification, verification, and documentation of human rights violations. Since manual extraction of events from the massive amount of online social network data is difficult and time-consuming, we propose an approach for automated, large-scale discovery and analysis of human rights-related events. We apply our recently developed Non-Parametric Heterogeneous Graph Scan (NPHGS), which models social media data such as Twitter as a heterogeneous network (with multiple different node types, features, and relationships) and detects emerging patterns in the network, to identify and characterize human rights events. NPHGS efficiently maximizes a nonparametric scan statistic (an aggregate measure of anomalousness) over connected subgraphs of the heterogeneous network to identify the most anomalous network clusters. It summarizes each event with information such as type of event, geographical locations, time, and participants, and provides documentation such as links to videos and news reports. Building on our previous work that demonstrates the utility of NPHGS for civil unrest prediction and rare disease outbreak detection, we present an analysis of human rights events detected by NPHGS using two years of Twitter data from Mexico. NPHGS was able to accurately detect relevant clusters of human rights-related tweets prior to international news sources, and in some cases, prior to local news reports. Analysis of social media using NPHGS could enhance the information-gathering missions of human rights organizations by pinpointing specific abuses, revealing events and details that may be blocked from traditional media sources, and providing evidence of emerging patterns of human rights violations. This could lead to more timely, targeted, and effective advocacy, as well as other potential interventions.

  3. A Natural Language Interface Concordant with a Knowledge Base.

    PubMed

    Han, Yong-Jin; Park, Seong-Bae; Park, Se-Young

    2016-01-01

    The discordance between expressions interpretable by a natural language interface (NLI) system and those answerable by a knowledge base is a critical problem in the field of NLIs. In order to solve this discordance problem, this paper proposes a method to translate natural language questions into formal queries that can be generated from a graph-based knowledge base. The proposed method considers a subgraph of a knowledge base as a formal query. Thus, all formal queries corresponding to a concept or a predicate in the knowledge base can be generated prior to query time and all possible natural language expressions corresponding to each formal query can also be collected in advance. A natural language expression has a one-to-one mapping with a formal query. Hence, a natural language question is translated into a formal query by matching the question with the most appropriate natural language expression. If the confidence of this matching is not sufficiently high the proposed method rejects the question and does not answer it. Multipredicate queries are processed by regarding them as a set of collected expressions. The experimental results show that the proposed method thoroughly handles answerable questions from the knowledge base and rejects unanswerable ones effectively.

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Madduri, Kamesh; Wu, Kesheng

    The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scienti c data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern- nding queries on this implicit multigraph in a SQL- like syntax. SPARQL queries generate complex intermediate join queries; to compute these joins e ciently, we propose a new strategy based on bitmap indexes. We store the RDF data in column-oriented structures as compressed bitmaps along with two dictionaries. This papermore » makes three new contributions. (i) We present an e cient parallel strategy for parsing the raw RDF data, building dictionaries of unique entities, and creating compressed bitmap indexes of the data. (ii) We utilize the constructed bitmap indexes to e ciently answer SPARQL queries, simplifying the join evaluations. (iii) To quantify the performance impact of using bitmap indexes, we compare our approach to the state-of-the-art triple-store RDF-3X. We nd that our bitmap index-based approach to answering queries is up to an order of magnitude faster for a variety of SPARQL queries, on gigascale RDF data sets.« less

  5. Network Approach to Disease Diagnosis

    NASA Astrophysics Data System (ADS)

    Sharma, Amitabh; Bashan, Amir; Barabasi, Alber-Laszlo

    2014-03-01

    Human diseases could be viewed as perturbations of the underlying biological system. A thorough understanding of the topological and dynamical properties of the biological system is crucial to explain the mechanisms of many complex diseases. Recently network-based approaches have provided a framework for integrating multi-dimensional biological data that results in a better understanding of the pathophysiological state of complex diseases. Here we provide a network-based framework to improve the diagnosis of complex diseases. This framework is based on the integration of transcriptomics and the interactome. We analyze the overlap between the differentially expressed (DE) genes and disease genes (DGs) based on their locations in the molecular interaction network (''interactome''). Disease genes and their protein products tend to be much more highly connected than random, hence defining a disease sub-graph (called disease module) in the interactome. DE genes, even though different from the known set of DGs, may be significantly associated with the disease when considering their closeness to the disease module in the interactome. This new network approach holds the promise to improve the diagnosis of patients who cannot be diagnosed using conventional tools. Support was provided by HL066289 and HL105339 grants from the U.S. National Institutes of Health.

  6. Beyond Aztec Castles: Toric Cascades in the dP 3 Quiver

    NASA Astrophysics Data System (ADS)

    Lai, Tri; Musiker, Gregg

    2017-12-01

    Given one of an infinite class of supersymmetric quiver gauge theories, string theorists can associate a corresponding toric variety (which is a Calabi-Yau 3-fold) as well as an associated combinatorial model known as a brane tiling. In combinatorial language, a brane tiling is a bipartite graph on a torus and its perfect matchings are of interest to both combinatorialists and physicists alike. A cluster algebra may also be associated to such quivers and in this paper we study the generators of this algebra, known as cluster variables, for the quiver associated to the cone over the del Pezzo surface d P 3. In particular, mutation sequences involving mutations exclusively at vertices with two in-coming arrows and two out-going arrows are referred to as toric cascades in the string theory literature. Such toric cascades give rise to interesting discrete integrable systems on the level of cluster variable dynamics. We provide an explicit algebraic formula for all cluster variables that are reachable by toric cascades as well as a combinatorial interpretation involving perfect matchings of subgraphs of the d P 3 brane tiling for these formulas in most cases.

  7. Interactive Web Graphs for Economic Principles.

    ERIC Educational Resources Information Center

    Kaufman, Dennis A.; Kaufman, Rebecca S.

    2002-01-01

    Describes a Web site with animation and interactive activities containing graphs and basic economics concepts. Features changes in supply and market equilibrium, the construction of the long-run average cost curve, short-run profit maximization, long-run market equilibrium, and changes in aggregate demand and aggregate supply. States the…

  8. Docking pose selection by interaction pattern graph similarity: application to the D3R grand challenge 2015.

    PubMed

    Slynko, Inna; Da Silva, Franck; Bret, Guillaume; Rognan, Didier

    2016-09-01

    High affinity ligands for a given target tend to share key molecular interactions with important anchoring amino acids and therefore often present quite conserved interaction patterns. This simple concept was formalized in a topological knowledge-based scoring function (GRIM) for selecting the most appropriate docking poses from previously X-rayed interaction patterns. GRIM first converts protein-ligand atomic coordinates (docking poses) into a simple 3D graph describing the corresponding interaction pattern. In a second step, proposed graphs are compared to that found from template structures in the Protein Data Bank. Last, all docking poses are rescored according to an empirical score (GRIMscore) accounting for overlap of maximum common subgraphs. Taking the opportunity of the public D3R Grand Challenge 2015, GRIM was used to rescore docking poses for 36 ligands (6 HSP90α inhibitors, 30 MAP4K4 inhibitors) prior to the release of the corresponding protein-ligand X-ray structures. When applied to the HSP90α dataset, for which many protein-ligand X-ray structures are already available, GRIM provided very high quality solutions (mean rmsd = 1.06 Å, n = 6) as top-ranked poses, and significantly outperformed a state-of-the-art scoring function. In the case of MAP4K4 inhibitors, for which preexisting 3D knowledge is scarce and chemical diversity is much larger, the accuracy of GRIM poses decays (mean rmsd = 3.18 Å, n = 30) although GRIM still outperforms an energy-based scoring function. GRIM rescoring appears to be quite robust with comparison to the other approaches competing for the same challenge (42 submissions for the HSP90 dataset, 27 for the MAP4K4 dataset) as it ranked 3rd and 2nd respectively, for the two investigated datasets. The rescoring method is quite simple to implement, independent on a docking engine, and applicable to any target for which at least one holo X-ray structure is available.

  9. Animal Adaptation and Acclimatization to Cold

    ERIC Educational Resources Information Center

    Phillips, R. E.; Watson, C. A.

    1977-01-01

    This article examines the mechanisms of adaptation and acclimatization of animals to cold temperatures. The differences between the two processes are explained and terms are defined. Graphs and drawings illustrate the article. (MA)

  10. Contact tracing for the control of infectious disease epidemics: Chronic Wasting Disease in deer farms.

    PubMed

    Rorres, Chris; Romano, Maria; Miller, Jennifer A; Mossey, Jana M; Grubesic, Tony H; Zellner, David E; Smith, Gary

    2018-06-01

    Contact tracing is a crucial component of the control of many infectious diseases, but is an arduous and time consuming process. Procedures that increase the efficiency of contact tracing increase the chance that effective controls can be implemented sooner and thus reduce the magnitude of the epidemic. We illustrate a procedure using Graph Theory in the context of infectious disease epidemics of farmed animals in which the epidemics are driven mainly by the shipment of animals between farms. Specifically, we created a directed graph of the recorded shipments of deer between deer farms in Pennsylvania over a timeframe and asked how the properties of the graph could be exploited to make contact tracing more efficient should Chronic Wasting Disease (a prion disease of deer) be discovered in one of the farms. We show that the presence of a large strongly connected component in the graph has a significant impact on the number of contacts that can arise. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  11. Disease management research using event graphs.

    PubMed

    Allore, H G; Schruben, L W

    2000-08-01

    Event Graphs, conditional representations of stochastic relationships between discrete events, simulate disease dynamics. In this paper, we demonstrate how Event Graphs, at an appropriate abstraction level, also extend and organize scientific knowledge about diseases. They can identify promising treatment strategies and directions for further research and provide enough detail for testing combinations of new medicines and interventions. Event Graphs can be enriched to incorporate and validate data and test new theories to reflect an expanding dynamic scientific knowledge base and establish performance criteria for the economic viability of new treatments. To illustrate, an Event Graph is developed for mastitis, a costly dairy cattle disease, for which extensive scientific literature exists. With only a modest amount of imagination, the methodology presented here can be seen to apply modeling to any disease, human, plant, or animal. The Event Graph simulation presented here is currently being used in research and in a new veterinary epidemiology course. Copyright 2000 Academic Press.

  12. A novel information cascade model in online social networks

    NASA Astrophysics Data System (ADS)

    Tong, Chao; He, Wenbo; Niu, Jianwei; Xie, Zhongyu

    2016-02-01

    The spread and diffusion of information has become one of the hot issues in today's social network analysis. To analyze the spread of online social network information and the attribute of cascade, in this paper, we discuss the spread of two kinds of users' decisions for city-wide activities, namely the "want to take part in the activity" and "be interested in the activity", based on the users' attention in "DouBan" and the data of the city-wide activities. We analyze the characteristics of the activity-decision's spread in these aspects: the scale and scope of the cascade subgraph, the structure characteristic of the cascade subgraph, the topological attribute of spread tree, and the occurrence frequency of cascade subgraph. On this basis, we propose a new information spread model. Based on the classical independent diffusion model, we introduce three mechanisms, equal probability, similarity of nodes, and popularity of nodes, which can generate and affect the spread of information. Besides, by conducting the experiments in six different kinds of network data set, we compare the effects of three mechanisms above mentioned, totally six specific factors, on the spread of information, and put forward that the node's popularity plays an important role in the information spread.

  13. Locating overlapping dense subgraphs in gene (protein) association networks and predicting novel protein functional groups among these subgraphs

    NASA Astrophysics Data System (ADS)

    Palla, Gergely; Derenyi, Imre; Farkas, Illes J.; Vicsek, Tamas

    2006-03-01

    Most tasks in a cell are performed not by individual proteins, but by functional groups of proteins (either physically interacting with each other or associated in other ways). In gene (protein) association networks these groups show up as sets of densely connected nodes. In the yeast, Saccharomyces cerevisiae, known physically interacting groups of proteins (called protein complexes) strongly overlap: the total number of proteins contained by these complexes by far underestimates the sum of their sizes (2750 vs. 8932). Thus, most functional groups of proteins, both physically interacting and other, are likely to share many of their members with other groups. However, current algorithms searching for dense groups of nodes in networks usually exclude overlaps. With the aim to discover both novel functions of individual proteins and novel protein functional groups we combine in protein association networks (i) a search for overlapping dense subgraphs based on the Clique Percolation Method (CPM) (Palla, G., et.al. Nature 435, 814-818 (2005), http://angel.elte.hu/clustering), which explicitly allows for overlaps among the groups, and (ii) a verification and characterization of the identified groups of nodes (proteins) with the help of standard annotation databases listing known functions.

  14. Perspective on Models in Theoretical and Practical Traditions of Knowledge: The Example of Otto Engine Animations

    ERIC Educational Resources Information Center

    Haglund, Jesper; Stromdahl, Helge

    2012-01-01

    Nineteen informants (n = 19) were asked to study and comment two computer animations of the Otto combustion engine. One animation was non-interactive and realistic in the sense of depicting a physical engine. The other animation was more idealised, interactive and synchronised with a dynamic PV-graph. The informants represented practical and…

  15. SimGraph: A Flight Simulation Data Visualization Workstation

    NASA Technical Reports Server (NTRS)

    Kaplan, Joseph A.; Kenney, Patrick S.

    1997-01-01

    Today's modern flight simulation research produces vast amounts of time sensitive data, making a qualitative analysis of the data difficult while it remains in a numerical representation. Therefore, a method of merging related data together and presenting it to the user in a more comprehensible format is necessary. Simulation Graphics (SimGraph) is an object-oriented data visualization software package that presents simulation data in animated graphical displays for easy interpretation. Data produced from a flight simulation is presented by SimGraph in several different formats, including: 3-Dimensional Views, Cockpit Control Views, Heads-Up Displays, Strip Charts, and Status Indicators. SimGraph can accommodate the addition of new graphical displays to allow the software to be customized to each user s particular environment. A new display can be developed and added to SimGraph without having to design a new application, allowing the graphics programmer to focus on the development of the graphical display. The SimGraph framework can be reused for a wide variety of visualization tasks. Although it was created for the flight simulation facilities at NASA Langley Research Center, SimGraph can be reconfigured to almost any data visualization environment. This paper describes the capabilities and operations of SimGraph.

  16. Assortativity and leadership emerge from anti-preferential attachment in heterogeneous networks.

    PubMed

    Sendiña-Nadal, I; Danziger, M M; Wang, Z; Havlin, S; Boccaletti, S

    2016-02-18

    Real-world networks have distinct topologies, with marked deviations from purely random networks. Many of them exhibit degree-assortativity, with nodes of similar degree more likely to link to one another. Though microscopic mechanisms have been suggested for the emergence of other topological features, assortativity has proven elusive. Assortativity can be artificially implanted in a network via degree-preserving link permutations, however this destroys the graph's hierarchical clustering and does not correspond to any microscopic mechanism. Here, we propose the first generative model which creates heterogeneous networks with scale-free-like properties in degree and clustering distributions and tunable realistic assortativity. Two distinct populations of nodes are incrementally added to an initial network by selecting a subgraph to connect to at random. One population (the followers) follows preferential attachment, while the other population (the potential leaders) connects via anti-preferential attachment: they link to lower degree nodes when added to the network. By selecting the lower degree nodes, the potential leader nodes maintain high visibility during the growth process, eventually growing into hubs. The evolution of links in Facebook empirically validates the connection between the initial anti-preferential attachment and long term high degree. In this way, our work sheds new light on the structure and evolution of social networks.

  17. A tensorial approach to access cognitive workload related to mental arithmetic from EEG functional connectivity estimates.

    PubMed

    Dimitriadis, S I; Sun, Yu; Kwok, K; Laskaris, N A; Bezerianos, A

    2013-01-01

    The association of functional connectivity patterns with particular cognitive tasks has long been a topic of interest in neuroscience, e.g., studies of functional connectivity have demonstrated its potential use for decoding various brain states. However, the high-dimensionality of the pairwise functional connectivity limits its usefulness in some real-time applications. In the present study, the methodology of tensor subspace analysis (TSA) is used to reduce the initial high-dimensionality of the pairwise coupling in the original functional connectivity network to a space of condensed descriptive power, which would significantly decrease the computational cost and facilitate the differentiation of brain states. We assess the feasibility of the proposed method on EEG recordings when the subject was performing mental arithmetic task which differ only in the difficulty level (easy: 1-digit addition v.s. 3-digit additions). Two different cortical connective networks were detected, and by comparing the functional connectivity networks in different work states, it was found that the task-difficulty is best reflected in the connectivity structure of sub-graphs extending over parietooccipital sites. Incorporating this data-driven information within original TSA methodology, we succeeded in predicting the difficulty level from connectivity patterns in an efficient way that can be implemented so as to work in real-time.

  18. Robust cell tracking in epithelial tissues through identification of maximum common subgraphs

    PubMed Central

    Bardenet, Rémi; Zartman, Jeremiah J.; Baker, Ruth E.

    2016-01-01

    Tracking of cells in live-imaging microscopy videos of epithelial sheets is a powerful tool for investigating fundamental processes in embryonic development. Characterizing cell growth, proliferation, intercalation and apoptosis in epithelia helps us to understand how morphogenetic processes such as tissue invagination and extension are locally regulated and controlled. Accurate cell tracking requires correctly resolving cells entering or leaving the field of view between frames, cell neighbour exchanges, cell removals and cell divisions. However, current tracking methods for epithelial sheets are not robust to large morphogenetic deformations and require significant manual interventions. Here, we present a novel algorithm for epithelial cell tracking, exploiting the graph-theoretic concept of a ‘maximum common subgraph’ to track cells between frames of a video. Our algorithm does not require the adjustment of tissue-specific parameters, and scales in sub-quadratic time with tissue size. It does not rely on precise positional information, permitting large cell movements between frames and enabling tracking in datasets acquired at low temporal resolution due to experimental constraints such as phototoxicity. To demonstrate the method, we perform tracking on the Drosophila embryonic epidermis and compare cell–cell rearrangements to previous studies in other tissues. Our implementation is open source and generally applicable to epithelial tissues. PMID:28334699

  19. Characterization of known protein complexes using k-connectivity and other topological measures

    PubMed Central

    Gallagher, Suzanne R; Goldberg, Debra S

    2015-01-01

    Many protein complexes are densely packed, so proteins within complexes often interact with several other proteins in the complex. Steric constraints prevent most proteins from simultaneously binding more than a handful of other proteins, regardless of the number of proteins in the complex. Because of this, as complex size increases, several measures of the complex decrease within protein-protein interaction networks. However, k-connectivity, the number of vertices or edges that need to be removed in order to disconnect a graph, may be consistently high for protein complexes. The property of k-connectivity has been little used previously in the investigation of protein-protein interactions. To understand the discriminative power of k-connectivity and other topological measures for identifying unknown protein complexes, we characterized these properties in known Saccharomyces cerevisiae protein complexes in networks generated both from highly accurate X-ray crystallography experiments which give an accurate model of each complex, and also as the complexes appear in high-throughput yeast 2-hybrid studies in which new complexes may be discovered. We also computed these properties for appropriate random subgraphs.We found that clustering coefficient, mutual clustering coefficient, and k-connectivity are better indicators of known protein complexes than edge density, degree, or betweenness. This suggests new directions for future protein complex-finding algorithms. PMID:26913183

  20. Theoretic derivation of directed acyclic subgraph algorithm and comparisons with message passing algorithm

    NASA Astrophysics Data System (ADS)

    Ha, Jeongmok; Jeong, Hong

    2016-07-01

    This study investigates the directed acyclic subgraph (DAS) algorithm, which is used to solve discrete labeling problems much more rapidly than other Markov-random-field-based inference methods but at a competitive accuracy. However, the mechanism by which the DAS algorithm simultaneously achieves competitive accuracy and fast execution speed, has not been elucidated by a theoretical derivation. We analyze the DAS algorithm by comparing it with a message passing algorithm. Graphical models, inference methods, and energy-minimization frameworks are compared between DAS and message passing algorithms. Moreover, the performances of DAS and other message passing methods [sum-product belief propagation (BP), max-product BP, and tree-reweighted message passing] are experimentally compared.

  1. Animals & Livestock | National Agricultural Library

    Science.gov Websites

    Skip to main content Home National Agricultural Library United States Department of Agriculture Ag (maps, tables, graphs), Agricultural Products html National Animal Nutrition Program (NANP) Feed | Agricultural Research Service | Plain Language | FOIA | Accessibility Statement | Information Quality | Privacy

  2. A Weighted and Directed Interareal Connectivity Matrix for Macaque Cerebral Cortex

    PubMed Central

    Markov, N. T.; Ercsey-Ravasz, M. M.; Ribeiro Gomes, A. R.; Lamy, C.; Magrou, L.; Vezoli, J.; Misery, P.; Falchier, A.; Quilodran, R.; Gariel, M. A.; Sallet, J.; Gamanut, R.; Huissoud, C.; Clavagnier, S.; Giroud, P.; Sappey-Marinier, D.; Barone, P.; Dehay, C.; Toroczkai, Z.; Knoblauch, K.; Van Essen, D. C.; Kennedy, H.

    2014-01-01

    Retrograde tracer injections in 29 of the 91 areas of the macaque cerebral cortex revealed 1,615 interareal pathways, a third of which have not previously been reported. A weight index (extrinsic fraction of labeled neurons [FLNe]) was determined for each area-to-area pathway. Newly found projections were weaker on average compared with the known projections; nevertheless, the 2 sets of pathways had extensively overlapping weight distributions. Repeat injections across individuals revealed modest FLNe variability given the range of FLNe values (standard deviation <1 log unit, range 5 log units). The connectivity profile for each area conformed to a lognormal distribution, where a majority of projections are moderate or weak in strength. In the G29 × 29 interareal subgraph, two-thirds of the connections that can exist do exist. Analysis of the smallest set of areas that collects links from all 91 nodes of the G29 × 91 subgraph (dominating set analysis) confirms the dense (66%) structure of the cortical matrix. The G29 × 29 subgraph suggests an unexpectedly high incidence of unidirectional links. The directed and weighted G29 × 91 connectivity matrix for the macaque will be valuable for comparison with connectivity analyses in other species, including humans. It will also inform future modeling studies that explore the regularities of cortical networks. PMID:23010748

  3. Event-based criteria in GT-STAF information indices: theory, exploratory diversity analysis and QSPR applications.

    PubMed

    Barigye, S J; Marrero-Ponce, Y; Martínez López, Y; Martínez Santiago, O; Torrens, F; García Domenech, R; Galvez, J

    2013-01-01

    Versatile event-based approaches for the definition of novel information theory-based indices (IFIs) are presented. An event in this context is the criterion followed in the "discovery" of molecular substructures, which in turn serve as basis for the construction of the generalized incidence and relations frequency matrices, Q and F, respectively. From the resultant F, Shannon's, mutual, conditional and joint entropy-based IFIs are computed. In previous reports, an event named connected subgraphs was presented. The present study is an extension of this notion, in which we introduce other events, namely: terminal paths, vertex path incidence, quantum subgraphs, walks of length k, Sach's subgraphs, MACCs, E-state and substructure fingerprints and, finally, Ghose and Crippen atom-types for hydrophobicity and refractivity. Moreover, we define magnitude-based IFIs, introducing the use of the magnitude criterion in the definition of mutual, conditional and joint entropy-based IFIs. We also discuss the use of information-theoretic parameters as a measure of the dissimilarity of codified structural information of molecules. Finally, a comparison of the statistics for QSPR models obtained with the proposed IFIs and DRAGON's molecular descriptors for two physicochemical properties log P and log K of 34 derivatives of 2-furylethylenes demonstrates similar to better predictive ability than the latter.

  4. Using animated computer-generated text and graphics to depict the risks and benefits of medical treatment.

    PubMed

    Tait, Alan R; Voepel-Lewis, Terri; Brennan-Martinez, Colleen; McGonegal, Maureen; Levine, Robert

    2012-11-01

    Conventional print materials for presenting risks and benefits of treatment are often difficult to understand. This study was undertaken to evaluate and compare subjects' understanding and perceptions of risks and benefits presented using animated computerized text and graphics. Adult subjects were randomized to receive identical risk/benefit information regarding taking statins that was presented on an iPad (Apple Corp, Cupertino, Calif) in 1 of 4 different animated formats: text/numbers, pie chart, bar graph, and pictograph. Subjects completed a questionnaire regarding their preferences and perceptions of the message delivery together with their understanding of the information. Health literacy, numeracy, and need for cognition were measured using validated instruments. There were no differences in subject understanding based on the different formats. However, significantly more subjects preferred graphs (82.5%) compared with text (17.5%, P<.001). Specifically, subjects preferred pictographs (32.0%) and bar graphs (31.0%) over pie charts (19.5%) and text (17.5%). Subjects whose preference for message delivery matched their randomly assigned format (preference match) had significantly greater understanding and satisfaction compared with those assigned to something other than their preference. Results showed that computer-animated depictions of risks and benefits offer an effective means to describe medical risk/benefit statistics. That understanding and satisfaction were significantly better when the format matched the individual's preference for message delivery is important and reinforces the value of "tailoring" information to the individual's needs and preferences. Copyright © 2012 Elsevier Inc. All rights reserved.

  5. Math Description Engine Software Development Kit

    NASA Technical Reports Server (NTRS)

    Shelton, Robert O.; Smith, Stephanie L.; Dexter, Dan E.; Hodgson, Terry R.

    2010-01-01

    The Math Description Engine Software Development Kit (MDE SDK) can be used by software developers to make computer-rendered graphs more accessible to blind and visually-impaired users. The MDE SDK generates alternative graph descriptions in two forms: textual descriptions and non-verbal sound renderings, or sonification. It also enables display of an animated trace of a graph sonification on a visual graph component, with color and line-thickness options for users having low vision or color-related impairments. A set of accessible graphical user interface widgets is provided for operation by end users and for control of accessible graph displays. Version 1.0 of the MDE SDK generates text descriptions for 2D graphs commonly seen in math and science curriculum (and practice). The mathematically rich text descriptions can also serve as a virtual math and science assistant for blind and sighted users, making graphs more accessible for everyone. The MDE SDK has a simple application programming interface (API) that makes it easy for programmers and Web-site developers to make graphs accessible with just a few lines of code. The source code is written in Java for cross-platform compatibility and to take advantage of Java s built-in support for building accessible software application interfaces. Compiled-library and NASA Open Source versions are available with API documentation and Programmer s Guide at http:/ / prim e.jsc.n asa. gov.

  6. Supporting Multimedia Learning with Visual Signalling and Animated Pedagogical Agent: Moderating Effects of Prior Knowledge

    ERIC Educational Resources Information Center

    Johnson, A. M.; Ozogul, G.; Reisslein, M.

    2015-01-01

    An experiment examined the effects of visual signalling to relevant information in multiple external representations and the visual presence of an animated pedagogical agent (APA). Students learned electric circuit analysis using a computer-based learning environment that included Cartesian graphs, equations and electric circuit diagrams. The…

  7. How the borderless science of electropenetrography (EPG) can benefit animal-disease vector research: overview of EPG history, principles, and applications

    USDA-ARS?s Scientific Manuscript database

    Studying feeding, host injury, and transmission of plant or animal pathogens by vectors is challenging. Vector piercing-sucking mouthparts are probed into opaque tissues, precluding direct observation. Over fifty years ago, this challenge was overcome by development of electrical penetration graph t...

  8. Trend Motif: A Graph Mining Approach for Analysis of Dynamic Complex Networks

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jin, R; McCallen, S; Almaas, E

    2007-05-28

    Complex networks have been used successfully in scientific disciplines ranging from sociology to microbiology to describe systems of interacting units. Until recently, studies of complex networks have mainly focused on their network topology. However, in many real world applications, the edges and vertices have associated attributes that are frequently represented as vertex or edge weights. Furthermore, these weights are often not static, instead changing with time and forming a time series. Hence, to fully understand the dynamics of the complex network, we have to consider both network topology and related time series data. In this work, we propose a motifmore » mining approach to identify trend motifs for such purposes. Simply stated, a trend motif describes a recurring subgraph where each of its vertices or edges displays similar dynamics over a userdefined period. Given this, each trend motif occurrence can help reveal significant events in a complex system; frequent trend motifs may aid in uncovering dynamic rules of change for the system, and the distribution of trend motifs may characterize the global dynamics of the system. Here, we have developed efficient mining algorithms to extract trend motifs. Our experimental validation using three disparate empirical datasets, ranging from the stock market, world trade, to a protein interaction network, has demonstrated the efficiency and effectiveness of our approach.« less

  9. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks.

    PubMed

    Li, Min; Li, Dongyan; Tang, Yu; Wu, Fangxiang; Wang, Jianxin

    2017-08-31

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster.

  10. CytoCluster: A Cytoscape Plugin for Cluster Analysis and Visualization of Biological Networks

    PubMed Central

    Li, Min; Li, Dongyan; Tang, Yu; Wang, Jianxin

    2017-01-01

    Nowadays, cluster analysis of biological networks has become one of the most important approaches to identifying functional modules as well as predicting protein complexes and network biomarkers. Furthermore, the visualization of clustering results is crucial to display the structure of biological networks. Here we present CytoCluster, a cytoscape plugin integrating six clustering algorithms, HC-PIN (Hierarchical Clustering algorithm in Protein Interaction Networks), OH-PIN (identifying Overlapping and Hierarchical modules in Protein Interaction Networks), IPCA (Identifying Protein Complex Algorithm), ClusterONE (Clustering with Overlapping Neighborhood Expansion), DCU (Detecting Complexes based on Uncertain graph model), IPC-MCE (Identifying Protein Complexes based on Maximal Complex Extension), and BinGO (the Biological networks Gene Ontology) function. Users can select different clustering algorithms according to their requirements. The main function of these six clustering algorithms is to detect protein complexes or functional modules. In addition, BinGO is used to determine which Gene Ontology (GO) categories are statistically overrepresented in a set of genes or a subgraph of a biological network. CytoCluster can be easily expanded, so that more clustering algorithms and functions can be added to this plugin. Since it was created in July 2013, CytoCluster has been downloaded more than 9700 times in the Cytoscape App store and has already been applied to the analysis of different biological networks. CytoCluster is available from http://apps.cytoscape.org/apps/cytocluster. PMID:28858211

  11. Machine-Learning Classifier for Patients with Major Depressive Disorder: Multifeature Approach Based on a High-Order Minimum Spanning Tree Functional Brain Network.

    PubMed

    Guo, Hao; Qin, Mengna; Chen, Junjie; Xu, Yong; Xiang, Jie

    2017-01-01

    High-order functional connectivity networks are rich in time information that can reflect dynamic changes in functional connectivity between brain regions. Accordingly, such networks are widely used to classify brain diseases. However, traditional methods for processing high-order functional connectivity networks generally include the clustering method, which reduces data dimensionality. As a result, such networks cannot be effectively interpreted in the context of neurology. Additionally, due to the large scale of high-order functional connectivity networks, it can be computationally very expensive to use complex network or graph theory to calculate certain topological properties. Here, we propose a novel method of generating a high-order minimum spanning tree functional connectivity network. This method increases the neurological significance of the high-order functional connectivity network, reduces network computing consumption, and produces a network scale that is conducive to subsequent network analysis. To ensure the quality of the topological information in the network structure, we used frequent subgraph mining technology to capture the discriminative subnetworks as features and combined this with quantifiable local network features. Then we applied a multikernel learning technique to the corresponding selected features to obtain the final classification results. We evaluated our proposed method using a data set containing 38 patients with major depressive disorder and 28 healthy controls. The experimental results showed a classification accuracy of up to 97.54%.

  12. Machine-Learning Classifier for Patients with Major Depressive Disorder: Multifeature Approach Based on a High-Order Minimum Spanning Tree Functional Brain Network

    PubMed Central

    Qin, Mengna; Chen, Junjie; Xu, Yong; Xiang, Jie

    2017-01-01

    High-order functional connectivity networks are rich in time information that can reflect dynamic changes in functional connectivity between brain regions. Accordingly, such networks are widely used to classify brain diseases. However, traditional methods for processing high-order functional connectivity networks generally include the clustering method, which reduces data dimensionality. As a result, such networks cannot be effectively interpreted in the context of neurology. Additionally, due to the large scale of high-order functional connectivity networks, it can be computationally very expensive to use complex network or graph theory to calculate certain topological properties. Here, we propose a novel method of generating a high-order minimum spanning tree functional connectivity network. This method increases the neurological significance of the high-order functional connectivity network, reduces network computing consumption, and produces a network scale that is conducive to subsequent network analysis. To ensure the quality of the topological information in the network structure, we used frequent subgraph mining technology to capture the discriminative subnetworks as features and combined this with quantifiable local network features. Then we applied a multikernel learning technique to the corresponding selected features to obtain the final classification results. We evaluated our proposed method using a data set containing 38 patients with major depressive disorder and 28 healthy controls. The experimental results showed a classification accuracy of up to 97.54%. PMID:29387141

  13. Comparison of Point Matching Techniques for Road Network Matching

    NASA Astrophysics Data System (ADS)

    Hackeloeer, A.; Klasing, K.; Krisp, J. M.; Meng, L.

    2013-05-01

    Map conflation investigates the unique identification of geographical entities across different maps depicting the same geographic region. It involves a matching process which aims to find commonalities between geographic features. A specific subdomain of conflation called Road Network Matching establishes correspondences between road networks of different maps on multiple layers of abstraction, ranging from elementary point locations to high-level structures such as road segments or even subgraphs derived from the induced graph of a road network. The process of identifying points located on different maps by means of geometrical, topological and semantical information is called point matching. This paper provides an overview of various techniques for point matching, which is a fundamental requirement for subsequent matching steps focusing on complex high-level entities in geospatial networks. Common point matching approaches as well as certain combinations of these are described, classified and evaluated. Furthermore, a novel similarity metric called the Exact Angular Index is introduced, which considers both topological and geometrical aspects. The results offer a basis for further research on a bottom-up matching process for complex map features, which must rely upon findings derived from suitable point matching algorithms. In the context of Road Network Matching, reliable point matches provide an immediate starting point for finding matches between line segments describing the geometry and topology of road networks, which may in turn be used for performing a structural high-level matching on the network level.

  14. Unified Photo Enhancement by Discovering Aesthetic Communities From Flickr.

    PubMed

    Hong, Richang; Zhang, Luming; Tao, Dacheng

    2016-03-01

    Photo enhancement refers to the process of increasing the aesthetic appeal of a photo, such as changing the photo aspect ratio and spatial recomposition. It is a widely used technique in the printing industry, graphic design, and cinematography. In this paper, we propose a unified and socially aware photo enhancement framework which can leverage the experience of photographers with various aesthetic topics (e.g., portrait and landscape). We focus on photos from the image hosting site Flickr, which has 87 million users and to which more than 3.5 million photos are uploaded daily. First, a tagwise regularized topic model is proposed to describe the aesthetic topic of each Flickr user, and coherent and interpretable topics are discovered by leveraging both the visual features and tags of photos. Next, a graph is constructed to describe the similarities in aesthetic topics between the users. Noticeably, densely connected users have similar aesthetic topics, which are categorized into different communities by a dense subgraph mining algorithm. Finally, a probabilistic model is exploited to enhance the aesthetic attractiveness of a test photo by leveraging the photographic experiences of Flickr users from the corresponding communities of that photo. Paired-comparison-based user studies show that our method performs competitively on photo retargeting and recomposition. Moreover, our approach accurately detects aesthetic communities in a photo set crawled from nearly 100000 Flickr users.

  15. Estimating the Size of a Large Network and its Communities from a Random Sample

    PubMed Central

    Chen, Lin; Karbasi, Amin; Crawford, Forrest W.

    2017-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = (V, E) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G(W) be the induced subgraph in G of the vertices in W. In addition to G(W), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K, and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios. PMID:28867924

  16. Estimating the Size of a Large Network and its Communities from a Random Sample.

    PubMed

    Chen, Lin; Karbasi, Amin; Crawford, Forrest W

    2016-01-01

    Most real-world networks are too large to be measured or studied directly and there is substantial interest in estimating global network properties from smaller sub-samples. One of the most important global properties is the number of vertices/nodes in the network. Estimating the number of vertices in a large network is a major challenge in computer science, epidemiology, demography, and intelligence analysis. In this paper we consider a population random graph G = ( V, E ) from the stochastic block model (SBM) with K communities/blocks. A sample is obtained by randomly choosing a subset W ⊆ V and letting G ( W ) be the induced subgraph in G of the vertices in W . In addition to G ( W ), we observe the total degree of each sampled vertex and its block membership. Given this partial information, we propose an efficient PopULation Size Estimation algorithm, called PULSE, that accurately estimates the size of the whole population as well as the size of each community. To support our theoretical analysis, we perform an exhaustive set of experiments to study the effects of sample size, K , and SBM model parameters on the accuracy of the estimates. The experimental results also demonstrate that PULSE significantly outperforms a widely-used method called the network scale-up estimator in a wide variety of scenarios.

  17. Learning about learning: Mining human brain sub-network biomarkers from fMRI data

    PubMed Central

    Dereli, Nazli; Dang, Xuan-Hong; Bassett, Danielle S.; Wymbs, Nicholas F.; Grafton, Scott T.; Singh, Ambuj K.

    2017-01-01

    Modeling the brain as a functional network can reveal the relationship between distributed neurophysiological processes and functional interactions between brain structures. Existing literature on functional brain networks focuses mainly on a battery of network properties in “resting state” employing, for example, modularity, clustering, or path length among regions. In contrast, we seek to uncover functionally connected subnetworks that predict or correlate with cohort differences and are conserved within the subjects within a cohort. We focus on differences in both the rate of learning as well as overall performance in a sensorimotor task across subjects and develop a principled approach for the discovery of discriminative subgraphs of functional connectivity based on imaging acquired during practice. We discover two statistically significant subgraph regions: one involving multiple regions in the visual cortex and another involving the parietal operculum and planum temporale. High functional coherence in the former characterizes sessions in which subjects take longer to perform the task, while high coherence in the latter is associated with high learning rate (performance improvement across trials). Our proposed methodology is general, in that it can be applied to other cognitive tasks, to study learning or to differentiate between healthy patients and patients with neurological disorders, by revealing the salient interactions among brain regions associated with the observed global state. The discovery of such significant discriminative subgraphs promises a better data-driven understanding of the dynamic brain processes associated with high-level cognitive functions. PMID:29016686

  18. Learning about learning: Mining human brain sub-network biomarkers from fMRI data.

    PubMed

    Bogdanov, Petko; Dereli, Nazli; Dang, Xuan-Hong; Bassett, Danielle S; Wymbs, Nicholas F; Grafton, Scott T; Singh, Ambuj K

    2017-01-01

    Modeling the brain as a functional network can reveal the relationship between distributed neurophysiological processes and functional interactions between brain structures. Existing literature on functional brain networks focuses mainly on a battery of network properties in "resting state" employing, for example, modularity, clustering, or path length among regions. In contrast, we seek to uncover functionally connected subnetworks that predict or correlate with cohort differences and are conserved within the subjects within a cohort. We focus on differences in both the rate of learning as well as overall performance in a sensorimotor task across subjects and develop a principled approach for the discovery of discriminative subgraphs of functional connectivity based on imaging acquired during practice. We discover two statistically significant subgraph regions: one involving multiple regions in the visual cortex and another involving the parietal operculum and planum temporale. High functional coherence in the former characterizes sessions in which subjects take longer to perform the task, while high coherence in the latter is associated with high learning rate (performance improvement across trials). Our proposed methodology is general, in that it can be applied to other cognitive tasks, to study learning or to differentiate between healthy patients and patients with neurological disorders, by revealing the salient interactions among brain regions associated with the observed global state. The discovery of such significant discriminative subgraphs promises a better data-driven understanding of the dynamic brain processes associated with high-level cognitive functions.

  19. A simple rule for the evolution of cooperation on graphs and social networks.

    PubMed

    Ohtsuki, Hisashi; Hauert, Christoph; Lieberman, Erez; Nowak, Martin A

    2006-05-25

    A fundamental aspect of all biological systems is cooperation. Cooperative interactions are required for many levels of biological organization ranging from single cells to groups of animals. Human society is based to a large extent on mechanisms that promote cooperation. It is well known that in unstructured populations, natural selection favours defectors over cooperators. There is much current interest, however, in studying evolutionary games in structured populations and on graphs. These efforts recognize the fact that who-meets-whom is not random, but determined by spatial relationships or social networks. Here we describe a surprisingly simple rule that is a good approximation for all graphs that we have analysed, including cycles, spatial lattices, random regular graphs, random graphs and scale-free networks: natural selection favours cooperation, if the benefit of the altruistic act, b, divided by the cost, c, exceeds the average number of neighbours, k, which means b/c > k. In this case, cooperation can evolve as a consequence of 'social viscosity' even in the absence of reputation effects or strategic complexity.

  20. Paint a Playground

    ERIC Educational Resources Information Center

    Modern Schools, 1977

    1977-01-01

    A new graph transfer method will enable schools to increase the use of recreational surfaces by color coating them with maps of the United States, the world, and whimsical animal game designs. (Author/MLF)

  1. Development of a Flight Simulation Data Visualization Workstation

    NASA Technical Reports Server (NTRS)

    Kaplan, Joseph A.; Chen, Ronnie; Kenney, Patrick S.; Koval, Christopher M.; Hutchinson, Brian K.

    1996-01-01

    Today's moderm flight simulation research produces vast amounts of time sensitive data. The meaning of this data can be difficult to assess while in its raw format . Therefore, a method of breaking the data down and presenting it to the user in a graphical format is necessary. Simulation Graphics (SimGraph) is intended as a data visualization software package that will incorporate simulation data into a variety of animated graphical displays for easy interpretation by the simulation researcher. Although it was created for the flight simulation facilities at NASA Langley Research Center, SimGraph can be reconfigured to almost any data visualization environment. This paper traces the design, development and implementation of the SimGraph program, and is intended to be a programmer's reference guide.

  2. Prediction of missing common genes for disease pairs using network based module separation on incomplete human interactome.

    PubMed

    Akram, Pakeeza; Liao, Li

    2017-12-06

    Identification of common genes associated with comorbid diseases can be critical in understanding their pathobiological mechanism. This work presents a novel method to predict missing common genes associated with a disease pair. Searching for missing common genes is formulated as an optimization problem to minimize network based module separation from two subgraphs produced by mapping genes associated with disease onto the interactome. Using cross validation on more than 600 disease pairs, our method achieves significantly higher average receiver operating characteristic ROC Score of 0.95 compared to a baseline ROC score 0.60 using randomized data. Missing common genes prediction is aimed to complete gene set associated with comorbid disease for better understanding of biological intervention. It will also be useful for gene targeted therapeutics related to comorbid diseases. This method can be further considered for prediction of missing edges to complete the subgraph associated with disease pair.

  3. Using new edges for anomaly detection in computer networks

    DOEpatents

    Neil, Joshua Charles

    2017-07-04

    Creation of new edges in a network may be used as an indication of a potential attack on the network. Historical data of a frequency with which nodes in a network create and receive new edges may be analyzed. Baseline models of behavior among the edges in the network may be established based on the analysis of the historical data. A new edge that deviates from a respective baseline model by more than a predetermined threshold during a time window may be detected. The new edge may be flagged as potentially anomalous when the deviation from the respective baseline model is detected. Probabilities for both new and existing edges may be obtained for all edges in a path or other subgraph. The probabilities may then be combined to obtain a score for the path or other subgraph. A threshold may be obtained by calculating an empirical distribution of the scores under historical conditions.

  4. Using new edges for anomaly detection in computer networks

    DOEpatents

    Neil, Joshua Charles

    2015-05-19

    Creation of new edges in a network may be used as an indication of a potential attack on the network. Historical data of a frequency with which nodes in a network create and receive new edges may be analyzed. Baseline models of behavior among the edges in the network may be established based on the analysis of the historical data. A new edge that deviates from a respective baseline model by more than a predetermined threshold during a time window may be detected. The new edge may be flagged as potentially anomalous when the deviation from the respective baseline model is detected. Probabilities for both new and existing edges may be obtained for all edges in a path or other subgraph. The probabilities may then be combined to obtain a score for the path or other subgraph. A threshold may be obtained by calculating an empirical distribution of the scores under historical conditions.

  5. Loops in hierarchical channel networks

    NASA Astrophysics Data System (ADS)

    Katifori, Eleni; Magnasco, Marcelo

    2012-02-01

    Nature provides us with many examples of planar distribution and structural networks having dense sets of closed loops. An archetype of this form of network organization is the vasculature of dicotyledonous leaves, which showcases a hierarchically-nested architecture. Although a number of methods have been proposed to measure aspects of the structure of such networks, a robust metric to quantify their hierarchical organization is still lacking. We present an algorithmic framework that allows mapping loopy networks to binary trees, preserving in the connectivity of the trees the architecture of the original graph. We apply this framework to investigate computer generated and natural graphs extracted from digitized images of dicotyledonous leaves and animal vasculature. We calculate various metrics on the corresponding trees and discuss the relationship of these quantities to the architectural organization of the original graphs. This algorithmic framework decouples the geometric information from the metric topology (connectivity and edge weight) and it ultimately allows us to perform a quantitative statistical comparison between predictions of theoretical models and naturally occurring loopy graphs.

  6. Applying network theory to animal movements to identify properties of landscape space use.

    PubMed

    Bastille-Rousseau, Guillaume; Douglas-Hamilton, Iain; Blake, Stephen; Northrup, Joseph M; Wittemyer, George

    2018-04-01

    Network (graph) theory is a popular analytical framework to characterize the structure and dynamics among discrete objects and is particularly effective at identifying critical hubs and patterns of connectivity. The identification of such attributes is a fundamental objective of animal movement research, yet network theory has rarely been applied directly to animal relocation data. We develop an approach that allows the analysis of movement data using network theory by defining occupied pixels as nodes and connection among these pixels as edges. We first quantify node-level (local) metrics and graph-level (system) metrics on simulated movement trajectories to assess the ability of these metrics to pull out known properties in movement paths. We then apply our framework to empirical data from African elephants (Loxodonta africana), giant Galapagos tortoises (Chelonoidis spp.), and mule deer (Odocoileous hemionus). Our results indicate that certain node-level metrics, namely degree, weight, and betweenness, perform well in capturing local patterns of space use, such as the definition of core areas and paths used for inter-patch movement. These metrics were generally applicable across data sets, indicating their robustness to assumptions structuring analysis or strategies of movement. Other metrics capture local patterns effectively, but were sensitive to specified graph properties, indicating case specific applications. Our analysis indicates that graph-level metrics are unlikely to outperform other approaches for the categorization of general movement strategies (central place foraging, migration, nomadism). By identifying critical nodes, our approach provides a robust quantitative framework to identify local properties of space use that can be used to evaluate the effect of the loss of specific nodes on range wide connectivity. Our network approach is intuitive, and can be implemented across imperfectly sampled or large-scale data sets efficiently, providing a framework for conservationists to analyze movement data. Functions created for the analyses are available within the R package moveNT. © 2018 by the Ecological Society of America.

  7. Multilayer motif analysis of brain networks

    NASA Astrophysics Data System (ADS)

    Battiston, Federico; Nicosia, Vincenzo; Chavez, Mario; Latora, Vito

    2017-04-01

    In the last decade, network science has shed new light both on the structural (anatomical) and on the functional (correlations in the activity) connectivity among the different areas of the human brain. The analysis of brain networks has made possible to detect the central areas of a neural system and to identify its building blocks by looking at overabundant small subgraphs, known as motifs. However, network analysis of the brain has so far mainly focused on anatomical and functional networks as separate entities. The recently developed mathematical framework of multi-layer networks allows us to perform an analysis of the human brain where the structural and functional layers are considered together. In this work, we describe how to classify the subgraphs of a multiplex network, and we extend the motif analysis to networks with an arbitrary number of layers. We then extract multi-layer motifs in brain networks of healthy subjects by considering networks with two layers, anatomical and functional, respectively, obtained from diffusion and functional magnetic resonance imaging. Results indicate that subgraphs in which the presence of a physical connection between brain areas (links at the structural layer) coexists with a non-trivial positive correlation in their activities are statistically overabundant. Finally, we investigate the existence of a reinforcement mechanism between the two layers by looking at how the probability to find a link in one layer depends on the intensity of the connection in the other one. Showing that functional connectivity is non-trivially constrained by the underlying anatomical network, our work contributes to a better understanding of the interplay between the structure and function in the human brain.

  8. VPV--The velocity profile viewer user manual

    USGS Publications Warehouse

    Donovan, John M.

    2004-01-01

    The Velocity Profile Viewer (VPV) is a tool for visualizing time series of velocity profiles developed by the U.S. Geological Survey (USGS). The USGS uses VPV to preview and present measured velocity data from acoustic Doppler current profilers and simulated velocity data from three-dimensional estuarine, river, and lake hydrodynamic models. The data can be viewed as an animated three-dimensional profile or as a stack of time-series graphs that each represents a location in the water column. The graphically displayed data are shown at each time step like frames of animation. The animation can play at several different speeds or can be suspended on one frame. The viewing angle and time can be manipulated using mouse interaction. A number of options control the appearance of the profile and the graphs. VPV cannot edit or save data, but it can create a Post-Script file showing the velocity profile in three dimensions. This user manual describes how to use each of these features. VPV is available and can be downloaded for free from the World Wide Web at http://ca.water.usgs.gov/program/sfbay/vpv.

  9. Trapped surfaces and emergent curved space in the Bose-Hubbard model

    NASA Astrophysics Data System (ADS)

    Caravelli, Francesco; Hamma, Alioscia; Markopoulou, Fotini; Riera, Arnau

    2012-02-01

    A Bose-Hubbard model on a dynamical lattice was introduced in previous work as a spin system analogue of emergent geometry and gravity. Graphs with regions of high connectivity in the lattice were identified as candidate analogues of spacetime geometries that contain trapped surfaces. We carry out a detailed study of these systems and show explicitly that the highly connected subgraphs trap matter. We do this by solving the model in the limit of no back-reaction of the matter on the lattice, and for states with certain symmetries that are natural for our problem. We find that in this case the problem reduces to a one-dimensional Hubbard model on a lattice with variable vertex degree and multiple edges between the same two vertices. In addition, we obtain a (discrete) differential equation for the evolution of the probability density of particles which is closed in the classical regime. This is a wave equation in which the vertex degree is related to the local speed of propagation of probability. This allows an interpretation of the probability density of particles similar to that in analogue gravity systems: matter inside this analogue system sees a curved spacetime. We verify our analytic results by numerical simulations. Finally, we analyze the dependence of localization on a gradual, rather than abrupt, falloff of the vertex degree on the boundary of the highly connected region and find that matter is localized in and around that region.

  10. Ground States of Random Spanning Trees on a D-Wave 2X

    NASA Astrophysics Data System (ADS)

    Hall, J. S.; Hobl, L.; Novotny, M. A.; Michielsen, Kristel

    The performances of two D-Wave 2 machines (476 and 496 qubits) and of a 1097-qubit D-Wave 2X were investigated. Each chip has a Chimera interaction graph calG . Problem input consists of values for the fields hj and for the two-qubit interactions Ji , j of an Ising spin-glass problem formulated on calG . Output is returned in terms of a spin configuration {sj } , with sj = +/- 1 . We generated random spanning trees (RSTs) uniformly distributed over all spanning trees of calG . On the 476-qubit D-Wave 2, RSTs were generated on the full chip with Ji , j = - 1 and hj = 0 and solved one thousand times. The distribution of solution energies and the average magnetization of each qubit were determined. On both the 476- and 1097-qubit machines, four identical spanning trees were generated on each quadrant of the chip. The statistical independence of these regions was investigated. In another study, on the D-Wave 2X, one hundred RSTs with random Ji , j ∈ { - 1 , 1 } and hj = 0 were generated on the full chip. Each RST problem was solved one hundred times and the number of times the ground state energy was found was recorded. This procedure was repeated for square subgraphs, with dimensions ranging from 7 ×7 to 11 ×11. Supported in part by NSF Grants DGE-0947419 and DMR-1206233. D-Wave time provided by D-Wave Systems and by the USRA Quantum Artificial Intelligence Laboratory Research Opportunity.

  11. Being around and knowing the players: networks of influence in health policy.

    PubMed

    Lewis, Jenny M

    2006-05-01

    The accumulation and use of power is crucial to the health policy process. This paper examines the power of the medical profession in the health policy arena, by analysing which actors are perceived as influential, and how influence is structured in health policy. It combines an analysis of policy networks and social networks, to examine positional and personal influence in health policy in the state of Victoria, Australia. In the sub-graph of the influence network examined here, those most widely regarded as influential are academics, medically qualified and male. Positional actors (the top politician, political advisor and bureaucrat in health and the top nursing official) form part of a core group within this network structure. A second central group consists of medical influentials working in academia, research institutes and health-related NGOs. In this network locale overall, medical academics appear to combine positional and personal influence, and play significant intermediary roles across the network. While many claim that the medical profession has lost power in health policy and politics, this analysis yields few signs that the power of medicine to shape the health policy process has been greatly diminished in Victoria. Medical expertise is a potent embedded resource connecting actors through ties of association, making it difficult for actors with other resources and different knowledge to be considered influential. The network concepts and analytical techniques used here provide a novel means for uncovering different types of influence in health policy.

  12. Communication efficiency and congestion of signal traffic in large-scale brain networks.

    PubMed

    Mišić, Bratislav; Sporns, Olaf; McIntosh, Anthony R

    2014-01-01

    The complex connectivity of the cerebral cortex suggests that inter-regional communication is a primary function. Using computational modeling, we show that anatomical connectivity may be a major determinant for global information flow in brain networks. A macaque brain network was implemented as a communication network in which signal units flowed between grey matter nodes along white matter paths. Compared to degree-matched surrogate networks, information flow on the macaque brain network was characterized by higher loss rates, faster transit times and lower throughput, suggesting that neural connectivity may be optimized for speed rather than fidelity. Much of global communication was mediated by a "rich club" of hub regions: a sub-graph comprised of high-degree nodes that are more densely interconnected with each other than predicted by chance. First, macaque communication patterns most closely resembled those observed for a synthetic rich club network, but were less similar to those seen in a synthetic small world network, suggesting that the former is a more fundamental feature of brain network topology. Second, rich club regions attracted the most signal traffic and likewise, connections between rich club regions carried more traffic than connections between non-rich club regions. Third, a number of rich club regions were significantly under-congested, suggesting that macaque connectivity actively shapes information flow, funneling traffic towards some nodes and away from others. Together, our results indicate a critical role of the rich club of hub nodes in dynamic aspects of global brain communication.

  13. Lost in the city: revisiting Milgram's experiment in the age of social networks.

    PubMed

    Szüle, János; Kondor, Dániel; Dobos, László; Csabai, István; Vattay, Gábor

    2014-01-01

    As more and more users access social network services from smart devices with GPS receivers, the available amount of geo-tagged information makes repeating classical experiments possible on global scales and with unprecedented precision. Inspired by the original experiments of Milgram, we simulated message routing within a representative sub-graph of the network of Twitter users with about 6 million geo-located nodes and 122 million edges. We picked pairs of users from two distant metropolitan areas and tried to find a route between them using local geographic information only; our method was to forward messages to a friend living closest to the target. We found that the examined network is navigable on large scales, but navigability breaks down at the city scale and the network becomes unnavigable on intra-city distances. This means that messages usually arrived to the close proximity of the target in only 3-6 steps, but only in about 20% of the cases was it possible to find a route all the way to the recipient, in spite of the network being connected. This phenomenon is supported by the distribution of link lengths; on larger scales the distribution behaves approximately as P(d) ≈ 1/d, which was found earlier by Kleinberg to allow efficient navigation, while on smaller scales, a fractal structure becomes apparent. The intra-city correlation dimension of the network was found to be D2 = 1.25, less than the dimension D2 = 1.78 of the distribution of the population.

  14. Communication Efficiency and Congestion of Signal Traffic in Large-Scale Brain Networks

    PubMed Central

    Mišić, Bratislav; Sporns, Olaf; McIntosh, Anthony R.

    2014-01-01

    The complex connectivity of the cerebral cortex suggests that inter-regional communication is a primary function. Using computational modeling, we show that anatomical connectivity may be a major determinant for global information flow in brain networks. A macaque brain network was implemented as a communication network in which signal units flowed between grey matter nodes along white matter paths. Compared to degree-matched surrogate networks, information flow on the macaque brain network was characterized by higher loss rates, faster transit times and lower throughput, suggesting that neural connectivity may be optimized for speed rather than fidelity. Much of global communication was mediated by a “rich club” of hub regions: a sub-graph comprised of high-degree nodes that are more densely interconnected with each other than predicted by chance. First, macaque communication patterns most closely resembled those observed for a synthetic rich club network, but were less similar to those seen in a synthetic small world network, suggesting that the former is a more fundamental feature of brain network topology. Second, rich club regions attracted the most signal traffic and likewise, connections between rich club regions carried more traffic than connections between non-rich club regions. Third, a number of rich club regions were significantly under-congested, suggesting that macaque connectivity actively shapes information flow, funneling traffic towards some nodes and away from others. Together, our results indicate a critical role of the rich club of hub nodes in dynamic aspects of global brain communication. PMID:24415931

  15. EMR-based medical knowledge representation and inference via Markov random fields and distributed representation learning.

    PubMed

    Zhao, Chao; Jiang, Jingchi; Guan, Yi; Guo, Xitong; He, Bin

    2018-05-01

    Electronic medical records (EMRs) contain medical knowledge that can be used for clinical decision support (CDS). Our objective is to develop a general system that can extract and represent knowledge contained in EMRs to support three CDS tasks-test recommendation, initial diagnosis, and treatment plan recommendation-given the condition of a patient. We extracted four kinds of medical entities from records and constructed an EMR-based medical knowledge network (EMKN), in which nodes are entities and edges reflect their co-occurrence in a record. Three bipartite subgraphs (bigraphs) were extracted from the EMKN, one to support each task. One part of the bigraph was the given condition (e.g., symptoms), and the other was the condition to be inferred (e.g., diseases). Each bigraph was regarded as a Markov random field (MRF) to support the inference. We proposed three graph-based energy functions and three likelihood-based energy functions. Two of these functions are based on knowledge representation learning and can provide distributed representations of medical entities. Two EMR datasets and three metrics were utilized to evaluate the performance. As a whole, the evaluation results indicate that the proposed system outperformed the baseline methods. The distributed representation of medical entities does reflect similarity relationships with respect to knowledge level. Combining EMKN and MRF is an effective approach for general medical knowledge representation and inference. Different tasks, however, require individually designed energy functions. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. Phenotypical resistance correlation networks for 10 non-typhoidal Salmonella subpopulations in an active antimicrobial surveillance programme.

    PubMed

    Love, W J; Zawack, K A; Booth, J G; Gröhn, Y T; Lanzas, C

    2018-06-01

    Antimicrobials play a critical role in treating cases of invasive non-typhoidal salmonellosis (iNTS) and other diseases, but efficacy is hindered by resistant pathogens. Selection for phenotypical resistance may occur via several mechanisms. The current study aims to identify correlations that would allow indirect selection of increased resistance to ceftriaxone, ciprofloxacin and azithromycin to improve antimicrobial stewardship. These are medically important antibiotics for treating iNTS, but these resistances persist in non-Typhi Salmonella serotypes even though they are not licensed for use in US food animals. A set of 2875 Salmonella enterica isolates collected from animal sources by the National Antimicrobial Resistance Monitoring System were stratified in to 10 subpopulations based on serotype and host species. Collateral resistances in each subpopulation were estimated as network models of minimum inhibitory concentration partial correlations. Ceftriaxone sensitivity was correlated with other β-lactam resistances, and less commonly resistances to tetracycline, trimethoprim-sulfamethoxazole or kanamycin. Azithromycin resistance was frequently correlated with chloramphenicol resistance. Indirect selection for ciprofloxacin resistance via collateral selection appears unlikely. Density of the ACSSuT subgraph resistance aligned well with the phenotypical frequency. The current study identifies several important resistances in iNTS serotypes and further research is needed to identify the causative genetic correlations.

  17. gProcess and ESIP Platforms for Satellite Imagery Processing over the Grid

    NASA Astrophysics Data System (ADS)

    Bacu, Victor; Gorgan, Dorian; Rodila, Denisa; Pop, Florin; Neagu, Gabriel; Petcu, Dana

    2010-05-01

    The Environment oriented Satellite Data Processing Platform (ESIP) is developed through the SEE-GRID-SCI (SEE-GRID eInfrastructure for regional eScience) co-funded by the European Commission through FP7 [1]. The gProcess Platform [2] is a set of tools and services supporting the development and the execution over the Grid of the workflow based processing, and particularly the satelite imagery processing. The ESIP [3], [4] is build on top of the gProcess platform by adding a set of satellite image processing software modules and meteorological algorithms. The satellite images can reveal and supply important information on earth surface parameters, climate data, pollution level, weather conditions that can be used in different research areas. Generally, the processing algorithms of the satellite images can be decomposed in a set of modules that forms a graph representation of the processing workflow. Two types of workflows can be defined in the gProcess platform: abstract workflow (PDG - Process Description Graph), in which the user defines conceptually the algorithm, and instantiated workflow (iPDG - instantiated PDG), which is the mapping of the PDG pattern on particular satellite image and meteorological data [5]. The gProcess platform allows the definition of complex workflows by combining data resources, operators, services and sub-graphs. The gProcess platform is developed for the gLite middleware that is available in EGEE and SEE-GRID infrastructures [6]. gProcess exposes the specific functionality through web services [7]. The Editor Web Service retrieves information on available resources that are used to develop complex workflows (available operators, sub-graphs, services, supported resources, etc.). The Manager Web Service deals with resources management (uploading new resources such as workflows, operators, services, data, etc.) and in addition retrieves information on workflows. The Executor Web Service manages the execution of the instantiated workflows on the Grid infrastructure. In addition, this web service monitors the execution and generates statistical data that are important to evaluate performances and to optimize execution. The Viewer Web Service allows access to input and output data. To prove and to validate the utility of the gProcess and ESIP platforms there were developed the GreenView and GreenLand applications. The GreenView related functionality includes the refinement of some meteorological data such as temperature, and the calibration of the satellite images based on field measurements. The GreenLand application performs the classification of the satellite images by using a set of vegetation indices. The gProcess and ESIP platforms are used as well in GiSHEO project [8] to support the processing of Earth Observation data over the Grid in eGLE (GiSHEO eLearning Environment). Experiments of performance assessment were conducted and they have revealed that the workflow-based execution could improve the execution time of a satellite image processing algorithm [9]. It is not a reliable solution to execute all the workflow nodes on different machines. The execution of some nodes can be more time consuming and they will be performed in a longer time than other nodes. The total execution time will be affected because some nodes will slow down the execution. It is important to correctly balance the workflow nodes. Based on some optimization strategy the workflow nodes can be grouped horizontally, vertically or in a hybrid approach. In this way, those operators will be executed on one machine and also the data transfer between workflow nodes will be lower. The dynamic nature of the Grid infrastructure makes it more exposed to the occurrence of failures. These failures can occur at worker node, services availability, storage element, etc. Currently gProcess has support for some basic error prevention and error management solutions. In future, some more advanced error prevention and management solutions will be integrated in the gProcess platform. References [1] SEE-GRID-SCI Project, http://www.see-grid-sci.eu/ [2] Bacu V., Stefanut T., Rodila D., Gorgan D., Process Description Graph Composition by gProcess Platform. HiPerGRID - 3rd International Workshop on High Performance Grid Middleware, 28 May, Bucharest. Proceedings of CSCS-17 Conference, Vol.2., ISSN 2066-4451, pp. 423-430, (2009). [3] ESIP Platform, http://wiki.egee-see.org/index.php/JRA1_Commonalities [4] Gorgan D., Bacu V., Rodila D., Pop Fl., Petcu D., Experiments on ESIP - Environment oriented Satellite Data Processing Platform. SEE-GRID-SCI User Forum, 9-10 Dec 2009, Bogazici University, Istanbul, Turkey, ISBN: 978-975-403-510-0, pp. 157-166 (2009). [5] Radu, A., Bacu, V., Gorgan, D., Diagrammatic Description of Satellite Image Processing Workflow. Workshop on Grid Computing Applications Development (GridCAD) at the SYNASC Symposium, 28 September 2007, Timisoara, IEEE Computer Press, ISBN 0-7695-3078-8, 2007, pp. 341-348 (2007). [6] Gorgan D., Bacu V., Stefanut T., Rodila D., Mihon D., Grid based Satellite Image Processing Platform for Earth Observation Applications Development. IDAACS'2009 - IEEE Fifth International Workshop on "Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications", 21-23 September, Cosenza, Italy, IEEE Published in Computer Press, 247-252 (2009). [7] Rodila D., Bacu V., Gorgan D., Integration of Satellite Image Operators as Workflows in the gProcess Application. Proceedings of ICCP2009 - IEEE 5th International Conference on Intelligent Computer Communication and Processing, 27-29 Aug, 2009 Cluj-Napoca. ISBN: 978-1-4244-5007-7, pp. 355-358 (2009). [8] GiSHEO consortium, Project site, http://gisheo.info.uvt.ro [9] Bacu V., Gorgan D., Graph Based Evaluation of Satellite Imagery Processing over Grid. ISPDC 2008 - 7th International Symposium on Parallel and Distributed Computing, July 1-5, 2008, Krakow, Poland. IEEE Computer Society 2008, ISBN: 978-0-7695-3472-5, pp. 147-154.

  18. Exploring the structure and function of temporal networks with dynamic graphlets

    PubMed Central

    Hulovatyy, Y.; Chen, H.; Milenković, T.

    2015-01-01

    Motivation: With increasing availability of temporal real-world networks, how to efficiently study these data? One can model a temporal network as a single aggregate static network, or as a series of time-specific snapshots, each being an aggregate static network over the corresponding time window. Then, one can use established methods for static analysis on the resulting aggregate network(s), but losing in the process valuable temporal information either completely, or at the interface between different snapshots, respectively. Here, we develop a novel approach for studying a temporal network more explicitly, by capturing inter-snapshot relationships. Results: We base our methodology on well-established graphlets (subgraphs), which have been proven in numerous contexts in static network research. We develop new theory to allow for graphlet-based analyses of temporal networks. Our new notion of dynamic graphlets is different from existing dynamic network approaches that are based on temporal motifs (statistically significant subgraphs). The latter have limitations: their results depend on the choice of a null network model that is required to evaluate the significance of a subgraph, and choosing a good null model is non-trivial. Our dynamic graphlets overcome the limitations of the temporal motifs. Also, when we aim to characterize the structure and function of an entire temporal network or of individual nodes, our dynamic graphlets outperform the static graphlets. Clearly, accounting for temporal information helps. We apply dynamic graphlets to temporal age-specific molecular network data to deepen our limited knowledge about human aging. Availability and implementation: http://www.nd.edu/∼cone/DG. Contact: tmilenko@nd.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26072480

  19. Molybdenum in the environment

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jarrell, W.M.; Page, A.L.; Elseewi, A.A.

    1980-01-01

    While molybdenum is an essential element for both plants and animals, it becomes toxic above certain critical levels. Reviewed are the natural supply of molybdenum in the environment, the molybdenum cycle, the importance of molybdenum in industry and agriculture, and potential hazards that may occur when excessive levels of molybdenum occur in the environment. Although the potential of molybdenum toxicity to humans and non-ruminant animals appears to be low, the enrichment of the environment with molybdenum from modern mining, agricultural, and industrial activities has potentially hazardous implications for ruminant animal health. (3 graphs, numerous references, 16 tables)

  20. Multimedia and Understanding: Expert and Novice Responses To Different Representations of Chemical Phenomena.

    ERIC Educational Resources Information Center

    Kozma, Robert B.; Russell, Joel

    1997-01-01

    Examines how professional chemists and undergraduate chemistry students respond to chemistry-related video segments, graphs, animations, and equations. Discusses the role that surface features of representations play in the understanding of chemistry. Contains 36 references. (DDR)

  1. Graph Theoretical Analysis Reveals: Women's Brains Are Better Connected than Men's.

    PubMed

    Szalkai, Balázs; Varga, Bálint; Grolmusz, Vince

    2015-01-01

    Deep graph-theoretic ideas in the context with the graph of the World Wide Web led to the definition of Google's PageRank and the subsequent rise of the most popular search engine to date. Brain graphs, or connectomes, are being widely explored today. We believe that non-trivial graph theoretic concepts, similarly as it happened in the case of the World Wide Web, will lead to discoveries enlightening the structural and also the functional details of the animal and human brains. When scientists examine large networks of tens or hundreds of millions of vertices, only fast algorithms can be applied because of the size constraints. In the case of diffusion MRI-based structural human brain imaging, the effective vertex number of the connectomes, or brain graphs derived from the data is on the scale of several hundred today. That size facilitates applying strict mathematical graph algorithms even for some hard-to-compute (or NP-hard) quantities like vertex cover or balanced minimum cut. In the present work we have examined brain graphs, computed from the data of the Human Connectome Project, recorded from male and female subjects between ages 22 and 35. Significant differences were found between the male and female structural brain graphs: we show that the average female connectome has more edges, is a better expander graph, has larger minimal bisection width, and has more spanning trees than the average male connectome. Since the average female brain weighs less than the brain of males, these properties show that the female brain has better graph theoretical properties, in a sense, than the brain of males. It is known that the female brain has a smaller gray matter/white matter ratio than males, that is, a larger white matter/gray matter ratio than the brain of males; this observation is in line with our findings concerning the number of edges, since the white matter consists of myelinated axons, which, in turn, roughly correspond to the connections in the brain graph. We have also found that the minimum bisection width, normalized with the edge number, is also significantly larger in the right and the left hemispheres in females: therefore, the differing bisection widths are independent from the difference in the number of edges.

  2. Graph Theoretical Analysis Reveals: Women’s Brains Are Better Connected than Men’s

    PubMed Central

    Szalkai, Balázs; Varga, Bálint; Grolmusz, Vince

    2015-01-01

    Deep graph-theoretic ideas in the context with the graph of the World Wide Web led to the definition of Google’s PageRank and the subsequent rise of the most popular search engine to date. Brain graphs, or connectomes, are being widely explored today. We believe that non-trivial graph theoretic concepts, similarly as it happened in the case of the World Wide Web, will lead to discoveries enlightening the structural and also the functional details of the animal and human brains. When scientists examine large networks of tens or hundreds of millions of vertices, only fast algorithms can be applied because of the size constraints. In the case of diffusion MRI-based structural human brain imaging, the effective vertex number of the connectomes, or brain graphs derived from the data is on the scale of several hundred today. That size facilitates applying strict mathematical graph algorithms even for some hard-to-compute (or NP-hard) quantities like vertex cover or balanced minimum cut. In the present work we have examined brain graphs, computed from the data of the Human Connectome Project, recorded from male and female subjects between ages 22 and 35. Significant differences were found between the male and female structural brain graphs: we show that the average female connectome has more edges, is a better expander graph, has larger minimal bisection width, and has more spanning trees than the average male connectome. Since the average female brain weighs less than the brain of males, these properties show that the female brain has better graph theoretical properties, in a sense, than the brain of males. It is known that the female brain has a smaller gray matter/white matter ratio than males, that is, a larger white matter/gray matter ratio than the brain of males; this observation is in line with our findings concerning the number of edges, since the white matter consists of myelinated axons, which, in turn, roughly correspond to the connections in the brain graph. We have also found that the minimum bisection width, normalized with the edge number, is also significantly larger in the right and the left hemispheres in females: therefore, the differing bisection widths are independent from the difference in the number of edges. PMID:26132764

  3. Efficient Generation of Dancing Animation Synchronizing with Music Based on Meta Motion Graphs

    NASA Astrophysics Data System (ADS)

    Xu, Jianfeng; Takagi, Koichi; Sakazawa, Shigeyuki

    This paper presents a system for automatic generation of dancing animation that is synchronized with a piece of music by re-using motion capture data. Basically, the dancing motion is synthesized according to the rhythm and intensity features of music. For this purpose, we propose a novel meta motion graph structure to embed the necessary features including both rhythm and intensity, which is constructed on the motion capture database beforehand. In this paper, we consider two scenarios for non-streaming music and streaming music, where global search and local search are required respectively. In the case of the former, once a piece of music is input, the efficient dynamic programming algorithm can be employed to globally search a best path in the meta motion graph, where an objective function is properly designed by measuring the quality of beat synchronization, intensity matching, and motion smoothness. In the case of the latter, the input music is stored in a buffer in a streaming mode, then an efficient search method is presented for a certain amount of music data (called a segment) in the buffer with the same objective function, resulting in a segment-based search approach. For streaming applications, we define an additional property in the above meta motion graph to deal with the unpredictable future music, which guarantees that there is some motion to match the unknown remaining music. A user study with totally 60 subjects demonstrates that our system outperforms the stat-of-the-art techniques in both scenarios. Furthermore, our system improves the synthesis speed greatly (maximal speedup is more than 500 times), which is essential for mobile applications. We have implemented our system on commercially available smart phones and confirmed that it works well on these mobile phones.

  4. A new forecast presentation tool for offshore contractors

    NASA Astrophysics Data System (ADS)

    Jørgensen, M.

    2009-09-01

    Contractors working off shore are often very sensitive to both sea and weather conditions, and it's essential that they have easy access to reliable information on coming conditions to enable planning of when to start or shut down offshore operations to avoid loss of life and materials. Danish Meteorological Institute, DMI, recently, in cooperation with business partners in the field, developed a new application to accommodate that need. The "Marine Forecast Service” is a browser based forecast presentation tool. It provides an interface for the user to enable easy and quick access to all relevant meteorological and oceanographic forecasts and observations for a given area of interest. Each customer gains access to the application via a standard login/password procedure. Once logged in, the user can inspect animated forecast maps of parameters like wind, gust, wave height, swell and current among others. Supplementing the general maps, the user can choose to look at forecast graphs for each of the locations where the user is running operations. These forecast graphs can also be overlaid with the user's own in situ observations, if such exist. Furthermore, the data from the graphs can be exported as data files that the customer can use in his own applications as he desires. As part of the application, a forecaster's view on the current and near future weather situation is presented to the user as well, adding further value to the information presented through maps and graphs. Among other features of the product, animated radar and satellite images could be mentioned. And finally the application provides the possibility of a "second opinion” through traditional weather charts from another recognized provider of weather forecasts. The presentation will provide more detailed insights into the contents of the applications as well as some of the experiences with the product.

  5. Top-K Interesting Subgraph Discovery in Information Networks

    DTIC Science & Technology

    2014-03-03

    Integrative Biomarker Discovery for Breast Cancer Metastasis from Gene Expression and Protein Interaction Data Using Error-tolerant Pattern Mining” at...Jiawei Han¶ ∗Microsoft, India . Email: gmanish@microsoft.com †State University of New York at Buffalo. Email: jing@buffalo.edu ‡University of California

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Demeure, I.M.

    The research presented here is concerned with representation techniques and tools to support the design, prototyping, simulation, and evaluation of message-based parallel, distributed computations. The author describes ParaDiGM-Parallel, Distributed computation Graph Model-a visual representation technique for parallel, message-based distributed computations. ParaDiGM provides several views of a computation depending on the aspect of concern. It is made of two complementary submodels, the DCPG-Distributed Computing Precedence Graph-model, and the PAM-Process Architecture Model-model. DCPGs are precedence graphs used to express the functionality of a computation in terms of tasks, message-passing, and data. PAM graphs are used to represent the partitioning of a computationmore » into schedulable units or processes, and the pattern of communication among those units. There is a natural mapping between the two models. He illustrates the utility of ParaDiGM as a representation technique by applying it to various computations (e.g., an adaptive global optimization algorithm, the client-server model). ParaDiGM representations are concise. They can be used in documenting the design and the implementation of parallel, distributed computations, in describing such computations to colleagues, and in comparing and contrasting various implementations of the same computation. He then describes VISA-VISual Assistant, a software tool to support the design, prototyping, and simulation of message-based parallel, distributed computations. VISA is based on the ParaDiGM model. In particular, it supports the editing of ParaDiGM graphs to describe the computations of interest, and the animation of these graphs to provide visual feedback during simulations. The graphs are supplemented with various attributes, simulation parameters, and interpretations which are procedures that can be executed by VISA.« less

  7. Halloween High Jinks.

    ERIC Educational Resources Information Center

    Andrews, Doreen; And Others

    1992-01-01

    Presents a collection of fall and Halloween activities for elementary students, including pumpkin poetry, batty bulletin boards (graphing), vegetable variety art, old time radio mysteries, paper doll Halloween safety, career dress-up day, imaginative Halloween writing, and matching animals with foods they eat. A student page offers a Dracula…

  8. Cardiovascular responses to hypogravic environments

    NASA Technical Reports Server (NTRS)

    Sandler, H.

    1983-01-01

    The cardiovascular deconditioning observed during and after space flight is characterized in a review of human space and simulation studies and animal simulations. The various simulation techniques (horizontal bed rest, head-down tilt, and water immersion in man, and immobilization of animals) are examined, and sample results are presented in graphs. Countermeasures such as exercise regimens, fluid replacement, drugs, venous pooling, G-suits, oscillating beds, electrostimulation of muscles, lower-body negative pressure, body-surface cooling, and hypoxia are reviewed and found to be generally ineffective or unreliable. The need for future space experimentation in both humans and animals is indicated.

  9. Accurate Classification of RNA Structures Using Topological Fingerprints

    PubMed Central

    Li, Kejie; Gribskov, Michael

    2016-01-01

    While RNAs are well known to possess complex structures, functionally similar RNAs often have little sequence similarity. While the exact size and spacing of base-paired regions vary, functionally similar RNAs have pronounced similarity in the arrangement, or topology, of base-paired stems. Furthermore, predicted RNA structures often lack pseudoknots (a crucial aspect of biological activity), and are only partially correct, or incomplete. A topological approach addresses all of these difficulties. In this work we describe each RNA structure as a graph that can be converted to a topological spectrum (RNA fingerprint). The set of subgraphs in an RNA structure, its RNA fingerprint, can be compared with the fingerprints of other RNA structures to identify and correctly classify functionally related RNAs. Topologically similar RNAs can be identified even when a large fraction, up to 30%, of the stems are omitted, indicating that highly accurate structures are not necessary. We investigate the performance of the RNA fingerprint approach on a set of eight highly curated RNA families, with diverse sizes and functions, containing pseudoknots, and with little sequence similarity–an especially difficult test set. In spite of the difficult test set, the RNA fingerprint approach is very successful (ROC AUC > 0.95). Due to the inclusion of pseudoknots, the RNA fingerprint approach both covers a wider range of possible structures than methods based only on secondary structure, and its tolerance for incomplete structures suggests that it can be applied even to predicted structures. Source code is freely available at https://github.rcac.purdue.edu/mgribsko/XIOS_RNA_fingerprint. PMID:27755571

  10. Probabilistic Graphical Model Representation in Phylogenetics

    PubMed Central

    Höhna, Sebastian; Heath, Tracy A.; Boussau, Bastien; Landis, Michael J.; Ronquist, Fredrik; Huelsenbeck, John P.

    2014-01-01

    Recent years have seen a rapid expansion of the model space explored in statistical phylogenetics, emphasizing the need for new approaches to statistical model representation and software development. Clear communication and representation of the chosen model is crucial for: (i) reproducibility of an analysis, (ii) model development, and (iii) software design. Moreover, a unified, clear and understandable framework for model representation lowers the barrier for beginners and nonspecialists to grasp complex phylogenetic models, including their assumptions and parameter/variable dependencies. Graphical modeling is a unifying framework that has gained in popularity in the statistical literature in recent years. The core idea is to break complex models into conditionally independent distributions. The strength lies in the comprehensibility, flexibility, and adaptability of this formalism, and the large body of computational work based on it. Graphical models are well-suited to teach statistical models, to facilitate communication among phylogeneticists and in the development of generic software for simulation and statistical inference. Here, we provide an introduction to graphical models for phylogeneticists and extend the standard graphical model representation to the realm of phylogenetics. We introduce a new graphical model component, tree plates, to capture the changing structure of the subgraph corresponding to a phylogenetic tree. We describe a range of phylogenetic models using the graphical model framework and introduce modules to simplify the representation of standard components in large and complex models. Phylogenetic model graphs can be readily used in simulation, maximum likelihood inference, and Bayesian inference using, for example, Metropolis–Hastings or Gibbs sampling of the posterior distribution. [Computation; graphical models; inference; modularization; statistical phylogenetics; tree plate.] PMID:24951559

  11. Transfer matrix computation of generalized critical polynomials in percolation

    DOE PAGES

    Scullard, Christian R.; Jacobsen, Jesper Lykke

    2012-09-27

    Percolation thresholds have recently been studied by means of a graph polynomial PB(p), henceforth referred to as the critical polynomial, that may be defined on any periodic lattice. The polynomial depends on a finite subgraph B, called the basis, and the way in which the basis is tiled to form the lattice. The unique root of P B(p) in [0, 1] either gives the exact percolation threshold for the lattice, or provides an approximation that becomes more accurate with appropriately increasing size of B. Initially P B(p) was defined by a contraction-deletion identity, similar to that satisfied by the Tuttemore » polynomial. Here, we give an alternative probabilistic definition of P B(p), which allows for much more efficient computations, by using the transfer matrix, than was previously possible with contraction-deletion. We present bond percolation polynomials for the (4, 82), kagome, and (3, 122) lattices for bases of up to respectively 96, 162, and 243 edges, much larger than the previous limit of 36 edges using contraction-deletion. We discuss in detail the role of the symmetries and the embedding of B. For the largest bases, we obtain the thresholds p c(4, 82) = 0.676 803 329 · · ·, p c(kagome) = 0.524 404 998 · · ·, p c(3, 122) = 0.740 420 798 · · ·, comparable to the best simulation results. We also show that the alternative definition of P B(p) can be applied to study site percolation problems.« less

  12. On the star partition dimension of comb product of cycle and path

    NASA Astrophysics Data System (ADS)

    Alfarisi, Ridho; Darmaji

    2017-08-01

    Let G = (V, E) be a connected graphs with vertex set V(G), edge set E(G) and S ⊆ V(G). Given an ordered partition Π = {S1, S2, S3, …, Sk} of the vertex set V of G, the representation of a vertex v ∈ V with respect to Π is the vector r(v|Π) = (d(v, S1), d(v, S2), …, d(v, Sk)), where d(v, Sk) represents the distance between the vertex v and the set Sk and d(v, Sk) = min{d(v, x)|x ∈ Sk }. A partition Π of V(G) is a resolving partition if different vertices of G have distinct representations, i.e., for every pair of vertices u, v ∈ V(G), r(u|Π) ≠ r(v|Π). The minimum k of Π resolving partition is a partition dimension of G, denoted by pd(G). The resolving partition Π = {S1, S2, S3, …, Sk } is called a star resolving partition for G if it is a resolving partition and each subgraph induced by Si, 1 ≤ i ≤ k, is a star. The minimum k for which there exists a star resolving partition of V(G) is the star partition dimension of G, denoted by spd(G). Finding the star partition dimension of G is classified to be a NP-Hard problem. In this paper, we will show that the partition dimension of comb product of cycle and path namely Cm⊳Pn and Pn⊳Cm for n ≥ 2 and m ≥ 3.

  13. Using Zipf-Mandelbrot law and graph theory to evaluate animal welfare

    NASA Astrophysics Data System (ADS)

    de Oliveira, Caprice G. L.; Miranda, José G. V.; Japyassú, Hilton F.; El-Hani, Charbel N.

    2018-02-01

    This work deals with the construction and testing of metrics of welfare based on behavioral complexity, using assumptions derived from Zipf-Mandelbrot law and graph theory. To test these metrics we compared yellow-breasted capuchins (Sapajus xanthosternos) (Wied-Neuwied, 1826) (PRIMATES CEBIDAE) found in two institutions, subjected to different captive conditions: a Zoobotanical Garden (hereafter, ZOO; n = 14), in good welfare condition, and a Wildlife Rescue Center (hereafter, WRC; n = 8), in poor welfare condition. In the Zipf-Mandelbrot-based analysis, the power law exponent was calculated using behavior frequency values versus behavior rank value. These values allow us to evaluate variations in individual behavioral complexity. For each individual we also constructed a graph using the sequence of behavioral units displayed in each recording (average recording time per individual: 4 h 26 min in the ZOO, 4 h 30 min in the WRC). Then, we calculated the values of the main graph attributes, which allowed us to analyze the complexity of the connectivity of the behaviors displayed in the individuals' behavioral sequences. We found significant differences between the two groups for the slope values in the Zipf-Mandelbrot analysis. The slope values for the ZOO individuals approached -1, with graphs representing a power law, while the values for the WRC individuals diverged from -1, differing from a power law pattern. Likewise, we found significant differences for the graph attributes average degree, weighted average degree, and clustering coefficient when comparing the ZOO and WRC individual graphs. However, no significant difference was found for the attributes modularity and average path length. Both analyses were effective in detecting differences between the patterns of behavioral complexity in the two groups. The slope values for the ZOO individuals indicated a higher behavioral complexity when compared to the WRC individuals. Similarly, graph construction and the calculation of its attributes values allowed us to show that the complexity of the connectivity among the behaviors was higher in the ZOO than in the WRC individual graphs. These results show that the two measuring approaches introduced and tested in this paper were capable of capturing the differences in welfare levels between the two conditions, as shown by differences in behavioral complexity.

  14. Agricultural Products | National Agricultural Library

    Science.gov Websites

    Skip to main content Home National Agricultural Library United States Department of Agriculture Ag News Contact Us Search  Log inRegister Home Home Agricultural Products NEWT: National Extension Web , tables, graphs), Agricultural Products html National Animal Nutrition Program (NANP) Feed Composition

  15. MetaMapp: mapping and visualizing metabolomic data by integrating information from biochemical pathways and chemical and mass spectral similarity

    PubMed Central

    2012-01-01

    Background Exposure to environmental tobacco smoke (ETS) leads to higher rates of pulmonary diseases and infections in children. To study the biochemical changes that may precede lung diseases, metabolomic effects on fetal and maternal lungs and plasma from rats exposed to ETS were compared to filtered air control animals. Genome- reconstructed metabolic pathways may be used to map and interpret dysregulation in metabolic networks. However, mass spectrometry-based non-targeted metabolomics datasets often comprise many metabolites for which links to enzymatic reactions have not yet been reported. Hence, network visualizations that rely on current biochemical databases are incomplete and also fail to visualize novel, structurally unidentified metabolites. Results We present a novel approach to integrate biochemical pathway and chemical relationships to map all detected metabolites in network graphs (MetaMapp) using KEGG reactant pair database, Tanimoto chemical and NIST mass spectral similarity scores. In fetal and maternal lungs, and in maternal blood plasma from pregnant rats exposed to environmental tobacco smoke (ETS), 459 unique metabolites comprising 179 structurally identified compounds were detected by gas chromatography time of flight mass spectrometry (GC-TOF MS) and BinBase data processing. MetaMapp graphs in Cytoscape showed much clearer metabolic modularity and complete content visualization compared to conventional biochemical mapping approaches. Cytoscape visualization of differential statistics results using these graphs showed that overall, fetal lung metabolism was more impaired than lungs and blood metabolism in dams. Fetuses from ETS-exposed dams expressed lower lipid and nucleotide levels and higher amounts of energy metabolism intermediates than control animals, indicating lower biosynthetic rates of metabolites for cell division, structural proteins and lipids that are critical for in lung development. Conclusions MetaMapp graphs efficiently visualizes mass spectrometry based metabolomics datasets as network graphs in Cytoscape, and highlights metabolic alterations that can be associated with higher rate of pulmonary diseases and infections in children prenatally exposed to ETS. The MetaMapp scripts can be accessed at http://metamapp.fiehnlab.ucdavis.edu. PMID:22591066

  16. Clinical Development of Gamitrinib, a Novel Mitochondrial-Targeted Small Molecule Hsp90 Inhibitor

    DTIC Science & Technology

    2015-09-01

    Group 2 and Group 3 animals examined at the end of the 7-repeated doses was comparable to those in control Group 1 animals (Figure 2). (7) Despite... posttest (for more than two- group comparisons) using a GraphPad software package (Prism 6.0) for Windows. Data are expressed as mean ± SD or mean ± SEM...Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc Series B Stat

  17. Approximate Subgraph Isomorphism for Image Localization (Author’s Manuscript)

    DTIC Science & Technology

    2016-02-18

    a working database for feature matching methods is nearly impossible to generate. In a proof of feasibility, Bansal et. al. [2] claim that overhead...of images in mountainous terrain. In Computer Vision–ECCV 2012, pages 517–530. Springer, 2012. 1 [2] M. Bansal , H. S. Sawhney, H. Cheng, and K

  18. Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations.

    PubMed

    Bakal, Gokhan; Talari, Preetham; Kakani, Elijah V; Kavuluru, Ramakanth

    2018-06-01

    Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying different causal relations between biomedical entities is also critical to understand biomedical processes. Generally, natural language processing (NLP) and machine learning are used to predict specific relations between any given pair of entities using the distant supervision approach. To build high accuracy supervised predictive models to predict previously unknown treatment and causative relations between biomedical entities based only on semantic graph pattern features extracted from biomedical knowledge graphs. We used 7000 treats and 2918 causes hand-curated relations from the UMLS Metathesaurus to train and test our models. Our graph pattern features are extracted from simple paths connecting biomedical entities in the SemMedDB graph (based on the well-known SemMedDB database made available by the U.S. National Library of Medicine). Using these graph patterns connecting biomedical entities as features of logistic regression and decision tree models, we computed mean performance measures (precision, recall, F-score) over 100 distinct 80-20% train-test splits of the datasets. For all experiments, we used a positive:negative class imbalance of 1:10 in the test set to model relatively more realistic scenarios. Our models predict treats and causes relations with high F-scores of 99% and 90% respectively. Logistic regression model coefficients also help us identify highly discriminative patterns that have an intuitive interpretation. We are also able to predict some new plausible relations based on false positives that our models scored highly based on our collaborations with two physician co-authors. Finally, our decision tree models are able to retrieve over 50% of treatment relations from a recently created external dataset. We employed semantic graph patterns connecting pairs of candidate biomedical entities in a knowledge graph as features to predict treatment/causative relations between them. We provide what we believe is the first evidence in direct prediction of biomedical relations based on graph features. Our work complements lexical pattern based approaches in that the graph patterns can be used as additional features for weakly supervised relation prediction. Copyright © 2018 Elsevier Inc. All rights reserved.

  19. Effects of Multimedia Vocabulary Annotations on Vocabulary Learning and Text Comprehension in ESP Classrooms

    ERIC Educational Resources Information Center

    Lin, Huifen

    2012-01-01

    For the past few decades, instructional materials enriched with multimedia elements have enjoyed increasing popularity. Multimedia-based instruction incorporating stimulating visuals, authentic audios, and interactive animated graphs of different kinds all provide additional and valuable opportunities for students to learn beyond what conventional…

  20. Custom Orthotics Changed My Life

    ERIC Educational Resources Information Center

    Holeton, Richard

    2010-01-01

    The narrator relates his life's downward spiral and miraculous rebound from severe foot problems using animated bullet points, images, charts, and graphs. "Custom Orthotics Changed My Life" is a work of presentation fiction, or slideshow fiction, in the form of a video with an original soundtrack. The music was composed by David Kettler, a…

  1. Mining continuous activity patterns from animal trajectory data

    USGS Publications Warehouse

    Wang, Y.; Luo, Ze; Baoping, Yan; Takekawa, John Y.; Prosser, Diann J.; Newman, Scott H.

    2014-01-01

    The increasing availability of animal tracking data brings us opportunities and challenges to intuitively understand the mechanisms of animal activities. In this paper, we aim to discover animal movement patterns from animal trajectory data. In particular, we propose a notion of continuous activity pattern as the concise representation of underlying similar spatio-temporal movements, and develop an extension and refinement framework to discover the patterns. We first preprocess the trajectories into significant semantic locations with time property. Then, we apply a projection-based approach to generate candidate patterns and refine them to generate true patterns. A sequence graph structure and a simple and effective processing strategy is further developed to reduce the computational overhead. The proposed approaches are extensively validated on both real GPS datasets and large synthetic datasets.

  2. Discriminating tests of information and topological indices. Animals and trees.

    PubMed

    Konstantinova, Elena V; Vidyuk, Maxim V

    2003-01-01

    In this paper we consider 13 information and topological indices based on the distance in a molecular graph with respect to their discrimination power. The numerical results of discriminating tests on 3490528 trees up to 21 vertices are given. The indices of the highest sensitivity are listed on the set of 1528775 alkane trees. The discrimination powers of indices are also examined on the classes of 849285 hexagonal, 298382 square, and 295365 triangular simply connected animals. The first class of animals corresponds to the structural formulas of planar benzenoid hydrocarbons. The values of all indices were calculated for all classes of animals as well as for the united set of 1443032 animals. The inspection of the data indicates the great sensitivity of four information indices and one topological index.

  3. Two's company, three (or more) is a simplex : Algebraic-topological tools for understanding higher-order structure in neural data.

    PubMed

    Giusti, Chad; Ghrist, Robert; Bassett, Danielle S

    2016-08-01

    The language of graph theory, or network science, has proven to be an exceptional tool for addressing myriad problems in neuroscience. Yet, the use of networks is predicated on a critical simplifying assumption: that the quintessential unit of interest in a brain is a dyad - two nodes (neurons or brain regions) connected by an edge. While rarely mentioned, this fundamental assumption inherently limits the types of neural structure and function that graphs can be used to model. Here, we describe a generalization of graphs that overcomes these limitations, thereby offering a broad range of new possibilities in terms of modeling and measuring neural phenomena. Specifically, we explore the use of simplicial complexes: a structure developed in the field of mathematics known as algebraic topology, of increasing applicability to real data due to a rapidly growing computational toolset. We review the underlying mathematical formalism as well as the budding literature applying simplicial complexes to neural data, from electrophysiological recordings in animal models to hemodynamic fluctuations in humans. Based on the exceptional flexibility of the tools and recent ground-breaking insights into neural function, we posit that this framework has the potential to eclipse graph theory in unraveling the fundamental mysteries of cognition.

  4. Just-in-Time Algebra: A Problem Solving Approach Including Multimedia and Animation.

    ERIC Educational Resources Information Center

    Hofmann, Roseanne S.; Hunter, Walter R.

    2003-01-01

    Describes a beginning algebra course that places stronger emphasis on learning to solve problems and introduces topics using real world applications. Students learn estimating, graphing, and algebraic algorithms for the purpose of solving problems. Indicates that applications motivate students by appearing to be a more relevant topic as well as…

  5. Investigating the Influence Relationship Models for Stocks in Indian Equity Market: A Weighted Network Modelling Study

    PubMed Central

    Acharjee, Animesh

    2016-01-01

    The socio-economic systems today possess high levels of both interconnectedness and interdependencies, and such system-level relationships behave very dynamically. In such situations, it is all around perceived that influence is a perplexing power that has an overseeing part in affecting the dynamics and behaviours of involved ones. As a result of the force & direction of influence, the transformative change of one entity has a cogent aftereffect on the other entities in the system. The current study employs directed weighted networks for investigating the influential relationship patterns existent in a typical equity market as an outcome of inter-stock interactions happening at the market level, the sectorial level and the industrial level. The study dataset is derived from 335 constituent stocks of ‘Standard & Poor Bombay Stock Exchange 500 index’ and study period is 1st June 2005 to 30th June 2015. The study identifies the set of most dynamically influential stocks & their respective temporal pattern at three hierarchical levels: the complete equity market, different sectors, and constituting industry segments of those sectors. A detailed influence relationship analysis is performed for the sectorial level network of the construction sector, and it was found that stocks belonging to the cement industry possessed high influence within this sector. Also, the detailed network analysis of construction sector revealed that it follows scale-free characteristics and power law distribution. In the industry specific influence relationship analysis for cement industry, methods based on threshold filtering and minimum spanning tree were employed to derive a set of sub-graphs having temporally stable high-correlation structure over this ten years period. PMID:27846251

  6. Semi-Metric Topology of the Human Connectome: Sensitivity and Specificity to Autism and Major Depressive Disorder.

    PubMed

    Simas, Tiago; Chattopadhyay, Shayanti; Hagan, Cindy; Kundu, Prantik; Patel, Ameera; Holt, Rosemary; Floris, Dorothea; Graham, Julia; Ooi, Cinly; Tait, Roger; Spencer, Michael; Baron-Cohen, Simon; Sahakian, Barbara; Bullmore, Ed; Goodyer, Ian; Suckling, John

    2015-01-01

    The human functional connectome is a graphical representation, consisting of nodes connected by edges, of the inter-relationships of blood oxygenation-level dependent (BOLD) time-series measured by MRI from regions encompassing the cerebral cortices and, often, the cerebellum. Semi-metric analysis of the weighted, undirected connectome distinguishes an edge as either direct (metric), such that there is no alternative path that is accumulatively stronger, or indirect (semi-metric), where one or more alternative paths exist that have greater strength than the direct edge. The sensitivity and specificity of this method of analysis is illustrated by two case-control analyses with independent, matched groups of adolescents with autism spectrum conditions (ASC) and major depressive disorder (MDD). Significance differences in the global percentage of semi-metric edges was observed in both groups, with increases in ASC and decreases in MDD relative to controls. Furthermore, MDD was associated with regional differences in left frontal and temporal lobes, the right limbic system and cerebellum. In contrast, ASC had a broadly increased percentage of semi-metric edges with a more generalised distribution of effects and some areas of reduction. In summary, MDD was characterised by localised, large reductions in the percentage of semi-metric edges, whilst ASC is characterised by more generalised, subtle increases. These differences were corroborated in greater detail by inspection of the semi-metric backbone for each group; that is, the sub-graph of semi-metric edges present in >90% of participants, and by nodal degree differences in the semi-metric connectome. These encouraging results, in what we believe is the first application of semi-metric analysis to neuroimaging data, raise confidence in the methodology as potentially capable of detection and characterisation of a range of neurodevelopmental and psychiatric disorders.

  7. Coding gains and error rates from the Big Viterbi Decoder

    NASA Technical Reports Server (NTRS)

    Onyszchuk, I. M.

    1991-01-01

    A prototype hardware Big Viterbi Decoder (BVD) was completed for an experiment with the Galileo Spacecraft. Searches for new convolutional codes, studies of Viterbi decoder hardware designs and architectures, mathematical formulations, and decompositions of the deBruijn graph into identical and hierarchical subgraphs, and very large scale integration (VLSI) chip design are just a few examples of tasks completed for this project. The BVD bit error rates (BER), measured from hardware and software simulations, are plotted as a function of bit signal to noise ratio E sub b/N sub 0 on the additive white Gaussian noise channel. Using the constraint length 15, rate 1/4, experimental convolutional code for the Galileo mission, the BVD gains 1.5 dB over the NASA standard (7,1/2) Maximum Likelihood Convolution Decoder (MCD) at a BER of 0.005. At this BER, the same gain results when the (255,233) NASA standard Reed-Solomon decoder is used, which yields a word error rate of 2.1 x 10(exp -8) and a BER of 1.4 x 10(exp -9). The (15, 1/6) code to be used by the Cometary Rendezvous Asteroid Flyby (CRAF)/Cassini Missions yields 1.7 dB of coding gain. These gains are measured with respect to symbols input to the BVD and increase with decreasing BER. Also, 8-bit input symbol quantization makes the BVD resistant to demodulated signal-level variations which may cause higher bandwidth than the NASA (7,1/2) code, these gains are offset by about 0.1 dB of expected additional receiver losses. Coding gains of several decibels are possible by compressing all spacecraft data.

  8. Investigating the Influence Relationship Models for Stocks in Indian Equity Market: A Weighted Network Modelling Study.

    PubMed

    Bhattacharjee, Biplab; Shafi, Muhammad; Acharjee, Animesh

    2016-01-01

    The socio-economic systems today possess high levels of both interconnectedness and interdependencies, and such system-level relationships behave very dynamically. In such situations, it is all around perceived that influence is a perplexing power that has an overseeing part in affecting the dynamics and behaviours of involved ones. As a result of the force & direction of influence, the transformative change of one entity has a cogent aftereffect on the other entities in the system. The current study employs directed weighted networks for investigating the influential relationship patterns existent in a typical equity market as an outcome of inter-stock interactions happening at the market level, the sectorial level and the industrial level. The study dataset is derived from 335 constituent stocks of 'Standard & Poor Bombay Stock Exchange 500 index' and study period is 1st June 2005 to 30th June 2015. The study identifies the set of most dynamically influential stocks & their respective temporal pattern at three hierarchical levels: the complete equity market, different sectors, and constituting industry segments of those sectors. A detailed influence relationship analysis is performed for the sectorial level network of the construction sector, and it was found that stocks belonging to the cement industry possessed high influence within this sector. Also, the detailed network analysis of construction sector revealed that it follows scale-free characteristics and power law distribution. In the industry specific influence relationship analysis for cement industry, methods based on threshold filtering and minimum spanning tree were employed to derive a set of sub-graphs having temporally stable high-correlation structure over this ten years period.

  9. Semi-Metric Topology of the Human Connectome: Sensitivity and Specificity to Autism and Major Depressive Disorder

    PubMed Central

    Simas, Tiago; Chattopadhyay, Shayanti; Hagan, Cindy; Kundu, Prantik; Patel, Ameera; Holt, Rosemary; Floris, Dorothea; Graham, Julia; Ooi, Cinly; Tait, Roger; Spencer, Michael; Baron-Cohen, Simon; Sahakian, Barbara; Bullmore, Ed; Goodyer, Ian; Suckling, John

    2015-01-01

    Introduction The human functional connectome is a graphical representation, consisting of nodes connected by edges, of the inter-relationships of blood oxygenation-level dependent (BOLD) time-series measured by MRI from regions encompassing the cerebral cortices and, often, the cerebellum. Semi-metric analysis of the weighted, undirected connectome distinguishes an edge as either direct (metric), such that there is no alternative path that is accumulatively stronger, or indirect (semi-metric), where one or more alternative paths exist that have greater strength than the direct edge. The sensitivity and specificity of this method of analysis is illustrated by two case-control analyses with independent, matched groups of adolescents with autism spectrum conditions (ASC) and major depressive disorder (MDD). Results Significance differences in the global percentage of semi-metric edges was observed in both groups, with increases in ASC and decreases in MDD relative to controls. Furthermore, MDD was associated with regional differences in left frontal and temporal lobes, the right limbic system and cerebellum. In contrast, ASC had a broadly increased percentage of semi-metric edges with a more generalised distribution of effects and some areas of reduction. In summary, MDD was characterised by localised, large reductions in the percentage of semi-metric edges, whilst ASC is characterised by more generalised, subtle increases. These differences were corroborated in greater detail by inspection of the semi-metric backbone for each group; that is, the sub-graph of semi-metric edges present in >90% of participants, and by nodal degree differences in the semi-metric connectome. Conclusion These encouraging results, in what we believe is the first application of semi-metric analysis to neuroimaging data, raise confidence in the methodology as potentially capable of detection and characterisation of a range of neurodevelopmental and psychiatric disorders. PMID:26308854

  10. Finite Correlation Length Implies Efficient Preparation of Quantum Thermal States

    NASA Astrophysics Data System (ADS)

    Brandão, Fernando G. S. L.; Kastoryano, Michael J.

    2018-05-01

    Preparing quantum thermal states on a quantum computer is in general a difficult task. We provide a procedure to prepare a thermal state on a quantum computer with a logarithmic depth circuit of local quantum channels assuming that the thermal state correlations satisfy the following two properties: (i) the correlations between two regions are exponentially decaying in the distance between the regions, and (ii) the thermal state is an approximate Markov state for shielded regions. We require both properties to hold for the thermal state of the Hamiltonian on any induced subgraph of the original lattice. Assumption (ii) is satisfied for all commuting Gibbs states, while assumption (i) is satisfied for every model above a critical temperature. Both assumptions are satisfied in one spatial dimension. Moreover, both assumptions are expected to hold above the thermal phase transition for models without any topological order at finite temperature. As a building block, we show that exponential decay of correlation (for thermal states of Hamiltonians on all induced subgraphs) is sufficient to efficiently estimate the expectation value of a local observable. Our proof uses quantum belief propagation, a recent strengthening of strong sub-additivity, and naturally breaks down for states with topological order.

  11. A New Analysis of Resting State Connectivity and Graph Theory Reveals Distinctive Short-Term Modulations due to Whisker Stimulation in Rats.

    PubMed

    Kreitz, Silke; de Celis Alonso, Benito; Uder, Michael; Hess, Andreas

    2018-01-01

    Resting state (RS) connectivity has been increasingly studied in healthy and diseased brains in humans and animals. This paper presents a new method to analyze RS data from fMRI that combines multiple seed correlation analysis with graph-theory (MSRA). We characterize and evaluate this new method in relation to two other graph-theoretical methods and ICA. The graph-theoretical methods calculate cross-correlations of regional average time-courses, one using seed regions of the same size (SRCC) and the other using whole brain structure regions (RCCA). We evaluated the reproducibility, power, and capacity of these methods to characterize short-term RS modulation to unilateral physiological whisker stimulation in rats. Graph-theoretical networks found with the MSRA approach were highly reproducible, and their communities showed large overlaps with ICA components. Additionally, MSRA was the only one of all tested methods that had the power to detect significant RS modulations induced by whisker stimulation that are controlled by family-wise error rate (FWE). Compared to the reduced resting state network connectivity during task performance, these modulations implied decreased connectivity strength in the bilateral sensorimotor and entorhinal cortex. Additionally, the contralateral ventromedial thalamus (part of the barrel field related lemniscal pathway) and the hypothalamus showed reduced connectivity. Enhanced connectivity was observed in the amygdala, especially the contralateral basolateral amygdala (involved in emotional learning processes). In conclusion, MSRA is a powerful analytical approach that can reliably detect tiny modulations of RS connectivity. It shows a great promise as a method for studying RS dynamics in healthy and pathological conditions.

  12. A New Analysis of Resting State Connectivity and Graph Theory Reveals Distinctive Short-Term Modulations due to Whisker Stimulation in Rats

    PubMed Central

    Kreitz, Silke; de Celis Alonso, Benito; Uder, Michael; Hess, Andreas

    2018-01-01

    Resting state (RS) connectivity has been increasingly studied in healthy and diseased brains in humans and animals. This paper presents a new method to analyze RS data from fMRI that combines multiple seed correlation analysis with graph-theory (MSRA). We characterize and evaluate this new method in relation to two other graph-theoretical methods and ICA. The graph-theoretical methods calculate cross-correlations of regional average time-courses, one using seed regions of the same size (SRCC) and the other using whole brain structure regions (RCCA). We evaluated the reproducibility, power, and capacity of these methods to characterize short-term RS modulation to unilateral physiological whisker stimulation in rats. Graph-theoretical networks found with the MSRA approach were highly reproducible, and their communities showed large overlaps with ICA components. Additionally, MSRA was the only one of all tested methods that had the power to detect significant RS modulations induced by whisker stimulation that are controlled by family-wise error rate (FWE). Compared to the reduced resting state network connectivity during task performance, these modulations implied decreased connectivity strength in the bilateral sensorimotor and entorhinal cortex. Additionally, the contralateral ventromedial thalamus (part of the barrel field related lemniscal pathway) and the hypothalamus showed reduced connectivity. Enhanced connectivity was observed in the amygdala, especially the contralateral basolateral amygdala (involved in emotional learning processes). In conclusion, MSRA is a powerful analytical approach that can reliably detect tiny modulations of RS connectivity. It shows a great promise as a method for studying RS dynamics in healthy and pathological conditions. PMID:29875622

  13. Using graph theory to quantify coarse sediment connectivity in alpine geosystems

    NASA Astrophysics Data System (ADS)

    Heckmann, Tobias; Thiel, Markus; Schwanghart, Wolfgang; Haas, Florian; Becht, Michael

    2010-05-01

    Networks are a common object of study in various disciplines. Among others, informatics, sociology, transportation science, economics and ecology frequently deal with objects which are linked with other objects to form a network. Despite this wide thematic range, a coherent formal basis to represent, measure and model the relational structure of models exists. The mathematical model for networks of all kinds is a graph which can be analysed using the tools of mathematical graph theory. In a graph model of a generic system, system components are represented by graph nodes, and the linkages between them are formed by graph edges. The latter may represent all kinds of linkages, from matter or energy fluxes to functional relations. To some extent, graph theory has been used in geosciences and related disciplines; in hydrology and fluvial geomorphology, for example, river networks have been modeled and analysed as graphs. An important issue in hydrology is the hydrological connectivity which determines if runoff generated on some area reaches the channel network. In ecology, a number of graph-theoretical indices is applicable to describing the influence of habitat distribution and landscape fragmentation on population structure and species mobility. In these examples, the mobility of matter (water, sediment, animals) through a system is an important consequence of system structure, i.e. the location and topology of its components as well as of properties of linkages between them. In geomorphology, sediment connectivity relates to the potential of sediment particles to move through the catchment. As a system property, connectivity depends, for example, on the degree to which hillslopes within a catchment are coupled to the channel system (lateral coupling), and to which channel reaches are coupled to each other (longitudinal coupling). In the present study, numerical GIS-based models are used to investigate the coupling of geomorphic process units by delineating the process domains of important geomorphic processes in a high-mountain environment (rockfall, slope-type debris flows, slope aquatic and fluvial processes). The results are validated by field mapping; they show that only small parts of a catchment are actually coupled to its outlet with respect to coarse (bedload) sediment. The models not only generate maps of the spatial extent and geomorphic activity of the aforementioned processes, they also output so-called edge lists that can be converted to adjacency matrices and graphs. Graph theory is then employed to explore ‘local' (i.e. referring to single nodes or edges) and ‘global' (i.e. system-wide, referring to the whole graph) measures that can be used to quantify coarse sediment connectivity. Such a quantification will complement the mainly qualitative appraisal of coupling and connectivity; the effect of connectivity on catchment properties such as specific sediment yield and catchment sensitivity will then be studied on the basis of quantitative measures.

  14. Glassy nature of hierarchical organizations.

    PubMed

    Zamani, Maryam; Vicsek, Tamas

    2017-05-03

    The question of why and how animal and human groups form temporarily stable hierarchical organizations has long been a great challenge from the point of quantitative interpretations. The prevailing observation/consensus is that a hierarchical social or technological structure is optimal considering a variety of aspects. Here we introduce a simple quantitative interpretation of this situation using a statistical mechanics-type approach. We look for the optimum of the efficiency function [Formula: see text] with J ij denoting the nature of the interaction between the units i and j and a i standing for the ability of member i to contribute to the efficiency of the system. Notably, this expression for E eff has a similar structure to that of the energy as defined for spin-glasses. Unconventionally, we assume that J ij -s can have the values 0 (no interaction), +1 and -1; furthermore, a direction is associated with each edge. The essential and novel feature of our approach is that instead of optimizing the state of the nodes of a pre-defined network, we search for extrema for given a i -s in the complex efficiency landscape by finding locally optimal network topologies for a given number of edges of the subgraphs considered.

  15. The dorsal striatum and the dynamics of the consensus connectomes in the frontal lobe of the human brain.

    PubMed

    Kerepesi, Csaba; Varga, Bálint; Szalkai, Balázs; Grolmusz, Vince

    2018-04-23

    In the applications of the graph theory, it is unusual that one considers numerous, pairwise different graphs on the very same set of vertices. In the case of human braingraphs or connectomes, however, this is the standard situation: the nodes correspond to anatomically identified cerebral regions, and two vertices are connected by an edge if a diffusion MRI-based workflow identifies a fiber of axons, running between the two regions, corresponding to the two vertices. Therefore, if we examine the braingraphs of n subjects, then we have n graphs on the very same, anatomically identified vertex set. It is a natural idea to describe the k-frequently appearing edges in these graphs: the edges that are present between the same two vertices in at least k out of the n graphs. Based on the NIH-funded large Human Connectome Project's public data release, we have reported the construction of the Budapest Reference Connectome Server http://www.connectome.pitgroup.org that generates and visualizes these k-frequently appearing edges. We call the graphs of the k-frequently appearing edges "k-consensus connectomes" since an edge could be included only if it is present in at least k graphs out of n. Considering the whole human brain, we have reported a surprising property of these consensus connectomes earlier. In the present work we are focusing on the frontal lobe of the brain, and we report here a similarly surprising dynamical property of the consensus connectomes when k is gradually changed from k = n to k = 1: the connections between the nodes of the frontal lobe are seemingly emanating from those nodes that were connected to sub-cortical structures of the dorsal striatum: the caudate nucleus, and the putamen. We hypothesize that this dynamic behavior copies the axonal fiber development of the frontal lobe. An animation of the phenomenon is presented at https://youtu.be/wBciB2eW6_8. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. 3DScapeCS: application of three dimensional, parallel, dynamic network visualization in Cytoscape

    PubMed Central

    2013-01-01

    Background The exponential growth of gigantic biological data from various sources, such as protein-protein interaction (PPI), genome sequences scaffolding, Mass spectrometry (MS) molecular networking and metabolic flux, demands an efficient way for better visualization and interpretation beyond the conventional, two-dimensional visualization tools. Results We developed a 3D Cytoscape Client/Server (3DScapeCS) plugin, which adopted Cytoscape in interpreting different types of data, and UbiGraph for three-dimensional visualization. The extra dimension is useful in accommodating, visualizing, and distinguishing large-scale networks with multiple crossed connections in five case studies. Conclusions Evaluation on several experimental data using 3DScapeCS and its special features, including multilevel graph layout, time-course data animation, and parallel visualization has proven its usefulness in visualizing complex data and help to make insightful conclusions. PMID:24225050

  17. Employee reactions and adjustment to euthanasia-related work: identifying turning-point events through retrospective narratives.

    PubMed

    Reeve, Charlie L; Spitzmuller, Christiane; Rogelberg, Steven G; Walker, Alan; Schultz, Lisa; Clark, Olga

    2004-01-01

    This study used a retrospective narrative procedure to examine the critical events that influence reactions and adjustment to euthanasia-related work of 35 employees who have stayed in the animal care and welfare field for at least 2 years. The study analyzed adjustment trajectory graphs and interview notes to identify turning-point events that spurred either a positive or negative change in shelter workers' psychological well-being. Analysis of the identified turning-point events revealed 10 common event themes that have implications for a range of work, personnel, and organizational practices. The article discusses implications for shelter, employee, and animal welfare.

  18. JPRS Report, Epidemiology.

    DTIC Science & Technology

    1989-07-19

    diagnoses as "endemic complacentitis." Chicken pox has persisted well past its seasonal limits. Since it is a non-notifiable disease, exact...Diseases comparative figures for the first four months of 1988 and 1989 show a worri- some graph. The hospital admitted 1,935 chicken pox patients last...PAULO, 20 Jun 89] 13 Leprosy Incidence Fourth Highest in World [Rio de Janiero O GLOBO, 3 May 89] 14 MEXICO African Bees Kill Farm Animals in

  19. ETHOWATCHER: validation of a tool for behavioral and video-tracking analysis in laboratory animals.

    PubMed

    Crispim Junior, Carlos Fernando; Pederiva, Cesar Nonato; Bose, Ricardo Chessini; Garcia, Vitor Augusto; Lino-de-Oliveira, Cilene; Marino-Neto, José

    2012-02-01

    We present a software (ETHOWATCHER(®)) developed to support ethography, object tracking and extraction of kinematic variables from digital video files of laboratory animals. The tracking module allows controlled segmentation of the target from the background, extracting image attributes used to calculate the distance traveled, orientation, length, area and a path graph of the experimental animal. The ethography module allows recording of catalog-based behaviors from environment or from video files continuously or frame-by-frame. The output reports duration, frequency and latency of each behavior and the sequence of events in a time-segmented format, set by the user. Validation tests were conducted on kinematic measurements and on the detection of known behavioral effects of drugs. This software is freely available at www.ethowatcher.ufsc.br. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Motif structure and cooperation in real-world complex networks

    NASA Astrophysics Data System (ADS)

    Salehi, Mostafa; Rabiee, Hamid R.; Jalili, Mahdi

    2010-12-01

    Networks of dynamical nodes serve as generic models for real-world systems in many branches of science ranging from mathematics to physics, technology, sociology and biology. Collective behavior of agents interacting over complex networks is important in many applications. The cooperation between selfish individuals is one of the most interesting collective phenomena. In this paper we address the interplay between the motifs’ cooperation properties and their abundance in a number of real-world networks including yeast protein-protein interaction, human brain, protein structure, email communication, dolphins’ social interaction, Zachary karate club and Net-science coauthorship networks. First, the amount of cooperativity for all possible undirected subgraphs with three to six nodes is calculated. To this end, the evolutionary dynamics of the Prisoner’s Dilemma game is considered and the cooperativity of each subgraph is calculated as the percentage of cooperating agents at the end of the simulation time. Then, the three- to six-node motifs are extracted for each network. The significance of the abundance of a motif, represented by a Z-value, is obtained by comparing them with some properly randomized versions of the original network. We found that there is always a group of motifs showing a significant inverse correlation between their cooperativity amount and Z-value, i.e. the more the Z-value the less the amount of cooperativity. This suggests that networks composed of well-structured units do not have good cooperativity properties.

  1. Biological network motif detection and evaluation

    PubMed Central

    2011-01-01

    Background Molecular level of biological data can be constructed into system level of data as biological networks. Network motifs are defined as over-represented small connected subgraphs in networks and they have been used for many biological applications. Since network motif discovery involves computationally challenging processes, previous algorithms have focused on computational efficiency. However, we believe that the biological quality of network motifs is also very important. Results We define biological network motifs as biologically significant subgraphs and traditional network motifs are differentiated as structural network motifs in this paper. We develop five algorithms, namely, EDGEGO-BNM, EDGEBETWEENNESS-BNM, NMF-BNM, NMFGO-BNM and VOLTAGE-BNM, for efficient detection of biological network motifs, and introduce several evaluation measures including motifs included in complex, motifs included in functional module and GO term clustering score in this paper. Experimental results show that EDGEGO-BNM and EDGEBETWEENNESS-BNM perform better than existing algorithms and all of our algorithms are applicable to find structural network motifs as well. Conclusion We provide new approaches to finding network motifs in biological networks. Our algorithms efficiently detect biological network motifs and further improve existing algorithms to find high quality structural network motifs, which would be impossible using existing algorithms. The performances of the algorithms are compared based on our new evaluation measures in biological contexts. We believe that our work gives some guidelines of network motifs research for the biological networks. PMID:22784624

  2. Studying the Immunomodulatory Effects of Small Molecule Ras Inhibitors in Animal Models of Rheumatoid Arthritis

    DTIC Science & Technology

    2017-10-01

    methotrexate (MTX) provides a superior protective effect compared to monotherapy. (iii) In the AIA model the FTS derivative, F-FTS, showed higher...therapeutic efficacy compared to FTS. (iv) The functional genomics studies showed that FTS therapy inhibits the in vivo TH17 immune response. (v) FTS semi... Description shall include pertinent data and graphs in sufficient detail to explain any significant results achieved. A succinct description of the

  3. Transfer matrix computation of critical polynomials for two-dimensional Potts models

    DOE PAGES

    Jacobsen, Jesper Lykke; Scullard, Christian R.

    2013-02-04

    We showed, In our previous work, that critical manifolds of the q-state Potts model can be studied by means of a graph polynomial P B(q, v), henceforth referred to as the critical polynomial. This polynomial may be defined on any periodic two-dimensional lattice. It depends on a finite subgraph B, called the basis, and the manner in which B is tiled to construct the lattice. The real roots v = e K — 1 of P B(q, v) either give the exact critical points for the lattice, or provide approximations that, in principle, can be made arbitrarily accurate by increasingmore » the size of B in an appropriate way. In earlier work, P B(q, v) was defined by a contraction-deletion identity, similar to that satisfied by the Tutte polynomial. Here, we give a probabilistic definition of P B(q, v), which facilitates its computation, using the transfer matrix, on much larger B than was previously possible.We present results for the critical polynomial on the (4, 8 2), kagome, and (3, 12 2) lattices for bases of up to respectively 96, 162, and 243 edges, compared to the limit of 36 edges with contraction-deletion. We discuss in detail the role of the symmetries and the embedding of B. The critical temperatures v c obtained for ferromagnetic (v > 0) Potts models are at least as precise as the best available results from Monte Carlo simulations or series expansions. For instance, with q = 3 we obtain v c(4, 8 2) = 3.742 489 (4), v c(kagome) = 1.876 459 7 (2), and v c(3, 12 2) = 5.033 078 49 (4), the precision being comparable or superior to the best simulation results. More generally, we trace the critical manifolds in the real (q, v) plane and discuss the intricate structure of the phase diagram in the antiferromagnetic (v < 0) region.« less

  4. Tracing Actual Causes

    DTIC Science & Technology

    2016-08-08

    actual values for variables in the SEM ), and an event e with M ,~u |= e, our definition answers the question : Which paths of the causal network G( M ...for each variable and a directed edge from vari- able X to Y if the equation for computing X uses Y . Given an SEM M , a context ~u (that supplies the...caused the event e1? Our definition answers this question as a set of causal slices, where each causal slice is a subgraph of G( M ). All paths in each

  5. Identification of informative subgraphs in brain networks

    NASA Astrophysics Data System (ADS)

    Marinazzo, D.; Wu, G.; Pellicoro, M.; Stramaglia, S.

    2013-01-01

    Measuring directed interactions in the brain in terms of information flow is a promising approach, mathematically treatable and amenable to encompass several methods. Here we present a formal expansion of the transfer entropy to put in evidence irreducible sets of variables which provide information for the future state of each assigned target. Multiplets characterized by a large contribution to the expansion are associated to informational circuits present in the system, with an informational character (synergetic or redundant) which can be inferred from the sign of the contribution.

  6. Relationship of trade patterns of the Danish swine industry animal movements network to potential disease spread.

    PubMed

    Bigras-Poulin, Michel; Barfod, Kristen; Mortensen, Sten; Greiner, Matthias

    2007-07-16

    The movements of animals were analysed under the conceptual framework of graph theory in mathematics. The swine production related premises of Denmark were considered to constitute the nodes of a network and the links were the animal movements. In this framework, each farm will have a network of other premises to which it will be linked. A premise was a farm (breeding, rearing or slaughter pig), an abattoir or a trade market. The overall network was divided in premise specific subnets that linked the other premises from and to which animals were moved. This approach allowed us to visualise and analyse the three levels of organization related to animal movements that existed in the Danish swine production registers: the movement of animals between two premises, the premise specific networks, and the industry network. The analyses of animal movements were done using these three levels of organisation. The movements of swine were studied for the period September 30, 2002 to May 22, 2003. For daily movements of swine between two slaughter pig premises, the median number of pigs moved was 130 pigs with a maximum of 3306. For movements between a slaughter pig premise and an abattoir, the median number of pigs was 24. The largest percentage of movements was from farm to abattoir (82.5%); the median number of pigs per movement was 24 and the maximum number was 2018. For the whole period the median and maximum Euclidean distances observed in farm-to-farm movements were 22 km and 289 km respectively, while in the farm-to-abattoir movements, they were 36.2 km and 285 km. The network related to one specific premise showed that the median number of premises was mainly away from slaughter pig farms (3) or breeder farms (26) and mainly to an abattoir (1535). The assumption that animal movements can be randomly generated on the basis of farm density of the surrounding area of any farm is not correct since the patterns of animal movements have the topology of a scale-free network with a large degree of heterogeneity. This supported the opinion that the disease spread software assuming homogeneity in farm-to-farm relationship should only be used for large-scale interpretation and for epidemic preparedness. The network approach, based on graph theory, can be used efficiently to express more precisely, on a local scale (premise), the heterogeneity of animal movements. This approach, by providing network knowledge to the local veterinarian in charge of controlling disease spread, should also be evaluated as a potential tool to manage epidemics during the crisis. Geographic information systems could also be linked in the approach to produce knowledge about local transmission of disease.

  7. Eb&D: A new clustering approach for signed social networks based on both edge-betweenness centrality and density of subgraphs

    NASA Astrophysics Data System (ADS)

    Qi, Xingqin; Song, Huimin; Wu, Jianliang; Fuller, Edgar; Luo, Rong; Zhang, Cun-Quan

    2017-09-01

    Clustering algorithms for unsigned social networks which have only positive edges have been studied intensively. However, when a network has like/dislike, love/hate, respect/disrespect, or trust/distrust relationships, unsigned social networks with only positive edges are inadequate. Thus we model such kind of networks as signed networks which can have both negative and positive edges. Detecting the cluster structures of signed networks is much harder than for unsigned networks, because it not only requires that positive edges within clusters are as many as possible, but also requires that negative edges between clusters are as many as possible. Currently, we have few clustering algorithms for signed networks, and most of them requires the number of final clusters as an input while it is actually hard to predict beforehand. In this paper, we will propose a novel clustering algorithm called Eb &D for signed networks, where both the betweenness of edges and the density of subgraphs are used to detect cluster structures. A hierarchically nested system will be constructed to illustrate the inclusion relationships of clusters. To show the validity and efficiency of Eb &D, we test it on several classical social networks and also hundreds of synthetic data sets, and all obtain better results compared with other methods. The biggest advantage of Eb &D compared with other methods is that the number of clusters do not need to be known prior.

  8. Adaption of the temporal correlation coefficient calculation for temporal networks (applied to a real-world pig trade network).

    PubMed

    Büttner, Kathrin; Salau, Jennifer; Krieter, Joachim

    2016-01-01

    The average topological overlap of two graphs of two consecutive time steps measures the amount of changes in the edge configuration between the two snapshots. This value has to be zero if the edge configuration changes completely and one if the two consecutive graphs are identical. Current methods depend on the number of nodes in the network or on the maximal number of connected nodes in the consecutive time steps. In the first case, this methodology breaks down if there are nodes with no edges. In the second case, it fails if the maximal number of active nodes is larger than the maximal number of connected nodes. In the following, an adaption of the calculation of the temporal correlation coefficient and of the topological overlap of the graph between two consecutive time steps is presented, which shows the expected behaviour mentioned above. The newly proposed adaption uses the maximal number of active nodes, i.e. the number of nodes with at least one edge, for the calculation of the topological overlap. The three methods were compared with the help of vivid example networks to reveal the differences between the proposed notations. Furthermore, these three calculation methods were applied to a real-world network of animal movements in order to detect influences of the network structure on the outcome of the different methods.

  9. The Double Star Orbit Initial Value Problem

    NASA Astrophysics Data System (ADS)

    Hensley, Hagan

    2018-04-01

    Many precise algorithms exist to find a best-fit orbital solution for a double star system given a good enough initial value. Desmos is an online graphing calculator tool with extensive capabilities to support animations and defining functions. It can provide a useful visual means of analyzing double star data to arrive at a best guess approximation of the orbital solution. This is a necessary requirement before using a gradient-descent algorithm to find the best-fit orbital solution for a binary system.

  10. MMB-4 Inhibition of Aceylcholinesterase Is Similar across Species

    DTIC Science & Technology

    2014-11-01

    version 5.4). An IC50 value was determined for AChE from each animal species by fitting the percent of AChE activity with respect to MMB 4 concentration...in GraphPad Prism (version 5) using a nonlinear regression dose response model for inhibition (normalized response with variable slope). Assessing the...Therefore, AChE activity and inhibition studies were carried out at 435 nm to reduce interference from MMB 4. Comparison of IC50 Values for MMB 4 with AChE

  11. Interactive Multiple-Representation Editing of Physically-based 3D Animation

    DTIC Science & Technology

    1994-05-29

    view similar to that of FrameMaker [46], but with a stronger model of structured editing, and a language window, which displays a formal description of...used. This is partly a result of the power of a good direct manipulation editor { most users of good editors like FrameMaker don’t seem to miss having a...re nement. In SIG- GRAPH, pages 205{212, August 1988. [46] Frame Technology Inc. Using FrameMaker , 1990. [47] Bjorn N. Freeman-Benson and John Maloney

  12. User's Guide for Flight Simulation Data Visualization Workstation

    NASA Technical Reports Server (NTRS)

    Kaplan, Joseph A.; Chen, Ronnie; Kenney, Patrick S.; Koval, Christopher M.; Hutchinson, Brian K.

    1996-01-01

    Today's modern flight simulation research produces vast amounts of time sensitive data. The meaning of this data can be difficult to assess while in its raw format. Therefore, a method of breaking the data down and presenting it to the user in a graphical format is necessary. Simulation Graphics (SimGraph) is intended as a data visualization software package that will incorporate simulation data into a variety of animated graphical displays for easy interpretation by the simulation researcher. This document is intended as an end user's guide.

  13. Couple Graph Based Label Propagation Method for Hyperspectral Remote Sensing Data Classification

    NASA Astrophysics Data System (ADS)

    Wang, X. P.; Hu, Y.; Chen, J.

    2018-04-01

    Graph based semi-supervised classification method are widely used for hyperspectral image classification. We present a couple graph based label propagation method, which contains both the adjacency graph and the similar graph. We propose to construct the similar graph by using the similar probability, which utilize the label similarity among examples probably. The adjacency graph was utilized by a common manifold learning method, which has effective improve the classification accuracy of hyperspectral data. The experiments indicate that the couple graph Laplacian which unite both the adjacency graph and the similar graph, produce superior classification results than other manifold Learning based graph Laplacian and Sparse representation based graph Laplacian in label propagation framework.

  14. Multi-Centrality Graph Spectral Decompositions and Their Application to Cyber Intrusion Detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Pin-Yu; Choudhury, Sutanay; Hero, Alfred

    Many modern datasets can be represented as graphs and hence spectral decompositions such as graph principal component analysis (PCA) can be useful. Distinct from previous graph decomposition approaches based on subspace projection of a single topological feature, e.g., the centered graph adjacency matrix (graph Laplacian), we propose spectral decomposition approaches to graph PCA and graph dictionary learning that integrate multiple features, including graph walk statistics, centrality measures and graph distances to reference nodes. In this paper we propose a new PCA method for single graph analysis, called multi-centrality graph PCA (MC-GPCA), and a new dictionary learning method for ensembles ofmore » graphs, called multi-centrality graph dictionary learning (MC-GDL), both based on spectral decomposition of multi-centrality matrices. As an application to cyber intrusion detection, MC-GPCA can be an effective indicator of anomalous connectivity pattern and MC-GDL can provide discriminative basis for attack classification.« less

  15. Graphs, matrices, and the GraphBLAS: Seven good reasons

    DOE PAGES

    Kepner, Jeremy; Bader, David; Buluç, Aydın; ...

    2015-01-01

    The analysis of graphs has become increasingly important to a wide range of applications. Graph analysis presents a number of unique challenges in the areas of (1) software complexity, (2) data complexity, (3) security, (4) mathematical complexity, (5) theoretical analysis, (6) serial performance, and (7) parallel performance. Implementing graph algorithms using matrix-based approaches provides a number of promising solutions to these challenges. The GraphBLAS standard (istcbigdata.org/GraphBlas) is being developed to bring the potential of matrix based graph algorithms to the broadest possible audience. The GraphBLAS mathematically defines a core set of matrix-based graph operations that can be used to implementmore » a wide class of graph algorithms in a wide range of programming environments. This paper provides an introduction to the GraphBLAS and describes how the GraphBLAS can be used to address many of the challenges associated with analysis of graphs.« less

  16. Adjusting protein graphs based on graph entropy.

    PubMed

    Peng, Sheng-Lung; Tsay, Yu-Wei

    2014-01-01

    Measuring protein structural similarity attempts to establish a relationship of equivalence between polymer structures based on their conformations. In several recent studies, researchers have explored protein-graph remodeling, instead of looking a minimum superimposition for pairwise proteins. When graphs are used to represent structured objects, the problem of measuring object similarity become one of computing the similarity between graphs. Graph theory provides an alternative perspective as well as efficiency. Once a protein graph has been created, its structural stability must be verified. Therefore, a criterion is needed to determine if a protein graph can be used for structural comparison. In this paper, we propose a measurement for protein graph remodeling based on graph entropy. We extend the concept of graph entropy to determine whether a graph is suitable for representing a protein. The experimental results suggest that when applied, graph entropy helps a conformational on protein graph modeling. Furthermore, it indirectly contributes to protein structural comparison if a protein graph is solid.

  17. Adjusting protein graphs based on graph entropy

    PubMed Central

    2014-01-01

    Measuring protein structural similarity attempts to establish a relationship of equivalence between polymer structures based on their conformations. In several recent studies, researchers have explored protein-graph remodeling, instead of looking a minimum superimposition for pairwise proteins. When graphs are used to represent structured objects, the problem of measuring object similarity become one of computing the similarity between graphs. Graph theory provides an alternative perspective as well as efficiency. Once a protein graph has been created, its structural stability must be verified. Therefore, a criterion is needed to determine if a protein graph can be used for structural comparison. In this paper, we propose a measurement for protein graph remodeling based on graph entropy. We extend the concept of graph entropy to determine whether a graph is suitable for representing a protein. The experimental results suggest that when applied, graph entropy helps a conformational on protein graph modeling. Furthermore, it indirectly contributes to protein structural comparison if a protein graph is solid. PMID:25474347

  18. Characterizing Containment and Related Classes of Graphs,

    DTIC Science & Technology

    1985-01-01

    Math . to appear. [G2] Golumbic,. Martin C., D. Rotem and J. Urrutia. "Comparability graphs and intersection graphs" Discrete Math . 43 (1983) 37-40. [G3...intersection classes of graphs" Discrete Math . to appear. [S2] Scheinerman, Edward R. Intersection Classes and Multiple Intersection Parameters of Graphs...graphs and of interval graphs" Canad. Jour. of blath. 16 (1964) 539-548. [G1] Golumbic, Martin C. "Containment graphs: and. intersection graphs" Discrete

  19. A Collection of Features for Semantic Graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Eliassi-Rad, T; Fodor, I K; Gallagher, B

    2007-05-02

    Semantic graphs are commonly used to represent data from one or more data sources. Such graphs extend traditional graphs by imposing types on both nodes and links. This type information defines permissible links among specified nodes and can be represented as a graph commonly referred to as an ontology or schema graph. Figure 1 depicts an ontology graph for data from National Association of Securities Dealers. Each node type and link type may also have a list of attributes. To capture the increased complexity of semantic graphs, concepts derived for standard graphs have to be extended. This document explains brieflymore » features commonly used to characterize graphs, and their extensions to semantic graphs. This document is divided into two sections. Section 2 contains the feature descriptions for static graphs. Section 3 extends the features for semantic graphs that vary over time.« less

  20. Graphing the order of the sexes: constructing, recalling, interpreting, and putting the self in gender difference graphs.

    PubMed

    Hegarty, Peter; Lemieux, Anthony F; McQueen, Grant

    2010-03-01

    Graphs seem to connote facts more than words or tables do. Consequently, they seem unlikely places to spot implicit sexism at work. Yet, in 6 studies (N = 741), women and men constructed (Study 1) and recalled (Study 2) gender difference graphs with men's data first, and graphed powerful groups (Study 3) and individuals (Study 4) ahead of weaker ones. Participants who interpreted graph order as evidence of author "bias" inferred that the author graphed his or her own gender group first (Study 5). Women's, but not men's, preferences to graph men first were mitigated when participants graphed a difference between themselves and an opposite-sex friend prior to graphing gender differences (Study 6). Graph production and comprehension are affected by beliefs and suppositions about the groups represented in graphs to a greater degree than cognitive models of graph comprehension or realist models of scientific thinking have yet acknowledged.

  1. BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks.

    PubMed

    Maere, Steven; Heymans, Karel; Kuiper, Martin

    2005-08-15

    The Biological Networks Gene Ontology tool (BiNGO) is an open-source Java tool to determine which Gene Ontology (GO) terms are significantly overrepresented in a set of genes. BiNGO can be used either on a list of genes, pasted as text, or interactively on subgraphs of biological networks visualized in Cytoscape. BiNGO maps the predominant functional themes of the tested gene set on the GO hierarchy, and takes advantage of Cytoscape's versatile visualization environment to produce an intuitive and customizable visual representation of the results.

  2. Graphing with "LogoWriter."

    ERIC Educational Resources Information Center

    Yoder, Sharon K.

    This book discusses four kinds of graphs that are taught in mathematics at the middle school level: pictographs, bar graphs, line graphs, and circle graphs. The chapters on each of these types of graphs contain information such as starting, scaling, drawing, labeling, and finishing the graphs using "LogoWriter." The final chapter of the…

  3. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Fangyan; Zhang, Song; Chung Wong, Pak

    Effectively visualizing large graphs and capturing the statistical properties are two challenging tasks. To aid in these two tasks, many sampling approaches for graph simplification have been proposed, falling into three categories: node sampling, edge sampling, and traversal-based sampling. It is still unknown which approach is the best. We evaluate commonly used graph sampling methods through a combined visual and statistical comparison of graphs sampled at various rates. We conduct our evaluation on three graph models: random graphs, small-world graphs, and scale-free graphs. Initial results indicate that the effectiveness of a sampling method is dependent on the graph model, themore » size of the graph, and the desired statistical property. This benchmark study can be used as a guideline in choosing the appropriate method for a particular graph sampling task, and the results presented can be incorporated into graph visualization and analysis tools.« less

  4. Tissue responses to fractional transient heating with sinusoidal heat flux condition on skin surface.

    PubMed

    Ezzat, Magdy A; El-Bary, Alaa A; Al-Sowayan, Noorah S

    2016-10-01

    A fractional model of Bioheat equation for describing quantitatively the thermal responses of skin tissue under sinusoidal heat flux conditions on skin surface is given. Laplace transform technique is used to obtain the solution in a closed form. The resulting formulation is applied to one-dimensional application to investigate the temperature distribution in skin with instantaneous surface heating for different cases. According to the numerical results and its graphs, conclusion about the fractional bioheat transfer equation has been constructed. Sensitivity analysis is performed to explore the thermal effects of various control parameters on tissue temperature. The comparisons are made with the results obtained in the case of the absence of time-fractional order. © 2016 Japanese Society of Animal Science. © 2016 Japanese Society of Animal Science.

  5. Introducing Seismic Tomography with Computational Modeling

    NASA Astrophysics Data System (ADS)

    Neves, R.; Neves, M. L.; Teodoro, V.

    2011-12-01

    Learning seismic tomography principles and techniques involves advanced physical and computational knowledge. In depth learning of such computational skills is a difficult cognitive process that requires a strong background in physics, mathematics and computer programming. The corresponding learning environments and pedagogic methodologies should then involve sets of computational modelling activities with computer software systems which allow students the possibility to improve their mathematical or programming knowledge and simultaneously focus on the learning of seismic wave propagation and inverse theory. To reduce the level of cognitive opacity associated with mathematical or programming knowledge, several computer modelling systems have already been developed (Neves & Teodoro, 2010). Among such systems, Modellus is particularly well suited to achieve this goal because it is a domain general environment for explorative and expressive modelling with the following main advantages: 1) an easy and intuitive creation of mathematical models using just standard mathematical notation; 2) the simultaneous exploration of images, tables, graphs and object animations; 3) the attribution of mathematical properties expressed in the models to animated objects; and finally 4) the computation and display of mathematical quantities obtained from the analysis of images and graphs. Here we describe virtual simulations and educational exercises which enable students an easy grasp of the fundamental of seismic tomography. The simulations make the lecture more interactive and allow students the possibility to overcome their lack of advanced mathematical or programming knowledge and focus on the learning of seismological concepts and processes taking advantage of basic scientific computation methods and tools.

  6. Mathematical foundations of the GraphBLAS

    DOE PAGES

    Kepner, Jeremy; Aaltonen, Peter; Bader, David; ...

    2016-12-01

    The GraphBLAS standard (GraphBlas.org) is being developed to bring the potential of matrix-based graph algorithms to the broadest possible audience. Mathematically, the GraphBLAS defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This study provides an introduction to the mathematics of the GraphBLAS. Graphs represent connections between vertices with edges. Matrices can represent a wide range of graphs using adjacency matrices or incidence matrices. Adjacency matrices are often easier to analyze while incidence matrices are often better for representing data. Fortunately, themore » two are easily connected by matrix multiplication. A key feature of matrix mathematics is that a very small number of matrix operations can be used to manipulate a very wide range of graphs. This composability of a small number of operations is the foundation of the GraphBLAS. A standard such as the GraphBLAS can only be effective if it has low performance overhead. Finally, performance measurements of prototype GraphBLAS implementations indicate that the overhead is low.« less

  7. Probing Factors Influencing Students' Graph Comprehension Regarding Four Operations in Kinematics Graphs

    ERIC Educational Resources Information Center

    Phage, Itumeleng B.; Lemmer, Miriam; Hitge, Mariette

    2017-01-01

    Students' graph comprehension may be affected by the background of the students who are the readers or interpreters of the graph, their knowledge of the context in which the graph is set, and the inferential processes required by the graph operation. This research study investigated these aspects of graph comprehension for 152 first year…

  8. Comparison and Enumeration of Chemical Graphs

    PubMed Central

    Akutsu, Tatsuya; Nagamochi, Hiroshi

    2013-01-01

    Chemical compounds are usually represented as graph structured data in computers. In this review article, we overview several graph classes relevant to chemical compounds and the computational complexities of several fundamental problems for these graph classes. In particular, we consider the following problems: determining whether two chemical graphs are identical, determining whether one input chemical graph is a part of the other input chemical graph, finding a maximum common part of two input graphs, finding a reaction atom mapping, enumerating possible chemical graphs, and enumerating stereoisomers. We also discuss the relationship between the fifth problem and kernel functions for chemical compounds. PMID:24688697

  9. Mean square cordial labelling related to some acyclic graphs and its rough approximations

    NASA Astrophysics Data System (ADS)

    Dhanalakshmi, S.; Parvathi, N.

    2018-04-01

    In this paper we investigate that the path Pn, comb graph Pn⊙K1, n-centipede graph,centipede graph (n,2) and star Sn admits mean square cordial labeling. Also we proved that the induced sub graph obtained by the upper approximation of any sub graph H of the above acyclic graphs admits mean square cordial labeling.

  10. Relating zeta functions of discrete and quantum graphs

    NASA Astrophysics Data System (ADS)

    Harrison, Jonathan; Weyand, Tracy

    2018-02-01

    We write the spectral zeta function of the Laplace operator on an equilateral metric graph in terms of the spectral zeta function of the normalized Laplace operator on the corresponding discrete graph. To do this, we apply a relation between the spectrum of the Laplacian on a discrete graph and that of the Laplacian on an equilateral metric graph. As a by-product, we determine how the multiplicity of eigenvalues of the quantum graph, that are also in the spectrum of the graph with Dirichlet conditions at the vertices, depends on the graph geometry. Finally we apply the result to calculate the vacuum energy and spectral determinant of a complete bipartite graph and compare our results with those for a star graph, a graph in which all vertices are connected to a central vertex by a single edge.

  11. Topic Model for Graph Mining.

    PubMed

    Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Luo, Xiangfeng

    2015-12-01

    Graph mining has been a popular research area because of its numerous application scenarios. Many unstructured and structured data can be represented as graphs, such as, documents, chemical molecular structures, and images. However, an issue in relation to current research on graphs is that they cannot adequately discover the topics hidden in graph-structured data which can be beneficial for both the unsupervised learning and supervised learning of the graphs. Although topic models have proved to be very successful in discovering latent topics, the standard topic models cannot be directly applied to graph-structured data due to the "bag-of-word" assumption. In this paper, an innovative graph topic model (GTM) is proposed to address this issue, which uses Bernoulli distributions to model the edges between nodes in a graph. It can, therefore, make the edges in a graph contribute to latent topic discovery and further improve the accuracy of the supervised and unsupervised learning of graphs. The experimental results on two different types of graph datasets show that the proposed GTM outperforms the latent Dirichlet allocation on classification by using the unveiled topics of these two models to represent graphs.

  12. Aligning Metabolic Pathways Exploiting Binary Relation of Reactions.

    PubMed

    Huang, Yiran; Zhong, Cheng; Lin, Hai Xiang; Huang, Jing

    2016-01-01

    Metabolic pathway alignment has been widely used to find one-to-one and/or one-to-many reaction mappings to identify the alternative pathways that have similar functions through different sets of reactions, which has important applications in reconstructing phylogeny and understanding metabolic functions. The existing alignment methods exhaustively search reaction sets, which may become infeasible for large pathways. To address this problem, we present an effective alignment method for accurately extracting reaction mappings between two metabolic pathways. We show that connected relation between reactions can be formalized as binary relation of reactions in metabolic pathways, and the multiplications of zero-one matrices for binary relations of reactions can be accomplished in finite steps. By utilizing the multiplications of zero-one matrices for binary relation of reactions, we efficiently obtain reaction sets in a small number of steps without exhaustive search, and accurately uncover biologically relevant reaction mappings. Furthermore, we introduce a measure of topological similarity of nodes (reactions) by comparing the structural similarity of the k-neighborhood subgraphs of the nodes in aligning metabolic pathways. We employ this similarity metric to improve the accuracy of the alignments. The experimental results on the KEGG database show that when compared with other state-of-the-art methods, in most cases, our method obtains better performance in the node correctness and edge correctness, and the number of the edges of the largest common connected subgraph for one-to-one reaction mappings, and the number of correct one-to-many reaction mappings. Our method is scalable in finding more reaction mappings with better biological relevance in large metabolic pathways.

  13. Scale-free models for the structure of business firm networks.

    PubMed

    Kitsak, Maksim; Riccaboni, Massimo; Havlin, Shlomo; Pammolli, Fabio; Stanley, H Eugene

    2010-03-01

    We study firm collaborations in the life sciences and the information and communication technology sectors. We propose an approach to characterize industrial leadership using k -shell decomposition, with top-ranking firms in terms of market value in higher k -shell layers. We find that the life sciences industry network consists of three distinct components: a "nucleus," which is a small well-connected subgraph, "tendrils," which are small subgraphs consisting of small degree nodes connected exclusively to the nucleus, and a "bulk body," which consists of the majority of nodes. Industrial leaders, i.e., the largest companies in terms of market value, are in the highest k -shells of both networks. The nucleus of the life sciences sector is very stable: once a firm enters the nucleus, it is likely to stay there for a long time. At the same time we do not observe the above three components in the information and communication technology sector. We also conduct a systematic study of these three components in random scale-free networks. Our results suggest that the sizes of the nucleus and the tendrils in scale-free networks decrease as the exponent of the power-law degree distribution lambda increases, and disappear for lambda>or=3 . We compare the k -shell structure of random scale-free model networks with two real-world business firm networks in the life sciences and in the information and communication technology sectors. We argue that the observed behavior of the k -shell structure in the two industries is consistent with the coexistence of both preferential and random agreements in the evolution of industrial networks.

  14. Fault tolerance in protein interaction networks: stable bipartite subgraphs and redundant pathways.

    PubMed

    Brady, Arthur; Maxwell, Kyle; Daniels, Noah; Cowen, Lenore J

    2009-01-01

    As increasing amounts of high-throughput data for the yeast interactome become available, more system-wide properties are uncovered. One interesting question concerns the fault tolerance of protein interaction networks: whether there exist alternative pathways that can perform some required function if a gene essential to the main mechanism is defective, absent or suppressed. A signature pattern for redundant pathways is the BPM (between-pathway model) motif, introduced by Kelley and Ideker. Past methods proposed to search the yeast interactome for BPM motifs have had several important limitations. First, they have been driven heuristically by local greedy searches, which can lead to the inclusion of extra genes that may not belong in the motif; second, they have been validated solely by functional coherence of the putative pathways using GO enrichment, making it difficult to evaluate putative BPMs in the absence of already known biological annotation. We introduce stable bipartite subgraphs, and show they form a clean and efficient way of generating meaningful BPMs which naturally discard extra genes included by local greedy methods. We show by GO enrichment measures that our BPM set outperforms previous work, covering more known complexes and functional pathways. Perhaps most importantly, since our BPMs are initially generated by examining the genetic-interaction network only, the location of edges in the protein-protein physical interaction network can then be used to statistically validate each candidate BPM, even with sparse GO annotation (or none at all). We uncover some interesting biological examples of previously unknown putative redundant pathways in such areas as vesicle-mediated transport and DNA repair.

  15. Fault Tolerance in Protein Interaction Networks: Stable Bipartite Subgraphs and Redundant Pathways

    PubMed Central

    Brady, Arthur; Maxwell, Kyle; Daniels, Noah; Cowen, Lenore J.

    2009-01-01

    As increasing amounts of high-throughput data for the yeast interactome become available, more system-wide properties are uncovered. One interesting question concerns the fault tolerance of protein interaction networks: whether there exist alternative pathways that can perform some required function if a gene essential to the main mechanism is defective, absent or suppressed. A signature pattern for redundant pathways is the BPM (between-pathway model) motif, introduced by Kelley and Ideker. Past methods proposed to search the yeast interactome for BPM motifs have had several important limitations. First, they have been driven heuristically by local greedy searches, which can lead to the inclusion of extra genes that may not belong in the motif; second, they have been validated solely by functional coherence of the putative pathways using GO enrichment, making it difficult to evaluate putative BPMs in the absence of already known biological annotation. We introduce stable bipartite subgraphs, and show they form a clean and efficient way of generating meaningful BPMs which naturally discard extra genes included by local greedy methods. We show by GO enrichment measures that our BPM set outperforms previous work, covering more known complexes and functional pathways. Perhaps most importantly, since our BPMs are initially generated by examining the genetic-interaction network only, the location of edges in the protein-protein physical interaction network can then be used to statistically validate each candidate BPM, even with sparse GO annotation (or none at all). We uncover some interesting biological examples of previously unknown putative redundant pathways in such areas as vesicle-mediated transport and DNA repair. PMID:19399174

  16. Preserving Differential Privacy in Degree-Correlation based Graph Generation

    PubMed Central

    Wang, Yue; Wu, Xintao

    2014-01-01

    Enabling accurate analysis of social network data while preserving differential privacy has been challenging since graph features such as cluster coefficient often have high sensitivity, which is different from traditional aggregate functions (e.g., count and sum) on tabular data. In this paper, we study the problem of enforcing edge differential privacy in graph generation. The idea is to enforce differential privacy on graph model parameters learned from the original network and then generate the graphs for releasing using the graph model with the private parameters. In particular, we develop a differential privacy preserving graph generator based on the dK-graph generation model. We first derive from the original graph various parameters (i.e., degree correlations) used in the dK-graph model, then enforce edge differential privacy on the learned parameters, and finally use the dK-graph model with the perturbed parameters to generate graphs. For the 2K-graph model, we enforce the edge differential privacy by calibrating noise based on the smooth sensitivity, rather than the global sensitivity. By doing this, we achieve the strict differential privacy guarantee with smaller magnitude noise. We conduct experiments on four real networks and compare the performance of our private dK-graph models with the stochastic Kronecker graph generation model in terms of utility and privacy tradeoff. Empirical evaluations show the developed private dK-graph generation models significantly outperform the approach based on the stochastic Kronecker generation model. PMID:24723987

  17. PREFACE: Complex Networks: from Biology to Information Technology

    NASA Astrophysics Data System (ADS)

    Barrat, A.; Boccaletti, S.; Caldarelli, G.; Chessa, A.; Latora, V.; Motter, A. E.

    2008-06-01

    The field of complex networks is one of the most active areas in contemporary statistical physics. Ten years after seminal work initiated the modern study of networks, interest in the field is in fact still growing, as indicated by the ever increasing number of publications in network science. The reason for such a resounding success is most likely the simplicity and broad significance of the approach that, through graph theory, allows researchers to address a variety of different complex systems within a common framework. This special issue comprises a selection of contributions presented at the workshop 'Complex Networks: from Biology to Information Technology' held in July 2007 in Pula (Cagliari), Italy as a satellite of the general conference STATPHYS23. The contributions cover a wide range of problems that are currently among the most important questions in the area of complex networks and that are likely to stimulate future research. The issue is organised into four sections. The first two sections describe 'methods' to study the structure and the dynamics of complex networks, respectively. After this methodological part, the issue proceeds with a section on applications to biological systems. The issue closes with a section concentrating on applications to the study of social and technological networks. The first section, entitled Methods: The Structure, consists of six contributions focused on the characterisation and analysis of structural properties of complex networks: The paper Motif-based communities in complex networks by Arenas et al is a study of the occurrence of characteristic small subgraphs in complex networks. These subgraphs, known as motifs, are used to define general classes of nodes and their communities by extending the mathematical expression of the Newman-Girvan modularity. The same line of research, aimed at characterising network structure through the analysis of particular subgraphs, is explored by Bianconi and Gulbahce in Algorithm for counting large directed loops. This work proposes a belief-propagation algorithm for counting long loops in directed networks, which is then applied to networks of different sizes and loop structure. In The anatomy of a large query graph, Baeza-Yates and Tiberi show that scale invariance is present also in the structure of a graph derived from query logs. This graph is determined not only by the queries but also by the subsequent actions of the users. The graph analysed in this study is generated by more than twenty million queries and is less sparse than suggested by previous studies. A different class of networks is considered by Travençolo and da F Costa in Hierarchical spatial organisation of geographical networks. This work proposes a hierarchical extension of the polygonality index as a means to characterise geographical planar networks and, in particular, to obtain more complete information about the spatial order of the network at progressive spatial scales. The paper Border trees of complex networks by Villas Boas et al focuses instead on the statistical properties of the boundary of graphs, constituted by the vertices of degree one (the leaves of border trees). The authors study the local properties, the depth, and the number of leaves of these border trees, finding that in some real networks more than half of the nodes belong to the border trees. The last contribution to the first section is The generation of random directed networks with prescribed 1-node and 2-node degree correlations by Zamora-López et al. This study deals with the generation of random directed networks and shows that often a large number of links cannot be 'randomised' without altering the degree correlations. This permits fast generation of ensembles of maximally random networks. In the section Methods: The Dynamics, significant attention is given to the study of synchronisation processes on networks: Díaz-Guilera's contribution Dynamics towards synchronisation in hierarchical networks consists of an overview of recent studies on hierarchical networks of phase oscillators. By analysing the evolution of the synchronous dynamics, one can infer details about the underlying network topology. Thus a connection between the dynamical and topological properties of the system is established. The paper Network synchronisation: optimal and pessimal scale-free topologies by Donetti et al explores an optimisation algorithm to study the properties of optimally synchronisable unweighted networks with scale-free degree distribution. It is shown that optimisation leads to a tendency towards disassortativity while networks that are optimally 'un-synchronisable' have a highly assortative string-like structure. The paper Critical line in undirected Kauffman Boolean networks—the role of percolation by Fronczak and Fronczak demonstrates that the percolation underlying the process of damage spreading impacts the position of the critical line in random boolean networks. The critical line results from the fact that the ordered behaviour of small clusters shields the chaotic behaviour of the giant component. In Impact of the updating scheme on stationary states of networks, Radicchi et al explore an interpolation between synchronous and asynchronous updating in a one-dimensional chain of Ising spins to locate a phase transition between phases with an absorbing and a fluctuating stationary state. The properties of attractors in the yeast cell-cycle network are also shown to depend sensitively on the updating mode. As this last contribution shows, a large part of the theoretical activity in the field can be applied to the study of biological systems. The section Biological Applications brings together the following contributions: In Applying weighted network measures to microarray distance matrices, Ahnert et al present a new approach to the analysis of weighted networks, which provides a generalisation to any network measure defined on unweighted networks. The clustering coefficient constructed using this approach is used to identify a number of biologically significant genes in data sets from microarray experiments. The paper Quantifying the taxonomic diversity in real species communities by Caretta Cartozo et al reports on universal statistical properties in taxonomic trees. The results, which are obtained by sampling a large pool of species from all over the world, suggest that it is possible to quantitatively distinguish real species assemblage from random collections. In the contribution Insights into biological information processing: structural and dynamical analysis of a human protein signalling network, de la Fuente et al investigate the dynamical properties of a human protein signalling network while accounting for edge directionality and topological properties both at the local and global scale. The relationship between the node degrees and the distribution of signals through the network is characterised using degree correlation profiles. A study of a brain network is presented by de Vico Fallani et al in Persistent patterns of interconnection in time-varying cortical networks estimated from high-resolution EEG recordings in humans during a simple motor act. The authors introduce an approach based on the estimate of time-varying graph indexes that allows the capture of schemes of communication within the network. The method is applied to a set of high resolution EEG data recorded from a group of subjects performing a simple foot movement. The last section, devoted to Social and Technological Applications, includes nine contributions in the broad area of infrastructure, economic, and social systems: The paper Uncovering individual and collective human dynamics from mobile phone records by Cándia et al explores extensive phone records resolved in both time and space to study collective behaviour and the occurrence of anomalous events. At the individual level, it is shown that the distribution of time intervals between consecutive calls is heavy tailed, which agrees with results previously reported on other human activities. In Mining the inner structure of the Web graph, Donato et al present a series of measurements of the Web, which offer a better understanding of the individual components of its bow-tie structure. The scale-free properties permeate all bow-tie components although they do not exhibit self-similarity and their inner structure is quite distinct. Effects of network topology on wealth distributions, by Garlaschelli and Loffredo, shows that a networked economic system self-organises towards a stationary state whose associated wealth distribution depends crucially on the underlying interaction network. In particular, this study implies that first-order topological properties alone (such as the scale-free property) are not enough to explain the emergence of the empirically observed mixed form of the wealth distribution. In the paper Resource allocation pattern in infrastructure networks, Kim and Motter show that real communication and transportation networks tend to exhibit larger load-to-capacity ratio in nodes and links with larger capacities. This surprising pattern, which is a consequence of decentralised evolution and network traffic fluctuations, suggests that infrastructure networks have evolved to prevent local failures but not necessarily large-scale failures that can be caused by cascading processes. The paper Consensus formation on coevolving networks: groups' formation and structure by Kozma and Barrat addresses the effect of adaptivity on a social model of opinion dynamics and consensus formation. The authors find that on adaptive networks the rewiring process fosters group formation by enhancing communication between agents of similar opinion, though it also makes possible the division of clusters. This result is significantly different from the percolation phenomena observed to govern the process in static networks. Capocci and Caldarelli, in the paper Folksonomies and clustering in the collaborative system CiteULike, analyse an online collaborative tagging system where users bookmark and annotate scientific papers. Such a system can be naturally represented as a tripartite graph whose nodes represent papers, users and tags connected by individual tag assignments. The semantics of tags is studied in order to uncover hidden relationships between tags. The authors find that the clustering coefficient reflects the semantical patterns among tags. Lambiotte's contribution, Majority rule on heterogeneous networks, focuses on the majority rule model for opinion formation when the agents interact through a complex network. It is shown that on networks with modular structures the system may exhibit an asymmetric regime, where nodes in different communities reach opposite average opinions. In addition, the node degree heterogeneity is shown to play an important role in the emergence of collective behaviour. In Structural analysis of behavioural networks from the Internet, Meiss et al analyse the structure of the Internet. The authors present a characterisation of the properties of the behavioural networks generated by several million users of the Abilene (Internet2) network. Structural features of these networks offer new insights into scaling properties of network activity and ways of distinguishing particular patterns of traffic. The final contribution, A social network's changing statistical properties and the quality of human innovation by Uzzi, is an analysis of the collaboration network of artists that made Broadway musicals in the post World War II period. It is shown that when the clustering coefficient in this network is low or high, the financial and artistic success of the industry is low while an intermediate level of clustering is associated with successful shows. We hope that this special issue will serve as a reference of the state of the knowledge in this exciting area of interdisciplinary research and that it will appeal to both experts and newcomers to the field. Finally, we would like to thank all participants of the workshop for their very significant contributions and the IOP Publishing team, particularly Rebecca Gillan, for the careful production of this special issue.

  18. Bipartite separability and nonlocal quantum operations on graphs

    NASA Astrophysics Data System (ADS)

    Dutta, Supriyo; Adhikari, Bibhas; Banerjee, Subhashish; Srikanth, R.

    2016-07-01

    In this paper we consider the separability problem for bipartite quantum states arising from graphs. Earlier it was proved that the degree criterion is the graph-theoretic counterpart of the familiar positive partial transpose criterion for separability, although there are entangled states with positive partial transpose for which the degree criterion fails. Here we introduce the concept of partially symmetric graphs and degree symmetric graphs by using the well-known concept of partial transposition of a graph and degree criteria, respectively. Thus, we provide classes of bipartite separable states of dimension m ×n arising from partially symmetric graphs. We identify partially asymmetric graphs that lack the property of partial symmetry. We develop a combinatorial procedure to create a partially asymmetric graph from a given partially symmetric graph. We show that this combinatorial operation can act as an entanglement generator for mixed states arising from partially symmetric graphs.

  19. On the local edge antimagicness of m-splitting graphs

    NASA Astrophysics Data System (ADS)

    Albirri, E. R.; Dafik; Slamin; Agustin, I. H.; Alfarisi, R.

    2018-04-01

    Let G be a connected and simple graph. A split graph is a graph derived by adding new vertex v‧ in every vertex v‧ such that v‧ adjacent to v in graph G. An m-splitting graph is a graph which has m v‧-vertices, denoted by mSpl(G). A local edge antimagic coloring in G = (V, E) graph is a bijection f:V (G)\\to \\{1,2,3,\\ldots,|V(G)|\\} in which for any two adjacent edges e 1 and e 2 satisfies w({e}1)\

  20. Survey of Approaches to Generate Realistic Synthetic Graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lim, Seung-Hwan; Lee, Sangkeun; Powers, Sarah S

    A graph is a flexible data structure that can represent relationships between entities. As with other data analysis tasks, the use of realistic graphs is critical to obtaining valid research results. Unfortunately, using the actual ("real-world") graphs for research and new algorithm development is difficult due to the presence of sensitive information in the data or due to the scale of data. This results in practitioners developing algorithms and systems that employ synthetic graphs instead of real-world graphs. Generating realistic synthetic graphs that provide reliable statistical confidence to algorithmic analysis and system evaluation involves addressing technical hurdles in a broadmore » set of areas. This report surveys the state of the art in approaches to generate realistic graphs that are derived from fitted graph models on real-world graphs.« less

  1. Self-organizing maps for learning the edit costs in graph matching.

    PubMed

    Neuhaus, Michel; Bunke, Horst

    2005-06-01

    Although graph matching and graph edit distance computation have become areas of intensive research recently, the automatic inference of the cost of edit operations has remained an open problem. In the present paper, we address the issue of learning graph edit distance cost functions for numerically labeled graphs from a corpus of sample graphs. We propose a system of self-organizing maps (SOMs) that represent the distance measuring spaces of node and edge labels. Our learning process is based on the concept of self-organization. It adapts the edit costs in such a way that the similarity of graphs from the same class is increased, whereas the similarity of graphs from different classes decreases. The learning procedure is demonstrated on two different applications involving line drawing graphs and graphs representing diatoms, respectively.

  2. Apparatuses and Methods for Producing Runtime Architectures of Computer Program Modules

    NASA Technical Reports Server (NTRS)

    Abi-Antoun, Marwan Elia (Inventor); Aldrich, Jonathan Erik (Inventor)

    2013-01-01

    Apparatuses and methods for producing run-time architectures of computer program modules. One embodiment includes creating an abstract graph from the computer program module and from containment information corresponding to the computer program module, wherein the abstract graph has nodes including types and objects, and wherein the abstract graph relates an object to a type, and wherein for a specific object the abstract graph relates the specific object to a type containing the specific object; and creating a runtime graph from the abstract graph, wherein the runtime graph is a representation of the true runtime object graph, wherein the runtime graph represents containment information such that, for a specific object, the runtime graph relates the specific object to another object that contains the specific object.

  3. GraphReduce: Processing Large-Scale Graphs on Accelerator-Based Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sengupta, Dipanjan; Song, Shuaiwen; Agarwal, Kapil

    2015-11-15

    Recent work on real-world graph analytics has sought to leverage the massive amount of parallelism offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithms and limitations in GPU-resident memory for storing large graphs. We present GraphReduce, a highly efficient and scalable GPU-based framework that operates on graphs that exceed the device’s internal memory capacity. GraphReduce adopts a combination of edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs with efficient graph data movement between the host andmore » device.« less

  4. Comparing Internet Probing Methodologies Through an Analysis of Large Dynamic Graphs

    DTIC Science & Technology

    2014-06-01

    comparable Internet topologies in less time. We compare these by modeling union of traceroute outputs as graphs, and using standard graph theoretical...topologies in less time. We compare these by modeling union of traceroute outputs as graphs, and using standard graph theoretical measurements as well...We compare these by modeling union of traceroute outputs as graphs, and study the graphs by using vertex and edge count, average vertex degree

  5. GraphBench

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sukumar, Sreenivas R.; Hong, Seokyong; Lee, Sangkeun

    2016-06-01

    GraphBench is a benchmark suite for graph pattern mining and graph analysis systems. The benchmark suite is a significant addition to conducting apples-apples comparison of graph analysis software (databases, in-memory tools, triple stores, etc.)

  6. Asymptote Misconception on Graphing Functions: Does Graphing Software Resolve It?

    ERIC Educational Resources Information Center

    Öçal, Mehmet Fatih

    2017-01-01

    Graphing function is an important issue in mathematics education due to its use in various areas of mathematics and its potential roles for students to enhance learning mathematics. The use of some graphing software assists students' learning during graphing functions. However, the display of graphs of functions that students sketched by hand may…

  7. Generalized graph states based on Hadamard matrices

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cui, Shawn X.; Yu, Nengkun; Department of Mathematics and Statistics, University of Guelph, Guelph, Ontario N1G 2W1

    2015-07-15

    Graph states are widely used in quantum information theory, including entanglement theory, quantum error correction, and one-way quantum computing. Graph states have a nice structure related to a certain graph, which is given by either a stabilizer group or an encoding circuit, both can be directly given by the graph. To generalize graph states, whose stabilizer groups are abelian subgroups of the Pauli group, one approach taken is to study non-abelian stabilizers. In this work, we propose to generalize graph states based on the encoding circuit, which is completely determined by the graph and a Hadamard matrix. We study themore » entanglement structures of these generalized graph states and show that they are all maximally mixed locally. We also explore the relationship between the equivalence of Hadamard matrices and local equivalence of the corresponding generalized graph states. This leads to a natural generalization of the Pauli (X, Z) pairs, which characterizes the local symmetries of these generalized graph states. Our approach is also naturally generalized to construct graph quantum codes which are beyond stabilizer codes.« less

  8. Graph processing platforms at scale: practices and experiences

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lim, Seung-Hwan; Lee, Sangkeun; Brown, Tyler C

    2015-01-01

    Graph analysis unveils hidden associations of data in many phenomena and artifacts, such as road network, social networks, genomic information, and scientific collaboration. Unfortunately, a wide diversity in the characteristics of graphs and graph operations make it challenging to find a right combination of tools and implementation of algorithms to discover desired knowledge from the target data set. This study presents an extensive empirical study of three representative graph processing platforms: Pegasus, GraphX, and Urika. Each system represents a combination of options in data model, processing paradigm, and infrastructure. We benchmarked each platform using three popular graph operations, degree distribution,more » connected components, and PageRank over a variety of real-world graphs. Our experiments show that each graph processing platform shows different strength, depending the type of graph operations. While Urika performs the best in non-iterative operations like degree distribution, GraphX outputforms iterative operations like connected components and PageRank. In addition, we discuss challenges to optimize the performance of each platform over large scale real world graphs.« less

  9. A fast algorithm for vertex-frequency representations of signals on graphs

    PubMed Central

    Jestrović, Iva; Coyle, James L.; Sejdić, Ervin

    2016-01-01

    The windowed Fourier transform (short time Fourier transform) and the S-transform are widely used signal processing tools for extracting frequency information from non-stationary signals. Previously, the windowed Fourier transform had been adopted for signals on graphs and has been shown to be very useful for extracting vertex-frequency information from graphs. However, high computational complexity makes these algorithms impractical. We sought to develop a fast windowed graph Fourier transform and a fast graph S-transform requiring significantly shorter computation time. The proposed schemes have been tested with synthetic test graph signals and real graph signals derived from electroencephalography recordings made during swallowing. The results showed that the proposed schemes provide significantly lower computation time in comparison with the standard windowed graph Fourier transform and the fast graph S-transform. Also, the results showed that noise has no effect on the results of the algorithm for the fast windowed graph Fourier transform or on the graph S-transform. Finally, we showed that graphs can be reconstructed from the vertex-frequency representations obtained with the proposed algorithms. PMID:28479645

  10. Graph 500 on OpenSHMEM: Using a Practical Survey of Past Work to Motivate Novel Algorithmic Developments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grossman, Max; Pritchard Jr., Howard Porter; Budimlic, Zoran

    2016-12-22

    Graph500 [14] is an effort to offer a standardized benchmark across large-scale distributed platforms which captures the behavior of common communicationbound graph algorithms. Graph500 differs from other large-scale benchmarking efforts (such as HPL [6] or HPGMG [7]) primarily in the irregularity of its computation and data access patterns. The core computational kernel of Graph500 is a breadth-first search (BFS) implemented on an undirected graph. The output of Graph500 is a spanning tree of the input graph, usually represented by a predecessor mapping for every node in the graph. The Graph500 benchmark defines several pre-defined input sizes for implementers to testmore » against. This report summarizes investigation into implementing the Graph500 benchmark on OpenSHMEM, and focuses on first building a strong and practical understanding of the strengths and limitations of past work before proposing and developing novel extensions.« less

  11. Graphing trillions of triangles.

    PubMed

    Burkhardt, Paul

    2017-07-01

    The increasing size of Big Data is often heralded but how data are transformed and represented is also profoundly important to knowledge discovery, and this is exemplified in Big Graph analytics. Much attention has been placed on the scale of the input graph but the product of a graph algorithm can be many times larger than the input. This is true for many graph problems, such as listing all triangles in a graph. Enabling scalable graph exploration for Big Graphs requires new approaches to algorithms, architectures, and visual analytics. A brief tutorial is given to aid the argument for thoughtful representation of data in the context of graph analysis. Then a new algebraic method to reduce the arithmetic operations in counting and listing triangles in graphs is introduced. Additionally, a scalable triangle listing algorithm in the MapReduce model will be presented followed by a description of the experiments with that algorithm that led to the current largest and fastest triangle listing benchmarks to date. Finally, a method for identifying triangles in new visual graph exploration technologies is proposed.

  12. Multiple graph regularized protein domain ranking.

    PubMed

    Wang, Jim Jing-Yan; Bensmail, Halima; Gao, Xin

    2012-11-19

    Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications.

  13. Evolutionary graph theory: breaking the symmetry between interaction and replacement

    PubMed Central

    Ohtsuki, Hisashi; Pacheco, Jorge M.; Nowak, Martin A.

    2008-01-01

    We study evolutionary dynamics in a population whose structure is given by two graphs: the interaction graph determines who plays with whom in an evolutionary game; the replacement graph specifies the geometry of evolutionary competition and updating. First, we calculate the fixation probabilities of frequency dependent selection between two strategies or phenotypes. We consider three different update mechanisms: birth-death, death-birth and imitation. Then, as a particular example, we explore the evolution of cooperation. Suppose the interaction graph is a regular graph of degree h, the replacement graph is a regular graph of degree g and the overlap between the two graphs is a regular graph of degree l. We show that cooperation is favored by natural selection if b/c > hg/l. Here, b and c denote the benefit and cost of the altruistic act. This result holds for death-birth updating, weak selection and large population size. Note that the optimum population structure for cooperators is given by maximum overlap between the interaction and the replacement graph (g = h = l), which means that the two graphs are identical. We also prove that a modified replicator equation can describe how the expected values of the frequencies of an arbitrary number of strategies change on replacement and interaction graphs: the two graphs induce a transformation of the payoff matrix. PMID:17350049

  14. Multiple graph regularized protein domain ranking

    PubMed Central

    2012-01-01

    Background Protein domain ranking is a fundamental task in structural biology. Most protein domain ranking methods rely on the pairwise comparison of protein domains while neglecting the global manifold structure of the protein domain database. Recently, graph regularized ranking that exploits the global structure of the graph defined by the pairwise similarities has been proposed. However, the existing graph regularized ranking methods are very sensitive to the choice of the graph model and parameters, and this remains a difficult problem for most of the protein domain ranking methods. Results To tackle this problem, we have developed the Multiple Graph regularized Ranking algorithm, MultiG-Rank. Instead of using a single graph to regularize the ranking scores, MultiG-Rank approximates the intrinsic manifold of protein domain distribution by combining multiple initial graphs for the regularization. Graph weights are learned with ranking scores jointly and automatically, by alternately minimizing an objective function in an iterative algorithm. Experimental results on a subset of the ASTRAL SCOP protein domain database demonstrate that MultiG-Rank achieves a better ranking performance than single graph regularized ranking methods and pairwise similarity based ranking methods. Conclusion The problem of graph model and parameter selection in graph regularized protein domain ranking can be solved effectively by combining multiple graphs. This aspect of generalization introduces a new frontier in applying multiple graphs to solving protein domain ranking applications. PMID:23157331

  15. Directable weathering of concave rock using curvature estimation.

    PubMed

    Jones, Michael D; Farley, McKay; Butler, Joseph; Beardall, Matthew

    2010-01-01

    We address the problem of directable weathering of exposed concave rock for use in computer-generated animation or games. Previous weathering models that admit concave surfaces are computationally inefficient and difficult to control. In nature, the spheroidal and cavernous weathering rates depend on the surface curvature. Spheroidal weathering is fastest in areas with large positive mean curvature and cavernous weathering is fastest in areas with large negative mean curvature. We simulate both processes using an approximation of mean curvature on a voxel grid. Both weathering rates are also influenced by rock durability. The user controls rock durability by editing a durability graph before and during weathering simulation. Simulations of rockfall and colluvium deposition further improve realism. The profile of the final weathered rock matches the shape of the durability graph up to the effects of weathering and colluvium deposition. We demonstrate the top-down directability and visual plausibility of the resulting model through a series of screenshots and rendered images. The results include the weathering of a cube into a sphere and of a sheltered inside corner into a cavern as predicted by the underlying geomorphological models.

  16. Alternative Fuels Data Center: Maps and Data

    Science.gov Websites

    fleet type from 1992-2014 Last update August 2016 View Graph Graph Download Data Generated_thumb20160830 Trend of S&FP AFV acquisitions by fuel type from 1992-2015 Last update August 2016 View Graph Graph transactions from 1997-2014 Last update August 2016 View Graph Graph Download Data Biofuelsatlas BioFuels Atlas

  17. GraphReduce: Large-Scale Graph Analytics on Accelerator-Based HPC Systems

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sengupta, Dipanjan; Agarwal, Kapil; Song, Shuaiwen

    2015-09-30

    Recent work on real-world graph analytics has sought to leverage the massive amount of parallelism offered by GPU devices, but challenges remain due to the inherent irregularity of graph algorithms and limitations in GPU-resident memory for storing large graphs. We present GraphReduce, a highly efficient and scalable GPU-based framework that operates on graphs that exceed the device’s internal memory capacity. GraphReduce adopts a combination of both edge- and vertex-centric implementations of the Gather-Apply-Scatter programming model and operates on multiple asynchronous GPU streams to fully exploit the high degrees of parallelism in GPUs with efficient graph data movement between the hostmore » and the device.« less

  18. GrouseFlocks: steerable exploration of graph hierarchy space.

    PubMed

    Archambault, Daniel; Munzner, Tamara; Auber, David

    2008-01-01

    Several previous systems allow users to interactively explore a large input graph through cuts of a superimposed hierarchy. This hierarchy is often created using clustering algorithms or topological features present in the graph. However, many graphs have domain-specific attributes associated with the nodes and edges, which could be used to create many possible hierarchies providing unique views of the input graph. GrouseFlocks is a system for the exploration of this graph hierarchy space. By allowing users to see several different possible hierarchies on the same graph, the system helps users investigate graph hierarchy space instead of a single fixed hierarchy. GrouseFlocks provides a simple set of operations so that users can create and modify their graph hierarchies based on selections. These selections can be made manually or based on patterns in the attribute data provided with the graph. It provides feedback to the user within seconds, allowing interactive exploration of this space.

  19. Spectral partitioning in equitable graphs.

    PubMed

    Barucca, Paolo

    2017-06-01

    Graph partitioning problems emerge in a wide variety of complex systems, ranging from biology to finance, but can be rigorously analyzed and solved only for a few graph ensembles. Here, an ensemble of equitable graphs, i.e., random graphs with a block-regular structure, is studied, for which analytical results can be obtained. In particular, the spectral density of this ensemble is computed exactly for a modular and bipartite structure. Kesten-McKay's law for random regular graphs is found analytically to apply also for modular and bipartite structures when blocks are homogeneous. An exact solution to graph partitioning for two equal-sized communities is proposed and verified numerically, and a conjecture on the absence of an efficient recovery detectability transition in equitable graphs is suggested. A final discussion summarizes results and outlines their relevance for the solution of graph partitioning problems in other graph ensembles, in particular for the study of detectability thresholds and resolution limits in stochastic block models.

  20. Spectral partitioning in equitable graphs

    NASA Astrophysics Data System (ADS)

    Barucca, Paolo

    2017-06-01

    Graph partitioning problems emerge in a wide variety of complex systems, ranging from biology to finance, but can be rigorously analyzed and solved only for a few graph ensembles. Here, an ensemble of equitable graphs, i.e., random graphs with a block-regular structure, is studied, for which analytical results can be obtained. In particular, the spectral density of this ensemble is computed exactly for a modular and bipartite structure. Kesten-McKay's law for random regular graphs is found analytically to apply also for modular and bipartite structures when blocks are homogeneous. An exact solution to graph partitioning for two equal-sized communities is proposed and verified numerically, and a conjecture on the absence of an efficient recovery detectability transition in equitable graphs is suggested. A final discussion summarizes results and outlines their relevance for the solution of graph partitioning problems in other graph ensembles, in particular for the study of detectability thresholds and resolution limits in stochastic block models.

  1. Well-Covered Graphs: A Survey

    DTIC Science & Technology

    1991-01-01

    critical G’s/# G’s -) 0 as IV(G)I -- oo? References [B1] C. Berge, Regularizable graphs, Ann. Discrete Math ., 3, 1978, 11-19. [B2] _ _, Some common...Springer-Verlag, Berlin, 1980, 108-123. [B3] _ _, Some common properties for regularizable graphs, edge-critical graphs, and B-graphs, Ann. Discrete Math ., 12...graphs - an extension of the K6nig-Egervgiry theorem, Discrete Math ., 27, 1979, 23-33. [ER] M.N Ellingham and G.F. Royle, Well-covered cubic graphs

  2. Study of Chromatic parameters of Line, Total, Middle graphs and Graph operators of Bipartite graph

    NASA Astrophysics Data System (ADS)

    Nagarathinam, R.; Parvathi, N.

    2018-04-01

    Chromatic parameters have been explored on the basis of graph coloring process in which a couple of adjacent nodes receives different colors. But the Grundy and b-coloring executes maximum colors under certain restrictions. In this paper, Chromatic, b-chromatic and Grundy number of some graph operators of bipartite graph has been investigat

  3. GBS 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2010-09-30

    The Umbra gbs (Graph-Based Search) library provides implementations of graph-based search/planning algorithms that can be applied to legacy graph data structures. Unlike some other graph algorithm libraries, this one does not require your graph class to inherit from a specific base class. Implementations of Dijkstra's Algorithm and A-Star search are included and can be used with graphs that are lazily-constructed.

  4. Information visualisation based on graph models

    NASA Astrophysics Data System (ADS)

    Kasyanov, V. N.; Kasyanova, E. V.

    2013-05-01

    Information visualisation is a key component of support tools for many applications in science and engineering. A graph is an abstract structure that is widely used to model information for its visualisation. In this paper, we consider practical and general graph formalism called hierarchical graphs and present the Higres and Visual Graph systems aimed at supporting information visualisation on the base of hierarchical graph models.

  5. Polysemy in the Domain-Specific Pedagogical Use of Graphs in Science Textbooks: The Case of an Electrocardiogram

    ERIC Educational Resources Information Center

    van Eijck, Michiel; Goedhart, Martin J.; Ellermeijer, Ton

    2011-01-01

    Polysemy in graph-related practices is the phenomenon that a single graph can sustain different meanings assigned to it. Considerable research has been done on polysemy in graph-related practices in school science in which graphs are rather used as scientific tools. However, graphs in science textbooks are also used rather pedagogically to…

  6. Lamplighter groups, de Brujin graphs, spider-web graphs and their spectra

    NASA Astrophysics Data System (ADS)

    Grigorchuk, R.; Leemann, P.-H.; Nagnibeda, T.

    2016-05-01

    We study the infinite family of spider-web graphs \\{{{ S }}k,N,M\\}, k≥slant 2, N≥slant 0 and M≥slant 1, initiated in the 50s in the context of network theory. It was later shown in physical literature that these graphs have remarkable percolation and spectral properties. We provide a mathematical explanation of these properties by putting the spider-web graphs in the context of group theory and algebraic graph theory. Namely, we realize them as tensor products of the well-known de Bruijn graphs \\{{{ B }}k,N\\} with cyclic graphs \\{{C}M\\} and show that these graphs are described by the action of the lamplighter group {{ L }}k={Z}/k{Z}\\wr {Z} on the infinite binary tree. Our main result is the identification of the infinite limit of \\{{{ S }}k,N,M\\}, as N,M\\to ∞ , with the Cayley graph of the lamplighter group {{ L }}k which, in turn, is one of the famous Diestel-Leader graphs {{DL}}k,k. As an application we compute the spectra of all spider-web graphs and show their convergence to the discrete spectral distribution associated with the Laplacian on the lamplighter group.

  7. How to Direct the Edges of the Connectomes: Dynamics of the Consensus Connectomes and the Development of the Connections in the Human Brain.

    PubMed

    Kerepesi, Csaba; Szalkai, Balázs; Varga, Bálint; Grolmusz, Vince

    2016-01-01

    The human braingraph or the connectome is the object of an intensive research today. The advantage of the graph-approach to brain science is that the rich structures, algorithms and definitions of graph theory can be applied to the anatomical networks of the connections of the human brain. In these graphs, the vertices correspond to the small (1-1.5 cm2) areas of the gray matter, and two vertices are connected by an edge, if a diffusion-MRI based workflow finds fibers of axons, running between those small gray matter areas in the white matter of the brain. One main question of the field today is discovering the directions of the connections between the small gray matter areas. In a previous work we have reported the construction of the Budapest Reference Connectome Server http://connectome.pitgroup.org from the data recorded in the Human Connectome Project of the NIH. The server generates the consensus braingraph of 96 subjects in Version 2, and of 418 subjects in Version 3, according to selectable parameters. After the Budapest Reference Connectome Server had been published, we recognized a surprising and unforeseen property of the server. The server can generate the braingraph of connections that are present in at least k graphs out of the 418, for any value of k = 1, 2, …, 418. When the value of k is changed from k = 418 through 1 by moving a slider at the webserver from right to left, certainly more and more edges appear in the consensus graph. The astonishing observation is that the appearance of the new edges is not random: it is similar to a growing shrub. We refer to this phenomenon as the Consensus Connectome Dynamics. We hypothesize that this movement of the slider in the webserver may copy the development of the connections in the human brain in the following sense: the connections that are present in all subjects are the oldest ones, and those that are present only in a decreasing fraction of the subjects are gradually the newer connections in the individual brain development. An animation on the phenomenon is available at https://youtu.be/yxlyudPaVUE. Based on this observation and the related hypothesis, we can assign directions to some of the edges of the connectome as follows: Let Gk + 1 denote the consensus connectome where each edge is present in at least k+1 graphs, and let Gk denote the consensus connectome where each edge is present in at least k graphs. Suppose that vertex v is not connected to any other vertices in Gk+1, and becomes connected to a vertex u in Gk, where u was connected to other vertices already in Gk+1. Then we direct this (v, u) edge from v to u.

  8. New methods for analyzing semantic graph based assessments in science education

    NASA Astrophysics Data System (ADS)

    Vikaros, Lance Steven

    This research investigated how the scoring of semantic graphs (known by many as concept maps) could be improved and automated in order to address issues of inter-rater reliability and scalability. As part of the NSF funded SENSE-IT project to introduce secondary school science students to sensor networks (NSF Grant No. 0833440), semantic graphs illustrating how temperature change affects water ecology were collected from 221 students across 16 schools. The graphing task did not constrain students' use of terms, as is often done with semantic graph based assessment due to coding and scoring concerns. The graphing software used provided real-time feedback to help students learn how to construct graphs, stay on topic and effectively communicate ideas. The collected graphs were scored by human raters using assessment methods expected to boost reliability, which included adaptations of traditional holistic and propositional scoring methods, use of expert raters, topical rubrics, and criterion graphs. High levels of inter-rater reliability were achieved, demonstrating that vocabulary constraints may not be necessary after all. To investigate a new approach to automating the scoring of graphs, thirty-two different graph features characterizing graphs' structure, semantics, configuration and process of construction were then used to predict human raters' scoring of graphs in order to identify feature patterns correlated to raters' evaluations of graphs' topical accuracy and complexity. Results led to the development of a regression model able to predict raters' scoring with 77% accuracy, with 46% accuracy expected when used to score new sets of graphs, as estimated via cross-validation tests. Although such performance is comparable to other graph and essay based scoring systems, cross-context testing of the model and methods used to develop it would be needed before it could be recommended for widespread use. Still, the findings suggest techniques for improving the reliability and scalability of semantic graph based assessments without requiring constraint of how ideas are expressed.

  9. Biometric Subject Verification Based on Electrocardiographic Signals

    NASA Technical Reports Server (NTRS)

    Dusan, Sorin V. (Inventor); Jorgensen, Charles C. (Inventor)

    2014-01-01

    A method of authenticating or declining to authenticate an asserted identity of a candidate-person. In an enrollment phase, a reference PQRST heart action graph is provided or constructed from information obtained from a plurality of graphs that resemble each other for a known reference person, using a first graph comparison metric. In a verification phase, a candidate-person asserts his/her identity and presents a plurality of his/her heart cycle graphs. If a sufficient number of the candidate-person's measured graphs resemble each other, a representative composite graph is constructed from the candidate-person's graphs and is compared with a composite reference graph, for the person whose identity is asserted, using a second graph comparison metric. When the second metric value lies in a selected range, the candidate-person's assertion of identity is accepted.

  10. EvoGraph: On-The-Fly Efficient Mining of Evolving Graphs on GPU

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sengupta, Dipanjan; Song, Shuaiwen

    With the prevalence of the World Wide Web and social networks, there has been a growing interest in high performance analytics for constantly-evolving dynamic graphs. Modern GPUs provide massive AQ1 amount of parallelism for efficient graph processing, but the challenges remain due to their lack of support for the near real-time streaming nature of dynamic graphs. Specifically, due to the current high volume and velocity of graph data combined with the complexity of user queries, traditional processing methods by first storing the updates and then repeatedly running static graph analytics on a sequence of versions or snapshots are deemed undesirablemore » and computational infeasible on GPU. We present EvoGraph, a highly efficient and scalable GPU- based dynamic graph analytics framework.« less

  11. Genome alignment with graph data structures: a comparison

    PubMed Central

    2014-01-01

    Background Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. Results We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. Conclusion We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools. PMID:24712884

  12. Efficient dynamic graph construction for inductive semi-supervised learning.

    PubMed

    Dornaika, F; Dahbi, R; Bosaghzadeh, A; Ruichek, Y

    2017-10-01

    Most of graph construction techniques assume a transductive setting in which the whole data collection is available at construction time. Addressing graph construction for inductive setting, in which data are coming sequentially, has received much less attention. For inductive settings, constructing the graph from scratch can be very time consuming. This paper introduces a generic framework that is able to make any graph construction method incremental. This framework yields an efficient and dynamic graph construction method that adds new samples (labeled or unlabeled) to a previously constructed graph. As a case study, we use the recently proposed Two Phase Weighted Regularized Least Square (TPWRLS) graph construction method. The paper has two main contributions. First, we use the TPWRLS coding scheme to represent new sample(s) with respect to an existing database. The representative coefficients are then used to update the graph affinity matrix. The proposed method not only appends the new samples to the graph but also updates the whole graph structure by discovering which nodes are affected by the introduction of new samples and by updating their edge weights. The second contribution of the article is the application of the proposed framework to the problem of graph-based label propagation using multiple observations for vision-based recognition tasks. Experiments on several image databases show that, without any significant loss in the accuracy of the final classification, the proposed dynamic graph construction is more efficient than the batch graph construction. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. JavaGenes: Evolving Graphs with Crossover

    NASA Technical Reports Server (NTRS)

    Globus, Al; Atsatt, Sean; Lawton, John; Wipke, Todd

    2000-01-01

    Genetic algorithms usually use string or tree representations. We have developed a novel crossover operator for a directed and undirected graph representation, and used this operator to evolve molecules and circuits. Unlike strings or trees, a single point in the representation cannot divide every possible graph into two parts, because graphs may contain cycles. Thus, the crossover operator is non-trivial. A steady-state, tournament selection genetic algorithm code (JavaGenes) was written to implement and test the graph crossover operator. All runs were executed by cycle-scavagging on networked workstations using the Condor batch processing system. The JavaGenes code has evolved pharmaceutical drug molecules and simple digital circuits. Results to date suggest that JavaGenes can evolve moderate sized drug molecules and very small circuits in reasonable time. The algorithm has greater difficulty with somewhat larger circuits, suggesting that directed graphs (circuits) are more difficult to evolve than undirected graphs (molecules), although necessary differences in the crossover operator may also explain the results. In principle, JavaGenes should be able to evolve other graph-representable systems, such as transportation networks, metabolic pathways, and computer networks. However, large graphs evolve significantly slower than smaller graphs, presumably because the space-of-all-graphs explodes combinatorially with graph size. Since the representation strongly affects genetic algorithm performance, adding graphs to the evolutionary programmer's bag-of-tricks should be beneficial. Also, since graph evolution operates directly on the phenotype, the genotype-phenotype translation step, common in genetic algorithm work, is eliminated.

  14. Chemical Applications of Graph Theory: Part II. Isomer Enumeration.

    ERIC Educational Resources Information Center

    Hansen, Peter J.; Jurs, Peter C.

    1988-01-01

    Discusses the use of graph theory to aid in the depiction of organic molecular structures. Gives a historical perspective of graph theory and explains graph theory terminology with organic examples. Lists applications of graph theory to current research projects. (ML)

  15. Graphing trillions of triangles

    PubMed Central

    Burkhardt, Paul

    2016-01-01

    The increasing size of Big Data is often heralded but how data are transformed and represented is also profoundly important to knowledge discovery, and this is exemplified in Big Graph analytics. Much attention has been placed on the scale of the input graph but the product of a graph algorithm can be many times larger than the input. This is true for many graph problems, such as listing all triangles in a graph. Enabling scalable graph exploration for Big Graphs requires new approaches to algorithms, architectures, and visual analytics. A brief tutorial is given to aid the argument for thoughtful representation of data in the context of graph analysis. Then a new algebraic method to reduce the arithmetic operations in counting and listing triangles in graphs is introduced. Additionally, a scalable triangle listing algorithm in the MapReduce model will be presented followed by a description of the experiments with that algorithm that led to the current largest and fastest triangle listing benchmarks to date. Finally, a method for identifying triangles in new visual graph exploration technologies is proposed. PMID:28690426

  16. Exploring Text and Icon Graph Interpretation in Students with Dyslexia: An Eye-tracking Study.

    PubMed

    Kim, Sunjung; Wiseheart, Rebecca

    2017-02-01

    A growing body of research suggests that individuals with dyslexia struggle to use graphs efficiently. Given the persistence of orthographic processing deficits in dyslexia, this study tested whether graph interpretation deficits in dyslexia are directly related to difficulties processing the orthographic components of graphs (i.e. axes and legend labels). Participants were 80 college students with and without dyslexia. Response times and eye movements were recorded as students answered comprehension questions about simple data displayed in bar graphs. Axes and legends were labelled either with words (mixed-modality graphs) or icons (orthography-free graphs). Students also answered informationally equivalent questions presented in sentences (orthography-only condition). Response times were slower in the dyslexic group only for processing sentences. However, eye tracking data revealed group differences for processing mixed-modality graphs, whereas no group differences were found for the orthography-free graphs. When processing bar graphs, students with dyslexia differ from their able reading peers only when graphs contain orthographic features. Implications for processing informational text are discussed. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  17. Learning about Science Graphs and Word Games. Superific Science Book V. A Good Apple Science Activity Book for Grades 5-8+.

    ERIC Educational Resources Information Center

    Conway, Lorraine

    This packet of student materials contains a variety of worksheet activities dealing with science graphs and science word games. These reproducible materials deal with: (1) bar graphs; (2) line graphs; (3) circle graphs; (4) pictographs; (5) histograms; (6) artgraphs; (7) designing your own graphs; (8) medical prefixes; (9) color prefixes; (10)…

  18. TrajGraph: A Graph-Based Visual Analytics Approach to Studying Urban Network Centralities Using Taxi Trajectory Data.

    PubMed

    Huang, Xiaoke; Zhao, Ye; Yang, Jing; Zhang, Chong; Ma, Chao; Ye, Xinyue

    2016-01-01

    We propose TrajGraph, a new visual analytics method, for studying urban mobility patterns by integrating graph modeling and visual analysis with taxi trajectory data. A special graph is created to store and manifest real traffic information recorded by taxi trajectories over city streets. It conveys urban transportation dynamics which can be discovered by applying graph analysis algorithms. To support interactive, multiscale visual analytics, a graph partitioning algorithm is applied to create region-level graphs which have smaller size than the original street-level graph. Graph centralities, including Pagerank and betweenness, are computed to characterize the time-varying importance of different urban regions. The centralities are visualized by three coordinated views including a node-link graph view, a map view and a temporal information view. Users can interactively examine the importance of streets to discover and assess city traffic patterns. We have implemented a fully working prototype of this approach and evaluated it using massive taxi trajectories of Shenzhen, China. TrajGraph's capability in revealing the importance of city streets was evaluated by comparing the calculated centralities with the subjective evaluations from a group of drivers in Shenzhen. Feedback from a domain expert was collected. The effectiveness of the visual interface was evaluated through a formal user study. We also present several examples and a case study to demonstrate the usefulness of TrajGraph in urban transportation analysis.

  19. Graphing Polar Curves

    ERIC Educational Resources Information Center

    Lawes, Jonathan F.

    2013-01-01

    Graphing polar curves typically involves a combination of three traditional techniques, all of which can be time-consuming and tedious. However, an alternative method--graphing the polar function on a rectangular plane--simplifies graphing, increases student understanding of the polar coordinate system, and reinforces graphing techniques learned…

  20. On the locating-chromatic number for graphs with two homogenous components

    NASA Astrophysics Data System (ADS)

    Welyyanti, Des; Baskoro, Edy Tri; Simajuntak, Rinovia; Uttunggadewa, Saladin

    2017-10-01

    The locating-chromatic number of a graph was introduced by Chartrand et al. in 2002. The concept of the locating-chromatic number is a marriage between graph coloring and the notion of graph partition dimension. This concept is only for connected graphs. In [8], we extended this concept also for disconnected graphs. In this paper, we determine the locating- chromatic number of a graph with two components. In particular, we determine such values if the components are homogeneous and each component has locating-chromatic number 3.

  1. The impact of home care nurses' numeracy and graph literacy on comprehension of visual display information: implications for dashboard design.

    PubMed

    Dowding, Dawn; Merrill, Jacqueline A; Onorato, Nicole; Barrón, Yolanda; Rosati, Robert J; Russell, David

    2018-02-01

    To explore home care nurses' numeracy and graph literacy and their relationship to comprehension of visualized data. A multifactorial experimental design using online survey software. Nurses were recruited from 2 Medicare-certified home health agencies. Numeracy and graph literacy were measured using validated scales. Nurses were randomized to 1 of 4 experimental conditions. Each condition displayed data for 1 of 4 quality indicators, in 1 of 4 different visualized formats (bar graph, line graph, spider graph, table). A mixed linear model measured the impact of numeracy, graph literacy, and display format on data understanding. In all, 195 nurses took part in the study. They were slightly more numerate and graph literate than the general population. Overall, nurses understood information presented in bar graphs most easily (88% correct), followed by tables (81% correct), line graphs (77% correct), and spider graphs (41% correct). Individuals with low numeracy and low graph literacy had poorer comprehension of information displayed across all formats. High graph literacy appeared to enhance comprehension of data regardless of numeracy capabilities. Clinical dashboards are increasingly used to provide information to clinicians in visualized format, under the assumption that visual display reduces cognitive workload. Results of this study suggest that nurses' comprehension of visualized information is influenced by their numeracy, graph literacy, and the display format of the data. Individual differences in numeracy and graph literacy skills need to be taken into account when designing dashboard technology. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  2. Molecular graph convolutions: moving beyond fingerprints.

    PubMed

    Kearnes, Steven; McCloskey, Kevin; Berndl, Marc; Pande, Vijay; Riley, Patrick

    2016-08-01

    Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph-atoms, bonds, distances, etc.-which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.

  3. Mutual proximity graphs for improved reachability in music recommendation.

    PubMed

    Flexer, Arthur; Stevens, Jeff

    2018-01-01

    This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness.

  4. Mutual proximity graphs for improved reachability in music recommendation

    PubMed Central

    Flexer, Arthur; Stevens, Jeff

    2018-01-01

    This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness. PMID:29348779

  5. Constructing compact and effective graphs for recommender systems via node and edge aggregations

    DOE PAGES

    Lee, Sangkeun; Kahng, Minsuk; Lee, Sang-goo

    2014-12-10

    Exploiting graphs for recommender systems has great potential to flexibly incorporate heterogeneous information for producing better recommendation results. As our baseline approach, we first introduce a naive graph-based recommendation method, which operates with a heterogeneous log-metadata graph constructed from user log and content metadata databases. Although the na ve graph-based recommendation method is simple, it allows us to take advantages of heterogeneous information and shows promising flexibility and recommendation accuracy. However, it often leads to extensive processing time due to the sheer size of the graphs constructed from entire user log and content metadata databases. In this paper, we proposemore » node and edge aggregation approaches to constructing compact and e ective graphs called Factor-Item bipartite graphs by aggregating nodes and edges of a log-metadata graph. Furthermore, experimental results using real world datasets indicate that our approach can significantly reduce the size of graphs exploited for recommender systems without sacrificing the recommendation quality.« less

  6. graphkernels: R and Python packages for graph comparison

    PubMed Central

    Ghisu, M Elisabetta; Llinares-López, Felipe; Borgwardt, Karsten

    2018-01-01

    Abstract Summary Measuring the similarity of graphs is a fundamental step in the analysis of graph-structured data, which is omnipresent in computational biology. Graph kernels have been proposed as a powerful and efficient approach to this problem of graph comparison. Here we provide graphkernels, the first R and Python graph kernel libraries including baseline kernels such as label histogram based kernels, classic graph kernels such as random walk based kernels, and the state-of-the-art Weisfeiler-Lehman graph kernel. The core of all graph kernels is implemented in C ++ for efficiency. Using the kernel matrices computed by the package, we can easily perform tasks such as classification, regression and clustering on graph-structured samples. Availability and implementation The R and Python packages including source code are available at https://CRAN.R-project.org/package=graphkernels and https://pypi.python.org/pypi/graphkernels. Contact mahito@nii.ac.jp or elisabetta.ghisu@bsse.ethz.ch Supplementary information Supplementary data are available online at Bioinformatics. PMID:29028902

  7. Detecting labor using graph theory on connectivity matrices of uterine EMG.

    PubMed

    Al-Omar, S; Diab, A; Nader, N; Khalil, M; Karlsson, B; Marque, C

    2015-08-01

    Premature labor is one of the most serious health problems in the developed world. One of the main reasons for this is that no good way exists to distinguish true labor from normal pregnancy contractions. The aim of this paper is to investigate if the application of graph theory techniques to multi-electrode uterine EMG signals can improve the discrimination between pregnancy contractions and labor. To test our methods we first applied them to synthetic graphs where we detected some differences in the parameters results and changes in the graph model from pregnancy-like graphs to labor-like graphs. Then, we applied the same methods to real signals. We obtained the best differentiation between pregnancy and labor through the same parameters. Major improvements in differentiating between pregnancy and labor were obtained using a low pass windowing preprocessing step. Results show that real graphs generally became more organized when moving from pregnancy, where the graph showed random characteristics, to labor where the graph became a more small-world like graph.

  8. Stationary waves on nonlinear quantum graphs. II. Application of canonical perturbation theory in basic graph structures.

    PubMed

    Gnutzmann, Sven; Waltner, Daniel

    2016-12-01

    We consider exact and asymptotic solutions of the stationary cubic nonlinear Schrödinger equation on metric graphs. We focus on some basic example graphs. The asymptotic solutions are obtained using the canonical perturbation formalism developed in our earlier paper [S. Gnutzmann and D. Waltner, Phys. Rev. E 93, 032204 (2016)2470-004510.1103/PhysRevE.93.032204]. For closed example graphs (interval, ring, star graph, tadpole graph), we calculate spectral curves and show how the description of spectra reduces to known characteristic functions of linear quantum graphs in the low-intensity limit. Analogously for open examples, we show how nonlinear scattering of stationary waves arises and how it reduces to known linear scattering amplitudes at low intensities. In the short-wavelength asymptotics we discuss how genuine nonlinear effects may be described using the leading order of canonical perturbation theory: bifurcation of spectral curves (and the corresponding solutions) in closed graphs and multistability in open graphs.

  9. A Visual Analytics Paradigm Enabling Trillion-Edge Graph Exploration

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wong, Pak C.; Haglin, David J.; Gillen, David S.

    We present a visual analytics paradigm and a system prototype for exploring web-scale graphs. A web-scale graph is described as a graph with ~one trillion edges and ~50 billion vertices. While there is an aggressive R&D effort in processing and exploring web-scale graphs among internet vendors such as Facebook and Google, visualizing a graph of that scale still remains an underexplored R&D area. The paper describes a nontraditional peek-and-filter strategy that facilitates the exploration of a graph database of unprecedented size for visualization and analytics. We demonstrate that our system prototype can 1) preprocess a graph with ~25 billion edgesmore » in less than two hours and 2) support database query and visualization on the processed graph database afterward. Based on our computational performance results, we argue that we most likely will achieve the one trillion edge mark (a computational performance improvement of 40 times) for graph visual analytics in the near future.« less

  10. graphkernels: R and Python packages for graph comparison.

    PubMed

    Sugiyama, Mahito; Ghisu, M Elisabetta; Llinares-López, Felipe; Borgwardt, Karsten

    2018-02-01

    Measuring the similarity of graphs is a fundamental step in the analysis of graph-structured data, which is omnipresent in computational biology. Graph kernels have been proposed as a powerful and efficient approach to this problem of graph comparison. Here we provide graphkernels, the first R and Python graph kernel libraries including baseline kernels such as label histogram based kernels, classic graph kernels such as random walk based kernels, and the state-of-the-art Weisfeiler-Lehman graph kernel. The core of all graph kernels is implemented in C ++ for efficiency. Using the kernel matrices computed by the package, we can easily perform tasks such as classification, regression and clustering on graph-structured samples. The R and Python packages including source code are available at https://CRAN.R-project.org/package=graphkernels and https://pypi.python.org/pypi/graphkernels. mahito@nii.ac.jp or elisabetta.ghisu@bsse.ethz.ch. Supplementary data are available online at Bioinformatics. © The Author(s) 2017. Published by Oxford University Press.

  11. Continuous-time quantum walks on star graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Salimi, S.

    2009-06-15

    In this paper, we investigate continuous-time quantum walk on star graphs. It is shown that quantum central limit theorem for a continuous-time quantum walk on star graphs for N-fold star power graph, which are invariant under the quantum component of adjacency matrix, converges to continuous-time quantum walk on K{sub 2} graphs (complete graph with two vertices) and the probability of observing walk tends to the uniform distribution.

  12. Matching Extension in Regular Graphs

    DTIC Science & Technology

    1989-01-01

    Plummer, Matching Theory, Ann. Discrete Math . 29, North- Holland, Amsterdam, 1986. [101 , The matching structure of graphs: some recent re- sults...maximums d’un graphe, These, Dr. troisieme cycle, Univ. Grenoble, 1978. [12 ] D. Naddef and W.R. Pulleyblank, Matching in regular graphs, Discrete Math . 34...1981, 283-291. [13 1 M.D. Plummer, On n-extendable graphs, Discrete Math . 31, 1980, 201-210. . [ 141 ,Matching extension in planar graphs IV

  13. Convex Graph Invariants

    DTIC Science & Technology

    2010-12-02

    Motzkin, T. and Straus, E. (1965). Maxima for graphs and a new proof of a theorem of Turan . Canad. J. Math. 17 533–540. [33] Rendl, F. and Sotirov, R...Convex Graph Invariants Venkat Chandrasekaran, Pablo A . Parrilo, and Alan S. Willsky ∗ Laboratory for Information and Decision Systems Department of...this paper we study convex graph invariants, which are graph invariants that are convex functions of the adjacency matrix of a graph. Some examples

  14. Graph characterization via Ihara coefficients.

    PubMed

    Ren, Peng; Wilson, Richard C; Hancock, Edwin R

    2011-02-01

    The novel contributions of this paper are twofold. First, we demonstrate how to characterize unweighted graphs in a permutation-invariant manner using the polynomial coefficients from the Ihara zeta function, i.e., the Ihara coefficients. Second, we generalize the definition of the Ihara coefficients to edge-weighted graphs. For an unweighted graph, the Ihara zeta function is the reciprocal of a quasi characteristic polynomial of the adjacency matrix of the associated oriented line graph. Since the Ihara zeta function has poles that give rise to infinities, the most convenient numerically stable representation is to work with the coefficients of the quasi characteristic polynomial. Moreover, the polynomial coefficients are invariant to vertex order permutations and also convey information concerning the cycle structure of the graph. To generalize the representation to edge-weighted graphs, we make use of the reduced Bartholdi zeta function. We prove that the computation of the Ihara coefficients for unweighted graphs is a special case of our proposed method for unit edge weights. We also present a spectral analysis of the Ihara coefficients and indicate their advantages over other graph spectral methods. We apply the proposed graph characterization method to capturing graph-class structure and clustering graphs. Experimental results reveal that the Ihara coefficients are more effective than methods based on Laplacian spectra.

  15. What Would a Graph Look Like in this Layout? A Machine Learning Approach to Large Graph Visualization.

    PubMed

    Kwon, Oh-Hyun; Crnovrsanin, Tarik; Ma, Kwan-Liu

    2018-01-01

    Using different methods for laying out a graph can lead to very different visual appearances, with which the viewer perceives different information. Selecting a "good" layout method is thus important for visualizing a graph. The selection can be highly subjective and dependent on the given task. A common approach to selecting a good layout is to use aesthetic criteria and visual inspection. However, fully calculating various layouts and their associated aesthetic metrics is computationally expensive. In this paper, we present a machine learning approach to large graph visualization based on computing the topological similarity of graphs using graph kernels. For a given graph, our approach can show what the graph would look like in different layouts and estimate their corresponding aesthetic metrics. An important contribution of our work is the development of a new framework to design graph kernels. Our experimental study shows that our estimation calculation is considerably faster than computing the actual layouts and their aesthetic metrics. Also, our graph kernels outperform the state-of-the-art ones in both time and accuracy. In addition, we conducted a user study to demonstrate that the topological similarity computed with our graph kernel matches perceptual similarity assessed by human users.

  16. Knowledge Representation Issues in Semantic Graphs for Relationship Detection

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Barthelemy, M; Chow, E; Eliassi-Rad, T

    2005-02-02

    An important task for Homeland Security is the prediction of threat vulnerabilities, such as through the detection of relationships between seemingly disjoint entities. A structure used for this task is a ''semantic graph'', also known as a ''relational data graph'' or an ''attributed relational graph''. These graphs encode relationships as typed links between a pair of typed nodes. Indeed, semantic graphs are very similar to semantic networks used in AI. The node and link types are related through an ontology graph (also known as a schema). Furthermore, each node has a set of attributes associated with it (e.g., ''age'' maymore » be an attribute of a node of type ''person''). Unfortunately, the selection of types and attributes for both nodes and links depends on human expertise and is somewhat subjective and even arbitrary. This subjectiveness introduces biases into any algorithm that operates on semantic graphs. Here, we raise some knowledge representation issues for semantic graphs and provide some possible solutions using recently developed ideas in the field of complex networks. In particular, we use the concept of transitivity to evaluate the relevance of individual links in the semantic graph for detecting relationships. We also propose new statistical measures for semantic graphs and illustrate these semantic measures on graphs constructed from movies and terrorism data.« less

  17. Simple graph models of information spread in finite populations

    PubMed Central

    Voorhees, Burton; Ryder, Bergerud

    2015-01-01

    We consider several classes of simple graphs as potential models for information diffusion in a structured population. These include biases cycles, dual circular flows, partial bipartite graphs and what we call ‘single-link’ graphs. In addition to fixation probabilities, we study structure parameters for these graphs, including eigenvalues of the Laplacian, conductances, communicability and expected hitting times. In several cases, values of these parameters are related, most strongly so for partial bipartite graphs. A measure of directional bias in cycles and circular flows arises from the non-zero eigenvalues of the antisymmetric part of the Laplacian and another measure is found for cycles as the value of the transition probability for which hitting times going in either direction of the cycle are equal. A generalization of circular flow graphs is used to illustrate the possibility of tuning edge weights to match pre-specified values for graph parameters; in particular, we show that generalizations of circular flows can be tuned to have fixation probabilities equal to the Moran probability for a complete graph by tuning vertex temperature profiles. Finally, single-link graphs are introduced as an example of a graph involving a bottleneck in the connection between two components and these are compared to the partial bipartite graphs. PMID:26064661

  18. A Semantic Graph Query Language

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kaplan, I L

    2006-10-16

    Semantic graphs can be used to organize large amounts of information from a number of sources into one unified structure. A semantic query language provides a foundation for extracting information from the semantic graph. The graph query language described here provides a simple, powerful method for querying semantic graphs.

  19. An efficient auto TPT stitch guidance generation for optimized standard cell design

    NASA Astrophysics Data System (ADS)

    Samboju, Nagaraj C.; Choi, Soo-Han; Arikati, Srini; Cilingir, Erdem

    2015-03-01

    As the technology continues to shrink below 14nm, triple patterning lithography (TPT) is a worthwhile lithography methodology for printing dense layers such as Metal1. However, this increases the complexity of standard cell design, as it is very difficult to develop a TPT compliant layout without compromising on the area. Hence, this emphasizes the importance to have an accurate stitch generation methodology to meet the standard cell area requirement as defined by the technology shrink factor. In this paper, we present an efficient auto TPT stitch guidance generation technique for optimized standard cell design. The basic idea here is to first identify the conflicting polygons based on the Fix Guidance [1] solution developed by Synopsys. Fix Guidance is a reduced sub-graph containing minimum set of edges along with the connecting polygons; by eliminating these edges in a design 3-color conflicts can be resolved. Once the conflicting polygons are identified using this method, they are categorized into four types [2] - (Type 1 to 4). The categorization is based on number of interactions a polygon has with the coloring links and the triangle loops of fix guidance. For each type a certain criteria for keep-out region is defined, based on which the final stitch guidance locations are generated. This technique provides various possible stitch locations to the user and helps the user to select the best stitch location considering both design flexibility (max. pin access/small area) and process-preferences. Based on this technique, a standard cell library for place and route (P and R) can be developed with colorless data and a stitch marker defined by designer using our proposed method. After P and R, the full chip (block) would contain the colorless data and standard cell stitch markers only. These stitch markers are considered as "must be stitch" candidates. Hence during full chip decomposition it is not required to generate and select the stitch markers again for the complete data; therefore, the proposed method reduces the decomposition time significantly.

  20. Constructing Dense Graphs with Unique Hamiltonian Cycles

    ERIC Educational Resources Information Center

    Lynch, Mark A. M.

    2012-01-01

    It is not difficult to construct dense graphs containing Hamiltonian cycles, but it is difficult to generate dense graphs that are guaranteed to contain a unique Hamiltonian cycle. This article presents an algorithm for generating arbitrarily large simple graphs containing "unique" Hamiltonian cycles. These graphs can be turned into dense graphs…

  1. Dynamic graph of an oxy-fuel combustion system using autocatalytic set model

    NASA Astrophysics Data System (ADS)

    Harish, Noor Ainy; Bakar, Sumarni Abu

    2017-08-01

    Evaporation process is one of the main processes besides combustion process in an oxy-combustion boiler system. An Autocatalytic Set (ASC) Model has successfully applied in developing graphical representation of the chemical reactions that occurs in the evaporation process in the system. Seventeen variables identified in the process are represented as nodes and the catalytic relationships are represented as edges in the graph. In addition, in this paper graph dynamics of ACS is further investigated. By using Dynamic Autocatalytic Set Graph Algorithm (DAGA), the adjacency matrix for each of the graphs and its relations to Perron-Frobenius Theorem is investigated. The dynamic graph obtained is further investigated where the connection of the graph to fuzzy graph Type 1 is established.

  2. A Weight-Adaptive Laplacian Embedding for Graph-Based Clustering.

    PubMed

    Cheng, De; Nie, Feiping; Sun, Jiande; Gong, Yihong

    2017-07-01

    Graph-based clustering methods perform clustering on a fixed input data graph. Thus such clustering results are sensitive to the particular graph construction. If this initial construction is of low quality, the resulting clustering may also be of low quality. We address this drawback by allowing the data graph itself to be adaptively adjusted in the clustering procedure. In particular, our proposed weight adaptive Laplacian (WAL) method learns a new data similarity matrix that can adaptively adjust the initial graph according to the similarity weight in the input data graph. We develop three versions of these methods based on the L2-norm, fuzzy entropy regularizer, and another exponential-based weight strategy, that yield three new graph-based clustering objectives. We derive optimization algorithms to solve these objectives. Experimental results on synthetic data sets and real-world benchmark data sets exhibit the effectiveness of these new graph-based clustering methods.

  3. Genus Ranges of 4-Regular Rigid Vertex Graphs

    PubMed Central

    Buck, Dorothy; Dolzhenko, Egor; Jonoska, Nataša; Saito, Masahico; Valencia, Karin

    2016-01-01

    A rigid vertex of a graph is one that has a prescribed cyclic order of its incident edges. We study orientable genus ranges of 4-regular rigid vertex graphs. The (orientable) genus range is a set of genera values over all orientable surfaces into which a graph is embedded cellularly, and the embeddings of rigid vertex graphs are required to preserve the prescribed cyclic order of incident edges at every vertex. The genus ranges of 4-regular rigid vertex graphs are sets of consecutive integers, and we address two questions: which intervals of integers appear as genus ranges of such graphs, and what types of graphs realize a given genus range. For graphs with 2n vertices (n > 1), we prove that all intervals [a, b] for all a < b ≤ n, and singletons [h, h] for some h ≤ n, are realized as genus ranges. For graphs with 2n − 1 vertices (n ≥ 1), we prove that all intervals [a, b] for all a < b ≤ n except [0, n], and [h, h] for some h ≤ n, are realized as genus ranges. We also provide constructions of graphs that realize these ranges. PMID:27807395

  4. Ringo: Interactive Graph Analytics on Big-Memory Machines

    PubMed Central

    Perez, Yonathan; Sosič, Rok; Banerjee, Arijit; Puttagunta, Rohan; Raison, Martin; Shah, Pararth; Leskovec, Jure

    2016-01-01

    We present Ringo, a system for analysis of large graphs. Graphs provide a way to represent and analyze systems of interacting objects (people, proteins, webpages) with edges between the objects denoting interactions (friendships, physical interactions, links). Mining graphs provides valuable insights about individual objects as well as the relationships among them. In building Ringo, we take advantage of the fact that machines with large memory and many cores are widely available and also relatively affordable. This allows us to build an easy-to-use interactive high-performance graph analytics system. Graphs also need to be built from input data, which often resides in the form of relational tables. Thus, Ringo provides rich functionality for manipulating raw input data tables into various kinds of graphs. Furthermore, Ringo also provides over 200 graph analytics functions that can then be applied to constructed graphs. We show that a single big-memory machine provides a very attractive platform for performing analytics on all but the largest graphs as it offers excellent performance and ease of use as compared to alternative approaches. With Ringo, we also demonstrate how to integrate graph analytics with an iterative process of trial-and-error data exploration and rapid experimentation, common in data mining workloads. PMID:27081215

  5. Computing Information Value from RDF Graph Properties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    al-Saffar, Sinan; Heileman, Gregory

    2010-11-08

    Information value has been implicitly utilized and mostly non-subjectively computed in information retrieval (IR) systems. We explicitly define and compute the value of an information piece as a function of two parameters, the first is the potential semantic impact the target information can subjectively have on its recipient's world-knowledge, and the second parameter is trust in the information source. We model these two parameters as properties of RDF graphs. Two graphs are constructed, a target graph representing the semantics of the target body of information and a context graph representing the context of the consumer of that information. We computemore » information value subjectively as a function of both potential change to the context graph (impact) and the overlap between the two graphs (trust). Graph change is computed as a graph edit distance measuring the dissimilarity between the context graph before and after the learning of the target graph. A particular application of this subjective information valuation is in the construction of a personalized ranking component in Web search engines. Based on our method, we construct a Web re-ranking system that personalizes the information experience for the information-consumer.« less

  6. Ringo: Interactive Graph Analytics on Big-Memory Machines.

    PubMed

    Perez, Yonathan; Sosič, Rok; Banerjee, Arijit; Puttagunta, Rohan; Raison, Martin; Shah, Pararth; Leskovec, Jure

    2015-01-01

    We present Ringo, a system for analysis of large graphs. Graphs provide a way to represent and analyze systems of interacting objects (people, proteins, webpages) with edges between the objects denoting interactions (friendships, physical interactions, links). Mining graphs provides valuable insights about individual objects as well as the relationships among them. In building Ringo, we take advantage of the fact that machines with large memory and many cores are widely available and also relatively affordable. This allows us to build an easy-to-use interactive high-performance graph analytics system. Graphs also need to be built from input data, which often resides in the form of relational tables. Thus, Ringo provides rich functionality for manipulating raw input data tables into various kinds of graphs. Furthermore, Ringo also provides over 200 graph analytics functions that can then be applied to constructed graphs. We show that a single big-memory machine provides a very attractive platform for performing analytics on all but the largest graphs as it offers excellent performance and ease of use as compared to alternative approaches. With Ringo, we also demonstrate how to integrate graph analytics with an iterative process of trial-and-error data exploration and rapid experimentation, common in data mining workloads.

  7. Reflecting on Graphs: Attributes of Graph Choice and Construction Practices in Biology

    PubMed Central

    Angra, Aakanksha; Gardner, Stephanie M.

    2017-01-01

    Undergraduate biology education reform aims to engage students in scientific practices such as experimental design, experimentation, and data analysis and communication. Graphs are ubiquitous in the biological sciences, and creating effective graphical representations involves quantitative and disciplinary concepts and skills. Past studies document student difficulties with graphing within the contexts of classroom or national assessments without evaluating student reasoning. Operating under the metarepresentational competence framework, we conducted think-aloud interviews to reveal differences in reasoning and graph quality between undergraduate biology students, graduate students, and professors in a pen-and-paper graphing task. All professors planned and thought about data before graph construction. When reflecting on their graphs, professors and graduate students focused on the function of graphs and experimental design, while most undergraduate students relied on intuition and data provided in the task. Most undergraduate students meticulously plotted all data with scaled axes, while professors and some graduate students transformed the data, aligned the graph with the research question, and reflected on statistics and sample size. Differences in reasoning and approaches taken in graph choice and construction corroborate and extend previous findings and provide rich targets for undergraduate and graduate instruction. PMID:28821538

  8. Comparing brain graphs in which nodes are regions of interest or independent components: A simulation study.

    PubMed

    Yu, Qingbao; Du, Yuhui; Chen, Jiayu; He, Hao; Sui, Jing; Pearlson, Godfrey; Calhoun, Vince D

    2017-11-01

    A key challenge in building a brain graph using fMRI data is how to define the nodes. Spatial brain components estimated by independent components analysis (ICA) and regions of interest (ROIs) determined by brain atlas are two popular methods to define nodes in brain graphs. It is difficult to evaluate which method is better in real fMRI data. Here we perform a simulation study and evaluate the accuracies of a few graph metrics in graphs with nodes of ICA components, ROIs, or modified ROIs in four simulation scenarios. Graph measures with ICA nodes are more accurate than graphs with ROI nodes in all cases. Graph measures with modified ROI nodes are modulated by artifacts. The correlations of graph metrics across subjects between graphs with ICA nodes and ground truth are higher than the correlations between graphs with ROI nodes and ground truth in scenarios with large overlapped spatial sources. Moreover, moving the location of ROIs would largely decrease the correlations in all scenarios. Evaluating graphs with different nodes is promising in simulated data rather than real data because different scenarios can be simulated and measures of different graphs can be compared with a known ground truth. Since ROIs defined using brain atlas may not correspond well to real functional boundaries, overall findings of this work suggest that it is more appropriate to define nodes using data-driven ICA than ROI approaches in real fMRI data. Copyright © 2017 Elsevier B.V. All rights reserved.

  9. Directed random walks and constraint programming reveal active pathways in hepatocyte growth factor signaling.

    PubMed

    Kittas, Aristotelis; Delobelle, Aurélien; Schmitt, Sabrina; Breuhahn, Kai; Guziolowski, Carito; Grabe, Niels

    2016-01-01

    An effective means to analyze mRNA expression data is to take advantage of established knowledge from pathway databases, using methods such as pathway-enrichment analyses. However, pathway databases are not case-specific and expression data could be used to infer gene-regulation patterns in the context of specific pathways. In addition, canonical pathways may not always describe the signaling mechanisms properly, because interactions can frequently occur between genes in different pathways. Relatively few methods have been proposed to date for generating and analyzing such networks, preserving the causality between gene interactions and reasoning over the qualitative logic of regulatory effects. We present an algorithm (MCWalk) integrated with a logic programming approach, to discover subgraphs in large-scale signaling networks by random walks in a fully automated pipeline. As an exemplary application, we uncover the signal transduction mechanisms in a gene interaction network describing hepatocyte growth factor-stimulated cell migration and proliferation from gene-expression measured with microarray and RT-qPCR using in-house perturbation experiments in a keratinocyte-fibroblast co-culture. The resulting subgraphs illustrate possible associations of hepatocyte growth factor receptor c-Met nodes, differentially expressed genes and cellular states. Using perturbation experiments and Answer Set programming, we are able to select those which are more consistent with the experimental data. We discover key regulator nodes by measuring the frequency with which they are traversed when connecting signaling between receptors and significantly regulated genes and predict their expression-shift consistently with the measured data. The Java implementation of MCWalk is publicly available under the MIT license at: https://bitbucket.org/akittas/biosubg. © 2015 FEBS.

  10. Functional Module Analysis for Gene Coexpression Networks with Network Integration.

    PubMed

    Zhang, Shuqin; Zhao, Hongyu; Ng, Michael K

    2015-01-01

    Network has been a general tool for studying the complex interactions between different genes, proteins, and other small molecules. Module as a fundamental property of many biological networks has been widely studied and many computational methods have been proposed to identify the modules in an individual network. However, in many cases, a single network is insufficient for module analysis due to the noise in the data or the tuning of parameters when building the biological network. The availability of a large amount of biological networks makes network integration study possible. By integrating such networks, more informative modules for some specific disease can be derived from the networks constructed from different tissues, and consistent factors for different diseases can be inferred. In this paper, we have developed an effective method for module identification from multiple networks under different conditions. The problem is formulated as an optimization model, which combines the module identification in each individual network and alignment of the modules from different networks together. An approximation algorithm based on eigenvector computation is proposed. Our method outperforms the existing methods, especially when the underlying modules in multiple networks are different in simulation studies. We also applied our method to two groups of gene coexpression networks for humans, which include one for three different cancers, and one for three tissues from the morbidly obese patients. We identified 13 modules with three complete subgraphs, and 11 modules with two complete subgraphs, respectively. The modules were validated through Gene Ontology enrichment and KEGG pathway enrichment analysis. We also showed that the main functions of most modules for the corresponding disease have been addressed by other researchers, which may provide the theoretical basis for further studying the modules experimentally.

  11. Scale-free models for the structure of business firm networks

    NASA Astrophysics Data System (ADS)

    Kitsak, Maksim; Riccaboni, Massimo; Havlin, Shlomo; Pammolli, Fabio; Stanley, H. Eugene

    2010-03-01

    We study firm collaborations in the life sciences and the information and communication technology sectors. We propose an approach to characterize industrial leadership using k -shell decomposition, with top-ranking firms in terms of market value in higher k -shell layers. We find that the life sciences industry network consists of three distinct components: a “nucleus,” which is a small well-connected subgraph, “tendrils,” which are small subgraphs consisting of small degree nodes connected exclusively to the nucleus, and a “bulk body,” which consists of the majority of nodes. Industrial leaders, i.e., the largest companies in terms of market value, are in the highest k -shells of both networks. The nucleus of the life sciences sector is very stable: once a firm enters the nucleus, it is likely to stay there for a long time. At the same time we do not observe the above three components in the information and communication technology sector. We also conduct a systematic study of these three components in random scale-free networks. Our results suggest that the sizes of the nucleus and the tendrils in scale-free networks decrease as the exponent of the power-law degree distribution λ increases, and disappear for λ≥3 . We compare the k -shell structure of random scale-free model networks with two real-world business firm networks in the life sciences and in the information and communication technology sectors. We argue that the observed behavior of the k -shell structure in the two industries is consistent with the coexistence of both preferential and random agreements in the evolution of industrial networks.

  12. A distributed query execution engine of big attributed graphs.

    PubMed

    Batarfi, Omar; Elshawi, Radwa; Fayoumi, Ayman; Barnawi, Ahmed; Sakr, Sherif

    2016-01-01

    A graph is a popular data model that has become pervasively used for modeling structural relationships between objects. In practice, in many real-world graphs, the graph vertices and edges need to be associated with descriptive attributes. Such type of graphs are referred to as attributed graphs. G-SPARQL has been proposed as an expressive language, with a centralized execution engine, for querying attributed graphs. G-SPARQL supports various types of graph querying operations including reachability, pattern matching and shortest path where any G-SPARQL query may include value-based predicates on the descriptive information (attributes) of the graph edges/vertices in addition to the structural predicates. In general, a main limitation of centralized systems is that their vertical scalability is always restricted by the physical limits of computer systems. This article describes the design, implementation in addition to the performance evaluation of DG-SPARQL, a distributed, hybrid and adaptive parallel execution engine of G-SPARQL queries. In this engine, the topology of the graph is distributed over the main memory of the underlying nodes while the graph data are maintained in a relational store which is replicated on the disk of each of the underlying nodes. DG-SPARQL evaluates parts of the query plan via SQL queries which are pushed to the underlying relational stores while other parts of the query plan, as necessary, are evaluated via indexless memory-based graph traversal algorithms. Our experimental evaluation shows the efficiency and the scalability of DG-SPARQL on querying massive attributed graph datasets in addition to its ability to outperform the performance of Apache Giraph, a popular distributed graph processing system, by orders of magnitudes.

  13. Does Guiding Toward Task-Relevant Information Help Improve Graph Processing and Graph Comprehension of Individuals with Low or High Numeracy? An Eye-Tracker Experiment.

    PubMed

    Keller, Carmen; Junghans, Alex

    2017-11-01

    Individuals with low numeracy have difficulties with understanding complex graphs. Combining the information-processing approach to numeracy with graph comprehension and information-reduction theories, we examined whether high numerates' better comprehension might be explained by their closer attention to task-relevant graphical elements, from which they would expect numerical information to understand the graph. Furthermore, we investigated whether participants could be trained in improving their attention to task-relevant information and graph comprehension. In an eye-tracker experiment ( N = 110) involving a sample from the general population, we presented participants with 2 hypothetical scenarios (stomach cancer, leukemia) showing survival curves for 2 treatments. In the training condition, participants received written instructions on how to read the graph. In the control condition, participants received another text. We tracked participants' eye movements while they answered 9 knowledge questions. The sum constituted graph comprehension. We analyzed visual attention to task-relevant graphical elements by using relative fixation durations and relative fixation counts. The mediation analysis revealed a significant ( P < 0.05) indirect effect of numeracy on graph comprehension through visual attention to task-relevant information, which did not differ between the 2 conditions. Training had a significant main effect on visual attention ( P < 0.05) but not on graph comprehension ( P < 0.07). Individuals with high numeracy have better graph comprehension due to their greater attention to task-relevant graphical elements than individuals with low numeracy. With appropriate instructions, both groups can be trained to improve their graph-processing efficiency. Future research should examine (e.g., motivational) mediators between visual attention and graph comprehension to develop appropriate instructions that also result in higher graph comprehension.

  14. Differentials on graph complexes II: hairy graphs

    NASA Astrophysics Data System (ADS)

    Khoroshkin, Anton; Willwacher, Thomas; Živković, Marko

    2017-10-01

    We study the cohomology of the hairy graph complexes which compute the rational homotopy of embedding spaces, generalizing the Vassiliev invariants of knot theory. We provide spectral sequences converging to zero whose first pages contain the hairy graph cohomology. Our results yield a way to construct many nonzero hairy graph cohomology classes out of (known) non-hairy classes by studying the cancellations in those sequences. This provide a first glimpse at the tentative global structure of the hairy graph cohomology.

  15. Alternative Fuels Data Center: Maps and Data

    Science.gov Websites

    Fuel Standard Volumes by Year Generated_thumb20150904-8240-13hgnxh Last update August 2012 View Graph product or destination Last update August 2015 View Graph Graph Download Data Custom_thumb U.S. Ethanol , from 1866-2014 Last update August 2015 View Graph Graph Download Data Generated_thumb20160920-21993

  16. Helping Students Make Sense of Graphs: An Experimental Trial of SmartGraphs Software

    ERIC Educational Resources Information Center

    Zucker, Andrew; Kay, Rachel; Staudt, Carolyn

    2014-01-01

    Graphs are commonly used in science, mathematics, and social sciences to convey important concepts; yet students at all ages demonstrate difficulties interpreting graphs. This paper reports on an experimental study of free, Web-based software called SmartGraphs that is specifically designed to help students overcome their misconceptions regarding…

  17. A Novel Graph Constructor for Semisupervised Discriminant Analysis: Combined Low-Rank and k-Nearest Neighbor Graph

    PubMed Central

    Pan, Yongke; Niu, Wenjia

    2017-01-01

    Semisupervised Discriminant Analysis (SDA) is a semisupervised dimensionality reduction algorithm, which can easily resolve the out-of-sample problem. Relative works usually focus on the geometric relationships of data points, which are not obvious, to enhance the performance of SDA. Different from these relative works, the regularized graph construction is researched here, which is important in the graph-based semisupervised learning methods. In this paper, we propose a novel graph for Semisupervised Discriminant Analysis, which is called combined low-rank and k-nearest neighbor (LRKNN) graph. In our LRKNN graph, we map the data to the LR feature space and then the kNN is adopted to satisfy the algorithmic requirements of SDA. Since the low-rank representation can capture the global structure and the k-nearest neighbor algorithm can maximally preserve the local geometrical structure of the data, the LRKNN graph can significantly improve the performance of SDA. Extensive experiments on several real-world databases show that the proposed LRKNN graph is an efficient graph constructor, which can largely outperform other commonly used baselines. PMID:28316616

  18. Computing Role Assignments of Proper Interval Graphs in Polynomial Time

    NASA Astrophysics Data System (ADS)

    Heggernes, Pinar; van't Hof, Pim; Paulusma, Daniël

    A homomorphism from a graph G to a graph R is locally surjective if its restriction to the neighborhood of each vertex of G is surjective. Such a homomorphism is also called an R-role assignment of G. Role assignments have applications in distributed computing, social network theory, and topological graph theory. The Role Assignment problem has as input a pair of graphs (G,R) and asks whether G has an R-role assignment. This problem is NP-complete already on input pairs (G,R) where R is a path on three vertices. So far, the only known non-trivial tractable case consists of input pairs (G,R) where G is a tree. We present a polynomial time algorithm that solves Role Assignment on all input pairs (G,R) where G is a proper interval graph. Thus we identify the first graph class other than trees on which the problem is tractable. As a complementary result, we show that the problem is Graph Isomorphism-hard on chordal graphs, a superclass of proper interval graphs and trees.

  19. A binary linear programming formulation of the graph edit distance.

    PubMed

    Justice, Derek; Hero, Alfred

    2006-08-01

    A binary linear programming formulation of the graph edit distance for unweighted, undirected graphs with vertex attributes is derived and applied to a graph recognition problem. A general formulation for editing graphs is used to derive a graph edit distance that is proven to be a metric, provided the cost function for individual edit operations is a metric. Then, a binary linear program is developed for computing this graph edit distance, and polynomial time methods for determining upper and lower bounds on the solution of the binary program are derived by applying solution methods for standard linear programming and the assignment problem. A recognition problem of comparing a sample input graph to a database of known prototype graphs in the context of a chemical information system is presented as an application of the new method. The costs associated with various edit operations are chosen by using a minimum normalized variance criterion applied to pairwise distances between nearest neighbors in the database of prototypes. The new metric is shown to perform quite well in comparison to existing metrics when applied to a database of chemical graphs.

  20. Practice of the Education for the Principle of Otto Cycle by the E-Learning CG-Content

    NASA Astrophysics Data System (ADS)

    Sato, Tomoaki; Nagaoka, Keizo; Oguchi, Kosei

    A CG-animation content which supports the learning of the Otto cycle was developed. This content has a piston assembly and the diagrams of PV, VS, TP and TS. The each diagram has a pointer which moves along the line of the graph and they are synchronized with the movement of the piston. The learners can operate this content directly on the e-learning system. While watching the movements of the piston assembly, the learners can confirm the state of the engine about temperature, pressure, volume, and entropy by the synchronized pointer on the diagrams. This content was used for the class of the machining practice exercise. The learning effect of the content was examined by the score of the short test. As the result of this examination, the CG-animation content was effective in the learning of the Otto cycle.

  1. Application of a minicomputer-based system in measuring intraocular fluid dynamics

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bronzino, J.D.; D'Amato, D.P.; O'Rourke, J.

    A complete, computerized system has been developed to automate and display radionuclide clearance studies in an ophthalmology clinical laboratory. The system is based on a PDP-8E computer with a 16-k core memory and includes a dual-drive Decassette system and an interactive display terminal. The software controls the acquisition of data from an NIM scaler, times the procedures, and analyzes and simultaneously displays logarithmically converted data on a fully annotated graph. Animal studies and clinical experiments are presented to illustrate the nature of these displays and the results obtained using this automated eye physiometer.

  2. Jupiter Environment Tool

    NASA Technical Reports Server (NTRS)

    Sturm, Erick J.; Monahue, Kenneth M.; Biehl, James P.; Kokorowski, Michael; Ngalande, Cedrick,; Boedeker, Jordan

    2012-01-01

    The Jupiter Environment Tool (JET) is a custom UI plug-in for STK that provides an interface to Jupiter environment models for visualization and analysis. Users can visualize the different magnetic field models of Jupiter through various rendering methods, which are fully integrated within STK s 3D Window. This allows users to take snapshots and make animations of their scenarios with magnetic field visualizations. Analytical data can be accessed in the form of custom vectors. Given these custom vectors, users have access to magnetic field data in custom reports, graphs, access constraints, coverage analysis, and anywhere else vectors are used within STK.

  3. Quantum walk on a chimera graph

    NASA Astrophysics Data System (ADS)

    Xu, Shu; Sun, Xiangxiang; Wu, Jizhou; Zhang, Wei-Wei; Arshed, Nigum; Sanders, Barry C.

    2018-05-01

    We analyse a continuous-time quantum walk on a chimera graph, which is a graph of choice for designing quantum annealers, and we discover beautiful quantum walk features such as localization that starkly distinguishes classical from quantum behaviour. Motivated by technological thrusts, we study continuous-time quantum walk on enhanced variants of the chimera graph and on diminished chimera graph with a random removal of vertices. We explain the quantum walk by constructing a generating set for a suitable subgroup of graph isomorphisms and corresponding symmetry operators that commute with the quantum walk Hamiltonian; the Hamiltonian and these symmetry operators provide a complete set of labels for the spectrum and the stationary states. Our quantum walk characterization of the chimera graph and its variants yields valuable insights into graphs used for designing quantum-annealers.

  4. On Edge Exchangeable Random Graphs

    NASA Astrophysics Data System (ADS)

    Janson, Svante

    2017-06-01

    We study a recent model for edge exchangeable random graphs introduced by Crane and Dempsey; in particular we study asymptotic properties of the random simple graph obtained by merging multiple edges. We study a number of examples, and show that the model can produce dense, sparse and extremely sparse random graphs. One example yields a power-law degree distribution. We give some examples where the random graph is dense and converges a.s. in the sense of graph limit theory, but also an example where a.s. every graph limit is the limit of some subsequence. Another example is sparse and yields convergence to a non-integrable generalized graphon defined on (0,∞).

  5. Interval Graph Limits

    PubMed Central

    Diaconis, Persi; Holmes, Susan; Janson, Svante

    2015-01-01

    We work out a graph limit theory for dense interval graphs. The theory developed departs from the usual description of a graph limit as a symmetric function W (x, y) on the unit square, with x and y uniform on the interval (0, 1). Instead, we fix a W and change the underlying distribution of the coordinates x and y. We find choices such that our limits are continuous. Connections to random interval graphs are given, including some examples. We also show a continuity result for the chromatic number and clique number of interval graphs. Some results on uniqueness of the limit description are given for general graph limits. PMID:26405368

  6. Chemical graphs, molecular matrices and topological indices in chemoinformatics and quantitative structure-activity relationships.

    PubMed

    Ivanciuc, Ovidiu

    2013-06-01

    Chemical and molecular graphs have fundamental applications in chemoinformatics, quantitative structureproperty relationships (QSPR), quantitative structure-activity relationships (QSAR), virtual screening of chemical libraries, and computational drug design. Chemoinformatics applications of graphs include chemical structure representation and coding, database search and retrieval, and physicochemical property prediction. QSPR, QSAR and virtual screening are based on the structure-property principle, which states that the physicochemical and biological properties of chemical compounds can be predicted from their chemical structure. Such structure-property correlations are usually developed from topological indices and fingerprints computed from the molecular graph and from molecular descriptors computed from the three-dimensional chemical structure. We present here a selection of the most important graph descriptors and topological indices, including molecular matrices, graph spectra, spectral moments, graph polynomials, and vertex topological indices. These graph descriptors are used to define several topological indices based on molecular connectivity, graph distance, reciprocal distance, distance-degree, distance-valency, spectra, polynomials, and information theory concepts. The molecular descriptors and topological indices can be developed with a more general approach, based on molecular graph operators, which define a family of graph indices related by a common formula. Graph descriptors and topological indices for molecules containing heteroatoms and multiple bonds are computed with weighting schemes based on atomic properties, such as the atomic number, covalent radius, or electronegativity. The correlation in QSPR and QSAR models can be improved by optimizing some parameters in the formula of topological indices, as demonstrated for structural descriptors based on atomic connectivity and graph distance.

  7. Spectral fluctuations of quantum graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Pluhař, Z.; Weidenmüller, H. A.

    We prove the Bohigas-Giannoni-Schmit conjecture in its most general form for completely connected simple graphs with incommensurate bond lengths. We show that for graphs that are classically mixing (i.e., graphs for which the spectrum of the classical Perron-Frobenius operator possesses a finite gap), the generating functions for all (P,Q) correlation functions for both closed and open graphs coincide (in the limit of infinite graph size) with the corresponding expressions of random-matrix theory, both for orthogonal and for unitary symmetry.

  8. 2-Extendability in Two Classes of Claw-Free Graphs

    DTIC Science & Technology

    1992-01-01

    extendability of planar graphs, Discrete Math ., 96, 1991, 81-99. [Lai M. Las Verguas, A note on matchings in graphs, Colloque sur la Thiorie des Graphes...43, 1987, 187-222. [LP L. Loviss and M.D. Plummet, Matching Theory, Ann. Discrete Math . 29, North-Holland, Amsterdam, 1986. [P11 M.D. Plummer, On n...extendable graphs, Discrete Math . 31, 1960, 201-210. [P21 Extending matchinp in planar graphs IV, Proc. of the Conference in honor of Cert Sabidussi, Ann

  9. A Visual Evaluation Study of Graph Sampling Techniques

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang, Fangyan; Zhang, Song; Wong, Pak C.

    2017-01-29

    We evaluate a dozen prevailing graph-sampling techniques with an ultimate goal to better visualize and understand big and complex graphs that exhibit different properties and structures. The evaluation uses eight benchmark datasets with four different graph types collected from Stanford Network Analysis Platform and NetworkX to give a comprehensive comparison of various types of graphs. The study provides a practical guideline for visualizing big graphs of different sizes and structures. The paper discusses results and important observations from the study.

  10. Methods of visualizing graphs

    DOEpatents

    Wong, Pak C.; Mackey, Patrick S.; Perrine, Kenneth A.; Foote, Harlan P.; Thomas, James J.

    2008-12-23

    Methods for visualizing a graph by automatically drawing elements of the graph as labels are disclosed. In one embodiment, the method comprises receiving node information and edge information from an input device and/or communication interface, constructing a graph layout based at least in part on that information, wherein the edges are automatically drawn as labels, and displaying the graph on a display device according to the graph layout. In some embodiments, the nodes are automatically drawn as labels instead of, or in addition to, the label-edges.

  11. A Ranking Approach on Large-Scale Graph With Multidimensional Heterogeneous Information.

    PubMed

    Wei, Wei; Gao, Bin; Liu, Tie-Yan; Wang, Taifeng; Li, Guohui; Li, Hang

    2016-04-01

    Graph-based ranking has been extensively studied and frequently applied in many applications, such as webpage ranking. It aims at mining potentially valuable information from the raw graph-structured data. Recently, with the proliferation of rich heterogeneous information (e.g., node/edge features and prior knowledge) available in many real-world graphs, how to effectively and efficiently leverage all information to improve the ranking performance becomes a new challenging problem. Previous methods only utilize part of such information and attempt to rank graph nodes according to link-based methods, of which the ranking performances are severely affected by several well-known issues, e.g., over-fitting or high computational complexity, especially when the scale of graph is very large. In this paper, we address the large-scale graph-based ranking problem and focus on how to effectively exploit rich heterogeneous information of the graph to improve the ranking performance. Specifically, we propose an innovative and effective semi-supervised PageRank (SSP) approach to parameterize the derived information within a unified semi-supervised learning framework (SSLF-GR), then simultaneously optimize the parameters and the ranking scores of graph nodes. Experiments on the real-world large-scale graphs demonstrate that our method significantly outperforms the algorithms that consider such graph information only partially.

  12. An asynchronous traversal engine for graph-based rich metadata management

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dai, Dong; Carns, Philip; Ross, Robert B.

    Rich metadata in high-performance computing (HPC) systems contains extended information about users, jobs, data files, and their relationships. Property graphs are a promising data model to represent heterogeneous rich metadata flexibly. Specifically, a property graph can use vertices to represent different entities and edges to record the relationships between vertices with unique annotations. The high-volume HPC use case, with millions of entities and relationships, naturally requires an out-of-core distributed property graph database, which must support live updates (to ingest production information in real time), low-latency point queries (for frequent metadata operations such as permission checking), and large-scale traversals (for provenancemore » data mining). Among these needs, large-scale property graph traversals are particularly challenging for distributed graph storage systems. Most existing graph systems implement a "level synchronous" breadth-first search algorithm that relies on global synchronization in each traversal step. This performs well in many problem domains; but a rich metadata management system is characterized by imbalanced graphs, long traversal lengths, and concurrent workloads, each of which has the potential to introduce or exacerbate stragglers (i.e., abnormally slow steps or servers in a graph traversal) that lead to low overall throughput for synchronous traversal algorithms. Previous research indicated that the straggler problem can be mitigated by using asynchronous traversal algorithms, and many graph-processing frameworks have successfully demonstrated this approach. Such systems require the graph to be loaded into a separate batch-processing framework instead of being iteratively accessed, however. In this work, we investigate a general asynchronous graph traversal engine that can operate atop a rich metadata graph in its native format. We outline a traversal-aware query language and key optimizations (traversal-affiliate caching and execution merging) necessary for efficient performance. We further explore the effect of different graph partitioning strategies on the traversal performance for both synchronous and asynchronous traversal engines. Our experiments show that the asynchronous graph traversal engine is more efficient than its synchronous counterpart in the case of HPC rich metadata processing, where more servers are involved and larger traversals are needed. Furthermore, the asynchronous traversal engine is more adaptive to different graph partitioning strategies.« less

  13. An asynchronous traversal engine for graph-based rich metadata management

    DOE PAGES

    Dai, Dong; Carns, Philip; Ross, Robert B.; ...

    2016-06-23

    Rich metadata in high-performance computing (HPC) systems contains extended information about users, jobs, data files, and their relationships. Property graphs are a promising data model to represent heterogeneous rich metadata flexibly. Specifically, a property graph can use vertices to represent different entities and edges to record the relationships between vertices with unique annotations. The high-volume HPC use case, with millions of entities and relationships, naturally requires an out-of-core distributed property graph database, which must support live updates (to ingest production information in real time), low-latency point queries (for frequent metadata operations such as permission checking), and large-scale traversals (for provenancemore » data mining). Among these needs, large-scale property graph traversals are particularly challenging for distributed graph storage systems. Most existing graph systems implement a "level synchronous" breadth-first search algorithm that relies on global synchronization in each traversal step. This performs well in many problem domains; but a rich metadata management system is characterized by imbalanced graphs, long traversal lengths, and concurrent workloads, each of which has the potential to introduce or exacerbate stragglers (i.e., abnormally slow steps or servers in a graph traversal) that lead to low overall throughput for synchronous traversal algorithms. Previous research indicated that the straggler problem can be mitigated by using asynchronous traversal algorithms, and many graph-processing frameworks have successfully demonstrated this approach. Such systems require the graph to be loaded into a separate batch-processing framework instead of being iteratively accessed, however. In this work, we investigate a general asynchronous graph traversal engine that can operate atop a rich metadata graph in its native format. We outline a traversal-aware query language and key optimizations (traversal-affiliate caching and execution merging) necessary for efficient performance. We further explore the effect of different graph partitioning strategies on the traversal performance for both synchronous and asynchronous traversal engines. Our experiments show that the asynchronous graph traversal engine is more efficient than its synchronous counterpart in the case of HPC rich metadata processing, where more servers are involved and larger traversals are needed. Furthermore, the asynchronous traversal engine is more adaptive to different graph partitioning strategies.« less

  14. Expanding our understanding of students' use of graphs for learning physics

    NASA Astrophysics Data System (ADS)

    Laverty, James T.

    It is generally agreed that the ability to visualize functional dependencies or physical relationships as graphs is an important step in modeling and learning. However, several studies in Physics Education Research (PER) have shown that many students in fact do not master this form of representation and even have misconceptions about the meaning of graphs that impede learning physics concepts. Working with graphs in classroom settings has been shown to improve student abilities with graphs, particularly when the students can interact with them. We introduce a novel problem type in an online homework system, which requires students to construct the graphs themselves in free form, and requires no hand-grading by instructors. A study of pre/post-test data using the Test of Understanding Graphs in Kinematics (TUG-K) over several semesters indicates that students learn significantly more from these graph construction problems than from the usual graph interpretation problems, and that graph interpretation alone may not have any significant effect. The interpretation of graphs, as well as the representation translation between textual, mathematical, and graphical representations of physics scenarios, are frequently listed among the higher order thinking skills we wish to convey in an undergraduate course. But to what degree do we succeed? Do students indeed employ higher order thinking skills when working through graphing exercises? We investigate students working through a variety of graph problems, and, using a think-aloud protocol, aim to reconstruct the cognitive processes that the students go through. We find that to a certain degree, these problems become commoditized and do not trigger the desired higher order thinking processes; simply translating ``textbook-like'' problems into the graphical realm will not achieve any additional educational goals. Whether the students have to interpret or construct a graph makes very little difference in the methods used by the students. We will also look at the results of using graph problems in an online learning environment. We will show evidence that construction problems lead to a higher degree of difficulty and degree of discrimination than other graph problems and discuss the influence the course has on these variables.

  15. Alternative Fuels Data Center: Maps and Data

    Science.gov Websites

    -1paywcu Last update August 2014 View Graph Graph Download Data State & Alt Fuel Providers -kgi9ks Trend of S&FP AFV acquisitions by fleet type from 1992-2014 Last update August 2016 View Graph -2015 Last update August 2016 View Graph Graph Download Data Generated_thumb20160907-12999-119sgvk

  16. Aspects of Performance on Line Graph Description Tasks: Influenced by Graph Familiarity and Different Task Features

    ERIC Educational Resources Information Center

    Xi, Xiaoming

    2010-01-01

    Motivated by cognitive theories of graph comprehension, this study systematically manipulated characteristics of a line graph description task in a speaking test in ways to mitigate the influence of graph familiarity, a potential source of construct-irrelevant variance. It extends Xi (2005), which found that the differences in holistic scores on…

  17. Building Scalable Knowledge Graphs for Earth Science

    NASA Technical Reports Server (NTRS)

    Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian

    2017-01-01

    Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.

  18. Global dynamics for switching systems and their extensions by linear differential equations

    NASA Astrophysics Data System (ADS)

    Huttinga, Zane; Cummins, Bree; Gedeon, Tomáš; Mischaikow, Konstantin

    2018-03-01

    Switching systems use piecewise constant nonlinearities to model gene regulatory networks. This choice provides advantages in the analysis of behavior and allows the global description of dynamics in terms of Morse graphs associated to nodes of a parameter graph. The parameter graph captures spatial characteristics of a decomposition of parameter space into domains with identical Morse graphs. However, there are many cellular processes that do not exhibit threshold-like behavior and thus are not well described by a switching system. We consider a class of extensions of switching systems formed by a mixture of switching interactions and chains of variables governed by linear differential equations. We show that the parameter graphs associated to the switching system and any of its extensions are identical. For each parameter graph node, there is an order-preserving map from the Morse graph of the switching system to the Morse graph of any of its extensions. We provide counterexamples that show why possible stronger relationships between the Morse graphs are not valid.

  19. Global dynamics for switching systems and their extensions by linear differential equations.

    PubMed

    Huttinga, Zane; Cummins, Bree; Gedeon, Tomáš; Mischaikow, Konstantin

    2018-03-15

    Switching systems use piecewise constant nonlinearities to model gene regulatory networks. This choice provides advantages in the analysis of behavior and allows the global description of dynamics in terms of Morse graphs associated to nodes of a parameter graph. The parameter graph captures spatial characteristics of a decomposition of parameter space into domains with identical Morse graphs. However, there are many cellular processes that do not exhibit threshold-like behavior and thus are not well described by a switching system. We consider a class of extensions of switching systems formed by a mixture of switching interactions and chains of variables governed by linear differential equations. We show that the parameter graphs associated to the switching system and any of its extensions are identical. For each parameter graph node, there is an order-preserving map from the Morse graph of the switching system to the Morse graph of any of its extensions. We provide counterexamples that show why possible stronger relationships between the Morse graphs are not valid.

  20. Text categorization of biomedical data sets using graph kernels and a controlled vocabulary.

    PubMed

    Bleik, Said; Mishra, Meenakshi; Huan, Jun; Song, Min

    2013-01-01

    Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph kernels to classify those articles into high-level categories. In our representation, common biomedical concepts and semantic relationships are identified with the help of an existing ontology and are used to build a rich graph structure that provides a consistent feature set and preserves additional semantic information that could improve a classifier's performance. We attempt to classify the graphs using both a set-based graph kernel that is capable of dealing with the disconnected nature of the graphs and a simple linear kernel. Finally, we report the results comparing the classification performance of the kernel classifiers to common text-based classifiers.

  1. Supermanifolds from Feynman graphs

    NASA Astrophysics Data System (ADS)

    Marcolli, Matilde; Rej, Abhijnan

    2008-08-01

    We generalize the computation of Feynman integrals of log divergent graphs in terms of the Kirchhoff polynomial to the case of graphs with both fermionic and bosonic edges, to which we assign a set of ordinary and Grassmann variables. This procedure gives a computation of the Feynman integrals in terms of a period on a supermanifold, for graphs admitting a basis of the first homology satisfying a condition generalizing the log divergence in this context. The analog in this setting of the graph hypersurfaces is a graph supermanifold given by the divisor of zeros and poles of the Berezinian of a matrix associated with the graph, inside a superprojective space. We introduce a Grothendieck group for supermanifolds and identify the subgroup generated by the graph supermanifolds. This can be seen as a general procedure for constructing interesting classes of supermanifolds with associated periods.

  2. Function plot response: A scalable system for teaching kinematics graphs

    NASA Astrophysics Data System (ADS)

    Laverty, James; Kortemeyer, Gerd

    2012-08-01

    Understanding and interpreting graphs are essential skills in all sciences. While students are mostly proficient in plotting given functions and reading values off graphs, they frequently lack the ability to construct and interpret graphs in a meaningful way. Students can use graphs as representations of value pairs, but often fail to interpret them as the representation of functions, and mostly fail to use them as representations of physical reality. Working with graphs in classroom settings has been shown to improve student abilities with graphs, particularly when the students can interact with them. We introduce a novel problem type in an online homework system, which requires students to construct the graphs themselves in free form, and requires no hand-grading by instructors. Initial experiences using the new problem type in an introductory physics course are reported.

  3. Many-core graph analytics using accelerated sparse linear algebra routines

    NASA Astrophysics Data System (ADS)

    Kozacik, Stephen; Paolini, Aaron L.; Fox, Paul; Kelmelis, Eric

    2016-05-01

    Graph analytics is a key component in identifying emerging trends and threats in many real-world applications. Largescale graph analytics frameworks provide a convenient and highly-scalable platform for developing algorithms to analyze large datasets. Although conceptually scalable, these techniques exhibit poor performance on modern computational hardware. Another model of graph computation has emerged that promises improved performance and scalability by using abstract linear algebra operations as the basis for graph analysis as laid out by the GraphBLAS standard. By using sparse linear algebra as the basis, existing highly efficient algorithms can be adapted to perform computations on the graph. This approach, however, is often less intuitive to graph analytics experts, who are accustomed to vertex-centric APIs such as Giraph, GraphX, and Tinkerpop. We are developing an implementation of the high-level operations supported by these APIs in terms of linear algebra operations. This implementation is be backed by many-core implementations of the fundamental GraphBLAS operations required, and offers the advantages of both the intuitive programming model of a vertex-centric API and the performance of a sparse linear algebra implementation. This technology can reduce the number of nodes required, as well as the run-time for a graph analysis problem, enabling customers to perform more complex analysis with less hardware at lower cost. All of this can be accomplished without the requirement for the customer to make any changes to their analytics code, thanks to the compatibility with existing graph APIs.

  4. Dynamical density delay maps: simple, new method for visualising the behaviour of complex systems

    PubMed Central

    2014-01-01

    Background Physiologic signals, such as cardiac interbeat intervals, exhibit complex fluctuations. However, capturing important dynamical properties, including nonstationarities may not be feasible from conventional time series graphical representations. Methods We introduce a simple-to-implement visualisation method, termed dynamical density delay mapping (“D3-Map” technique) that provides an animated representation of a system’s dynamics. The method is based on a generalization of conventional two-dimensional (2D) Poincaré plots, which are scatter plots where each data point, x(n), in a time series is plotted against the adjacent one, x(n + 1). First, we divide the original time series, x(n) (n = 1,…, N), into a sequence of segments (windows). Next, for each segment, a three-dimensional (3D) Poincaré surface plot of x(n), x(n + 1), h[x(n),x(n + 1)] is generated, in which the third dimension, h, represents the relative frequency of occurrence of each (x(n),x(n + 1)) point. This 3D Poincaré surface is then chromatised by mapping the relative frequency h values onto a colour scheme. We also generate a colourised 2D contour plot from each time series segment using the same colourmap scheme as for the 3D Poincaré surface. Finally, the original time series graph, the colourised 3D Poincaré surface plot, and its projection as a colourised 2D contour map for each segment, are animated to create the full “D3-Map.” Results We first exemplify the D3-Map method using the cardiac interbeat interval time series from a healthy subject during sleeping hours. The animations uncover complex dynamical changes, such as transitions between states, and the relative amount of time the system spends in each state. We also illustrate the utility of the method in detecting hidden temporal patterns in the heart rate dynamics of a patient with atrial fibrillation. The videos, as well as the source code, are made publicly available. Conclusions Animations based on density delay maps provide a new way of visualising dynamical properties of complex systems not apparent in time series graphs or standard Poincaré plot representations. Trainees in a variety of fields may find the animations useful as illustrations of fundamental but challenging concepts, such as nonstationarity and multistability. For investigators, the method may facilitate data exploration. PMID:24438439

  5. On Effective Graphic Communication of Health Inequality: Considerations for Health Policy Researchers.

    PubMed

    Asada, Yukiko; Abel, Hannah; Skedgel, Chris; Warner, Grace

    2017-12-01

    Policy Points: Effective graphs can be a powerful tool in communicating health inequality. The choice of graphs is often based on preferences and familiarity rather than science. According to the literature on graph perception, effective graphs allow human brains to decode visual cues easily. Dot charts are easier to decode than bar charts, and thus they are more effective. Dot charts are a flexible and versatile way to display information about health inequality. Consistent with the health risk communication literature, the captions accompanying health inequality graphs should provide a numerical, explicitly calculated description of health inequality, expressed in absolute and relative terms, from carefully thought-out perspectives. Graphs are an essential tool for communicating health inequality, a key health policy concern. The choice of graphs is often driven by personal preferences and familiarity. Our article is aimed at health policy researchers developing health inequality graphs for policy and scientific audiences and seeks to (1) raise awareness of the effective use of graphs in communicating health inequality; (2) advocate for a particular type of graph (ie, dot charts) to depict health inequality; and (3) suggest key considerations for the captions accompanying health inequality graphs. Using composite review methods, we selected the prevailing recommendations for improving graphs in scientific reporting. To find the origins of these recommendations, we reviewed the literature on graph perception and then applied what we learned to the context of health inequality. In addition, drawing from the numeracy literature in health risk communication, we examined numeric and verbal formats to explain health inequality graphs. Many disciplines offer commonsense recommendations for visually presenting quantitative data. The literature on graph perception, which defines effective graphs as those allowing the easy decoding of visual cues in human brains, shows that with their more accurate and easier-to-decode visual cues, dot charts are more effective than bar charts. Dot charts can flexibly present a large amount of information in limited space. They also can easily accommodate typical health inequality information to describe a health variable (eg, life expectancy) by an inequality domain (eg, income) with domain groups (eg, poor and rich) in a population (eg, Canada) over time periods (eg, 2010 and 2017). The numeracy literature suggests that a health inequality graph's caption should provide a numerical, explicitly calculated description of health inequality expressed in absolute and relative terms, from carefully thought-out perspectives. Given the ubiquity of graphs, the health inequality field should learn from the vibrant multidisciplinary literature how to construct effective graphic communications, especially by considering to use dot charts. © 2017 Milbank Memorial Fund.

  6. The Quantity and Quality of Scientific Graphs in Pharmaceutical Advertisements

    PubMed Central

    Cooper, Richelle J; Schriger, David L; Wallace, Roger C; Mikulich, Vladislav J; Wilkes, Michael S

    2003-01-01

    We characterized the quantity and quality of graphs in all pharmaceutical advertisements, in the 10 U.S. medical journals. Four hundred eighty-four unique advertisements (of 3,185 total advertisements) contained 836 glossy and 455 small-print pages. Forty-nine percent of glossy page area was nonscientific figures/images, 0.4% tables, and 1.6% scientific graphs (74 graphs in 64 advertisements). All 74 graphs were univariate displays, 4% were distributions, and 4% contained confidence intervals for summary measures. Extraneous decoration (66%) and redundancy (46%) were common. Fifty-eight percent of graphs presented an outcome relevant to the drug's indication. Numeric distortion, specifically prohibited by FDA regulations, occurred in 36% of graphs. PMID:12709097

  7. Molecular graph convolutions: moving beyond fingerprints

    NASA Astrophysics Data System (ADS)

    Kearnes, Steven; McCloskey, Kevin; Berndl, Marc; Pande, Vijay; Riley, Patrick

    2016-08-01

    Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.

  8. Alternative Fuels Data Center: Maps and Data

    Science.gov Websites

    acquisitions by fleet type from 1992-2014 Last update August 2016 View Graph Graph Download Data -m8i0e0 Trend of S&FP AFV acquisitions by fuel type from 1992-2015 Last update August 2016 View Graph transactions from 1997-2014 Last update August 2016 View Graph Graph Download Data Generated_thumb20160907

  9. PuLP/XtraPuLP : Partitioning Tools for Extreme-Scale Graphs

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Slota, George M; Rajamanickam, Sivasankaran; Madduri, Kamesh

    2017-09-21

    PuLP/XtraPulp is software for partitioning graphs from several real-world problems. Graphs occur in several places in real world from road networks, social networks and scientific simulations. For efficient parallel processing these graphs have to be partitioned (split) with respect to metrics such as computation and communication costs. Our software allows such partitioning for massive graphs.

  10. Dynamics on Networks of Manifolds

    NASA Astrophysics Data System (ADS)

    DeVille, Lee; Lerman, Eugene

    2015-03-01

    We propose a precise definition of a continuous time dynamical system made up of interacting open subsystems. The interconnections of subsystems are coded by directed graphs. We prove that the appropriate maps of graphs called graph fibrations give rise to maps of dynamical systems. Consequently surjective graph fibrations give rise to invariant subsystems and injective graph fibrations give rise to projections of dynamical systems.

  11. Graph edit distance from spectral seriation.

    PubMed

    Robles-Kelly, Antonio; Hancock, Edwin R

    2005-03-01

    This paper is concerned with computing graph edit distance. One of the criticisms that can be leveled at existing methods for computing graph edit distance is that they lack some of the formality and rigor of the computation of string edit distance. Hence, our aim is to convert graphs to string sequences so that string matching techniques can be used. To do this, we use a graph spectral seriation method to convert the adjacency matrix into a string or sequence order. We show how the serial ordering can be established using the leading eigenvector of the graph adjacency matrix. We pose the problem of graph-matching as a maximum a posteriori probability (MAP) alignment of the seriation sequences for pairs of graphs. This treatment leads to an expression in which the edit cost is the negative logarithm of the a posteriori sequence alignment probability. We compute the edit distance by finding the sequence of string edit operations which minimizes the cost of the path traversing the edit lattice. The edit costs are determined by the components of the leading eigenvectors of the adjacency matrix and by the edge densities of the graphs being matched. We demonstrate the utility of the edit distance on a number of graph clustering problems.

  12. NEFI: Network Extraction From Images

    PubMed Central

    Dirnberger, M.; Kehl, T.; Neumann, A.

    2015-01-01

    Networks are amongst the central building blocks of many systems. Given a graph of a network, methods from graph theory enable a precise investigation of its properties. Software for the analysis of graphs is widely available and has been applied to study various types of networks. In some applications, graph acquisition is relatively simple. However, for many networks data collection relies on images where graph extraction requires domain-specific solutions. Here we introduce NEFI, a tool that extracts graphs from images of networks originating in various domains. Regarding previous work on graph extraction, theoretical results are fully accessible only to an expert audience and ready-to-use implementations for non-experts are rarely available or insufficiently documented. NEFI provides a novel platform allowing practitioners to easily extract graphs from images by combining basic tools from image processing, computer vision and graph theory. Thus, NEFI constitutes an alternative to tedious manual graph extraction and special purpose tools. We anticipate NEFI to enable time-efficient collection of large datasets. The analysis of these novel datasets may open up the possibility to gain new insights into the structure and function of various networks. NEFI is open source and available at http://nefi.mpi-inf.mpg.de. PMID:26521675

  13. Visual graph query formulation and exploration: a new perspective on information retrieval at the edge

    NASA Astrophysics Data System (ADS)

    Kase, Sue E.; Vanni, Michelle; Knight, Joanne A.; Su, Yu; Yan, Xifeng

    2016-05-01

    Within operational environments decisions must be made quickly based on the information available. Identifying an appropriate knowledge base and accurately formulating a search query are critical tasks for decision-making effectiveness in dynamic situations. The spreading of graph data management tools to access large graph databases is a rapidly emerging research area of potential benefit to the intelligence community. A graph representation provides a natural way of modeling data in a wide variety of domains. Graph structures use nodes, edges, and properties to represent and store data. This research investigates the advantages of information search by graph query initiated by the analyst and interactively refined within the contextual dimensions of the answer space toward a solution. The paper introduces SLQ, a user-friendly graph querying system enabling the visual formulation of schemaless and structureless graph queries. SLQ is demonstrated with an intelligence analyst information search scenario focused on identifying individuals responsible for manufacturing a mosquito-hosted deadly virus. The scenario highlights the interactive construction of graph queries without prior training in complex query languages or graph databases, intuitive navigation through the problem space, and visualization of results in graphical format.

  14. Metric learning with spectral graph convolutions on brain connectivity networks.

    PubMed

    Ktena, Sofia Ira; Parisot, Sarah; Ferrante, Enzo; Rajchl, Martin; Lee, Matthew; Glocker, Ben; Rueckert, Daniel

    2018-04-01

    Graph representations are often used to model structured data at an individual or population level and have numerous applications in pattern recognition problems. In the field of neuroscience, where such representations are commonly used to model structural or functional connectivity between a set of brain regions, graphs have proven to be of great importance. This is mainly due to the capability of revealing patterns related to brain development and disease, which were previously unknown. Evaluating similarity between these brain connectivity networks in a manner that accounts for the graph structure and is tailored for a particular application is, however, non-trivial. Most existing methods fail to accommodate the graph structure, discarding information that could be beneficial for further classification or regression analyses based on these similarities. We propose to learn a graph similarity metric using a siamese graph convolutional neural network (s-GCN) in a supervised setting. The proposed framework takes into consideration the graph structure for the evaluation of similarity between a pair of graphs, by employing spectral graph convolutions that allow the generalisation of traditional convolutions to irregular graphs and operates in the graph spectral domain. We apply the proposed model on two datasets: the challenging ABIDE database, which comprises functional MRI data of 403 patients with autism spectrum disorder (ASD) and 468 healthy controls aggregated from multiple acquisition sites, and a set of 2500 subjects from UK Biobank. We demonstrate the performance of the method for the tasks of classification between matching and non-matching graphs, as well as individual subject classification and manifold learning, showing that it leads to significantly improved results compared to traditional methods. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Proving relations between modular graph functions

    NASA Astrophysics Data System (ADS)

    Basu, Anirban

    2016-12-01

    We consider modular graph functions that arise in the low energy expansion of the four graviton amplitude in type II string theory. The vertices of these graphs are the positions of insertions of vertex operators on the toroidal worldsheet, while the links are the scalar Green functions connecting the vertices. Graphs with four and five links satisfy several non-trivial relations, which have been proved recently. We prove these relations by using elementary properties of Green functions and the details of the graphs. We also prove a relation between modular graph functions with six links.

  16. Panconnectivity of Locally Connected K(1,3)-Free Graphs

    DTIC Science & Technology

    1989-10-15

    Graph Theory, 3 (1979) p. 351-356. 22 7. Cun-Quan Zhang, Cycles of Given Lengths in KI, 3-Free Graphs, Discrete Math ., (1988) to appear. I. f 2.f 𔃽. AA A V V / (S. ...Locally Connected and Hamiltonian-Connected Graphs, Isreal J. Math., 33 (1979) p. 5-8. 4. V. Chvatal and P. Erd6s, A Note on Hamiltonian Circuits, Discrete ... Math ., 2 (1972) p. 111-113. 5. S. V. Kanetkar and P. R. Rao, Connected and Locally 2- Connected, K1.3-Free Graphs are Panconnected, J. Graph Theory, 8

  17. What energy functions can be minimized via graph cuts?

    PubMed

    Kolmogorov, Vladimir; Zabih, Ramin

    2004-02-01

    In the last few years, several new algorithms based on graph cuts have been developed to solve energy minimization problems in computer vision. Each of these techniques constructs a graph such that the minimum cut on the graph also minimizes the energy. Yet, because these graph constructions are complex and highly specific to a particular energy function, graph cuts have seen limited application to date. In this paper, we give a characterization of the energy functions that can be minimized by graph cuts. Our results are restricted to functions of binary variables. However, our work generalizes many previous constructions and is easily applicable to vision problems that involve large numbers of labels, such as stereo, motion, image restoration, and scene reconstruction. We give a precise characterization of what energy functions can be minimized using graph cuts, among the energy functions that can be written as a sum of terms containing three or fewer binary variables. We also provide a general-purpose construction to minimize such an energy function. Finally, we give a necessary condition for any energy function of binary variables to be minimized by graph cuts. Researchers who are considering the use of graph cuts to optimize a particular energy function can use our results to determine if this is possible and then follow our construction to create the appropriate graph. A software implementation is freely available.

  18. Connectivity: Performance Portable Algorithms for graph connectivity v. 0.1

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Slota, George; Rajamanickam, Sivasankaran; Madduri, Kamesh

    Graphs occur in several places in real world from road networks, social networks and scientific simulations. Connectivity is a graph analysis software to graph connectivity in modern architectures like multicore CPUs, Xeon Phi and GPUs.

  19. Graph wavelet alignment kernels for drug virtual screening.

    PubMed

    Smalter, Aaron; Huan, Jun; Lushington, Gerald

    2009-06-01

    In this paper, we introduce a novel statistical modeling technique for target property prediction, with applications to virtual screening and drug design. In our method, we use graphs to model chemical structures and apply a wavelet analysis of graphs to summarize features capturing graph local topology. We design a novel graph kernel function to utilize the topology features to build predictive models for chemicals via Support Vector Machine classifier. We call the new graph kernel a graph wavelet-alignment kernel. We have evaluated the efficacy of the wavelet-alignment kernel using a set of chemical structure-activity prediction benchmarks. Our results indicate that the use of the kernel function yields performance profiles comparable to, and sometimes exceeding that of the existing state-of-the-art chemical classification approaches. In addition, our results also show that the use of wavelet functions significantly decreases the computational costs for graph kernel computation with more than ten fold speedup.

  20. Flexibility in data interpretation: effects of representational format.

    PubMed

    Braithwaite, David W; Goldstone, Robert L

    2013-01-01

    Graphs and tables differentially support performance on specific tasks. For tasks requiring reading off single data points, tables are as good as or better than graphs, while for tasks involving relationships among data points, graphs often yield better performance. However, the degree to which graphs and tables support flexibility across a range of tasks is not well-understood. In two experiments, participants detected main and interaction effects in line graphs and tables of bivariate data. Graphs led to more efficient performance, but also lower flexibility, as indicated by a larger discrepancy in performance across tasks. In particular, detection of main effects of variables represented in the graph legend was facilitated relative to detection of main effects of variables represented in the x-axis. Graphs may be a preferable representational format when the desired task or analytical perspective is known in advance, but may also induce greater interpretive bias than tables, necessitating greater care in their use and design.

Top