Efficient Extraction of High Centrality Vertices in Distributed Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumbhare, Alok; Frincu, Marc; Raghavendra, Cauligi S.
2014-09-09
Betweenness centrality (BC) is an important measure for identifying high value or critical vertices in graphs, in variety of domains such as communication networks, road networks, and social graphs. However, calculating betweenness values is prohibitively expensive and, more often, domain experts are interested only in the vertices with the highest centrality values. In this paper, we first propose a partition-centric algorithm (MS-BC) to calculate BC for a large distributed graph that optimizes resource utilization and improves overall performance. Further, we extend the notion of approximate BC by pruning the graph and removing a subset of edges and vertices that contributemore » the least to the betweenness values of other vertices (MSL-BC), which further improves the runtime performance. We evaluate the proposed algorithms using a mix of real-world and synthetic graphs on an HPC cluster and analyze its strengths and weaknesses. The experimental results show an improvement in performance of upto 12x for large sparse graphs as compared to the state-of-the-art, and at the same time highlights the need for better partitioning methods to enable a balanced workload across partitions for unbalanced graphs such as small-world or power-law graphs.« less
PuLP/XtraPuLP : Partitioning Tools for Extreme-Scale Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Slota, George M; Rajamanickam, Sivasankaran; Madduri, Kamesh
2017-09-21
PuLP/XtraPulp is software for partitioning graphs from several real-world problems. Graphs occur in several places in real world from road networks, social networks and scientific simulations. For efficient parallel processing these graphs have to be partitioned (split) with respect to metrics such as computation and communication costs. Our software allows such partitioning for massive graphs.
Minimum nonuniform graph partitioning with unrelated weights
NASA Astrophysics Data System (ADS)
Makarychev, K. S.; Makarychev, Yu S.
2017-12-01
We give a bi-criteria approximation algorithm for the Minimum Nonuniform Graph Partitioning problem, recently introduced by Krauthgamer, Naor, Schwartz and Talwar. In this problem, we are given a graph G=(V,E) and k numbers ρ_1,\\dots, ρ_k. The goal is to partition V into k disjoint sets (bins) P_1,\\dots, P_k satisfying \\vert P_i\\vert≤ ρi \\vert V\\vert for all i, so as to minimize the number of edges cut by the partition. Our bi-criteria algorithm gives an O(\\sqrt{log \\vert V\\vert log k}) approximation for the objective function in general graphs and an O(1) approximation in graphs excluding a fixed minor. The approximate solution satisfies the relaxed capacity constraints \\vert P_i\\vert ≤ (5+ \\varepsilon)ρi \\vert V\\vert. This algorithm is an improvement upon the O(log \\vert V\\vert)-approximation algorithm by Krauthgamer, Naor, Schwartz and Talwar. We extend our results to the case of 'unrelated weights' and to the case of 'unrelated d-dimensional weights'. A preliminary version of this work was presented at the 41st International Colloquium on Automata, Languages and Programming (ICALP 2014). Bibliography: 7 titles.
Task-specific image partitioning.
Kim, Sungwoong; Nowozin, Sebastian; Kohli, Pushmeet; Yoo, Chang D
2013-02-01
Image partitioning is an important preprocessing step for many of the state-of-the-art algorithms used for performing high-level computer vision tasks. Typically, partitioning is conducted without regard to the task in hand. We propose a task-specific image partitioning framework to produce a region-based image representation that will lead to a higher task performance than that reached using any task-oblivious partitioning framework and existing supervised partitioning framework, albeit few in number. The proposed method partitions the image by means of correlation clustering, maximizing a linear discriminant function defined over a superpixel graph. The parameters of the discriminant function that define task-specific similarity/dissimilarity among superpixels are estimated based on structured support vector machine (S-SVM) using task-specific training data. The S-SVM learning leads to a better generalization ability while the construction of the superpixel graph used to define the discriminant function allows a rich set of features to be incorporated to improve discriminability and robustness. We evaluate the learned task-aware partitioning algorithms on three benchmark datasets. Results show that task-aware partitioning leads to better labeling performance than the partitioning computed by the state-of-the-art general-purpose and supervised partitioning algorithms. We believe that the task-specific image partitioning paradigm is widely applicable to improving performance in high-level image understanding tasks.
Fully Decomposable Split Graphs
NASA Astrophysics Data System (ADS)
Broersma, Hajo; Kratsch, Dieter; Woeginger, Gerhard J.
We discuss various questions around partitioning a split graph into connected parts. Our main result is a polynomial time algorithm that decides whether a given split graph is fully decomposable, i.e., whether it can be partitioned into connected parts of order α 1,α 2,...,α k for every α 1,α 2,...,α k summing up to the order of the graph. In contrast, we show that the decision problem whether a given split graph can be partitioned into connected parts of order α 1,α 2,...,α k for a given partition α 1,α 2,...,α k of the order of the graph, is NP-hard.
Multi-A Graph Patrolling and Partitioning
NASA Astrophysics Data System (ADS)
Elor, Y.; Bruckstein, A. M.
2012-12-01
We introduce a novel multi agent patrolling algorithm inspired by the behavior of gas filled balloons. Very low capability ant-like agents are considered with the task of patrolling an unknown area modeled as a graph. While executing the proposed algorithm, the agents dynamically partition the graph between them using simple local interactions, every agent assuming the responsibility for patrolling his subgraph. Balanced graph partition is an emergent behavior due to the local interactions between the agents in the swarm. Extensive simulations on various graphs (environments) showed that the average time to reach a balanced partition is linear with the graph size. The simulations yielded a convincing argument for conjecturing that if the graph being patrolled contains a balanced partition, the agents will find it. However, we could not prove this. Nevertheless, we have proved that if a balanced partition is reached, the maximum time lag between two successive visits to any vertex using the proposed strategy is at most twice the optimal so the patrol quality is at least half the optimal. In case of weighted graphs the patrol quality is at least (1)/(2){lmin}/{lmax} of the optimal where lmax (lmin) is the longest (shortest) edge in the graph.
Spectral partitioning in equitable graphs.
Barucca, Paolo
2017-06-01
Graph partitioning problems emerge in a wide variety of complex systems, ranging from biology to finance, but can be rigorously analyzed and solved only for a few graph ensembles. Here, an ensemble of equitable graphs, i.e., random graphs with a block-regular structure, is studied, for which analytical results can be obtained. In particular, the spectral density of this ensemble is computed exactly for a modular and bipartite structure. Kesten-McKay's law for random regular graphs is found analytically to apply also for modular and bipartite structures when blocks are homogeneous. An exact solution to graph partitioning for two equal-sized communities is proposed and verified numerically, and a conjecture on the absence of an efficient recovery detectability transition in equitable graphs is suggested. A final discussion summarizes results and outlines their relevance for the solution of graph partitioning problems in other graph ensembles, in particular for the study of detectability thresholds and resolution limits in stochastic block models.
Spectral partitioning in equitable graphs
NASA Astrophysics Data System (ADS)
Barucca, Paolo
2017-06-01
Graph partitioning problems emerge in a wide variety of complex systems, ranging from biology to finance, but can be rigorously analyzed and solved only for a few graph ensembles. Here, an ensemble of equitable graphs, i.e., random graphs with a block-regular structure, is studied, for which analytical results can be obtained. In particular, the spectral density of this ensemble is computed exactly for a modular and bipartite structure. Kesten-McKay's law for random regular graphs is found analytically to apply also for modular and bipartite structures when blocks are homogeneous. An exact solution to graph partitioning for two equal-sized communities is proposed and verified numerically, and a conjecture on the absence of an efficient recovery detectability transition in equitable graphs is suggested. A final discussion summarizes results and outlines their relevance for the solution of graph partitioning problems in other graph ensembles, in particular for the study of detectability thresholds and resolution limits in stochastic block models.
Elmetwaly, Shereef; Schlick, Tamar
2014-01-01
Graph representations have been widely used to analyze and design various economic, social, military, political, and biological networks. In systems biology, networks of cells and organs are useful for understanding disease and medical treatments and, in structural biology, structures of molecules can be described, including RNA structures. In our RNA-As-Graphs (RAG) framework, we represent RNA structures as tree graphs by translating unpaired regions into vertices and helices into edges. Here we explore the modularity of RNA structures by applying graph partitioning known in graph theory to divide an RNA graph into subgraphs. To our knowledge, this is the first application of graph partitioning to biology, and the results suggest a systematic approach for modular design in general. The graph partitioning algorithms utilize mathematical properties of the Laplacian eigenvector (µ2) corresponding to the second eigenvalues (λ2) associated with the topology matrix defining the graph: λ2 describes the overall topology, and the sum of µ2′s components is zero. The three types of algorithms, termed median, sign, and gap cuts, divide a graph by determining nodes of cut by median, zero, and largest gap of µ2′s components, respectively. We apply these algorithms to 45 graphs corresponding to all solved RNA structures up through 11 vertices (∼220 nucleotides). While we observe that the median cut divides a graph into two similar-sized subgraphs, the sign and gap cuts partition a graph into two topologically-distinct subgraphs. We find that the gap cut produces the best biologically-relevant partitioning for RNA because it divides RNAs at less stable connections while maintaining junctions intact. The iterative gap cuts suggest basic modules and assembly protocols to design large RNA structures. Our graph substructuring thus suggests a systematic approach to explore the modularity of biological networks. In our applications to RNA structures, subgraphs also suggest design strategies for novel RNA motifs. PMID:25188578
Improving Unstructured Mesh Partitions for Multiple Criteria Using Mesh Adjacencies
Smith, Cameron W.; Rasquin, Michel; Ibanez, Dan; ...
2018-02-13
The scalability of unstructured mesh based applications depends on partitioning methods that quickly balance the computational work while reducing communication costs. Zhou et al. [SIAM J. Sci. Comput., 32 (2010), pp. 3201{3227; J. Supercomput., 59 (2012), pp. 1218{1228] demonstrated the combination of (hyper)graph methods with vertex and element partition improvement for PHASTA CFD scaling to hundreds of thousands of processes. Our work generalizes partition improvement to support balancing combinations of all the mesh entity dimensions (vertices, edges, faces, regions) in partitions with imbalances exceeding 70%. Improvement results are then presented for multiple entity dimensions on up to one million processesmore » on meshes with over 12 billion tetrahedral elements.« less
Improving Unstructured Mesh Partitions for Multiple Criteria Using Mesh Adjacencies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Smith, Cameron W.; Rasquin, Michel; Ibanez, Dan
The scalability of unstructured mesh based applications depends on partitioning methods that quickly balance the computational work while reducing communication costs. Zhou et al. [SIAM J. Sci. Comput., 32 (2010), pp. 3201{3227; J. Supercomput., 59 (2012), pp. 1218{1228] demonstrated the combination of (hyper)graph methods with vertex and element partition improvement for PHASTA CFD scaling to hundreds of thousands of processes. Our work generalizes partition improvement to support balancing combinations of all the mesh entity dimensions (vertices, edges, faces, regions) in partitions with imbalances exceeding 70%. Improvement results are then presented for multiple entity dimensions on up to one million processesmore » on meshes with over 12 billion tetrahedral elements.« less
A Partitioning Algorithm for Block-Diagonal Matrices With Overlap
DOE Office of Scientific and Technical Information (OSTI.GOV)
Guy Antoine Atenekeng Kahou; Laura Grigori; Masha Sosonkina
2008-02-02
We present a graph partitioning algorithm that aims at partitioning a sparse matrix into a block-diagonal form, such that any two consecutive blocks overlap. We denote this form of the matrix as the overlapped block-diagonal matrix. The partitioned matrix is suitable for applying the explicit formulation of Multiplicative Schwarz preconditioner (EFMS) described in [3]. The graph partitioning algorithm partitions the graph of the input matrix into K partitions, such that every partition {Omega}{sub i} has at most two neighbors {Omega}{sub i-1} and {Omega}{sub i+1}. First, an ordering algorithm, such as the reverse Cuthill-McKee algorithm, that reduces the matrix profile ismore » performed. An initial overlapped block-diagonal partition is obtained from the profile of the matrix. An iterative strategy is then used to further refine the partitioning by allowing nodes to be transferred between neighboring partitions. Experiments are performed on matrices arising from real-world applications to show the feasibility and usefulness of this approach.« less
A Novel Coarsening Method for Scalable and Efficient Mesh Generation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoo, A; Hysom, D; Gunney, B
2010-12-02
In this paper, we propose a novel mesh coarsening method called brick coarsening method. The proposed method can be used in conjunction with any graph partitioners and scales to very large meshes. This method reduces problem space by decomposing the original mesh into fixed-size blocks of nodes called bricks, layered in a similar way to conventional brick laying, and then assigning each node of the original mesh to appropriate brick. Our experiments indicate that the proposed method scales to very large meshes while allowing simple RCB partitioner to produce higher-quality partitions with significantly less edge cuts. Our results further indicatemore » that the proposed brick-coarsening method allows more complicated partitioners like PT-Scotch to scale to very large problem size while still maintaining good partitioning performance with relatively good edge-cut metric. Graph partitioning is an important problem that has many scientific and engineering applications in such areas as VLSI design, scientific computing, and resource management. Given a graph G = (V,E), where V is the set of vertices and E is the set of edges, (k-way) graph partitioning problem is to partition the vertices of the graph (V) into k disjoint groups such that each group contains roughly equal number of vertices and the number of edges connecting vertices in different groups is minimized. Graph partitioning plays a key role in large scientific computing, especially in mesh-based computations, as it is used as a tool to minimize the volume of communication and to ensure well-balanced load across computing nodes. The impact of graph partitioning on the reduction of communication can be easily seen, for example, in different iterative methods to solve a sparse system of linear equation. Here, a graph partitioning technique is applied to the matrix, which is basically a graph in which each edge is a non-zero entry in the matrix, to allocate groups of vertices to processors in such a way that many of matrix-vector multiplication can be performed locally on each processor and hence to minimize communication. Furthermore, a good graph partitioning scheme ensures the equal amount of computation performed on each processor. Graph partitioning is a well known NP-complete problem, and thus the most commonly used graph partitioning algorithms employ some forms of heuristics. These algorithms vary in terms of their complexity, partition generation time, and the quality of partitions, and they tend to trade off these factors. A significant challenge we are currently facing at the Lawrence Livermore National Laboratory is how to partition very large meshes on massive-size distributed memory machines like IBM BlueGene/P, where scalability becomes a big issue. For example, we have found that the ParMetis, a very popular graph partitioning tool, can only scale to 16K processors. An ideal graph partitioning method on such an environment should be fast and scale to very large meshes, while producing high quality partitions. This is an extremely challenging task, as to scale to that level, the partitioning algorithm should be simple and be able to produce partitions that minimize inter-processor communications and balance the load imposed on the processors. Our goals in this work are two-fold: (1) To develop a new scalable graph partitioning method with good load balancing and communication reduction capability. (2) To study the performance of the proposed partitioning method on very large parallel machines using actual data sets and compare the performance to that of existing methods. The proposed method achieves the desired scalability by reducing the mesh size. For this, it coarsens an input mesh into a smaller size mesh by coalescing the vertices and edges of the original mesh into a set of mega-vertices and mega-edges. A new coarsening method called brick algorithm is developed in this research. In the brick algorithm, the zones in a given mesh are first grouped into fixed size blocks called bricks. These brick are then laid in a way similar to conventional brick laying technique, which reduces the number of neighboring blocks each block needs to communicate. Contributions of this research are as follows: (1) We have developed a novel method that scales to a really large problem size while producing high quality mesh partitions; (2) We measured the performance and scalability of the proposed method on a machine of massive size using a set of actual large complex data sets, where we have scaled to a mesh with 110 million zones using our method. To the best of our knowledge, this is the largest complex mesh that a partitioning method is successfully applied to; and (3) We have shown that proposed method can reduce the number of edge cuts by as much as 65%.« less
The partition dimension of cycle books graph
NASA Astrophysics Data System (ADS)
Santoso, Jaya; Darmaji
2018-03-01
Let G be a nontrivial and connected graph with vertex set V(G), edge set E(G) and S ⊆ V(G) with v ∈ V(G), the distance between v and S is d(v,S) = min{d(v,x)|x ∈ S}. For an ordered partition ∏ = {S 1, S 2, S 3,…, Sk } of V(G), the representation of v with respect to ∏ is defined by r(v|∏) = (d(v, S 1), d(v, S 2),…, d(v, Sk )). The partition ∏ is called a resolving partition of G if all representations of vertices are distinct. The partition dimension pd(G) is the smallest integer k such that G has a resolving partition set with k members. In this research, we will determine the partition dimension of Cycle Books {B}{Cr,m}. Cycle books graph {B}{Cr,m} is a graph consisting of m copies cycle Cr with the common path P 2. It is shown that the partition dimension of cycle books graph, pd({B}{C3,m}) is 3 for m = 2, 3, and m for m ≥ 4. pd({B}{C4,m}) is 3 + 2k for m = 3k + 2, 4 + 2(k ‑ 1) for m = 3k + 1, and 3 + 2(k ‑ 1) for m = 3k. pd({B}{C5,m}) is m + 1.
NASA Astrophysics Data System (ADS)
Murni, Bustamam, A.; Ernastuti, Handhika, T.; Kerami, D.
2017-07-01
Calculation of the matrix-vector multiplication in the real-world problems often involves large matrix with arbitrary size. Therefore, parallelization is needed to speed up the calculation process that usually takes a long time. Graph partitioning techniques that have been discussed in the previous studies cannot be used to complete the parallelized calculation of matrix-vector multiplication with arbitrary size. This is due to the assumption of graph partitioning techniques that can only solve the square and symmetric matrix. Hypergraph partitioning techniques will overcome the shortcomings of the graph partitioning technique. This paper addresses the efficient parallelization of matrix-vector multiplication through hypergraph partitioning techniques using CUDA GPU-based parallel computing. CUDA (compute unified device architecture) is a parallel computing platform and programming model that was created by NVIDIA and implemented by the GPU (graphics processing unit).
The partition dimension of subdivision of a graph
NASA Astrophysics Data System (ADS)
Amrullah, Baskoro, Edy Tri; Uttunggadewa, Saladin; Simanjuntak, Rinovia
2016-02-01
Let G = (V,E) be a connected graph, u,v ∈ V (G), e = uv ∈ E(G) and k be a positive integer. A k-subdivision of an edge e is a replacement of e = uv with a path u, x1, x2, x ..., xk, v. A graph G with a k-subdivided edge is denoted with S(G(e; k)). Let p be a positive integer and Π = {L1, L2, L3, …, Lp} be a p-partition of V (G). The representation of a vertex v with respect to Π, r(v|Π), is the vector (d(v, L1), d(v, L2), d(v, L3),…, d(v, Lp)) where d(v, Li) for i ∈ [1, p] is the minimum distance between v and the vertices of Li. The partition Π is called a resolving partition of G if r(w|Π) ≠ r(v|Π) for all w ≠ v ∈ V (G). The partition dimension, pd(G), of G is the smallest integer p such that G has a resolving p-partition. In this paper, we present sharp upper and lower bounds of the partition dimension of S(G(e; k)) for any graph G.
Optimal Clustering in Graphs with Weighted Edges: A Unified Approach to the Threshold Problem.
ERIC Educational Resources Information Center
Goetschel, Roy; Voxman, William
1987-01-01
Relations on a finite set V are viewed as weighted graphs. Using the language of graph theory, two methods of partitioning V are examined: selecting threshold values and applying them to a maximal weighted spanning forest, and using a parametric linear program to obtain a most adhesive partition. (Author/EM)
Graph Partitioning for Parallel Applications in Heterogeneous Grid Environments
NASA Technical Reports Server (NTRS)
Bisws, Rupak; Kumar, Shailendra; Das, Sajal K.; Biegel, Bryan (Technical Monitor)
2002-01-01
The problem of partitioning irregular graphs and meshes for parallel computations on homogeneous systems has been extensively studied. However, these partitioning schemes fail when the target system architecture exhibits heterogeneity in resource characteristics. With the emergence of technologies such as the Grid, it is imperative to study the partitioning problem taking into consideration the differing capabilities of such distributed heterogeneous systems. In our model, the heterogeneous system consists of processors with varying processing power and an underlying non-uniform communication network. We present in this paper a novel multilevel partitioning scheme for irregular graphs and meshes, that takes into account issues pertinent to Grid computing environments. Our partitioning algorithm, called MiniMax, generates and maps partitions onto a heterogeneous system with the objective of minimizing the maximum execution time of the parallel distributed application. For experimental performance study, we have considered both a realistic mesh problem from NASA as well as synthetic workloads. Simulation results demonstrate that MiniMax generates high quality partitions for various classes of applications targeted for parallel execution in a distributed heterogeneous environment.
Partitioning Algorithms for Simultaneously Balancing Iterative and Direct Methods
2004-03-03
is defined as 57698&:&;=<$>?8A@B8 DC E & /F <G8H IJ0 K L 012 1NM? which is the ratio of the highest partition weight over the average...OQPSR , 57698T:;=<$>U8T@B8 DC E & /VXWZYK[\\O , and E :^] E_CU`4ab /V is minimized. The load imbalance is the constraint we have to satisfy, and...that the initial partitioning can be improved [16, 19, 20]. 3 Problem Definition and Challenges Consider a graph )c2 with d e f vertices
Weights and topology: a study of the effects of graph construction on 3D image segmentation.
Grady, Leo; Jolly, Marie-Pierre
2008-01-01
Graph-based algorithms have become increasingly popular for medical image segmentation. The fundamental process for each of these algorithms is to use the image content to generate a set of weights for the graph and then set conditions for an optimal partition of the graph with respect to these weights. To date, the heuristics used for generating the weighted graphs from image intensities have largely been ignored, while the primary focus of attention has been on the details of providing the partitioning conditions. In this paper we empirically study the effects of graph connectivity and weighting function on the quality of the segmentation results. To control for algorithm-specific effects, we employ both the Graph Cuts and Random Walker algorithms in our experiments.
A GRAPH PARTITIONING APPROACH TO PREDICTING PATTERNS IN LATERAL INHIBITION SYSTEMS
RUFINO FERREIRA, ANA S.; ARCAK, MURAT
2017-01-01
We analyze spatial patterns on networks of cells where adjacent cells inhibit each other through contact signaling. We represent the network as a graph where each vertex represents the dynamics of identical individual cells and where graph edges represent cell-to-cell signaling. To predict steady-state patterns we find equitable partitions of the graph vertices and assign them into disjoint classes. We then use results from monotone systems theory to prove the existence of patterns that are structured in such a way that all the cells in the same class have the same final fate. To study the stability properties of these patterns, we rely on the graph partition to perform a block decomposition of the system. Then, to guarantee stability, we provide a small-gain type criterion that depends on the input-output properties of each cell in the reduced system. Finally, we discuss pattern formation in stochastic models. With the help of a modal decomposition we show that noise can enhance the parameter region where patterning occurs. PMID:29225552
Some trees with partition dimension three
NASA Astrophysics Data System (ADS)
Fredlina, Ketut Queena; Baskoro, Edy Tri
2016-02-01
The concept of partition dimension of a graph was introduced by Chartrand, E. Salehi and P. Zhang (1998) [2]. Let G(V, E) be a connected graph. For S ⊆ V (G) and v ∈ V (G), define the distance d(v, S) from v to S is min{d(v, x)|x ∈ S}. Let Π be an ordered partition of V (G) and Π = {S1, S2, ..., Sk }. The representation r(v|Π) of vertex v with respect to Π is (d(v, S1), d(v, S2), ..., d(v, Sk)). If the representations of all vertices are distinct, then the partition Π is called a resolving partition of G. The partition dimension of G is the minimum k such that G has a resolving partition with k partition classes. In this paper, we characterize some classes of trees with partition dimension three, namely olive trees, weeds, and centipedes.
Tensor Spectral Clustering for Partitioning Higher-order Network Structures.
Benson, Austin R; Gleich, David F; Leskovec, Jure
2015-01-01
Spectral graph theory-based methods represent an important class of tools for studying the structure of networks. Spectral methods are based on a first-order Markov chain derived from a random walk on the graph and thus they cannot take advantage of important higher-order network substructures such as triangles, cycles, and feed-forward loops. Here we propose a Tensor Spectral Clustering (TSC) algorithm that allows for modeling higher-order network structures in a graph partitioning framework. Our TSC algorithm allows the user to specify which higher-order network structures (cycles, feed-forward loops, etc.) should be preserved by the network clustering. Higher-order network structures of interest are represented using a tensor, which we then partition by developing a multilinear spectral method. Our framework can be applied to discovering layered flows in networks as well as graph anomaly detection, which we illustrate on synthetic networks. In directed networks, a higher-order structure of particular interest is the directed 3-cycle, which captures feedback loops in networks. We demonstrate that our TSC algorithm produces large partitions that cut fewer directed 3-cycles than standard spectral clustering algorithms.
Tensor Spectral Clustering for Partitioning Higher-order Network Structures
Benson, Austin R.; Gleich, David F.; Leskovec, Jure
2016-01-01
Spectral graph theory-based methods represent an important class of tools for studying the structure of networks. Spectral methods are based on a first-order Markov chain derived from a random walk on the graph and thus they cannot take advantage of important higher-order network substructures such as triangles, cycles, and feed-forward loops. Here we propose a Tensor Spectral Clustering (TSC) algorithm that allows for modeling higher-order network structures in a graph partitioning framework. Our TSC algorithm allows the user to specify which higher-order network structures (cycles, feed-forward loops, etc.) should be preserved by the network clustering. Higher-order network structures of interest are represented using a tensor, which we then partition by developing a multilinear spectral method. Our framework can be applied to discovering layered flows in networks as well as graph anomaly detection, which we illustrate on synthetic networks. In directed networks, a higher-order structure of particular interest is the directed 3-cycle, which captures feedback loops in networks. We demonstrate that our TSC algorithm produces large partitions that cut fewer directed 3-cycles than standard spectral clustering algorithms. PMID:27812399
Solving Multi-variate Polynomial Equations in a Finite Field
2013-06-01
Algebraic Background In this section, some algebraic definitions and basics are discussed as they pertain to this re- search. For a more detailed...definitions and basics are discussed as they pertain to this research. For a more detailed treatment, consult a graph theory text such as [10]. A graph G...graph if V(G) can be partitioned into k subsets V1,V2, ...,Vk such that uv is only an edge of G if u and v belong to different partite sets. If, in
Architecture Aware Partitioning Algorithms
2006-01-19
follows: Given a graph G = (V, E ), where V is the set of vertices, n = |V | is the number of vertices, and E is the set of edges in the graph, partition the...communication link l(pi, pj) is associated with a graph edge weight e ∗(pi, pj) that represents the communication cost per unit of communication between...one that is local for each one. For our model we assume that communication in either direction across a given link is the same, therefore e ∗(pi, pj
Controlling bi-partite entanglement in multi-qubit systems
NASA Astrophysics Data System (ADS)
Plesch, Martin; Novotný, Jaroslav; Dzuráková, Zuzana; Buzek, Vladimír
2004-02-01
Bi-partite entanglement in multi-qubit systems cannot be shared freely. The rules of quantum mechanics impose bounds on how multi-qubit systems can be correlated. In this paper, we utilize a concept of entangled graphs with weighted edges in order to analyse pure quantum states of multi-qubit systems. Here qubits are represented by vertexes of the graph, while the presence of bi-partite entanglement is represented by an edge between corresponding vertexes. The weight of each edge is defined to be the entanglement between the two qubits connected by the edge, as measured by the concurrence. We prove that each entangled graph with entanglement bounded by a specific value of the concurrence can be represented by a pure multi-qubit state. In addition, we present a logic network with O(N2) elementary gates that can be used for preparation of the weighted entangled graphs of N qubits.
Liu, Yuangang; Guo, Qingsheng; Sun, Yageng; Ma, Xiaoya
2014-01-01
Scale reduction from source to target maps inevitably leads to conflicts of map symbols in cartography and geographic information systems (GIS). Displacement is one of the most important map generalization operators and it can be used to resolve the problems that arise from conflict among two or more map objects. In this paper, we propose a combined approach based on constraint Delaunay triangulation (CDT) skeleton and improved elastic beam algorithm for automated building displacement. In this approach, map data sets are first partitioned. Then the displacement operation is conducted in each partition as a cyclic and iterative process of conflict detection and resolution. In the iteration, the skeleton of the gap spaces is extracted using CDT. It then serves as an enhanced data model to detect conflicts and construct the proximity graph. Then, the proximity graph is adjusted using local grouping information. Under the action of forces derived from the detected conflicts, the proximity graph is deformed using the improved elastic beam algorithm. In this way, buildings are displaced to find an optimal compromise between related cartographic constraints. To validate this approach, two topographic map data sets (i.e., urban and suburban areas) were tested. The results were reasonable with respect to each constraint when the density of the map was not extremely high. In summary, the improvements include (1) an automated parameter-setting method for elastic beams, (2) explicit enforcement regarding the positional accuracy constraint, added by introducing drag forces, (3) preservation of local building groups through displacement over an adjusted proximity graph, and (4) an iterative strategy that is more likely to resolve the proximity conflicts than the one used in the existing elastic beam algorithm. PMID:25470727
On the star partition dimension of comb product of cycle and complete graph
NASA Astrophysics Data System (ADS)
Alfarisi, Ridho; Darmaji; Dafik
2017-06-01
Let G = (V, E) be a connected graphs with vertex set V (G), edge set E(G) and S ⊆ V (G). For an ordered partition Π = {S 1, S 2, S 3, …, Sk } of V (G), the representation of a vertex v ∈ V (G) with respect to Π is the k-vectors r(v|Π) = (d(v, S 1), d(v, S 2), …, d(v, Sk )), where d(v, Sk ) represents the distance between the vertex v and the set Sk , defined by d(v, Sk ) = min{d(v, x)|x ∈ Sk}. The partition Π of V (G) is a resolving partition if the k-vektors r(v|Π), v ∈ V (G) are distinct. The minimum resolving partition Π is a partition dimension of G, denoted by pd(G). The resolving partition Π = {S 1, S 2, S 3, …, Sk} is called a star resolving partition for G if it is a resolving partition and each subgraph induced by Si , 1 ≤ i ≤ k, is a star. The minimum k for which there exists a star resolving partition of V (G) is the star partition dimension of G, denoted by spd(G). Finding a star partition dimension of G is classified to be a NP-Hard problem. Furthermore, the comb product between G and H, denoted by G ⊲ H, is a graph obtained by taking one copy of G and |V (G)| copies of H and grafting the i-th copy of H at the vertex o to the i-th vertex of G. By definition of comb product, we can say that V (G ⊲ H) = {(a, u)|a ∈ V (G), u ∈ V (H)} and (a, u)(b, v) ∈ E(G ⊲ H) whenever a = b and uv ∈ E(H), or ab ∈ E(G) and u = v = o. In this paper, we will study the star partition dimension of comb product of cycle and complete graph, namely Cn ⊲ Km and Km ⊲ Cn for n ≥ 3 and m ≥ 3.
NASA Astrophysics Data System (ADS)
Kruglov, V. E.; Malyshev, D. S.; Pochinka, O. V.
2018-01-01
Studying the dynamics of a flow on surfaces by partitioning the phase space into cells with the same limit behaviour of trajectories within a cell goes back to the classical papers of Andronov, Pontryagin, Leontovich and Maier. The types of cells (the number of which is finite) and how the cells adjoin one another completely determine the topological equivalence class of a flow with finitely many special trajectories. If one trajectory is chosen in every cell of a rough flow without periodic orbits, then the cells are partitioned into so-called triangular regions of the same type. A combinatorial description of such a partition gives rise to the three-colour Oshemkov-Sharko graph, the vertices of which correspond to the triangular regions, and the edges to separatrices connecting them. Oshemkov and Sharko proved that such flows are topologically equivalent if and only if the three-colour graphs of the flows are isomorphic, and described an algorithm of distinguishing three-colour graphs. But their algorithm is not efficient with respect to graph theory. In the present paper, we describe the dynamics of Ω-stable flows without periodic trajectories on surfaces in the language of four-colour graphs, present an efficient algorithm for distinguishing such graphs, and develop a realization of a flow from some abstract graph. Bibliography: 17 titles.
On the partition dimension of comb product of path and complete graph
NASA Astrophysics Data System (ADS)
Darmaji, Alfarisi, Ridho
2017-08-01
For a vertex v of a connected graph G(V, E) with vertex set V(G), edge set E(G) and S ⊆ V(G). Given an ordered partition Π = {S1, S2, S3, …, Sk} of the vertex set V of G, the representation of a vertex v ∈ V with respect to Π is the vector r(v|Π) = (d(v, S1), d(v, S2), …, d(v, Sk)), where d(v, Sk) represents the distance between the vertex v and the set Sk and d(v, Sk) = min{d(v, x)|x ∈ Sk}. A partition Π of V(G) is a resolving partition if different vertices of G have distinct representations, i.e., for every pair of vertices u, v ∈ V(G), r(u|Π) ≠ r(v|Π). The minimum k of Π resolving partition is a partition dimension of G, denoted by pd(G). Finding the partition dimension of G is classified to be a NP-Hard problem. In this paper, we will show that the partition dimension of comb product of path and complete graph. The results show that comb product of complete grapph Km and path Pn namely p d (Km⊳Pn)=m where m ≥ 3 and n ≥ 2 and p d (Pn⊳Km)=m where m ≥ 3, n ≥ 2 and m ≥ n.
Persona, Marek; Kutarov, Vladimir V; Kats, Boris M; Persona, Andrzej; Marczewska, Barbara
2007-01-01
The paper describes the new prediction method of octanol-water partition coefficient, which is based on molecular graph theory. The results obtained using the new method are well correlated with experimental values. These results were compared with the ones obtained by use of ten other structure correlated methods. The comparison shows that graph theory can be very useful in structure correlation research.
Applying graph partitioning methods in measurement-based dynamic load balancing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bhatele, Abhinav; Fourestier, Sebastien; Menon, Harshitha
Load imbalance leads to an increasing waste of resources as an application is scaled to more and more processors. Achieving the best parallel efficiency for a program requires optimal load balancing which is a NP-hard problem. However, finding near-optimal solutions to this problem for complex computational science and engineering applications is becoming increasingly important. Charm++, a migratable objects based programming model, provides a measurement-based dynamic load balancing framework. This framework instruments and then migrates over-decomposed objects to balance computational load and communication at runtime. This paper explores the use of graph partitioning algorithms, traditionally used for partitioning physical domains/meshes, formore » measurement-based dynamic load balancing of parallel applications. In particular, we present repartitioning methods developed in a graph partitioning toolbox called SCOTCH that consider the previous mapping to minimize migration costs. We also discuss a new imbalance reduction algorithm for graphs with irregular load distributions. We compare several load balancing algorithms using microbenchmarks on Intrepid and Ranger and evaluate the effect of communication, number of cores and number of objects on the benefit achieved from load balancing. New algorithms developed in SCOTCH lead to better performance compared to the METIS partitioners for several cases, both in terms of the application execution time and fewer number of objects migrated.« less
Dynamic Load Balancing for Adaptive Computations on Distributed-Memory Machines
NASA Technical Reports Server (NTRS)
1999-01-01
Dynamic load balancing is central to adaptive mesh-based computations on large-scale parallel computers. The principal investigator has investigated various issues on the dynamic load balancing problem under NASA JOVE and JAG rants. The major accomplishments of the project are two graph partitioning algorithms and a load balancing framework. The S-HARP dynamic graph partitioner is known to be the fastest among the known dynamic graph partitioners to date. It can partition a graph of over 100,000 vertices in 0.25 seconds on a 64- processor Cray T3E distributed-memory multiprocessor while maintaining the scalability of over 16-fold speedup. Other known and widely used dynamic graph partitioners take over a second or two while giving low scalability of a few fold speedup on 64 processors. These results have been published in journals and peer-reviewed flagship conferences.
Song, Youyi; Zhang, Ling; Chen, Siping; Ni, Dong; Lei, Baiying; Wang, Tianfu
2015-10-01
In this paper, a multiscale convolutional network (MSCN) and graph-partitioning-based method is proposed for accurate segmentation of cervical cytoplasm and nuclei. Specifically, deep learning via the MSCN is explored to extract scale invariant features, and then, segment regions centered at each pixel. The coarse segmentation is refined by an automated graph partitioning method based on the pretrained feature. The texture, shape, and contextual information of the target objects are learned to localize the appearance of distinctive boundary, which is also explored to generate markers to split the touching nuclei. For further refinement of the segmentation, a coarse-to-fine nucleus segmentation framework is developed. The computational complexity of the segmentation is reduced by using superpixel instead of raw pixels. Extensive experimental results demonstrate that the proposed cervical nucleus cell segmentation delivers promising results and outperforms existing methods.
Residuals-Based Subgraph Detection with Cue Vertices
2015-11-30
Workshop, 2012, pp. 129–132. [5] M. E. J. Newman , “Finding community structure in networks using the eigenvectors of matrices,” Phys. Rev. E, vol. 74, no...from Data, vol. 1, no. 1, 2007. [7] M. W. Mahoney , L. Orecchia, and N. K. Vishnoi, “A spectral algorithm for improving graph partitions,” CoRR, vol. abs
Repartitioning Strategies for Massively Parallel Simulation of Reacting Flow
NASA Astrophysics Data System (ADS)
Pisciuneri, Patrick; Zheng, Angen; Givi, Peyman; Labrinidis, Alexandros; Chrysanthis, Panos
2015-11-01
The majority of parallel CFD simulators partition the domain into equal regions and assign the calculations for a particular region to a unique processor. This type of domain decomposition is vital to the efficiency of the solver. However, as the simulation develops, the workload among the partitions often become uneven (e.g. by adaptive mesh refinement, or chemically reacting regions) and a new partition should be considered. The process of repartitioning adjusts the current partition to evenly distribute the load again. We compare two repartitioning tools: Zoltan, an architecture-agnostic graph repartitioner developed at the Sandia National Laboratories; and Paragon, an architecture-aware graph repartitioner developed at the University of Pittsburgh. The comparative assessment is conducted via simulation of the Taylor-Green vortex flow with chemical reaction.
Sun, Peng; Guo, Jiong; Baumbach, Jan
2012-07-17
The explosion of biological data has largely influenced the focus of today’s biology research. Integrating and analysing large quantity of data to provide meaningful insights has become the main challenge to biologists and bioinformaticians. One major problem is the combined data analysis of data from different types, such as phenotypes and genotypes. This data is modelled as bi-partite graphs where nodes correspond to the different data points, mutations and diseases for instance, and weighted edges relate to associations between them. Bi-clustering is a special case of clustering designed for partitioning two different types of data simultaneously. We present a bi-clustering approach that solves the NP-hard weighted bi-cluster editing problem by transforming a given bi-partite graph into a disjoint union of bi-cliques. Here we contribute with an exact algorithm that is based on fixed-parameter tractability. We evaluated its performance on artificial graphs first. Afterwards we exemplarily applied our Java implementation to data of genome-wide association studies (GWAS) data aiming for discovering new, previously unobserved geno-to-pheno associations. We believe that our results will serve as guidelines for further wet lab investigations. Generally our software can be applied to any kind of data that can be modelled as bi-partite graphs. To our knowledge it is the fastest exact method for weighted bi-cluster editing problem.
Sun, Peng; Guo, Jiong; Baumbach, Jan
2012-06-01
The explosion of biological data has largely influenced the focus of today's biology research. Integrating and analysing large quantity of data to provide meaningful insights has become the main challenge to biologists and bioinformaticians. One major problem is the combined data analysis of data from different types, such as phenotypes and genotypes. This data is modelled as bi-partite graphs where nodes correspond to the different data points, mutations and diseases for instance, and weighted edges relate to associations between them. Bi-clustering is a special case of clustering designed for partitioning two different types of data simultaneously. We present a bi-clustering approach that solves the NP-hard weighted bi-cluster editing problem by transforming a given bi-partite graph into a disjoint union of bi-cliques. Here we contribute with an exact algorithm that is based on fixed-parameter tractability. We evaluated its performance on artificial graphs first. Afterwards we exemplarily applied our Java implementation to data of genome-wide association studies (GWAS) data aiming for discovering new, previously unobserved geno-to-pheno associations. We believe that our results will serve as guidelines for further wet lab investigations. Generally our software can be applied to any kind of data that can be modelled as bi-partite graphs. To our knowledge it is the fastest exact method for weighted bi-cluster editing problem.
Overlapping clusters for distributed computation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mirrokni, Vahab; Andersen, Reid; Gleich, David F.
2010-11-01
Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initialmore » partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.« less
On the locating-chromatic number for graphs with two homogenous components
NASA Astrophysics Data System (ADS)
Welyyanti, Des; Baskoro, Edy Tri; Simajuntak, Rinovia; Uttunggadewa, Saladin
2017-10-01
The locating-chromatic number of a graph was introduced by Chartrand et al. in 2002. The concept of the locating-chromatic number is a marriage between graph coloring and the notion of graph partition dimension. This concept is only for connected graphs. In [8], we extended this concept also for disconnected graphs. In this paper, we determine the locating- chromatic number of a graph with two components. In particular, we determine such values if the components are homogeneous and each component has locating-chromatic number 3.
Graph partitions and cluster synchronization in networks of oscillators
Schaub, Michael T.; O’Clery, Neave; Billeh, Yazan N.; Delvenne, Jean-Charles; Lambiotte, Renaud; Barahona, Mauricio
2017-01-01
Synchronization over networks depends strongly on the structure of the coupling between the oscillators. When the coupling presents certain regularities, the dynamics can be coarse-grained into clusters by means of External Equitable Partitions of the network graph and their associated quotient graphs. We exploit this graph-theoretical concept to study the phenomenon of cluster synchronization, in which different groups of nodes converge to distinct behaviors. We derive conditions and properties of networks in which such clustered behavior emerges, and show that the ensuing dynamics is the result of the localization of the eigenvectors of the associated graph Laplacians linked to the existence of invariant subspaces. The framework is applied to both linear and non-linear models, first for the standard case of networks with positive edges, before being generalized to the case of signed networks with both positive and negative interactions. We illustrate our results with examples of both signed and unsigned graphs for consensus dynamics and for partial synchronization of oscillator networks under the master stability function as well as Kuramoto oscillators. PMID:27781454
A Random Walk Approach to Query Informative Constraints for Clustering.
Abin, Ahmad Ali
2017-08-09
This paper presents a random walk approach to the problem of querying informative constraints for clustering. The proposed method is based on the properties of the commute time, that is the expected time taken for a random walk to travel between two nodes and return, on the adjacency graph of data. Commute time has the nice property of that, the more short paths connect two given nodes in a graph, the more similar those nodes are. Since computing the commute time takes the Laplacian eigenspectrum into account, we use this property in a recursive fashion to query informative constraints for clustering. At each recursion, the proposed method constructs the adjacency graph of data and utilizes the spectral properties of the commute time matrix to bipartition the adjacency graph. Thereafter, the proposed method benefits from the commute times distance on graph to query informative constraints between partitions. This process iterates for each partition until the stop condition becomes true. Experiments on real-world data show the efficiency of the proposed method for constraints selection.
Evolving bipartite authentication graph partitions
Pope, Aaron Scott; Tauritz, Daniel Remy; Kent, Alexander D.
2017-01-16
As large scale enterprise computer networks become more ubiquitous, finding the appropriate balance between user convenience and user access control is an increasingly challenging proposition. Suboptimal partitioning of users’ access and available services contributes to the vulnerability of enterprise networks. Previous edge-cut partitioning methods unduly restrict users’ access to network resources. This paper introduces a novel method of network partitioning superior to the current state-of-the-art which minimizes user impact by providing alternate avenues for access that reduce vulnerability. Networks are modeled as bipartite authentication access graphs and a multi-objective evolutionary algorithm is used to simultaneously minimize the size of largemore » connected components while minimizing overall restrictions on network users. Lastly, results are presented on a real world data set that demonstrate the effectiveness of the introduced method compared to previous naive methods.« less
Evolving bipartite authentication graph partitions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pope, Aaron Scott; Tauritz, Daniel Remy; Kent, Alexander D.
As large scale enterprise computer networks become more ubiquitous, finding the appropriate balance between user convenience and user access control is an increasingly challenging proposition. Suboptimal partitioning of users’ access and available services contributes to the vulnerability of enterprise networks. Previous edge-cut partitioning methods unduly restrict users’ access to network resources. This paper introduces a novel method of network partitioning superior to the current state-of-the-art which minimizes user impact by providing alternate avenues for access that reduce vulnerability. Networks are modeled as bipartite authentication access graphs and a multi-objective evolutionary algorithm is used to simultaneously minimize the size of largemore » connected components while minimizing overall restrictions on network users. Lastly, results are presented on a real world data set that demonstrate the effectiveness of the introduced method compared to previous naive methods.« less
Convergence of Mayer and Virial expansions and the Penrose tree-graph identity
NASA Astrophysics Data System (ADS)
Procacci, Aldo; Yuhjtman, Sergio A.
2017-01-01
We establish new lower bounds for the convergence radius of the Mayer series and the Virial series of a continuous particle system interacting via a stable and tempered pair potential. Our bounds considerably improve those given by Penrose (J Math Phys 4:1312, 1963) and Ruelle (Ann Phys 5:109-120, 1963) for the Mayer series and by Lebowitz and Penrose (J Math Phys 7:841-847, 1964) for the Virial series. To get our results, we exploit the tree-graph identity given by Penrose (Statistical mechanics: foundations and applications. Benjamin, New York, 1967) using a new partition scheme based on minimum spanning trees.
Li, Qu; Yao, Min; Yang, Jianhua; Xu, Ning
2014-01-01
Online friend recommendation is a fast developing topic in web mining. In this paper, we used SVD matrix factorization to model user and item feature vector and used stochastic gradient descent to amend parameter and improve accuracy. To tackle cold start problem and data sparsity, we used KNN model to influence user feature vector. At the same time, we used graph theory to partition communities with fairly low time and space complexity. What is more, matrix factorization can combine online and offline recommendation. Experiments showed that the hybrid recommendation algorithm is able to recommend online friends with good accuracy.
Convergence Analysis of the Graph Allen-Cahn Scheme
2016-02-01
CONVERGENCE ANALYSIS OF THE GRAPH ALLEN-CAHN SCHEME ∗ XIYANG LUO† AND ANDREA L. BERTOZZI† Abstract. Graph partitioning problems have a wide range of...optimization, convergence and monotonicity are shown for a class of schemes under a graph-independent timestep restriction. We also analyze the effects of...spectral truncation, a common technique used to save computational cost. Convergence of the scheme with spectral truncation is also proved under a
Dynamic airspace configuration algorithms for next generation air transportation system
NASA Astrophysics Data System (ADS)
Wei, Jian
The National Airspace System (NAS) is under great pressure to safely and efficiently handle the record-high air traffic volume nowadays, and will face even greater challenge to keep pace with the steady increase of future air travel demand, since the air travel demand is projected to increase to two to three times the current level by 2025. The inefficiency of traffic flow management initiatives causes severe airspace congestion and frequent flight delays, which cost billions of economic losses every year. To address the increasingly severe airspace congestion and delays, the Next Generation Air Transportation System (NextGen) is proposed to transform the current static and rigid radar based system to a dynamic and flexible satellite based system. New operational concepts such as Dynamic Airspace Configuration (DAC) have been under development to allow more flexibility required to mitigate the demand-capacity imbalances in order to increase the throughput of the entire NAS. In this dissertation, we address the DAC problem in the en route and terminal airspace under the framework of NextGen. We develop a series of algorithms to facilitate the implementation of innovative concepts relevant with DAC in both the en route and terminal airspace. We also develop a performance evaluation framework for comprehensive benefit analyses on different aspects of future sector design algorithms. First, we complete a graph based sectorization algorithm for DAC in the en route airspace, which models the underlying air route network with a weighted graph, converts the sectorization problem into the graph partition problem, partitions the weighted graph with an iterative spectral bipartition method, and constructs the sectors from the partitioned graph. The algorithm uses a graph model to accurately capture the complex traffic patterns of the real flights, and generates sectors with high efficiency while evenly distributing the workload among the generated sectors. We further improve the robustness and efficiency of the graph based DAC algorithm by incorporating the Multilevel Graph Partitioning (MGP) method into the graph model, and develop a MGP based sectorization algorithm for DAC in the en route airspace. In a comprehensive benefit analysis, the performance of the proposed algorithms are tested in numerical simulations with Enhanced Traffic Management System (ETMS) data. Simulation results demonstrate that the algorithmically generated sectorizations outperform the current sectorizations in different sectors for different time periods. Secondly, based on our experience with DAC in the en route airspace, we further study the sectorization problem for DAC in the terminal airspace. The differences between the en route and terminal airspace are identified, and their influence on the terminal sectorization is analyzed. After adjusting the graph model to better capture the unique characteristics of the terminal airspace and the requirements of terminal sectorization, we develop a graph based geometric sectorization algorithm for DAC in the terminal airspace. Moreover, the graph based model is combined with the region based sector design method to better handle the complicated geometric and operational constraints in the terminal sectorization problem. In the benefit analysis, we identify the contributing factors to terminal controller workload, define evaluation metrics, and develop a bebefit analysis framework for terminal sectorization evaluation. With the evaluation framework developed, we demonstrate the improvements on the current sectorizations with real traffic data collected from several major international airports in the U.S., and conduct a detailed analysis on the potential benefits of dynamic reconfiguration in the terminal airspace. Finally, in addition to the research on the macroscopic behavior of a large number of aircraft, we also study the dynamical behavior of individual aircraft from the perspective of traffic flow management. We formulate the mode-confusion problem as hybrid estimation problem, and develop a state estimation algorithm for the linear hybrid system with continuous-state-dependent transitions based on sparse observations. We also develop an estimated time of arrival prediction algorithm based on the state-dependent transition hybrid estimation algorithm, whose performance is demonstrated with simulations on the landing procedure following the Continuous Descend Approach (CDA) profile.
Reducing vertices in property graphs
Pąk, Karol
2018-01-01
Graph databases are constantly growing, and, at the same time, some of their data is the same or similar. Our experience with the management of the existing databases, especially the bigger ones, shows that certain vertices are particularly replicated there numerous times. Eliminating repetitive or even very similar data speeds up the access to database resources. We present a modification of this approach, where similarly we group together vertices of identical properties, but then additionally we join together groups of data that are located in distant parts of a graph. The second part of our approach is non-trivial. We show that the search for a partition of a given graph where each member of the partition has only pairwise distant vertices is NP-hard. We indicate a group of heuristics that try to solve our difficult computational problems and then we apply them to check the the effectiveness of our approach. PMID:29444127
Improving Design Efficiency for Large-Scale Heterogeneous Circuits
NASA Astrophysics Data System (ADS)
Gregerson, Anthony
Despite increases in logic density, many Big Data applications must still be partitioned across multiple computing devices in order to meet their strict performance requirements. Among the most demanding of these applications is high-energy physics (HEP), which uses complex computing systems consisting of thousands of FPGAs and ASICs to process the sensor data created by experiments at particles accelerators such as the Large Hadron Collider (LHC). Designing such computing systems is challenging due to the scale of the systems, the exceptionally high-throughput and low-latency performance constraints that necessitate application-specific hardware implementations, the requirement that algorithms are efficiently partitioned across many devices, and the possible need to update the implemented algorithms during the lifetime of the system. In this work, we describe our research to develop flexible architectures for implementing such large-scale circuits on FPGAs. In particular, this work is motivated by (but not limited in scope to) high-energy physics algorithms for the Compact Muon Solenoid (CMS) experiment at the LHC. To make efficient use of logic resources in multi-FPGA systems, we introduce Multi-Personality Partitioning, a novel form of the graph partitioning problem, and present partitioning algorithms that can significantly improve resource utilization on heterogeneous devices while also reducing inter-chip connections. To reduce the high communication costs of Big Data applications, we also introduce Information-Aware Partitioning, a partitioning method that analyzes the data content of application-specific circuits, characterizes their entropy, and selects circuit partitions that enable efficient compression of data between chips. We employ our information-aware partitioning method to improve the performance of the hardware validation platform for evaluating new algorithms for the CMS experiment. Together, these research efforts help to improve the efficiency and decrease the cost of the developing large-scale, heterogeneous circuits needed to enable large-scale application in high-energy physics and other important areas.
Yan, Bo; Pan, Chongle; Olman, Victor N; Hettich, Robert L; Xu, Ying
2004-01-01
Mass spectrometry is one of the most popular analytical techniques for identification of individual proteins in a protein mixture, one of the basic problems in proteomics. It identifies a protein through identifying its unique mass spectral pattern. While the problem is theoretically solvable, it remains a challenging problem computationally. One of the key challenges comes from the difficulty in distinguishing the N- and C-terminus ions, mostly b- and y-ions respectively. In this paper, we present a graph algorithm for solving the problem of separating bfrom y-ions in a set of mass spectra. We represent each spectral peak as a node and consider two types of edges: a type-1 edge connects two peaks possibly of the same ion types and a type-2 edge connects two peaks possibly of different ion types, predicted based on local information. The ion-separation problem is then formulated and solved as a graph partition problem, which is to partition the graph into three subgraphs, namely b-, y-ions and others respectively, so to maximize the total weight of type-1 edges while minimizing the total weight of type-2 edges within each subgraph. We have developed a dynamic programming algorithm for rigorously solving this graph partition problem and implemented it as a computer program PRIME. We have tested PRIME on 18 data sets of high accurate FT-ICR tandem mass spectra and found that it achieved ~90% accuracy for separation of b- and y- ions.
NASA Astrophysics Data System (ADS)
Gutin, Gregory; Kim, Eun Jung; Soleimanfallah, Arezou; Szeider, Stefan; Yeo, Anders
The NP-hard general factor problem asks, given a graph and for each vertex a list of integers, whether the graph has a spanning subgraph where each vertex has a degree that belongs to its assigned list. The problem remains NP-hard even if the given graph is bipartite with partition U ⊎ V, and each vertex in U is assigned the list {1}; this subproblem appears in the context of constraint programming as the consistency problem for the extended global cardinality constraint. We show that this subproblem is fixed-parameter tractable when parameterized by the size of the second partite set V. More generally, we show that the general factor problem for bipartite graphs, parameterized by |V |, is fixed-parameter tractable as long as all vertices in U are assigned lists of length 1, but becomes W[1]-hard if vertices in U are assigned lists of length at most 2. We establish fixed-parameter tractability by reducing the problem instance to a bounded number of acyclic instances, each of which can be solved in polynomial time by dynamic programming.
NASA Technical Reports Server (NTRS)
Bokhari, Shahid H.; Crockett, Thomas W.; Nicol, David M.
1993-01-01
Binary dissection is widely used to partition non-uniform domains over parallel computers. This algorithm does not consider the perimeter, surface area, or aspect ratio of the regions being generated and can yield decompositions that have poor communication to computation ratio. Parametric Binary Dissection (PBD) is a new algorithm in which each cut is chosen to minimize load + lambda x(shape). In a 2 (or 3) dimensional problem, load is the amount of computation to be performed in a subregion and shape could refer to the perimeter (respectively surface) of that subregion. Shape is a measure of communication overhead and the parameter permits us to trade off load imbalance against communication overhead. When A is zero, the algorithm reduces to plain binary dissection. This algorithm can be used to partition graphs embedded in 2 or 3-d. Load is the number of nodes in a subregion, shape the number of edges that leave that subregion, and lambda the ratio of time to communicate over an edge to the time to compute at a node. An algorithm is presented that finds the depth d parametric dissection of an embedded graph with n vertices and e edges in O(max(n log n, de)) time, which is an improvement over the O(dn log n) time of plain binary dissection. Parallel versions of this algorithm are also presented; the best of these requires O((n/p) log(sup 3)p) time on a p processor hypercube, assuming graphs of bounded degree. How PBD is applied to 3-d unstructured meshes and yields partitions that are better than those obtained by plain dissection is described. Its application to the color image quantization problem is also discussed, in which samples in a high-resolution color space are mapped onto a lower resolution space in a way that minimizes the color error.
Visibility graphs and symbolic dynamics
NASA Astrophysics Data System (ADS)
Lacasa, Lucas; Just, Wolfram
2018-07-01
Visibility algorithms are a family of geometric and ordering criteria by which a real-valued time series of N data is mapped into a graph of N nodes. This graph has been shown to often inherit in its topology nontrivial properties of the series structure, and can thus be seen as a combinatorial representation of a dynamical system. Here we explore in some detail the relation between visibility graphs and symbolic dynamics. To do that, we consider the degree sequence of horizontal visibility graphs generated by the one-parameter logistic map, for a range of values of the parameter for which the map shows chaotic behaviour. Numerically, we observe that in the chaotic region the block entropies of these sequences systematically converge to the Lyapunov exponent of the time series. Hence, Pesin's identity suggests that these block entropies are converging to the Kolmogorov-Sinai entropy of the physical measure, which ultimately suggests that the algorithm is implicitly and adaptively constructing phase space partitions which might have the generating property. To give analytical insight, we explore the relation k(x) , x ∈ [ 0 , 1 ] that, for a given datum with value x, assigns in graph space a node with degree k. In the case of the out-degree sequence, such relation is indeed a piece-wise constant function. By making use of explicit methods and tools from symbolic dynamics we are able to analytically show that the algorithm indeed performs an effective partition of the phase space and that such partition is naturally expressed as a countable union of subintervals, where the endpoints of each subinterval are related to the fixed point structure of the iterates of the map and the subinterval enumeration is associated with particular ordering structures that we called motifs.
ERIC Educational Resources Information Center
Brusco, Michael; Steinley, Douglas
2010-01-01
Structural balance theory (SBT) has maintained a venerable status in the psychological literature for more than 5 decades. One important problem pertaining to SBT is the approximation of structural or generalized balance via the partitioning of the vertices of a signed graph into "K" clusters. This "K"-balance partitioning problem also has more…
Hajnal, A.
1971-01-01
If the continuum hypothesis is assumed, there is a graph G whose vertices form an ordered set of type ω12; G does not contain triangles or complete even graphs of form [[unk]0,[unk]0], and there is no independent subset of vertices of type ω12. PMID:16591893
Acceleration of Binding Site Comparisons by Graph Partitioning.
Krotzky, Timo; Klebe, Gerhard
2015-08-01
The comparison of protein binding sites is a prominent task in computational chemistry and has been studied in many different ways. For the automatic detection and comparison of putative binding cavities the Cavbase system has been developed which uses a coarse-grained set of pseudocenters to represent the physicochemical properties of a binding site and employs a graph-based procedure to calculate similarities between two binding sites. However, the comparison of two graphs is computationally quite demanding which makes large-scale studies such as the rapid screening of entire databases hardly feasible. In a recent work, we proposed the method Local Cliques (LC) for the efficient comparison of Cavbase binding sites. It employs a clique heuristic to detect the maximum common subgraph of two binding sites and an extended graph model to additionally compare the shape of individual surface patches. In this study, we present an alternative to further accelerate the LC method by partitioning the binding-site graphs into disjoint components prior to their comparisons. The pseudocenter sets are split with regard to their assigned phyiscochemical type, which leads to seven much smaller graphs than the original one. Applying this approach on the same test scenarios as in the former comprehensive way results in a significant speed-up without sacrificing accuracy. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Venous tree separation in the liver: graph partitioning using a non-ising model.
O'Donnell, Thomas; Kaftan, Jens N; Schuh, Andreas; Tietjen, Christian; Soza, Grzegorz; Aach, Til
2011-01-01
Entangled tree-like vascular systems are commonly found in the body (e.g., in the peripheries and lungs). Separation of these systems in medical images may be formulated as a graph partitioning problem given an imperfect segmentation and specification of the tree roots. In this work, we show that the ubiquitous Ising-model approaches (e.g., Graph Cuts, Random Walker) are not appropriate for tackling this problem and propose a novel method based on recursive minimal paths for doing so. To motivate our method, we focus on the intertwined portal and hepatic venous systems in the liver. Separation of these systems is critical for liver intervention planning, in particular when resection is involved. We apply our method to 34 clinical datasets, each containing well over a hundred vessel branches, demonstrating its effectiveness.
NASA Technical Reports Server (NTRS)
Collins, Oliver (Inventor); Dolinar, Jr., Samuel J. (Inventor); Hus, In-Shek (Inventor); Bozzola, Fabrizio P. (Inventor); Olson, Erlend M. (Inventor); Statman, Joseph I. (Inventor); Zimmerman, George A. (Inventor)
1991-01-01
A method of formulating and packaging decision-making elements into a long constraint length Viterbi decoder which involves formulating the decision-making processors as individual Viterbi butterfly processors that are interconnected in a deBruijn graph configuration. A fully distributed architecture, which achieves high decoding speeds, is made feasible by novel wiring and partitioning of the state diagram. This partitioning defines universal modules, which can be used to build any size decoder, such that a large number of wires is contained inside each module, and a small number of wires is needed to connect modules. The total system is modular and hierarchical, and it implements a large proportion of the required wiring internally within modules and may include some external wiring to fully complete the deBruijn graph. pg,14.
A graph based algorithm for adaptable dynamic airspace configuration for NextGen
NASA Astrophysics Data System (ADS)
Savai, Mehernaz P.
The National Airspace System (NAS) is a complicated large-scale aviation network, consisting of many static sectors wherein each sector is controlled by one or more controllers. The main purpose of the NAS is to enable safe and prompt air travel in the U.S. However, such static configuration of sectors will not be able to handle the continued growth of air travel which is projected to be more than double the current traffic by 2025. Under the initiative of the Next Generation of Air Transportation system (NextGen), the main objective of Adaptable Dynamic Airspace Configuration (ADAC) is that the sectors should change to the changing traffic so as to reduce the controller workload variance with time while increasing the throughput. Change in the resectorization should be such that there is a minimal increase in exchange of air traffic among controllers. The benefit of a new design (improvement in workload balance, etc.) should sufficiently exceed the transition cost, in order to deserve a change. This leads to the analysis of the concept of transition workload which is the cost associated with a transition from one sectorization to another. Given two airspace configurations, a transition workload metric which considers the air traffic as well as the geometry of the airspace is proposed. A solution to reduce this transition workload is also discussed. The algorithm is specifically designed to be implemented for the Dynamic Airspace Configuration (DAC) Algorithm. A graph model which accurately represents the air route structure and air traffic in the NAS is used to formulate the airspace configuration problem. In addition, a multilevel graph partitioning algorithm is developed for Dynamic Airspace Configuration which partitions the graph model of airspace with given user defined constraints and hence provides the user more flexibility and control over various partitions. In terms of air traffic management, vertices represent airports and waypoints. Some of the major (busy) airports need to be given more importance and hence treated separately. Thus the algorithm takes into account the air route structure while finding a balance between sector workloads. The performance of the proposed algorithms and performance metrics is validated with the Enhanced Traffic Management System (ETMS) air traffic data.
F-RAG: Generating Atomic Coordinates from RNA Graphs by Fragment Assembly.
Jain, Swati; Schlick, Tamar
2017-11-24
Coarse-grained models represent attractive approaches to analyze and simulate ribonucleic acid (RNA) molecules, for example, for structure prediction and design, as they simplify the RNA structure to reduce the conformational search space. Our structure prediction protocol RAGTOP (RNA-As-Graphs Topology Prediction) represents RNA structures as tree graphs and samples graph topologies to produce candidate graphs. However, for a more detailed study and analysis, construction of atomic from coarse-grained models is required. Here we present our graph-based fragment assembly algorithm (F-RAG) to convert candidate three-dimensional (3D) tree graph models, produced by RAGTOP into atomic structures. We use our related RAG-3D utilities to partition graphs into subgraphs and search for structurally similar atomic fragments in a data set of RNA 3D structures. The fragments are edited and superimposed using common residues, full atomic models are scored using RAGTOP's knowledge-based potential, and geometries of top scoring models is optimized. To evaluate our models, we assess all-atom RMSDs and Interaction Network Fidelity (a measure of residue interactions) with respect to experimentally solved structures and compare our results to other fragment assembly programs. For a set of 50 RNA structures, we obtain atomic models with reasonable geometries and interactions, particularly good for RNAs containing junctions. Additional improvements to our protocol and databases are outlined. These results provide a good foundation for further work on RNA structure prediction and design applications. Copyright © 2017 Elsevier Ltd. All rights reserved.
Feynman graphs and the large dimensional limit of multipartite entanglement
NASA Astrophysics Data System (ADS)
Di Martino, Sara; Facchi, Paolo; Florio, Giuseppe
2018-01-01
In this paper, we extend the analysis of multipartite entanglement, based on techniques from classical statistical mechanics, to a system composed of n d-level parties (qudits). We introduce a suitable partition function at a fictitious temperature with the average local purity of the system as Hamiltonian. In particular, we analyze the high-temperature expansion of this partition function, prove the convergence of the series, and study its asymptotic behavior as d → ∞. We make use of a diagrammatic technique, classify the graphs, and study their degeneracy. We are thus able to evaluate their contributions and estimate the moments of the distribution of the local purity.
High performance genetic algorithm for VLSI circuit partitioning
NASA Astrophysics Data System (ADS)
Dinu, Simona
2016-12-01
Partitioning is one of the biggest challenges in computer-aided design for VLSI circuits (very large-scale integrated circuits). This work address the min-cut balanced circuit partitioning problem- dividing the graph that models the circuit into almost equal sized k sub-graphs while minimizing the number of edges cut i.e. minimizing the number of edges connecting the sub-graphs. The problem may be formulated as a combinatorial optimization problem. Experimental studies in the literature have shown the problem to be NP-hard and thus it is important to design an efficient heuristic algorithm to solve it. The approach proposed in this study is a parallel implementation of a genetic algorithm, namely an island model. The information exchange between the evolving subpopulations is modeled using a fuzzy controller, which determines an optimal balance between exploration and exploitation of the solution space. The results of simulations show that the proposed algorithm outperforms the standard sequential genetic algorithm both in terms of solution quality and convergence speed. As a direction for future study, this research can be further extended to incorporate local search operators which should include problem-specific knowledge. In addition, the adaptive configuration of mutation and crossover rates is another guidance for future research.
An asynchronous traversal engine for graph-based rich metadata management
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dai, Dong; Carns, Philip; Ross, Robert B.
Rich metadata in high-performance computing (HPC) systems contains extended information about users, jobs, data files, and their relationships. Property graphs are a promising data model to represent heterogeneous rich metadata flexibly. Specifically, a property graph can use vertices to represent different entities and edges to record the relationships between vertices with unique annotations. The high-volume HPC use case, with millions of entities and relationships, naturally requires an out-of-core distributed property graph database, which must support live updates (to ingest production information in real time), low-latency point queries (for frequent metadata operations such as permission checking), and large-scale traversals (for provenancemore » data mining). Among these needs, large-scale property graph traversals are particularly challenging for distributed graph storage systems. Most existing graph systems implement a "level synchronous" breadth-first search algorithm that relies on global synchronization in each traversal step. This performs well in many problem domains; but a rich metadata management system is characterized by imbalanced graphs, long traversal lengths, and concurrent workloads, each of which has the potential to introduce or exacerbate stragglers (i.e., abnormally slow steps or servers in a graph traversal) that lead to low overall throughput for synchronous traversal algorithms. Previous research indicated that the straggler problem can be mitigated by using asynchronous traversal algorithms, and many graph-processing frameworks have successfully demonstrated this approach. Such systems require the graph to be loaded into a separate batch-processing framework instead of being iteratively accessed, however. In this work, we investigate a general asynchronous graph traversal engine that can operate atop a rich metadata graph in its native format. We outline a traversal-aware query language and key optimizations (traversal-affiliate caching and execution merging) necessary for efficient performance. We further explore the effect of different graph partitioning strategies on the traversal performance for both synchronous and asynchronous traversal engines. Our experiments show that the asynchronous graph traversal engine is more efficient than its synchronous counterpart in the case of HPC rich metadata processing, where more servers are involved and larger traversals are needed. Furthermore, the asynchronous traversal engine is more adaptive to different graph partitioning strategies.« less
An asynchronous traversal engine for graph-based rich metadata management
Dai, Dong; Carns, Philip; Ross, Robert B.; ...
2016-06-23
Rich metadata in high-performance computing (HPC) systems contains extended information about users, jobs, data files, and their relationships. Property graphs are a promising data model to represent heterogeneous rich metadata flexibly. Specifically, a property graph can use vertices to represent different entities and edges to record the relationships between vertices with unique annotations. The high-volume HPC use case, with millions of entities and relationships, naturally requires an out-of-core distributed property graph database, which must support live updates (to ingest production information in real time), low-latency point queries (for frequent metadata operations such as permission checking), and large-scale traversals (for provenancemore » data mining). Among these needs, large-scale property graph traversals are particularly challenging for distributed graph storage systems. Most existing graph systems implement a "level synchronous" breadth-first search algorithm that relies on global synchronization in each traversal step. This performs well in many problem domains; but a rich metadata management system is characterized by imbalanced graphs, long traversal lengths, and concurrent workloads, each of which has the potential to introduce or exacerbate stragglers (i.e., abnormally slow steps or servers in a graph traversal) that lead to low overall throughput for synchronous traversal algorithms. Previous research indicated that the straggler problem can be mitigated by using asynchronous traversal algorithms, and many graph-processing frameworks have successfully demonstrated this approach. Such systems require the graph to be loaded into a separate batch-processing framework instead of being iteratively accessed, however. In this work, we investigate a general asynchronous graph traversal engine that can operate atop a rich metadata graph in its native format. We outline a traversal-aware query language and key optimizations (traversal-affiliate caching and execution merging) necessary for efficient performance. We further explore the effect of different graph partitioning strategies on the traversal performance for both synchronous and asynchronous traversal engines. Our experiments show that the asynchronous graph traversal engine is more efficient than its synchronous counterpart in the case of HPC rich metadata processing, where more servers are involved and larger traversals are needed. Furthermore, the asynchronous traversal engine is more adaptive to different graph partitioning strategies.« less
P-HARP: A parallel dynamic spectral partitioner
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sohn, A.; Biswas, R.; Simon, H.D.
1997-05-01
Partitioning unstructured graphs is central to the parallel solution of problems in computational science and engineering. The authors have introduced earlier the sequential version of an inertial spectral partitioner called HARP which maintains the quality of recursive spectral bisection (RSB) while forming the partitions an order of magnitude faster than RSB. The serial HARP is known to be the fastest spectral partitioner to date, three to four times faster than similar partitioners on a variety of meshes. This paper presents a parallel version of HARP, called P-HARP. Two types of parallelism have been exploited: loop level parallelism and recursive parallelism.more » P-HARP has been implemented in MPI on the SGI/Cray T3E and the IBM SP2. Experimental results demonstrate that P-HARP can partition a mesh of over 100,000 vertices into 256 partitions in 0.25 seconds on a 64-processor T3E. Experimental results further show that P-HARP can give nearly a 20-fold speedup on 64 processors. These results indicate that graph partitioning is no longer a major bottleneck that hinders the advancement of computational science and engineering for dynamically-changing real-world applications.« less
Multidimensional spectral load balancing
Hendrickson, Bruce A.; Leland, Robert W.
1996-12-24
A method of and apparatus for graph partitioning involving the use of a plurality of eigenvectors of the Laplacian matrix of the graph of the problem for which load balancing is desired. The invention is particularly useful for optimizing parallel computer processing of a problem and for minimizing total pathway lengths of integrated circuits in the design stage.
Accelerating semantic graph databases on commodity clusters
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morari, Alessandro; Castellana, Vito G.; Haglin, David J.
We are developing a full software system for accelerating semantic graph databases on commodity cluster that scales to hundreds of nodes while maintaining constant query throughput. Our framework comprises a SPARQL to C++ compiler, a library of parallel graph methods and a custom multithreaded runtime layer, which provides a Partitioned Global Address Space (PGAS) programming model with fork/join parallelism and automatic load balancing over a commodity clusters. We present preliminary results for the compiler and for the runtime.
Partitioning Rectangular and Structurally Nonsymmetric Sparse Matrices for Parallel Processing
DOE Office of Scientific and Technical Information (OSTI.GOV)
B. Hendrickson; T.G. Kolda
1998-09-01
A common operation in scientific computing is the multiplication of a sparse, rectangular or structurally nonsymmetric matrix and a vector. In many applications the matrix- transpose-vector product is also required. This paper addresses the efficient parallelization of these operations. We show that the problem can be expressed in terms of partitioning bipartite graphs. We then introduce several algorithms for this partitioning problem and compare their performance on a set of test matrices.
Path similarity skeleton graph matching.
Bai, Xiang; Latecki, Longin Jan
2008-07-01
This paper presents a novel framework to for shape recognition based on object silhouettes. The main idea is to match skeleton graphs by comparing the shortest paths between skeleton endpoints. In contrast to typical tree or graph matching methods, we completely ignore the topological graph structure. Our approach is motivated by the fact that visually similar skeleton graphs may have completely different topological structures. The proposed comparison of shortest paths between endpoints of skeleton graphs yields correct matching results in such cases. The skeletons are pruned by contour partitioning with Discrete Curve Evolution, which implies that the endpoints of skeleton branches correspond to visual parts of the objects. The experimental results demonstrate that our method is able to produce correct results in the presence of articulations, stretching, and occlusion.
HARP: A Dynamic Inertial Spectral Partitioner
NASA Technical Reports Server (NTRS)
Simon, Horst D.; Sohn, Andrew; Biswas, Rupak
1997-01-01
Partitioning unstructured graphs is central to the parallel solution of computational science and engineering problems. Spectral partitioners, such recursive spectral bisection (RSB), have proven effecfive in generating high-quality partitions of realistically-sized meshes. The major problem which hindered their wide-spread use was their long execution times. This paper presents a new inertial spectral partitioner, called HARP. The main objective of the proposed approach is to quickly partition the meshes at runtime in a manner that works efficiently for real applications in the context of distributed-memory machines. The underlying principle of HARP is to find the eigenvectors of the unpartitioned vertices and then project them onto the eigerivectors of the original mesh. Results for various meshes ranging in size from 1000 to 100,000 vertices indicate that HARP can indeed partition meshes rapidly at runtime. Experimental results show that our largest mesh can be partitioned sequentially in only a few seconds on an SP2 which is several times faster than other spectral partitioners while maintaining the solution quality of the proven RSB method. A parallel WI version of HARP has also been implemented on IBM SP2 and Cray T3E. Parallel HARP, running on 64 processors SP2 and T3E, can partition a mesh containing more than 100,000 vertices into 64 subgrids in about half a second. These results indicate that graph partitioning can now be truly embedded in dynamically-changing real-world applications.
Enhancing data locality by using terminal propagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hendrickson, B.; Leland, R.; Van Driessche, R.
1995-12-31
Terminal propagation is a method developed in the circuit placement community for adding constraints to graph partitioning problems. This paper adapts and expands this idea, and applies it to the problem of partitioning data structures among the processors of a parallel computer. We show how the constraints in terminal propagation can be used to encourage partitions in which messages are communicated only between architecturally near processors. We then show how these constraints can be handled in two important partitioning algorithms, spectral bisection and multilevel-KL. We compare the quality of partitions generated by these algorithms to each other and to Partitionsmore » generated by more familiar techniques.« less
Hierarchical image feature extraction by an irregular pyramid of polygonal partitions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skurikhin, Alexei N
2008-01-01
We present an algorithmic framework for hierarchical image segmentation and feature extraction. We build a successive fine-to-coarse hierarchy of irregular polygonal partitions of the original image. This multiscale hierarchy forms the basis for object-oriented image analysis. The framework incorporates the Gestalt principles of visual perception, such as proximity and closure, and exploits spectral and textural similarities of polygonal partitions, while iteratively grouping them until dissimilarity criteria are exceeded. Seed polygons are built upon a triangular mesh composed of irregular sized triangles, whose spatial arrangement is adapted to the image content. This is achieved by building the triangular mesh on themore » top of detected spectral discontinuities (such as edges), which form a network of constraints for the Delaunay triangulation. The image is then represented as a spatial network in the form of a graph with vertices corresponding to the polygonal partitions and edges reflecting their relations. The iterative agglomeration of partitions into object-oriented segments is formulated as Minimum Spanning Tree (MST) construction. An important characteristic of the approach is that the agglomeration of polygonal partitions is constrained by the detected edges; thus the shapes of agglomerated partitions are more likely to correspond to the outlines of real-world objects. The constructed partitions and their spatial relations are characterized using spectral, textural and structural features based on proximity graphs. The framework allows searching for object-oriented features of interest across multiple levels of details of the built hierarchy and can be generalized to the multi-criteria MST to account for multiple criteria important for an application.« less
A Multi-Objective Partition Method for Marine Sensor Networks Based on Degree of Event Correlation.
Huang, Dongmei; Xu, Chenyixuan; Zhao, Danfeng; Song, Wei; He, Qi
2017-09-21
Existing marine sensor networks acquire data from sea areas that are geographically divided, and store the data independently in their affiliated sea area data centers. In the case of marine events across multiple sea areas, the current network structure needs to retrieve data from multiple data centers, and thus severely affects real-time decision making. In this study, in order to provide a fast data retrieval service for a marine sensor network, we use all the marine sensors as the vertices, establish the edge based on marine events, and abstract the marine sensor network as a graph. Then, we construct a multi-objective balanced partition method to partition the abstract graph into multiple regions and store them in the cloud computing platform. This method effectively increases the correlation of the sensors and decreases the retrieval cost. On this basis, an incremental optimization strategy is designed to dynamically optimize existing partitions when new sensors are added into the network. Experimental results show that the proposed method can achieve the optimal layout for distributed storage in the process of disaster data retrieval in the China Sea area, and effectively optimize the result of partitions when new buoys are deployed, which eventually will provide efficient data access service for marine events.
Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction
ERIC Educational Resources Information Center
Trivedi, Shubhendu; Pardos, Zachary A.; Sarkozy, Gabor N.; Heffernan, Neil T.
2012-01-01
Learning a more distributed representation of the input feature space is a powerful method to boost the performance of a given predictor. Often this is accomplished by partitioning the data into homogeneous groups by clustering so that separate models could be trained on each cluster. Intuitively each such predictor is a better representative of…
An Efficient Algorithm for Partitioning and Authenticating Problem-Solutions of eLeaming Contents
ERIC Educational Resources Information Center
Dewan, Jahangir; Chowdhury, Morshed; Batten, Lynn
2013-01-01
Content authenticity and correctness is one of the important challenges in eLearning as there can be many solutions to one specific problem in cyber space. Therefore, the authors feel it is necessary to map problems to solutions using graph partition and weighted bipartite matching. This article proposes an efficient algorithm to partition…
Monkey search algorithm for ECE components partitioning
NASA Astrophysics Data System (ADS)
Kuliev, Elmar; Kureichik, Vladimir; Kureichik, Vladimir, Jr.
2018-05-01
The paper considers one of the important design problems – a partitioning of electronic computer equipment (ECE) components (blocks). It belongs to the NP-hard class of problems and has a combinatorial and logic nature. In the paper, a partitioning problem formulation can be found as a partition of graph into parts. To solve the given problem, the authors suggest using a bioinspired approach based on a monkey search algorithm. Based on the developed software, computational experiments were carried out that show the algorithm efficiency, as well as its recommended settings for obtaining more effective solutions in comparison with a genetic algorithm.
Multiple Semantic Matching on Augmented N-partite Graph for Object Co-segmentation.
Wang, Chuan; Zhang, Hua; Yang, Liang; Cao, Xiaochun; Xiong, Hongkai
2017-09-08
Recent methods for object co-segmentation focus on discovering single co-occurring relation of candidate regions representing the foreground of multiple images. However, region extraction based only on low and middle level information often occupies a large area of background without the help of semantic context. In addition, seeking single matching solution very likely leads to discover local parts of common objects. To cope with these deficiencies, we present a new object cosegmentation framework, which takes advantages of semantic information and globally explores multiple co-occurring matching cliques based on an N-partite graph structure. To this end, we first propose to incorporate candidate generation with semantic context. Based on the regions extracted from semantic segmentation of each image, we design a merging mechanism to hierarchically generate candidates with high semantic responses. Secondly, all candidates are taken into consideration to globally formulate multiple maximum weighted matching cliques, which complements the discovery of part of the common objects induced by a single clique. To facilitate the discovery of multiple matching cliques, an N-partite graph, which inherently excludes intralinks between candidates from the same image, is constructed to separate multiple cliques without additional constraints. Further, we augment the graph with an additional virtual node in each part to handle irrelevant matches when the similarity between two candidates is too small. Finally, with the explored multiple cliques, we statistically compute pixel-wise co-occurrence map for each image. Experimental results on two benchmark datasets, i.e., iCoseg and MSRC datasets, achieve desirable performance and demonstrate the effectiveness of our proposed framework.
Hearing the shape of the Ising model with a programmable superconducting-flux annealer.
Vinci, Walter; Markström, Klas; Boixo, Sergio; Roy, Aidan; Spedalieri, Federico M; Warburton, Paul A; Severini, Simone
2014-07-16
Two objects can be distinguished if they have different measurable properties. Thus, distinguishability depends on the Physics of the objects. In considering graphs, we revisit the Ising model as a framework to define physically meaningful spectral invariants. In this context, we introduce a family of refinements of the classical spectrum and consider the quantum partition function. We demonstrate that the energy spectrum of the quantum Ising Hamiltonian is a stronger invariant than the classical one without refinements. For the purpose of implementing the related physical systems, we perform experiments on a programmable annealer with superconducting flux technology. Departing from the paradigm of adiabatic computation, we take advantage of a noisy evolution of the device to generate statistics of low energy states. The graphs considered in the experiments have the same classical partition functions, but different quantum spectra. The data obtained from the annealer distinguish non-isomorphic graphs via information contained in the classical refinements of the functions but not via the differences in the quantum spectra.
NASA Astrophysics Data System (ADS)
Cahyaningrum, Rosalia D.; Bustamam, Alhadi; Siswantining, Titin
2017-03-01
Technology of microarray became one of the imperative tools in life science to observe the gene expression levels, one of which is the expression of the genes of people with carcinoma. Carcinoma is a cancer that forms in the epithelial tissue. These data can be analyzed such as the identification expressions hereditary gene and also build classifications that can be used to improve diagnosis of carcinoma. Microarray data usually served in large dimension that most methods require large computing time to do the grouping. Therefore, this study uses spectral clustering method which allows to work with any object for reduces dimension. Spectral clustering method is a method based on spectral decomposition of the matrix which is represented in the form of a graph. After the data dimensions are reduced, then the data are partitioned. One of the famous partition method is Partitioning Around Medoids (PAM) which is minimize the objective function with exchanges all the non-medoid points into medoid point iteratively until converge. Objectivity of this research is to implement methods spectral clustering and partitioning algorithm PAM to obtain groups of 7457 genes with carcinoma based on the similarity value. The result in this study is two groups of genes with carcinoma.
Julián-Ortiz, Jesus V de; Gozalbes, Rafael; Besalú, Emili
2016-01-01
The search for new drug candidates in databases is of paramount importance in pharmaceutical chemistry. The selection of molecular subsets is greatly optimized and much more promising when potential drug-like molecules are detected a priori. In this work, about one hundred thousand molecules are ranked following a new methodology: a drug/non-drug classifier constructed by a consensual set of classification trees. The classification trees arise from the stochastic generation of training sets, which in turn are used to estimate probability factors of test molecules to be drug-like compounds. Molecules were represented by Topological Quantum Similarity Indices and their Graph Theoretical counterparts. The contribution of the present paper consists of presenting an effective ranking method able to improve the probability of finding drug-like substances by using these types of molecular descriptors.
ERIC Educational Resources Information Center
Brusco, Michael J.; Kohn, Hans-Friedrich
2009-01-01
The clique partitioning problem (CPP) requires the establishment of an equivalence relation for the vertices of a graph such that the sum of the edge costs associated with the relation is minimized. The CPP has important applications for the social sciences because it provides a framework for clustering objects measured on a collection of nominal…
A local search for a graph clustering problem
NASA Astrophysics Data System (ADS)
Navrotskaya, Anna; Il'ev, Victor
2016-10-01
In the clustering problems one has to partition a given set of objects (a data set) into some subsets (called clusters) taking into consideration only similarity of the objects. One of most visual formalizations of clustering is graph clustering, that is grouping the vertices of a graph into clusters taking into consideration the edge structure of the graph whose vertices are objects and edges represent similarities between the objects. In the graph k-clustering problem the number of clusters does not exceed k and the goal is to minimize the number of edges between clusters and the number of missing edges within clusters. This problem is NP-hard for any k ≥ 2. We propose a polynomial time (2k-1)-approximation algorithm for graph k-clustering. Then we apply a local search procedure to the feasible solution found by this algorithm and hold experimental research of obtained heuristics.
Applying Graph Theory to Problems in Air Traffic Management
NASA Technical Reports Server (NTRS)
Farrahi, Amir Hossein; Goldbert, Alan; Bagasol, Leonard Neil; Jung, Jaewoo
2017-01-01
Graph theory is used to investigate three different problems arising in air traffic management. First, using a polynomial reduction from a graph partitioning problem, it is shown that both the airspace sectorization problem and its incremental counterpart, the sector combination problem are NP-hard, in general, under several simple workload models. Second, using a polynomial time reduction from maximum independent set in graphs, it is shown that for any fixed e, the problem of finding a solution to the minimum delay scheduling problem in traffic flow management that is guaranteed to be within n1-e of the optimal, where n is the number of aircraft in the problem instance, is NP-hard. Finally, a problem arising in precision arrival scheduling is formulated and solved using graph reachability. These results demonstrate that graph theory provides a powerful framework for modeling, reasoning about, and devising algorithmic solutions to diverse problems arising in air traffic management.
Applying Graph Theory to Problems in Air Traffic Management
NASA Technical Reports Server (NTRS)
Farrahi, Amir H.; Goldberg, Alan T.; Bagasol, Leonard N.; Jung, Jaewoo
2017-01-01
Graph theory is used to investigate three different problems arising in air traffic management. First, using a polynomial reduction from a graph partitioning problem, it isshown that both the airspace sectorization problem and its incremental counterpart, the sector combination problem are NP-hard, in general, under several simple workload models. Second, using a polynomial time reduction from maximum independent set in graphs, it is shown that for any fixed e, the problem of finding a solution to the minimum delay scheduling problem in traffic flow management that is guaranteed to be within n1-e of the optimal, where n is the number of aircraft in the problem instance, is NP-hard. Finally, a problem arising in precision arrival scheduling is formulated and solved using graph reachability. These results demonstrate that graph theory provides a powerful framework for modeling, reasoning about, and devising algorithmic solutions to diverse problems arising in air traffic management.
COLA: Optimizing Stream Processing Applications via Graph Partitioning
NASA Astrophysics Data System (ADS)
Khandekar, Rohit; Hildrum, Kirsten; Parekh, Sujay; Rajan, Deepak; Wolf, Joel; Wu, Kun-Lung; Andrade, Henrique; Gedik, Buğra
In this paper, we describe an optimization scheme for fusing compile-time operators into reasonably-sized run-time software units called processing elements (PEs). Such PEs are the basic deployable units in System S, a highly scalable distributed stream processing middleware system. Finding a high quality fusion significantly benefits the performance of streaming jobs. In order to maximize throughput, our solution approach attempts to minimize the processing cost associated with inter-PE stream traffic while simultaneously balancing load across the processing hosts. Our algorithm computes a hierarchical partitioning of the operator graph based on a minimum-ratio cut subroutine. We also incorporate several fusion constraints in order to support real-world System S jobs. We experimentally compare our algorithm with several other reasonable alternative schemes, highlighting the effectiveness of our approach.
Normalized Cut Algorithm for Automated Assignment of Protein Domains
NASA Technical Reports Server (NTRS)
Samanta, M. P.; Liang, S.; Zha, H.; Biegel, Bryan A. (Technical Monitor)
2002-01-01
We present a novel computational method for automatic assignment of protein domains from structural data. At the core of our algorithm lies a recently proposed clustering technique that has been very successful for image-partitioning applications. This grap.,l-theory based clustering method uses the notion of a normalized cut to partition. an undirected graph into its strongly-connected components. Computer implementation of our method tested on the standard comparison set of proteins from the literature shows a high success rate (84%), better than most existing alternative In addition, several other features of our algorithm, such as reliance on few adjustable parameters, linear run-time with respect to the size of the protein and reduced complexity compared to other graph-theory based algorithms, would make it an attractive tool for structural biologists.
Fast Decentralized Averaging via Multi-scale Gossip
NASA Astrophysics Data System (ADS)
Tsianos, Konstantinos I.; Rabbat, Michael G.
We are interested in the problem of computing the average consensus in a distributed fashion on random geometric graphs. We describe a new algorithm called Multi-scale Gossip which employs a hierarchical decomposition of the graph to partition the computation into tractable sub-problems. Using only pairwise messages of fixed size that travel at most O(n^{1/3}) hops, our algorithm is robust and has communication cost of O(n loglogn logɛ - 1) transmissions, which is order-optimal up to the logarithmic factor in n. Simulated experiments verify the good expected performance on graphs of many thousands of nodes.
Constructing Temporally Extended Actions through Incremental Community Detection
Li, Ge
2018-01-01
Hierarchical reinforcement learning works on temporally extended actions or skills to facilitate learning. How to automatically form such abstraction is challenging, and many efforts tackle this issue in the options framework. While various approaches exist to construct options from different perspectives, few of them concentrate on options' adaptability during learning. This paper presents an algorithm to create options and enhance their quality online. Both aspects operate on detected communities of the learning environment's state transition graph. We first construct options from initial samples as the basis of online learning. Then a rule-based community revision algorithm is proposed to update graph partitions, based on which existing options can be continuously tuned. Experimental results in two problems indicate that options from initial samples may perform poorly in more complex environments, and our presented strategy can effectively improve options and get better results compared with flat reinforcement learning. PMID:29849543
Design of a Recommendation System for Adding Support in the Treatment of Chronic Patients.
Torkar, Simon; Benedik, Peter; Rajkovič, Uroš; Šušteršič, Olga; Rajkovič, Vladislav
2016-01-01
Rapid growth of chronic disease cases around the world is adding pressure on healthcare providers to ensure a structured patent follow-up during chronic disease management process. In response to the increasing demand for better chronic disease management and improved health care efficiency, nursing roles have been specialized or enhanced in the primary health care setting. Nurses become key players in chronic disease management process. Study describes a system to help nurses manage the care process of patient with chronic disease. It supports focusing nurse's attention on those resources/solutions that are likely to be most relevant to their particular situation/problem in nursing domain. System is based on multi-relational property graph representing a flexible modeling construct. Graph allows modeling a nursing ontology and the indices that partition domain into an efficient, searchable space where the solution to a problem is seen as abstractly defined traversals through its vertices and edges.
Frog: Asynchronous Graph Processing on GPU with Hybrid Coloring Model
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shi, Xuanhua; Luo, Xuan; Liang, Junling
GPUs have been increasingly used to accelerate graph processing for complicated computational problems regarding graph theory. Many parallel graph algorithms adopt the asynchronous computing model to accelerate the iterative convergence. Unfortunately, the consistent asynchronous computing requires locking or atomic operations, leading to significant penalties/overheads when implemented on GPUs. As such, coloring algorithm is adopted to separate the vertices with potential updating conflicts, guaranteeing the consistency/correctness of the parallel processing. Common coloring algorithms, however, may suffer from low parallelism because of a large number of colors generally required for processing a large-scale graph with billions of vertices. We propose a light-weightmore » asynchronous processing framework called Frog with a preprocessing/hybrid coloring model. The fundamental idea is based on Pareto principle (or 80-20 rule) about coloring algorithms as we observed through masses of realworld graph coloring cases. We find that a majority of vertices (about 80%) are colored with only a few colors, such that they can be read and updated in a very high degree of parallelism without violating the sequential consistency. Accordingly, our solution separates the processing of the vertices based on the distribution of colors. In this work, we mainly answer three questions: (1) how to partition the vertices in a sparse graph with maximized parallelism, (2) how to process large-scale graphs that cannot fit into GPU memory, and (3) how to reduce the overhead of data transfers on PCIe while processing each partition. We conduct experiments on real-world data (Amazon, DBLP, YouTube, RoadNet-CA, WikiTalk and Twitter) to evaluate our approach and make comparisons with well-known non-preprocessed (such as Totem, Medusa, MapGraph and Gunrock) and preprocessed (Cusha) approaches, by testing four classical algorithms (BFS, PageRank, SSSP and CC). On all the tested applications and datasets, Frog is able to significantly outperform existing GPU-based graph processing systems except Gunrock and MapGraph. MapGraph gets better performance than Frog when running BFS on RoadNet-CA. The comparison between Gunrock and Frog is inconclusive. Frog can outperform Gunrock more than 1.04X when running PageRank and SSSP, while the advantage of Frog is not obvious when running BFS and CC on some datasets especially for RoadNet-CA.« less
Spatial-temporal causal modeling: a data centric approach to climate change attribution (Invited)
NASA Astrophysics Data System (ADS)
Lozano, A. C.
2010-12-01
Attribution of climate change has been predominantly based on simulations using physical climate models. These approaches rely heavily on the employed models and are thus subject to their shortcomings. Given the physical models’ limitations in describing the complex system of climate, we propose an alternative approach to climate change attribution that is data centric in the sense that it relies on actual measurements of climate variables and human and natural forcing factors. We present a novel class of methods to infer causality from spatial-temporal data, as well as a procedure to incorporate extreme value modeling into our methodology in order to address the attribution of extreme climate events. We develop a collection of causal modeling methods using spatio-temporal data that combine graphical modeling techniques with the notion of Granger causality. “Granger causality” is an operational definition of causality from econometrics, which is based on the premise that if a variable causally affects another, then the past values of the former should be helpful in predicting the future values of the latter. In its basic version, our methodology makes use of the spatial relationship between the various data points, but treats each location as being identically distributed and builds a unique causal graph that is common to all locations. A more flexible framework is then proposed that is less restrictive than having a single causal graph common to all locations, while avoiding the brittleness due to data scarcity that might arise if one were to independently learn a different graph for each location. The solution we propose can be viewed as finding a middle ground by partitioning the locations into subsets that share the same causal structures and pooling the observations from all the time series belonging to the same subset in order to learn more robust causal graphs. More precisely, we make use of relationships between locations (e.g. neighboring relationship) by defining a relational graph in which related locations are connected (note that this relational graph, which represents relationships among the different locations, is distinct from the causal graph, which represents causal relationships among the individual variables - e.g. temperature, pressure- within a multivariate time series). We then define a hidden Markov Random Field (hMRF), assigning a hidden state to each node (location), with the state assignment guided by the prior information encoded in the relational graph. Nodes that share the same state in the hMRF model will have the same causal graph. State assignment can thus shed light on unknown relations among locations (e.g. teleconnection). While the model has been described in terms of hard location partitioning to facilitate its exposition, in fact a soft partitioning is maintained throughout learning. This leads to a form of transfer learning, which makes our model applicable even in situations where partitioning the locations might not seem appropriate. We first validate the effectiveness of our methodology on synthetic datasets, and then apply it to actual climate measurement data. The experimental results show that our approach offers a useful alternative to the simulation-based approach for climate modeling and attribution, and has the capability to provide valuable scientific insights from a new perspective.
Efficient Synthesis of Graph Methods: a Dynamically Scheduled Architecture
DOE Office of Scientific and Technical Information (OSTI.GOV)
Minutoli, Marco; Castellana, Vito G.; Tumeo, Antonino
RDF databases naturally map to a graph representation and employ languages, such as SPARQL, that implements queries as graph pattern matching routines. Graph methods exhibit an irregular behavior: they present unpredictable, fine-grained data accesses, and are synchronization inten- sive. Graph data structures expose large amounts of dy- namic parallelism, but are difficult to partition without gen- erating load unbalance. In this paper, we present a novel ar- chitecture to improve the synthesis of graph methods. Our design addresses the issues of these algorithms with two com- ponents: a Dynamic Task Scheduler (DTS), which reduces load unbalance and maximize resource utilization,more » and a Hi- erarchical Memory Interface controller (HMI), which pro- vides support for concurrent memory operations on multi- ported/multi-banked shared memories. We evaluate our ap- proach by generating the accelerators for a set of SPARQL queries from the Lehigh University Benchmark (LUBM). We first analyze the load unbalance of these queries, showing that execution time among tasks can differ even of order of magnitudes. We then synthesize the queries and com- pare the performance of the resulting accelerators against the current state of the art. Experimental results show that our solution provides a speedup over the serial implementa- tion close to the theoretical maximum and a speedup up to 3.45 over a baseline parallel implementation. We conclude our study by exploring the design space to achieve maximum memory channels utilization. The best design used at least three of the four memory channels for more than 90% of the execution time.« less
Eigenvector synchronization, graph rigidity and the molecule problemR
Cucuringu, Mihai; Singer, Amit; Cowburn, David
2013-01-01
The graph realization problem has received a great deal of attention in recent years, due to its importance in applications such as wireless sensor networks and structural biology. In this paper, we extend the previous work and propose the 3D-As-Synchronized-As-Possible (3D-ASAP) algorithm, for the graph realization problem in ℝ3, given a sparse and noisy set of distance measurements. 3D-ASAP is a divide and conquer, non-incremental and non-iterative algorithm, which integrates local distance information into a global structure determination. Our approach starts with identifying, for every node, a subgraph of its 1-hop neighborhood graph, which can be accurately embedded in its own coordinate system. In the noise-free case, the computed coordinates of the sensors in each patch must agree with their global positioning up to some unknown rigid motion, that is, up to translation, rotation and possibly reflection. In other words, to every patch, there corresponds an element of the Euclidean group, Euc(3), of rigid transformations in ℝ3, and the goal was to estimate the group elements that will properly align all the patches in a globally consistent way. Furthermore, 3D-ASAP successfully incorporates information specific to the molecule problem in structural biology, in particular information on known substructures and their orientation. In addition, we also propose 3D-spectral-partitioning (SP)-ASAP, a faster version of 3D-ASAP, which uses a spectral partitioning algorithm as a pre-processing step for dividing the initial graph into smaller subgraphs. Our extensive numerical simulations show that 3D-ASAP and 3D-SP-ASAP are very robust to high levels of noise in the measured distances and to sparse connectivity in the measurement graph, and compare favorably with similar state-of-the-art localization algorithms. PMID:24432187
Yoink: An interaction-based partitioning API.
Zheng, Min; Waller, Mark P
2018-05-15
Herein, we describe the implementation details of our interaction-based partitioning API (application programming interface) called Yoink for QM/MM modeling and fragment-based quantum chemistry studies. Interactions are detected by computing density descriptors such as reduced density gradient, density overlap regions indicator, and single exponential decay detector. Only molecules having an interaction with a user-definable QM core are added to the QM region of a hybrid QM/MM calculation. Moreover, a set of molecule pairs having density-based interactions within a molecular system can be computed in Yoink, and an interaction graph can then be constructed. Standard graph clustering methods can then be applied to construct fragments for further quantum chemical calculations. The Yoink API is licensed under Apache 2.0 and can be accessed via yoink.wallerlab.org. © 2018 Wiley Periodicals, Inc. © 2018 Wiley Periodicals, Inc.
Semi-supervised clustering for parcellating brain regions based on resting state fMRI data
NASA Astrophysics Data System (ADS)
Cheng, Hewei; Fan, Yong
2014-03-01
Many unsupervised clustering techniques have been adopted for parcellating brain regions of interest into functionally homogeneous subregions based on resting state fMRI data. However, the unsupervised clustering techniques are not able to take advantage of exiting knowledge of the functional neuroanatomy readily available from studies of cytoarchitectonic parcellation or meta-analysis of the literature. In this study, we propose a semi-supervised clustering method for parcellating amygdala into functionally homogeneous subregions based on resting state fMRI data. Particularly, the semi-supervised clustering is implemented under the framework of graph partitioning, and adopts prior information and spatial consistent constraints to obtain a spatially contiguous parcellation result. The graph partitioning problem is solved using an efficient algorithm similar to the well-known weighted kernel k-means algorithm. Our method has been validated for parcellating amygdala into 3 subregions based on resting state fMRI data of 28 subjects. The experiment results have demonstrated that the proposed method is more robust than unsupervised clustering and able to parcellate amygdala into centromedial, laterobasal, and superficial parts with improved functionally homogeneity compared with the cytoarchitectonic parcellation result. The validity of the parcellation results is also supported by distinctive functional and structural connectivity patterns of the subregions and high consistency between coactivation patterns derived from a meta-analysis and functional connectivity patterns of corresponding subregions.
Traxl, Dominik; Boers, Niklas; Kurths, Jürgen
2016-06-01
Network theory has proven to be a powerful tool in describing and analyzing systems by modelling the relations between their constituent objects. Particularly in recent years, a great progress has been made by augmenting "traditional" network theory in order to account for the multiplex nature of many networks, multiple types of connections between objects, the time-evolution of networks, networks of networks and other intricacies. However, existing network representations still lack crucial features in order to serve as a general data analysis tool. These include, most importantly, an explicit association of information with possibly heterogeneous types of objects and relations, and a conclusive representation of the properties of groups of nodes as well as the interactions between such groups on different scales. In this paper, we introduce a collection of definitions resulting in a framework that, on the one hand, entails and unifies existing network representations (e.g., network of networks and multilayer networks), and on the other hand, generalizes and extends them by incorporating the above features. To implement these features, we first specify the nodes and edges of a finite graph as sets of properties (which are permitted to be arbitrary mathematical objects). Second, the mathematical concept of partition lattices is transferred to the network theory in order to demonstrate how partitioning the node and edge set of a graph into supernodes and superedges allows us to aggregate, compute, and allocate information on and between arbitrary groups of nodes. The derived partition lattice of a graph, which we denote by deep graph, constitutes a concise, yet comprehensive representation that enables the expression and analysis of heterogeneous properties, relations, and interactions on all scales of a complex system in a self-contained manner. Furthermore, to be able to utilize existing network-based methods and models, we derive different representations of multilayer networks from our framework and demonstrate the advantages of our representation. On the basis of the formal framework described here, we provide a rich, fully scalable (and self-explanatory) software package that integrates into the PyData ecosystem and offers interfaces to popular network packages, making it a powerful, general-purpose data analysis toolkit. We exemplify an application of deep graphs using a real world dataset, comprising 16 years of satellite-derived global precipitation measurements. We deduce a deep graph representation of these measurements in order to track and investigate local formations of spatio-temporal clusters of extreme precipitation events.
Huang, Xiaoke; Zhao, Ye; Yang, Jing; Zhang, Chong; Ma, Chao; Ye, Xinyue
2016-01-01
We propose TrajGraph, a new visual analytics method, for studying urban mobility patterns by integrating graph modeling and visual analysis with taxi trajectory data. A special graph is created to store and manifest real traffic information recorded by taxi trajectories over city streets. It conveys urban transportation dynamics which can be discovered by applying graph analysis algorithms. To support interactive, multiscale visual analytics, a graph partitioning algorithm is applied to create region-level graphs which have smaller size than the original street-level graph. Graph centralities, including Pagerank and betweenness, are computed to characterize the time-varying importance of different urban regions. The centralities are visualized by three coordinated views including a node-link graph view, a map view and a temporal information view. Users can interactively examine the importance of streets to discover and assess city traffic patterns. We have implemented a fully working prototype of this approach and evaluated it using massive taxi trajectories of Shenzhen, China. TrajGraph's capability in revealing the importance of city streets was evaluated by comparing the calculated centralities with the subjective evaluations from a group of drivers in Shenzhen. Feedback from a domain expert was collected. The effectiveness of the visual interface was evaluated through a formal user study. We also present several examples and a case study to demonstrate the usefulness of TrajGraph in urban transportation analysis.
Interactions of information transfer along separable causal paths
NASA Astrophysics Data System (ADS)
Jiang, Peishi; Kumar, Praveen
2018-04-01
Complex systems arise as a result of interdependences between multiple variables, whose causal interactions can be visualized in a time-series graph. Transfer entropy and information partitioning approaches have been used to characterize such dependences. However, these approaches capture net information transfer occurring through a multitude of pathways involved in the interaction and as a result mask our ability to discern the causal interaction within a subgraph of interest through specific pathways. We build on recent developments of momentary information transfer along causal paths proposed by Runge [Phys. Rev. E 92, 062829 (2015), 10.1103/PhysRevE.92.062829] to develop a framework for quantifying information partitioning along separable causal paths. Momentary information transfer along causal paths captures the amount of information transfer between any two variables lagged at two specific points in time. Our approach expands this concept to characterize the causal interaction in terms of synergistic, unique, and redundant information transfer through separable causal paths. Through a graphical model, we analyze the impact of the separable and nonseparable causal paths and the causality structure embedded in the graph as well as the noise effect on information partitioning by using synthetic data generated from two coupled logistic equation models. Our approach can provide a valuable reference for an autonomous information partitioning along separable causal paths which form a causal subgraph influencing a target.
Breast histopathology image segmentation using spatio-colour-texture based graph partition method.
Belsare, A D; Mushrif, M M; Pangarkar, M A; Meshram, N
2016-06-01
This paper proposes a novel integrated spatio-colour-texture based graph partitioning method for segmentation of nuclear arrangement in tubules with a lumen or in solid islands without a lumen from digitized Hematoxylin-Eosin stained breast histology images, in order to automate the process of histology breast image analysis to assist the pathologists. We propose a new similarity based super pixel generation method and integrate it with texton representation to form spatio-colour-texture map of Breast Histology Image. Then a new weighted distance based similarity measure is used for generation of graph and final segmentation using normalized cuts method is obtained. The extensive experiments carried shows that the proposed algorithm can segment nuclear arrangement in normal as well as malignant duct in breast histology tissue image. For evaluation of the proposed method the ground-truth image database of 100 malignant and nonmalignant breast histology images is created with the help of two expert pathologists and the quantitative evaluation of proposed breast histology image segmentation has been performed. It shows that the proposed method outperforms over other methods. © 2015 The Authors Journal of Microscopy © 2015 Royal Microscopical Society.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rivasseau, Vincent, E-mail: vincent.rivasseau@th.u-psud.fr, E-mail: adrian.tanasa@ens-lyon.org; Tanasa, Adrian, E-mail: vincent.rivasseau@th.u-psud.fr, E-mail: adrian.tanasa@ens-lyon.org
The Loop Vertex Expansion (LVE) is a quantum field theory (QFT) method which explicitly computes the Borel sum of Feynman perturbation series. This LVE relies in a crucial way on symmetric tree weights which define a measure on the set of spanning trees of any connected graph. In this paper we generalize this method by defining new tree weights. They depend on the choice of a partition of a set of vertices of the graph, and when the partition is non-trivial, they are no longer symmetric under permutation of vertices. Nevertheless we prove they have the required positivity property tomore » lead to a convergent LVE; in fact we formulate this positivity property precisely for the first time. Our generalized tree weights are inspired by the Brydges-Battle-Federbush work on cluster expansions and could be particularly suited to the computation of connected functions in QFT. Several concrete examples are explicitly given.« less
Quantum Walk Schemes for Universal Quantum Computation
NASA Astrophysics Data System (ADS)
Underwood, Michael S.
Random walks are a powerful tool for the efficient implementation of algorithms in classical computation. Their quantum-mechanical analogues, called quantum walks, hold similar promise. Quantum walks provide a model of quantum computation that has recently been shown to be equivalent in power to the standard circuit model. As in the classical case, quantum walks take place on graphs and can undergo discrete or continuous evolution, though quantum evolution is unitary and therefore deterministic until a measurement is made. This thesis considers the usefulness of continuous-time quantum walks to quantum computation from the perspectives of both their fundamental power under various formulations, and their applicability in practical experiments. In one extant scheme, logical gates are effected by scattering processes. The results of an exhaustive search for single-qubit operations in this model are presented. It is shown that the number of distinct operations increases exponentially with the number of vertices in the scattering graph. A catalogue of all graphs on up to nine vertices that implement single-qubit unitaries at a specific set of momenta is included in an appendix. I develop a novel scheme for universal quantum computation called the discontinuous quantum walk, in which a continuous-time quantum walker takes discrete steps of evolution via perfect quantum state transfer through small 'widget' graphs. The discontinuous quantum-walk scheme requires an exponentially sized graph, as do prior discrete and continuous schemes. To eliminate the inefficient vertex resource requirement, a computation scheme based on multiple discontinuous walkers is presented. In this model, n interacting walkers inhabiting a graph with 2n vertices can implement an arbitrary quantum computation on an input of length n, an exponential savings over previous universal quantum walk schemes. This is the first quantum walk scheme that allows for the application of quantum error correction. The many-particle quantum walk can be viewed as a single quantum walk undergoing perfect state transfer on a larger weighted graph, obtained via equitable partitioning. I extend this formalism to non-simple graphs. Examples of the application of equitable partitioning to the analysis of quantum walks and many-particle quantum systems are discussed.
Communication-Efficient Arbitration Models for Low-Resolution Data Flow Computing
1988-12-01
phase can be formally described as follows: Graph Partitioning Problem NP-complete: (Garey & Johnson) Given graph G = (V, E), weights w (v) for each v e V...Technical Report, MIT/LCS/TR-218, Cambridge, Mass. Agerwala, Tilak, February 1982, "Data Flow Systems", Computer, pp. 10-13. Babb, Robert G ., July 1984...34Parallel Processing with Large-Grain Data Flow Techniques," IEEE Computer 17, 7, pp. 55-61. Babb, Robert G ., II, Lise Storc, and William C. Ragsdale
Language in the brain at rest: new insights from resting state data and graph theoretical analysis
Muller, Angela M.; Meyer, Martin
2014-01-01
In humans, the most obvious functional lateralization is the specialization of the left hemisphere for language. Therefore, the involvement of the right hemisphere in language is one of the most remarkable findings during the last two decades of fMRI research. However, the importance of this finding continues to be underestimated. We examined the interaction between the two hemispheres and also the role of the right hemisphere in language. From two seeds representing Broca's area, we conducted a seed correlation analysis (SCA) of resting state fMRI data and could identify a resting state network (RSN) overlapping to significant extent with a language network that was generated by an automated meta-analysis tool. To elucidate the relationship between the clusters of this RSN, we then performed graph theoretical analyses (GTA) using the same resting state dataset. We show that the right hemisphere is clearly involved in language. A modularity analysis revealed that the interaction between the two hemispheres is mediated by three partitions: A bilateral frontal partition consists of nodes representing the classical left sided language regions as well as two right-sided homologs. The second bilateral partition consists of nodes from the right frontal, the left inferior parietal cortex as well as of two nodes within the posterior cerebellum. The third partition is also bilateral and comprises five regions from the posterior midline parts of the brain to the temporal and frontal cortex, two of the nodes are prominent default mode nodes. The involvement of this last partition in a language relevant function is a novel finding. PMID:24808843
Dynamic connectivity regression: Determining state-related changes in brain connectivity
Cribben, Ivor; Haraldsdottir, Ragnheidur; Atlas, Lauren Y.; Wager, Tor D.; Lindquist, Martin A.
2014-01-01
Most statistical analyses of fMRI data assume that the nature, timing and duration of the psychological processes being studied are known. However, often it is hard to specify this information a priori. In this work we introduce a data-driven technique for partitioning the experimental time course into distinct temporal intervals with different multivariate functional connectivity patterns between a set of regions of interest (ROIs). The technique, called Dynamic Connectivity Regression (DCR), detects temporal change points in functional connectivity and estimates a graph, or set of relationships between ROIs, for data in the temporal partition that falls between pairs of change points. Hence, DCR allows for estimation of both the time of change in connectivity and the connectivity graph for each partition, without requiring prior knowledge of the nature of the experimental design. Permutation and bootstrapping methods are used to perform inference on the change points. The method is applied to various simulated data sets as well as to an fMRI data set from a study (N=26) of a state anxiety induction using a socially evaluative threat challenge. The results illustrate the method’s ability to observe how the networks between different brain regions changed with subjects’ emotional state. PMID:22484408
Parallelization of a Fully-Distributed Hydrologic Model using Sub-basin Partitioning
NASA Astrophysics Data System (ADS)
Vivoni, E. R.; Mniszewski, S.; Fasel, P.; Springer, E.; Ivanov, V. Y.; Bras, R. L.
2005-12-01
A primary obstacle towards advances in watershed simulations has been the limited computational capacity available to most models. The growing trend of model complexity, data availability and physical representation has not been matched by adequate developments in computational efficiency. This situation has created a serious bottleneck which limits existing distributed hydrologic models to small domains and short simulations. In this study, we present novel developments in the parallelization of a fully-distributed hydrologic model. Our work is based on the TIN-based Real-time Integrated Basin Simulator (tRIBS), which provides continuous hydrologic simulation using a multiple resolution representation of complex terrain based on a triangulated irregular network (TIN). While the use of TINs reduces computational demand, the sequential version of the model is currently limited over large basins (>10,000 km2) and long simulation periods (>1 year). To address this, a parallel MPI-based version of the tRIBS model has been implemented and tested using high performance computing resources at Los Alamos National Laboratory. Our approach utilizes domain decomposition based on sub-basin partitioning of the watershed. A stream reach graph based on the channel network structure is used to guide the sub-basin partitioning. Individual sub-basins or sub-graphs of sub-basins are assigned to separate processors to carry out internal hydrologic computations (e.g. rainfall-runoff transformation). Routed streamflow from each sub-basin forms the major hydrologic data exchange along the stream reach graph. Individual sub-basins also share subsurface hydrologic fluxes across adjacent boundaries. We demonstrate how the sub-basin partitioning provides computational feasibility and efficiency for a set of test watersheds in northeastern Oklahoma. We compare the performance of the sequential and parallelized versions to highlight the efficiency gained as the number of processors increases. We also discuss how the coupled use of TINs and parallel processing can lead to feasible long-term simulations in regional watersheds while preserving basin properties at high-resolution.
Nonlinear optimization simplified by hypersurface deformation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stillinger, F.H.; Weber, T.A.
1988-09-01
A general strategy is advanced for simplifying nonlinear optimization problems, the ant-lion method. This approach exploits shape modifications of the cost-function hypersurface which distend basins surrounding low-lying minima (including global minima). By intertwining hypersurface deformations with steepest-descent displacements, the search is concentrated on a small relevant subset of all minima. Specific calculations demonstrating the value of this method are reported for the partitioning of two classes of irregular but nonrandom graphs, the prime-factor graphs and the pi graphs. We also indicate how this approach can be applied to the traveling salesman problem and to design layout optimization, and that itmore » may be useful in combination with simulated annealing strategies.« less
Graph Partitioning by Eigenvectors,
1987-01-01
the extremal nature of eigenvalues of symmetric matrices, the interlacing theorem, monotonicity of spectral radius of nonnegative matrices, Perron ... Frobenius theory, etc. (See Varga (1962) and Lancaster and Tismenetsky (1985).) Most of the results of this paper depend on the following lemma. ABSTRACT
Memoryless cooperative graph search based on the simulated annealing algorithm
NASA Astrophysics Data System (ADS)
Hou, Jian; Yan, Gang-Feng; Fan, Zhen
2011-04-01
We have studied the problem of reaching a globally optimal segment for a graph-like environment with a single or a group of autonomous mobile agents. Firstly, two efficient simulated-annealing-like algorithms are given for a single agent to solve the problem in a partially known environment and an unknown environment, respectively. It shows that under both proposed control strategies, the agent will eventually converge to a globally optimal segment with probability 1. Secondly, we use multi-agent searching to simultaneously reduce the computation complexity and accelerate convergence based on the algorithms we have given for a single agent. By exploiting graph partition, a gossip-consensus method based scheme is presented to update the key parameter—radius of the graph, ensuring that the agents spend much less time finding a globally optimal segment.
On k-ary n-cubes: Theory and applications
NASA Technical Reports Server (NTRS)
Mao, Weizhen; Nicol, David M.
1994-01-01
Many parallel processing networks can be viewed as graphs called k-ary n-cubes, whose special cases include rings, hypercubes and toruses. In this paper, combinatorial properties of k-ary n-cubes are explored. In particular, the problem of characterizing the subgraph of a given number of nodes with the maximum edge count is studied. These theoretical results are then used to compute a lower bounding function in branch-and-bound partitioning algorithms and to establish the optimality of some irregular partitions.
Stroganov, Oleg V; Novikov, Fedor N; Zeifman, Alexey A; Stroylov, Viktor S; Chilov, Ghermes G
2011-09-01
A new graph-theoretical approach called thermodynamic sampling of amino acid residues (TSAR) has been elaborated to explicitly account for the protein side chain flexibility in modeling conformation-dependent protein properties. In TSAR, a protein is viewed as a graph whose nodes correspond to structurally independent groups and whose edges connect the interacting groups. Each node has its set of states describing conformation and ionization of the group, and each edge is assigned an array of pairwise interaction potentials between the adjacent groups. By treating the obtained graph as a belief-network-a well-established mathematical abstraction-the partition function of each node is found. In the current work we used TSAR to calculate partition functions of the ionized forms of protein residues. A simplified version of a semi-empirical molecular mechanical scoring function, borrowed from our Lead Finder docking software, was used for energy calculations. The accuracy of the resulting model was validated on a set of 486 experimentally determined pK(a) values of protein residues. The average correlation coefficient (R) between calculated and experimental pK(a) values was 0.80, ranging from 0.95 (for Tyr) to 0.61 (for Lys). It appeared that the hydrogen bond interactions and the exhaustiveness of side chain sampling made the most significant contribution to the accuracy of pK(a) calculations. Copyright © 2011 Wiley-Liss, Inc.
A genetic graph-based approach for partitional clustering.
Menéndez, Héctor D; Barrero, David F; Camacho, David
2014-05-01
Clustering is one of the most versatile tools for data analysis. In the recent years, clustering that seeks the continuity of data (in opposition to classical centroid-based approaches) has attracted an increasing research interest. It is a challenging problem with a remarkable practical interest. The most popular continuity clustering method is the spectral clustering (SC) algorithm, which is based on graph cut: It initially generates a similarity graph using a distance measure and then studies its graph spectrum to find the best cut. This approach is sensitive to the parameters of the metric, and a correct parameter choice is critical to the quality of the cluster. This work proposes a new algorithm, inspired by SC, that reduces the parameter dependency while maintaining the quality of the solution. The new algorithm, named genetic graph-based clustering (GGC), takes an evolutionary approach introducing a genetic algorithm (GA) to cluster the similarity graph. The experimental validation shows that GGC increases robustness of SC and has competitive performance in comparison with classical clustering methods, at least, in the synthetic and real dataset used in the experiments.
Deep graphs—A general framework to represent and analyze heterogeneous complex systems across scales
NASA Astrophysics Data System (ADS)
Traxl, Dominik; Boers, Niklas; Kurths, Jürgen
2016-06-01
Network theory has proven to be a powerful tool in describing and analyzing systems by modelling the relations between their constituent objects. Particularly in recent years, a great progress has been made by augmenting "traditional" network theory in order to account for the multiplex nature of many networks, multiple types of connections between objects, the time-evolution of networks, networks of networks and other intricacies. However, existing network representations still lack crucial features in order to serve as a general data analysis tool. These include, most importantly, an explicit association of information with possibly heterogeneous types of objects and relations, and a conclusive representation of the properties of groups of nodes as well as the interactions between such groups on different scales. In this paper, we introduce a collection of definitions resulting in a framework that, on the one hand, entails and unifies existing network representations (e.g., network of networks and multilayer networks), and on the other hand, generalizes and extends them by incorporating the above features. To implement these features, we first specify the nodes and edges of a finite graph as sets of properties (which are permitted to be arbitrary mathematical objects). Second, the mathematical concept of partition lattices is transferred to the network theory in order to demonstrate how partitioning the node and edge set of a graph into supernodes and superedges allows us to aggregate, compute, and allocate information on and between arbitrary groups of nodes. The derived partition lattice of a graph, which we denote by deep graph, constitutes a concise, yet comprehensive representation that enables the expression and analysis of heterogeneous properties, relations, and interactions on all scales of a complex system in a self-contained manner. Furthermore, to be able to utilize existing network-based methods and models, we derive different representations of multilayer networks from our framework and demonstrate the advantages of our representation. On the basis of the formal framework described here, we provide a rich, fully scalable (and self-explanatory) software package that integrates into the PyData ecosystem and offers interfaces to popular network packages, making it a powerful, general-purpose data analysis toolkit. We exemplify an application of deep graphs using a real world dataset, comprising 16 years of satellite-derived global precipitation measurements. We deduce a deep graph representation of these measurements in order to track and investigate local formations of spatio-temporal clusters of extreme precipitation events.
Weighted graph cuts without eigenvectors a multilevel approach.
Dhillon, Inderjit S; Guan, Yuqiang; Kulis, Brian
2007-11-01
A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel k-means are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods--in particular, a general weighted kernel k-means objective is mathematically equivalent to a weighted graph clustering objective. We exploit this equivalence to develop a fast, high-quality multilevel algorithm that directly optimizes various weighted graph clustering objectives, such as the popular ratio cut, normalized cut, and ratio association criteria. This eliminates the need for any eigenvector computation for graph clustering problems, which can be prohibitive for very large graphs. Previous multilevel graph partitioning methods, such as Metis, have suffered from the restriction of equal-sized clusters; our multilevel algorithm removes this restriction by using kernel k-means to optimize weighted graph cuts. Experimental results show that our multilevel algorithm outperforms a state-of-the-art spectral clustering algorithm in terms of speed, memory usage, and quality. We demonstrate that our algorithm is applicable to large-scale clustering tasks such as image segmentation, social network analysis and gene network analysis.
GraphMeta: Managing HPC Rich Metadata in Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Dai, Dong; Chen, Yong; Carns, Philip
High-performance computing (HPC) systems face increasingly critical metadata management challenges, especially in the approaching exascale era. These challenges arise not only from exploding metadata volumes, but also from increasingly diverse metadata, which contains data provenance and arbitrary user-defined attributes in addition to traditional POSIX metadata. This ‘rich’ metadata is becoming critical to supporting advanced data management functionality such as data auditing and validation. In our prior work, we identified a graph-based model as a promising solution to uniformly manage HPC rich metadata due to its flexibility and generality. However, at the same time, graph-based HPC rich metadata anagement also introducesmore » significant challenges to the underlying infrastructure. In this study, we first identify the challenges on the underlying infrastructure to support scalable, high-performance rich metadata management. Based on that, we introduce GraphMeta, a graphbased engine designed for this use case. It achieves performance scalability by introducing a new graph partitioning algorithm and a write-optimal storage engine. We evaluate GraphMeta under both synthetic and real HPC metadata workloads, compare it with other approaches, and demonstrate its advantages in terms of efficiency and usability for rich metadata management in HPC systems.« less
Multiscale 3D Shape Analysis using Spherical Wavelets
Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen
2013-01-01
Shape priors attempt to represent biological variations within a population. When variations are global, Principal Component Analysis (PCA) can be used to learn major modes of variation, even from a limited training set. However, when significant local variations exist, PCA typically cannot represent such variations from a small training set. To address this issue, we present a novel algorithm that learns shape variations from data at multiple scales and locations using spherical wavelets and spectral graph partitioning. Our results show that when the training set is small, our algorithm significantly improves the approximation of shapes in a testing set over PCA, which tends to oversmooth data. PMID:16685992
Multiscale 3D shape analysis using spherical wavelets.
Nain, Delphine; Haker, Steven; Bobick, Aaron; Tannenbaum, Allen R
2005-01-01
Shape priors attempt to represent biological variations within a population. When variations are global, Principal Component Analysis (PCA) can be used to learn major modes of variation, even from a limited training set. However, when significant local variations exist, PCA typically cannot represent such variations from a small training set. To address this issue, we present a novel algorithm that learns shape variations from data at multiple scales and locations using spherical wavelets and spectral graph partitioning. Our results show that when the training set is small, our algorithm significantly improves the approximation of shapes in a testing set over PCA, which tends to oversmooth data.
Partitioning sparse matrices with eigenvectors of graphs
NASA Technical Reports Server (NTRS)
Pothen, Alex; Simon, Horst D.; Liou, Kang-Pu
1990-01-01
The problem of computing a small vertex separator in a graph arises in the context of computing a good ordering for the parallel factorization of sparse, symmetric matrices. An algebraic approach for computing vertex separators is considered in this paper. It is shown that lower bounds on separator sizes can be obtained in terms of the eigenvalues of the Laplacian matrix associated with a graph. The Laplacian eigenvectors of grid graphs can be computed from Kronecker products involving the eigenvectors of path graphs, and these eigenvectors can be used to compute good separators in grid graphs. A heuristic algorithm is designed to compute a vertex separator in a general graph by first computing an edge separator in the graph from an eigenvector of the Laplacian matrix, and then using a maximum matching in a subgraph to compute the vertex separator. Results on the quality of the separators computed by the spectral algorithm are presented, and these are compared with separators obtained from other algorithms for computing separators. Finally, the time required to compute the Laplacian eigenvector is reported, and the accuracy with which the eigenvector must be computed to obtain good separators is considered. The spectral algorithm has the advantage that it can be implemented on a medium-size multiprocessor in a straightforward manner.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boman, Erik G.; Catalyurek, Umit V.; Chevalier, Cedric
2015-01-16
This final progress report summarizes the work accomplished at the Combinatorial Scientific Computing and Petascale Simulations Institute. We developed Zoltan, a parallel mesh partitioning library that made use of accurate hypergraph models to provide load balancing in mesh-based computations. We developed several graph coloring algorithms for computing Jacobian and Hessian matrices and organized them into a software package called ColPack. We developed parallel algorithms for graph coloring and graph matching problems, and also designed multi-scale graph algorithms. Three PhD students graduated, six more are continuing their PhD studies, and four postdoctoral scholars were advised. Six of these students and Fellowsmore » have joined DOE Labs (Sandia, Berkeley), as staff scientists or as postdoctoral scientists. We also organized the SIAM Workshop on Combinatorial Scientific Computing (CSC) in 2007, 2009, and 2011 to continue to foster the CSC community.« less
Heuristic-driven graph wavelet modeling of complex terrain
NASA Astrophysics Data System (ADS)
Cioacǎ, Teodor; Dumitrescu, Bogdan; Stupariu, Mihai-Sorin; Pǎtru-Stupariu, Ileana; Nǎpǎrus, Magdalena; Stoicescu, Ioana; Peringer, Alexander; Buttler, Alexandre; Golay, François
2015-03-01
We present a novel method for building a multi-resolution representation of large digital surface models. The surface points coincide with the nodes of a planar graph which can be processed using a critically sampled, invertible lifting scheme. To drive the lazy wavelet node partitioning, we employ an attribute aware cost function based on the generalized quadric error metric. The resulting algorithm can be applied to multivariate data by storing additional attributes at the graph's nodes. We discuss how the cost computation mechanism can be coupled with the lifting scheme and examine the results by evaluating the root mean square error. The algorithm is experimentally tested using two multivariate LiDAR sets representing terrain surface and vegetation structure with different sampling densities.
Chen, Wenbin; Hendrix, William; Samatova, Nagiza F
2017-12-01
The problem of aligning multiple metabolic pathways is one of very challenging problems in computational biology. A metabolic pathway consists of three types of entities: reactions, compounds, and enzymes. Based on similarities between enzymes, Tohsato et al. gave an algorithm for aligning multiple metabolic pathways. However, the algorithm given by Tohsato et al. neglects the similarities among reactions, compounds, enzymes, and pathway topology. How to design algorithms for the alignment problem of multiple metabolic pathways based on the similarity of reactions, compounds, and enzymes? It is a difficult computational problem. In this article, we propose an algorithm for the problem of aligning multiple metabolic pathways based on the similarities among reactions, compounds, enzymes, and pathway topology. First, we compute a weight between each pair of like entities in different input pathways based on the entities' similarity score and topological structure using Ay et al.'s methods. We then construct a weighted k-partite graph for the reactions, compounds, and enzymes. We extract a mapping between these entities by solving the maximum-weighted k-partite matching problem by applying a novel heuristic algorithm. By analyzing the alignment results of multiple pathways in different organisms, we show that the alignments found by our algorithm correctly identify common subnetworks among multiple pathways.
Protein and gene model inference based on statistical modeling in k-partite graphs.
Gerster, Sarah; Qeli, Ermir; Ahrens, Christian H; Bühlmann, Peter
2010-07-06
One of the major goals of proteomics is the comprehensive and accurate description of a proteome. Shotgun proteomics, the method of choice for the analysis of complex protein mixtures, requires that experimentally observed peptides are mapped back to the proteins they were derived from. This process is also known as protein inference. We present Markovian Inference of Proteins and Gene Models (MIPGEM), a statistical model based on clearly stated assumptions to address the problem of protein and gene model inference for shotgun proteomics data. In particular, we are dealing with dependencies among peptides and proteins using a Markovian assumption on k-partite graphs. We are also addressing the problems of shared peptides and ambiguous proteins by scoring the encoding gene models. Empirical results on two control datasets with synthetic mixtures of proteins and on complex protein samples of Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana suggest that the results with MIPGEM are competitive with existing tools for protein inference.
Should Science Be Used to Teach Mathematical Skills?
ERIC Educational Resources Information Center
Kren, Sandra R.; Huntsberger, John P.
1977-01-01
Studies elementary school childrens' abilities in (1) measuring and constructing angles, and (2) interpreting and constructing linear graphs as a result of instructional formats. Partitioned into instructional treatments of (1) science, (2) science-mathematics, (3) mathematics, and (4) control were 161 fourth- and fifth-grade children. Mathematics…
Significant Scales in Community Structure
NASA Astrophysics Data System (ADS)
Traag, V. A.; Krings, G.; van Dooren, P.
2013-10-01
Many complex networks show signs of modular structure, uncovered by community detection. Although many methods succeed in revealing various partitions, it remains difficult to detect at what scale some partition is significant. This problem shows foremost in multi-resolution methods. We here introduce an efficient method for scanning for resolutions in one such method. Additionally, we introduce the notion of ``significance'' of a partition, based on subgraph probabilities. Significance is independent of the exact method used, so could also be applied in other methods, and can be interpreted as the gain in encoding a graph by making use of a partition. Using significance, we can determine ``good'' resolution parameters, which we demonstrate on benchmark networks. Moreover, optimizing significance itself also shows excellent performance. We demonstrate our method on voting data from the European Parliament. Our analysis suggests the European Parliament has become increasingly ideologically divided and that nationality plays no role.
Spectral Upscaling for Graph Laplacian Problems with Application to Reservoir Simulation
Barker, Andrew T.; Lee, Chak S.; Vassilevski, Panayot S.
2017-10-26
Here, we consider coarsening procedures for graph Laplacian problems written in a mixed saddle-point form. In that form, in addition to the original (vertex) degrees of freedom (dofs), we also have edge degrees of freedom. We extend previously developed aggregation-based coarsening procedures applied to both sets of dofs to now allow more than one coarse vertex dof per aggregate. Those dofs are selected as certain eigenvectors of local graph Laplacians associated with each aggregate. Additionally, we coarsen the edge dofs by using traces of the discrete gradients of the already constructed coarse vertex dofs. These traces are defined on themore » interface edges that connect any two adjacent aggregates. The overall procedure is a modification of the spectral upscaling procedure developed in for the mixed finite element discretization of diffusion type PDEs which has the important property of maintaining inf-sup stability on coarse levels and having provable approximation properties. We consider applications to partitioning a general graph and to a finite volume discretization interpreted as a graph Laplacian, developing consistent and accurate coarse-scale models of a fine-scale problem.« less
Local Higher-Order Graph Clustering
Yin, Hao; Benson, Austin R.; Leskovec, Jure; Gleich, David F.
2018-01-01
Local graph clustering methods aim to find a cluster of nodes by exploring a small region of the graph. These methods are attractive because they enable targeted clustering around a given seed node and are faster than traditional global graph clustering methods because their runtime does not depend on the size of the input graph. However, current local graph partitioning methods are not designed to account for the higher-order structures crucial to the network, nor can they effectively handle directed networks. Here we introduce a new class of local graph clustering methods that address these issues by incorporating higher-order network information captured by small subgraphs, also called network motifs. We develop the Motif-based Approximate Personalized PageRank (MAPPR) algorithm that finds clusters containing a seed node with minimal motif conductance, a generalization of the conductance metric for network motifs. We generalize existing theory to prove the fast running time (independent of the size of the graph) and obtain theoretical guarantees on the cluster quality (in terms of motif conductance). We also develop a theory of node neighborhoods for finding sets that have small motif conductance, and apply these results to the case of finding good seed nodes to use as input to the MAPPR algorithm. Experimental validation on community detection tasks in both synthetic and real-world networks, shows that our new framework MAPPR outperforms the current edge-based personalized PageRank methodology. PMID:29770258
Overlapping communities detection based on spectral analysis of line graphs
NASA Astrophysics Data System (ADS)
Gui, Chun; Zhang, Ruisheng; Hu, Rongjing; Huang, Guoming; Wei, Jiaxuan
2018-05-01
Community in networks are often overlapping where one vertex belongs to several clusters. Meanwhile, many networks show hierarchical structure such that community is recursively grouped into hierarchical organization. In order to obtain overlapping communities from a global hierarchy of vertices, a new algorithm (named SAoLG) is proposed to build the hierarchical organization along with detecting the overlap of community structure. SAoLG applies the spectral analysis into line graphs to unify the overlap and hierarchical structure of the communities. In order to avoid the limitation of absolute distance such as Euclidean distance, SAoLG employs Angular distance to compute the similarity between vertices. Furthermore, we make a micro-improvement partition density to evaluate the quality of community structure and use it to obtain the more reasonable and sensible community numbers. The proposed SAoLG algorithm achieves a balance between overlap and hierarchy by applying spectral analysis to edge community detection. The experimental results on one standard network and six real-world networks show that the SAoLG algorithm achieves higher modularity and reasonable community number values than those generated by Ahn's algorithm, the classical CPM and GN ones.
Superpixel Cut for Figure-Ground Image Segmentation
NASA Astrophysics Data System (ADS)
Yang, Michael Ying; Rosenhahn, Bodo
2016-06-01
Figure-ground image segmentation has been a challenging problem in computer vision. Apart from the difficulties in establishing an effective framework to divide the image pixels into meaningful groups, the notions of figure and ground often need to be properly defined by providing either user inputs or object models. In this paper, we propose a novel graph-based segmentation framework, called superpixel cut. The key idea is to formulate foreground segmentation as finding a subset of superpixels that partitions a graph over superpixels. The problem is formulated as Min-Cut. Therefore, we propose a novel cost function that simultaneously minimizes the inter-class similarity while maximizing the intra-class similarity. This cost function is optimized using parametric programming. After a small learning step, our approach is fully automatic and fully bottom-up, which requires no high-level knowledge such as shape priors and scene content. It recovers coherent components of images, providing a set of multiscale hypotheses for high-level reasoning. We evaluate our proposed framework by comparing it to other generic figure-ground segmentation approaches. Our method achieves improved performance on state-of-the-art benchmark databases.
On the star partition dimension of comb product of cycle and path
NASA Astrophysics Data System (ADS)
Alfarisi, Ridho; Darmaji
2017-08-01
Let G = (V, E) be a connected graphs with vertex set V(G), edge set E(G) and S ⊆ V(G). Given an ordered partition Π = {S1, S2, S3, …, Sk} of the vertex set V of G, the representation of a vertex v ∈ V with respect to Π is the vector r(v|Π) = (d(v, S1), d(v, S2), …, d(v, Sk)), where d(v, Sk) represents the distance between the vertex v and the set Sk and d(v, Sk) = min{d(v, x)|x ∈ Sk }. A partition Π of V(G) is a resolving partition if different vertices of G have distinct representations, i.e., for every pair of vertices u, v ∈ V(G), r(u|Π) ≠ r(v|Π). The minimum k of Π resolving partition is a partition dimension of G, denoted by pd(G). The resolving partition Π = {S1, S2, S3, …, Sk } is called a star resolving partition for G if it is a resolving partition and each subgraph induced by Si, 1 ≤ i ≤ k, is a star. The minimum k for which there exists a star resolving partition of V(G) is the star partition dimension of G, denoted by spd(G). Finding the star partition dimension of G is classified to be a NP-Hard problem. In this paper, we will show that the partition dimension of comb product of cycle and path namely Cm⊳Pn and Pn⊳Cm for n ≥ 2 and m ≥ 3.
NASA Technical Reports Server (NTRS)
Drake, Michael J.; Boynton, William V.
1988-01-01
The effect of oxygen fugacity on the partitioning of REEs between hibonite and silicate melt is investigated in hibonite-growth experiments at 1470 C. The experimental procedures and apparatus are described, and the results are presented in extensive tables and graphs and characterized in detail. The absolute activity coefficients in hibonite are estimated as 330 for La, 1200 for Eu(3+), and 24,000 for Yb. It is inferred that ideal solution behavior cannot be assumed when calculating REE condensation temperatures for (Ca, Al)-rich inclusions in carbonaceous chondrites.
Lohse, Christian; Bassett, Danielle S; Lim, Kelvin O; Carlson, Jean M
2014-10-01
Human brain anatomy and function display a combination of modular and hierarchical organization, suggesting the importance of both cohesive structures and variable resolutions in the facilitation of healthy cognitive processes. However, tools to simultaneously probe these features of brain architecture require further development. We propose and apply a set of methods to extract cohesive structures in network representations of brain connectivity using multi-resolution techniques. We employ a combination of soft thresholding, windowed thresholding, and resolution in community detection, that enable us to identify and isolate structures associated with different weights. One such mesoscale structure is bipartivity, which quantifies the extent to which the brain is divided into two partitions with high connectivity between partitions and low connectivity within partitions. A second, complementary mesoscale structure is modularity, which quantifies the extent to which the brain is divided into multiple communities with strong connectivity within each community and weak connectivity between communities. Our methods lead to multi-resolution curves of these network diagnostics over a range of spatial, geometric, and structural scales. For statistical comparison, we contrast our results with those obtained for several benchmark null models. Our work demonstrates that multi-resolution diagnostic curves capture complex organizational profiles in weighted graphs. We apply these methods to the identification of resolution-specific characteristics of healthy weighted graph architecture and altered connectivity profiles in psychiatric disease.
NASA Astrophysics Data System (ADS)
He, Xianjin; Zhang, Xinchang; Xin, Qinchuan
2018-02-01
Recognition of building group patterns (i.e., the arrangement and form exhibited by a collection of buildings at a given mapping scale) is important to the understanding and modeling of geographic space and is hence essential to a wide range of downstream applications such as map generalization. Most of the existing methods develop rigid rules based on the topographic relationships between building pairs to identify building group patterns and thus their applications are often limited. This study proposes a method to identify a variety of building group patterns that allow for map generalization. The method first identifies building group patterns from potential building clusters based on a machine-learning algorithm and further partitions the building clusters with no recognized patterns based on the graph partitioning method. The proposed method is applied to the datasets of three cities that are representative of the complex urban environment in Southern China. Assessment of the results based on the reference data suggests that the proposed method is able to recognize both regular (e.g., the collinear, curvilinear, and rectangular patterns) and irregular (e.g., the L-shaped, H-shaped, and high-density patterns) building group patterns well, given that the correctness values are consistently nearly 90% and the completeness values are all above 91% for three study areas. The proposed method shows promises in automated recognition of building group patterns that allows for map generalization.
NASA Astrophysics Data System (ADS)
Kriz, Igor; Loebl, Martin; Somberg, Petr
2013-05-01
We study various mathematical aspects of discrete models on graphs, specifically the Dimer and the Ising models. We focus on proving gluing formulas for individual summands of the partition function. We also obtain partial results regarding conjectured limits realized by fermions in rational conformal field theories.
S-HARP: A parallel dynamic spectral partitioner
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sohn, A.; Simon, H.
1998-01-01
Computational science problems with adaptive meshes involve dynamic load balancing when implemented on parallel machines. This dynamic load balancing requires fast partitioning of computational meshes at run time. The authors present in this report a fast parallel dynamic partitioner, called S-HARP. The underlying principles of S-HARP are the fast feature of inertial partitioning and the quality feature of spectral partitioning. S-HARP partitions a graph from scratch, requiring no partition information from previous iterations. Two types of parallelism have been exploited in S-HARP, fine grain loop level parallelism and coarse grain recursive parallelism. The parallel partitioner has been implemented in Messagemore » Passing Interface on Cray T3E and IBM SP2 for portability. Experimental results indicate that S-HARP can partition a mesh of over 100,000 vertices into 256 partitions in 0.2 seconds on a 64 processor Cray T3E. S-HARP is much more scalable than other dynamic partitioners, giving over 15 fold speedup on 64 processors while ParaMeTiS1.0 gives a few fold speedup. Experimental results demonstrate that S-HARP is three to 10 times faster than the dynamic partitioners ParaMeTiS and Jostle on six computational meshes of size over 100,000 vertices.« less
a Super Voxel-Based Riemannian Graph for Multi Scale Segmentation of LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Li, Minglei
2018-04-01
Automatically segmenting LiDAR points into respective independent partitions has become a topic of great importance in photogrammetry, remote sensing and computer vision. In this paper, we cast the problem of point cloud segmentation as a graph optimization problem by constructing a Riemannian graph. The scale space of the observed scene is explored by an octree-based over-segmentation with different depths. The over-segmentation produces many super voxels which restrict the structure of the scene and will be used as nodes of the graph. The Kruskal coordinates are used to compute edge weights that are proportional to the geodesic distance between nodes. Then we compute the edge-weight matrix in which the elements reflect the sectional curvatures associated with the geodesic paths between super voxel nodes on the scene surface. The final segmentation results are generated by clustering similar super voxels and cutting off the weak edges in the graph. The performance of this method was evaluated on LiDAR point clouds for both indoor and outdoor scenes. Additionally, extensive comparisons to state of the art techniques show that our algorithm outperforms on many metrics.
Exactly solved models on planar graphs with vertices in {Z}^3
NASA Astrophysics Data System (ADS)
Kels, Andrew P.
2017-12-01
It is shown how exactly solved edge interaction models on the square lattice, may be extended onto more general planar graphs, with edges connecting a subset of next nearest neighbour vertices of {Z}3 . This is done by using local deformations of the square lattice, that arise through the use of the star-triangle relation. Similar to Baxter’s Z-invariance property, these local deformations leave the partition function invariant up to some simple factors coming from the star-triangle relation. The deformations used here extend the usual formulation of Z-invariance, by requiring the introduction of oriented rapidity lines which form directed closed paths in the rapidity graph of the model. The quasi-classical limit is also considered, in which case the deformations imply a classical Z-invariance property, as well as a related local closure relation, for the action functional of a system of classical discrete Laplace equations.
FPFH-based graph matching for 3D point cloud registration
NASA Astrophysics Data System (ADS)
Zhao, Jiapeng; Li, Chen; Tian, Lihua; Zhu, Jihua
2018-04-01
Correspondence detection is a vital step in point cloud registration and it can help getting a reliable initial alignment. In this paper, we put forward an advanced point feature-based graph matching algorithm to solve the initial alignment problem of rigid 3D point cloud registration with partial overlap. Specifically, Fast Point Feature Histograms are used to determine the initial possible correspondences firstly. Next, a new objective function is provided to make the graph matching more suitable for partially overlapping point cloud. The objective function is optimized by the simulated annealing algorithm for final group of correct correspondences. Finally, we present a novel set partitioning method which can transform the NP-hard optimization problem into a O(n3)-solvable one. Experiments on the Stanford and UWA public data sets indicates that our method can obtain better result in terms of both accuracy and time cost compared with other point cloud registration methods.
Finding minimum-quotient cuts in planar graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, J.K.; Phillips, C.A.
Given a graph G = (V, E) where each vertex v {element_of} V is assigned a weight w(v) and each edge e {element_of} E is assigned a cost c(e), the quotient of a cut partitioning the vertices of V into sets S and {bar S} is c(S, {bar S})/min{l_brace}w(S), w(S){r_brace}, where c(S, {bar S}) is the sum of the costs of the edges crossing the cut and w(S) and w({bar S}) are the sum of the weights of the vertices in S and {bar S}, respectively. The problem of finding a cut whose quotient is minimum for a graph hasmore » in recent years attracted considerable attention, due in large part to the work of Rao and Leighton and Rao. They have shown that an algorithm (exact or approximation) for the minimum-quotient-cut problem can be used to obtain an approximation algorithm for the more famous minimumb-balanced-cut problem, which requires finding a cut (S,{bar S}) minimizing c(S,{bar S}) subject to the constraint bW {le} w(S) {le} (1 {minus} b)W, where W is the total vertex weight and b is some fixed balance in the range 0 < b {le} {1/2}. Unfortunately, the minimum-quotient-cut problem is strongly NP-hard for general graphs, and the best polynomial-time approximation algorithm known for the general problem guarantees only a cut whose quotient is at mostO(lg n) times optimal, where n is the size of the graph. However, for planar graphs, the minimum-quotient-cut problem appears more tractable, as Rao has developed several efficient approximation algorithms for the planar version of the problem capable of finding a cut whose quotient is at most some constant times optimal. In this paper, we improve Rao`s algorithms, both in terms of accuracy and speed. As our first result, we present two pseudopolynomial-time exact algorithms for the planar minimum-quotient-cut problem. As Rao`s most accurate approximation algorithm for the problem -- also a pseudopolynomial-time algorithm -- guarantees only a 1.5-times-optimal cut, our algorithms represent a significant advance.« less
Finding minimum-quotient cuts in planar graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, J.K.; Phillips, C.A.
Given a graph G = (V, E) where each vertex v [element of] V is assigned a weight w(v) and each edge e [element of] E is assigned a cost c(e), the quotient of a cut partitioning the vertices of V into sets S and [bar S] is c(S, [bar S])/min[l brace]w(S), w(S)[r brace], where c(S, [bar S]) is the sum of the costs of the edges crossing the cut and w(S) and w([bar S]) are the sum of the weights of the vertices in S and [bar S], respectively. The problem of finding a cut whose quotient is minimummore » for a graph has in recent years attracted considerable attention, due in large part to the work of Rao and Leighton and Rao. They have shown that an algorithm (exact or approximation) for the minimum-quotient-cut problem can be used to obtain an approximation algorithm for the more famous minimumb-balanced-cut problem, which requires finding a cut (S,[bar S]) minimizing c(S,[bar S]) subject to the constraint bW [le] w(S) [le] (1 [minus] b)W, where W is the total vertex weight and b is some fixed balance in the range 0 < b [le] [1/2]. Unfortunately, the minimum-quotient-cut problem is strongly NP-hard for general graphs, and the best polynomial-time approximation algorithm known for the general problem guarantees only a cut whose quotient is at mostO(lg n) times optimal, where n is the size of the graph. However, for planar graphs, the minimum-quotient-cut problem appears more tractable, as Rao has developed several efficient approximation algorithms for the planar version of the problem capable of finding a cut whose quotient is at most some constant times optimal. In this paper, we improve Rao's algorithms, both in terms of accuracy and speed. As our first result, we present two pseudopolynomial-time exact algorithms for the planar minimum-quotient-cut problem. As Rao's most accurate approximation algorithm for the problem -- also a pseudopolynomial-time algorithm -- guarantees only a 1.5-times-optimal cut, our algorithms represent a significant advance.« less
Discrete geometric analysis of message passing algorithm on graphs
NASA Astrophysics Data System (ADS)
Watanabe, Yusuke
2010-04-01
We often encounter probability distributions given as unnormalized products of non-negative functions. The factorization structures are represented by hypergraphs called factor graphs. Such distributions appear in various fields, including statistics, artificial intelligence, statistical physics, error correcting codes, etc. Given such a distribution, computations of marginal distributions and the normalization constant are often required. However, they are computationally intractable because of their computational costs. One successful approximation method is Loopy Belief Propagation (LBP) algorithm. The focus of this thesis is an analysis of the LBP algorithm. If the factor graph is a tree, i.e. having no cycle, the algorithm gives the exact quantities. If the factor graph has cycles, however, the LBP algorithm does not give exact results and possibly exhibits oscillatory and non-convergent behaviors. The thematic question of this thesis is "How the behaviors of the LBP algorithm are affected by the discrete geometry of the factor graph?" The primary contribution of this thesis is the discovery of a formula that establishes the relation between the LBP, the Bethe free energy and the graph zeta function. This formula provides new techniques for analysis of the LBP algorithm, connecting properties of the graph and of the LBP and the Bethe free energy. We demonstrate applications of the techniques to several problems including (non) convexity of the Bethe free energy, the uniqueness and stability of the LBP fixed point. We also discuss the loop series initiated by Chertkov and Chernyak. The loop series is a subgraph expansion of the normalization constant, or partition function, and reflects the graph geometry. We investigate theoretical natures of the series. Moreover, we show a partial connection between the loop series and the graph zeta function.
Some cycle-supermagic labelings of the calendula graphs
NASA Astrophysics Data System (ADS)
Pradipta, T. R.; Salman, A. N. M.
2018-01-01
In this paper, we introduce a calendula graph, denoted by Clm,n . It is a graph constructed from a cycle on m vertices Cm and m copies of Cn which are Cn1 , Cn2 , ⋯, Cnm and grafting the i-th edge of Cm to an edge of in Cni for each i ∈ {1,2,⋯,m}. A graph G = (V, E) admits a Cn -covering, if every edge e ∈ E(G) belongs to a subgraph of G isomorphic to Cn . The graph G is called cycle-magic, if there exists a total labeling ϕ: V ∪ E → {1,2,…,|V|+|E|} such that for every subgraph Cn ‧ = (V‧,E‧) of G isomorphic to Cn has the same weight. In this case, the weight of Cn , denoted by ϕ(Cn ’), is defined as ∑ v∈V(C’n ) ϕ(v) + ∑ e∈E(C’n ) ϕ(e). Furthermore, G is called cycle-supermagic, if ϕ:V→{1,2,…,|V|}. In this paper, we provide some cycle-supermagic labelings of calendula graphs. In order to prove it, we develop a technique, to make a partition of a multiset into m sub-multisets with the same cardinality such that the sum of all elements of each sub-multiset is same. The technique is called an m-balanced multiset.
On spatial coalescents with multiple mergers in two dimensions.
Heuer, Benjamin; Sturm, Anja
2013-08-01
We consider the genealogy of a sample of individuals taken from a spatially structured population when the variance of the offspring distribution is relatively large. The space is structured into discrete sites of a graph G. If the population size at each site is large, spatial coalescents with multiple mergers, so called spatial Λ-coalescents, for which ancestral lines migrate in space and coalesce according to some Λ-coalescent mechanism, are shown to be appropriate approximations to the genealogy of a sample of individuals. We then consider as the graph G the two dimensional torus with side length 2L+1 and show that as L tends to infinity, and time is rescaled appropriately, the partition structure of spatial Λ-coalescents of individuals sampled far enough apart converges to the partition structure of a non-spatial Kingman coalescent. From a biological point of view this means that in certain circumstances both the spatial structure as well as larger variances of the underlying offspring distribution are harder to detect from the sample. However, supplemental simulations show that for moderately large L the different structure is still evident. Copyright © 2012 Elsevier Inc. All rights reserved.
Experimental quantum annealing: case study involving the graph isomorphism problem.
Zick, Kenneth M; Shehab, Omar; French, Matthew
2015-06-08
Quantum annealing is a proposed combinatorial optimization technique meant to exploit quantum mechanical effects such as tunneling and entanglement. Real-world quantum annealing-based solvers require a combination of annealing and classical pre- and post-processing; at this early stage, little is known about how to partition and optimize the processing. This article presents an experimental case study of quantum annealing and some of the factors involved in real-world solvers, using a 504-qubit D-Wave Two machine and the graph isomorphism problem. To illustrate the role of classical pre-processing, a compact Hamiltonian is presented that enables a reduced Ising model for each problem instance. On random N-vertex graphs, the median number of variables is reduced from N(2) to fewer than N log2 N and solvable graph sizes increase from N = 5 to N = 13. Additionally, error correction via classical post-processing majority voting is evaluated. While the solution times are not competitive with classical approaches to graph isomorphism, the enhanced solver ultimately classified correctly every problem that was mapped to the processor and demonstrated clear advantages over the baseline approach. The results shed some light on the nature of real-world quantum annealing and the associated hybrid classical-quantum solvers.
Experimental quantum annealing: case study involving the graph isomorphism problem
Zick, Kenneth M.; Shehab, Omar; French, Matthew
2015-01-01
Quantum annealing is a proposed combinatorial optimization technique meant to exploit quantum mechanical effects such as tunneling and entanglement. Real-world quantum annealing-based solvers require a combination of annealing and classical pre- and post-processing; at this early stage, little is known about how to partition and optimize the processing. This article presents an experimental case study of quantum annealing and some of the factors involved in real-world solvers, using a 504-qubit D-Wave Two machine and the graph isomorphism problem. To illustrate the role of classical pre-processing, a compact Hamiltonian is presented that enables a reduced Ising model for each problem instance. On random N-vertex graphs, the median number of variables is reduced from N2 to fewer than N log2 N and solvable graph sizes increase from N = 5 to N = 13. Additionally, error correction via classical post-processing majority voting is evaluated. While the solution times are not competitive with classical approaches to graph isomorphism, the enhanced solver ultimately classified correctly every problem that was mapped to the processor and demonstrated clear advantages over the baseline approach. The results shed some light on the nature of real-world quantum annealing and the associated hybrid classical-quantum solvers. PMID:26053973
In-Memory Graph Databases for Web-Scale Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castellana, Vito G.; Morari, Alessandro; Weaver, Jesse R.
RDF databases have emerged as one of the most relevant way for organizing, integrating, and managing expo- nentially growing, often heterogeneous, and not rigidly structured data for a variety of scientific and commercial fields. In this paper we discuss the solutions integrated in GEMS (Graph database Engine for Multithreaded Systems), a software framework for implementing RDF databases on commodity, distributed-memory high-performance clusters. Unlike the majority of current RDF databases, GEMS has been designed from the ground up to primarily employ graph-based methods. This is reflected in all the layers of its stack. The GEMS framework is composed of: a SPARQL-to-C++more » compiler, a library of data structures and related methods to access and modify them, and a custom runtime providing lightweight software multithreading, network messages aggregation and a partitioned global address space. We provide an overview of the framework, detailing its component and how they have been closely designed and customized to address issues of graph methods applied to large-scale datasets on clusters. We discuss in details the principles that enable automatic translation of the queries (expressed in SPARQL, the query language of choice for RDF databases) to graph methods, and identify differences with respect to other RDF databases.« less
NASA Astrophysics Data System (ADS)
Vivoni, Enrique R.; Mascaro, Giuseppe; Mniszewski, Susan; Fasel, Patricia; Springer, Everett P.; Ivanov, Valeriy Y.; Bras, Rafael L.
2011-10-01
SummaryA major challenge in the use of fully-distributed hydrologic models has been the lack of computational capabilities for high-resolution, long-term simulations in large river basins. In this study, we present the parallel model implementation and real-world hydrologic assessment of the Triangulated Irregular Network (TIN)-based Real-time Integrated Basin Simulator (tRIBS). Our parallelization approach is based on the decomposition of a complex watershed using the channel network as a directed graph. The resulting sub-basin partitioning divides effort among processors and handles hydrologic exchanges across boundaries. Through numerical experiments in a set of nested basins, we quantify parallel performance relative to serial runs for a range of processors, simulation complexities and lengths, and sub-basin partitioning methods, while accounting for inter-run variability on a parallel computing system. In contrast to serial simulations, the parallel model speed-up depends on the variability of hydrologic processes. Load balancing significantly improves parallel speed-up with proportionally faster runs as simulation complexity (domain resolution and channel network extent) increases. The best strategy for large river basins is to combine a balanced partitioning with an extended channel network, with potential savings through a lower TIN resolution. Based on these advances, a wider range of applications for fully-distributed hydrologic models are now possible. This is illustrated through a set of ensemble forecasts that account for precipitation uncertainty derived from a statistical downscaling model.
Attribute-based Decision Graphs: A framework for multiclass data classification.
Bertini, João Roberto; Nicoletti, Maria do Carmo; Zhao, Liang
2017-01-01
Graph-based algorithms have been successfully applied in machine learning and data mining tasks. A simple but, widely used, approach to build graphs from vector-based data is to consider each data instance as a vertex and connecting pairs of it using a similarity measure. Although this abstraction presents some advantages, such as arbitrary shape representation of the original data, it is still tied to some drawbacks, for example, it is dependent on the choice of a pre-defined distance metric and is biased by the local information among data instances. Aiming at exploring alternative ways to build graphs from data, this paper proposes an algorithm for constructing a new type of graph, called Attribute-based Decision Graph-AbDG. Given a vector-based data set, an AbDG is built by partitioning each data attribute range into disjoint intervals and representing each interval as a vertex. The edges are then established between vertices from different attributes according to a pre-defined pattern. Classification is performed through a matching process among the attribute values of the new instance and AbDG. Moreover, AbDG provides an inner mechanism to handle missing attribute values, which contributes for expanding its applicability. Results of classification tasks have shown that AbDG is a competitive approach when compared to well-known multiclass algorithms. The main contribution of the proposed framework is the combination of the advantages of attribute-based and graph-based techniques to perform robust pattern matching data classification, while permitting the analysis the input data considering only a subset of its attributes. Copyright © 2016 Elsevier Ltd. All rights reserved.
Super (a*, d*)-ℋ-antimagic total covering of second order of shackle graphs
NASA Astrophysics Data System (ADS)
Hesti Agustin, Ika; Dafik; Nisviasari, Rosanita; Prihandini, R. M.
2017-12-01
Let H be a simple and connected graph. A shackle of graph H, denoted by G = shack(H, v, n), is a graph G constructed by non-trivial graphs H 1, H 2, …, H n such that, for every 1 ≤ s, t ≤ n, H s and Ht have no a common vertex with |s - t| ≥ 2 and for every 1 ≤ i ≤ n - 1, Hi and H i+1 share exactly one common vertex v, called connecting vertex, and those k - 1 connecting vertices are all distinct. The graph G is said to be an (a*, d*)-H-antimagic total graph of second order if there exist a bijective function f : V(G) ∪ E(G) → {1, 2, …, |V(G)| + |E(G)|} such that for all subgraphs isomorphic to H, the total H-weights W(H)=\\displaystyle {\\sum }v\\in V(H)f(v)+\\displaystyle {\\sum }e\\in E(H)f(e) form an arithmetic sequence of second order of \\{a* ,a* +d* ,a* +3d* ,a* +6d* ,\\ldots ,a* +(\\frac{{n}2-n}{2})d* \\}, where a* and d* are positive integers and n is the number of all subgraphs isomorphic to H. An (a*, d*)-H-antimagic total labeling of second order f is called super if the smallest labels appear in the vertices. In this paper, we study a super (a*, d*)-H antimagic total labeling of second order of G = shack(H, v, n) by using a partition technique of second order.
Enhanced Vaccine Control of Epidemics in Adaptive Networks
2010-04-29
than random vaccination 32. When vaccine is very limited, outbreaks can be minimized by fragmenting the network via a graph partitioning strategy...Although an outbreak could transiently decrease network connectivity in a real social network, long term reductions in average connectivity are...37, cholera and typhoid fever 35, as well as pertussis 38. Although not all of the above diseases currently possess a vaccine, research is ongo
Spadafore, Maxwell; Najarian, Kayvan; Boyle, Alan P
2017-11-29
Transcription factors (TFs) form a complex regulatory network within the cell that is crucial to cell functioning and human health. While methods to establish where a TF binds to DNA are well established, these methods provide no information describing how TFs interact with one another when they do bind. TFs tend to bind the genome in clusters, and current methods to identify these clusters are either limited in scope, unable to detect relationships beyond motif similarity, or not applied to TF-TF interactions. Here, we present a proximity-based graph clustering approach to identify TF clusters using either ChIP-seq or motif search data. We use TF co-occurrence to construct a filtered, normalized adjacency matrix and use the Markov Clustering Algorithm to partition the graph while maintaining TF-cluster and cluster-cluster interactions. We then apply our graph structure beyond clustering, using it to increase the accuracy of motif-based TFBS searching for an example TF. We show that our method produces small, manageable clusters that encapsulate many known, experimentally validated transcription factor interactions and that our method is capable of capturing interactions that motif similarity methods might miss. Our graph structure is able to significantly increase the accuracy of motif TFBS searching, demonstrating that the TF-TF connections within the graph correlate with biological TF-TF interactions. The interactions identified by our method correspond to biological reality and allow for fast exploration of TF clustering and regulatory dynamics.
Wang, Yang; Wu, Lin
2018-07-01
Low-Rank Representation (LRR) is arguably one of the most powerful paradigms for Multi-view spectral clustering, which elegantly encodes the multi-view local graph/manifold structures into an intrinsic low-rank self-expressive data similarity embedded in high-dimensional space, to yield a better graph partition than their single-view counterparts. In this paper we revisit it with a fundamentally different perspective by discovering LRR as essentially a latent clustered orthogonal projection based representation winged with an optimized local graph structure for spectral clustering; each column of the representation is fundamentally a cluster basis orthogonal to others to indicate its members, which intuitively projects the view-specific feature representation to be the one spanned by all orthogonal basis to characterize the cluster structures. Upon this finding, we propose our technique with the following: (1) We decompose LRR into latent clustered orthogonal representation via low-rank matrix factorization, to encode the more flexible cluster structures than LRR over primal data objects; (2) We convert the problem of LRR into that of simultaneously learning orthogonal clustered representation and optimized local graph structure for each view; (3) The learned orthogonal clustered representations and local graph structures enjoy the same magnitude for multi-view, so that the ideal multi-view consensus can be readily achieved. The experiments over multi-view datasets validate its superiority, especially over recent state-of-the-art LRR models. Copyright © 2018 Elsevier Ltd. All rights reserved.
Complete graph model for community detection
NASA Astrophysics Data System (ADS)
Sun, Peng Gang; Sun, Xiya
2017-04-01
Community detection brings plenty of considerable problems, which has attracted more attention for many years. This paper develops a new framework, which tries to measure the interior and the exterior of a community based on a same metric, complete graph model. In particular, the exterior is modeled as a complete bipartite. We partition a network into subnetworks by maximizing the difference between the interior and the exterior of the subnetworks. In addition, we compare our approach with some state of the art methods on computer-generated networks based on the LFR benchmark as well as real-world networks. The experimental results indicate that our approach obtains better results for community detection, is capable of splitting irregular networks and achieves perfect results on the karate network and the dolphin network.
Leveraging unsupervised training sets for multi-scale compartmentalization in renal pathology
NASA Astrophysics Data System (ADS)
Lutnick, Brendon; Tomaszewski, John E.; Sarder, Pinaki
2017-03-01
Clinical pathology relies on manual compartmentalization and quantification of biological structures, which is time consuming and often error-prone. Application of computer vision segmentation algorithms to histopathological image analysis, in contrast, can offer fast, reproducible, and accurate quantitative analysis to aid pathologists. Algorithms tunable to different biologically relevant structures can allow accurate, precise, and reproducible estimates of disease states. In this direction, we have developed a fast, unsupervised computational method for simultaneously separating all biologically relevant structures from histopathological images in multi-scale. Segmentation is achieved by solving an energy optimization problem. Representing the image as a graph, nodes (pixels) are grouped by minimizing a Potts model Hamiltonian, adopted from theoretical physics, modeling interacting electron spins. Pixel relationships (modeled as edges) are used to update the energy of the partitioned graph. By iteratively improving the clustering, the optimal number of segments is revealed. To reduce computational time, the graph is simplified using a Cantor pairing function to intelligently reduce the number of included nodes. The classified nodes are then used to train a multiclass support vector machine to apply the segmentation over the full image. Accurate segmentations of images with as many as 106 pixels can be completed only in 5 sec, allowing for attainable multi-scale visualization. To establish clinical potential, we employed our method in renal biopsies to quantitatively visualize for the first time scale variant compartments of heterogeneous intra- and extraglomerular structures simultaneously. Implications of the utility of our method extend to fields such as oncology, genomics, and non-biological problems.
Analyzing locomotion synthesis with feature-based motion graphs.
Mahmudi, Mentar; Kallmann, Marcelo
2013-05-01
We propose feature-based motion graphs for realistic locomotion synthesis among obstacles. Among several advantages, feature-based motion graphs achieve improved results in search queries, eliminate the need of postprocessing for foot skating removal, and reduce the computational requirements in comparison to traditional motion graphs. Our contributions are threefold. First, we show that choosing transitions based on relevant features significantly reduces graph construction time and leads to improved search performances. Second, we employ a fast channel search method that confines the motion graph search to a free channel with guaranteed clearance among obstacles, achieving faster and improved results that avoid expensive collision checking. Lastly, we present a motion deformation model based on Inverse Kinematics applied over the transitions of a solution branch. Each transition is assigned a continuous deformation range that does not exceed the original transition cost threshold specified by the user for the graph construction. The obtained deformation improves the reachability of the feature-based motion graph and in turn also reduces the time spent during search. The results obtained by the proposed methods are evaluated and quantified, and they demonstrate significant improvements in comparison to traditional motion graph techniques.
Network immunization under limited budget using graph spectra
NASA Astrophysics Data System (ADS)
Zahedi, R.; Khansari, M.
2016-03-01
In this paper, we propose a new algorithm that minimizes the worst expected growth of an epidemic by reducing the size of the largest connected component (LCC) of the underlying contact network. The proposed algorithm is applicable to any level of available resources and, despite the greedy approaches of most immunization strategies, selects nodes simultaneously. In each iteration, the proposed method partitions the LCC into two groups. These are the best candidates for communities in that component, and the available resources are sufficient to separate them. Using Laplacian spectral partitioning, the proposed method performs community detection inference with a time complexity that rivals that of the best previous methods. Experiments show that our method outperforms targeted immunization approaches in both real and synthetic networks.
Extremal Optimization: Methods Derived from Co-Evolution
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boettcher, S.; Percus, A.G.
1999-07-13
We describe a general-purpose method for finding high-quality solutions to hard optimization problems, inspired by self-organized critical models of co-evolution such as the Bak-Sneppen model. The method, called Extremal Optimization, successively eliminates extremely undesirable components of sub-optimal solutions, rather than ''breeding'' better components. In contrast to Genetic Algorithms which operate on an entire ''gene-pool'' of possible solutions, Extremal Optimization improves on a single candidate solution by treating each of its components as species co-evolving according to Darwinian principles. Unlike Simulated Annealing, its non-equilibrium approach effects an algorithm requiring few parameters to tune. With only one adjustable parameter, its performance provesmore » competitive with, and often superior to, more elaborate stochastic optimization procedures. We demonstrate it here on two classic hard optimization problems: graph partitioning and the traveling salesman problem.« less
Mapping Computation with No Memory
NASA Astrophysics Data System (ADS)
Burckel, Serge; Gioan, Emeric; Thomé, Emmanuel
We investigate the computation of mappings from a set S n to itself with in situ programs, that is using no extra variables than the input, and performing modifications of one component at a time. We consider several types of mappings and obtain effective computation and decomposition methods, together with upper bounds on the program length (number of assignments). Our technique is combinatorial and algebraic (graph coloration, partition ordering, modular arithmetics).
An integer programming formulation of the parsimonious loss of heterozygosity problem.
Catanzaro, Daniele; Labbé, Martine; Halldórsson, Bjarni V
2013-01-01
A loss of heterozygosity (LOH) event occurs when, by the laws of Mendelian inheritance, an individual should be heterozygote at a given site but, due to a deletion polymorphism, is not. Deletions play an important role in human disease and their detection could provide fundamental insights for the development of new diagnostics and treatments. In this paper, we investigate the parsimonious loss of heterozygosity problem (PLOHP), i.e., the problem of partitioning suspected polymorphisms from a set of individuals into a minimum number of deletion areas. Specifically, we generalize Halldórsson et al.'s work by providing a more general formulation of the PLOHP and by showing how one can incorporate different recombination rates and prior knowledge about the locations of deletions. Moreover, we show that the PLOHP can be formulated as a specific version of the clique partition problem in a particular class of graphs called undirected catch-point interval graphs and we prove its general $({\\cal NP})$-hardness. Finally, we provide a state-of-the-art integer programming (IP) formulation and strengthening valid inequalities to exactly solve real instances of the PLOHP containing up to 9,000 individuals and 3,000 SNPs. Our results give perspectives on the mathematics of the PLOHP and suggest new directions on the development of future efficient exact solution approaches.
Equal Graph Partitioning on Estimated Infection Network as an Effective Epidemic Mitigation Measure
Hadidjojo, Jeremy; Cheong, Siew Ann
2011-01-01
Controlling severe outbreaks remains the most important problem in infectious disease area. With time, this problem will only become more severe as population density in urban centers grows. Social interactions play a very important role in determining how infectious diseases spread, and organization of people along social lines gives rise to non-spatial networks in which the infections spread. Infection networks are different for diseases with different transmission modes, but are likely to be identical or highly similar for diseases that spread the same way. Hence, infection networks estimated from common infections can be useful to contain epidemics of a more severe disease with the same transmission mode. Here we present a proof-of-concept study demonstrating the effectiveness of epidemic mitigation based on such estimated infection networks. We first generate artificial social networks of different sizes and average degrees, but with roughly the same clustering characteristic. We then start SIR epidemics on these networks, censor the simulated incidences, and use them to reconstruct the infection network. We then efficiently fragment the estimated network by removing the smallest number of nodes identified by a graph partitioning algorithm. Finally, we demonstrate the effectiveness of this targeted strategy, by comparing it against traditional untargeted strategies, in slowing down and reducing the size of advancing epidemics. PMID:21799777
DOE Office of Scientific and Technical Information (OSTI.GOV)
Demeure, I.M.
The research presented here is concerned with representation techniques and tools to support the design, prototyping, simulation, and evaluation of message-based parallel, distributed computations. The author describes ParaDiGM-Parallel, Distributed computation Graph Model-a visual representation technique for parallel, message-based distributed computations. ParaDiGM provides several views of a computation depending on the aspect of concern. It is made of two complementary submodels, the DCPG-Distributed Computing Precedence Graph-model, and the PAM-Process Architecture Model-model. DCPGs are precedence graphs used to express the functionality of a computation in terms of tasks, message-passing, and data. PAM graphs are used to represent the partitioning of a computationmore » into schedulable units or processes, and the pattern of communication among those units. There is a natural mapping between the two models. He illustrates the utility of ParaDiGM as a representation technique by applying it to various computations (e.g., an adaptive global optimization algorithm, the client-server model). ParaDiGM representations are concise. They can be used in documenting the design and the implementation of parallel, distributed computations, in describing such computations to colleagues, and in comparing and contrasting various implementations of the same computation. He then describes VISA-VISual Assistant, a software tool to support the design, prototyping, and simulation of message-based parallel, distributed computations. VISA is based on the ParaDiGM model. In particular, it supports the editing of ParaDiGM graphs to describe the computations of interest, and the animation of these graphs to provide visual feedback during simulations. The graphs are supplemented with various attributes, simulation parameters, and interpretations which are procedures that can be executed by VISA.« less
Identification of structural domains in proteins by a graph heuristic.
Wernisch, L; Hunting, M; Wodak, S J
1999-05-15
A novel automatic procedure for identifying domains from protein atomic coordinates is presented. The procedure, termed STRUDL (STRUctural Domain Limits), does not take into account information on secondary structures and handles any number of domains made up of contiguous or non-contiguous chain segments. The core algorithm uses the Kernighan-Lin graph heuristic to partition the protein into residue sets which display minimum interactions between them. These interactions are deduced from the weighted Voronoi diagram. The generated partitions are accepted or rejected on the basis of optimized criteria, representing basic expected physical properties of structural domains. The graph heuristic approach is shown to be very effective, it approximates closely the exact solution provided by a branch and bound algorithm for a number of test proteins. In addition, the overall performance of STRUDL is assessed on a set of 787 representative proteins from the Protein Data Bank by comparison to domain definitions in the CATH protein classification. The domains assigned by STRUDL agree with the CATH assignments in at least 81% of the tested proteins. This result is comparable to that obtained previously using PUU (Holm and Sander, Proteins 1994;9:256-268), the only other available algorithm designed to identify domains with any number of non-contiguous chain segments. A detailed discussion of the structures for which our assignments differ from those in CATH brings to light some clear inconsistencies between the concept of structural domains based on minimizing inter-domain interactions and that of delimiting structural motifs that represent acceptable folding topologies or architectures. Considering both concepts as complementary and combining them in a layered approach might be the way forward.
Zhang, Pan; Moore, Cristopher
2014-01-01
Modularity is a popular measure of community structure. However, maximizing the modularity can lead to many competing partitions, with almost the same modularity, that are poorly correlated with each other. It can also produce illusory ‘‘communities’’ in random graphs where none exist. We address this problem by using the modularity as a Hamiltonian at finite temperature and using an efficient belief propagation algorithm to obtain the consensus of many partitions with high modularity, rather than looking for a single partition that maximizes it. We show analytically and numerically that the proposed algorithm works all of the way down to the detectability transition in networks generated by the stochastic block model. It also performs well on real-world networks, revealing large communities in some networks where previous work has claimed no communities exist. Finally we show that by applying our algorithm recursively, subdividing communities until no statistically significant subcommunities can be found, we can detect hierarchical structure in real-world networks more efficiently than previous methods. PMID:25489096
A brief history of partitions of numbers, partition functions and their modern applications
NASA Astrophysics Data System (ADS)
Debnath, Lokenath
2016-04-01
K-Partite RNA Secondary Structures
NASA Astrophysics Data System (ADS)
Jiang, Minghui; Tejada, Pedro J.; Lasisi, Ramoni O.; Cheng, Shanhong; Fechser, D. Scott
RNA secondary structure prediction is a fundamental problem in structural bioinformatics. The prediction problem is difficult because RNA secondary structures may contain pseudoknots formed by crossing base pairs. We introduce k-partite secondary structures as a simple classification of RNA secondary structures with pseudoknots. An RNA secondary structure is k-partite if it is the union of k pseudoknot-free sub-structures. Most known RNA secondary structures are either bipartite or tripartite. We show that there exists a constant number k such that any secondary structure can be modified into a k-partite secondary structure with approximately the same free energy. This offers a partial explanation of the prevalence of k-partite secondary structures with small k. We give a complete characterization of the computational complexities of recognizing k-partite secondary structures for all k ≥ 2, and show that this recognition problem is essentially the same as the k-colorability problem on circle graphs. We present two simple heuristics, iterated peeling and first-fit packing, for finding k-partite RNA secondary structures. For maximizing the number of base pair stackings, our iterated peeling heuristic achieves a constant approximation ratio of at most k for 2 ≤ k ≤ 5, and at most frac6{1-(1-6/k)^k} le frac6{1-e^{-6}} < 6.01491 for k ≥ 6. Experiment on sequences from PseudoBase shows that our first-fit packing heuristic outperforms the leading method HotKnots in predicting RNA secondary structures with pseudoknots. Source code, data set, and experimental results are available at
Resolution of ranking hierarchies in directed networks.
Letizia, Elisa; Barucca, Paolo; Lillo, Fabrizio
2018-01-01
Identifying hierarchies and rankings of nodes in directed graphs is fundamental in many applications such as social network analysis, biology, economics, and finance. A recently proposed method identifies the hierarchy by finding the ordered partition of nodes which minimises a score function, termed agony. This function penalises the links violating the hierarchy in a way depending on the strength of the violation. To investigate the resolution of ranking hierarchies we introduce an ensemble of random graphs, the Ranked Stochastic Block Model. We find that agony may fail to identify hierarchies when the structure is not strong enough and the size of the classes is small with respect to the whole network. We analytically characterise the resolution threshold and we show that an iterated version of agony can partly overcome this resolution limit.
Resolution of ranking hierarchies in directed networks
Barucca, Paolo; Lillo, Fabrizio
2018-01-01
Identifying hierarchies and rankings of nodes in directed graphs is fundamental in many applications such as social network analysis, biology, economics, and finance. A recently proposed method identifies the hierarchy by finding the ordered partition of nodes which minimises a score function, termed agony. This function penalises the links violating the hierarchy in a way depending on the strength of the violation. To investigate the resolution of ranking hierarchies we introduce an ensemble of random graphs, the Ranked Stochastic Block Model. We find that agony may fail to identify hierarchies when the structure is not strong enough and the size of the classes is small with respect to the whole network. We analytically characterise the resolution threshold and we show that an iterated version of agony can partly overcome this resolution limit. PMID:29394278
Folksonomies and clustering in the collaborative system CiteULike
NASA Astrophysics Data System (ADS)
Capocci, Andrea; Caldarelli, Guido
2008-06-01
We analyze CiteULike, an online collaborative tagging system where users bookmark and annotate scientific papers. Such a system can be naturally represented as a tri-partite graph whose nodes represent papers, users and tags connected by individual tag assignments. The semantics of tags is studied here, in order to uncover the hidden relationships between tags. We find that the clustering coefficient can be used to analyze the semantical patterns among tags.
Man-made objects cuing in satellite imagery
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skurikhin, Alexei N
2009-01-01
We present a multi-scale framework for man-made structures cuing in satellite image regions. The approach is based on a hierarchical image segmentation followed by structural analysis. A hierarchical segmentation produces an image pyramid that contains a stack of irregular image partitions, represented as polygonized pixel patches, of successively reduced levels of detail (LOOs). We are jumping off from the over-segmented image represented by polygons attributed with spectral and texture information. The image is represented as a proximity graph with vertices corresponding to the polygons and edges reflecting polygon relations. This is followed by the iterative graph contraction based on Boruvka'smore » Minimum Spanning Tree (MST) construction algorithm. The graph contractions merge the patches based on their pairwise spectral and texture differences. Concurrently with the construction of the irregular image pyramid, structural analysis is done on the agglomerated patches. Man-made object cuing is based on the analysis of shape properties of the constructed patches and their spatial relations. The presented framework can be used as pre-scanning tool for wide area monitoring to quickly guide the further analysis to regions of interest.« less
Mutual proximity graphs for improved reachability in music recommendation.
Flexer, Arthur; Stevens, Jeff
2018-01-01
This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness.
Mutual proximity graphs for improved reachability in music recommendation
Flexer, Arthur; Stevens, Jeff
2018-01-01
This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness. PMID:29348779
Segmentation and classification of cell cycle phases in fluorescence imaging.
Ersoy, Ilker; Bunyak, Filiz; Chagin, Vadim; Cardoso, M Christina; Palaniappan, Kannappan
2009-01-01
Current chemical biology methods for studying spatiotemporal correlation between biochemical networks and cell cycle phase progression in live-cells typically use fluorescence-based imaging of fusion proteins. Stable cell lines expressing fluorescently tagged protein GFP-PCNA produce rich, dynamically varying sub-cellular foci patterns characterizing the cell cycle phases, including the progress during the S-phase. Variable fluorescence patterns, drastic changes in SNR, shape and position changes and abundance of touching cells require sophisticated algorithms for reliable automatic segmentation and cell cycle classification. We extend the recently proposed graph partitioning active contours (GPAC) for fluorescence-based nucleus segmentation using regional density functions and dramatically improve its efficiency, making it scalable for high content microscopy imaging. We utilize surface shape properties of GFP-PCNA intensity field to obtain descriptors of foci patterns and perform automated cell cycle phase classification, and give quantitative performance by comparing our results to manually labeled data.
Bunyak, Filiz; Palaniappan, Kannappan; Chagin, Vadim; Cardoso, M
2009-01-01
Fluorescently tagged proteins such as GFP-PCNA produce rich dynamically varying textural patterns of foci distributed in the nucleus. This enables the behavioral study of sub-cellular structures during different phases of the cell cycle. The varying punctuate patterns of fluorescence, drastic changes in SNR, shape and position during mitosis and abundance of touching cells, however, require more sophisticated algorithms for reliable automatic cell segmentation and lineage analysis. Since the cell nuclei are non-uniform in appearance, a distribution-based modeling of foreground classes is essential. The recently proposed graph partitioning active contours (GPAC) algorithm supports region descriptors and flexible distance metrics. We extend GPAC for fluorescence-based cell segmentation using regional density functions and dramatically improve its efficiency for segmentation from O(N(4)) to O(N(2)), for an image with N(2) pixels, making it practical and scalable for high throughput microscopy imaging studies.
Sy, B K; Deller, J R
1989-05-01
An intelligent communication device is developed to assist the nonverbal, motor disabled in the generation of written and spoken messages. The device is centered on a knowledge base of the grammatical rules and message elements. A "belief" reasoning scheme based on both the information from external sources and the embedded knowledge is used to optimize the process of message search. The search for the message elements is conceptualized as a path search in the language graph, and a special frame architecture is used to construct and to partition the graph. Bayesian "belief" reasoning from the Dempster-Shafer theory of evidence is augmented to cope with time-varying evidence. An "information fusion" strategy is also introduced to integrate various forms of external information. Experimental testing of the prototype system is discussed.
Weakly supervised image semantic segmentation based on clustering superpixels
NASA Astrophysics Data System (ADS)
Yan, Xiong; Liu, Xiaohua
2018-04-01
In this paper, we propose an image semantic segmentation model which is trained from image-level labeled images. The proposed model starts with superpixel segmenting, and features of the superpixels are extracted by trained CNN. We introduce a superpixel-based graph followed by applying the graph partition method to group correlated superpixels into clusters. For the acquisition of inter-label correlations between the image-level labels in dataset, we not only utilize label co-occurrence statistics but also exploit visual contextual cues simultaneously. At last, we formulate the task of mapping appropriate image-level labels to the detected clusters as a problem of convex minimization. Experimental results on MSRC-21 dataset and LableMe dataset show that the proposed method has a better performance than most of the weakly supervised methods and is even comparable to fully supervised methods.
Tracking Multiple People Online and in Real Time
2015-12-21
NO. 0704-0188 3. DATES COVERED (From - To) - UU UU UU UU 21-12-2015 Approved for public release; distribution is unlimited. Tracking multiple people ...online and in real time We cast the problem of tracking several people as a graph partitioning problem that takes the form of an NP-hard binary...PLEASE DO NOT RETURN YOUR FORM TO THE ABOVE ADDRESS. Duke University 2200 West Main Street Suite 710 Durham, NC 27705 -4010 ABSTRACT Tracking multiple
GoFFish: Graph-Oriented Framework for Foresight and Insight Using Scalable Heuristics
2015-09-01
series of research challenges that arise from the above and related to: scalability, data partitioning, memory representation and storage, exe - cution...job with varied deadlines in Jan, 2013. Further, we also build an SPSP model using Jan, 2013 data post facto as an ideal case. Fig. 7 reports the...Advanced Studies on Collaborative Research . Riverton, NJ, USA: IBM Corp., 2009, pp. 101–111. [42] G. J. Nutt, “The evolution towards flexible workflow
NASA Astrophysics Data System (ADS)
Brazhnik, Olga D.; Freed, Karl F.
1996-07-01
The lattice cluster theory (LCT) is extended to enable inclusion of longer range correlation contributions to the partition function of lattice model polymers in the athermal limit. A diagrammatic technique represents the expansion of the partition function in powers of the inverse lattice coordination number. Graph theory is applied to sort, classify, and evaluate the numerous diagrams appearing in higher orders. New general theorems are proven that provide a significant reduction in the computational labor required to evaluate the contributions from higher order correlations. The new algorithm efficiently generates the correction to the Flory mean field approximation from as many as eight sterically interacting bonds. While the new results contain the essential ingredients for treating a system of flexible chains with arbitrary lengths and concentrations, the complexity of our new algorithm motivates us to test the theory here for the simplest case of a system of lattice dimers by comparison to the dimer packing entropies from the work of Gaunt. This comparison demonstrates that the eight bond LCT is exact through order φ5 for dimers in one through three dimensions, where φ is the volume fraction of dimers. A subsequent work will use the contracted diagrams, derived and tested here, to treat the packing entropy for a system of flexible N-mers at a volume fraction of φ on hypercubic lattices.
Detecting labor using graph theory on connectivity matrices of uterine EMG.
Al-Omar, S; Diab, A; Nader, N; Khalil, M; Karlsson, B; Marque, C
2015-08-01
Premature labor is one of the most serious health problems in the developed world. One of the main reasons for this is that no good way exists to distinguish true labor from normal pregnancy contractions. The aim of this paper is to investigate if the application of graph theory techniques to multi-electrode uterine EMG signals can improve the discrimination between pregnancy contractions and labor. To test our methods we first applied them to synthetic graphs where we detected some differences in the parameters results and changes in the graph model from pregnancy-like graphs to labor-like graphs. Then, we applied the same methods to real signals. We obtained the best differentiation between pregnancy and labor through the same parameters. Major improvements in differentiating between pregnancy and labor were obtained using a low pass windowing preprocessing step. Results show that real graphs generally became more organized when moving from pregnancy, where the graph showed random characteristics, to labor where the graph became a more small-world like graph.
Edge connectivity and the spectral gap of combinatorial and quantum graphs
NASA Astrophysics Data System (ADS)
Berkolaiko, Gregory; Kennedy, James B.; Kurasov, Pavel; Mugnolo, Delio
2017-09-01
We derive a number of upper and lower bounds for the first nontrivial eigenvalue of Laplacians on combinatorial and quantum graph in terms of the edge connectivity, i.e. the minimal number of edges which need to be removed to make the graph disconnected. On combinatorial graphs, one of the bounds corresponds to a well-known inequality of Fiedler, of which we give a new variational proof. On quantum graphs, the corresponding bound generalizes a recent result of Band and Lévy. All proofs are general enough to yield corresponding estimates for the p-Laplacian and allow us to identify the minimizers. Based on the Betti number of the graph, we also derive upper and lower bounds on all eigenvalues which are ‘asymptotically correct’, i.e. agree with the Weyl asymptotics for the eigenvalues of the quantum graph. In particular, the lower bounds improve the bounds of Friedlander on any given graph for all but finitely many eigenvalues, while the upper bounds improve recent results of Ariturk. Our estimates are also used to derive bounds on the eigenvalues of the normalized Laplacian matrix that improve known bounds of spectral graph theory.
Combinatorial approximation algorithms for MAXCUT using random walks.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seshadhri, Comandur; Kale, Satyen
We give the first combinatorial approximation algorithm for MaxCut that beats the trivial 0.5 factor by a constant. The main partitioning procedure is very intuitive, natural, and easily described. It essentially performs a number of random walks and aggregates the information to provide the partition. We can control the running time to get an approximation factor-running time tradeoff. We show that for any constant b > 1.5, there is an {tilde O}(n{sup b}) algorithm that outputs a (0.5 + {delta})-approximation for MaxCut, where {delta} = {delta}(b) is some positive constant. One of the components of our algorithm is a weakmore » local graph partitioning procedure that may be of independent interest. Given a starting vertex i and a conductance parameter {phi}, unless a random walk of length {ell} = O(log n) starting from i mixes rapidly (in terms of {phi} and {ell}), we can find a cut of conductance at most {phi} close to the vertex. The work done per vertex found in the cut is sublinear in n.« less
Clustering Financial Time Series by Network Community Analysis
NASA Astrophysics Data System (ADS)
Piccardi, Carlo; Calatroni, Lisa; Bertoni, Fabio
In this paper, we describe a method for clustering financial time series which is based on community analysis, a recently developed approach for partitioning the nodes of a network (graph). A network with N nodes is associated to the set of N time series. The weight of the link (i, j), which quantifies the similarity between the two corresponding time series, is defined according to a metric based on symbolic time series analysis, which has recently proved effective in the context of financial time series. Then, searching for network communities allows one to identify groups of nodes (and then time series) with strong similarity. A quantitative assessment of the significance of the obtained partition is also provided. The method is applied to two distinct case-studies concerning the US and Italy Stock Exchange, respectively. In the US case, the stability of the partitions over time is also thoroughly investigated. The results favorably compare with those obtained with the standard tools typically used for clustering financial time series, such as the minimal spanning tree and the hierarchical tree.
Text categorization of biomedical data sets using graph kernels and a controlled vocabulary.
Bleik, Said; Mishra, Meenakshi; Huan, Jun; Song, Min
2013-01-01
Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph kernels to classify those articles into high-level categories. In our representation, common biomedical concepts and semantic relationships are identified with the help of an existing ontology and are used to build a rich graph structure that provides a consistent feature set and preserves additional semantic information that could improve a classifier's performance. We attempt to classify the graphs using both a set-based graph kernel that is capable of dealing with the disconnected nature of the graphs and a simple linear kernel. Finally, we report the results comparing the classification performance of the kernel classifiers to common text-based classifiers.
Keller, Carmen; Junghans, Alex
2017-11-01
Individuals with low numeracy have difficulties with understanding complex graphs. Combining the information-processing approach to numeracy with graph comprehension and information-reduction theories, we examined whether high numerates' better comprehension might be explained by their closer attention to task-relevant graphical elements, from which they would expect numerical information to understand the graph. Furthermore, we investigated whether participants could be trained in improving their attention to task-relevant information and graph comprehension. In an eye-tracker experiment ( N = 110) involving a sample from the general population, we presented participants with 2 hypothetical scenarios (stomach cancer, leukemia) showing survival curves for 2 treatments. In the training condition, participants received written instructions on how to read the graph. In the control condition, participants received another text. We tracked participants' eye movements while they answered 9 knowledge questions. The sum constituted graph comprehension. We analyzed visual attention to task-relevant graphical elements by using relative fixation durations and relative fixation counts. The mediation analysis revealed a significant ( P < 0.05) indirect effect of numeracy on graph comprehension through visual attention to task-relevant information, which did not differ between the 2 conditions. Training had a significant main effect on visual attention ( P < 0.05) but not on graph comprehension ( P < 0.07). Individuals with high numeracy have better graph comprehension due to their greater attention to task-relevant graphical elements than individuals with low numeracy. With appropriate instructions, both groups can be trained to improve their graph-processing efficiency. Future research should examine (e.g., motivational) mediators between visual attention and graph comprehension to develop appropriate instructions that also result in higher graph comprehension.
Go With the Flow, on Jupiter and Snow. Coherence from Model-Free Video Data Without Trajectories
NASA Astrophysics Data System (ADS)
AlMomani, Abd AlRahman R.; Bollt, Erik
2018-06-01
Viewing a data set such as the clouds of Jupiter, coherence is readily apparent to human observers, especially the Great Red Spot, but also other great storms and persistent structures. There are now many different definitions and perspectives mathematically describing coherent structures, but we will take an image processing perspective here. We describe an image processing perspective inference of coherent sets from a fluidic system directly from image data, without attempting to first model underlying flow fields, related to a concept in image processing called motion tracking. In contrast to standard spectral methods for image processing which are generally related to a symmetric affinity matrix, leading to standard spectral graph theory, we need a not symmetric affinity which arises naturally from the underlying arrow of time. We develop an anisotropic, directed diffusion operator corresponding to flow on a directed graph, from a directed affinity matrix developed with coherence in mind, and corresponding spectral graph theory from the graph Laplacian. Our methodology is not offered as more accurate than other traditional methods of finding coherent sets, but rather our approach works with alternative kinds of data sets, in the absence of vector field. Our examples will include partitioning the weather and cloud structures of Jupiter, and a local to Potsdam, NY, lake effect snow event on Earth, as well as the benchmark test double-gyre system.
Unsupervised segmentation of MRI knees using image partition forests
NASA Astrophysics Data System (ADS)
Marčan, Marija; Voiculescu, Irina
2016-03-01
Nowadays many people are affected by arthritis, a condition of the joints with limited prevention measures, but with various options of treatment the most radical of which is surgical. In order for surgery to be successful, it can make use of careful analysis of patient-based models generated from medical images, usually by manual segmentation. In this work we show how to automate the segmentation of a crucial and complex joint -- the knee. To achieve this goal we rely on our novel way of representing a 3D voxel volume as a hierarchical structure of partitions which we have named Image Partition Forest (IPF). The IPF contains several partition layers of increasing coarseness, with partitions nested across layers in the form of adjacency graphs. On the basis of a set of properties (size, mean intensity, coordinates) of each node in the IPF we classify nodes into different features. Values indicating whether or not any particular node belongs to the femur or tibia are assigned through node filtering and node-based region growing. So far we have evaluated our method on 15 MRI knee images. Our unsupervised segmentation compared against a hand-segmented gold standard has achieved an average Dice similarity coefficient of 0.95 for femur and 0.93 for tibia, and an average symmetric surface distance of 0.98 mm for femur and 0.73 mm for tibia. The paper also discusses ways to introduce stricter morphological and spatial conditioning in the bone labelling process.
Shem-Tov, Doron; Halperin, Eran
2014-06-01
Recent technological improvements in the field of genetic data extraction give rise to the possibility of reconstructing the historical pedigrees of entire populations from the genotypes of individuals living today. Current methods are still not practical for real data scenarios as they have limited accuracy and assume unrealistic assumptions of monogamy and synchronized generations. In order to address these issues, we develop a new method for pedigree reconstruction, [Formula: see text], which is based on formulations of the pedigree reconstruction problem as variants of graph coloring. The new formulation allows us to consider features that were overlooked by previous methods, resulting in a reconstruction of up to 5 generations back in time, with an order of magnitude improvement of false-negatives rates over the state of the art, while keeping a lower level of false positive rates. We demonstrate the accuracy of [Formula: see text] compared to previous approaches using simulation studies over a range of population sizes, including inbred and outbred populations, monogamous and polygamous mating patterns, as well as synchronous and asynchronous mating.
Clique-Based Neural Associative Memories with Local Coding and Precoding.
Mofrad, Asieh Abolpour; Parker, Matthew G; Ferdosi, Zahra; Tadayon, Mohammad H
2016-08-01
Techniques from coding theory are able to improve the efficiency of neuroinspired and neural associative memories by forcing some construction and constraints on the network. In this letter, the approach is to embed coding techniques into neural associative memory in order to increase their performance in the presence of partial erasures. The motivation comes from recent work by Gripon, Berrou, and coauthors, which revisited Willshaw networks and presented a neural network with interacting neurons that partitioned into clusters. The model introduced stores patterns as small-size cliques that can be retrieved in spite of partial error. We focus on improving the success of retrieval by applying two techniques: doing a local coding in each cluster and then applying a precoding step. We use a slightly different decoding scheme, which is appropriate for partial erasures and converges faster. Although the ideas of local coding and precoding are not new, the way we apply them is different. Simulations show an increase in the pattern retrieval capacity for both techniques. Moreover, we use self-dual additive codes over field [Formula: see text], which have very interesting properties and a simple-graph representation.
Communication Optimal Parallel Multiplication of Sparse Random Matrices
2013-02-21
Definition 2.1), and (2) the algorithm is sparsity- independent, where the computation is statically partitioned to processors independent of the sparsity...struc- ture of the input matrices (see Definition 2.5). The second assumption applies to nearly all existing al- gorithms for general sparse matrix-matrix...where A and B are n× n ER(d) matrices: Definition 2.1 An ER(d) matrix is an adjacency matrix of an Erdős-Rényi graph with parameters n and d/n. That
SPROC: A multiple-processor DSP IC
NASA Technical Reports Server (NTRS)
Davis, R.
1991-01-01
A large, single-chip, multiple-processor, digital signal processing (DSP) integrated circuit (IC) fabricated in HP-Cmos34 is presented. The innovative architecture is best suited for analog and real-time systems characterized by both parallel signal data flows and concurrent logic processing. The IC is supported by a powerful development system that transforms graphical signal flow graphs into production-ready systems in minutes. Automatic compiler partitioning of tasks among four on-chip processors gives the IC the signal processing power of several conventional DSP chips.
Couple Graph Based Label Propagation Method for Hyperspectral Remote Sensing Data Classification
NASA Astrophysics Data System (ADS)
Wang, X. P.; Hu, Y.; Chen, J.
2018-04-01
Graph based semi-supervised classification method are widely used for hyperspectral image classification. We present a couple graph based label propagation method, which contains both the adjacency graph and the similar graph. We propose to construct the similar graph by using the similar probability, which utilize the label similarity among examples probably. The adjacency graph was utilized by a common manifold learning method, which has effective improve the classification accuracy of hyperspectral data. The experiments indicate that the couple graph Laplacian which unite both the adjacency graph and the similar graph, produce superior classification results than other manifold Learning based graph Laplacian and Sparse representation based graph Laplacian in label propagation framework.
Efficient parallel architecture for highly coupled real-time linear system applications
NASA Technical Reports Server (NTRS)
Carroll, Chester C.; Homaifar, Abdollah; Barua, Soumavo
1988-01-01
A systematic procedure is developed for exploiting the parallel constructs of computation in a highly coupled, linear system application. An overall top-down design approach is adopted. Differential equations governing the application under consideration are partitioned into subtasks on the basis of a data flow analysis. The interconnected task units constitute a task graph which has to be computed in every update interval. Multiprocessing concepts utilizing parallel integration algorithms are then applied for efficient task graph execution. A simple scheduling routine is developed to handle task allocation while in the multiprocessor mode. Results of simulation and scheduling are compared on the basis of standard performance indices. Processor timing diagrams are developed on the basis of program output accruing to an optimal set of processors. Basic architectural attributes for implementing the system are discussed together with suggestions for processing element design. Emphasis is placed on flexible architectures capable of accommodating widely varying application specifics.
Statistical mechanics of high-density bond percolation
NASA Astrophysics Data System (ADS)
Timonin, P. N.
2018-05-01
High-density (HD) percolation describes the percolation of specific κ -clusters, which are the compact sets of sites each connected to κ nearest filled sites at least. It takes place in the classical patterns of independently distributed sites or bonds in which the ordinary percolation transition also exists. Hence, the study of series of κ -type HD percolations amounts to the description of classical clusters' structure for which κ -clusters constitute κ -cores nested one into another. Such data are needed for description of a number of physical, biological, and information properties of complex systems on random lattices, graphs, and networks. They range from magnetic properties of semiconductor alloys to anomalies in supercooled water and clustering in biological and social networks. Here we present the statistical mechanics approach to study HD bond percolation on an arbitrary graph. It is shown that the generating function for κ -clusters' size distribution can be obtained from the partition function of the specific q -state Potts-Ising model in the q →1 limit. Using this approach we find exact κ -clusters' size distributions for the Bethe lattice and Erdos-Renyi graph. The application of the method to Euclidean lattices is also discussed.
Comparing Phylogenetic Trees by Matching Nodes Using the Transfer Distance Between Partitions
Giaro, Krzysztof
2017-01-01
Abstract Ability to quantify dissimilarity of different phylogenetic trees describing the relationship between the same group of taxa is required in various types of phylogenetic studies. For example, such metrics are used to assess the quality of phylogeny construction methods, to define optimization criteria in supertree building algorithms, or to find horizontal gene transfer (HGT) events. Among the set of metrics described so far in the literature, the most commonly used seems to be the Robinson–Foulds distance. In this article, we define a new metric for rooted trees—the Matching Pair (MP) distance. The MP metric uses the concept of the minimum-weight perfect matching in a complete bipartite graph constructed from partitions of all pairs of leaves of the compared phylogenetic trees. We analyze the properties of the MP metric and present computational experiments showing its potential applicability in tasks related to finding the HGT events. PMID:28177699
Comparing Phylogenetic Trees by Matching Nodes Using the Transfer Distance Between Partitions.
Bogdanowicz, Damian; Giaro, Krzysztof
2017-05-01
Ability to quantify dissimilarity of different phylogenetic trees describing the relationship between the same group of taxa is required in various types of phylogenetic studies. For example, such metrics are used to assess the quality of phylogeny construction methods, to define optimization criteria in supertree building algorithms, or to find horizontal gene transfer (HGT) events. Among the set of metrics described so far in the literature, the most commonly used seems to be the Robinson-Foulds distance. In this article, we define a new metric for rooted trees-the Matching Pair (MP) distance. The MP metric uses the concept of the minimum-weight perfect matching in a complete bipartite graph constructed from partitions of all pairs of leaves of the compared phylogenetic trees. We analyze the properties of the MP metric and present computational experiments showing its potential applicability in tasks related to finding the HGT events.
Cascading Failures in Bi-partite Graphs: Model for Systemic Risk Propagation
Huang, Xuqing; Vodenska, Irena; Havlin, Shlomo; Stanley, H. Eugene
2013-01-01
As economic entities become increasingly interconnected, a shock in a financial network can provoke significant cascading failures throughout the system. To study the systemic risk of financial systems, we create a bi-partite banking network model composed of banks and bank assets and propose a cascading failure model to describe the risk propagation process during crises. We empirically test the model with 2007 US commercial banks balance sheet data and compare the model prediction of the failed banks with the real failed banks after 2007. We find that our model efficiently identifies a significant portion of the actual failed banks reported by Federal Deposit Insurance Corporation. The results suggest that this model could be useful for systemic risk stress testing for financial systems. The model also identifies that commercial rather than residential real estate assets are major culprits for the failure of over 350 US commercial banks during 2008–2011. PMID:23386974
Non-crystallographic nets: characterization and first steps towards a classification.
Moreira de Oliveira, Montauban; Eon, Jean Guillaume
2014-05-01
Non-crystallographic (NC) nets are periodic nets characterized by the existence of non-trivial bounded automorphisms. Such automorphisms cannot be associated with any crystallographic symmetry in realizations of the net by crystal structures. It is shown that bounded automorphisms of finite order form a normal subgroup F(N) of the automorphism group of NC nets (N, T). As a consequence, NC nets are unstable nets (they display vertex collisions in any barycentric representation) and, conversely, stable nets are crystallographic nets. The labelled quotient graphs of NC nets are characterized by the existence of an equivoltage partition (a partition of the vertex set that preserves label vectors over edges between cells). A classification of NC nets is proposed on the basis of (i) their relationship to the crystallographic net with a homeomorphic barycentric representation and (ii) the structure of the subgroup F(N).
Implementation of spectral clustering on microarray data of carcinoma using k-means algorithm
NASA Astrophysics Data System (ADS)
Frisca, Bustamam, Alhadi; Siswantining, Titin
2017-03-01
Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.
A Ranking Approach on Large-Scale Graph With Multidimensional Heterogeneous Information.
Wei, Wei; Gao, Bin; Liu, Tie-Yan; Wang, Taifeng; Li, Guohui; Li, Hang
2016-04-01
Graph-based ranking has been extensively studied and frequently applied in many applications, such as webpage ranking. It aims at mining potentially valuable information from the raw graph-structured data. Recently, with the proliferation of rich heterogeneous information (e.g., node/edge features and prior knowledge) available in many real-world graphs, how to effectively and efficiently leverage all information to improve the ranking performance becomes a new challenging problem. Previous methods only utilize part of such information and attempt to rank graph nodes according to link-based methods, of which the ranking performances are severely affected by several well-known issues, e.g., over-fitting or high computational complexity, especially when the scale of graph is very large. In this paper, we address the large-scale graph-based ranking problem and focus on how to effectively exploit rich heterogeneous information of the graph to improve the ranking performance. Specifically, we propose an innovative and effective semi-supervised PageRank (SSP) approach to parameterize the derived information within a unified semi-supervised learning framework (SSLF-GR), then simultaneously optimize the parameters and the ranking scores of graph nodes. Experiments on the real-world large-scale graphs demonstrate that our method significantly outperforms the algorithms that consider such graph information only partially.
Correspondence between spanning trees and the Ising model on a square lattice
NASA Astrophysics Data System (ADS)
Viswanathan, G. M.
2017-06-01
An important problem in statistical physics concerns the fascinating connections between partition functions of lattice models studied in equilibrium statistical mechanics on the one hand and graph theoretical enumeration problems on the other hand. We investigate the nature of the relationship between the number of spanning trees and the partition function of the Ising model on the square lattice. The spanning tree generating function T (z ) gives the spanning tree constant when evaluated at z =1 , while giving the lattice green function when differentiated. It is known that for the infinite square lattice the partition function Z (K ) of the Ising model evaluated at the critical temperature K =Kc is related to T (1 ) . Here we show that this idea in fact generalizes to all real temperatures. We prove that [Z(K ) s e c h 2 K ] 2=k exp[T (k )] , where k =2 tanh(2 K )s e c h (2 K ) . The identical Mahler measure connects the two seemingly disparate quantities T (z ) and Z (K ) . In turn, the Mahler measure is determined by the random walk structure function. Finally, we show that the the above correspondence does not generalize in a straightforward manner to nonplanar lattices.
Harnessing the Bethe free energy†
Bapst, Victor
2016-01-01
ABSTRACT A wide class of problems in combinatorics, computer science and physics can be described along the following lines. There are a large number of variables ranging over a finite domain that interact through constraints that each bind a few variables and either encourage or discourage certain value combinations. Examples include the k‐SAT problem or the Ising model. Such models naturally induce a Gibbs measure on the set of assignments, which is characterised by its partition function. The present paper deals with the partition function of problems where the interactions between variables and constraints are induced by a sparse random (hyper)graph. According to physics predictions, a generic recipe called the “replica symmetric cavity method” yields the correct value of the partition function if the underlying model enjoys certain properties [Krzkala et al., PNAS (2007) 10318–10323]. Guided by this conjecture, we prove general sufficient conditions for the success of the cavity method. The proofs are based on a “regularity lemma” for probability measures on sets of the form Ωn for a finite Ω and a large n that may be of independent interest. © 2016 Wiley Periodicals, Inc. Random Struct. Alg., 49, 694–741, 2016 PMID:28035178
KmL3D: a non-parametric algorithm for clustering joint trajectories.
Genolini, C; Pingault, J B; Driss, T; Côté, S; Tremblay, R E; Vitaro, F; Arnaud, C; Falissard, B
2013-01-01
In cohort studies, variables are measured repeatedly and can be considered as trajectories. A classic way to work with trajectories is to cluster them in order to detect the existence of homogeneous patterns of evolution. Since cohort studies usually measure a large number of variables, it might be interesting to study the joint evolution of several variables (also called joint-variable trajectories). To date, the only way to cluster joint-trajectories is to cluster each trajectory independently, then to cross the partitions obtained. This approach is unsatisfactory because it does not take into account a possible co-evolution of variable-trajectories. KmL3D is an R package that implements a version of k-means dedicated to clustering joint-trajectories. It provides facilities for the management of missing values, offers several quality criteria and its graphic interface helps the user to select the best partition. KmL3D can work with any number of joint-variable trajectories. In the restricted case of two joint trajectories, it proposes 3D tools to visualize the partitioning and then export 3D dynamic rotating-graphs to PDF format. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Graph state generation with noisy mirror-inverting spin chains
NASA Astrophysics Data System (ADS)
Clark, Stephen R.; Klein, Alexander; Bruderer, Martin; Jaksch, Dieter
2007-06-01
We investigate the influence of noise on a graph state generation scheme which exploits a mirror inverting spin chain. Within this scheme the spin chain is used repeatedly as an entanglement bus (EB) to create multi-partite entanglement. The noise model we consider comprises of each spin of this EB being exposed to independent local noise which degrades the capabilities of the EB. Here we concentrate on quantifying its performance as a single-qubit channel and as a mediator of a two-qubit entangling gate, since these are basic operations necessary for graph state generation using the EB. In particular, for the single-qubit case we numerically calculate the average channel fidelity and whether the channel becomes entanglement breaking, i.e. expunges any entanglement the transferred qubit may have with other external qubits. We find that neither local decay nor dephasing noise cause entanglement breaking. This is in contrast to local thermal and depolarizing noise where we determine a critical length and critical noise coupling, respectively, at which entanglement breaking occurs. The critical noise coupling for local depolarizing noise is found to exhibit a power-law dependence on the chain length. For two-qubits we similarly compute the average gate fidelity and whether the ability for this gate to create entanglement is maintained. The concatenation of these noisy gates for the construction of a five-qubit linear cluster state and a Greenberger Horne Zeilinger state indicates that the level of noise that can be tolerated for graph state generation is tightly constrained.
Guturu, Parthasarathy; Dantu, Ram
2008-06-01
Many graph- and set-theoretic problems, because of their tremendous application potential and theoretical appeal, have been well investigated by the researchers in complexity theory and were found to be NP-hard. Since the combinatorial complexity of these problems does not permit exhaustive searches for optimal solutions, only near-optimal solutions can be explored using either various problem-specific heuristic strategies or metaheuristic global-optimization methods, such as simulated annealing, genetic algorithms, etc. In this paper, we propose a unified evolutionary algorithm (EA) to the problems of maximum clique finding, maximum independent set, minimum vertex cover, subgraph and double subgraph isomorphism, set packing, set partitioning, and set cover. In the proposed approach, we first map these problems onto the maximum clique-finding problem (MCP), which is later solved using an evolutionary strategy. The proposed impatient EA with probabilistic tabu search (IEA-PTS) for the MCP integrates the best features of earlier successful approaches with a number of new heuristics that we developed to yield a performance that advances the state of the art in EAs for the exploration of the maximum cliques in a graph. Results of experimentation with the 37 DIMACS benchmark graphs and comparative analyses with six state-of-the-art algorithms, including two from the smaller EA community and four from the larger metaheuristics community, indicate that the IEA-PTS outperforms the EAs with respect to a Pareto-lexicographic ranking criterion and offers competitive performance on some graph instances when individually compared to the other heuristic algorithms. It has also successfully set a new benchmark on one graph instance. On another benchmark suite called Benchmarks with Hidden Optimal Solutions, IEA-PTS ranks second, after a very recent algorithm called COVER, among its peers that have experimented with this suite.
A density-based clustering model for community detection in complex networks
NASA Astrophysics Data System (ADS)
Zhao, Xiang; Li, Yantao; Qu, Zehui
2018-04-01
Network clustering (or graph partitioning) is an important technique for uncovering the underlying community structures in complex networks, which has been widely applied in various fields including astronomy, bioinformatics, sociology, and bibliometric. In this paper, we propose a density-based clustering model for community detection in complex networks (DCCN). The key idea is to find group centers with a higher density than their neighbors and a relatively large integrated-distance from nodes with higher density. The experimental results indicate that our approach is efficient and effective for community detection of complex networks.
Communication-Efficient Arbitration Models for Low-Resolution Data Flow Computing
1988-12-01
Given graph G = (V, E), weights w (v) for each v e V and L (e) for each e c E, and positive integers B and J, find a partition of V into disjoint...MIT/LCS/TR-218, Cambridge, Mass. Agerwala, Tilak, February 1982, "Data Flow Systems", Computer, pp. 10-13. Babb, Robert G ., July 1984, "Parallel...Processing with Large-Grain Data Flow Techniques," IEEE Computer 17, 7, pp. 55-61. Babb, Robert G ., II, Lise Storc, and William C. Ragsdale, 1986, "A Large
Can Comparison of Contrastive Examples Facilitate Graph Understanding?
ERIC Educational Resources Information Center
Smith, Linsey A.; Gentner, Dedre
2011-01-01
The authors explore the role of comparison in improving graph fluency. The ability to use graphs fluently is crucial for STEM achievement, but graphs are challenging to interpret and produce because they often involve integration of multiple variables, continuous change in variables over time, and omission of certain details in order to highlight…
ERIC Educational Resources Information Center
Deniz, Hasan; Dulger, Mehmet F.
2012-01-01
This study examined to what extent inquiry-based instruction supported with real-time graphing technology improves fourth grader's ability to interpret graphs as representations of physical science concepts such as motion and temperature. This study also examined whether there is any difference between inquiry-based instruction supported with…
Graphs in Kinematics--A Need for Adherence to Principles of Algebraic Functions
ERIC Educational Resources Information Center
Sokolowski, Andrzej
2017-01-01
Graphs in physics are central to the analysis of phenomena and to learning about a system's behavior. The ways students handle graphs are frequently researched. Students' misconceptions are highlighted, and methods of improvement suggested. While kinematics graphs are to represent a real motion, they are also algebraic entities that must satisfy…
Distributed Sensing and Processing: A Graphical Model Approach
2005-11-30
that Ramanujan graph toplogies maximize the convergence rate of distributed detection consensus algorithms, improving over three orders of...small world type network designs. 14. SUBJECT TERMS Ramanujan graphs, sensor network topology, sensor network...that Ramanujan graphs, for which there are explicit algebraic constructions, have large eigenratios, converging much faster than structured graphs
Improved visibility graph fractality with application for the diagnosis of Autism Spectrum Disorder
NASA Astrophysics Data System (ADS)
Ahmadlou, Mehran; Adeli, Hojjat; Adeli, Amir
2012-10-01
Recently, the visibility graph (VG) algorithm was proposed for mapping a time series to a graph to study complexity and fractality of the time series through investigation of the complexity of its graph. The visibility graph algorithm converts a fractal time series to a scale-free graph. VG has been used for the investigation of fractality in the dynamic behavior of both artificial and natural complex systems. However, robustness and performance of the power of scale-freeness of VG (PSVG) as an effective method for measuring fractality has not been investigated. Since noise is unavoidable in real life time series, the robustness of a fractality measure is of paramount importance. To improve the accuracy and robustness of PSVG to noise for measurement of fractality of time series in biological time-series, an improved PSVG is presented in this paper. The proposed method is evaluated using two examples: a synthetic benchmark time series and a complicated real life Electroencephalograms (EEG)-based diagnostic problem, that is distinguishing autistic children from non-autistic children. It is shown that the proposed improved PSVG is less sensitive to noise and therefore more robust compared with PSVG. Further, it is shown that using improved PSVG in the wavelet-chaos neural network model of Adeli and c-workers in place of the Katz fractality dimension results in a more accurate diagnosis of autism, a complicated neurological and psychiatric disorder.
New methods for analyzing semantic graph based assessments in science education
NASA Astrophysics Data System (ADS)
Vikaros, Lance Steven
This research investigated how the scoring of semantic graphs (known by many as concept maps) could be improved and automated in order to address issues of inter-rater reliability and scalability. As part of the NSF funded SENSE-IT project to introduce secondary school science students to sensor networks (NSF Grant No. 0833440), semantic graphs illustrating how temperature change affects water ecology were collected from 221 students across 16 schools. The graphing task did not constrain students' use of terms, as is often done with semantic graph based assessment due to coding and scoring concerns. The graphing software used provided real-time feedback to help students learn how to construct graphs, stay on topic and effectively communicate ideas. The collected graphs were scored by human raters using assessment methods expected to boost reliability, which included adaptations of traditional holistic and propositional scoring methods, use of expert raters, topical rubrics, and criterion graphs. High levels of inter-rater reliability were achieved, demonstrating that vocabulary constraints may not be necessary after all. To investigate a new approach to automating the scoring of graphs, thirty-two different graph features characterizing graphs' structure, semantics, configuration and process of construction were then used to predict human raters' scoring of graphs in order to identify feature patterns correlated to raters' evaluations of graphs' topical accuracy and complexity. Results led to the development of a regression model able to predict raters' scoring with 77% accuracy, with 46% accuracy expected when used to score new sets of graphs, as estimated via cross-validation tests. Although such performance is comparable to other graph and essay based scoring systems, cross-context testing of the model and methods used to develop it would be needed before it could be recommended for widespread use. Still, the findings suggest techniques for improving the reliability and scalability of semantic graph based assessments without requiring constraint of how ideas are expressed.
ERIC Educational Resources Information Center
Gültepe, Nejla
2016-01-01
Graphing subjects in chemistry has been used to provide alternatives to verbal and algorithmic descriptions of a subject by handing students another way of improving their manipulation of concepts. Teachers should therefore know the level of students' graphing skills. Studies have identified that students have difficulty making connections with…
Molecular graph convolutions: moving beyond fingerprints.
Kearnes, Steven; McCloskey, Kevin; Berndl, Marc; Pande, Vijay; Riley, Patrick
2016-08-01
Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph-atoms, bonds, distances, etc.-which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.
EEG Sleep Stages Classification Based on Time Domain Features and Structural Graph Similarity.
Diykh, Mohammed; Li, Yan; Wen, Peng
2016-11-01
The electroencephalogram (EEG) signals are commonly used in diagnosing and treating sleep disorders. Many existing methods for sleep stages classification mainly depend on the analysis of EEG signals in time or frequency domain to obtain a high classification accuracy. In this paper, the statistical features in time domain, the structural graph similarity and the K-means (SGSKM) are combined to identify six sleep stages using single channel EEG signals. Firstly, each EEG segment is partitioned into sub-segments. The size of a sub-segment is determined empirically. Secondly, statistical features are extracted, sorted into different sets of features and forwarded to the SGSKM to classify EEG sleep stages. We have also investigated the relationships between sleep stages and the time domain features of the EEG data used in this paper. The experimental results show that the proposed method yields better classification results than other four existing methods and the support vector machine (SVM) classifier. A 95.93% average classification accuracy is achieved by using the proposed method.
Approximate inference on planar graphs using loop calculus and belief progagation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chertkov, Michael; Gomez, Vicenc; Kappen, Hilbert
We introduce novel results for approximate inference on planar graphical models using the loop calculus framework. The loop calculus (Chertkov and Chernyak, 2006b) allows to express the exact partition function Z of a graphical model as a finite sum of terms that can be evaluated once the belief propagation (BP) solution is known. In general, full summation over all correction terms is intractable. We develop an algorithm for the approach presented in Chertkov et al. (2008) which represents an efficient truncation scheme on planar graphs and a new representation of the series in terms of Pfaffians of matrices. We analyzemore » in detail both the loop series and the Pfaffian series for models with binary variables and pairwise interactions, and show that the first term of the Pfaffian series can provide very accurate approximations. The algorithm outperforms previous truncation schemes of the loop series and is competitive with other state-of-the-art methods for approximate inference.« less
NASA Astrophysics Data System (ADS)
Arévalo, Germán. V.; Hincapié, Roberto C.; Sierra, Javier E.
2015-09-01
UDWDM PON is a leading technology oriented to provide ultra-high bandwidth to final users while profiting the physical channels' capability. One of the main drawbacks of UDWDM technique is the fact that the nonlinear effects, like FWM, become stronger due to the close spectral proximity among channels. This work proposes a model for the optimal deployment of this type of networks taking into account the fiber length limitations imposed by physical restrictions related with the fiber's data transmission as well as the users' asymmetric distribution in a provided region. The proposed model employs the data transmission related effects in UDWDM PON as restrictions in the optimization problem and also considers the user's asymmetric clustering and the subdivision of the users region though a Voronoi geometric partition technique. Here it is considered de Voronoi dual graph, it is the Delaunay Triangulation, as the planar graph for resolving the problem related with the minimum weight of the fiber links.
Study on probability distributions for evolution in modified extremal optimization
NASA Astrophysics Data System (ADS)
Zeng, Guo-Qiang; Lu, Yong-Zai; Mao, Wei-Jie; Chu, Jian
2010-05-01
It is widely believed that the power-law is a proper probability distribution being effectively applied for evolution in τ-EO (extremal optimization), a general-purpose stochastic local-search approach inspired by self-organized criticality, and its applications in some NP-hard problems, e.g., graph partitioning, graph coloring, spin glass, etc. In this study, we discover that the exponential distributions or hybrid ones (e.g., power-laws with exponential cutoff) being popularly used in the research of network sciences may replace the original power-laws in a modified τ-EO method called self-organized algorithm (SOA), and provide better performances than other statistical physics oriented methods, such as simulated annealing, τ-EO and SOA etc., from the experimental results on random Euclidean traveling salesman problems (TSP) and non-uniform instances. From the perspective of optimization, our results appear to demonstrate that the power-law is not the only proper probability distribution for evolution in EO-similar methods at least for TSP, the exponential and hybrid distributions may be other choices.
GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics1
DOE Office of Scientific and Technical Information (OSTI.GOV)
Simmhan, Yogesh; Kumbhare, Alok; Wickramaarachchi, Charith
2014-08-25
Large scale graph processing is a major research area for Big Data exploration. Vertex centric programming models like Pregel are gaining traction due to their simple abstraction that allows for scalable execution on distributed systems naturally. However, there are limitations to this approach which cause vertex centric algorithms to under-perform due to poor compute to communication overhead ratio and slow convergence of iterative superstep. In this paper we introduce GoFFish a scalable sub-graph centric framework co-designed with a distributed persistent graph storage for large scale graph analytics on commodity clusters. We introduce a sub-graph centric programming abstraction that combines themore » scalability of a vertex centric approach with the flexibility of shared memory sub-graph computation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation. We map Connected Components, SSSP and PageRank algorithms to this model to illustrate its flexibility. Further, we empirically analyze GoFFish using several real world graphs and demonstrate its significant performance improvement, orders of magnitude in some cases, compared to Apache Giraph, the leading open source vertex centric implementation.« less
Dim target detection method based on salient graph fusion
NASA Astrophysics Data System (ADS)
Hu, Ruo-lan; Shen, Yi-yan; Jiang, Jun
2018-02-01
Dim target detection is one key problem in digital image processing field. With development of multi-spectrum imaging sensor, it becomes a trend to improve the performance of dim target detection by fusing the information from different spectral images. In this paper, one dim target detection method based on salient graph fusion was proposed. In the method, Gabor filter with multi-direction and contrast filter with multi-scale were combined to construct salient graph from digital image. And then, the maximum salience fusion strategy was designed to fuse the salient graph from different spectral images. Top-hat filter was used to detect dim target from the fusion salient graph. Experimental results show that proposal method improved the probability of target detection and reduced the probability of false alarm on clutter background images.
NASA Astrophysics Data System (ADS)
Zagouras, Athanassios; Argiriou, Athanassios A.; Flocas, Helena A.; Economou, George; Fotopoulos, Spiros
2012-11-01
Classification of weather maps at various isobaric levels as a methodological tool is used in several problems related to meteorology, climatology, atmospheric pollution and to other fields for many years. Initially the classification was performed manually. The criteria used by the person performing the classification are features of isobars or isopleths of geopotential height, depending on the type of maps to be classified. Although manual classifications integrate the perceptual experience and other unquantifiable qualities of the meteorology specialists involved, these are typically subjective and time consuming. Furthermore, during the last years different approaches of automated methods for atmospheric circulation classification have been proposed, which present automated and so-called objective classifications. In this paper a new method of atmospheric circulation classification of isobaric maps is presented. The method is based on graph theory. It starts with an intelligent prototype selection using an over-partitioning mode of fuzzy c-means (FCM) algorithm, proceeds to a graph formulation for the entire dataset and produces the clusters based on the contemporary dominant sets clustering method. Graph theory is a novel mathematical approach, allowing a more efficient representation of spatially correlated data, compared to the classical Euclidian space representation approaches, used in conventional classification methods. The method has been applied to the classification of 850 hPa atmospheric circulation over the Eastern Mediterranean. The evaluation of the automated methods is performed by statistical indexes; results indicate that the classification is adequately comparable with other state-of-the-art automated map classification methods, for a variable number of clusters.
Visibility graph network analysis of natural gas price: The case of North American market
NASA Astrophysics Data System (ADS)
Sun, Mei; Wang, Yaqi; Gao, Cuixia
2016-11-01
Fluctuations in prices of natural gas significantly affect global economy. Therefore, the research on the characteristics of natural gas price fluctuations, turning points and its influencing cycle on the subsequent price series is of great significance. Global natural gas trade concentrates on three regional markets: the North American market, the European market and the Asia-Pacific market, with North America having the most developed natural gas financial market. In addition, perfect legal supervision and coordinated regulations make the North American market more open and more competitive. This paper focuses on the North American natural gas market specifically. The Henry Hub natural gas spot price time series is converted to a visibility graph network which provides a new direction for macro analysis of time series, and several indicators are investigated: degree and degree distribution, the average shortest path length and community structure. The internal mechanisms underlying price fluctuations are explored through the indicators. The results show that the natural gas prices visibility graph network (NGP-VGN) is of small-world and scale-free properties simultaneously. After random rearrangement of original price time series, the degree distribution of network becomes exponential distribution, different from the original ones. This means that, the original price time series is of long-range negative correlation fractal characteristic. In addition, nodes with large degree correspond to significant geopolitical or economic events. Communities correspond to time cycles in visibility graph network. The cycles of time series and the impact scope of hubs can be found by community structure partition.
Connectomics and neuroticism: an altered functional network organization.
Servaas, Michelle N; Geerligs, Linda; Renken, Remco J; Marsman, Jan-Bernard C; Ormel, Johan; Riese, Harriëtte; Aleman, André
2015-01-01
The personality trait neuroticism is a potent risk marker for psychopathology. Although the neurobiological basis remains unclear, studies have suggested that alterations in connectivity may underlie it. Therefore, the aim of the current study was to shed more light on the functional network organization in neuroticism. To this end, we applied graph theory on resting-state functional magnetic resonance imaging (fMRI) data in 120 women selected based on their neuroticism score. Binary and weighted brain-wide graphs were constructed to examine changes in the functional network structure and functional connectivity strength. Furthermore, graphs were partitioned into modules to specifically investigate connectivity within and between functional subnetworks related to emotion processing and cognitive control. Subsequently, complex network measures (ie, efficiency and modularity) were calculated on the brain-wide graphs and modules, and correlated with neuroticism scores. Compared with low neurotic individuals, high neurotic individuals exhibited a whole-brain network structure resembling more that of a random network and had overall weaker functional connections. Furthermore, in these high neurotic individuals, functional subnetworks could be delineated less clearly and the majority of these subnetworks showed lower efficiency, while the affective subnetwork showed higher efficiency. In addition, the cingulo-operculum subnetwork demonstrated more ties with other functional subnetworks in association with neuroticism. In conclusion, the 'neurotic brain' has a less than optimal functional network organization and shows signs of functional disconnectivity. Moreover, in high compared with low neurotic individuals, emotion and salience subnetworks have a more prominent role in the information exchange, while sensory(-motor) and cognitive control subnetworks have a less prominent role.
Graph Theory Approach for Studying Food Webs
NASA Astrophysics Data System (ADS)
Longjas, A.; Tejedor, A.; Foufoula-Georgiou, E.
2017-12-01
Food webs are complex networks of feeding interactions among species in ecological communities. Metrics describing food web structure have been proposed to compare and classify food webs ranging from food chain length, connectance, degree distribution, centrality measures, to the presence of motifs (distinct compartments), among others. However, formal methodologies for studying both food web topology and the dynamic processes operating on them are still lacking. Here, we utilize a quantitative framework using graph theory within which a food web is represented by a directed graph, i.e., a collection of vertices (species or trophic species defined as sets of species sharing the same predators and prey) and directed edges (predation links). This framework allows us to identify apex (environmental "source" node) to outlet (top predators) subnetworks and compute the steady-state flux (e.g., carbon, nutrients, energy etc.) in the food web. We use this framework to (1) construct vulnerability maps that quantify the relative change of flux delivery to the top predators in response to perturbations in prey species (2) identify keystone species, whose loss would precipitate further species extinction, and (3) introduce a suite of graph-theoretic metrics to quantify the topologic (imposed by food web connectivity) and dynamic (dictated by the flux partitioning and distribution) components of a food web's complexity. By projecting food webs into a 2D Topodynamic Complexity Space whose coordinates are given by Number of alternative paths (topologic) and Leakage Index (dynamic), we show that this space provides a basis for food web comparison and provide physical insights into their dynamic behavior.
Molecular graph convolutions: moving beyond fingerprints
NASA Astrophysics Data System (ADS)
Kearnes, Steven; McCloskey, Kevin; Berndl, Marc; Pande, Vijay; Riley, Patrick
2016-08-01
Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.
Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Luo, Xiangfeng
2015-12-01
Graph mining has been a popular research area because of its numerous application scenarios. Many unstructured and structured data can be represented as graphs, such as, documents, chemical molecular structures, and images. However, an issue in relation to current research on graphs is that they cannot adequately discover the topics hidden in graph-structured data which can be beneficial for both the unsupervised learning and supervised learning of the graphs. Although topic models have proved to be very successful in discovering latent topics, the standard topic models cannot be directly applied to graph-structured data due to the "bag-of-word" assumption. In this paper, an innovative graph topic model (GTM) is proposed to address this issue, which uses Bernoulli distributions to model the edges between nodes in a graph. It can, therefore, make the edges in a graph contribute to latent topic discovery and further improve the accuracy of the supervised and unsupervised learning of graphs. The experimental results on two different types of graph datasets show that the proposed GTM outperforms the latent Dirichlet allocation on classification by using the unveiled topics of these two models to represent graphs.
Automated problem scheduling and reduction of synchronization delay effects
NASA Technical Reports Server (NTRS)
Saltz, Joel H.
1987-01-01
It is anticipated that in order to make effective use of many future high performance architectures, programs will have to exhibit at least a medium grained parallelism. A framework is presented for partitioning very sparse triangular systems of linear equations that is designed to produce favorable preformance results in a wide variety of parallel architectures. Efficient methods for solving these systems are of interest because: (1) they provide a useful model problem for use in exploring heuristics for the aggregation, mapping and scheduling of relatively fine grained computations whose data dependencies are specified by directed acrylic graphs, and (2) because such efficient methods can find direct application in the development of parallel algorithms for scientific computation. Simple expressions are derived that describe how to schedule computational work with varying degrees of granularity. The Encore Multimax was used as a hardware simulator to investigate the performance effects of using the partitioning techniques presented in shared memory architectures with varying relative synchronization costs.
Distributed state-space generation of discrete-state stochastic models
NASA Technical Reports Server (NTRS)
Ciardo, Gianfranco; Gluckman, Joshua; Nicol, David
1995-01-01
High-level formalisms such as stochastic Petri nets can be used to model complex systems. Analysis of logical and numerical properties of these models of ten requires the generation and storage of the entire underlying state space. This imposes practical limitations on the types of systems which can be modeled. Because of the vast amount of memory consumed, we investigate distributed algorithms for the generation of state space graphs. The distributed construction allows us to take advantage of the combined memory readily available on a network of workstations. The key technical problem is to find effective methods for on-the-fly partitioning, so that the state space is evenly distributed among processors. In this paper we report on the implementation of a distributed state-space generator that may be linked to a number of existing system modeling tools. We discuss partitioning strategies in the context of Petri net models, and report on performance observed on a network of workstations, as well as on a distributed memory multi-computer.
Congenital blindness is associated with large-scale reorganization of anatomical networks.
Hasson, Uri; Andric, Michael; Atilgan, Hicret; Collignon, Olivier
2016-03-01
Blindness is a unique model for understanding the role of experience in the development of the brain's functional and anatomical architecture. Documenting changes in the structure of anatomical networks for this population would substantiate the notion that the brain's core network-level organization may undergo neuroplasticity as a result of life-long experience. To examine this issue, we compared whole-brain networks of regional cortical-thickness covariance in early blind and matched sighted individuals. This covariance is thought to reflect signatures of integration between systems involved in similar perceptual/cognitive functions. Using graph-theoretic metrics, we identified a unique mode of anatomical reorganization in the blind that differed from that found for sighted. This was seen in that network partition structures derived from subgroups of blind were more similar to each other than they were to partitions derived from sighted. Notably, after deriving network partitions, we found that language and visual regions tended to reside within separate modules in sighted but showed a pattern of merging into shared modules in the blind. Our study demonstrates that early visual deprivation triggers a systematic large-scale reorganization of whole-brain cortical-thickness networks, suggesting changes in how occipital regions interface with other functional networks in the congenitally blind. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Iterative cross section sequence graph for handwritten character segmentation.
Dawoud, Amer
2007-08-01
The iterative cross section sequence graph (ICSSG) is an algorithm for handwritten character segmentation. It expands the cross section sequence graph concept by applying it iteratively at equally spaced thresholds. The iterative thresholding reduces the effect of information loss associated with image binarization. ICSSG preserves the characters' skeletal structure by preventing the interference of pixels that causes flooding of adjacent characters' segments. Improving the structural quality of the characters' skeleton facilitates better feature extraction and classification, which improves the overall performance of optical character recognition (OCR). Experimental results showed significant improvements in OCR recognition rates compared to other well-established segmentation algorithms.
Modular structure of functional networks in olfactory memory.
Meunier, David; Fonlupt, Pierre; Saive, Anne-Lise; Plailly, Jane; Ravel, Nadine; Royet, Jean-Pierre
2014-07-15
Graph theory enables the study of systems by describing those systems as a set of nodes and edges. Graph theory has been widely applied to characterize the overall structure of data sets in the social, technological, and biological sciences, including neuroscience. Modular structure decomposition enables the definition of sub-networks whose components are gathered in the same module and work together closely, while working weakly with components from other modules. This processing is of interest for studying memory, a cognitive process that is widely distributed. We propose a new method to identify modular structure in task-related functional magnetic resonance imaging (fMRI) networks. The modular structure was obtained directly from correlation coefficients and thus retained information about both signs and weights. The method was applied to functional data acquired during a yes-no odor recognition memory task performed by young and elderly adults. Four response categories were explored: correct (Hit) and incorrect (False alarm, FA) recognition and correct and incorrect rejection. We extracted time series data for 36 areas as a function of response categories and age groups and calculated condition-based weighted correlation matrices. Overall, condition-based modular partitions were more homogeneous in young than elderly subjects. Using partition similarity-based statistics and a posteriori statistical analyses, we demonstrated that several areas, including the hippocampus, caudate nucleus, and anterior cingulate gyrus, belonged to the same module more frequently during Hit than during all other conditions. Modularity values were negatively correlated with memory scores in the Hit condition and positively correlated with bias scores (liberal/conservative attitude) in the Hit and FA conditions. We further demonstrated that the proportion of positive and negative links between areas of different modules (i.e., the proportion of correlated and anti-correlated areas) accounted for most of the observed differences in signed modularity. Taken together, our results provided some evidence that the neural networks involved in odor recognition memory are organized into modules and that these modular partitions are linked to behavioral performance and individual strategies. Copyright © 2014 Elsevier Inc. All rights reserved.
Independence polynomial and matching polynomial of the Koch network
NASA Astrophysics Data System (ADS)
Liao, Yunhua; Xie, Xiaoliang
2015-11-01
The lattice gas model and the monomer-dimer model are two classical models in statistical mechanics. It is well known that the partition functions of these two models are associated with the independence polynomial and the matching polynomial in graph theory, respectively. Both polynomials have been shown to belong to the “#P-complete” class, which indicate the problems are computationally “intractable”. We consider these two polynomials of the Koch networks which are scale-free with small-world effects. Explicit recurrences are derived, and explicit formulae are presented for the number of independent sets of a certain type.
Data Wrangling Within Different Astronomy Career Trajectories
NASA Astrophysics Data System (ADS)
Guillen, Reynal; Gu, D.; Holbrook, J.; Murillo, L.; Traweek, S.
2012-01-01
Five kinds of astronomers work with large data sets: cosmologists, data analysts, instrumentation people, observers, and numerical theorists. Each of these career trajectories can diverge and converge in and out of collaborations with each other and perform different kinds of work. Nonetheless, each group defines and wrangles data differently. This poster characterizes their different meanings of data, analytic skills, techniques, and technologies. It also identifies some sites and patterns of convergence. We plot these collaborative relationships in bi-partite graphs. These emergent characteristics of the astronomy workforce have implications for curricula, pedagogies, and the division of labor in research collaborations.
ALI: A CSSL/multiprocessor software interface
DOE Office of Scientific and Technical Information (OSTI.GOV)
Makoui, A.; Karplus, W.J.
ALI (A Language Interface) is a software package which translates simulation models expressed in one of the higher-level languages, CSSL-IV or ACSL, into sequences of instructions for each processor of a network of microprocessors. The partitioning of the source program among the processors is automatically accomplished. The code is converted into a data flow graph, analyzed and divided among the processors to minimize the overall execution time in the presence of interprocessor communication delays. This paper describes ALI from the user's point of view and includes a detailed example of the application of ALI to a specific dynamic system simulation.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brost, Randolph C.; McLendon, William Clarence,
2013-01-01
Modeling geospatial information with semantic graphs enables search for sites of interest based on relationships between features, without requiring strong a priori models of feature shape or other intrinsic properties. Geospatial semantic graphs can be constructed from raw sensor data with suitable preprocessing to obtain a discretized representation. This report describes initial work toward extending geospatial semantic graphs to include temporal information, and initial results applying semantic graph techniques to SAR image data. We describe an efficient graph structure that includes geospatial and temporal information, which is designed to support simultaneous spatial and temporal search queries. We also report amore » preliminary implementation of feature recognition, semantic graph modeling, and graph search based on input SAR data. The report concludes with lessons learned and suggestions for future improvements.« less
Molecular graph convolutions: moving beyond fingerprints
Kearnes, Steven; McCloskey, Kevin; Berndl, Marc; Pande, Vijay; Riley, Patrick
2016-01-01
Molecular “fingerprints” encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement. PMID:27558503
A Visual Analytics Paradigm Enabling Trillion-Edge Graph Exploration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wong, Pak C.; Haglin, David J.; Gillen, David S.
We present a visual analytics paradigm and a system prototype for exploring web-scale graphs. A web-scale graph is described as a graph with ~one trillion edges and ~50 billion vertices. While there is an aggressive R&D effort in processing and exploring web-scale graphs among internet vendors such as Facebook and Google, visualizing a graph of that scale still remains an underexplored R&D area. The paper describes a nontraditional peek-and-filter strategy that facilitates the exploration of a graph database of unprecedented size for visualization and analytics. We demonstrate that our system prototype can 1) preprocess a graph with ~25 billion edgesmore » in less than two hours and 2) support database query and visualization on the processed graph database afterward. Based on our computational performance results, we argue that we most likely will achieve the one trillion edge mark (a computational performance improvement of 40 times) for graph visual analytics in the near future.« less
Function plot response: A scalable system for teaching kinematics graphs
NASA Astrophysics Data System (ADS)
Laverty, James; Kortemeyer, Gerd
2012-08-01
Understanding and interpreting graphs are essential skills in all sciences. While students are mostly proficient in plotting given functions and reading values off graphs, they frequently lack the ability to construct and interpret graphs in a meaningful way. Students can use graphs as representations of value pairs, but often fail to interpret them as the representation of functions, and mostly fail to use them as representations of physical reality. Working with graphs in classroom settings has been shown to improve student abilities with graphs, particularly when the students can interact with them. We introduce a novel problem type in an online homework system, which requires students to construct the graphs themselves in free form, and requires no hand-grading by instructors. Initial experiences using the new problem type in an introductory physics course are reported.
Ghita, Ovidiu; Dietlmeier, Julia; Whelan, Paul F
2014-10-01
In this paper, we investigate the segmentation of closed contours in subcellular data using a framework that primarily combines the pairwise affinity grouping principles with a graph partitioning contour searching approach. One salient problem that precluded the application of these methods to large scale segmentation problems is the onerous computational complexity required to generate comprehensive representations that include all pairwise relationships between all pixels in the input data. To compensate for this problem, a practical solution is to reduce the complexity of the input data by applying an over-segmentation technique prior to the application of the computationally demanding strands of the segmentation process. This approach opens the opportunity to build specific shape and intensity models that can be successfully employed to extract the salient structures in the input image which are further processed to identify the cycles in an undirected graph. The proposed framework has been applied to the segmentation of mitochondria membranes in electron microscopy data which are characterized by low contrast and low signal-to-noise ratio. The algorithm has been quantitatively evaluated using two datasets where the segmentation results have been compared with the corresponding manual annotations. The performance of the proposed algorithm has been measured using standard metrics, such as precision and recall, and the experimental results indicate a high level of segmentation accuracy.
Path Network Recovery Using Remote Sensing Data and Geospatial-Temporal Semantic Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
William C. McLendon III; Brost, Randy C.
Remote sensing systems produce large volumes of high-resolution images that are difficult to search. The GeoGraphy (pronounced Geo-Graph-y) framework [2, 20] encodes remote sensing imagery into a geospatial-temporal semantic graph representation to enable high level semantic searches to be performed. Typically scene objects such as buildings and trees tend to be shaped like blocks with few holes, but other shapes generated from path networks tend to have a large number of holes and can span a large geographic region due to their connectedness. For example, we have a dataset covering the city of Philadelphia in which there is a singlemore » road network node spanning a 6 mile x 8 mile region. Even a simple question such as "find two houses near the same street" might give unexpected results. More generally, nodes arising from networks of paths (roads, sidewalks, trails, etc.) require additional processing to make them useful for searches in GeoGraphy. We have assigned the term Path Network Recovery to this process. Path Network Recovery is a three-step process involving (1) partitioning the network node into segments, (2) repairing broken path segments interrupted by occlusions or sensor noise, and (3) adding path-aware search semantics into GeoQuestions. This report covers the path network recovery process, how it is used, and some example use cases of the current capabilities.« less
Holographic hierarchy in the Gaussian matrix model via the fuzzy sphere
NASA Astrophysics Data System (ADS)
Garner, David; Ramgoolam, Sanjaye
2013-10-01
The Gaussian Hermitian matrix model was recently proposed to have a dual string description with worldsheets mapping to a sphere target space. The correlators were written as sums over holomorphic (Belyi) maps from worldsheets to the two-dimensional sphere, branched over three points. We express the matrix model correlators by using the fuzzy sphere construction of matrix algebras, which can be interpreted as a string field theory description of the Belyi strings. This gives the correlators in terms of trivalent ribbon graphs that represent the couplings of irreducible representations of su(2), which can be evaluated in terms of 3j and 6j symbols. The Gaussian model perturbed by a cubic potential is then recognised as a generating function for Ponzano-Regge partition functions for 3-manifolds having the worldsheet as boundary, and equipped with boundary data determined by the ribbon graphs. This can be viewed as a holographic extension of the Belyi string worldsheets to membrane worldvolumes, forming part of a holographic hierarchy linking, via the large N expansion, the zero-dimensional QFT of the Matrix model to 2D strings and 3D membranes. Note that if, after removing the white vertices, the graph contains a blue edge connecting to the same black vertex at both ends, then the triangulation generated from the black edges will contain faces that resemble cut discs. These faces are triangles with two of the edges identified.
Loop series for discrete statistical models on graphs
NASA Astrophysics Data System (ADS)
Chertkov, Michael; Chernyak, Vladimir Y.
2006-06-01
In this paper we present the derivation details, logic, and motivation for the three loop calculus introduced in Chertkov and Chernyak (2006 Phys. Rev. E 73 065102(R)). Generating functions for each of the three interrelated discrete statistical models are expressed in terms of a finite series. The first term in the series corresponds to the Bethe-Peierls belief-propagation (BP) contribution; the other terms are labelled by loops on the factor graph. All loop contributions are simple rational functions of spin correlation functions calculated within the BP approach. We discuss two alternative derivations of the loop series. One approach implements a set of local auxiliary integrations over continuous fields with the BP contribution corresponding to an integrand saddle-point value. The integrals are replaced by sums in the complementary approach, briefly explained in Chertkov and Chernyak (2006 Phys. Rev. E 73 065102(R)). Local gauge symmetry transformations that clarify an important invariant feature of the BP solution are revealed in both approaches. The individual terms change under the gauge transformation while the partition function remains invariant. The requirement for all individual terms to be nonzero only for closed loops in the factor graph (as opposed to paths with loose ends) is equivalent to fixing the first term in the series to be exactly equal to the BP contribution. Further applications of the loop calculus to problems in statistical physics, computer and information sciences are discussed.
Chung, Dongjun; Kim, Hang J; Zhao, Hongyu
2017-02-01
Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. However, identification of risk variants associated with complex diseases remains challenging as they are often affected by many genetic variants with small or moderate effects. There has been accumulating evidence suggesting that different complex traits share common risk basis, namely pleiotropy. Recently, several statistical methods have been developed to improve statistical power to identify risk variants for complex traits through a joint analysis of multiple GWAS datasets by leveraging pleiotropy. While these methods were shown to improve statistical power for association mapping compared to separate analyses, they are still limited in the number of phenotypes that can be integrated. In order to address this challenge, in this paper, we propose a novel statistical framework, graph-GPA, to integrate a large number of GWAS datasets for multiple phenotypes using a hidden Markov random field approach. Application of graph-GPA to a joint analysis of GWAS datasets for 12 phenotypes shows that graph-GPA improves statistical power to identify risk variants compared to statistical methods based on smaller number of GWAS datasets. In addition, graph-GPA also promotes better understanding of genetic mechanisms shared among phenotypes, which can potentially be useful for the development of improved diagnosis and therapeutics. The R implementation of graph-GPA is currently available at https://dongjunchung.github.io/GGPA/.
Improved parallel data partitioning by nested dissection with applications to information retrieval.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wolf, Michael M.; Chevalier, Cedric; Boman, Erik Gunnar
The computational work in many information retrieval and analysis algorithms is based on sparse linear algebra. Sparse matrix-vector multiplication is a common kernel in many of these computations. Thus, an important related combinatorial problem in parallel computing is how to distribute the matrix and the vectors among processors so as to minimize the communication cost. We focus on minimizing the total communication volume while keeping the computation balanced across processes. In [1], the first two authors presented a new 2D partitioning method, the nested dissection partitioning algorithm. In this paper, we improve on that algorithm and show that it ismore » a good option for data partitioning in information retrieval. We also show partitioning time can be substantially reduced by using the SCOTCH software, and quality improves in some cases, too.« less
Zhang, Junhua; Wang, Yuanyuan; Shi, Xinling
2009-12-01
A modified graph cut was proposed under the elliptical shape constraint to segment cervical lymph nodes on sonograms, and its effect on the measurement of short axis to long axis ratio (S/L) was investigated by using the relative ultimate measurement accuracy (RUMA). Under the same user inputs, the proposed algorithm successfully segmented all 60 sonograms tested, while the traditional graph cut failed. The mean RUMA resulted from the developed method was comparable to that resulted from the manual segmentation. Results indicated that utilizing the elliptical shape prior could appreciably improve the graph cut for nodes segmentation, and the proposed method satisfied the accuracy requirement of S/L measurement.
Cognitive Aids for Guiding Graph Comprehension
ERIC Educational Resources Information Center
Mautone, Patricia D.; Mayer, Richard E.
2007-01-01
This study sought to improve students' comprehension of scientific graphs by adapting scaffolding techniques used to aid text comprehension. In 3 experiments involving 121 female and 88 male college students, some students were shown cognitive aids prior to viewing 4 geography graphs whereas others were not; all students were then asked to write a…
Solving the scalability issue in quantum-based refinement: Q|R#1.
Zheng, Min; Moriarty, Nigel W; Xu, Yanting; Reimers, Jeffrey R; Afonine, Pavel V; Waller, Mark P
2017-12-01
Accurately refining biomacromolecules using a quantum-chemical method is challenging because the cost of a quantum-chemical calculation scales approximately as n m , where n is the number of atoms and m (≥3) is based on the quantum method of choice. This fundamental problem means that quantum-chemical calculations become intractable when the size of the system requires more computational resources than are available. In the development of the software package called Q|R, this issue is referred to as Q|R#1. A divide-and-conquer approach has been developed that fragments the atomic model into small manageable pieces in order to solve Q|R#1. Firstly, the atomic model of a crystal structure is analyzed to detect noncovalent interactions between residues, and the results of the analysis are represented as an interaction graph. Secondly, a graph-clustering algorithm is used to partition the interaction graph into a set of clusters in such a way as to minimize disruption to the noncovalent interaction network. Thirdly, the environment surrounding each individual cluster is analyzed and any residue that is interacting with a particular cluster is assigned to the buffer region of that particular cluster. A fragment is defined as a cluster plus its buffer region. The gradients for all atoms from each of the fragments are computed, and only the gradients from each cluster are combined to create the total gradients. A quantum-based refinement is carried out using the total gradients as chemical restraints. In order to validate this interaction graph-based fragmentation approach in Q|R, the entire atomic model of an amyloid cross-β spine crystal structure (PDB entry 2oNA) was refined.
A mesh partitioning algorithm for preserving spatial locality in arbitrary geometries
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nivarti, Girish V., E-mail: g.nivarti@alumni.ubc.ca; Salehi, M. Mahdi; Bushe, W. Kendal
2015-01-15
Highlights: •An algorithm for partitioning computational meshes is proposed. •The Morton order space-filling curve is modified to achieve improved locality. •A spatial locality metric is defined to compare results with existing approaches. •Results indicate improved performance of the algorithm in complex geometries. -- Abstract: A space-filling curve (SFC) is a proximity preserving linear mapping of any multi-dimensional space and is widely used as a clustering tool. Equi-sized partitioning of an SFC ignores the loss in clustering quality that occurs due to inaccuracies in the mapping. Often, this results in poor locality within partitions, especially for the conceptually simple, Morton ordermore » curves. We present a heuristic that improves partition locality in arbitrary geometries by slicing a Morton order curve at points where spatial locality is sacrificed. In addition, we develop algorithms that evenly distribute points to the extent possible while maintaining spatial locality. A metric is defined to estimate relative inter-partition contact as an indicator of communication in parallel computing architectures. Domain partitioning tests have been conducted on geometries relevant to turbulent reactive flow simulations. The results obtained highlight the performance of our method as an unsupervised and computationally inexpensive domain partitioning tool.« less
SING: Subgraph search In Non-homogeneous Graphs
2010-01-01
Background Finding the subgraphs of a graph database that are isomorphic to a given query graph has practical applications in several fields, from cheminformatics to image understanding. Since subgraph isomorphism is a computationally hard problem, indexing techniques have been intensively exploited to speed up the process. Such systems filter out those graphs which cannot contain the query, and apply a subgraph isomorphism algorithm to each residual candidate graph. The applicability of such systems is limited to databases of small graphs, because their filtering power degrades on large graphs. Results In this paper, SING (Subgraph search In Non-homogeneous Graphs), a novel indexing system able to cope with large graphs, is presented. The method uses the notion of feature, which can be a small subgraph, subtree or path. Each graph in the database is annotated with the set of all its features. The key point is to make use of feature locality information. This idea is used to both improve the filtering performance and speed up the subgraph isomorphism task. Conclusions Extensive tests on chemical compounds, biological networks and synthetic graphs show that the proposed system outperforms the most popular systems in query time over databases of medium and large graphs. Other specific tests show that the proposed system is effective for single large graphs. PMID:20170516
NASA Astrophysics Data System (ADS)
Chandramouli, Bharadwaj; Jang, Myoseon; Kamens, Richard M.
The partitioning of a diverse set of semivolatile organic compounds (SOCs) on a variety of organic aerosols was studied using smog chamber experimental data. Existing data on the partitioning of SOCs on aerosols from wood combustion, diesel combustion, and the α-pinene-O 3 reaction was augmented by carrying out smog chamber partitioning experiments on aerosols from meat cooking, and catalyzed and uncatalyzed gasoline engine exhaust. Model compositions for aerosols from meat cooking and gasoline combustion emissions were used to calculate activity coefficients for the SOCs in the organic aerosols and the Pankow absorptive gas/particle partitioning model was used to calculate the partitioning coefficient Kp and quantitate the predictive improvements of using the activity coefficient. The slope of the log K p vs. log p L0 correlation for partitioning on aerosols from meat cooking improved from -0.81 to -0.94 after incorporation of activity coefficients iγ om. A stepwise regression analysis of the partitioning model revealed that for the data set used in this study, partitioning predictions on α-pinene-O 3 secondary aerosol and wood combustion aerosol showed statistically significant improvement after incorporation of iγ om, which can be attributed to their overall polarity. The partitioning model was sensitive to changes in aerosol composition when updated compositions for α-pinene-O 3 aerosol and wood combustion aerosol were used. The octanol-air partitioning coefficient's ( KOA) effectiveness as a partitioning correlator over a variety of aerosol types was evaluated. The slope of the log K p- log K OA correlation was not constant over the aerosol types and SOCs used in the study and the use of KOA for partitioning correlations can potentially lead to significant deviations, especially for polar aerosols.
Dimer geometry, amoebae and a vortex dimer model
NASA Astrophysics Data System (ADS)
Nash, Charles; O'Connor, Denjoe
2017-09-01
We present a geometrical approach and introduce a connection for dimer problems on bipartite and non-bipartite graphs. In the bipartite case the connection is flat but has non-trivial {Z}2 holonomy round certain curves. This holonomy has the universality property that it does not change as the number of vertices in the fundamental domain of the graph is increased. It is argued that the K-theory of the torus, with or without punctures, is the appropriate underlying invariant. In the non-bipartite case the connection has non-zero curvature as well as non-zero Chern number. The curvature does not require the introduction of a magnetic field. The phase diagram of these models is captured by what is known as an amoeba. We introduce a dimer model with negative edge weights which correspond to vortices. The amoebae for various models are studied with particular emphasis on the case of negative edge weights. Vortices give rise to new kinds of amoebae with certain singular structures which we investigate. On the amoeba of the vortex full hexagonal lattice we find the partition function corresponds to that of a massless Dirac doublet.
A large-grain mapping approach for multiprocessor systems through data flow model. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Kim, Hwa-Soo
1991-01-01
A large-grain level mapping method is presented of numerical oriented applications onto multiprocessor systems. The method is based on the large-grain data flow representation of the input application and it assumes a general interconnection topology of the multiprocessor system. The large-grain data flow model was used because such representation best exhibits inherited parallelism in many important applications, e.g., CFD models based on partial differential equations can be presented in large-grain data flow format, very effectively. A generalized interconnection topology of the multiprocessor architecture is considered, including such architectural issues as interprocessor communication cost, with the aim to identify the 'best matching' between the application and the multiprocessor structure. The objective is to minimize the total execution time of the input algorithm running on the target system. The mapping strategy consists of the following: (1) large-grain data flow graph generation from the input application using compilation techniques; (2) data flow graph partitioning into basic computation blocks; and (3) physical mapping onto the target multiprocessor using a priority allocation scheme for the computation blocks.
Markov chain aggregation and its applications to combinatorial reaction networks.
Ganguly, Arnab; Petrov, Tatjana; Koeppl, Heinz
2014-09-01
We consider a continuous-time Markov chain (CTMC) whose state space is partitioned into aggregates, and each aggregate is assigned a probability measure. A sufficient condition for defining a CTMC over the aggregates is presented as a variant of weak lumpability, which also characterizes that the measure over the original process can be recovered from that of the aggregated one. We show how the applicability of de-aggregation depends on the initial distribution. The application section is devoted to illustrate how the developed theory aids in reducing CTMC models of biochemical systems particularly in connection to protein-protein interactions. We assume that the model is written by a biologist in form of site-graph-rewrite rules. Site-graph-rewrite rules compactly express that, often, only a local context of a protein (instead of a full molecular species) needs to be in a certain configuration in order to trigger a reaction event. This observation leads to suitable aggregate Markov chains with smaller state spaces, thereby providing sufficient reduction in computational complexity. This is further exemplified in two case studies: simple unbounded polymerization and early EGFR/insulin crosstalk.
Pan, Yongke; Niu, Wenjia
2017-01-01
Semisupervised Discriminant Analysis (SDA) is a semisupervised dimensionality reduction algorithm, which can easily resolve the out-of-sample problem. Relative works usually focus on the geometric relationships of data points, which are not obvious, to enhance the performance of SDA. Different from these relative works, the regularized graph construction is researched here, which is important in the graph-based semisupervised learning methods. In this paper, we propose a novel graph for Semisupervised Discriminant Analysis, which is called combined low-rank and k-nearest neighbor (LRKNN) graph. In our LRKNN graph, we map the data to the LR feature space and then the kNN is adopted to satisfy the algorithmic requirements of SDA. Since the low-rank representation can capture the global structure and the k-nearest neighbor algorithm can maximally preserve the local geometrical structure of the data, the LRKNN graph can significantly improve the performance of SDA. Extensive experiments on several real-world databases show that the proposed LRKNN graph is an efficient graph constructor, which can largely outperform other commonly used baselines. PMID:28316616
NASA Astrophysics Data System (ADS)
Mason, J. M.; Fahy, F. J.
1986-10-01
The effectiveness of tuned Helmholtz resonators connected to the partition cavity in double-leaf partitions utilized in situations requiring low weight structures with high transmission loss is investigated as a method of improving sound transmission loss. This is demonstrated by a simple theoretical model and then experimentally verified. Results show that substantial improvements may be obtained at and around the mass-air-mass frequency for a total resonator volume 15 percent of the cavity volume.
ERIC Educational Resources Information Center
Lo, Ya-yu; Starling, A. Leyf Peirce
2009-01-01
This study examined the effects of a graphing task analysis using the Microsoft[R] Office Excel 2007 program on the single-subject multiple baseline graphing skills of three university graduate students. Using a multiple probe across participants design, the study demonstrated a functional relationship between the number of correct graphing…
OPEX: Optimized Eccentricity Computation in Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Henderson, Keith
2011-11-14
Real-world graphs have many properties of interest, but often these properties are expensive to compute. We focus on eccentricity, radius and diameter in this work. These properties are useful measures of the global connectivity patterns in a graph. Unfortunately, computing eccentricity for all nodes is O(n2) for a graph with n nodes. We present OPEX, a novel combination of optimizations which improves computation time of these properties by orders of magnitude in real-world experiments on graphs of many different sizes. We run OPEX on graphs with up to millions of links. OPEX gives either exact results or bounded approximations, unlikemore » its competitors which give probabilistic approximations or sacrifice node-level information (eccentricity) to compute graphlevel information (diameter).« less
Auction dynamics: A volume constrained MBO scheme
NASA Astrophysics Data System (ADS)
Jacobs, Matt; Merkurjev, Ekaterina; Esedoǧlu, Selim
2018-02-01
We show how auction algorithms, originally developed for the assignment problem, can be utilized in Merriman, Bence, and Osher's threshold dynamics scheme to simulate multi-phase motion by mean curvature in the presence of equality and inequality volume constraints on the individual phases. The resulting algorithms are highly efficient and robust, and can be used in simulations ranging from minimal partition problems in Euclidean space to semi-supervised machine learning via clustering on graphs. In the case of the latter application, numerous experimental results on benchmark machine learning datasets show that our approach exceeds the performance of current state-of-the-art methods, while requiring a fraction of the computation time.
Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550
Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.
Transfer-Efficient Face Routing Using the Planar Graphs of Neighbors in High Density WSNs
Kim, Sang-Ha
2017-01-01
Face routing has been adopted in wireless sensor networks (WSNs) where topological changes occur frequently or maintaining full network information is difficult. For message forwarding in networks, a planar graph is used to prevent looping, and because long edges are removed by planarization and the resulting planar graph is composed of short edges, and messages are forwarded along multiple nodes connected by them even though they can be forwarded directly. To solve this, face routing using information on all nodes within 2-hop range was adopted to forward messages directly to the farthest node within radio range. However, as the density of the nodes increases, network performance plunges because message transfer nodes receive and process increased node information. To deal with this problem, we propose a new face routing using the planar graphs of neighboring nodes to improve transfer efficiency. It forwards a message directly to the farthest neighbor and reduces loads and processing time by distributing network graph construction and planarization to the neighbors. It also decreases the amount of location information to be transmitted by sending information on the planar graph nodes rather than on all neighboring nodes. Simulation results show that it significantly improves transfer efficiency. PMID:29053623
On the ordinary quiver of the symmetric group over a field of characteristic 2
NASA Astrophysics Data System (ADS)
Martin, Stuart; Russell, Lee
1997-11-01
Let [fraktur S]n and [fraktur A]n denote the symmetric and alternating groups of degree n[set membership][open face N] respectively. Let p be a prime number and let F be an arbitrary field of characteristic p. We say that a partition of n is p-regular if no p (non-zero) parts of it are equal; otherwise we call it p-singular. Let S[lambda]F denote the Specht module corresponding to [lambda]. For [lambda] a p-regular partition of n let D[lambda]F denote the unique irreducible top factor of S[lambda]F. Denote by [Delta][lambda]F =D[lambda]F [downward arrow][fraktur A]n its restriction to [fraktur A]n. Recall also that, over F, the ordinary quiver of the modular group algebra FG is a finite directed graph defined as follows: the vertices are labelled by the set of all simple FG-modules, L1, [ctdot], Lr, and the number of arrows from Li to Lj equals dimFExtFG(Li, Lj). The quiver gives important information about the block structure of G.
Compact localized states and flat bands from local symmetry partitioning
NASA Astrophysics Data System (ADS)
Röntgen, M.; Morfonios, C. V.; Schmelcher, P.
2018-01-01
We propose a framework for the connection between local symmetries of discrete Hamiltonians and the design of compact localized states. Such compact localized states are used for the creation of tunable, local symmetry-induced bound states in an energy continuum and flat energy bands for periodically repeated local symmetries in one- and two-dimensional lattices. The framework is based on very recent theorems in graph theory which are here employed to obtain a block partitioning of the Hamiltonian induced by the symmetry of a given system under local site permutations. The diagonalization of the Hamiltonian is thereby reduced to finding the eigenspectra of smaller matrices, with eigenvectors automatically divided into compact localized and extended states. We distinguish between local symmetry operations which commute with the Hamiltonian, and those which do not commute due to an asymmetric coupling to the surrounding sites. While valuable as a computational tool for versatile discrete systems with locally symmetric structures, the approach provides in particular a unified, intuitive, and efficient route to the flexible design of compact localized states at desired energies.
Thermodynamic characterization of networks using graph polynomials
NASA Astrophysics Data System (ADS)
Ye, Cheng; Comin, César H.; Peron, Thomas K. DM.; Silva, Filipi N.; Rodrigues, Francisco A.; Costa, Luciano da F.; Torsello, Andrea; Hancock, Edwin R.
2015-09-01
In this paper, we present a method for characterizing the evolution of time-varying complex networks by adopting a thermodynamic representation of network structure computed from a polynomial (or algebraic) characterization of graph structure. Commencing from a representation of graph structure based on a characteristic polynomial computed from the normalized Laplacian matrix, we show how the polynomial is linked to the Boltzmann partition function of a network. This allows us to compute a number of thermodynamic quantities for the network, including the average energy and entropy. Assuming that the system does not change volume, we can also compute the temperature, defined as the rate of change of entropy with energy. All three thermodynamic variables can be approximated using low-order Taylor series that can be computed using the traces of powers of the Laplacian matrix, avoiding explicit computation of the normalized Laplacian spectrum. These polynomial approximations allow a smoothed representation of the evolution of networks to be constructed in the thermodynamic space spanned by entropy, energy, and temperature. We show how these thermodynamic variables can be computed in terms of simple network characteristics, e.g., the total number of nodes and node degree statistics for nodes connected by edges. We apply the resulting thermodynamic characterization to real-world time-varying networks representing complex systems in the financial and biological domains. The study demonstrates that the method provides an efficient tool for detecting abrupt changes and characterizing different stages in network evolution.
Motifs in triadic random graphs based on Steiner triple systems
NASA Astrophysics Data System (ADS)
Winkler, Marco; Reichardt, Jörg
2013-08-01
Conventionally, pairwise relationships between nodes are considered to be the fundamental building blocks of complex networks. However, over the last decade, the overabundance of certain subnetwork patterns, i.e., the so-called motifs, has attracted much attention. It has been hypothesized that these motifs, instead of links, serve as the building blocks of network structures. Although the relation between a network's topology and the general properties of the system, such as its function, its robustness against perturbations, or its efficiency in spreading information, is the central theme of network science, there is still a lack of sound generative models needed for testing the functional role of subgraph motifs. Our work aims to overcome this limitation. We employ the framework of exponential random graph models (ERGMs) to define models based on triadic substructures. The fact that only a small portion of triads can actually be set independently poses a challenge for the formulation of such models. To overcome this obstacle, we use Steiner triple systems (STSs). These are partitions of sets of nodes into pair-disjoint triads, which thus can be specified independently. Combining the concepts of ERGMs and STSs, we suggest generative models capable of generating ensembles of networks with nontrivial triadic Z-score profiles. Further, we discover inevitable correlations between the abundance of triad patterns, which occur solely for statistical reasons and need to be taken into account when discussing the functional implications of motif statistics. Moreover, we calculate the degree distributions of our triadic random graphs analytically.
Automatic classification of protein structures relying on similarities between alignments
2012-01-01
Background Identification of protein structural cores requires isolation of sets of proteins all sharing a same subset of structural motifs. In the context of an ever growing number of available 3D protein structures, standard and automatic clustering algorithms require adaptations so as to allow for efficient identification of such sets of proteins. Results When considering a pair of 3D structures, they are stated as similar or not according to the local similarities of their matching substructures in a structural alignment. This binary relation can be represented in a graph of similarities where a node represents a 3D protein structure and an edge states that two 3D protein structures are similar. Therefore, classifying proteins into structural families can be viewed as a graph clustering task. Unfortunately, because such a graph encodes only pairwise similarity information, clustering algorithms may include in the same cluster a subset of 3D structures that do not share a common substructure. In order to overcome this drawback we first define a ternary similarity on a triple of 3D structures as a constraint to be satisfied by the graph of similarities. Such a ternary constraint takes into account similarities between pairwise alignments, so as to ensure that the three involved protein structures do have some common substructure. We propose hereunder a modification algorithm that eliminates edges from the original graph of similarities and gives a reduced graph in which no ternary constraints are violated. Our approach is then first to build a graph of similarities, then to reduce the graph according to the modification algorithm, and finally to apply to the reduced graph a standard graph clustering algorithm. Such method was used for classifying ASTRAL-40 non-redundant protein domains, identifying significant pairwise similarities with Yakusa, a program devised for rapid 3D structure alignments. Conclusions We show that filtering similarities prior to standard graph based clustering process by applying ternary similarity constraints i) improves the separation of proteins of different classes and consequently ii) improves the classification quality of standard graph based clustering algorithms according to the reference classification SCOP. PMID:22974051
NASA Astrophysics Data System (ADS)
Mason, J. M.; Fahy, F. J.
1988-07-01
Double-leaf partitions are often utilized in situations requiring low weight structures with high transmission loss, an example of current interest being the fuselage walls of propeller-driven aircraft. In this case, acoustic excitation is periodic and, if one of the frequencies of excitation lies in the region of the fundamental mass-air-mass frequency of the partition, insulation performance is considerably less than desired. The potential effectiveness of tuned Helmholtz resonators connected to the partition cavity is investigated as a method of improving transmission loss. This is demonstrated by a simple theoretical model and then experimentally verified. Results show that substantial improvements may be obtained at and around the mass-air-mass frequency for a total resonator volume 15 percent of the cavity volume.
ERIC Educational Resources Information Center
Stewart, Kelise K.; Carr, James E.; Brandt, Charles W.; McHenry, Meade M.
2007-01-01
The present study evaluated the effects of both a traditional lecture and the conservative dual-criterion (CDC) judgment aid on the ability of 6 university students to visually inspect AB-design line graphs. The traditional lecture reliably failed to improve visual inspection accuracy, whereas the CDC method substantially improved the performance…
Multiply scaled constrained nonlinear equation solvers. [for nonlinear heat conduction problems
NASA Technical Reports Server (NTRS)
Padovan, Joe; Krishna, Lala
1986-01-01
To improve the numerical stability of nonlinear equation solvers, a partitioned multiply scaled constraint scheme is developed. This scheme enables hierarchical levels of control for nonlinear equation solvers. To complement the procedure, partitioned convergence checks are established along with self-adaptive partitioning schemes. Overall, such procedures greatly enhance the numerical stability of the original solvers. To demonstrate and motivate the development of the scheme, the problem of nonlinear heat conduction is considered. In this context the main emphasis is given to successive substitution-type schemes. To verify the improved numerical characteristics associated with partitioned multiply scaled solvers, results are presented for several benchmark examples.
Modeling flow and transport in fracture networks using graphs
NASA Astrophysics Data System (ADS)
Karra, S.; O'Malley, D.; Hyman, J. D.; Viswanathan, H. S.; Srinivasan, G.
2018-03-01
Fractures form the main pathways for flow in the subsurface within low-permeability rock. For this reason, accurately predicting flow and transport in fractured systems is vital for improving the performance of subsurface applications. Fracture sizes in these systems can range from millimeters to kilometers. Although modeling flow and transport using the discrete fracture network (DFN) approach is known to be more accurate due to incorporation of the detailed fracture network structure over continuum-based methods, capturing the flow and transport in such a wide range of scales is still computationally intractable. Furthermore, if one has to quantify uncertainty, hundreds of realizations of these DFN models have to be run. To reduce the computational burden, we solve flow and transport on a graph representation of a DFN. We study the accuracy of the graph approach by comparing breakthrough times and tracer particle statistical data between the graph-based and the high-fidelity DFN approaches, for fracture networks with varying number of fractures and degree of heterogeneity. Due to our recent developments in capabilities to perform DFN high-fidelity simulations on fracture networks with large number of fractures, we are in a unique position to perform such a comparison. We show that the graph approach shows a consistent bias with up to an order of magnitude slower breakthrough when compared to the DFN approach. We show that this is due to graph algorithm's underprediction of the pressure gradients across intersections on a given fracture, leading to slower tracer particle speeds between intersections and longer travel times. We present a bias correction methodology to the graph algorithm that reduces the discrepancy between the DFN and graph predictions. We show that with this bias correction, the graph algorithm predictions significantly improve and the results are very accurate. The good accuracy and the low computational cost, with O (104) times lower times than the DFN, makes the graph algorithm an ideal technique to incorporate in uncertainty quantification methods.
Modeling flow and transport in fracture networks using graphs.
Karra, S; O'Malley, D; Hyman, J D; Viswanathan, H S; Srinivasan, G
2018-03-01
Fractures form the main pathways for flow in the subsurface within low-permeability rock. For this reason, accurately predicting flow and transport in fractured systems is vital for improving the performance of subsurface applications. Fracture sizes in these systems can range from millimeters to kilometers. Although modeling flow and transport using the discrete fracture network (DFN) approach is known to be more accurate due to incorporation of the detailed fracture network structure over continuum-based methods, capturing the flow and transport in such a wide range of scales is still computationally intractable. Furthermore, if one has to quantify uncertainty, hundreds of realizations of these DFN models have to be run. To reduce the computational burden, we solve flow and transport on a graph representation of a DFN. We study the accuracy of the graph approach by comparing breakthrough times and tracer particle statistical data between the graph-based and the high-fidelity DFN approaches, for fracture networks with varying number of fractures and degree of heterogeneity. Due to our recent developments in capabilities to perform DFN high-fidelity simulations on fracture networks with large number of fractures, we are in a unique position to perform such a comparison. We show that the graph approach shows a consistent bias with up to an order of magnitude slower breakthrough when compared to the DFN approach. We show that this is due to graph algorithm's underprediction of the pressure gradients across intersections on a given fracture, leading to slower tracer particle speeds between intersections and longer travel times. We present a bias correction methodology to the graph algorithm that reduces the discrepancy between the DFN and graph predictions. We show that with this bias correction, the graph algorithm predictions significantly improve and the results are very accurate. The good accuracy and the low computational cost, with O(10^{4}) times lower times than the DFN, makes the graph algorithm an ideal technique to incorporate in uncertainty quantification methods.
Modeling flow and transport in fracture networks using graphs
Karra, S.; O'Malley, D.; Hyman, J. D.; ...
2018-03-09
Fractures form the main pathways for flow in the subsurface within low-permeability rock. For this reason, accurately predicting flow and transport in fractured systems is vital for improving the performance of subsurface applications. Fracture sizes in these systems can range from millimeters to kilometers. Although modeling flow and transport using the discrete fracture network (DFN) approach is known to be more accurate due to incorporation of the detailed fracture network structure over continuum-based methods, capturing the flow and transport in such a wide range of scales is still computationally intractable. Furthermore, if one has to quantify uncertainty, hundreds of realizationsmore » of these DFN models have to be run. To reduce the computational burden, we solve flow and transport on a graph representation of a DFN. We study the accuracy of the graph approach by comparing breakthrough times and tracer particle statistical data between the graph-based and the high-fidelity DFN approaches, for fracture networks with varying number of fractures and degree of heterogeneity. Due to our recent developments in capabilities to perform DFN high-fidelity simulations on fracture networks with large number of fractures, we are in a unique position to perform such a comparison. We show that the graph approach shows a consistent bias with up to an order of magnitude slower breakthrough when compared to the DFN approach. We show that this is due to graph algorithm's underprediction of the pressure gradients across intersections on a given fracture, leading to slower tracer particle speeds between intersections and longer travel times. We present a bias correction methodology to the graph algorithm that reduces the discrepancy between the DFN and graph predictions. We show that with this bias correction, the graph algorithm predictions significantly improve and the results are very accurate. In conclusion, the good accuracy and the low computational cost, with O(10 4) times lower times than the DFN, makes the graph algorithm an ideal technique to incorporate in uncertainty quantification methods.« less
Modeling flow and transport in fracture networks using graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Karra, S.; O'Malley, D.; Hyman, J. D.
Fractures form the main pathways for flow in the subsurface within low-permeability rock. For this reason, accurately predicting flow and transport in fractured systems is vital for improving the performance of subsurface applications. Fracture sizes in these systems can range from millimeters to kilometers. Although modeling flow and transport using the discrete fracture network (DFN) approach is known to be more accurate due to incorporation of the detailed fracture network structure over continuum-based methods, capturing the flow and transport in such a wide range of scales is still computationally intractable. Furthermore, if one has to quantify uncertainty, hundreds of realizationsmore » of these DFN models have to be run. To reduce the computational burden, we solve flow and transport on a graph representation of a DFN. We study the accuracy of the graph approach by comparing breakthrough times and tracer particle statistical data between the graph-based and the high-fidelity DFN approaches, for fracture networks with varying number of fractures and degree of heterogeneity. Due to our recent developments in capabilities to perform DFN high-fidelity simulations on fracture networks with large number of fractures, we are in a unique position to perform such a comparison. We show that the graph approach shows a consistent bias with up to an order of magnitude slower breakthrough when compared to the DFN approach. We show that this is due to graph algorithm's underprediction of the pressure gradients across intersections on a given fracture, leading to slower tracer particle speeds between intersections and longer travel times. We present a bias correction methodology to the graph algorithm that reduces the discrepancy between the DFN and graph predictions. We show that with this bias correction, the graph algorithm predictions significantly improve and the results are very accurate. In conclusion, the good accuracy and the low computational cost, with O(10 4) times lower times than the DFN, makes the graph algorithm an ideal technique to incorporate in uncertainty quantification methods.« less
PiTS-1: Carbon Partitioning in Loblolly Pine after 13C Labeling and Shade Treatments
Warren, J. M.; Iversen, C. M.; Garten, Jr., C. T.; Norby, R. J.; Childs, J.; Brice, D.; Evans, R. M.; Gu, L.; Thornton, P.; Weston, D. J.
2013-01-01
The PiTS task was established with the objective of improving the C partitioning routines in existing ecosystem models by exploring mechanistic model representations of partitioning tested against field observations. We used short-term field manipulations of C flow, through 13CO2 labeling, canopy shading and stem girdling, to dramatically alter C partitioning, and resultant data are being used to test model representation of C partitioning processes in the Community Land Model (CLM4 or CLM4.5).
Improved segmentation of abnormal cervical nuclei using a graph-search based approach
NASA Astrophysics Data System (ADS)
Zhang, Ling; Liu, Shaoxiong; Wang, Tianfu; Chen, Siping; Sonka, Milan
2015-03-01
Reliable segmentation of abnormal nuclei in cervical cytology is of paramount importance in automation-assisted screening techniques. This paper presents a general method for improving the segmentation of abnormal nuclei using a graph-search based approach. More specifically, the proposed method focuses on the improvement of coarse (initial) segmentation. The improvement relies on a transform that maps round-like border in the Cartesian coordinate system into lines in the polar coordinate system. The costs consisting of nucleus-specific edge and region information are assigned to the nodes. The globally optimal path in the constructed graph is then identified by dynamic programming. We have tested the proposed method on abnormal nuclei from two cervical cell image datasets, Herlev and H and E stained liquid-based cytology (HELBC), and the comparative experiments with recent state-of-the-art approaches demonstrate the superior performance of the proposed method.
High-Performance Data Analytics Beyond the Relational and Graph Data Models with GEMS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castellana, Vito G.; Minutoli, Marco; Bhatt, Shreyansh
Graphs represent an increasingly popular data model for data-analytics, since they can naturally represent relationships and interactions between entities. Relational databases and their pure table-based data model are not well suitable to store and process sparse data. Consequently, graph databases have gained interest in the last few years and the Resource Description Framework (RDF) became the standard data model for graph data. Nevertheless, while RDF is well suited to analyze the relationships between the entities, it is not efficient in representing their attributes and properties. In this work we propose the adoption of a new hybrid data model, based onmore » attributed graphs, that aims at overcoming the limitations of the pure relational and graph data models. We present how we have re-designed the GEMS data-analytics framework to fully take advantage of the proposed hybrid data model. To improve analysts productivity, in addition to a C++ API for applications development, we adopt GraQL as input query language. We validate our approach implementing a set of queries on net-flow data and we compare our framework performance against Neo4j. Experimental results show significant performance improvement over Neo4j, up to several orders of magnitude when increasing the size of the input data.« less
Key-Node-Separated Graph Clustering and Layouts for Human Relationship Graph Visualization.
Itoh, Takayuki; Klein, Karsten
2015-01-01
Many graph-drawing methods apply node-clustering techniques based on the density of edges to find tightly connected subgraphs and then hierarchically visualize the clustered graphs. However, users may want to focus on important nodes and their connections to groups of other nodes for some applications. For this purpose, it is effective to separately visualize the key nodes detected based on adjacency and attributes of the nodes. This article presents a graph visualization technique for attribute-embedded graphs that applies a graph-clustering algorithm that accounts for the combination of connections and attributes. The graph clustering step divides the nodes according to the commonality of connected nodes and similarity of feature value vectors. It then calculates the distances between arbitrary pairs of clusters according to the number of connecting edges and the similarity of feature value vectors and finally places the clusters based on the distances. Consequently, the technique separates important nodes that have connections to multiple large clusters and improves the visibility of such nodes' connections. To test this technique, this article presents examples with human relationship graph datasets, including a coauthorship and Twitter communication network dataset.
Caetano, Tibério S; McAuley, Julian J; Cheng, Li; Le, Quoc V; Smola, Alex J
2009-06-01
As a fundamental problem in pattern recognition, graph matching has applications in a variety of fields, from computer vision to computational biology. In graph matching, patterns are modeled as graphs and pattern recognition amounts to finding a correspondence between the nodes of different graphs. Many formulations of this problem can be cast in general as a quadratic assignment problem, where a linear term in the objective function encodes node compatibility and a quadratic term encodes edge compatibility. The main research focus in this theme is about designing efficient algorithms for approximately solving the quadratic assignment problem, since it is NP-hard. In this paper we turn our attention to a different question: how to estimate compatibility functions such that the solution of the resulting graph matching problem best matches the expected solution that a human would manually provide. We present a method for learning graph matching: the training examples are pairs of graphs and the 'labels' are matches between them. Our experimental results reveal that learning can substantially improve the performance of standard graph matching algorithms. In particular, we find that simple linear assignment with such a learning scheme outperforms Graduated Assignment with bistochastic normalisation, a state-of-the-art quadratic assignment relaxation algorithm.
Scarselli, Franco; Tsoi, Ah Chung; Hagenbuchner, Markus; Noi, Lucia Di
2013-12-01
This paper proposes the combination of two state-of-the-art algorithms for processing graph input data, viz., the probabilistic mapping graph self organizing map, an unsupervised learning approach, and the graph neural network, a supervised learning approach. We organize these two algorithms in a cascade architecture containing a probabilistic mapping graph self organizing map, and a graph neural network. We show that this combined approach helps us to limit the long-term dependency problem that exists when training the graph neural network resulting in an overall improvement in performance. This is demonstrated in an application to a benchmark problem requiring the detection of spam in a relatively large set of web sites. It is found that the proposed method produces results which reach the state of the art when compared with some of the best results obtained by others using quite different approaches. A particular strength of our method is its applicability towards any input domain which can be represented as a graph. Copyright © 2013 Elsevier Ltd. All rights reserved.
Mathematics of Web science: structure, dynamics and incentives.
Chayes, Jennifer
2013-03-28
Dr Chayes' talk described how, to a discrete mathematician, 'all the world's a graph, and all the people and domains merely vertices'. A graph is represented as a set of vertices V and a set of edges E, so that, for instance, in the World Wide Web, V is the set of pages and E the directed hyperlinks; in a social network, V is the people and E the set of relationships; and in the autonomous system Internet, V is the set of autonomous systems (such as AOL, Yahoo! and MSN) and E the set of connections. This means that mathematics can be used to study the Web (and other large graphs in the online world) in the following way: first, we can model online networks as large finite graphs; second, we can sample pieces of these graphs; third, we can understand and then control processes on these graphs; and fourth, we can develop algorithms for these graphs and apply them to improve the online experience.
Use of graph theory measures to identify errors in record linkage.
Randall, Sean M; Boyd, James H; Ferrante, Anna M; Bauer, Jacqueline K; Semmens, James B
2014-07-01
Ensuring high linkage quality is important in many record linkage applications. Current methods for ensuring quality are manual and resource intensive. This paper seeks to determine the effectiveness of graph theory techniques in identifying record linkage errors. A range of graph theory techniques was applied to two linked datasets, with known truth sets. The ability of graph theory techniques to identify groups containing errors was compared to a widely used threshold setting technique. This methodology shows promise; however, further investigations into graph theory techniques are required. The development of more efficient and effective methods of improving linkage quality will result in higher quality datasets that can be delivered to researchers in shorter timeframes. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Community structure from spectral properties in complex networks
NASA Astrophysics Data System (ADS)
Servedio, V. D. P.; Colaiori, F.; Capocci, A.; Caldarelli, G.
2005-06-01
We analyze the spectral properties of complex networks focusing on their relation to the community structure, and develop an algorithm based on correlations among components of different eigenvectors. The algorithm applies to general weighted networks, and, in a suitably modified version, to the case of directed networks. Our method allows to correctly detect communities in sharply partitioned graphs, however it is useful to the analysis of more complex networks, without a well defined cluster structure, as social and information networks. As an example, we test the algorithm on a large scale data-set from a psychological experiment of free word association, where it proves to be successful both in clustering words, and in uncovering mental association patterns.
Hydromagnetic Rarefied Fluid Flow over a Wedge in the Presence of Surface Slip and Thermal Radiation
NASA Astrophysics Data System (ADS)
Das, K.; Sharma, R. P.; Duari, P. R.
2017-12-01
An analysis is presented to investigate the effects of thermal radiation on a convective slip flow of an electrically conducting slightly rarefied fluid, having temperature dependent fluid properties, over a wedge with a thermal jump at the surface of the boundary in the presence of a transverse magnetic field. The reduced equations are solved numerically using the finite difference code that implements the 3-stage Lobatto IIIa formula for the partitioned Runge-Kutta method. Numerical results for the dimensionless velocity and temperature as well as for the skin friction coefficient and the Nusselt number are presented through graphs and tables for pertinent parameters to show interesting aspects of the solution.
Runtime support and compilation methods for user-specified data distributions
NASA Technical Reports Server (NTRS)
Ponnusamy, Ravi; Saltz, Joel; Choudhury, Alok; Hwang, Yuan-Shin; Fox, Geoffrey
1993-01-01
This paper describes two new ideas by which an HPF compiler can deal with irregular computations effectively. The first mechanism invokes a user specified mapping procedure via a set of compiler directives. The directives allow use of program arrays to describe graph connectivity, spatial location of array elements, and computational load. The second mechanism is a simple conservative method that in many cases enables a compiler to recognize that it is possible to reuse previously computed information from inspectors (e.g. communication schedules, loop iteration partitions, information that associates off-processor data copies with on-processor buffer locations). We present performance results for these mechanisms from a Fortran 90D compiler implementation.
Many-core graph analytics using accelerated sparse linear algebra routines
NASA Astrophysics Data System (ADS)
Kozacik, Stephen; Paolini, Aaron L.; Fox, Paul; Kelmelis, Eric
2016-05-01
Graph analytics is a key component in identifying emerging trends and threats in many real-world applications. Largescale graph analytics frameworks provide a convenient and highly-scalable platform for developing algorithms to analyze large datasets. Although conceptually scalable, these techniques exhibit poor performance on modern computational hardware. Another model of graph computation has emerged that promises improved performance and scalability by using abstract linear algebra operations as the basis for graph analysis as laid out by the GraphBLAS standard. By using sparse linear algebra as the basis, existing highly efficient algorithms can be adapted to perform computations on the graph. This approach, however, is often less intuitive to graph analytics experts, who are accustomed to vertex-centric APIs such as Giraph, GraphX, and Tinkerpop. We are developing an implementation of the high-level operations supported by these APIs in terms of linear algebra operations. This implementation is be backed by many-core implementations of the fundamental GraphBLAS operations required, and offers the advantages of both the intuitive programming model of a vertex-centric API and the performance of a sparse linear algebra implementation. This technology can reduce the number of nodes required, as well as the run-time for a graph analysis problem, enabling customers to perform more complex analysis with less hardware at lower cost. All of this can be accomplished without the requirement for the customer to make any changes to their analytics code, thanks to the compatibility with existing graph APIs.
Approximate Computing Techniques for Iterative Graph Algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Panyala, Ajay R.; Subasi, Omer; Halappanavar, Mahantesh
Approximate computing enables processing of large-scale graphs by trading off quality for performance. Approximate computing techniques have become critical not only due to the emergence of parallel architectures but also the availability of large scale datasets enabling data-driven discovery. Using two prototypical graph algorithms, PageRank and community detection, we present several approximate computing heuristics to scale the performance with minimal loss of accuracy. We present several heuristics including loop perforation, data caching, incomplete graph coloring and synchronization, and evaluate their efficiency. We demonstrate performance improvements of up to 83% for PageRank and up to 450x for community detection, with lowmore » impact of accuracy for both the algorithms. We expect the proposed approximate techniques will enable scalable graph analytics on data of importance to several applications in science and their subsequent adoption to scale similar graph algorithms.« less
A Research Graph dataset for connecting research data repositories using RD-Switchboard.
Aryani, Amir; Poblet, Marta; Unsworth, Kathryn; Wang, Jingbo; Evans, Ben; Devaraju, Anusuriya; Hausstein, Brigitte; Klas, Claus-Peter; Zapilko, Benjamin; Kaplun, Samuele
2018-05-29
This paper describes the open access graph dataset that shows the connections between Dryad, CERN, ANDS and other international data repositories to publications and grants across multiple research data infrastructures. The graph dataset was created using the Research Graph data model and the Research Data Switchboard (RD-Switchboard), a collaborative project by the Research Data Alliance DDRI Working Group (DDRI WG) with the aim to discover and connect the related research datasets based on publication co-authorship or jointly funded grants. The graph dataset allows researchers to trace and follow the paths to understanding a body of work. By mapping the links between research datasets and related resources, the graph dataset improves both their discovery and visibility, while avoiding duplicate efforts in data creation. Ultimately, the linked datasets may spur novel ideas, facilitate reproducibility and re-use in new applications, stimulate combinatorial creativity, and foster collaborations across institutions.
Ivanciuc, Ovidiu
2013-06-01
Chemical and molecular graphs have fundamental applications in chemoinformatics, quantitative structureproperty relationships (QSPR), quantitative structure-activity relationships (QSAR), virtual screening of chemical libraries, and computational drug design. Chemoinformatics applications of graphs include chemical structure representation and coding, database search and retrieval, and physicochemical property prediction. QSPR, QSAR and virtual screening are based on the structure-property principle, which states that the physicochemical and biological properties of chemical compounds can be predicted from their chemical structure. Such structure-property correlations are usually developed from topological indices and fingerprints computed from the molecular graph and from molecular descriptors computed from the three-dimensional chemical structure. We present here a selection of the most important graph descriptors and topological indices, including molecular matrices, graph spectra, spectral moments, graph polynomials, and vertex topological indices. These graph descriptors are used to define several topological indices based on molecular connectivity, graph distance, reciprocal distance, distance-degree, distance-valency, spectra, polynomials, and information theory concepts. The molecular descriptors and topological indices can be developed with a more general approach, based on molecular graph operators, which define a family of graph indices related by a common formula. Graph descriptors and topological indices for molecules containing heteroatoms and multiple bonds are computed with weighting schemes based on atomic properties, such as the atomic number, covalent radius, or electronegativity. The correlation in QSPR and QSAR models can be improved by optimizing some parameters in the formula of topological indices, as demonstrated for structural descriptors based on atomic connectivity and graph distance.
Adaptive packet switch with an optical core (demonstrator)
NASA Astrophysics Data System (ADS)
Abdo, Ahmad; Bishtein, Vadim; Clark, Stewart A.; Dicorato, Pino; Lu, David T.; Paredes, Sofia A.; Taebi, Sareh; Hall, Trevor J.
2004-11-01
A three-stage opto-electronic packet switch architecture is described consisting of a reconfigurable optical centre stage surrounded by two electronic buffering stages partitioned into sectors to ease memory contention. A Flexible Bandwidth Provision (FBP) algorithm, implemented on a soft-core processor, is used to change the configuration of the input sectors and optical centre stage to set up internal paths that will provide variable bandwidth to serve the traffic. The switch is modeled by a bipartite graph built from a service matrix, which is a function of the arriving traffic. The bipartite graph is decomposed by solving an edge-colouring problem and the resulting permutations are used to configure the switch. Simulation results show that this architecture exhibits a dramatic reduction of complexity and increased potential for scalability, at the price of only a modest spatial speed-up k, 1
Santitissadeekorn, N; Bollt, E M
2007-06-01
In this paper, we present an approach to approximate the Frobenius-Perron transfer operator from a sequence of time-ordered images, that is, a movie dataset. Unlike time-series data, successive images do not provide a direct access to a trajectory of a point in a phase space; more precisely, a pixel in an image plane. Therefore, we reconstruct the velocity field from image sequences based on the infinitesimal generator of the Frobenius-Perron operator. Moreover, we relate this problem to the well-known optical flow problem from the computer vision community and we validate the continuity equation derived from the infinitesimal operator as a constraint equation for the optical flow problem. Once the vector field and then a discrete transfer operator are found, then, in addition, we present a graph modularity method as a tool to discover basin structure in the phase space. Together with a tool to reconstruct a velocity field, this graph-based partition method provides us with a way to study transport behavior and other ergodic properties of measurable dynamical systems captured only through image sequences.
Parallel heuristics for scalable community detection
Lu, Hao; Halappanavar, Mahantesh; Kalyanaraman, Ananth
2015-08-14
Community detection has become a fundamental operation in numerous graph-theoretic applications. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is an iterative heuristic for modularity optimization. Originally developed in 2008, the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method ismore » also inherently sequential, thereby limiting its scalability. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose heuristics that are designed to break the sequential barrier. For evaluation purposes, we implemented our heuristics using OpenMP multithreading, and tested them over real world graphs derived from multiple application domains. Compared to the serial Louvain implementation, our parallel implementation is able to produce community outputs with a higher modularity for most of the inputs tested, in comparable number or fewer iterations, while providing real speedups of up to 16x using 32 threads.« less
NASA Astrophysics Data System (ADS)
Gligor, M.; Ausloos, M.
2007-05-01
The statistical distances between countries, calculated for various moving average time windows, are mapped into the ultrametric subdominant space as in classical Minimal Spanning Tree methods. The Moving Average Minimal Length Path (MAMLP) algorithm allows a decoupling of fluctuations with respect to the mass center of the system from the movement of the mass center itself. A Hamiltonian representation given by a factor graph is used and plays the role of cost function. The present analysis pertains to 11 macroeconomic (ME) indicators, namely the GDP (x1), Final Consumption Expenditure (x2), Gross Capital Formation (x3), Net Exports (x4), Consumer Price Index (y1), Rates of Interest of the Central Banks (y2), Labour Force (z1), Unemployment (z2), GDP/hour worked (z3), GDP/capita (w1) and Gini coefficient (w2). The target group of countries is composed of 15 EU countries, data taken between 1995 and 2004. By two different methods (the Bipartite Factor Graph Analysis and the Correlation Matrix Eigensystem Analysis) it is found that the strongly correlated countries with respect to the macroeconomic indicators fluctuations can be partitioned into stable clusters.
Ito, Yoichiro; Clary, Robert
2016-01-01
High-speed countercurrent chromatography with a spiral tube assembly can retain a satisfactory amount of stationary phase of polymer phase systems used for protein separation. In order to improve the partition efficiency a simple tool to modify the tubing shapes was fabricated, and the following four different tubing modifications were made: intermittently pressed at 10 mm width, flat, flat-wave, and flat-twist. Partition efficiencies of the separation column made from these modified tubing were examined in protein separation with an aqueous-aqueous polymer phase system at flow rates of 1–2 ml/min under 800 rpm. The results indicated that the column with all modified tubing improved the partition efficiency at a flow rate of 1 ml/min, but at a higher flow rate of 2 ml/min the columns made of flattened tubing showed lowered partition efficiency apparently due to the loss of the retained stationary phase. Among all the modified columns, the column with intermittently pressed tubing gave the best peak resolution. It may be concluded that the intermittently pressed and flat-twist improve the partition efficiency in a semi-preparative separation while other modified tubing of flat and flat-wave configurations may be used for analytical separations with a low flow rate. PMID:27790621
Ito, Yoichiro; Clary, Robert
2016-12-01
High-speed countercurrent chromatography with a spiral tube assembly can retain a satisfactory amount of stationary phase of polymer phase systems used for protein separation. In order to improve the partition efficiency a simple tool to modify the tubing shapes was fabricated, and the following four different tubing modifications were made: intermittently pressed at 10 mm width, flat, flat-wave, and flat-twist. Partition efficiencies of the separation column made from these modified tubing were examined in protein separation with an aqueous-aqueous polymer phase system at flow rates of 1-2 ml/min under 800 rpm. The results indicated that the column with all modified tubing improved the partition efficiency at a flow rate of 1 ml/min, but at a higher flow rate of 2 ml/min the columns made of flattened tubing showed lowered partition efficiency apparently due to the loss of the retained stationary phase. Among all the modified columns, the column with intermittently pressed tubing gave the best peak resolution. It may be concluded that the intermittently pressed and flat-twist improve the partition efficiency in a semi-preparative separation while other modified tubing of flat and flat-wave configurations may be used for analytical separations with a low flow rate.
Enhancing Community Detection By Affinity-based Edge Weighting Scheme
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoo, Andy; Sanders, Geoffrey; Henson, Van
Community detection refers to an important graph analytics problem of finding a set of densely-connected subgraphs in a graph and has gained a great deal of interest recently. The performance of current community detection algorithms is limited by an inherent constraint of unweighted graphs that offer very little information on their internal community structures. In this paper, we propose a new scheme to address this issue that weights the edges in a given graph based on recently proposed vertex affinity. The vertex affinity quantifies the proximity between two vertices in terms of their clustering strength, and therefore, it is idealmore » for graph analytics applications such as community detection. We also demonstrate that the affinity-based edge weighting scheme can improve the performance of community detection algorithms significantly.« less
A graph-based approach for the retrieval of multi-modality medical images.
Kumar, Ashnil; Kim, Jinman; Wen, Lingfeng; Fulham, Michael; Feng, Dagan
2014-02-01
In this paper, we address the retrieval of multi-modality medical volumes, which consist of two different imaging modalities, acquired sequentially, from the same scanner. One such example, positron emission tomography and computed tomography (PET-CT), provides physicians with complementary functional and anatomical features as well as spatial relationships and has led to improved cancer diagnosis, localisation, and staging. The challenge of multi-modality volume retrieval for cancer patients lies in representing the complementary geometric and topologic attributes between tumours and organs. These attributes and relationships, which are used for tumour staging and classification, can be formulated as a graph. It has been demonstrated that graph-based methods have high accuracy for retrieval by spatial similarity. However, naïvely representing all relationships on a complete graph obscures the structure of the tumour-anatomy relationships. We propose a new graph structure derived from complete graphs that structurally constrains the edges connected to tumour vertices based upon the spatial proximity of tumours and organs. This enables retrieval on the basis of tumour localisation. We also present a similarity matching algorithm that accounts for different feature sets for graph elements from different imaging modalities. Our method emphasises the relationships between a tumour and related organs, while still modelling patient-specific anatomical variations. Constraining tumours to related anatomical structures improves the discrimination potential of graphs, making it easier to retrieve similar images based on tumour location. We evaluated our retrieval methodology on a dataset of clinical PET-CT volumes. Our results showed that our method enabled the retrieval of multi-modality images using spatial features. Our graph-based retrieval algorithm achieved a higher precision than several other retrieval techniques: gray-level histograms as well as state-of-the-art methods such as visual words using the scale- invariant feature transform (SIFT) and relational matrices representing the spatial arrangements of objects. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
McGibbney, L. J.; Jiang, Y.; Burgess, A. B.
2017-12-01
Big Earth observation data have been produced, archived and made available online, but discovering the right data in a manner that precisely and efficiently satisfies user needs presents a significant challenge to the Earth Science (ES) community. An emerging trend in information retrieval community is to utilize knowledge graphs to assist users in quickly finding desired information from across knowledge sources. This is particularly prevalent within the fields of social media and complex multimodal information processing to name but a few, however building a domain-specific knowledge graph is labour-intensive and hard to keep up-to-date. In this work, we update our progress on the Earth Science Knowledge Graph (ESKG) project; an ESIP-funded testbed project which provides an automatic approach to building a dynamic knowledge graph for ES to improve interdisciplinary data discovery by leveraging implicit, latent existing knowledge present within across several U.S Federal Agencies e.g. NASA, NOAA and USGS. ESKG strengthens ties between observations and user communities by: 1) developing a knowledge graph derived from various sources e.g. Web pages, Web Services, etc. via natural language processing and knowledge extraction techniques; 2) allowing users to traverse, explore, query, reason and navigate ES data via knowledge graph interaction. ESKG has the potential to revolutionize the way in which ES communities interact with ES data in the open world through the entity, spatial and temporal linkages and characteristics that make it up. This project enables the advancement of ESIP collaboration areas including both Discovery and Semantic Technologies by putting graph information right at our fingertips in an interactive, modern manner and reducing the efforts to constructing ontology. To demonstrate the ESKG concept, we will demonstrate use of our framework across NASA JPL's PO.DAAC, NOAA's Earth Observation Requirements Evaluation System (EORES) and various USGS systems.
Network reconstruction via graph blending
NASA Astrophysics Data System (ADS)
Estrada, Rolando
2016-05-01
Graphs estimated from empirical data are often noisy and incomplete due to the difficulty of faithfully observing all the components (nodes and edges) of the true graph. This problem is particularly acute for large networks where the number of components may far exceed available surveillance capabilities. Errors in the observed graph can render subsequent analyses invalid, so it is vital to develop robust methods that can minimize these observational errors. Errors in the observed graph may include missing and spurious components, as well fused (multiple nodes are merged into one) and split (a single node is misinterpreted as many) nodes. Traditional graph reconstruction methods are only able to identify missing or spurious components (primarily edges, and to a lesser degree nodes), so we developed a novel graph blending framework that allows us to cast the full estimation problem as a simple edge addition/deletion problem. Armed with this framework, we systematically investigate the viability of various topological graph features, such as the degree distribution or the clustering coefficients, and existing graph reconstruction methods for tackling the full estimation problem. Our experimental results suggest that incorporating any topological feature as a source of information actually hinders reconstruction accuracy. We provide a theoretical analysis of this phenomenon and suggest several avenues for improving this estimation problem.
Integer Flows and Circuit Covers of Graphs and Signed Graphs
NASA Astrophysics Data System (ADS)
Cheng, Jian
The work in Chapter 2 is motivated by Tutte and Jaeger's pioneering work on converting modulo flows into integer-valued flows for ordinary graphs. For a signed graphs (G, sigma), we first prove that for each k ∈ {2, 3}, if (G, sigma) is (k - 1)-edge-connected and contains an even number of negative edges when k = 2, then every modulo k-flow of (G, sigma) can be converted into an integer-valued ( k + 1)-ow with a larger or the same support. We also prove that if (G, sigma) is odd-(2p+1)-edge-connected, then (G, sigma) admits a modulo circular (2 + 1/ p)-flows if and only if it admits an integer-valued circular (2 + 1/p)-flows, which improves all previous result by Xu and Zhang (DM2005), Schubert and Steffen (EJC2015), and Zhu (JCTB2015). Shortest circuit cover conjecture is one of the major open problems in graph theory. It states that every bridgeless graph G contains a set of circuits F such that each edge is contained in at least one member of F and the length of F is at most 7/5∥E(G)∥. This concept was recently generalized to signed graphs by Macajova et al. (JGT2015). In Chapter 3, we improve their upper bound from 11∥E( G)∥ to 14/3 ∥E(G)∥, and if G is 2-edgeconnected and has even negativeness, then it can be further reduced to 11/3 ∥E(G)∥. Tutte's 3-flow conjecture has been studied by many graph theorists in the last several decades. As a new approach to this conjecture, DeVos and Thomassen considered the vectors as ow values and found that there is a close relation between vector S1-flows and integer 3-NZFs. Motivated by their observation, in Chapter 4, we prove that if a graph G admits a vector S1-flow with rank at most two, then G admits an integer 3-NZF. The concept of even factors is highly related to the famous Four Color Theorem. We conclude this dissertation in Chapter 5 with an improvement of a recent result by Chen and Fan (JCTB2016) on the upperbound of even factors. We show that if a graph G contains an even factor, then it contains an even factor H with. ∥E(H)∥ ≥ 4/7 (∥ E(G)∥+1)+ 1/7 ∥V2 (G)∥, where V2( G) is the set of vertices of degree two.
Exotic equilibria of Harary graphs and a new minimum degree lower bound for synchronization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Canale, Eduardo A., E-mail: ecanale@pol.una.py; Monzón, Pablo, E-mail: monzon@fing.edu.uy
2015-02-15
This work is concerned with stability of equilibria in the homogeneous (equal frequencies) Kuramoto model of weakly coupled oscillators. In 2012 [R. Taylor, J. Phys. A: Math. Theor. 45, 1–15 (2012)], a sufficient condition for almost global synchronization was found in terms of the minimum degree–order ratio of the graph. In this work, a new lower bound for this ratio is given. The improvement is achieved by a concrete infinite sequence of regular graphs. Besides, non standard unstable equilibria of the graphs studied in Wiley et al. [Chaos 16, 015103 (2006)] are shown to exist as conjectured in that work.
2006-09-01
between the groups (p = .0003) and within the groups (p =. 01). The decrease in depression in the exercise group (89%) neared significance (p = .052...exercise program was effective in improving aerobic capacity, lower-body flexibility, fatigue, depression , anxiety, confusion, anger, and energy in the...Appendix F: 10-week LASA Fatigue Graph Appendix G: 10-week LASA Depression Graph Appendix H: 10-week LASA Anxiety Graph Appendix I: 10-week
Sacchet, Matthew D.; Prasad, Gautam; Foland-Ross, Lara C.; Thompson, Paul M.; Gotlib, Ian H.
2015-01-01
Recently, there has been considerable interest in understanding brain networks in major depressive disorder (MDD). Neural pathways can be tracked in the living brain using diffusion-weighted imaging (DWI); graph theory can then be used to study properties of the resulting fiber networks. To date, global abnormalities have not been reported in tractography-based graph metrics in MDD, so we used a machine learning approach based on “support vector machines” to differentiate depressed from healthy individuals based on multiple brain network properties. We also assessed how important specific graph metrics were for this differentiation. Finally, we conducted a local graph analysis to identify abnormal connectivity at specific nodes of the network. We were able to classify depression using whole-brain graph metrics. Small-worldness was the most useful graph metric for classification. The right pars orbitalis, right inferior parietal cortex, and left rostral anterior cingulate all showed abnormal network connectivity in MDD. This is the first use of structural global graph metrics to classify depressed individuals. These findings highlight the importance of future research to understand network properties in depression across imaging modalities, improve classification results, and relate network alterations to psychiatric symptoms, medication, and comorbidities. PMID:25762941
Sacchet, Matthew D; Prasad, Gautam; Foland-Ross, Lara C; Thompson, Paul M; Gotlib, Ian H
2015-01-01
Recently, there has been considerable interest in understanding brain networks in major depressive disorder (MDD). Neural pathways can be tracked in the living brain using diffusion-weighted imaging (DWI); graph theory can then be used to study properties of the resulting fiber networks. To date, global abnormalities have not been reported in tractography-based graph metrics in MDD, so we used a machine learning approach based on "support vector machines" to differentiate depressed from healthy individuals based on multiple brain network properties. We also assessed how important specific graph metrics were for this differentiation. Finally, we conducted a local graph analysis to identify abnormal connectivity at specific nodes of the network. We were able to classify depression using whole-brain graph metrics. Small-worldness was the most useful graph metric for classification. The right pars orbitalis, right inferior parietal cortex, and left rostral anterior cingulate all showed abnormal network connectivity in MDD. This is the first use of structural global graph metrics to classify depressed individuals. These findings highlight the importance of future research to understand network properties in depression across imaging modalities, improve classification results, and relate network alterations to psychiatric symptoms, medication, and comorbidities.
Metric learning with spectral graph convolutions on brain connectivity networks.
Ktena, Sofia Ira; Parisot, Sarah; Ferrante, Enzo; Rajchl, Martin; Lee, Matthew; Glocker, Ben; Rueckert, Daniel
2018-04-01
Graph representations are often used to model structured data at an individual or population level and have numerous applications in pattern recognition problems. In the field of neuroscience, where such representations are commonly used to model structural or functional connectivity between a set of brain regions, graphs have proven to be of great importance. This is mainly due to the capability of revealing patterns related to brain development and disease, which were previously unknown. Evaluating similarity between these brain connectivity networks in a manner that accounts for the graph structure and is tailored for a particular application is, however, non-trivial. Most existing methods fail to accommodate the graph structure, discarding information that could be beneficial for further classification or regression analyses based on these similarities. We propose to learn a graph similarity metric using a siamese graph convolutional neural network (s-GCN) in a supervised setting. The proposed framework takes into consideration the graph structure for the evaluation of similarity between a pair of graphs, by employing spectral graph convolutions that allow the generalisation of traditional convolutions to irregular graphs and operates in the graph spectral domain. We apply the proposed model on two datasets: the challenging ABIDE database, which comprises functional MRI data of 403 patients with autism spectrum disorder (ASD) and 468 healthy controls aggregated from multiple acquisition sites, and a set of 2500 subjects from UK Biobank. We demonstrate the performance of the method for the tasks of classification between matching and non-matching graphs, as well as individual subject classification and manifold learning, showing that it leads to significantly improved results compared to traditional methods. Copyright © 2017 Elsevier Inc. All rights reserved.
Kim, Z-Hun; Park, Hanwool; Hong, Seong-Joo; Lim, Sang-Min; Lee, Choul-Gyun
2016-05-01
Culturing microalgae in the ocean has potentials that may reduce the production cost and provide an option for an economic biofuel production from microalgae. The ocean holds great potentials for mass microalgal cultivation with its high specific heat, mixing energy from waves, and large cultivable area. Suitable photobioreactors (PBRs) that are capable of integrating marine energy into the culture systems need to be developed for the successful ocean cultivation. In this study, prototype floating PBRs were designed and constructed using transparent low-density polyethylene film for microalgal culture in the ocean. To improve the mixing efficiency, various types of internal partitions were introduced within PBRs. Three different types of internal partitions were evaluated for their effects on the mixing efficiency in terms of mass transfer (k(L)a) and mixing time in the PBRs. The partition type with the best mixing efficiency was selected, and the number of partitions was varied from one to three for investigation of its effect on mixing efficiency. When the number of partitions is increased, mass transfer increased in proportion to the number of partitions. However, mixing time was not directly related to the number of partitions. When a green microalga, Tetraselmis sp. was cultivated using PBRs with the selected partition under semi-continuous mode in the ocean, biomass and fatty acid productivities in the PBRs were increased by up to 50 % and 44% at high initial cell density, respectively, compared to non-partitioned ones. The results of internally partitioned PBRs demonstrated potentials for culturing microalgae by efficiently utilizing ocean wave energy into culture mixing in the ocean.
An Improved Heuristic Method for Subgraph Isomorphism Problem
NASA Astrophysics Data System (ADS)
Xiang, Yingzhuo; Han, Jiesi; Xu, Haijiang; Guo, Xin
2017-09-01
This paper focus on the subgraph isomorphism (SI) problem. We present an improved genetic algorithm, a heuristic method to search the optimal solution. The contribution of this paper is that we design a dedicated crossover algorithm and a new fitness function to measure the evolution process. Experiments show our improved genetic algorithm performs better than other heuristic methods. For a large graph, such as a subgraph of 40 nodes, our algorithm outperforms the traditional tree search algorithms. We find that the performance of our improved genetic algorithm does not decrease as the number of nodes in prototype graphs.
Overlapping community detection based on link graph using distance dynamics
NASA Astrophysics Data System (ADS)
Chen, Lei; Zhang, Jing; Cai, Li-Jun
2018-01-01
The distance dynamics model was recently proposed to detect the disjoint community of a complex network. To identify the overlapping structure of a network using the distance dynamics model, an overlapping community detection algorithm, called L-Attractor, is proposed in this paper. The process of L-Attractor mainly consists of three phases. In the first phase, L-Attractor transforms the original graph to a link graph (a new edge graph) to assure that one node has multiple distances. In the second phase, using the improved distance dynamics model, a dynamic interaction process is introduced to simulate the distance dynamics (shrink or stretch). Through the dynamic interaction process, all distances converge, and the disjoint community structure of the link graph naturally manifests itself. In the third phase, a recovery method is designed to convert the disjoint community structure of the link graph to the overlapping community structure of the original graph. Extensive experiments are conducted on the LFR benchmark networks as well as real-world networks. Based on the results, our algorithm demonstrates higher accuracy and quality than other state-of-the-art algorithms.
Task Parallel Incomplete Cholesky Factorization using 2D Partitioned-Block Layout
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, Kyungjoo; Rajamanickam, Sivasankaran; Stelle, George Widgery
We introduce a task-parallel algorithm for sparse incomplete Cholesky factorization that utilizes a 2D sparse partitioned-block layout of a matrix. Our factorization algorithm follows the idea of algorithms-by-blocks by using the block layout. The algorithm-byblocks approach induces a task graph for the factorization. These tasks are inter-related to each other through their data dependences in the factorization algorithm. To process the tasks on various manycore architectures in a portable manner, we also present a portable tasking API that incorporates different tasking backends and device-specific features using an open-source framework for manycore platforms i.e., Kokkos. A performance evaluation is presented onmore » both Intel Sandybridge and Xeon Phi platforms for matrices from the University of Florida sparse matrix collection to illustrate merits of the proposed task-based factorization. Experimental results demonstrate that our task-parallel implementation delivers about 26.6x speedup (geometric mean) over single-threaded incomplete Choleskyby- blocks and 19.2x speedup over serial Cholesky performance which does not carry tasking overhead using 56 threads on the Intel Xeon Phi processor for sparse matrices arising from various application problems.« less
Imran, Muhammad; Zafar, Nazir Ahmad
2012-01-01
Maintaining inter-actor connectivity is extremely crucial in mission-critical applications of Wireless Sensor and Actor Networks (WSANs), as actors have to quickly plan optimal coordinated responses to detected events. Failure of a critical actor partitions the inter-actor network into disjoint segments besides leaving a coverage hole, and thus hinders the network operation. This paper presents a Partitioning detection and Connectivity Restoration (PCR) algorithm to tolerate critical actor failure. As part of pre-failure planning, PCR determines critical/non-critical actors based on localized information and designates each critical node with an appropriate backup (preferably non-critical). The pre-designated backup detects the failure of its primary actor and initiates a post-failure recovery process that may involve coordinated multi-actor relocation. To prove the correctness, we construct a formal specification of PCR using Z notation. We model WSAN topology as a dynamic graph and transform PCR to corresponding formal specification using Z notation. Formal specification is analyzed and validated using the Z Eves tool. Moreover, we simulate the specification to quantitatively analyze the efficiency of PCR. Simulation results confirm the effectiveness of PCR and the results shown that it outperforms contemporary schemes found in the literature.
NASA Astrophysics Data System (ADS)
Marrero-Ponce, Yovani; Santiago, Oscar Martínez; López, Yoan Martínez; Barigye, Stephen J.; Torrens, Francisco
2012-11-01
In this report, we present a new mathematical approach for describing chemical structures of organic molecules at atomic-molecular level, proposing for the first time the use of the concept of the derivative ( partial ) of a molecular graph (MG) with respect to a given event ( E), to obtain a new family of molecular descriptors (MDs). With this purpose, a new matrix representation of the MG, which generalizes graph's theory's traditional incidence matrix, is introduced. This matrix, denominated the generalized incidence matrix, Q, arises from the Boolean representation of molecular sub- graphs that participate in the formation of the graph molecular skeleton MG and could be complete (representing all possible connected sub-graphs) or constitute sub-graphs of determined orders or types as well as a combination of these. The Q matrix is a non-quadratic and unsymmetrical in nature, its columns ( n) and rows ( m) are conditions (letters) and collection of conditions (words) with which the event occurs. This non-quadratic and unsymmetrical matrix is transformed, by algebraic manipulation, to a quadratic and symmetric matrix known as relations frequency matrix, F, which characterizes the participation intensity of the conditions (letters) in the events (words). With F, we calculate the derivative over a pair of atomic nuclei. The local index for the atomic nuclei i, Δ i , can therefore be obtained as a linear combination of all the pair derivatives of the atomic nuclei i with all the rest of the j's atomic nuclei. Here, we also define new strategies that generalize the present form of obtaining global or local (group or atom-type) invariants from atomic contributions (local vertex invariants, LOVIs). In respect to this, metric (norms), means and statistical invariants are introduced. These invariants are applied to a vector whose components are the values Δ i for the atomic nuclei of the molecule or its fragments. Moreover, with the purpose of differentiating among different atoms, an atomic weighting scheme (atom-type labels) is used in the formation of the matrix Q or in LOVIs state. The obtained indices were utilized to describe the partition coefficient (Log P) and the reactivity index (Log K) of the 34 derivatives of 2-furylethylenes. In all the cases, our MDs showed better statistical results than those previously obtained using some of the most used families of MDs in chemometric practice. Therefore, it has been demonstrated to that the proposed MDs are useful in molecular design and permit obtaining easier and robust mathematical models than the majority of those reported in the literature. All this range of mentioned possibilities open "the doors" to the creation of a new family of MDs, using the graph derivative, and avail a new tool for QSAR/QSPR and molecular diversity/similarity studies.
Marrero-Ponce, Yovani; Santiago, Oscar Martínez; López, Yoan Martínez; Barigye, Stephen J; Torrens, Francisco
2012-11-01
In this report, we present a new mathematical approach for describing chemical structures of organic molecules at atomic-molecular level, proposing for the first time the use of the concept of the derivative ([Formula: see text]) of a molecular graph (MG) with respect to a given event (E), to obtain a new family of molecular descriptors (MDs). With this purpose, a new matrix representation of the MG, which generalizes graph's theory's traditional incidence matrix, is introduced. This matrix, denominated the generalized incidence matrix, Q, arises from the Boolean representation of molecular sub-graphs that participate in the formation of the graph molecular skeleton MG and could be complete (representing all possible connected sub-graphs) or constitute sub-graphs of determined orders or types as well as a combination of these. The Q matrix is a non-quadratic and unsymmetrical in nature, its columns (n) and rows (m) are conditions (letters) and collection of conditions (words) with which the event occurs. This non-quadratic and unsymmetrical matrix is transformed, by algebraic manipulation, to a quadratic and symmetric matrix known as relations frequency matrix, F, which characterizes the participation intensity of the conditions (letters) in the events (words). With F, we calculate the derivative over a pair of atomic nuclei. The local index for the atomic nuclei i, Δ(i), can therefore be obtained as a linear combination of all the pair derivatives of the atomic nuclei i with all the rest of the j's atomic nuclei. Here, we also define new strategies that generalize the present form of obtaining global or local (group or atom-type) invariants from atomic contributions (local vertex invariants, LOVIs). In respect to this, metric (norms), means and statistical invariants are introduced. These invariants are applied to a vector whose components are the values Δ(i) for the atomic nuclei of the molecule or its fragments. Moreover, with the purpose of differentiating among different atoms, an atomic weighting scheme (atom-type labels) is used in the formation of the matrix Q or in LOVIs state. The obtained indices were utilized to describe the partition coefficient (Log P) and the reactivity index (Log K) of the 34 derivatives of 2-furylethylenes. In all the cases, our MDs showed better statistical results than those previously obtained using some of the most used families of MDs in chemometric practice. Therefore, it has been demonstrated to that the proposed MDs are useful in molecular design and permit obtaining easier and robust mathematical models than the majority of those reported in the literature. All this range of mentioned possibilities open "the doors" to the creation of a new family of MDs, using the graph derivative, and avail a new tool for QSAR/QSPR and molecular diversity/similarity studies.
An improved approach of register allocation via graph coloring
NASA Astrophysics Data System (ADS)
Gao, Lei; Shi, Ce
2005-03-01
Register allocation is an important part of optimizing compiler. The algorithm of register allocation via graph coloring is implemented by Chaitin and his colleagues firstly and improved by Briggs and others. By abstracting register allocation to graph coloring, the allocation process is simplified. As the physical register number is limited, coloring of the interference graph can"t succeed for every node. The uncolored nodes must be spilled. There is an assumption that almost all the allocation method obeys: when a register is allocated to a variable v, it can"t be used by others before v quit even if v is not used for a long time. This may causes a waste of register resource. The authors relax this restriction under certain conditions and make some improvement. In this method, one register can be mapped to two or more interfered "living" live ranges at the same time if they satisfy some requirements. An operation named merge is defined which can arrange two interfered nodes occupy the same register with some cost. Thus, the resource of register can be used more effectively and the cost of memory access can be reduced greatly.
Graph Based Models for Unsupervised High Dimensional Data Clustering and Network Analysis
2015-01-01
ApprovedOMB No. 0704-0188 Public reporting burden for the collection of information is estimated to average 1 hour per response, including the time for...algorithms we proposed improve the time e ciency signi cantly for large scale datasets. In the last chapter, we also propose an incremental reseeding...plume detection in hyper-spectral video data. These graph based clustering algorithms we proposed improve the time efficiency significantly for large
A Note on the Kirchhoff and Additive Degree-Kirchhoff Indices of Graphs
NASA Astrophysics Data System (ADS)
Yang, Yujun; Klein, Douglas J.
2015-06-01
Two resistance-distance-based graph invariants, namely, the Kirchhoff index and the additive degree-Kirchhoff index, are studied. A relation between them is established, with inequalities for the additive degree-Kirchhoff index arising via the Kirchhoff index along with minimum, maximum, and average degrees. Bounds for the Kirchhoff and additive degree-Kirchhoff indices are also determined, and extremal graphs are characterised. In addition, an upper bound for the additive degree-Kirchhoff index is established to improve a previously known result.
Thinking graphically: Connecting vision and cognition during graph comprehension.
Ratwani, Raj M; Trafton, J Gregory; Boehm-Davis, Deborah A
2008-03-01
Task analytic theories of graph comprehension account for the perceptual and conceptual processes required to extract specific information from graphs. Comparatively, the processes underlying information integration have received less attention. We propose a new framework for information integration that highlights visual integration and cognitive integration. During visual integration, pattern recognition processes are used to form visual clusters of information; these visual clusters are then used to reason about the graph during cognitive integration. In 3 experiments, the processes required to extract specific information and to integrate information were examined by collecting verbal protocol and eye movement data. Results supported the task analytic theories for specific information extraction and the processes of visual and cognitive integration for integrative questions. Further, the integrative processes scaled up as graph complexity increased, highlighting the importance of these processes for integration in more complex graphs. Finally, based on this framework, design principles to improve both visual and cognitive integration are described. PsycINFO Database Record (c) 2008 APA, all rights reserved
Unimodular lattice triangulations as small-world and scale-free random graphs
NASA Astrophysics Data System (ADS)
Krüger, B.; Schmidt, E. M.; Mecke, K.
2015-02-01
Real-world networks, e.g., the social relations or world-wide-web graphs, exhibit both small-world and scale-free behaviour. We interpret lattice triangulations as planar graphs by identifying triangulation vertices with graph nodes and one-dimensional simplices with edges. Since these triangulations are ergodic with respect to a certain Pachner flip, applying different Monte Carlo simulations enables us to calculate average properties of random triangulations, as well as canonical ensemble averages, using an energy functional that is approximately the variance of the degree distribution. All considered triangulations have clustering coefficients comparable with real-world graphs; for the canonical ensemble there are inverse temperatures with small shortest path length independent of system size. Tuning the inverse temperature to a quasi-critical value leads to an indication of scale-free behaviour for degrees k≥slant 5. Using triangulations as a random graph model can improve the understanding of real-world networks, especially if the actual distance of the embedded nodes becomes important.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Jing; Li, Yuan-Yuan; Shanghai Center for Bioinformation Technology, Shanghai 200235
2012-03-02
Highlights: Black-Right-Pointing-Pointer Proper dataset partition can improve the prediction of deleterious nsSNPs. Black-Right-Pointing-Pointer Partition according to original residue type at nsSNP is a good criterion. Black-Right-Pointing-Pointer Similar strategy is supposed promising in other machine learning problems. -- Abstract: Many non-synonymous SNPs (nsSNPs) are associated with diseases, and numerous machine learning methods have been applied to train classifiers for sorting disease-associated nsSNPs from neutral ones. The continuously accumulated nsSNP data allows us to further explore better prediction approaches. In this work, we partitioned the training data into 20 subsets according to either original or substituted amino acid type at the nsSNPmore » site. Using support vector machine (SVM), training classification models on each subset resulted in an overall accuracy of 76.3% or 74.9% depending on the two different partition criteria, while training on the whole dataset obtained an accuracy of only 72.6%. Moreover, the dataset was also randomly divided into 20 subsets, but the corresponding accuracy was only 73.2%. Our results demonstrated that partitioning the whole training dataset into subsets properly, i.e., according to the residue type at the nsSNP site, will improve the performance of the trained classifiers significantly, which should be valuable in developing better tools for predicting the disease-association of nsSNPs.« less
Inferring ontology graph structures using OWL reasoning.
Rodríguez-García, Miguel Ángel; Hoehndorf, Robert
2018-01-05
Ontologies are representations of a conceptualization of a domain. Traditionally, ontologies in biology were represented as directed acyclic graphs (DAG) which represent the backbone taxonomy and additional relations between classes. These graphs are widely exploited for data analysis in the form of ontology enrichment or computation of semantic similarity. More recently, ontologies are developed in a formal language such as the Web Ontology Language (OWL) and consist of a set of axioms through which classes are defined or constrained. While the taxonomy of an ontology can be inferred directly from the axioms of an ontology as one of the standard OWL reasoning tasks, creating general graph structures from OWL ontologies that exploit the ontologies' semantic content remains a challenge. We developed a method to transform ontologies into graphs using an automated reasoner while taking into account all relations between classes. Searching for (existential) patterns in the deductive closure of ontologies, we can identify relations between classes that are implied but not asserted and generate graph structures that encode for a large part of the ontologies' semantic content. We demonstrate the advantages of our method by applying it to inference of protein-protein interactions through semantic similarity over the Gene Ontology and demonstrate that performance is increased when graph structures are inferred using deductive inference according to our method. Our software and experiment results are available at http://github.com/bio-ontology-research-group/Onto2Graph . Onto2Graph is a method to generate graph structures from OWL ontologies using automated reasoning. The resulting graphs can be used for improved ontology visualization and ontology-based data analysis.
A novel sub-shot segmentation method for user-generated video
NASA Astrophysics Data System (ADS)
Lei, Zhuo; Zhang, Qian; Zheng, Chi; Qiu, Guoping
2018-04-01
With the proliferation of the user-generated videos, temporal segmentation is becoming a challengeable problem. Traditional video temporal segmentation methods like shot detection are not able to work on unedited user-generated videos, since they often only contain one single long shot. We propose a novel temporal segmentation framework for user-generated video. It finds similar frames with a tree partitioning min-Hash technique, constructs sparse temporal constrained affinity sub-graphs, and finally divides the video into sub-shot-level segments with a dense-neighbor-based clustering method. Experimental results show that our approach outperforms all the other related works. Furthermore, it is indicated that the proposed approach is able to segment user-generated videos at an average human level.
GOGrapher: A Python library for GO graph representation and analysis.
Muller, Brian; Richards, Adam J; Jin, Bo; Lu, Xinghua
2009-07-07
The Gene Ontology is the most commonly used controlled vocabulary for annotating proteins. The concepts in the ontology are organized as a directed acyclic graph, in which a node corresponds to a biological concept and a directed edge denotes the parent-child semantic relationship between a pair of terms. A large number of protein annotations further create links between proteins and their functional annotations, reflecting the contemporary knowledge about proteins and their functional relationships. This leads to a complex graph consisting of interleaved biological concepts and their associated proteins. What is needed is a simple, open source library that provides tools to not only create and view the Gene Ontology graph, but to analyze and manipulate it as well. Here we describe the development and use of GOGrapher, a Python library that can be used for the creation, analysis, manipulation, and visualization of Gene Ontology related graphs. An object-oriented approach was adopted to organize the hierarchy of the graphs types and associated classes. An Application Programming Interface is provided through which different types of graphs can be pragmatically created, manipulated, and visualized. GOGrapher has been successfully utilized in multiple research projects, e.g., a graph-based multi-label text classifier for protein annotation. The GOGrapher project provides a reusable programming library designed for the manipulation and analysis of Gene Ontology graphs. The library is freely available for the scientific community to use and improve.
Graphs in kinematics—a need for adherence to principles of algebraic functions
NASA Astrophysics Data System (ADS)
Sokolowski, Andrzej
2017-11-01
Graphs in physics are central to the analysis of phenomena and to learning about a system’s behavior. The ways students handle graphs are frequently researched. Students’ misconceptions are highlighted, and methods of improvement suggested. While kinematics graphs are to represent a real motion, they are also algebraic entities that must satisfy conditions for being algebraic functions. To be algebraic functions, they must pass certain tests before they can be used to infer more about motion. A preliminary survey of some physics resources has revealed that little attention is paid to verifying if the position, velocity and acceleration versus time graphs, that are to depict real motion, satisfy the most critical condition for being an algebraic function; the vertical line test. The lack of attention to this adherence shows as vertical segments in piecewise graphs. Such graphs generate unrealistic interpretations and may confuse students. A group of 25 college physics students was provided with such a graph and asked to analyse its adherence to reality. The majority of the students (N = 16, 64%) questioned the graph’s validity. It is inferred that such graphs might not only jeopardize the function principles studied in mathematics but also undermine the purpose of studying these principles. The aim of this study was to bring this idea forth and suggest a better alignment of physics and mathematics methods.
Expanding our understanding of students' use of graphs for learning physics
NASA Astrophysics Data System (ADS)
Laverty, James T.
It is generally agreed that the ability to visualize functional dependencies or physical relationships as graphs is an important step in modeling and learning. However, several studies in Physics Education Research (PER) have shown that many students in fact do not master this form of representation and even have misconceptions about the meaning of graphs that impede learning physics concepts. Working with graphs in classroom settings has been shown to improve student abilities with graphs, particularly when the students can interact with them. We introduce a novel problem type in an online homework system, which requires students to construct the graphs themselves in free form, and requires no hand-grading by instructors. A study of pre/post-test data using the Test of Understanding Graphs in Kinematics (TUG-K) over several semesters indicates that students learn significantly more from these graph construction problems than from the usual graph interpretation problems, and that graph interpretation alone may not have any significant effect. The interpretation of graphs, as well as the representation translation between textual, mathematical, and graphical representations of physics scenarios, are frequently listed among the higher order thinking skills we wish to convey in an undergraduate course. But to what degree do we succeed? Do students indeed employ higher order thinking skills when working through graphing exercises? We investigate students working through a variety of graph problems, and, using a think-aloud protocol, aim to reconstruct the cognitive processes that the students go through. We find that to a certain degree, these problems become commoditized and do not trigger the desired higher order thinking processes; simply translating ``textbook-like'' problems into the graphical realm will not achieve any additional educational goals. Whether the students have to interpret or construct a graph makes very little difference in the methods used by the students. We will also look at the results of using graph problems in an online learning environment. We will show evidence that construction problems lead to a higher degree of difficulty and degree of discrimination than other graph problems and discuss the influence the course has on these variables.
Reactome graph database: Efficient access to complex pathway data
Korninger, Florian; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D’Eustachio, Peter
2018-01-01
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types. PMID:29377902
DOE Office of Scientific and Technical Information (OSTI.GOV)
John Homer; Ashok Varikuti; Xinming Ou
Various tools exist to analyze enterprise network systems and to produce attack graphs detailing how attackers might penetrate into the system. These attack graphs, however, are often complex and difficult to comprehend fully, and a human user may find it problematic to reach appropriate configuration decisions. This paper presents methodologies that can 1) automatically identify portions of an attack graph that do not help a user to understand the core security problems and so can be trimmed, and 2) automatically group similar attack steps as virtual nodes in a model of the network topology, to immediately increase the understandability ofmore » the data. We believe both methods are important steps toward improving visualization of attack graphs to make them more useful in configuration management for large enterprise networks. We implemented our methods using one of the existing attack-graph toolkits. Initial experimentation shows that the proposed approaches can 1) significantly reduce the complexity of attack graphs by trimming a large portion of the graph that is not needed for a user to understand the security problem, and 2) significantly increase the accessibility and understandability of the data presented in the attack graph by clearly showing, within a generated visualization of the network topology, the number and type of potential attacks to which each host is exposed.« less
Reactome graph database: Efficient access to complex pathway data.
Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning
2018-01-01
Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.
Video Measurements: Quantity or Quality
ERIC Educational Resources Information Center
Zajkov, Oliver; Mitrevski, Boce
2012-01-01
Students have problems with understanding, using and interpreting graphs. In order to improve the students' skills for working with graphs, we propose Manual Video Measurement (MVM). In this paper, the MVM method is explained and its accuracy is tested. The comparison with the standardized video data software shows that its accuracy is comparable…
A Data Book of Child and Adolescent Injury.
ERIC Educational Resources Information Center
National Center for Education in Maternal and Child Health, Washington, DC.
This booklet contains 54 graphs and accompanying narrative which summarize available data on child and adolescent non-natural injuries and deaths and are intended to help in the multi-disciplinary and multi-agency "Healthy People 2000" campaign to improve the nation's health and prevent needless child and adolescent injuries. Graphs illustrate…
Application of Computer Graphics to Graphing in Algebra and Trigonometry. Final Report.
ERIC Educational Resources Information Center
Morris, J. Richard
This project was designed to improve the graphing competency of students in elementary algebra, intermediate algebra, and trigonometry courses at Virginia Commonwealth University. Computer graphics programs were designed using an Apple II Plus computer and implemented using Pascal. The software package is interactive and gives students control…
2013-01-01
5, pp. 75–174, 2010. [2] J. Leskovec, K. J. Lang, A. Dasgupta, and M. W. Mahoney , “Statistical properties of community structure in large social and...2011. [14] R. R. Nadakuditi and M. Newman , “Graph spectra and the detectability of community structure in networks,” Phys. Rev. Lett., vol. 108, no
Pogliani, Lionello
2010-01-30
Twelve properties of a highly heterogeneous class of organic solvents have been modeled with a graph-theoretical molecular connectivity modified (MC) method, which allows to encode the core electrons and the hydrogen atoms. The graph-theoretical method uses the concepts of simple, general, and complete graphs, where these last types of graphs are used to encode the core electrons. The hydrogen atoms have been encoded by the aid of a graph-theoretical perturbation parameter, which contributes to the definition of the valence delta, delta(v), a key parameter in molecular connectivity studies. The model of the twelve properties done with a stepwise search algorithm is always satisfactory, and it allows to check the influence of the hydrogen content of the solvent molecules on the choice of the type of descriptor. A similar argument holds for the influence of the halogen atoms on the type of core electron representation. In some cases the molar mass, and in a minor way, special "ad hoc" parameters have been used to improve the model. A very good model of the surface tension could be obtained by the aid of five experimental parameters. A mixed model method based on experimental parameters plus molecular connectivity indices achieved, instead, to consistently improve the model quality of five properties. To underline is the importance of the boiling point temperatures as descriptors in these last two model methodologies. Copyright 2009 Wiley Periodicals, Inc.
An Enriched Unified Medical Language System Semantic Network with a Multiple Subsumption Hierarchy
Zhang, Li; Perl, Yehoshua; Halper, Michael; Geller, James; Cimino, James J.
2004-01-01
Objective: The Unified Medical Language System's (UMLS's) Semantic Network's (SN's) two-tree structure is restrictive because it does not allow a semantic type to be a specialization of several other semantic types. In this article, the SN is expanded into a multiple subsumption structure with a directed acyclic graph (DAG) IS-A hierarchy, allowing a semantic type to have multiple parents. New viable IS-A links are added as warranted. Design: Two methodologies are presented to identify and add new viable IS-A links. The first methodology is based on imposing the characteristic of connectivity on a previously presented partition of the SN. Four transformations are provided to find viable IS-A links in the process of converting the partition's disconnected groups into connected ones. The second methodology identifies new IS-A links through a string matching process involving names and definitions of various semantic types in the SN. A domain expert is needed to review all the results to determine the validity of the new IS-A links. Results: Nineteen new IS-A links are added to the SN, and four new semantic types are also created to support the multiple subsumption framework. The resulting network, called the Enriched Semantic Network (ESN), exhibits a DAG-structured hierarchy. A partition of the ESN containing 19 connected groups is also derived. Conclusion: The ESN is an expanded abstraction of the UMLS compared with the original SN. Its multiple subsumption hierarchy can accommodate semantic types with multiple parents. Its representation thus provides direct access to a broader range of subsumption knowledge. PMID:14764611
Zheng, Qiang; Warner, Steven; Tasian, Gregory; Fan, Yong
2018-02-12
Automatic segmentation of kidneys in ultrasound (US) images remains a challenging task because of high speckle noise, low contrast, and large appearance variations of kidneys in US images. Because texture features may improve the US image segmentation performance, we propose a novel graph cuts method to segment kidney in US images by integrating image intensity information and texture feature maps. We develop a new graph cuts-based method to segment kidney US images by integrating original image intensity information and texture feature maps extracted using Gabor filters. To handle large appearance variation within kidney images and improve computational efficiency, we build a graph of image pixels close to kidney boundary instead of building a graph of the whole image. To make the kidney segmentation robust to weak boundaries, we adopt localized regional information to measure similarity between image pixels for computing edge weights to build the graph of image pixels. The localized graph is dynamically updated and the graph cuts-based segmentation iteratively progresses until convergence. Our method has been evaluated based on kidney US images of 85 subjects. The imaging data of 20 randomly selected subjects were used as training data to tune parameters of the image segmentation method, and the remaining data were used as testing data for validation. Experiment results demonstrated that the proposed method obtained promising segmentation results for bilateral kidneys (average Dice index = 0.9446, average mean distance = 2.2551, average specificity = 0.9971, average accuracy = 0.9919), better than other methods under comparison (P < .05, paired Wilcoxon rank sum tests). The proposed method achieved promising performance for segmenting kidneys in two-dimensional US images, better than segmentation methods built on any single channel of image information. This method will facilitate extraction of kidney characteristics that may predict important clinical outcomes such as progression of chronic kidney disease. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
A Graph Based Backtracking Algorithm for Solving General CSPs
NASA Technical Reports Server (NTRS)
Pang, Wanlin; Goodwin, Scott D.
2003-01-01
Many AI tasks can be formalized as constraint satisfaction problems (CSPs), which involve finding values for variables subject to constraints. While solving a CSP is an NP-complete task in general, tractable classes of CSPs have been identified based on the structure of the underlying constraint graphs. Much effort has been spent on exploiting structural properties of the constraint graph to improve the efficiency of finding a solution. These efforts contributed to development of a class of CSP solving algorithms called decomposition algorithms. The strength of CSP decomposition is that its worst-case complexity depends on the structural properties of the constraint graph and is usually better than the worst-case complexity of search methods. Its practical application is limited, however, since it cannot be applied if the CSP is not decomposable. In this paper, we propose a graph based backtracking algorithm called omega-CDBT, which shares merits and overcomes the weaknesses of both decomposition and search approaches.
Projected power iteration for network alignment
NASA Astrophysics Data System (ADS)
Onaran, Efe; Villar, Soledad
2017-08-01
The network alignment problem asks for the best correspondence between two given graphs, so that the largest possible number of edges are matched. This problem appears in many scientific problems (like the study of protein-protein interactions) and it is very closely related to the quadratic assignment problem which has graph isomorphism, traveling salesman and minimum bisection problems as particular cases. The graph matching problem is NP-hard in general. However, under some restrictive models for the graphs, algorithms can approximate the alignment efficiently. In that spirit the recent work by Feizi and collaborators introduce EigenAlign, a fast spectral method with convergence guarantees for Erd-s-Renyí graphs. In this work we propose the algorithm Projected Power Alignment, which is a projected power iteration version of EigenAlign. We numerically show it improves the recovery rates of EigenAlign and we describe the theory that may be used to provide performance guarantees for Projected Power Alignment.
A simple method for finding the scattering coefficients of quantum graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cottrell, Seth S.
2015-09-15
Quantum walks are roughly analogous to classical random walks, and similar to classical walks they have been used to find new (quantum) algorithms. When studying the behavior of large graphs or combinations of graphs, it is useful to find the response of a subgraph to signals of different frequencies. In doing so, we can replace an entire subgraph with a single vertex with variable scattering coefficients. In this paper, a simple technique for quickly finding the scattering coefficients of any discrete-time quantum graph will be presented. These scattering coefficients can be expressed entirely in terms of the characteristic polynomial ofmore » the graph’s time step operator. This is a marked improvement over previous techniques which have traditionally required finding eigenstates for a given eigenvalue, which is far more computationally costly. With the scattering coefficients we can easily derive the “impulse response” which is the key to predicting the response of a graph to any signal. This gives us a powerful set of tools for rapidly understanding the behavior of graphs or for reducing a large graph into its constituent subgraphs regardless of how they are connected.« less
Novo, Leonardo; Chakraborty, Shantanav; Mohseni, Masoud; Neven, Hartmut; Omar, Yasser
2015-01-01
Continuous time quantum walks provide an important framework for designing new algorithms and modelling quantum transport and state transfer problems. Often, the graph representing the structure of a problem contains certain symmetries that confine the dynamics to a smaller subspace of the full Hilbert space. In this work, we use invariant subspace methods, that can be computed systematically using the Lanczos algorithm, to obtain the reduced set of states that encompass the dynamics of the problem at hand without the specific knowledge of underlying symmetries. First, we apply this method to obtain new instances of graphs where the spatial quantum search algorithm is optimal: complete graphs with broken links and complete bipartite graphs, in particular, the star graph. These examples show that regularity and high-connectivity are not needed to achieve optimal spatial search. We also show that this method considerably simplifies the calculation of quantum transport efficiencies. Furthermore, we observe improved efficiencies by removing a few links from highly symmetric graphs. Finally, we show that this reduction method also allows us to obtain an upper bound for the fidelity of a single qubit transfer on an XY spin network. PMID:26330082
Ali, Nadia; Peebles, David
2013-02-01
We report three experiments investigating the ability of undergraduate college students to comprehend 2 x 2 "interaction" graphs from two-way factorial research designs. Factorial research designs are an invaluable research tool widely used in all branches of the natural and social sciences, and the teaching of such designs lies at the core of many college curricula. Such data can be represented in bar or line graph form. Previous studies have shown, however, that people interpret these two graphical forms differently. In Experiment 1, participants were required to interpret interaction data in either bar or line graphs while thinking aloud. Verbal protocol analysis revealed that line graph users were significantly more likely to misinterpret the data or fail to interpret the graph altogether. The patterns of errors line graph users made were interpreted as arising from the operation of Gestalt principles of perceptual organization, and this interpretation was used to develop two modified versions of the line graph, which were then tested in two further experiments. One of the modifications resulted in a significant improvement in performance. Results of the three experiments support the proposed explanation and demonstrate the effects (both positive and negative) of Gestalt principles of perceptual organization on graph comprehension. We propose that our new design provides a more balanced representation of the data than the standard line graph for nonexpert users to comprehend the full range of relationships in two-way factorial research designs and may therefore be considered a more appropriate representation for use in educational and other nonexpert contexts.
Men's interpretations of graphical information in a videotape decision aid 1
Pylar, Jan; Wills, Celia E.; Lillie, Janet; Rovner, David R.; Kelly‐Blake, Karen; Holmes‐Rovner, Margaret
2007-01-01
Abstract Objective To examine men's interpretations of graphical information types viewed in a high‐quality, previously tested videotape decision aid (DA). Setting, participants, design A community‐dwelling sample of men >50 years of age (N = 188) balanced by education (college/non‐college) and race (Black/White) were interviewed just following their viewing of a videotape DA. A descriptive study design was used to examine men's interpretations of a representative sample of the types of graphs that were shown in the benign prostatic hyperplasia videotape DA. Main variables studied Men provided their interpretation of graphs information presented in three formats that varied in complexity: pictograph, line and horizontal bar graph. Audiotape transcripts of men's responses were coded for meaning and content‐related interpretation statements. Results Men provided both meaning and content‐focused interpretations of the graphs. Accuracy of interpretation was lower than hypothesized on the basis of literature review (85.4% for pictograph, 65.7% for line graph, 47.8% for horizontal bar graph). Accuracy for pictograph and line graphs was associated with education level, = 3.94, P = 0.047, and = 7.55, P = 0.006, respectively. Accuracy was uncorrelated with men's reported liking of the graphs, = 2.00, P = 0.441. Conclusion While men generally liked the DA, accuracy of graphs interpretation was associated with format complexity and education level. Graphs are often recommended to improve comprehension of information in DAs. However, additional evaluation is needed in experimental and naturalistic observational settings to develop best practice standards for data representation. PMID:17524011
Multiple directed graph large-class multi-spectral processor
NASA Technical Reports Server (NTRS)
Casasent, David; Liu, Shiaw-Dong; Yoneyama, Hideyuki
1988-01-01
Numerical analysis techniques for the interpretation of high-resolution imaging-spectrometer data are described and demonstrated. The method proposed involves the use of (1) a hierarchical classifier with a tree structure generated automatically by a Fisher linear-discriminant-function algorithm and (2) a novel multiple-directed-graph scheme which reduces the local maxima and the number of perturbations required. Results for a 500-class test problem involving simulated imaging-spectrometer data are presented in tables and graphs; 100-percent-correct classification is achieved with an improvement factor of 5.
Weighted graph based ordering techniques for preconditioned conjugate gradient methods
NASA Technical Reports Server (NTRS)
Clift, Simon S.; Tang, Wei-Pai
1994-01-01
We describe the basis of a matrix ordering heuristic for improving the incomplete factorization used in preconditioned conjugate gradient techniques applied to anisotropic PDE's. Several new matrix ordering techniques, derived from well-known algorithms in combinatorial graph theory, which attempt to implement this heuristic, are described. These ordering techniques are tested against a number of matrices arising from linear anisotropic PDE's, and compared with other matrix ordering techniques. A variation of RCM is shown to generally improve the quality of incomplete factorization preconditioners.
Lifted worm algorithm for the Ising model
NASA Astrophysics Data System (ADS)
Elçi, Eren Metin; Grimm, Jens; Ding, Lijie; Nasrawi, Abrahim; Garoni, Timothy M.; Deng, Youjin
2018-04-01
We design an irreversible worm algorithm for the zero-field ferromagnetic Ising model by using the lifting technique. We study the dynamic critical behavior of an energylike observable on both the complete graph and toroidal grids, and compare our findings with reversible algorithms such as the Prokof'ev-Svistunov worm algorithm. Our results show that the lifted worm algorithm improves the dynamic exponent of the energylike observable on the complete graph and leads to a significant constant improvement on toroidal grids.
Multiple sclerosis lesion segmentation using an automatic multimodal graph cuts.
García-Lorenzo, Daniel; Lecoeur, Jeremy; Arnold, Douglas L; Collins, D Louis; Barillot, Christian
2009-01-01
Graph Cuts have been shown as a powerful interactive segmentation technique in several medical domains. We propose to automate the Graph Cuts in order to automatically segment Multiple Sclerosis (MS) lesions in MRI. We replace the manual interaction with a robust EM-based approach in order to discriminate between MS lesions and the Normal Appearing Brain Tissues (NABT). Evaluation is performed in synthetic and real images showing good agreement between the automatic segmentation and the target segmentation. We compare our algorithm with the state of the art techniques and with several manual segmentations. An advantage of our algorithm over previously published ones is the possibility to semi-automatically improve the segmentation due to the Graph Cuts interactive feature.
[A graph cuts-based interactive method for segmentation of magnetic resonance images of meningioma].
Li, Shuan-qiang; Feng, Qian-jin; Chen, Wu-fan; Lin, Ya-zhong
2011-06-01
For accurate segmentation of the magnetic resonance (MR) images of meningioma, we propose a novel interactive segmentation method based on graph cuts. The high dimensional image features was extracted, and for each pixel, the probabilities of its origin, either the tumor or the background regions, were estimated by exploiting the weighted K-nearest neighborhood classifier. Based on these probabilities, a new energy function was proposed. Finally, a graph cut optimal framework was used for the solution of the energy function. The proposed method was evaluated by application in the segmentation of MR images of meningioma, and the results showed that the method significantly improved the segmentation accuracy compared with the gray level information-based graph cut method.
GOGrapher: A Python library for GO graph representation and analysis
Muller, Brian; Richards, Adam J; Jin, Bo; Lu, Xinghua
2009-01-01
Background The Gene Ontology is the most commonly used controlled vocabulary for annotating proteins. The concepts in the ontology are organized as a directed acyclic graph, in which a node corresponds to a biological concept and a directed edge denotes the parent-child semantic relationship between a pair of terms. A large number of protein annotations further create links between proteins and their functional annotations, reflecting the contemporary knowledge about proteins and their functional relationships. This leads to a complex graph consisting of interleaved biological concepts and their associated proteins. What is needed is a simple, open source library that provides tools to not only create and view the Gene Ontology graph, but to analyze and manipulate it as well. Here we describe the development and use of GOGrapher, a Python library that can be used for the creation, analysis, manipulation, and visualization of Gene Ontology related graphs. Findings An object-oriented approach was adopted to organize the hierarchy of the graphs types and associated classes. An Application Programming Interface is provided through which different types of graphs can be pragmatically created, manipulated, and visualized. GOGrapher has been successfully utilized in multiple research projects, e.g., a graph-based multi-label text classifier for protein annotation. Conclusion The GOGrapher project provides a reusable programming library designed for the manipulation and analysis of Gene Ontology graphs. The library is freely available for the scientific community to use and improve. PMID:19583843
Automatic extraction of numeric strings in unconstrained handwritten document images
NASA Astrophysics Data System (ADS)
Haji, M. Mehdi; Bui, Tien D.; Suen, Ching Y.
2011-01-01
Numeric strings such as identification numbers carry vital pieces of information in documents. In this paper, we present a novel algorithm for automatic extraction of numeric strings in unconstrained handwritten document images. The algorithm has two main phases: pruning and verification. In the pruning phase, the algorithm first performs a new segment-merge procedure on each text line, and then using a new regularity measure, it prunes all sequences of characters that are unlikely to be numeric strings. The segment-merge procedure is composed of two modules: a new explicit character segmentation algorithm which is based on analysis of skeletal graphs and a merging algorithm which is based on graph partitioning. All the candidate sequences that pass the pruning phase are sent to a recognition-based verification phase for the final decision. The recognition is based on a coarse-to-fine approach using probabilistic RBF networks. We developed our algorithm for the processing of real-world documents where letters and digits may be connected or broken in a document. The effectiveness of the proposed approach is shown by extensive experiments done on a real-world database of 607 documents which contains handwritten, machine-printed and mixed documents with different types of layouts and levels of noise.
Finding Statistically Significant Communities in Networks
Lancichinetti, Andrea; Radicchi, Filippo; Ramasco, José J.; Fortunato, Santo
2011-01-01
Community structure is one of the main structural features of networks, revealing both their internal organization and the similarity of their elementary units. Despite the large variety of methods proposed to detect communities in graphs, there is a big need for multi-purpose techniques, able to handle different types of datasets and the subtleties of community structure. In this paper we present OSLOM (Order Statistics Local Optimization Method), the first method capable to detect clusters in networks accounting for edge directions, edge weights, overlapping communities, hierarchies and community dynamics. It is based on the local optimization of a fitness function expressing the statistical significance of clusters with respect to random fluctuations, which is estimated with tools of Extreme and Order Statistics. OSLOM can be used alone or as a refinement procedure of partitions/covers delivered by other techniques. We have also implemented sequential algorithms combining OSLOM with other fast techniques, so that the community structure of very large networks can be uncovered. Our method has a comparable performance as the best existing algorithms on artificial benchmark graphs. Several applications on real networks are shown as well. OSLOM is implemented in a freely available software (http://www.oslom.org), and we believe it will be a valuable tool in the analysis of networks. PMID:21559480
BiCluE - Exact and heuristic algorithms for weighted bi-cluster editing of biomedical data
2013-01-01
Background The explosion of biological data has dramatically reformed today's biology research. The biggest challenge to biologists and bioinformaticians is the integration and analysis of large quantity of data to provide meaningful insights. One major problem is the combined analysis of data from different types. Bi-cluster editing, as a special case of clustering, which partitions two different types of data simultaneously, might be used for several biomedical scenarios. However, the underlying algorithmic problem is NP-hard. Results Here we contribute with BiCluE, a software package designed to solve the weighted bi-cluster editing problem. It implements (1) an exact algorithm based on fixed-parameter tractability and (2) a polynomial-time greedy heuristics based on solving the hardest part, edge deletions, first. We evaluated its performance on artificial graphs. Afterwards we exemplarily applied our implementation on real world biomedical data, GWAS data in this case. BiCluE generally works on any kind of data types that can be modeled as (weighted or unweighted) bipartite graphs. Conclusions To our knowledge, this is the first software package solving the weighted bi-cluster editing problem. BiCluE as well as the supplementary results are available online at http://biclue.mpi-inf.mpg.de. PMID:24565035
Detection of protein complex from protein-protein interaction network using Markov clustering
NASA Astrophysics Data System (ADS)
Ochieng, P. J.; Kusuma, W. A.; Haryanto, T.
2017-05-01
Detection of complexes, or groups of functionally related proteins, is an important challenge while analysing biological networks. However, existing algorithms to identify protein complexes are insufficient when applied to dense networks of experimentally derived interaction data. Therefore, we introduced a graph clustering method based on Markov clustering algorithm to identify protein complex within highly interconnected protein-protein interaction networks. Protein-protein interaction network was first constructed to develop geometrical network, the network was then partitioned using Markov clustering to detect protein complexes. The interest of the proposed method was illustrated by its application to Human Proteins associated to type II diabetes mellitus. Flow simulation of MCL algorithm was initially performed and topological properties of the resultant network were analysed for detection of the protein complex. The results indicated the proposed method successfully detect an overall of 34 complexes with 11 complexes consisting of overlapping modules and 20 non-overlapping modules. The major complex consisted of 102 proteins and 521 interactions with cluster modularity and density of 0.745 and 0.101 respectively. The comparison analysis revealed MCL out perform AP, MCODE and SCPS algorithms with high clustering coefficient (0.751) network density and modularity index (0.630). This demonstrated MCL was the most reliable and efficient graph clustering algorithm for detection of protein complexes from PPI networks.
Geographic Gossip: Efficient Averaging for Sensor Networks
NASA Astrophysics Data System (ADS)
Dimakis, Alexandros D. G.; Sarwate, Anand D.; Wainwright, Martin J.
Gossip algorithms for distributed computation are attractive due to their simplicity, distributed nature, and robustness in noisy and uncertain environments. However, using standard gossip algorithms can lead to a significant waste in energy by repeatedly recirculating redundant information. For realistic sensor network model topologies like grids and random geometric graphs, the inefficiency of gossip schemes is related to the slow mixing times of random walks on the communication graph. We propose and analyze an alternative gossiping scheme that exploits geographic information. By utilizing geographic routing combined with a simple resampling method, we demonstrate substantial gains over previously proposed gossip protocols. For regular graphs such as the ring or grid, our algorithm improves standard gossip by factors of $n$ and $\\sqrt{n}$ respectively. For the more challenging case of random geometric graphs, our algorithm computes the true average to accuracy $\\epsilon$ using $O(\\frac{n^{1.5}}{\\sqrt{\\log n}} \\log \\epsilon^{-1})$ radio transmissions, which yields a $\\sqrt{\\frac{n}{\\log n}}$ factor improvement over standard gossip algorithms. We illustrate these theoretical results with experimental comparisons between our algorithm and standard methods as applied to various classes of random fields.
An Improved Multi-Sensor Fusion Navigation Algorithm Based on the Factor Graph
Zeng, Qinghua; Chen, Weina; Liu, Jianye; Wang, Huizhe
2017-01-01
An integrated navigation system coupled with additional sensors can be used in the Micro Unmanned Aerial Vehicle (MUAV) applications because the multi-sensor information is redundant and complementary, which can markedly improve the system accuracy. How to deal with the information gathered from different sensors efficiently is an important problem. The fact that different sensors provide measurements asynchronously may complicate the processing of these measurements. In addition, the output signals of some sensors appear to have a non-linear character. In order to incorporate these measurements and calculate a navigation solution in real time, the multi-sensor fusion algorithm based on factor graph is proposed. The global optimum solution is factorized according to the chain structure of the factor graph, which allows for a more general form of the conditional probability density. It can convert the fusion matter into connecting factors defined by these measurements to the graph without considering the relationship between the sensor update frequency and the fusion period. An experimental MUAV system has been built and some experiments have been performed to prove the effectiveness of the proposed method. PMID:28335570
An Improved Multi-Sensor Fusion Navigation Algorithm Based on the Factor Graph.
Zeng, Qinghua; Chen, Weina; Liu, Jianye; Wang, Huizhe
2017-03-21
An integrated navigation system coupled with additional sensors can be used in the Micro Unmanned Aerial Vehicle (MUAV) applications because the multi-sensor information is redundant and complementary, which can markedly improve the system accuracy. How to deal with the information gathered from different sensors efficiently is an important problem. The fact that different sensors provide measurements asynchronously may complicate the processing of these measurements. In addition, the output signals of some sensors appear to have a non-linear character. In order to incorporate these measurements and calculate a navigation solution in real time, the multi-sensor fusion algorithm based on factor graph is proposed. The global optimum solution is factorized according to the chain structure of the factor graph, which allows for a more general form of the conditional probability density. It can convert the fusion matter into connecting factors defined by these measurements to the graph without considering the relationship between the sensor update frequency and the fusion period. An experimental MUAV system has been built and some experiments have been performed to prove the effectiveness of the proposed method.
Kafieh, Raheleh; Rabbani, Hossein; Abramoff, Michael D.; Sonka, Milan
2013-01-01
Optical coherence tomography (OCT) is a powerful and noninvasive method for retinal imaging. In this paper, we introduce a fast segmentation method based on a new variant of spectral graph theory named diffusion maps. The research is performed on spectral domain (SD) OCT images depicting macular and optic nerve head appearance. The presented approach does not require edge-based image information in localizing most of boundaries and relies on regional image texture. Consequently, the proposed method demonstrates robustness in situations of low image contrast or poor layer-to-layer image gradients. Diffusion mapping applied to 2D and 3D OCT datasets is composed of two steps, one for partitioning the data into important and less important sections, and another one for localization of internal layers. In the first step, the pixels/voxels are grouped in rectangular/cubic sets to form a graph node. The weights of the graph are calculated based on geometric distances between pixels/voxels and differences of their mean intensity. The first diffusion map clusters the data into three parts, the second of which is the area of interest. The other two sections are eliminated from the remaining calculations. In the second step, the remaining area is subjected to another diffusion map assessment and the internal layers are localized based on their textural similarities. The proposed method was tested on 23 datasets from two patient groups (glaucoma and normals). The mean unsigned border positioning errors (mean ± SD) was 8.52 ± 3.13 and 7.56 ± 2.95 μm for the 2D and 3D methods, respectively. PMID:23837966
Partitioning medical image databases for content-based queries on a Grid.
Montagnat, J; Breton, V; E Magnin, I
2005-01-01
In this paper we study the impact of executing a medical image database query application on the grid. For lowering the total computation time, the image database is partitioned into subsets to be processed on different grid nodes. A theoretical model of the application complexity and estimates of the grid execution overhead are used to efficiently partition the database. We show results demonstrating that smart partitioning of the database can lead to significant improvements in terms of total computation time. Grids are promising for content-based image retrieval in medical databases.
Towards Scalable Graph Computation on Mobile Devices.
Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng
2014-10-01
Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach.
Transformations of Mathematical and Stimulus Functions
Ninness, Chris; Barnes-Holmes, Dermot; Rumph, Robin; McCuller, Glen; Ford, Angela M; Payne, Robert; Ninness, Sharon K; Smith, Ronald J; Ward, Todd A; Elliott, Marc P
2006-01-01
Following a pretest, 8 participants who were unfamiliar with algebraic and trigonometric functions received a brief presentation on the rectangular coordinate system. Next, they participated in a computer-interactive matching-to-sample procedure that trained formula-to-formula and formula-to-graph relations. Then, they were exposed to 40 novel formula-to-graph tests and 10 novel graph-to-formula tests. Seven of the 8 participants showed substantial improvement in identifying formula-to-graph relations; however, in the test of novel graph-to-formula relations, participants tended to select equations in their factored form. Next, we manipulated contextual cues in the form of rules regarding mathematical preferences. First, we informed participants that standard forms of equations were preferred over factored forms. In a subsequent test of 10 additional novel graph-to-formula relations, participants shifted their selections to favor equations in their standard form. This preference reversed during 10 more tests when financial reward was made contingent on correct identification of formulas in factored form. Formula preferences and transformation of novel mathematical and stimulus functions are discussed. PMID:17020211
Dynamic graph cuts for efficient inference in Markov Random Fields.
Kohli, Pushmeet; Torr, Philip H S
2007-12-01
Abstract-In this paper we present a fast new fully dynamic algorithm for the st-mincut/max-flow problem. We show how this algorithm can be used to efficiently compute MAP solutions for certain dynamically changing MRF models in computer vision such as image segmentation. Specifically, given the solution of the max-flow problem on a graph, the dynamic algorithm efficiently computes the maximum flow in a modified version of the graph. The time taken by it is roughly proportional to the total amount of change in the edge weights of the graph. Our experiments show that, when the number of changes in the graph is small, the dynamic algorithm is significantly faster than the best known static graph cut algorithm. We test the performance of our algorithm on one particular problem: the object-background segmentation problem for video. It should be noted that the application of our algorithm is not limited to the above problem, the algorithm is generic and can be used to yield similar improvements in many other cases that involve dynamic change.
Towards Scalable Graph Computation on Mobile Devices
Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng
2015-01-01
Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach. PMID:25859564
An improvement of the measurement of time series irreversibility with visibility graph approach
NASA Astrophysics Data System (ADS)
Wu, Zhenyu; Shang, Pengjian; Xiong, Hui
2018-07-01
We propose a method to improve the measure of real-valued time series irreversibility which contains two tools: the directed horizontal visibility graph and the Kullback-Leibler divergence. The degree of time irreversibility is estimated by the Kullback-Leibler divergence between the in and out degree distributions presented in the associated visibility graph. In our work, we reframe the in and out degree distributions by encoding them with different embedded dimensions used in calculating permutation entropy(PE). With this improved method, we can not only estimate time series irreversibility efficiently, but also detect time series irreversibility from multiple dimensions. We verify the validity of our method and then estimate the amount of time irreversibility of series generated by chaotic maps as well as global stock markets over the period 2005-2015. The result shows that the amount of time irreversibility reaches the peak with embedded dimension d = 3 under circumstances of experiment and financial markets.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castellana, Vito G.; Tumeo, Antonino; Ferrandi, Fabrizio
Emerging applications such as data mining, bioinformatics, knowledge discovery, social network analysis are irregular. They use data structures based on pointers or linked lists, such as graphs, unbalanced trees or unstructures grids, which generates unpredictable memory accesses. These data structures usually are large, but difficult to partition. These applications mostly are memory bandwidth bounded and have high synchronization intensity. However, they also have large amounts of inherent dynamic parallelism, because they potentially perform a task for each one of the element they are exploring. Several efforts are looking at accelerating these applications on hybrid architectures, which integrate general purpose processorsmore » with reconfigurable devices. Some solutions, which demonstrated significant speedups, include custom-hand tuned accelerators or even full processor architectures on the reconfigurable logic. In this paper we present an approach for the automatic synthesis of accelerators from C, targeted at irregular applications. In contrast to typical High Level Synthesis paradigms, which construct a centralized Finite State Machine, our approach generates dynamically scheduled hardware components. While parallelism exploitation in typical HLS-generated accelerators is usually bound within a single execution flow, our solution allows concurrently running multiple execution flow, thus also exploiting the coarser grain task parallelism of irregular applications. Our approach supports multiple, multi-ported and distributed memories, and atomic memory operations. Its main objective is parallelizing as many memory operations as possible, independently from their execution time, to maximize the memory bandwidth utilization. This significantly differs from current HLS flows, which usually consider a single memory port and require precise scheduling of memory operations. A key innovation of our approach is the generation of a memory interface controller, which dynamically maps concurrent memory accesses to multiple ports. We present a case study on a typical irregular kernel, Graph Breadth First search (BFS), exploring different tradeoffs in terms of parallelism and number of memories.« less
Asada, Yukiko; Abel, Hannah; Skedgel, Chris; Warner, Grace
2017-12-01
Policy Points: Effective graphs can be a powerful tool in communicating health inequality. The choice of graphs is often based on preferences and familiarity rather than science. According to the literature on graph perception, effective graphs allow human brains to decode visual cues easily. Dot charts are easier to decode than bar charts, and thus they are more effective. Dot charts are a flexible and versatile way to display information about health inequality. Consistent with the health risk communication literature, the captions accompanying health inequality graphs should provide a numerical, explicitly calculated description of health inequality, expressed in absolute and relative terms, from carefully thought-out perspectives. Graphs are an essential tool for communicating health inequality, a key health policy concern. The choice of graphs is often driven by personal preferences and familiarity. Our article is aimed at health policy researchers developing health inequality graphs for policy and scientific audiences and seeks to (1) raise awareness of the effective use of graphs in communicating health inequality; (2) advocate for a particular type of graph (ie, dot charts) to depict health inequality; and (3) suggest key considerations for the captions accompanying health inequality graphs. Using composite review methods, we selected the prevailing recommendations for improving graphs in scientific reporting. To find the origins of these recommendations, we reviewed the literature on graph perception and then applied what we learned to the context of health inequality. In addition, drawing from the numeracy literature in health risk communication, we examined numeric and verbal formats to explain health inequality graphs. Many disciplines offer commonsense recommendations for visually presenting quantitative data. The literature on graph perception, which defines effective graphs as those allowing the easy decoding of visual cues in human brains, shows that with their more accurate and easier-to-decode visual cues, dot charts are more effective than bar charts. Dot charts can flexibly present a large amount of information in limited space. They also can easily accommodate typical health inequality information to describe a health variable (eg, life expectancy) by an inequality domain (eg, income) with domain groups (eg, poor and rich) in a population (eg, Canada) over time periods (eg, 2010 and 2017). The numeracy literature suggests that a health inequality graph's caption should provide a numerical, explicitly calculated description of health inequality expressed in absolute and relative terms, from carefully thought-out perspectives. Given the ubiquity of graphs, the health inequality field should learn from the vibrant multidisciplinary literature how to construct effective graphic communications, especially by considering to use dot charts. © 2017 Milbank Memorial Fund.
NASA Astrophysics Data System (ADS)
Chan, Chia-Hsin; Tu, Chun-Chuan; Tsai, Wen-Jiin
2017-01-01
High efficiency video coding (HEVC) not only improves the coding efficiency drastically compared to the well-known H.264/AVC but also introduces coding tools for parallel processing, one of which is tiles. Tile partitioning is allowed to be arbitrary in HEVC, but how to decide tile boundaries remains an open issue. An adaptive tile boundary (ATB) method is proposed to select a better tile partitioning to improve load balancing (ATB-LoadB) and coding efficiency (ATB-Gain) with a unified scheme. Experimental results show that, compared to ordinary uniform-space partitioning, the proposed ATB can save up to 17.65% of encoding times in parallel encoding scenarios and can reduce up to 0.8% of total bit rates for coding efficiency.
Dowen, Frances; Sidhu, Karishma; Broadbent, Elizabeth; Pilmore, Helen
2017-09-21
Mortality in end stage renal disease (ESRD) is higher than many malignancies. There is no data about the optimal way to present information about projected survival to patients with ESRD. In other areas, graphs have been shown to be more easily understood than narrative. We examined patient comprehension and perspectives on graphs in communicating projected survival in chronic kidney disease (CKD). One hundred seventy-seven patients with CKD were shown 4 different graphs presenting post transplantation survival data. Patients were asked to interpret a Kaplan Meier curve, pie chart, histogram and pictograph and answer a multi-choice question to determine understanding. We measured interpretation, usefulness and preference for the graphs. Most patients correctly interpreted the graphs. There was asignificant difference in the percentage of correct answers when comparing different graph types (p = 0.0439). The pictograph was correctly interpreted by 81% of participants, the histogram by 79%, pie chart by 77% and Kaplan Meier by 69%. Correct interpretation of the histogram was associated with educational level (p = 0.008) and inversely associated with age > 65 (p = 0.008). Of those who interpreted all four graphs correctly, there was an association with employment (p = 0.001) and New Zealand European ethnicity (p = 0.002). 87% of patients found the graphs useful. The pie chart was the most preferred graph (p 0.002). The readability of the graphs may have been improved with an alternative colour choice, especially in the setting of visual impairment. Visual aids, can be beneficial adjuncts to discussing survival in CKD.
ERIC Educational Resources Information Center
Maries, Alexandru; Lin, Shih-Yin; Singh, Chandralekha
2017-01-01
Prior research suggests that introductory physics students have difficulty with graphing and interpreting graphs. Here, we discuss an investigation of student difficulties in translating between mathematical and graphical representations for a problem in electrostatics and the effect of increasing levels of scaffolding on students'…
Improving Student Knowledge of the Graphing Calculator's Capabilities.
ERIC Educational Resources Information Center
Hubbard, Donna
This paper describes an intervention in two Algebra II classes in which the graphing calculator was incorporated into the curriculum as often as possible. The targeted population consisted of high school students in a growing middle to upper class community located in a suburb of a large city. The problem of a lack of understanding of the…
Efficient solution for finding Hamilton cycles in undirected graphs.
Alhalabi, Wadee; Kitanneh, Omar; Alharbi, Amira; Balfakih, Zain; Sarirete, Akila
2016-01-01
The Hamilton cycle problem is closely related to a series of famous problems and puzzles (traveling salesman problem, Icosian game) and, due to the fact that it is NP-complete, it was extensively studied with different algorithms to solve it. The most efficient algorithm is not known. In this paper, a necessary condition for an arbitrary un-directed graph to have Hamilton cycle is proposed. Based on this condition, a mathematical solution for this problem is developed and several proofs and an algorithmic approach are introduced. The algorithm is successfully implemented on many Hamiltonian and non-Hamiltonian graphs. This provides a new effective approach to solve a problem that is fundamental in graph theory and can influence the manner in which the existing applications are used and improved.
SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance.
Sacha, Dominik; Kraus, Matthias; Bernard, Jurgen; Behrisch, Michael; Schreck, Tobias; Asano, Yuki; Keim, Daniel A
2018-01-01
Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.
Graphing Calculator Mini Course
NASA Technical Reports Server (NTRS)
Karnawat, Sunil R.
1996-01-01
The "Graphing Calculator Mini Course" project provided a mathematically-intensive technologically-based summer enrichment workshop for teachers of American Indian students on the Turtle Mountain Indian Reservation. Eleven such teachers participated in the six-day workshop in summer of 1996 and three Sunday workshops in the academic year. The project aimed to improve science and mathematics education on the reservation by showing teachers effective ways to use high-end graphing calculators as teaching and learning tools in science and mathematics courses at all levels. In particular, the workshop concentrated on applying TI-82's user-friendly features to understand the various mathematical and scientific concepts.
One-dimensional swarm algorithm packaging
NASA Astrophysics Data System (ADS)
Lebedev, Boris K.; Lebedev, Oleg B.; Lebedeva, Ekaterina O.
2018-05-01
The paper considers an algorithm for solving the problem of onedimensional packaging based on the adaptive behavior model of an ant colony. The key role in the development of the ant algorithm is the choice of representation (interpretation) of the solution. The structure of the solution search graph, the procedure for finding solutions on the graph, the methods of deposition and evaporation of pheromone are described. Unlike the canonical paradigm of an ant algorithm, an ant on the solution search graph generates sets of elements distributed across blocks. Experimental studies were conducted on IBM PC. Compared with the existing algorithms, the results are improved.
Using Behavior Over Time Graphs to Spur Systems Thinking Among Public Health Practitioners.
Calancie, Larissa; Anderson, Seri; Branscomb, Jane; Apostolico, Alexsandra A; Lich, Kristen Hassmiller
2018-02-01
Public health practitioners can use Behavior Over Time (BOT) graphs to spur discussion and systems thinking around complex challenges. Multiple large systems, such as health care, the economy, and education, affect chronic disease rates in the United States. System thinking tools can build public health practitioners' capacity to understand these systems and collaborate within and across sectors to improve population health. BOT graphs show a variable, or variables (y axis) over time (x axis). Although analyzing trends is not new to public health, drawing BOT graphs, annotating the events and systemic forces that are likely to influence the depicted trends, and then discussing the graphs in a diverse group provides an opportunity for public health practitioners to hear each other's perspectives and creates a more holistic understanding of the key factors that contribute to a trend. We describe how BOT graphs are used in public health, how they can be used to generate group discussion, and how this process can advance systems-level thinking. Then we describe how BOT graphs were used with groups of maternal and child health (MCH) practitioners and partners (N = 101) during a training session to advance their thinking about MCH challenges. Eighty-six percent of the 84 participants who completed an evaluation agreed or strongly agreed that they would use this BOT graph process to engage stakeholders in their home states and jurisdictions. The BOT graph process we describe can be applied to a variety of public health issues and used by practitioners, stakeholders, and researchers.
On the Parameterized Complexity of Some Optimization Problems Related to Multiple-Interval Graphs
NASA Astrophysics Data System (ADS)
Jiang, Minghui
We show that for any constant t ≥ 2, K -Independent Set and K-Dominating Set in t-track interval graphs are W[1]-hard. This settles an open question recently raised by Fellows, Hermelin, Rosamond, and Vialette. We also give an FPT algorithm for K-Clique in t-interval graphs, parameterized by both k and t, with running time max { t O(k), 2 O(klogk) } ·poly(n), where n is the number of vertices in the graph. This slightly improves the previous FPT algorithm by Fellows, Hermelin, Rosamond, and Vialette. Finally, we use the W[1]-hardness of K-Independent Set in t-track interval graphs to obtain the first parameterized intractability result for a recent bioinformatics problem called Maximal Strip Recovery (MSR). We show that MSR-d is W[1]-hard for any constant d ≥ 4 when the parameter is either the total length of the strips, or the total number of adjacencies in the strips, or the number of strips in the optimal solution.
Xu, Xin; Huang, Zhenhua; Graves, Daniel; Pedrycz, Witold
2014-12-01
In order to deal with the sequential decision problems with large or continuous state spaces, feature representation and function approximation have been a major research topic in reinforcement learning (RL). In this paper, a clustering-based graph Laplacian framework is presented for feature representation and value function approximation (VFA) in RL. By making use of clustering-based techniques, that is, K-means clustering or fuzzy C-means clustering, a graph Laplacian is constructed by subsampling in Markov decision processes (MDPs) with continuous state spaces. The basis functions for VFA can be automatically generated from spectral analysis of the graph Laplacian. The clustering-based graph Laplacian is integrated with a class of approximation policy iteration algorithms called representation policy iteration (RPI) for RL in MDPs with continuous state spaces. Simulation and experimental results show that, compared with previous RPI methods, the proposed approach needs fewer sample points to compute an efficient set of basis functions and the learning control performance can be improved for a variety of parameter settings.
Li, Bing; Yuan, Chunfeng; Xiong, Weihua; Hu, Weiming; Peng, Houwen; Ding, Xinmiao; Maybank, Steve
2017-12-01
In multi-instance learning (MIL), the relations among instances in a bag convey important contextual information in many applications. Previous studies on MIL either ignore such relations or simply model them with a fixed graph structure so that the overall performance inevitably degrades in complex environments. To address this problem, this paper proposes a novel multi-view multi-instance learning algorithm (MIL) that combines multiple context structures in a bag into a unified framework. The novel aspects are: (i) we propose a sparse -graph model that can generate different graphs with different parameters to represent various context relations in a bag, (ii) we propose a multi-view joint sparse representation that integrates these graphs into a unified framework for bag classification, and (iii) we propose a multi-view dictionary learning algorithm to obtain a multi-view graph dictionary that considers cues from all views simultaneously to improve the discrimination of the MIL. Experiments and analyses in many practical applications prove the effectiveness of the M IL.
Nearest neighbor-density-based clustering methods for large hyperspectral images
NASA Astrophysics Data System (ADS)
Cariou, Claude; Chehdi, Kacem
2017-10-01
We address the problem of hyperspectral image (HSI) pixel partitioning using nearest neighbor - density-based (NN-DB) clustering methods. NN-DB methods are able to cluster objects without specifying the number of clusters to be found. Within the NN-DB approach, we focus on deterministic methods, e.g. ModeSeek, knnClust, and GWENN (standing for Graph WatershEd using Nearest Neighbors). These methods only require the availability of a k-nearest neighbor (kNN) graph based on a given distance metric. Recently, a new DB clustering method, called Density Peak Clustering (DPC), has received much attention, and kNN versions of it have quickly followed and showed their efficiency. However, NN-DB methods still suffer from the difficulty of obtaining the kNN graph due to the quadratic complexity with respect to the number of pixels. This is why GWENN was embedded into a multiresolution (MR) scheme to bypass the computation of the full kNN graph over the image pixels. In this communication, we propose to extent the MR-GWENN scheme on three aspects. Firstly, similarly to knnClust, the original labeling rule of GWENN is modified to account for local density values, in addition to the labels of previously processed objects. Secondly, we set up a modified NN search procedure within the MR scheme, in order to stabilize of the number of clusters found from the coarsest to the finest spatial resolution. Finally, we show that these extensions can be easily adapted to the three other NN-DB methods (ModeSeek, knnClust, knnDPC) for pixel clustering in large HSIs. Experiments are conducted to compare the four NN-DB methods for pixel clustering in HSIs. We show that NN-DB methods can outperform a classical clustering method such as fuzzy c-means (FCM), in terms of classification accuracy, relevance of found clusters, and clustering speed. Finally, we demonstrate the feasibility and evaluate the performances of NN-DB methods on a very large image acquired by our AISA Eagle hyperspectral imaging sensor.
Spatial partitions systematize visual search and enhance target memory.
Solman, Grayden J F; Kingstone, Alan
2017-02-01
Humans are remarkably capable of finding desired objects in the world, despite the scale and complexity of naturalistic environments. Broadly, this ability is supported by an interplay between exploratory search and guidance from episodic memory for previously observed target locations. Here we examined how the environment itself may influence this interplay. In particular, we examined how partitions in the environment-like buildings, rooms, and furniture-can impact memory during repeated search. We report that the presence of partitions in a display, independent of item configuration, reliably improves episodic memory for item locations. Repeated search through partitioned displays was faster overall and was characterized by more rapid ballistic orienting in later repetitions. Explicit recall was also both faster and more accurate when displays were partitioned. Finally, we found that search paths were more regular and systematic when displays were partitioned. Given the ubiquity of partitions in real-world environments, these results provide important insights into the mechanisms of naturalistic search and its relation to memory.
Kujala, Rainer; Glerean, Enrico; Pan, Raj Kumar; Jääskeläinen, Iiro P; Sams, Mikko; Saramäki, Jari
2016-11-01
Networks have become a standard tool for analyzing functional magnetic resonance imaging (fMRI) data. In this approach, brain areas and their functional connections are mapped to the nodes and links of a network. Even though this mapping reduces the complexity of the underlying data, it remains challenging to understand the structure of the resulting networks due to the large number of nodes and links. One solution is to partition networks into modules and then investigate the modules' composition and relationship with brain functioning. While this approach works well for single networks, understanding differences between two networks by comparing their partitions is difficult and alternative approaches are thus necessary. To this end, we present a coarse-graining framework that uses a single set of data-driven modules as a frame of reference, enabling one to zoom out from the node- and link-level details. As a result, differences in the module-level connectivity can be understood in a transparent, statistically verifiable manner. We demonstrate the feasibility of the method by applying it to networks constructed from fMRI data recorded from 13 healthy subjects during rest and movie viewing. While independently partitioning the rest and movie networks is shown to yield little insight, the coarse-graining framework enables one to pinpoint differences in the module-level structure, such as the increased number of intra-module links within the visual cortex during movie viewing. In addition to quantifying differences due to external stimuli, the approach could also be applied in clinical settings, such as comparing patients with healthy controls. © 2016 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Duan, Wuhua; Chen, Jing; Wang, Jianchen; Wang, Shuwei; Feng, Xiaogui; Wang, Xinghai; Li, Shaowei; Xu, Chao
2014-08-15
High level liquid waste (HLLW) produced from the reprocessing of the spent nuclear fuel still contains moderate amounts of uranium, transuranium (TRU) actinides, (90)Sr, (137)Cs, etc., and thus constitutes a permanent hazard to the environment. The partitioning and transmutation (P&T) strategy has increasingly attracted interest for the safe treatment and disposal of HLLW, in which the partitioning of HLLW is one of the critical technical issues. An improved total partitioning process, including a TRPO (tri-alkylphosphine oxide) process for the removal of actinides, a CESE (crown ether strontium extraction) process for the removal of Sr, and a CECE (calixcrown ether cesium extraction) process for the removal of Cs, has been developed to treat Chinese HLLW. A 160-hour hot test of the improved total partitioning process was carried out using 72-stage 10-mm-dia annular centrifugal contactors (ACCs) and genuine HLLW. The hot test results showed that the average DFs of total α activity, Sr and Cs were 3.57 × 10(3), 2.25 × 10(4) and 1.68 × 10(4) after the hot test reached equilibrium, respectively. During the hot test, 72-stage 10-mm-dia ACCs worked stable, continuously with no stage failing or interruption of the operation. Copyright © 2014 Elsevier B.V. All rights reserved.
Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation.
Song, Jingkuan; Gao, Lianli; Nie, Feiping; Shen, Heng Tao; Yan, Yan; Sebe, Nicu
2016-11-01
In multimedia annotation, due to the time constraints and the tediousness of manual tagging, it is quite common to utilize both tagged and untagged data to improve the performance of supervised learning when only limited tagged training data are available. This is often done by adding a geometry-based regularization term in the objective function of a supervised learning model. In this case, a similarity graph is indispensable to exploit the geometrical relationships among the training data points, and the graph construction scheme essentially determines the performance of these graph-based learning algorithms. However, most of the existing works construct the graph empirically and are usually based on a single feature without using the label information. In this paper, we propose a semi-supervised annotation approach by learning an optimized graph (OGL) from multi-cues (i.e., partial tags and multiple features), which can more accurately embed the relationships among the data points. Since OGL is a transductive method and cannot deal with novel data points, we further extend our model to address the out-of-sample issue. Extensive experiments on image and video annotation show the consistent superiority of OGL over the state-of-the-art methods.
Botía, Juan A; Vandrovcova, Jana; Forabosco, Paola; Guelfi, Sebastian; D'Sa, Karishma; Hardy, John; Lewis, Cathryn M; Ryten, Mina; Weale, Michael E
2017-04-12
Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ). We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.
A dynamic re-partitioning strategy based on the distribution of key in Spark
NASA Astrophysics Data System (ADS)
Zhang, Tianyu; Lian, Xin
2018-05-01
Spark is a memory-based distributed data processing framework, has the ability of processing massive data and becomes a focus in Big Data. But the performance of Spark Shuffle depends on the distribution of data. The naive Hash partition function of Spark can not guarantee load balancing when data is skewed. The time of job is affected by the node which has more data to process. In order to handle this problem, dynamic sampling is used. In the process of task execution, histogram is used to count the key frequency distribution of each node, and then generate the global key frequency distribution. After analyzing the distribution of key, load balance of data partition is achieved. Results show that the Dynamic Re-Partitioning function is better than the default Hash partition, Fine Partition and the Balanced-Schedule strategy, it can reduce the execution time of the task and improve the efficiency of the whole cluster.
Parallel file system with metadata distributed across partitioned key-value store c
Bent, John M.; Faibish, Sorin; Grider, Gary; Torres, Aaron
2017-09-19
Improved techniques are provided for storing metadata associated with a plurality of sub-files associated with a single shared file in a parallel file system. The shared file is generated by a plurality of applications executing on a plurality of compute nodes. A compute node implements a Parallel Log Structured File System (PLFS) library to store at least one portion of the shared file generated by an application executing on the compute node and metadata for the at least one portion of the shared file on one or more object storage servers. The compute node is also configured to implement a partitioned data store for storing a partition of the metadata for the shared file, wherein the partitioned data store communicates with partitioned data stores on other compute nodes using a message passing interface. The partitioned data store can be implemented, for example, using Multidimensional Data Hashing Indexing Middleware (MDHIM).
The Edge-Disjoint Path Problem on Random Graphs by Message-Passing.
Altarelli, Fabrizio; Braunstein, Alfredo; Dall'Asta, Luca; De Bacco, Caterina; Franz, Silvio
2015-01-01
We present a message-passing algorithm to solve a series of edge-disjoint path problems on graphs based on the zero-temperature cavity equations. Edge-disjoint paths problems are important in the general context of routing, that can be defined by incorporating under a unique framework both traffic optimization and total path length minimization. The computation of the cavity equations can be performed efficiently by exploiting a mapping of a generalized edge-disjoint path problem on a star graph onto a weighted maximum matching problem. We perform extensive numerical simulations on random graphs of various types to test the performance both in terms of path length minimization and maximization of the number of accommodated paths. In addition, we test the performance on benchmark instances on various graphs by comparison with state-of-the-art algorithms and results found in the literature. Our message-passing algorithm always outperforms the others in terms of the number of accommodated paths when considering non trivial instances (otherwise it gives the same trivial results). Remarkably, the largest improvement in performance with respect to the other methods employed is found in the case of benchmarks with meshes, where the validity hypothesis behind message-passing is expected to worsen. In these cases, even though the exact message-passing equations do not converge, by introducing a reinforcement parameter to force convergence towards a sub optimal solution, we were able to always outperform the other algorithms with a peak of 27% performance improvement in terms of accommodated paths. On random graphs, we numerically observe two separated regimes: one in which all paths can be accommodated and one in which this is not possible. We also investigate the behavior of both the number of paths to be accommodated and their minimum total length.
The Edge-Disjoint Path Problem on Random Graphs by Message-Passing
2015-01-01
We present a message-passing algorithm to solve a series of edge-disjoint path problems on graphs based on the zero-temperature cavity equations. Edge-disjoint paths problems are important in the general context of routing, that can be defined by incorporating under a unique framework both traffic optimization and total path length minimization. The computation of the cavity equations can be performed efficiently by exploiting a mapping of a generalized edge-disjoint path problem on a star graph onto a weighted maximum matching problem. We perform extensive numerical simulations on random graphs of various types to test the performance both in terms of path length minimization and maximization of the number of accommodated paths. In addition, we test the performance on benchmark instances on various graphs by comparison with state-of-the-art algorithms and results found in the literature. Our message-passing algorithm always outperforms the others in terms of the number of accommodated paths when considering non trivial instances (otherwise it gives the same trivial results). Remarkably, the largest improvement in performance with respect to the other methods employed is found in the case of benchmarks with meshes, where the validity hypothesis behind message-passing is expected to worsen. In these cases, even though the exact message-passing equations do not converge, by introducing a reinforcement parameter to force convergence towards a sub optimal solution, we were able to always outperform the other algorithms with a peak of 27% performance improvement in terms of accommodated paths. On random graphs, we numerically observe two separated regimes: one in which all paths can be accommodated and one in which this is not possible. We also investigate the behavior of both the number of paths to be accommodated and their minimum total length. PMID:26710102
ERIC Educational Resources Information Center
Yoon, Susan A.
2011-01-01
This study extends previous research that explores how visualization affordances that computational tools provide and social network analyses that account for individual- and group-level dynamic processes can work in conjunction to improve learning outcomes. The study's main hypothesis is that when social network graphs are used in instruction,…
ERIC Educational Resources Information Center
Averitt, Sallie D.
This instructor guide, which was developed for use in a manufacturing firm's advanced technical preparation program, contains the materials required to present a learning module that is designed to prepare trainees for the program's statistical process control module by improving their basic math skills in working with line graphs and teaching…
Multilabel user classification using the community structure of online networks
Papadopoulos, Symeon; Kompatsiaris, Yiannis
2017-01-01
We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user’s graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score. PMID:28278242
Multilabel user classification using the community structure of online networks.
Rizos, Georgios; Papadopoulos, Symeon; Kompatsiaris, Yiannis
2017-01-01
We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user's graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score.
NASA Astrophysics Data System (ADS)
Shankar, Chandrashekar
The goal of this research was to gain a fundamental understanding of the properties of networks created by the ring opening metathesis polymerization (ROMP) of dicyclopentadiene (DCPD) used in self-healing materials. To this end we used molecular simulation methods to generate realistic structures of DCPD networks, characterize their structures, and determine their mechanical properties. Density functional theory (DFT) calculations, complemented by structural information derived from molecular dynamics simulations were used to reconstruct experimental Raman spectra and differential scanning calorimetry (DSC) data. We performed coarse-grained simulations comparing networks generated via the ROMP reaction process and compared them to those generated via a RANDOM process, which led to the fundamental realization that the polymer topology has a unique influence on the network properties. We carried out fully atomistic simulations of DCPD using a novel algorithm for recreating ROMP reactions of DCPD molecules. Mechanical properties derived from these atomistic networks are in excellent agreement with those obtained from coarse-grained simulations in which interactions between nodes are subject to angular constraints. This comparison provides self-consistent validation of our simulation results and helps to identify the level of detail necessary for the coarse-grained interaction model. Simulations suggest networks can classified into three stages: fluid-like, rubber-like or glass-like delineated by two thresholds in degree of reaction alpha: The onset of finite magnitudes for the Young's modulus, alphaY, and the departure of the Poisson ration from 0.5, alphaP. In each stage the polymer exhibits a different predominant mechanical response to deformation. At low alpha < alphaY it flows. At alpha Y < alpha < alphaP the response is entropic with no change in internal energy. At alpha > alphaP the response is enthalpic change in internal energy. We developed graph theory-based network characterizations to correlate between network topology and the simulated mechanical properties. (1) Eigenvector centrality (2) Graph fractal dimension, (3) Fiedler partitioning, and (4) Cross-link fraction (Q3+Q4). Of these quantities, the Fiedler partition is the best characteristic for the prediction of Young's Modulus. The new computational tools developed in this research are of great fundamental and practical interest.
Ivanciuc, O; Ivanciuc, T; Klein, D J; Seitz, W A; Balaban, A T
2001-02-01
Quantitative structure-retention relationships (QSRR) represent statistical models that quantify the connection between the molecular structure and the chromatographic retention indices of organic compounds, allowing the prediction of retention indices of novel, not yet synthesized compounds, solely from their structural descriptors. Using multiple linear regression, QSRR models for the gas chromatographic Kováts retention indices of 129 alkylbenzenes are generated using molecular graph descriptors. The correlational ability of structural descriptors computed from 10 molecular matrices is investigated, showing that the novel reciprocal matrices give numerical indices with improved correlational ability. A QSRR equation with 5 graph descriptors gives the best calibration and prediction results, demonstrating the usefulness of the molecular graph descriptors in modeling chromatographic retention parameters. The sequential orthogonalization of descriptors suggests simpler QSRR models by eliminating redundant structural information.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coram, Jamie L.; Morrow, James D.; Perkins, David Nikolaus
2015-09-01
This document describes the PANTHER R&D Application, a proof-of-concept user interface application developed under the PANTHER Grand Challenge LDRD. The purpose of the application is to explore interaction models for graph analytics, drive algorithmic improvements from an end-user point of view, and support demonstration of PANTHER technologies to potential customers. The R&D Application implements a graph-centric interaction model that exposes analysts to the algorithms contained within the GeoGraphy graph analytics library. Users define geospatial-temporal semantic graph queries by constructing search templates based on nodes, edges, and the constraints among them. Users then analyze the results of the queries using bothmore » geo-spatial and temporal visualizations. Development of this application has made user experience an explicit driver for project and algorithmic level decisions that will affect how analysts one day make use of PANTHER technologies.« less
Lombaert, Herve; Grady, Leo; Polimeni, Jonathan R.; Cheriet, Farida
2013-01-01
Existing methods for surface matching are limited by the trade-off between precision and computational efficiency. Here we present an improved algorithm for dense vertex-to-vertex correspondence that uses direct matching of features defined on a surface and improves it by using spectral correspondence as a regularization. This algorithm has the speed of both feature matching and spectral matching while exhibiting greatly improved precision (distance errors of 1.4%). The method, FOCUSR, incorporates implicitly such additional features to calculate the correspondence and relies on the smoothness of the lowest-frequency harmonics of a graph Laplacian to spatially regularize the features. In its simplest form, FOCUSR is an improved spectral correspondence method that nonrigidly deforms spectral embeddings. We provide here a full realization of spectral correspondence where virtually any feature can be used as additional information using weights on graph edges, but also on graph nodes and as extra embedded coordinates. As an example, the full power of FOCUSR is demonstrated in a real case scenario with the challenging task of brain surface matching across several individuals. Our results show that combining features and regularizing them in a spectral embedding greatly improves the matching precision (to a sub-millimeter level) while performing at much greater speed than existing methods. PMID:23868776
Developing and evaluating Quilts for the depiction of large layered graphs.
Bae, Juhee; Watson, Ben
2011-12-01
Traditional layered graph depictions such as flow charts are in wide use. Yet as graphs grow more complex, these depictions can become difficult to understand. Quilts are matrix-based depictions for layered graphs designed to address this problem. In this research, we first improve Quilts by developing three design alternatives, and then compare the best of these alternatives to better-known node-link and matrix depictions. A primary weakness in Quilts is their depiction of skip links, links that do not simply connect to a succeeding layer. Therefore in our first study, we compare Quilts using color-only, text-only, and mixed (color and text) skip link depictions, finding that path finding with the color-only depiction is significantly slower and less accurate, and that in certain cases, the mixed depiction offers an advantage over the text-only depiction. In our second study, we compare Quilts using the mixed depiction to node-link diagrams and centered matrices. Overall results show that users can find paths through graphs significantly faster with Quilts (46.6 secs) than with node-link (58.3 secs) or matrix (71.2 secs) diagrams. This speed advantage is still greater in large graphs (e.g. in 200 node graphs, 55.4 secs vs. 71.1 secs for node-link and 84.2 secs for matrix depictions). © 2011 IEEE
Multi-label literature classification based on the Gene Ontology graph.
Jin, Bo; Muller, Brian; Zhai, Chengxiang; Lu, Xinghua
2008-12-08
The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators) that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate protein annotation based on the literature.
Using Correlation to Compute Better Probability Estimates in Plan Graphs
NASA Technical Reports Server (NTRS)
Bryce, Daniel; Smith, David E.
2006-01-01
Plan graphs are commonly used in planning to help compute heuristic "distance" estimates between states and goals. A few authors have also attempted to use plan graphs in probabilistic planning to compute estimates of the probability that propositions can be achieved and actions can be performed. This is done by propagating probability information forward through the plan graph from the initial conditions through each possible action to the action effects, and hence to the propositions at the next layer of the plan graph. The problem with these calculations is that they make very strong independence assumptions - in particular, they usually assume that the preconditions for each action are independent of each other. This can lead to gross overestimates in probability when the plans for those preconditions interfere with each other. It can also lead to gross underestimates of probability when there is synergy between the plans for two or more preconditions. In this paper we introduce a notion of the binary correlation between two propositions and actions within a plan graph, show how to propagate this information within a plan graph, and show how this improves probability estimates for planning. This notion of correlation can be thought of as a continuous generalization of the notion of mutual exclusion (mutex) often used in plan graphs. At one extreme (correlation=0) two propositions or actions are completely mutex. With correlation = 1, two propositions or actions are independent, and with correlation > 1, two propositions or actions are synergistic. Intermediate values can and do occur indicating different degrees to which propositions and action interfere or are synergistic. We compare this approach with another recent approach by Bryce that computes probability estimates using Monte Carlo simulation of possible worlds in plan graphs.
Time-dependence of graph theory metrics in functional connectivity analysis
Chiang, Sharon; Cassese, Alberto; Guindani, Michele; Vannucci, Marina; Yeh, Hsiang J.; Haneef, Zulfi; Stern, John M.
2016-01-01
Brain graphs provide a useful way to computationally model the network structure of the connectome, and this has led to increasing interest in the use of graph theory to quantitate and investigate the topological characteristics of the healthy brain and brain disorders on the network level. The majority of graph theory investigations of functional connectivity have relied on the assumption of temporal stationarity. However, recent evidence increasingly suggests that functional connectivity fluctuates over the length of the scan. In this study, we investigate the stationarity of brain network topology using a Bayesian hidden Markov model (HMM) approach that estimates the dynamic structure of graph theoretical measures of whole-brain functional connectivity. In addition to extracting the stationary distribution and transition probabilities of commonly employed graph theory measures, we propose two estimators of temporal stationarity: the S-index and N-index. These indexes can be used to quantify different aspects of the temporal stationarity of graph theory measures. We apply the method and proposed estimators to resting-state functional MRI data from healthy controls and patients with temporal lobe epilepsy. Our analysis shows that several graph theory measures, including small-world index, global integration measures, and betweenness centrality, may exhibit greater stationarity over time and therefore be more robust. Additionally, we demonstrate that accounting for subject-level differences in the level of temporal stationarity of network topology may increase discriminatory power in discriminating between disease states. Our results confirm and extend findings from other studies regarding the dynamic nature of functional connectivity, and suggest that using statistical models which explicitly account for the dynamic nature of functional connectivity in graph theory analyses may improve the sensitivity of investigations and consistency across investigations. PMID:26518632
On Bipartite Graphs Trees and Their Partial Vertex Covers.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caskurlu, Bugra; Mkrtchyan, Vahan; Parekh, Ojas D.
2015-03-01
Graphs can be used to model risk management in various systems. Particularly, Caskurlu et al. in [7] have considered a system, which has threats, vulnerabilities and assets, and which essentially represents a tripartite graph. The goal in this model is to reduce the risk in the system below a predefined risk threshold level. One can either restricting the permissions of the users, or encapsulating the system assets. The pointed out two strategies correspond to deleting minimum number of elements corresponding to vulnerabilities and assets, such that the flow between threats and assets is reduced below the predefined threshold level. Itmore » can be shown that the main goal in this risk management system can be formulated as a Partial Vertex Cover problem on bipartite graphs. It is well-known that the Vertex Cover problem is in P on bipartite graphs, however; the computational complexity of the Partial Vertex Cover problem on bipartite graphs has remained open. In this paper, we establish that the Partial Vertex Cover problem is NP-hard on bipartite graphs, which was also recently independently demonstrated [N. Apollonio and B. Simeone, Discrete Appl. Math., 165 (2014), pp. 37–48; G. Joret and A. Vetta, preprint, arXiv:1211.4853v1 [cs.DS], 2012]. We then identify interesting special cases of bipartite graphs, for which the Partial Vertex Cover problem, the closely related Budgeted Maximum Coverage problem, and their weighted extensions can be solved in polynomial time. We also present an 8/9-approximation algorithm for the Budgeted Maximum Coverage problem in the class of bipartite graphs. We show that this matches and resolves the integrality gap of the natural LP relaxation of the problem and improves upon a recent 4/5-approximation.« less
Time-dependence of graph theory metrics in functional connectivity analysis.
Chiang, Sharon; Cassese, Alberto; Guindani, Michele; Vannucci, Marina; Yeh, Hsiang J; Haneef, Zulfi; Stern, John M
2016-01-15
Brain graphs provide a useful way to computationally model the network structure of the connectome, and this has led to increasing interest in the use of graph theory to quantitate and investigate the topological characteristics of the healthy brain and brain disorders on the network level. The majority of graph theory investigations of functional connectivity have relied on the assumption of temporal stationarity. However, recent evidence increasingly suggests that functional connectivity fluctuates over the length of the scan. In this study, we investigate the stationarity of brain network topology using a Bayesian hidden Markov model (HMM) approach that estimates the dynamic structure of graph theoretical measures of whole-brain functional connectivity. In addition to extracting the stationary distribution and transition probabilities of commonly employed graph theory measures, we propose two estimators of temporal stationarity: the S-index and N-index. These indexes can be used to quantify different aspects of the temporal stationarity of graph theory measures. We apply the method and proposed estimators to resting-state functional MRI data from healthy controls and patients with temporal lobe epilepsy. Our analysis shows that several graph theory measures, including small-world index, global integration measures, and betweenness centrality, may exhibit greater stationarity over time and therefore be more robust. Additionally, we demonstrate that accounting for subject-level differences in the level of temporal stationarity of network topology may increase discriminatory power in discriminating between disease states. Our results confirm and extend findings from other studies regarding the dynamic nature of functional connectivity, and suggest that using statistical models which explicitly account for the dynamic nature of functional connectivity in graph theory analyses may improve the sensitivity of investigations and consistency across investigations. Copyright © 2015 Elsevier Inc. All rights reserved.
Kennedy, David M; Chatha, Kamaljit; Rayner, Hugh C
2013-09-01
Some patients with chronic kidney disease are still referred late for specialist care despite the evidence that earlier detection and intervention can halt or delay progression to end-stage kidney disease (ESKD). To develop a population surveillance system using existing laboratory data to enable early detection of patients at high risk of ESKD by reviewing cumulative graphs of estimated glomerular filtration rate (eGFR). A database was developed, updated daily with data from the laboratory computer. Cumulative eGFR graphs containing up to five years of data are reviewed by clinical scientists for all primary care patients or out-patients with a low eGFR for their age. For those with a declining trend, a report containing the eGFR graph is sent to the requesting doctor. A retrospective audit was performed using historical data to assess the predictive value of the graphs. In nine months, we reported 370,000 eGFR results, reviewing 12,000 eGFR graphs. On average 60 graphs per week were flagged as 'high' or 'intermediate' risk. Patients with graphs flagged as high risk had a significantly higher mortality after 3.5 years and a significantly greater chance of requiring renal replacement therapy after 4.5 years of follow-up. Five patients (7%) with graphs flagged as high risk had a sustained >25% fall in eGFR without evidence of secondary care referral. Feedback about the service from requesting clinicians was 73% positive. We have developed a system for laboratory staff to review cumulative eGFR graphs for a large population and identify patients at highest risk of developing ESKD. Further research is needed to measure the impact of this service on patient outcomes. © 2013 European Dialysis and Transplant Nurses Association/European Renal Care Association.
NASA Astrophysics Data System (ADS)
Zhang, Yongping; Shang, Pengjian; Xiong, Hui; Xia, Jianan
Time irreversibility is an important property of nonequilibrium dynamic systems. A visibility graph approach was recently proposed, and this approach is generally effective to measure time irreversibility of time series. However, its result may be unreliable when dealing with high-dimensional systems. In this work, we consider the joint concept of time irreversibility and adopt the phase-space reconstruction technique to improve this visibility graph approach. Compared with the previous approach, the improved approach gives a more accurate estimate for the irreversibility of time series, and is more effective to distinguish irreversible and reversible stochastic processes. We also use this approach to extract the multiscale irreversibility to account for the multiple inherent dynamics of time series. Finally, we apply the approach to detect the multiscale irreversibility of financial time series, and succeed to distinguish the time of financial crisis and the plateau. In addition, Asian stock indexes away from other indexes are clearly visible in higher time scales. Simulations and real data support the effectiveness of the improved approach when detecting time irreversibility.
Engineering lipid structure for recognition of the liquid ordered membrane phase
Bordovsky, Stefan S.; Wong, Christopher S.; Bachand, George D.; ...
2016-08-26
The selective partitioning of lipid components in phase-separated membranes is essential for domain formation involved in cellular processes. Identifying and tracking the movement of lipids in cellular systems would be improved if we understood how to achieve selective affinity between fluorophore-labeled lipids and membrane assemblies. Furthermore, we investigated the structure and chemistry of membrane lipids to evaluate lipid designs that partition to the liquid ordered (L o) phase. A range of fluorophores at the headgroup position and lengths of PEG spacer between the lipid backbone and fluorophore were examined. On a lipid body with saturated palmityl or palmitoyl tails, wemore » found that although the lipid tails can direct selective partitioning to the L o phase through favorable packing interactions, headgroup hydrophobicity can override the partitioning behavior and direct the lipid to the disordered membrane phase (L d). The PEG spacer can serve as a buffer to mute headgroup–membrane interactions and thus improve L o phase partitioning, but its effect is limited with strongly hydrophobic fluorophore headgroups. We present a series of lipid designs leading to the development of novel fluorescently labeled lipids with selective affinity for the L o phase.« less
Engineering Lipid Structure for Recognition of the Liquid Ordered Membrane Phase.
Bordovsky, Stefan S; Wong, Christopher S; Bachand, George D; Stachowiak, Jeanne C; Sasaki, Darryl Y
2016-11-29
The selective partitioning of lipid components in phase-separated membranes is essential for domain formation involved in cellular processes. Identifying and tracking the movement of lipids in cellular systems would be improved if we understood how to achieve selective affinity between fluorophore-labeled lipids and membrane assemblies. Here, we investigated the structure and chemistry of membrane lipids to evaluate lipid designs that partition to the liquid ordered (L o ) phase. A range of fluorophores at the headgroup position and lengths of PEG spacer between the lipid backbone and fluorophore were examined. On a lipid body with saturated palmityl or palmitoyl tails, we found that although the lipid tails can direct selective partitioning to the L o phase through favorable packing interactions, headgroup hydrophobicity can override the partitioning behavior and direct the lipid to the disordered membrane phase (L d ). The PEG spacer can serve as a buffer to mute headgroup-membrane interactions and thus improve L o phase partitioning, but its effect is limited with strongly hydrophobic fluorophore headgroups. We present a series of lipid designs leading to the development of novel fluorescently labeled lipids with selective affinity for the L o phase.
A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics
Tang, Haixu; Li, Sujun; Ye, Yuzhen
2016-01-01
Metaproteomic studies adopt the common bottom-up proteomics approach to investigate the protein composition and the dynamics of protein expression in microbial communities. When matched metagenomic and/or metatranscriptomic data of the microbial communities are available, metaproteomic data analyses often employ a metagenome-guided approach, in which complete or fragmental protein-coding genes are first directly predicted from metagenomic (and/or metatranscriptomic) sequences or from their assemblies, and the resulting protein sequences are then used as the reference database for peptide/protein identification from MS/MS spectra. This approach is often limited because protein coding genes predicted from metagenomes are incomplete and fragmental. In this paper, we present a graph-centric approach to improving metagenome-guided peptide and protein identification in metaproteomics. Our method exploits the de Bruijn graph structure reported by metagenome assembly algorithms to generate a comprehensive database of protein sequences encoded in the community. We tested our method using several public metaproteomic datasets with matched metagenomic and metatranscriptomic sequencing data acquired from complex microbial communities in a biological wastewater treatment plant. The results showed that many more peptides and proteins can be identified when assembly graphs were utilized, improving the characterization of the proteins expressed in the microbial communities. The additional proteins we identified contribute to the characterization of important pathways such as those involved in degradation of chemical hazards. Our tools are released as open-source software on github at https://github.com/COL-IU/Graph2Pro. PMID:27918579
A Hybrid Task Graph Scheduler for High Performance Image Processing Workflows.
Blattner, Timothy; Keyrouz, Walid; Bhattacharyya, Shuvra S; Halem, Milton; Brady, Mary
2017-12-01
Designing applications for scalability is key to improving their performance in hybrid and cluster computing. Scheduling code to utilize parallelism is difficult, particularly when dealing with data dependencies, memory management, data motion, and processor occupancy. The Hybrid Task Graph Scheduler (HTGS) improves programmer productivity when implementing hybrid workflows for multi-core and multi-GPU systems. The Hybrid Task Graph Scheduler (HTGS) is an abstract execution model, framework, and API that increases programmer productivity when implementing hybrid workflows for such systems. HTGS manages dependencies between tasks, represents CPU and GPU memories independently, overlaps computations with disk I/O and memory transfers, keeps multiple GPUs occupied, and uses all available compute resources. Through these abstractions, data motion and memory are explicit; this makes data locality decisions more accessible. To demonstrate the HTGS application program interface (API), we present implementations of two example algorithms: (1) a matrix multiplication that shows how easily task graphs can be used; and (2) a hybrid implementation of microscopy image stitching that reduces code size by ≈ 43% compared to a manually coded hybrid workflow implementation and showcases the minimal overhead of task graphs in HTGS. Both of the HTGS-based implementations show good performance. In image stitching the HTGS implementation achieves similar performance to the hybrid workflow implementation. Matrix multiplication with HTGS achieves 1.3× and 1.8× speedup over the multi-threaded OpenBLAS library for 16k × 16k and 32k × 32k size matrices, respectively.
Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves
NASA Technical Reports Server (NTRS)
Mengshoel, Ole J.
2010-01-01
One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.
Yadav, Umesh P.; Ayre, Brian G.; Bush, Daniel R.
2015-04-22
The principal components of plant productivity and nutritional value, from the standpoint of modern agriculture, are the acquisition and partitioning of organic carbon (C) and nitrogen (N) compounds among the various organs of the plant. The flow of essential organic nutrients among the plant organ systems is mediated by its complex vascular system, and is driven by a series of transport steps including export from sites of primary assimilation, transport into and out of the phloem and xylem, and transport into the various import-dependent organs. Manipulating C and N partitioning to enhance yield of harvested organs is evident in themore » earliest crop domestication events and continues to be a goal for modern plant biology. Research on the biochemistry, molecular and cellular biology, and physiology of C and N partitioning has now matured to an extent that strategic manipulation of these transport systems through biotechnology are being attempted to improve movement from source to sink tissues in general, but also to target partitioning to specific organs. These nascent efforts are demonstrating the potential of applied biomass targeting but are also identifying interactions between essential nutrients that require further basic research. In this review, we summarize the key transport steps involved in C and N partitioning, and discuss various transgenic approaches for directly manipulating key C and N transporters involved. In addition, we propose several experiments that could enhance biomass accumulation in targeted organs while simultaneously testing current partitioning models.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yadav, Umesh P.; Ayre, Brian G.; Bush, Daniel R.
The principal components of plant productivity and nutritional value, from the standpoint of modern agriculture, are the acquisition and partitioning of organic carbon (C) and nitrogen (N) compounds among the various organs of the plant. The flow of essential organic nutrients among the plant organ systems is mediated by its complex vascular system, and is driven by a series of transport steps including export from sites of primary assimilation, transport into and out of the phloem and xylem, and transport into the various import-dependent organs. Manipulating C and N partitioning to enhance yield of harvested organs is evident in themore » earliest crop domestication events and continues to be a goal for modern plant biology. Research on the biochemistry, molecular and cellular biology, and physiology of C and N partitioning has now matured to an extent that strategic manipulation of these transport systems through biotechnology are being attempted to improve movement from source to sink tissues in general, but also to target partitioning to specific organs. These nascent efforts are demonstrating the potential of applied biomass targeting but are also identifying interactions between essential nutrients that require further basic research. In this review, we summarize the key transport steps involved in C and N partitioning, and discuss various transgenic approaches for directly manipulating key C and N transporters involved. In addition, we propose several experiments that could enhance biomass accumulation in targeted organs while simultaneously testing current partitioning models.« less
Partitioning of polar and non-polar neutral organic chemicals into human and cow milk.
Geisler, Anett; Endo, Satoshi; Goss, Kai-Uwe
2011-10-01
The aim of this work was to develop a predictive model for milk/water partition coefficients of neutral organic compounds. Batch experiments were performed for 119 diverse organic chemicals in human milk and raw and processed cow milk at 37°C. No differences (<0.3 log units) in the partition coefficients of these types of milk were observed. The polyparameter linear free energy relationship model fit the calibration data well (SD=0.22 log units). An experimental validation data set including hormones and hormone active compounds was predicted satisfactorily by the model. An alternative modelling approach based on log K(ow) revealed a poorer performance. The model presented here provides a significant improvement in predicting enrichment of potentially hazardous chemicals in milk. In combination with physiologically based pharmacokinetic modelling this improvement in the estimation of milk/water partitioning coefficients may allow a better risk assessment for a wide range of neutral organic chemicals. Copyright © 2011 Elsevier Ltd. All rights reserved.
Mäenpää, Kimmo; Leppänen, Matti T; Figueiredo, Kaisa; Tigistu-Sahle, Feven; Käkelä, Reijo
2015-01-01
Knowledge on the internal distribution of halogenated organic chemicals (HOCs) would improve our understanding of dose-effect relationships and subsequently improve risk assessment of contaminated sites. Herein, we determine the concentrations of HOCs based on equilibrium partitioning in storage lipids, membrane lipids, and proteins in field-contaminated fish using equilibrium sampling devices. The study shows the importance of protein as a sorptive phase in lean fish. Our results provide a basis for using species-specific equilibrium partitioning coefficients between sorptive tissues and fish internal water as a substitute for K(ow) in, for example, upgrading models that simulate food-chain accumulation of the chemical.
A graph-theoretic method to quantify the airline route authority
NASA Technical Reports Server (NTRS)
Chan, Y.
1979-01-01
The paper introduces a graph-theoretic method to quantify the legal statements in route certificate which specifies the airline routing restrictions. All the authorized nonstop and multistop routes, including the shortest time routes, can be obtained, and the method suggests profitable route structure alternatives to airline analysts. This method to quantify the C.A.B. route authority was programmed in a software package, Route Improvement Synthesis and Evaluation, and demonstrated in a case study with a commercial airline. The study showed the utility of this technique in suggesting route alternatives and the possibility of improvements in the U.S. route system.
Geometry of complex networks and topological centrality
NASA Astrophysics Data System (ADS)
Ranjan, Gyan; Zhang, Zhi-Li
2013-09-01
We explore the geometry of complex networks in terms of an n-dimensional Euclidean embedding represented by the Moore-Penrose pseudo-inverse of the graph Laplacian (L). The squared distance of a node i to the origin in this n-dimensional space (lii+), yields a topological centrality index, defined as C∗(i)=1/lii+. In turn, the sum of reciprocals of individual node centralities, ∑i1/C∗(i)=∑ilii+, or the trace of L, yields the well-known Kirchhoff index (K), an overall structural descriptor for the network. To put into context this geometric definition of centrality, we provide alternative interpretations of the proposed indices that connect them to meaningful topological characteristics - first, as forced detour overheads and frequency of recurrences in random walks that has an interesting analogy to voltage distributions in the equivalent electrical network; and then as the average connectedness of i in all the bi-partitions of the graph. These interpretations respectively help establish the topological centrality (C∗(i)) of node i as a measure of its overall position as well as its overall connectedness in the network; thus reflecting the robustness of i to random multiple edge failures. Through empirical evaluations using synthetic and real world networks, we demonstrate how the topological centrality is better able to distinguish nodes in terms of their structural roles in the network and, along with Kirchhoff index, is appropriately sensitive to perturbations/re-wirings in the network.
Exploration maturity key to ranking search areas
Attanasi, E.D.; Freeman, P.A.
2008-01-01
The study area of US Geological Survey Circular 1288, the world outside the US and Canada, was partitioned into 44 countries and country groups. Map figures such as Fig. 2 and graphs similar to Figs. 3 and 4 provide a visual summary of maturity of oil and gas exploration. From 1992 through 2001, exploration data show that in the study area the delineated prospective area expanded at a rate of about 50,000 sq miles/year, while the explored area grew at a rate of 11,000 sq miles/year. The delineated prospective area established by 1970 accounts for less than 40% of total delineated prospective area but contains 75% of the oil discovered to date in the study area. From 1991 through 2000, offshore discoveries accounted for 59% of the oil and 77% of the gas discovered in the study area.
Tripartite community structure in social bookmarking data
NASA Astrophysics Data System (ADS)
Neubauer, Nicolas; Obermayer, Klaus
2011-12-01
Community detection is a branch of network analysis concerned with identifying strongly connected subnetworks. Social bookmarking sites aggregate datasets of often hundreds of millions of triples (document, user, and tag), which, when interpreted as edges of a graph, give rise to special networks called 3-partite, 3-uniform hypergraphs. We identify challenges and opportunities of generalizing community detection and in particular modularity optimization to these structures. Two methods for community detection are introduced that preserve the hypergraph's special structure to different degrees. Their performance is compared on synthetic datasets, showing the benefits of structure preservation. Furthermore, a tool for interactive exploration of the community detection results is introduced and applied to examples from real datasets. We find additional evidence for the importance of structure preservation and, more generally, demonstrate how tripartite community detection can help understand the structure of social bookmarking data.
Transport in Dynamical Astronomy and Multibody Problems
NASA Astrophysics Data System (ADS)
Dellnitz, Michael; Junge, Oliver; Koon, Wang Sang; Lekien, Francois; Lo, Martin W.; Marsden, Jerrold E.; Padberg, Kathrin; Preis, Robert; Ross, Shane D.; Thiere, Bianca
We combine the techniques of almost invariant sets (using tree structured box elimination and graph partitioning algorithms) with invariant manifold and lobe dynamics techniques. The result is a new computational technique for computing key dynamical features, including almost invariant sets, resonance regions as well as transport rates and bottlenecks between regions in dynamical systems. This methodology can be applied to a variety of multibody problems, including those in molecular modeling, chemical reaction rates and dynamical astronomy. In this paper we focus on problems in dynamical astronomy to illustrate the power of the combination of these different numerical tools and their applicability. In particular, we compute transport rates between two resonance regions for the three-body system consisting of the Sun, Jupiter and a third body (such as an asteroid). These resonance regions are appropriate for certain comets and asteroids.
Arefin, Ahmed Shamsul; Vimieiro, Renato; Riveros, Carlos; Craig, Hugh; Moscato, Pablo
2014-01-01
In this paper we analyse the word frequency profiles of a set of works from the Shakespearean era to uncover patterns of relationship between them, highlighting the connections within authorial canons. We used a text corpus comprising 256 plays and poems from the 16th and 17th centuries, with 17 works of uncertain authorship. Our clustering approach is based on the Jensen-Shannon divergence and a graph partitioning algorithm, and our results show that authors' characteristic styles are very powerful factors in explaining the variation of word use, frequently transcending cross-cutting factors like the differences between tragedy and comedy, early and late works, and plays and poems. Our method also provides an empirical guide to the authorship of plays and poems where this is unknown or disputed. PMID:25347727
Jothi, R; Mohanty, Sraban Kumar; Ojha, Aparajita
2016-04-01
Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.
Song, Qi; Chen, Mingqing; Bai, Junjie; Sonka, Milan; Wu, Xiaodong
2011-01-01
Multi-object segmentation with mutual interaction is a challenging task in medical image analysis. We report a novel solution to a segmentation problem, in which target objects of arbitrary shape mutually interact with terrain-like surfaces, which widely exists in the medical imaging field. The approach incorporates context information used during simultaneous segmentation of multiple objects. The object-surface interaction information is encoded by adding weighted inter-graph arcs to our graph model. A globally optimal solution is achieved by solving a single maximum flow problem in a low-order polynomial time. The performance of the method was evaluated in robust delineation of lung tumors in megavoltage cone-beam CT images in comparison with an expert-defined independent standard. The evaluation showed that our method generated highly accurate tumor segmentations. Compared with the conventional graph-cut method, our new approach provided significantly better results (p < 0.001). The Dice coefficient obtained by the conventional graph-cut approach (0.76 +/- 0.10) was improved to 0.84 +/- 0.05 when employing our new method for pulmonary tumor segmentation.
Danaci, Hasan Fehmi; Cetin-Atalay, Rengul; Atalay, Volkan
2018-03-26
Visualizing large-scale data produced by the high throughput experiments as a biological graph leads to better understanding and analysis. This study describes a customized force-directed layout algorithm, EClerize, for biological graphs that represent pathways in which the nodes are associated with Enzyme Commission (EC) attributes. The nodes with the same EC class numbers are treated as members of the same cluster. Positions of nodes are then determined based on both the biological similarity and the connection structure. EClerize minimizes the intra-cluster distance, that is the distance between the nodes of the same EC cluster and maximizes the inter-cluster distance, that is the distance between two distinct EC clusters. EClerize is tested on a number of biological pathways and the improvement brought in is presented with respect to the original algorithm. EClerize is available as a plug-in to cytoscape ( http://apps.cytoscape.org/apps/eclerize ).
A Coding Method for Efficient Subgraph Querying on Vertex- and Edge-Labeled Graphs
Zhu, Lei; Song, Qinbao; Guo, Yuchen; Du, Lei; Zhu, Xiaoyan; Wang, Guangtao
2014-01-01
Labeled graphs are widely used to model complex data in many domains, so subgraph querying has been attracting more and more attention from researchers around the world. Unfortunately, subgraph querying is very time consuming since it involves subgraph isomorphism testing that is known to be an NP-complete problem. In this paper, we propose a novel coding method for subgraph querying that is based on Laplacian spectrum and the number of walks. Our method follows the filtering-and-verification framework and works well on graph databases with frequent updates. We also propose novel two-step filtering conditions that can filter out most false positives and prove that the two-step filtering conditions satisfy the no-false-negative requirement (no dismissal in answers). Extensive experiments on both real and synthetic graphs show that, compared with six existing counterpart methods, our method can effectively improve the efficiency of subgraph querying. PMID:24853266
Plan-graph Based Heuristics for Conformant Probabilistic Planning
NASA Technical Reports Server (NTRS)
Ramakrishnan, Salesh; Pollack, Martha E.; Smith, David E.
2004-01-01
In this paper, we introduce plan-graph based heuristics to solve a variation of the conformant probabilistic planning (CPP) problem. In many real-world problems, it is the case that the sensors are unreliable or take too many resources to provide knowledge about the environment. These domains are better modeled as conformant planning problems. POMDP based techniques are currently the most successful approach for solving CPP but have the limitation of state- space explosion. Recent advances in deterministic and conformant planning have shown that plan-graphs can be used to enhance the performance significantly. We show that this enhancement can also be translated to CPP. We describe our process for developing the plan-graph heuristics and estimating the probability of a partial plan. We compare the performance of our planner PVHPOP when used with different heuristics. We also perform a comparison with a POMDP solver to show over a order of magnitude improvement in performance.
Proximity Networks and Epidemics
NASA Astrophysics Data System (ADS)
Guclu, Hasan; Toroczkai, Zoltán
2007-03-01
We presented the basis of a framework to account for the dynamics of contacts in epidemic processes, through the notion of dynamic proximity graphs. By varying the integration time-parameter T, which is the period of infectivity one can give a simple account for some of the differences in the observed contact networks for different diseases, such as smallpox, or AIDS. Our simplistic model also seems to shed some light on the shape of the degree distribution of the measured people-people contact network from the EPISIM data. We certainly do not claim that the simplistic graph integration model above is a good model for dynamic contact graphs. It only contains the essential ingredients for such processes to produce a qualitative agreement with some observations. We expect that further refinements and extensions to this picture, in particular deriving the link-probabilities in the dynamic proximity graph from more realistic contact dynamics should improve the agreement between models and data.
Han, Liang-Feng; Plummer, Niel; Aggarwal, Pradeep
2012-01-01
A graphical method is described for identifying geochemical reactions needed in the interpretation of radiocarbon age in groundwater systems. Graphs are constructed by plotting the measured 14C, δ13C, and concentration of dissolved inorganic carbon and are interpreted according to specific criteria to recognize water samples that are consistent with a wide range of processes, including geochemical reactions, carbon isotopic exchange, 14C decay, and mixing of waters. The graphs are used to provide a qualitative estimate of radiocarbon age, to deduce the hydrochemical complexity of a groundwater system, and to compare samples from different groundwater systems. Graphs of chemical and isotopic data from a series of previously-published groundwater studies are used to demonstrate the utility of the approach. Ultimately, the information derived from the graphs is used to improve geochemical models for adjustment of radiocarbon ages in groundwater systems.
Visual Exploratory Search of Relationship Graphs on Smartphones
Ouyang, Jianquan; Zheng, Hao; Kong, Fanbin; Liu, Tianming
2013-01-01
This paper presents a novel framework for Visual Exploratory Search of Relationship Graphs on Smartphones (VESRGS) that is composed of three major components: inference and representation of semantic relationship graphs on the Web via meta-search, visual exploratory search of relationship graphs through both querying and browsing strategies, and human-computer interactions via the multi-touch interface and mobile Internet on smartphones. In comparison with traditional lookup search methodologies, the proposed VESRGS system is characterized with the following perceived advantages. 1) It infers rich semantic relationships between the querying keywords and other related concepts from large-scale meta-search results from Google, Yahoo! and Bing search engines, and represents semantic relationships via graphs; 2) the exploratory search approach empowers users to naturally and effectively explore, adventure and discover knowledge in a rich information world of interlinked relationship graphs in a personalized fashion; 3) it effectively takes the advantages of smartphones’ user-friendly interfaces and ubiquitous Internet connection and portability. Our extensive experimental results have demonstrated that the VESRGS framework can significantly improve the users’ capability of seeking the most relevant relationship information to their own specific needs. We envision that the VESRGS framework can be a starting point for future exploration of novel, effective search strategies in the mobile Internet era. PMID:24223936
SAGE: String-overlap Assembly of GEnomes.
Ilie, Lucian; Haider, Bahlul; Molnar, Michael; Solis-Oba, Roberto
2014-09-15
De novo genome assembly of next-generation sequencing data is one of the most important current problems in bioinformatics, essential in many biological applications. In spite of significant amount of work in this area, better solutions are still very much needed. We present a new program, SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers. SAGE benefits from innovations in almost every aspect of the assembly process: error correction of input reads, string-overlap graph construction, read copy counts estimation, overlap graph analysis and reduction, contig extraction, and scaffolding. We hope that these new ideas will help advance the current state-of-the-art in an essential area of research in genomics.
Scaling Up Graph-Based Semisupervised Learning via Prototype Vector Machines
Zhang, Kai; Lan, Liang; Kwok, James T.; Vucetic, Slobodan; Parvin, Bahram
2014-01-01
When the amount of labeled data are limited, semi-supervised learning can improve the learner's performance by also using the often easily available unlabeled data. In particular, a popular approach requires the learned function to be smooth on the underlying data manifold. By approximating this manifold as a weighted graph, such graph-based techniques can often achieve state-of-the-art performance. However, their high time and space complexities make them less attractive on large data sets. In this paper, we propose to scale up graph-based semisupervised learning using a set of sparse prototypes derived from the data. These prototypes serve as a small set of data representatives, which can be used to approximate the graph-based regularizer and to control model complexity. Consequently, both training and testing become much more efficient. Moreover, when the Gaussian kernel is used to define the graph affinity, a simple and principled method to select the prototypes can be obtained. Experiments on a number of real-world data sets demonstrate encouraging performance and scaling properties of the proposed approach. It also compares favorably with models learned via ℓ1-regularization at the same level of model sparsity. These results demonstrate the efficacy of the proposed approach in producing highly parsimonious and accurate models for semisupervised learning. PMID:25720002
NASA Astrophysics Data System (ADS)
Bornyakov, V. G.; Boyda, D. L.; Goy, V. A.; Molochkov, A. V.; Nakamura, Atsushi; Nikolaev, A. A.; Zakharov, V. I.
2017-05-01
We propose and test a new approach to computation of canonical partition functions in lattice QCD at finite density. We suggest a few steps procedure. We first compute numerically the quark number density for imaginary chemical potential i μq I . Then we restore the grand canonical partition function for imaginary chemical potential using the fitting procedure for the quark number density. Finally we compute the canonical partition functions using high precision numerical Fourier transformation. Additionally we compute the canonical partition functions using the known method of the hopping parameter expansion and compare results obtained by two methods in the deconfining as well as in the confining phases. The agreement between two methods indicates the validity of the new method. Our numerical results are obtained in two flavor lattice QCD with clover improved Wilson fermions.
Optimal Multiple Surface Segmentation With Shape and Context Priors
Bai, Junjie; Garvin, Mona K.; Sonka, Milan; Buatti, John M.; Wu, Xiaodong
2014-01-01
Segmentation of multiple surfaces in medical images is a challenging problem, further complicated by the frequent presence of weak boundary evidence, large object deformations, and mutual influence between adjacent objects. This paper reports a novel approach to multi-object segmentation that incorporates both shape and context prior knowledge in a 3-D graph-theoretic framework to help overcome the stated challenges. We employ an arc-based graph representation to incorporate a wide spectrum of prior information through pair-wise energy terms. In particular, a shape-prior term is used to penalize local shape changes and a context-prior term is used to penalize local surface-distance changes from a model of the expected shape and surface distances, respectively. The globally optimal solution for multiple surfaces is obtained by computing a maximum flow in a low-order polynomial time. The proposed method was validated on intraretinal layer segmentation of optical coherence tomography images and demonstrated statistically significant improvement of segmentation accuracy compared to our earlier graph-search method that was not utilizing shape and context priors. The mean unsigned surface positioning errors obtained by the conventional graph-search approach (6.30 ± 1.58 μm) was improved to 5.14 ± 0.99 μm when employing our new method with shape and context priors. PMID:23193309
Improving the Accuracy of Attribute Extraction using the Relatedness between Attribute Values
NASA Astrophysics Data System (ADS)
Bollegala, Danushka; Tani, Naoki; Ishizuka, Mitsuru
Extracting attribute-values related to entities from web texts is an important step in numerous web related tasks such as information retrieval, information extraction, and entity disambiguation (namesake disambiguation). For example, for a search query that contains a personal name, we can not only return documents that contain that personal name, but if we have attribute-values such as the organization for which that person works, we can also suggest documents that contain information related to that organization, thereby improving the user's search experience. Despite numerous potential applications of attribute extraction, it remains a challenging task due to the inherent noise in web data -- often a single web page contains multiple entities and attributes. We propose a graph-based approach to select the correct attribute-values from a set of candidate attribute-values extracted for a particular entity. First, we build an undirected weighted graph in which, attribute-values are represented by nodes, and the edge that connects two nodes in the graph represents the degree of relatedness between the corresponding attribute-values. Next, we find the maximum spanning tree of this graph that connects exactly one attribute-value for each attribute-type. The proposed method outperforms previously proposed attribute extraction methods on a dataset that contains 5000 web pages.
Unsupervised Metric Fusion Over Multiview Data by Graph Random Walk-Based Cross-View Diffusion.
Wang, Yang; Zhang, Wenjie; Wu, Lin; Lin, Xuemin; Zhao, Xiang
2017-01-01
Learning an ideal metric is crucial to many tasks in computer vision. Diverse feature representations may combat this problem from different aspects; as visual data objects described by multiple features can be decomposed into multiple views, thus often provide complementary information. In this paper, we propose a cross-view fusion algorithm that leads to a similarity metric for multiview data by systematically fusing multiple similarity measures. Unlike existing paradigms, we focus on learning distance measure by exploiting a graph structure of data samples, where an input similarity matrix can be improved through a propagation of graph random walk. In particular, we construct multiple graphs with each one corresponding to an individual view, and a cross-view fusion approach based on graph random walk is presented to derive an optimal distance measure by fusing multiple metrics. Our method is scalable to a large amount of data by enforcing sparsity through an anchor graph representation. To adaptively control the effects of different views, we dynamically learn view-specific coefficients, which are leveraged into graph random walk to balance multiviews. However, such a strategy may lead to an over-smooth similarity metric where affinities between dissimilar samples may be enlarged by excessively conducting cross-view fusion. Thus, we figure out a heuristic approach to controlling the iteration number in the fusion process in order to avoid over smoothness. Extensive experiments conducted on real-world data sets validate the effectiveness and efficiency of our approach.
Surveillance system and method having an operating mode partitioned fault classification model
NASA Technical Reports Server (NTRS)
Bickford, Randall L. (Inventor)
2005-01-01
A system and method which partitions a parameter estimation model, a fault detection model, and a fault classification model for a process surveillance scheme into two or more coordinated submodels together providing improved diagnostic decision making for at least one determined operating mode of an asset.
Zhang, Feng; Liao, Xiangke; Peng, Shaoliang; Cui, Yingbo; Wang, Bingqiang; Zhu, Xiaoqian; Liu, Jie
2016-06-01
' The de novo assembly of DNA sequences is increasingly important for biological researches in the genomic era. After more than one decade since the Human Genome Project, some challenges still exist and new solutions are being explored to improve de novo assembly of genomes. String graph assembler (SGA), based on the string graph theory, is a new method/tool developed to address the challenges. In this paper, based on an in-depth analysis of SGA we prove that the SGA-based sequence de novo assembly is an NP-complete problem. According to our analysis, SGA outperforms other similar methods/tools in memory consumption, but costs much more time, of which 60-70 % is spent on the index construction. Upon this analysis, we introduce a hybrid parallel optimization algorithm and implement this algorithm in the TianHe-2's parallel framework. Simulations are performed with different datasets. For data of small size the optimized solution is 3.06 times faster than before, and for data of middle size it's 1.60 times. The results demonstrate an evident performance improvement, with the linear scalability for parallel FM-index construction. This results thus contribute significantly to improving the efficiency of de novo assembly of DNA sequences.
A hierarchical graph neuron scheme for real-time pattern recognition.
Nasution, B B; Khan, A I
2008-02-01
The hierarchical graph neuron (HGN) implements a single cycle memorization and recall operation through a novel algorithmic design. The HGN is an improvement on the already published original graph neuron (GN) algorithm. In this improved approach, it recognizes incomplete/noisy patterns. It also resolves the crosstalk problem, which is identified in the previous publications, within closely matched patterns. To accomplish this, the HGN links multiple GN networks for filtering noise and crosstalk out of pattern data inputs. Intrinsically, the HGN is a lightweight in-network processing algorithm which does not require expensive floating point computations; hence, it is very suitable for real-time applications and tiny devices such as the wireless sensor networks. This paper describes that the HGN's pattern matching capability and the small response time remain insensitive to the increases in the number of stored patterns. Moreover, the HGN does not require definition of rules or setting of thresholds by the operator to achieve the desired results nor does it require heuristics entailing iterative operations for memorization and recall of patterns.
The Statistical Mechanics of Dilute, Disordered Systems
NASA Astrophysics Data System (ADS)
Blackburn, Roger Michael
Available from UMI in association with The British Library. Requires signed TDF. A graph partitioning problem with variable inter -partition costs is studied by exploiting its mapping on to the Ashkin-Teller spin glass. The cavity method is used to derive the TAP equations and free energy for both extensively connected and dilute systems. Unlike Ising and Potts spin glasses, the self-consistent equation for the distribution of effective fields does not have a solution solely made up of delta functions. Numerical integration is used to find the stable solution, from which the ground state energy is calculated. Simulated annealing is used to test the results. The retrieving activity distribution for networks of boolean functions trained as associative memories for optimal capacity is derived. For infinite networks, outputs are shown to be frozen, in contrast to dilute asymmetric networks trained with the Hebb rule. For finite networks, a steady leaking to the non-retrieving attractor is demonstrated. Simulations of quenched networks are reported which show a departure from this picture: some configurations remain frozen for all time, while others follow cycles of small periods. An estimate of the critical capacity from the simulations is found to be in broad agreement with recent analytical results. The existing theory is extended to include noise on recall, and the behaviour is found to be robust to noise up to order 1/c^2 for networks with connectivity c.
Hoggan, James L; Bae, Keonbeom; Kibbey, Tohren C G
2007-08-15
Trapped organic solvents, in both the vadose zone and below the water table, are frequent sources of environmental contamination. A common source of organic solvent contamination is spills, leaks, and improper solvent disposal associated with dry cleaning processes. Dry cleaning solvents, such as tetrachloroethylene (PCE), are typically enhanced with the addition of surfactants to improve cleaning performance. The objective of this work was to examine the partitioning behavior of surfactants from PCE in contact with water. The relative rates of surfactants partitioning and PCE dissolution are important for modeling the behavior of waste PCE in the subsurface, in that they influence the interfacial tension of the PCE, and how (or if) interfacial tension changes over time in the subsurface. The work described here uses a flow-through system to examine simultaneous partitioning and PCE dissolution in a porous medium. Results indicate that both nonylphenol ethoxylate nonionic surfactants and a sulfosuccinate anionic surfactant partition out of residual PCE much more rapidly than the PCE dissolves, suggesting that in many cases interfacial tension changes caused by partitioning may influence infiltration and distribution of PCE in the subsurface. Non-steady-state partitioning is found to be well-described by a linear driving force model incorporating measured surfactant partition coefficients.
Interference graph-based dynamic frequency reuse in optical attocell networks
NASA Astrophysics Data System (ADS)
Liu, Huanlin; Xia, Peijie; Chen, Yong; Wu, Lan
2017-11-01
Indoor optical attocell network may achieve higher capacity than radio frequency (RF) or Infrared (IR)-based wireless systems. It is proposed as a special type of visible light communication (VLC) system using Light Emitting Diodes (LEDs). However, the system spectral efficiency may be severely degraded owing to the inter-cell interference (ICI), particularly for dense deployment scenarios. To address these issues, we construct the spectral interference graph for indoor optical attocell network, and propose the Dynamic Frequency Reuse (DFR) and Weighted Dynamic Frequency Reuse (W-DFR) algorithms to decrease ICI and improve the spectral efficiency performance. The interference graph makes LEDs can transmit data without interference and select the minimum sub-bands needed for frequency reuse. Then, DFR algorithm reuses the system frequency equally across service-providing cells to mitigate spectrum interference. While W-DFR algorithm can reuse the system frequency by using the bandwidth weight (BW), which is defined based on the number of service users. Numerical results show that both of the proposed schemes can effectively improve the average spectral efficiency (ASE) of the system. Additionally, improvement of the user data rate is also obtained by analyzing its cumulative distribution function (CDF).
The Optimization of In-Memory Space Partitioning Trees for Cache Utilization
NASA Astrophysics Data System (ADS)
Yeo, Myung Ho; Min, Young Soo; Bok, Kyoung Soo; Yoo, Jae Soo
In this paper, a novel cache conscious indexing technique based on space partitioning trees is proposed. Many researchers investigated efficient cache conscious indexing techniques which improve retrieval performance of in-memory database management system recently. However, most studies considered data partitioning and targeted fast information retrieval. Existing data partitioning-based index structures significantly degrade performance due to the redundant accesses of overlapped spaces. Specially, R-tree-based index structures suffer from the propagation of MBR (Minimum Bounding Rectangle) information by updating data frequently. In this paper, we propose an in-memory space partitioning index structure for optimal cache utilization. The proposed index structure is compared with the existing index structures in terms of update performance, insertion performance and cache-utilization rate in a variety of environments. The results demonstrate that the proposed index structure offers better performance than existing index structures.
NASA Astrophysics Data System (ADS)
Bansal, Gaurav K.; Rajinikanth, V.; Ghosh, Chiradeep; Srivastava, V. C.; Kundu, S.; Ghosh Chowdhury, S.
2018-05-01
In the present investigation, an attempt has been made to stabilize austenite by carbon partitioning through quenching and nonisothermal partitioning (Q&P) technique. This will eliminate the need for additional heat-treatment facility to perform isothermal partitioning or tempering process. The presence of retained austenite in the microstructure helps in increasing the toughness, which in turn is expected to improve the abrasion resistance of steels. The carbon partitioning from different quench temperatures has been performed on two different alloys, with low-Si content (0.5 wt pct), in a salt bath furnace atmosphere, the cooling profile of which closely resembles the industrially produced hot-rolled coil cooling. The results show that the stabilization of retained austenite is possible and gives rise to increased work hardening, better impact toughness and abrasive wear loss comparable to that of a fully martensitic microstructure. In contrast, tempered martensite exhibits better wear properties at the expense of impact toughness.
Luchko, Tyler; Blinov, Nikolay; Limon, Garrett C; Joyce, Kevin P; Kovalenko, Andriy
2016-11-01
Implicit solvent methods for classical molecular modeling are frequently used to provide fast, physics-based hydration free energies of macromolecules. Less commonly considered is the transferability of these methods to other solvents. The Statistical Assessment of Modeling of Proteins and Ligands 5 (SAMPL5) distribution coefficient dataset and the accompanying explicit solvent partition coefficient reference calculations provide a direct test of solvent model transferability. Here we use the 3D reference interaction site model (3D-RISM) statistical-mechanical solvation theory, with a well tested water model and a new united atom cyclohexane model, to calculate partition coefficients for the SAMPL5 dataset. The cyclohexane model performed well in training and testing ([Formula: see text] for amino acid neutral side chain analogues) but only if a parameterized solvation free energy correction was used. In contrast, the same protocol, using single solute conformations, performed poorly on the SAMPL5 dataset, obtaining [Formula: see text] compared to the reference partition coefficients, likely due to the much larger solute sizes. Including solute conformational sampling through molecular dynamics coupled with 3D-RISM (MD/3D-RISM) improved agreement with the reference calculation to [Formula: see text]. Since our initial calculations only considered partition coefficients and not distribution coefficients, solute sampling provided little benefit comparing against experiment, where ionized and tautomer states are more important. Applying a simple [Formula: see text] correction improved agreement with experiment from [Formula: see text] to [Formula: see text], despite a small number of outliers. Better agreement is possible by accounting for tautomers and improving the ionization correction.
NASA Astrophysics Data System (ADS)
Luchko, Tyler; Blinov, Nikolay; Limon, Garrett C.; Joyce, Kevin P.; Kovalenko, Andriy
2016-11-01
Implicit solvent methods for classical molecular modeling are frequently used to provide fast, physics-based hydration free energies of macromolecules. Less commonly considered is the transferability of these methods to other solvents. The Statistical Assessment of Modeling of Proteins and Ligands 5 (SAMPL5) distribution coefficient dataset and the accompanying explicit solvent partition coefficient reference calculations provide a direct test of solvent model transferability. Here we use the 3D reference interaction site model (3D-RISM) statistical-mechanical solvation theory, with a well tested water model and a new united atom cyclohexane model, to calculate partition coefficients for the SAMPL5 dataset. The cyclohexane model performed well in training and testing (R=0.98 for amino acid neutral side chain analogues) but only if a parameterized solvation free energy correction was used. In contrast, the same protocol, using single solute conformations, performed poorly on the SAMPL5 dataset, obtaining R=0.73 compared to the reference partition coefficients, likely due to the much larger solute sizes. Including solute conformational sampling through molecular dynamics coupled with 3D-RISM (MD/3D-RISM) improved agreement with the reference calculation to R=0.93. Since our initial calculations only considered partition coefficients and not distribution coefficients, solute sampling provided little benefit comparing against experiment, where ionized and tautomer states are more important. Applying a simple pK_{ {a}} correction improved agreement with experiment from R=0.54 to R=0.66, despite a small number of outliers. Better agreement is possible by accounting for tautomers and improving the ionization correction.
Application of a Model for Quenching and Partitioning in Hot Stamping of High-Strength Steel
NASA Astrophysics Data System (ADS)
Zhu, Bin; Liu, Zhuang; Wang, Yanan; Rolfe, Bernard; Wang, Liang; Zhang, Yisheng
2018-04-01
Application of quenching and partitioning process in hot stamping has proven to be an effective method to improve the plasticity of advanced high-strength steels (AHSSs). In this study, the hot stamping and partitioning process of advanced high-strength steel 30CrMnSi2Nb is investigated with a hot stamping mold. Given the specific partitioning time and temperature, the influence of quenching temperature on the volume fraction of microstructure evolution and mechanical properties of the above steel are studied in detail. In addition, a model for quenching and partitioning process is applied to predict the carbon diffusion and interface migration during partitioning, which determines the retained austenite volume fraction and final properties of the part. The predicted trends of the retained austenite volume fraction agree with the experimental results. In both cases, the volume fraction of retained austenite increases first and then decreases with the increasing quenching temperature. The optimal quenching temperature is approximately 290 °C for 30CrMnSi2Nb with the partition conditions of 425 °C and 20 seconds. It is suggested that the model can be used to help determine the process parameters to obtain retained austenite as much as possible.
Shi, Yan; Wang, Hao Gang; Li, Long; Chan, Chi Hou
2008-10-01
A multilevel Green's function interpolation method based on two kinds of multilevel partitioning schemes--the quasi-2D and the hybrid partitioning scheme--is proposed for analyzing electromagnetic scattering from objects comprising both conducting and dielectric parts. The problem is formulated using the surface integral equation for homogeneous dielectric and conducting bodies. A quasi-2D multilevel partitioning scheme is devised to improve the efficiency of the Green's function interpolation. In contrast to previous multilevel partitioning schemes, noncubic groups are introduced to discretize the whole EM structure in this quasi-2D multilevel partitioning scheme. Based on the detailed analysis of the dimension of the group in this partitioning scheme, a hybrid quasi-2D/3D multilevel partitioning scheme is proposed to effectively handle objects with fine local structures. Selection criteria for some key parameters relating to the interpolation technique are given. The proposed algorithm is ideal for the solution of problems involving objects such as missiles, microstrip antenna arrays, photonic bandgap structures, etc. Numerical examples are presented to show that CPU time is between O(N) and O(N log N) while the computer memory requirement is O(N).
Villandre, Luc; Günthard, Huldrych F.; Kouyos, Roger; Stadler, Tanja
2016-01-01
Background Transmission patterns of sexually-transmitted infections (STIs) could relate to the structure of the underlying sexual contact network, whose features are therefore of interest to clinicians. Conventionally, we represent sexual contacts in a population with a graph, that can reveal the existence of communities. Phylogenetic methods help infer the history of an epidemic and incidentally, may help detecting communities. In particular, phylogenetic analyses of HIV-1 epidemics among men who have sex with men (MSM) have revealed the existence of large transmission clusters, possibly resulting from within-community transmissions. Past studies have explored the association between contact networks and phylogenies, including transmission clusters, producing conflicting conclusions about whether network features significantly affect observed transmission history. As far as we know however, none of them thoroughly investigated the role of communities, defined with respect to the network graph, in the observation of clusters. Methods The present study investigates, through simulations, community detection from phylogenies. We simulate a large number of epidemics over both unweighted and weighted, undirected random interconnected-islands networks, with islands corresponding to communities. We use weighting to modulate distance between islands. We translate each epidemic into a phylogeny, that lets us partition our samples of infected subjects into transmission clusters, based on several common definitions from the literature. We measure similarity between subjects’ island membership indices and transmission cluster membership indices with the adjusted Rand index. Results and Conclusion Analyses reveal modest mean correspondence between communities in graphs and phylogenetic transmission clusters. We conclude that common methods often have limited success in detecting contact network communities from phylogenies. The rarely-fulfilled requirement that network communities correspond to clades in the phylogeny is their main drawback. Understanding the link between transmission clusters and communities in sexual contact networks could help inform policymaking to curb HIV incidence in MSMs. PMID:26863322
Schaub, Michael T.; Delvenne, Jean-Charles; Yaliraki, Sophia N.; Barahona, Mauricio
2012-01-01
In recent years, there has been a surge of interest in community detection algorithms for complex networks. A variety of computational heuristics, some with a long history, have been proposed for the identification of communities or, alternatively, of good graph partitions. In most cases, the algorithms maximize a particular objective function, thereby finding the ‘right’ split into communities. Although a thorough comparison of algorithms is still lacking, there has been an effort to design benchmarks, i.e., random graph models with known community structure against which algorithms can be evaluated. However, popular community detection methods and benchmarks normally assume an implicit notion of community based on clique-like subgraphs, a form of community structure that is not always characteristic of real networks. Specifically, networks that emerge from geometric constraints can have natural non clique-like substructures with large effective diameters, which can be interpreted as long-range communities. In this work, we show that long-range communities escape detection by popular methods, which are blinded by a restricted ‘field-of-view’ limit, an intrinsic upper scale on the communities they can detect. The field-of-view limit means that long-range communities tend to be overpartitioned. We show how by adopting a dynamical perspective towards community detection [1], [2], in which the evolution of a Markov process on the graph is used as a zooming lens over the structure of the network at all scales, one can detect both clique- or non clique-like communities without imposing an upper scale to the detection. Consequently, the performance of algorithms on inherently low-diameter, clique-like benchmarks may not always be indicative of equally good results in real networks with local, sparser connectivity. We illustrate our ideas with constructive examples and through the analysis of real-world networks from imaging, protein structures and the power grid, where a multiscale structure of non clique-like communities is revealed. PMID:22384178
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chin, George; Marquez, Andres; Choudhury, Sutanay
2012-09-01
Triadic analysis encompasses a useful set of graph mining methods that is centered on the concept of a triad, which is a subgraph of three nodes and the configuration of directed edges across the nodes. Such methods are often applied in the social sciences as well as many other diverse fields. Triadic methods commonly operate on a triad census that counts the number of triads of every possible edge configuration in a graph. Like other graph algorithms, triadic census algorithms do not scale well when graphs reach tens of millions to billions of nodes. To enable the triadic analysis ofmore » large-scale graphs, we developed and optimized a triad census algorithm to efficiently execute on shared memory architectures. We will retrace the development and evolution of a parallel triad census algorithm. Over the course of several versions, we continually adapted the code’s data structures and program logic to expose more opportunities to exploit parallelism on shared memory that would translate into improved computational performance. We will recall the critical steps and modifications that occurred during code development and optimization. Furthermore, we will compare the performances of triad census algorithm versions on three specific systems: Cray XMT, HP Superdome, and AMD multi-core NUMA machine. These three systems have shared memory architectures but with markedly different hardware capabilities to manage parallelism.« less
Analyzing cross-college course enrollments via contextual graph mining
Liu, Xiaozhong; Chen, Yan
2017-01-01
The ability to predict what courses a student may enroll in the coming semester plays a pivotal role in the allocation of learning resources, which is a hot topic in the domain of educational data mining. In this study, we propose an innovative approach to characterize students’ cross-college course enrollments by leveraging a novel contextual graph. Specifically, different kinds of variables, such as students, courses, colleges and diplomas, as well as various types of variable relations, are utilized to depict the context of each variable, and then a representation learning algorithm node2vec is applied to extracting sophisticated graph-based features for the enrollment analysis. In this manner, the relations between any pair of variables can be measured quantitatively, which enables the variable type to transform from nominal to ratio. These graph-based features are examined by the random forest algorithm, and experiments on 24,663 students, 1,674 courses and 417,590 enrollment records demonstrate that the contextual graph can successfully improve analyzing the cross-college course enrollments, where three of the graph-based features have significantly stronger impacts on prediction accuracy than the others. Besides, the empirical results also indicate that the student’s course preference is the most important factor in predicting future course enrollments, which is consistent to the previous studies that acknowledge the course interest is a key point for course recommendations. PMID:29186171
Evaluating approaches to find exon chains based on long reads.
Kuosmanen, Anna; Norri, Tuukka; Mäkinen, Veli
2018-05-01
Transcript prediction can be modeled as a graph problem where exons are modeled as nodes and reads spanning two or more exons are modeled as exon chains. Pacific Biosciences third-generation sequencing technology produces significantly longer reads than earlier second-generation sequencing technologies, which gives valuable information about longer exon chains in a graph. However, with the high error rates of third-generation sequencing, aligning long reads correctly around the splice sites is a challenging task. Incorrect alignments lead to spurious nodes and arcs in the graph, which in turn lead to incorrect transcript predictions. We survey several approaches to find the exon chains corresponding to long reads in a splicing graph, and experimentally study the performance of these methods using simulated data to allow for sensitivity/precision analysis. Our experiments show that short reads from second-generation sequencing can be used to significantly improve exon chain correctness either by error-correcting the long reads before splicing graph creation, or by using them to create a splicing graph on which the long-read alignments are then projected. We also study the memory and time consumption of various modules, and show that accurate exon chains lead to significantly increased transcript prediction accuracy. The simulated data and in-house scripts used for this article are available at http://www.cs.helsinki.fi/group/gsa/exon-chains/exon-chains-bib.tar.bz2.
Graph-Based Semi-Supervised Hyperspectral Image Classification Using Spatial Information
NASA Astrophysics Data System (ADS)
Jamshidpour, N.; Homayouni, S.; Safari, A.
2017-09-01
Hyperspectral image classification has been one of the most popular research areas in the remote sensing community in the past decades. However, there are still some problems that need specific attentions. For example, the lack of enough labeled samples and the high dimensionality problem are two most important issues which degrade the performance of supervised classification dramatically. The main idea of semi-supervised learning is to overcome these issues by the contribution of unlabeled samples, which are available in an enormous amount. In this paper, we propose a graph-based semi-supervised classification method, which uses both spectral and spatial information for hyperspectral image classification. More specifically, two graphs were designed and constructed in order to exploit the relationship among pixels in spectral and spatial spaces respectively. Then, the Laplacians of both graphs were merged to form a weighted joint graph. The experiments were carried out on two different benchmark hyperspectral data sets. The proposed method performed significantly better than the well-known supervised classification methods, such as SVM. The assessments consisted of both accuracy and homogeneity analyses of the produced classification maps. The proposed spectral-spatial SSL method considerably increased the classification accuracy when the labeled training data set is too scarce.When there were only five labeled samples for each class, the performance improved 5.92% and 10.76% compared to spatial graph-based SSL, for AVIRIS Indian Pine and Pavia University data sets respectively.
Analyzing cross-college course enrollments via contextual graph mining.
Wang, Yongzhen; Liu, Xiaozhong; Chen, Yan
2017-01-01
The ability to predict what courses a student may enroll in the coming semester plays a pivotal role in the allocation of learning resources, which is a hot topic in the domain of educational data mining. In this study, we propose an innovative approach to characterize students' cross-college course enrollments by leveraging a novel contextual graph. Specifically, different kinds of variables, such as students, courses, colleges and diplomas, as well as various types of variable relations, are utilized to depict the context of each variable, and then a representation learning algorithm node2vec is applied to extracting sophisticated graph-based features for the enrollment analysis. In this manner, the relations between any pair of variables can be measured quantitatively, which enables the variable type to transform from nominal to ratio. These graph-based features are examined by the random forest algorithm, and experiments on 24,663 students, 1,674 courses and 417,590 enrollment records demonstrate that the contextual graph can successfully improve analyzing the cross-college course enrollments, where three of the graph-based features have significantly stronger impacts on prediction accuracy than the others. Besides, the empirical results also indicate that the student's course preference is the most important factor in predicting future course enrollments, which is consistent to the previous studies that acknowledge the course interest is a key point for course recommendations.
Teaching Physics with Basketball
NASA Astrophysics Data System (ADS)
Chanpichai, N.; Wattanakasiwich, P.
2010-07-01
Recently, technologies and computer takes important roles in learning and teaching, including physics. Advance in technologies can help us better relating physics taught in the classroom to the real world. In this study, we developed a module on teaching a projectile motion through shooting a basketball. Students learned about physics of projectile motion, and then they took videos of their classmates shooting a basketball by using the high speed camera. Then they analyzed videos by using Tracker, a video analysis and modeling tool. While working with Tracker, students learned about the relationships between three kinematics graphs. Moreover, they learned about a real projectile motion (with an air resistance) through modeling tools. Students' abilities to interpret kinematics graphs were investigated before and after the instruction by using the Test of Understanding Graphs in Kinematics (TUG-K). The maximum normalized gain or
Co-occurrence graphs for word sense disambiguation in the biomedical domain.
Duque, Andres; Stevenson, Mark; Martinez-Romo, Juan; Araujo, Lourdes
2018-05-01
Word sense disambiguation is a key step for many natural language processing tasks (e.g. summarization, text classification, relation extraction) and presents a challenge to any system that aims to process documents from the biomedical domain. In this paper, we present a new graph-based unsupervised technique to address this problem. The knowledge base used in this work is a graph built with co-occurrence information from medical concepts found in scientific abstracts, and hence adapted to the specific domain. Unlike other unsupervised approaches based on static graphs such as UMLS, in this work the knowledge base takes the context of the ambiguous terms into account. Abstracts downloaded from PubMed are used for building the graph and disambiguation is performed using the personalized PageRank algorithm. Evaluation is carried out over two test datasets widely explored in the literature. Different parameters of the system are also evaluated to test robustness and scalability. Results show that the system is able to outperform state-of-the-art knowledge-based systems, obtaining more than 10% of accuracy improvement in some cases, while only requiring minimal external resources. Copyright © 2018 Elsevier B.V. All rights reserved.
Evolutionary dynamics on graphs: Efficient method for weak selection
NASA Astrophysics Data System (ADS)
Fu, Feng; Wang, Long; Nowak, Martin A.; Hauert, Christoph
2009-04-01
Investigating the evolutionary dynamics of game theoretical interactions in populations where individuals are arranged on a graph can be challenging in terms of computation time. Here, we propose an efficient method to study any type of game on arbitrary graph structures for weak selection. In this limit, evolutionary game dynamics represents a first-order correction to neutral evolution. Spatial correlations can be empirically determined under neutral evolution and provide the basis for formulating the game dynamics as a discrete Markov process by incorporating a detailed description of the microscopic dynamics based on the neutral correlations. This framework is then applied to one of the most intriguing questions in evolutionary biology: the evolution of cooperation. We demonstrate that the degree heterogeneity of a graph impedes cooperation and that the success of tit for tat depends not only on the number of rounds but also on the degree of the graph. Moreover, considering the mutation-selection equilibrium shows that the symmetry of the stationary distribution of states under weak selection is skewed in favor of defectors for larger selection strengths. In particular, degree heterogeneity—a prominent feature of scale-free networks—generally results in a more pronounced increase in the critical benefit-to-cost ratio required for evolution to favor cooperation as compared to regular graphs. This conclusion is corroborated by an analysis of the effects of population structures on the fixation probabilities of strategies in general 2×2 games for different types of graphs. Computer simulations confirm the predictive power of our method and illustrate the improved accuracy as compared to previous studies.
An Improved Graph Model for Conflict Resolution Based on Option Prioritization and Its Application
Yin, Kedong; Li, Xuemei
2017-01-01
In order to quantitatively depict differences regarding the preferences of decision makers for different states, a score function is proposed. As a foundation, coalition motivation and real-coalition analysis are discussed when external circumstance or opportunity costs are considering. On the basis of a confidence-level function, we establish the score function using a “preference tree”. We not only measure the preference for each state, but we also build a collation improvement function to measure coalition motivation and to construct a coordinate system in which to analyze real-coalition stability. All of these developments enhance the applicability of the graph model for conflict resolution (GMCR). Finally, an improved GMCR is applied in the “Changzhou Conflict” to demonstrate how it can be conveniently utilized in practice. PMID:29077049
An Improved Graph Model for Conflict Resolution Based on Option Prioritization and Its Application.
Yin, Kedong; Yu, Li; Li, Xuemei
2017-10-27
In order to quantitatively depict differences regarding the preferences of decision makers for different states, a score function is proposed. As a foundation, coalition motivation and real-coalition analysis are discussed when external circumstance or opportunity costs are considering. On the basis of a confidence-level function, we establish the score function using a "preference tree". We not only measure the preference for each state, but we also build a collation improvement function to measure coalition motivation and to construct a coordinate system in which to analyze real-coalition stability. All of these developments enhance the applicability of the graph model for conflict resolution (GMCR). Finally, an improved GMCR is applied in the "Changzhou Conflict" to demonstrate how it can be conveniently utilized in practice.
Global and Local Partitioning of the Charge Transferred in the Parr-Pearson Model.
Orozco-Valencia, Angel Ulises; Gázquez, José L; Vela, Alberto
2017-05-25
Through a simple proposal, the charge transfer obtained from the cornerstone theory of Parr and Pearson is partitioned, for each reactant, in two channels: an electrophilic, through which the species accepts electrons, and the other, a nucleophilic, where the species donates electrons. It is shown that this global model allows us to determine unambiguously the charge-transfer mechanism prevailing in a given reaction. The partitioning is extended to include local effects through the Fukui functions of the reactants. This local model is applied to several emblematic reactions in organic and inorganic chemistry, and we show that besides improving the correlations obtained with the global model it provides valuable information concerning the atoms in the reactants playing the most important roles in the reaction and thus improving our understanding of the reaction under study.
A method for validating Rent's rule for technological and biological networks.
Alcalde Cuesta, Fernando; González Sequeiros, Pablo; Lozano Rojo, Álvaro
2017-07-14
Rent's rule is empirical power law introduced in an effort to describe and optimize the wiring complexity of computer logic graphs. It is known that brain and neuronal networks also obey Rent's rule, which is consistent with the idea that wiring costs play a fundamental role in brain evolution and development. Here we propose a method to validate this power law for a certain range of network partitions. This method is based on the bifurcation phenomenon that appears when the network is subjected to random alterations preserving its degree distribution. It has been tested on a set of VLSI circuits and real networks, including biological and technological ones. We also analyzed the effect of different types of random alterations on the Rentian scaling in order to test the influence of the degree distribution. There are network architectures quite sensitive to these randomization procedures with significant increases in the values of the Rent exponents.
Fault-Tolerant Self-Stabilizing Distributed Clock Synchronization Protocol for Arbitrary Digraphs
NASA Technical Reports Server (NTRS)
Malekpour, Mahyar R. (Inventor)
2014-01-01
A self-stabilizing network in the form of an arbitrary, non-partitioned digraph includes K nodes having a synchronizer executing a protocol. K-1 monitors of each node may receive a Sync message transmitted from a directly connected node. When the Sync message is received, the logical clock value for the receiving node is set to between 0 and a communication latency value (gamma) if the clock value is less than a minimum event-response delay (D). A new Sync message is also transmitted to any directly connected nodes if the clock value is greater than or equal to both D and a graph threshold (T(sub S)). When the Sync message is not received the synchronizer increments the clock value if the clock value is less than a resynchronization period (P), and resets the clock value and transmits a new Sync message to all directly connected nodes when the clock value equals or exceeds P.
QSPIN: A High Level Java API for Quantum Computing Experimentation
NASA Technical Reports Server (NTRS)
Barth, Tim
2017-01-01
QSPIN is a high level Java language API for experimentation in QC models used in the calculation of Ising spin glass ground states and related quadratic unconstrained binary optimization (QUBO) problems. The Java API is intended to facilitate research in advanced QC algorithms such as hybrid quantum-classical solvers, automatic selection of constraint and optimization parameters, and techniques for the correction and mitigation of model and solution errors. QSPIN includes high level solver objects tailored to the D-Wave quantum annealing architecture that implement hybrid quantum-classical algorithms [Booth et al.] for solving large problems on small quantum devices, elimination of variables via roof duality, and classical computing optimization methods such as GPU accelerated simulated annealing and tabu search for comparison. A test suite of documented NP-complete applications ranging from graph coloring, covering, and partitioning to integer programming and scheduling are provided to demonstrate current capabilities.
Domain decomposition methods in aerodynamics
NASA Technical Reports Server (NTRS)
Venkatakrishnan, V.; Saltz, Joel
1990-01-01
Compressible Euler equations are solved for two-dimensional problems by a preconditioned conjugate gradient-like technique. An approximate Riemann solver is used to compute the numerical fluxes to second order accuracy in space. Two ways to achieve parallelism are tested, one which makes use of parallelism inherent in triangular solves and the other which employs domain decomposition techniques. The vectorization/parallelism in triangular solves is realized by the use of a recording technique called wavefront ordering. This process involves the interpretation of the triangular matrix as a directed graph and the analysis of the data dependencies. It is noted that the factorization can also be done in parallel with the wave front ordering. The performances of two ways of partitioning the domain, strips and slabs, are compared. Results on Cray YMP are reported for an inviscid transonic test case. The performances of linear algebra kernels are also reported.
A new mathematical modelling based shape extraction technique for Forensic Odontology.
G, Jaffino; A, Banumathi; Gurunathan, Ulaganathan; B, Vijayakumari; J, Prabin Jose
2017-04-01
Forensic Odontology is a specific means for identifying a person in which deceased, and particularly in fatality incidents. The algorithm can be proposed to identify a person by comparing both postmortem (PM) and antemortem (AM) dental radiographs and photographs. This work aims to introduce a new mathematical algorithm for photographs in addition with radiographs. Isoperimetric graph partitioning method is used to extract the shape of dental images in forensic identification. Shape matching is done by comparing AM and PM dental images using both similarity and distance measures. Experimental results prove that the higher matching distance is observed by distance metric rather than similarity measures. The results of this algorithm show that a high hit rate is observed for distance based performance measures and it is well suited for forensic odontologist to identify a person. Copyright © 2017 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Performance Enhancement Strategies for Multi-Block Overset Grid CFD Applications
NASA Technical Reports Server (NTRS)
Djomehri, M. Jahed; Biswas, Rupak
2003-01-01
The overset grid methodology has significantly reduced time-to-solution of highfidelity computational fluid dynamics (CFD) simulations about complex aerospace configurations. The solution process resolves the geometrical complexity of the problem domain by using separately generated but overlapping structured discretization grids that periodically exchange information through interpolation. However, high performance computations of such large-scale realistic applications must be handled efficiently on state-of-the-art parallel supercomputers. This paper analyzes the effects of various performance enhancement strategies on the parallel efficiency of an overset grid Navier-Stokes CFD application running on an SGI Origin2000 machinc. Specifically, the role of asynchronous communication, grid splitting, and grid grouping strategies are presented and discussed. Details of a sophisticated graph partitioning technique for grid grouping are also provided. Results indicate that performance depends critically on the level of latency hiding and the quality of load balancing across the processors.
A New Model for Optimal Mechanical and Thermal Performance of Cement-Based Partition Wall
Huang, Shiping; Hu, Mengyu; Cui, Nannan; Wang, Weifeng
2018-01-01
The prefabricated cement-based partition wall has been widely used in assembled buildings because of its high manufacturing efficiency, high-quality surface, and simple and convenient construction process. In this paper, a general porous partition wall that is made from cement-based materials was proposed to meet the optimal mechanical and thermal performance during transportation, construction and its service life. The porosity of the proposed partition wall is formed by elliptic-cylinder-type cavities. The finite element method was used to investigate the mechanical and thermal behaviour, which shows that the proposed model has distinct advantages over the current partition wall that is used in the building industry. It is found that, by controlling the eccentricity of the elliptic-cylinder cavities, the proposed wall stiffness can be adjusted to respond to the imposed loads and to improve the thermal performance, which can be used for the optimum design. Finally, design guidance is provided to obtain the optimal mechanical and thermal performance. The proposed model could be used as a promising candidate for partition wall in the building industry. PMID:29673176
NASA Technical Reports Server (NTRS)
Garg, Sanjay; Schmidt, Phillip H.
1993-01-01
A parameter optimization framework has earlier been developed to solve the problem of partitioning a centralized controller into a decentralized, hierarchical structure suitable for integrated flight/propulsion control implementation. This paper presents results from the application of the controller partitioning optimization procedure to IFPC design for a Short Take-Off and Vertical Landing (STOVL) aircraft in transition flight. The controller partitioning problem and the parameter optimization algorithm are briefly described. Insight is provided into choosing various 'user' selected parameters in the optimization cost function such that the resulting optimized subcontrollers will meet the characteristics of the centralized controller that are crucial to achieving the desired closed-loop performance and robustness, while maintaining the desired subcontroller structure constraints that are crucial for IFPC implementation. The optimization procedure is shown to improve upon the initial partitioned subcontrollers and lead to performance comparable to that achieved with the centralized controller. This application also provides insight into the issues that should be addressed at the centralized control design level in order to obtain implementable partitioned subcontrollers.
A New Model for Optimal Mechanical and Thermal Performance of Cement-Based Partition Wall.
Huang, Shiping; Hu, Mengyu; Huang, Yonghui; Cui, Nannan; Wang, Weifeng
2018-04-17
The prefabricated cement-based partition wall has been widely used in assembled buildings because of its high manufacturing efficiency, high-quality surface, and simple and convenient construction process. In this paper, a general porous partition wall that is made from cement-based materials was proposed to meet the optimal mechanical and thermal performance during transportation, construction and its service life. The porosity of the proposed partition wall is formed by elliptic-cylinder-type cavities. The finite element method was used to investigate the mechanical and thermal behaviour, which shows that the proposed model has distinct advantages over the current partition wall that is used in the building industry. It is found that, by controlling the eccentricity of the elliptic-cylinder cavities, the proposed wall stiffness can be adjusted to respond to the imposed loads and to improve the thermal performance, which can be used for the optimum design. Finally, design guidance is provided to obtain the optimal mechanical and thermal performance. The proposed model could be used as a promising candidate for partition wall in the building industry.
Graph Matching: Relax at Your Own Risk.
Lyzinski, Vince; Fishkind, Donniell E; Fiori, Marcelo; Vogelstein, Joshua T; Priebe, Carey E; Sapiro, Guillermo
2016-01-01
Graph matching-aligning a pair of graphs to minimize their edge disagreements-has received wide-spread attention from both theoretical and applied communities over the past several decades, including combinatorics, computer vision, and connectomics. Its attention can be partially attributed to its computational difficulty. Although many heuristics have previously been proposed in the literature to approximately solve graph matching, very few have any theoretical support for their performance. A common technique is to relax the discrete problem to a continuous problem, therefore enabling practitioners to bring gradient-descent-type algorithms to bear. We prove that an indefinite relaxation (when solved exactly) almost always discovers the optimal permutation, while a common convex relaxation almost always fails to discover the optimal permutation. These theoretical results suggest that initializing the indefinite algorithm with the convex optimum might yield improved practical performance. Indeed, experimental results illuminate and corroborate these theoretical findings, demonstrating that excellent results are achieved in both benchmark and real data problems by amalgamating the two approaches.
NASA Astrophysics Data System (ADS)
Palmeri, Anthony
This research project was developed to provide extensive practice and exposure to data collection and data representation in a high school science classroom. The student population engaged in this study included 40 high school sophomores enrolled in two microbiology classes. Laboratory investigations and activities were deliberately designed to include quantitative data collection that necessitated organization and graphical representation. These activities were embedded into the curriculum and conducted in conjunction with the normal and expected course content, rather than as a separate entity. It was expected that routine practice with graph construction and interpretation would result in improved competency when graphing data and proficiency in analyzing graphs. To objectively test the effectiveness in achieving this goal, a pre-test and post-test that included graph construction, interpretation, interpolation, extrapolation, and analysis was administered. Based on the results of a paired T-Test, graphical literacy was significantly enhanced by extensive practice and exposure to data representation.
Optimal steering for kinematic vehicles with applications to spatially distributed agents
NASA Astrophysics Data System (ADS)
Brown, Scott; Praeger, Cheryl E.; Giudici, Michael
While there is no universal method to address control problems involving networks of autonomous vehicles, there exist a few promising schemes that apply to different specific classes of problems, which have attracted the attention of many researchers from different fields. In particular, one way to extend techniques that address problems involving a single autonomous vehicle to those involving teams of autonomous vehicles is to use the concept of Voronoi diagram. The Voronoi diagram provides a spatial partition of the environment the team of vehicles operate in, where each element of this partition is associated with a unique vehicle from the team. The partition induces a graph abstraction of the operating space that is in an one-to-one correspondence with the network abstraction of the team of autonomous vehicles; a fact that can provide both conceptual and analytical advantages during mission planning and execution. In this dissertation, we propose the use of a new class of Voronoi-like partitioning schemes with respect to state-dependent proximity (pseudo-) metrics rather than the Euclidean distance or other generalized distance functions, which are typically used in the literature. An important nuance here is that, in contrast to the Euclidean distance, state-dependent metrics can succinctly capture system theoretic features of each vehicle from the team (e.g., vehicle kinematics), as well as the environment-vehicle interactions, which are induced, for example, by local winds/currents. We subsequently illustrate how the proposed concept of state-dependent Voronoi-like partition can induce local control schemes for problems involving networks of spatially distributed autonomous vehicles by examining a sequential pursuit problem of a maneuvering target by a group of pursuers distributed in the plane. The construction of generalized Voronoi diagrams with respect to state-dependent metrics poses some significant challenges. First, the generalized distance metric may be a function of the direction of motion of the vehicle (anisotropic pseudo-distance function) and/or may not be expressible in closed form. Second, such problems fall under the general class of partitioning problems for which the vehicles' dynamics must be taken into account. The topology of the vehicle's configuration space may be non-Euclidean, for example, it may be a manifold embedded in a Euclidean space. In other words, these problems may not be reducible to generalized Voronoi diagram problems for which efficient construction schemes, analytical and/or computational, exist in the literature. This research effort pursues three main objectives. First, we present the complete solution of different steering problems involving a single vehicle in the presence of motion constraints imposed by the maneuverability envelope of the vehicle and/or the presence of a drift field induced by winds/currents in its vicinity. The analysis of each steering problem involving a single vehicle provides us with a state-dependent generalized metric, such as the minimum time-to-go/come. We subsequently use these state-dependent generalized distance functions as the proximity metrics in the formulation of generalized Voronoi-like partitioning problems. The characterization of the solutions of these state-dependent Voronoi-like partitioning problems using either analytical or computational techniques constitutes the second main objective of this dissertation. The third objective of this research effort is to illustrate the use of the proposed concept of state-dependent Voronoi-like partition as a means for passing from control techniques that apply to problems involving a single vehicle to problems involving networks of spatially distributed autonomous vehicles. To this aim, we formulate the problem of sequential/relay pursuit of a maneuvering target by a group of spatially distributed pursuers and subsequently propose a distributed group pursuit strategy that directly derives from the solution of a state-dependent Voronoi-like partitioning problem. (Abstract shortened by UMI.)
Scoring and staging systems using cox linear regression modeling and recursive partitioning.
Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H
2006-01-01
Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
Interactive semiautomatic contour delineation using statistical conditional random fields framework.
Hu, Yu-Chi; Grossberg, Michael D; Wu, Abraham; Riaz, Nadeem; Perez, Carmen; Mageras, Gig S
2012-07-01
Contouring a normal anatomical structure during radiation treatment planning requires significant time and effort. The authors present a fast and accurate semiautomatic contour delineation method to reduce the time and effort required of expert users. Following an initial segmentation on one CT slice, the user marks the target organ and nontarget pixels with a few simple brush strokes. The algorithm calculates statistics from this information that, in turn, determines the parameters of an energy function containing both boundary and regional components. The method uses a conditional random field graphical model to define the energy function to be minimized for obtaining an estimated optimal segmentation, and a graph partition algorithm to efficiently solve the energy function minimization. Organ boundary statistics are estimated from the segmentation and propagated to subsequent images; regional statistics are estimated from the simple brush strokes that are either propagated or redrawn as needed on subsequent images. This greatly reduces the user input needed and speeds up segmentations. The proposed method can be further accelerated with graph-based interpolation of alternating slices in place of user-guided segmentation. CT images from phantom and patients were used to evaluate this method. The authors determined the sensitivity and specificity of organ segmentations using physician-drawn contours as ground truth, as well as the predicted-to-ground truth surface distances. Finally, three physicians evaluated the contours for subjective acceptability. Interobserver and intraobserver analysis was also performed and Bland-Altman plots were used to evaluate agreement. Liver and kidney segmentations in patient volumetric CT images show that boundary samples provided on a single CT slice can be reused through the entire 3D stack of images to obtain accurate segmentation. In liver, our method has better sensitivity and specificity (0.925 and 0.995) than region growing (0.897 and 0.995) and level set methods (0.912 and 0.985) as well as shorter mean predicted-to-ground truth distance (2.13 mm) compared to regional growing (4.58 mm) and level set methods (8.55 mm and 4.74 mm). Similar results are observed in kidney segmentation. Physician evaluation of ten liver cases showed that 83% of contours did not need any modification, while 6% of contours needed modifications as assessed by two or more evaluators. In interobserver and intraobserver analysis, Bland-Altman plots showed our method to have better repeatability than the manual method while the delineation time was 15% faster on average. Our method achieves high accuracy in liver and kidney segmentation and considerably reduces the time and labor required for contour delineation. Since it extracts purely statistical information from the samples interactively specified by expert users, the method avoids heuristic assumptions commonly used by other methods. In addition, the method can be expanded to 3D directly without modification because the underlying graphical framework and graph partition optimization method fit naturally with the image grid structure.
USDA-ARS?s Scientific Manuscript database
Altering nutrient partitioning in young postpartum beef cows from milk production to body weight gain has potential to improve reproductive performance. A 2-yr study conducted at the Corona Range and Livestock Research Center from February to July in 2003 (n = 33) and 2004 (n = 26) evaluated respons...
Network-based study of Lagrangian transport and mixing
NASA Astrophysics Data System (ADS)
Padberg-Gehle, Kathrin; Schneide, Christiane
2017-10-01
Transport and mixing processes in fluid flows are crucially influenced by coherent structures and the characterization of these Lagrangian objects is a topic of intense current research. While established mathematical approaches such as variational methods or transfer-operator-based schemes require full knowledge of the flow field or at least high-resolution trajectory data, this information may not be available in applications. Recently, different computational methods have been proposed to identify coherent behavior in flows directly from Lagrangian trajectory data, that is, numerical or measured time series of particle positions in a fluid flow. In this context, spatio-temporal clustering algorithms have been proven to be very effective for the extraction of coherent sets from sparse and possibly incomplete trajectory data. Inspired by these recent approaches, we consider an unweighted, undirected network, where Lagrangian particle trajectories serve as network nodes. A link is established between two nodes if the respective trajectories come close to each other at least once in the course of time. Classical graph concepts are then employed to analyze the resulting network. In particular, local network measures such as the node degree, the average degree of neighboring nodes, and the clustering coefficient serve as indicators of highly mixing regions, whereas spectral graph partitioning schemes allow us to extract coherent sets. The proposed methodology is very fast to run and we demonstrate its applicability in two geophysical flows - the Bickley jet as well as the Antarctic stratospheric polar vortex.
New polymers for low-gravity purification of cells by phase partitioning
NASA Technical Reports Server (NTRS)
Harris, J. M.
1983-01-01
A potentially powerful technique for separating different biological cell types is based on the partitioning of these cells between the immiscible aqueous phases formed by solution of certain polymers in water. This process is gravity-limited because cells sediment rather than associate with the phase most favored on the basis of cell-phase interactions. In the present contract we have been involved in the synthesis of new polymers both to aid in understanding the partitioning process and to improve the quality of separations. The prime driving force behind the design of these polymers is to produce materials which will aid in space experiments to separate important cell types and to study the partitioning process in the absence of gravity (i.e., in an equilibrium state).
Khodadadi, Mostafa; Dehghani, Hamid; Jalali Javaran, Mokhtar
2017-01-01
Enhancing water use efficiency of coriander (Coriandrum sativum L.) is a major focus for coriander breeding to cope with drought stress. The purpose of this study was; (a) to identify the predominant mechanism(s) of drought resistance in coriander and (b) to evaluate the genetic control mechanism(s) of traits associated with drought resistance and higher fruit yield. To reach this purpose, 15 half-diallel hybrids of coriander and their six parents were evaluated under well-watered and water deficit stressed (WDS) in both glasshouse lysimetric and field conditions. The parents were selected for their different response to water deficit stress following preliminary experiments. Results revealed that the genetic control mechanism of fruit yield is complex, variable and highly affected by environment. The mode of inheritance and nature of gene action for percent assimilate partitioned to fruits were similar to those for flowering time in both well-watered and WDS conditions. A significant negative genetic linkage was found between fruit yield and percent assimilate partitioned to root, percent assimilate partitioned to shoot, root number, root diameter, root dry mass, root volume, and early flowering. Thus, to improve fruit yield under water deficit stress, selection of low values of these traits could be used. In contrast, a significant positive genetic linkage between fruit yield and percent assimilate partitioned to fruits, leaf relative water content and chlorophyll content indicate selection for high values of these traits. These secondary or surrogate traits could be selected during early segregating generations. The early ripening parent (P1; TN-59-230) contained effective genes involved in preferred percent assimilate partitioning to fruit and drought stress resistance. In conclusion, genetic improvement of fruit yield and drought resistance could be simultaneously gained in coriander when breeding for drought resistance. PMID:28473836
Handling Data Skew in MapReduce Cluster by Using Partition Tuning
Gao, Yufei; Zhou, Yanjie; Zhou, Bing; Shi, Lei; Zhang, Jiacai
2017-01-01
The healthcare industry has generated large amounts of data, and analyzing these has emerged as an important problem in recent years. The MapReduce programming model has been successfully used for big data analytics. However, data skew invariably occurs in big data analytics and seriously affects efficiency. To overcome the data skew problem in MapReduce, we have in the past proposed a data processing algorithm called Partition Tuning-based Skew Handling (PTSH). In comparison with the one-stage partitioning strategy used in the traditional MapReduce model, PTSH uses a two-stage strategy and the partition tuning method to disperse key-value pairs in virtual partitions and recombines each partition in case of data skew. The robustness and efficiency of the proposed algorithm were tested on a wide variety of simulated datasets and real healthcare datasets. The results showed that PTSH algorithm can handle data skew in MapReduce efficiently and improve the performance of MapReduce jobs in comparison with the native Hadoop, Closer, and locality-aware and fairness-aware key partitioning (LEEN). We also found that the time needed for rule extraction can be reduced significantly by adopting the PTSH algorithm, since it is more suitable for association rule mining (ARM) on healthcare data. © 2017 Yufei Gao et al.
Handling Data Skew in MapReduce Cluster by Using Partition Tuning.
Gao, Yufei; Zhou, Yanjie; Zhou, Bing; Shi, Lei; Zhang, Jiacai
2017-01-01
The healthcare industry has generated large amounts of data, and analyzing these has emerged as an important problem in recent years. The MapReduce programming model has been successfully used for big data analytics. However, data skew invariably occurs in big data analytics and seriously affects efficiency. To overcome the data skew problem in MapReduce, we have in the past proposed a data processing algorithm called Partition Tuning-based Skew Handling (PTSH). In comparison with the one-stage partitioning strategy used in the traditional MapReduce model, PTSH uses a two-stage strategy and the partition tuning method to disperse key-value pairs in virtual partitions and recombines each partition in case of data skew. The robustness and efficiency of the proposed algorithm were tested on a wide variety of simulated datasets and real healthcare datasets. The results showed that PTSH algorithm can handle data skew in MapReduce efficiently and improve the performance of MapReduce jobs in comparison with the native Hadoop, Closer, and locality-aware and fairness-aware key partitioning (LEEN). We also found that the time needed for rule extraction can be reduced significantly by adopting the PTSH algorithm, since it is more suitable for association rule mining (ARM) on healthcare data.
Handling Data Skew in MapReduce Cluster by Using Partition Tuning
Zhou, Yanjie; Zhou, Bing; Shi, Lei
2017-01-01
The healthcare industry has generated large amounts of data, and analyzing these has emerged as an important problem in recent years. The MapReduce programming model has been successfully used for big data analytics. However, data skew invariably occurs in big data analytics and seriously affects efficiency. To overcome the data skew problem in MapReduce, we have in the past proposed a data processing algorithm called Partition Tuning-based Skew Handling (PTSH). In comparison with the one-stage partitioning strategy used in the traditional MapReduce model, PTSH uses a two-stage strategy and the partition tuning method to disperse key-value pairs in virtual partitions and recombines each partition in case of data skew. The robustness and efficiency of the proposed algorithm were tested on a wide variety of simulated datasets and real healthcare datasets. The results showed that PTSH algorithm can handle data skew in MapReduce efficiently and improve the performance of MapReduce jobs in comparison with the native Hadoop, Closer, and locality-aware and fairness-aware key partitioning (LEEN). We also found that the time needed for rule extraction can be reduced significantly by adopting the PTSH algorithm, since it is more suitable for association rule mining (ARM) on healthcare data. PMID:29065568
An iterative network partition algorithm for accurate identification of dense network modules
Sun, Siqi; Dong, Xinran; Fu, Yao; Tian, Weidong
2012-01-01
A key step in network analysis is to partition a complex network into dense modules. Currently, modularity is one of the most popular benefit functions used to partition network modules. However, recent studies suggested that it has an inherent limitation in detecting dense network modules. In this study, we observed that despite the limitation, modularity has the advantage of preserving the primary network structure of the undetected modules. Thus, we have developed a simple iterative Network Partition (iNP) algorithm to partition a network. The iNP algorithm provides a general framework in which any modularity-based algorithm can be implemented in the network partition step. Here, we tested iNP with three modularity-based algorithms: multi-step greedy (MSG), spectral clustering and Qcut. Compared with the original three methods, iNP achieved a significant improvement in the quality of network partition in a benchmark study with simulated networks, identified more modules with significantly better enrichment of functionally related genes in both yeast protein complex network and breast cancer gene co-expression network, and discovered more cancer-specific modules in the cancer gene co-expression network. As such, iNP should have a broad application as a general method to assist in the analysis of biological networks. PMID:22121225
Kéchichian, Razmig; Valette, Sébastien; Desvignes, Michel; Prost, Rémy
2013-11-01
We derive shortest-path constraints from graph models of structure adjacency relations and introduce them in a joint centroidal Voronoi image clustering and Graph Cut multiobject semiautomatic segmentation framework. The vicinity prior model thus defined is a piecewise-constant model incurring multiple levels of penalization capturing the spatial configuration of structures in multiobject segmentation. Qualitative and quantitative analyses and comparison with a Potts prior-based approach and our previous contribution on synthetic, simulated, and real medical images show that the vicinity prior allows for the correct segmentation of distinct structures having identical intensity profiles and improves the precision of segmentation boundary placement while being fairly robust to clustering resolution. The clustering approach we take to simplify images prior to segmentation strikes a good balance between boundary adaptivity and cluster compactness criteria furthermore allowing to control the trade-off. Compared with a direct application of segmentation on voxels, the clustering step improves the overall runtime and memory footprint of the segmentation process up to an order of magnitude without compromising the quality of the result.
Differentially Private Frequent Subgraph Mining
Xu, Shengzhi; Xiong, Li; Cheng, Xiang; Xiao, Ke
2016-01-01
Mining frequent subgraphs from a collection of input graphs is an important topic in data mining research. However, if the input graphs contain sensitive information, releasing frequent subgraphs may pose considerable threats to individual's privacy. In this paper, we study the problem of frequent subgraph mining (FGM) under the rigorous differential privacy model. We introduce a novel differentially private FGM algorithm, which is referred to as DFG. In this algorithm, we first privately identify frequent subgraphs from input graphs, and then compute the noisy support of each identified frequent subgraph. In particular, to privately identify frequent subgraphs, we present a frequent subgraph identification approach which can improve the utility of frequent subgraph identifications through candidates pruning. Moreover, to compute the noisy support of each identified frequent subgraph, we devise a lattice-based noisy support derivation approach, where a series of methods has been proposed to improve the accuracy of the noisy supports. Through formal privacy analysis, we prove that our DFG algorithm satisfies ε-differential privacy. Extensive experimental results on real datasets show that the DFG algorithm can privately find frequent subgraphs with high data utility. PMID:27616876
Gaskins, J T; Daniels, M J
2016-01-02
The estimation of the covariance matrix is a key concern in the analysis of longitudinal data. When data consists of multiple groups, it is often assumed the covariance matrices are either equal across groups or are completely distinct. We seek methodology to allow borrowing of strength across potentially similar groups to improve estimation. To that end, we introduce a covariance partition prior which proposes a partition of the groups at each measurement time. Groups in the same set of the partition share dependence parameters for the distribution of the current measurement given the preceding ones, and the sequence of partitions is modeled as a Markov chain to encourage similar structure at nearby measurement times. This approach additionally encourages a lower-dimensional structure of the covariance matrices by shrinking the parameters of the Cholesky decomposition toward zero. We demonstrate the performance of our model through two simulation studies and the analysis of data from a depression study. This article includes Supplementary Material available online.
NASA Astrophysics Data System (ADS)
Zhou, Ze-an; Fu, Wan-tang; Zhu, Zhe; Li, Bin; Shi, Zhong-ping; Sun, Shu-hua
2018-05-01
The retained austenite content (RAC), the mechanical properties, and the resistance to cavitation erosion (CE) of the 00Cr13Mn8MoN steel after quenching and partitioning (Q&P) processing were investigated. The results show that the Q&P process affected the RAC, which reached the maximum value after partitioning at 400°C for 10 min. The tensile strength of the steel slightly decreased with increasing partitioning temperature and time. However, the elongation and product of strength and elongation first increased and then decreased. The sample partitioned at 400°C for 10 min exhibited the optimal property: a strength-ductility of 23.8 GPa·%. The resistance to CE for the 00Cr13Mn8MoN steel treated by the Q&P process was improved due to work hardening, spalling, and cavitation-induced martensitic transformation of the retained austenite.
A Novel Space Partitioning Algorithm to Improve Current Practices in Facility Placement
Jimenez, Tamara; Mikler, Armin R; Tiwari, Chetan
2012-01-01
In the presence of naturally occurring and man-made public health threats, the feasibility of regional bio-emergency contingency plans plays a crucial role in the mitigation of such emergencies. While the analysis of in-place response scenarios provides a measure of quality for a given plan, it involves human judgment to identify improvements in plans that are otherwise likely to fail. Since resource constraints and government mandates limit the availability of service provided in case of an emergency, computational techniques can determine optimal locations for providing emergency response assuming that the uniform distribution of demand across homogeneous resources will yield and optimal service outcome. This paper presents an algorithm that recursively partitions the geographic space into sub-regions while equally distributing the population across the partitions. For this method, we have proven the existence of an upper bound on the deviation from the optimal population size for sub-regions. PMID:23853502
2010-09-01
service to support the creation of both normative and pathological neuroinformatics databases. Project 3 Deliverable: The aim of this project is to...Gainesville, FL, 2010. D. Hammond, P. Vandergheynst, R. Gribonval, “ Wavelets on Graphs via Spectral Graph Theory,” Applied Computational and...the Phantom Experiments,” International Conference on Electrical Bioimpedance, April 4-8-, Gainesville, FL, 2010. 21 D. Hammond, “ Wavelets on
Compacting de Bruijn graphs from sequencing data quickly and in low memory.
Chikhi, Rayan; Limasset, Antoine; Medvedev, Paul
2016-06-15
As the quantity of data per sequencing experiment increases, the challenges of fragment assembly are becoming increasingly computational. The de Bruijn graph is a widely used data structure in fragment assembly algorithms, used to represent the information from a set of reads. Compaction is an important data reduction step in most de Bruijn graph based algorithms where long simple paths are compacted into single vertices. Compaction has recently become the bottleneck in assembly pipelines, and improving its running time and memory usage is an important problem. We present an algorithm and a tool bcalm 2 for the compaction of de Bruijn graphs. bcalm 2 is a parallel algorithm that distributes the input based on a minimizer hashing technique, allowing for good balance of memory usage throughout its execution. For human sequencing data, bcalm 2 reduces the computational burden of compacting the de Bruijn graph to roughly an hour and 3 GB of memory. We also applied bcalm 2 to the 22 Gbp loblolly pine and 20 Gbp white spruce sequencing datasets. Compacted graphs were constructed from raw reads in less than 2 days and 40 GB of memory on a single machine. Hence, bcalm 2 is at least an order of magnitude more efficient than other available methods. Source code of bcalm 2 is freely available at: https://github.com/GATB/bcalm rayan.chikhi@univ-lille1.fr. © The Author 2016. Published by Oxford University Press.
QSPR modeling: graph connectivity indices versus line graph connectivity indices
Basak; Nikolic; Trinajstic; Amic; Beslo
2000-07-01
Five QSPR models of alkanes were reinvestigated. Properties considered were molecular surface-dependent properties (boiling points and gas chromatographic retention indices) and molecular volume-dependent properties (molar volumes and molar refractions). The vertex- and edge-connectivity indices were used as structural parameters. In each studied case we computed connectivity indices of alkane trees and alkane line graphs and searched for the optimum exponent. Models based on indices with an optimum exponent and on the standard value of the exponent were compared. Thus, for each property we generated six QSPR models (four for alkane trees and two for the corresponding line graphs). In all studied cases QSPR models based on connectivity indices with optimum exponents have better statistical characteristics than the models based on connectivity indices with the standard value of the exponent. The comparison between models based on vertex- and edge-connectivity indices gave in two cases (molar volumes and molar refractions) better models based on edge-connectivity indices and in three cases (boiling points for octanes and nonanes and gas chromatographic retention indices) better models based on vertex-connectivity indices. Thus, it appears that the edge-connectivity index is more appropriate to be used in the structure-molecular volume properties modeling and the vertex-connectivity index in the structure-molecular surface properties modeling. The use of line graphs did not improve the predictive power of the connectivity indices. Only in one case (boiling points of nonanes) a better model was obtained with the use of line graphs.
Adaptation of Chain Event Graphs for use with Case-Control Studies in Epidemiology.
Keeble, Claire; Thwaites, Peter Adam; Barber, Stuart; Law, Graham Richard; Baxter, Paul David
2017-09-26
Case-control studies are used in epidemiology to try to uncover the causes of diseases, but are a retrospective study design known to suffer from non-participation and recall bias, which may explain their decreased popularity in recent years. Traditional analyses report usually only the odds ratio for given exposures and the binary disease status. Chain event graphs are a graphical representation of a statistical model derived from event trees which have been developed in artificial intelligence and statistics, and only recently introduced to the epidemiology literature. They are a modern Bayesian technique which enable prior knowledge to be incorporated into the data analysis using the agglomerative hierarchical clustering algorithm, used to form a suitable chain event graph. Additionally, they can account for missing data and be used to explore missingness mechanisms. Here we adapt the chain event graph framework to suit scenarios often encountered in case-control studies, to strengthen this study design which is time and financially efficient. We demonstrate eight adaptations to the graphs, which consist of two suitable for full case-control study analysis, four which can be used in interim analyses to explore biases, and two which aim to improve the ease and accuracy of analyses. The adaptations are illustrated with complete, reproducible, fully-interpreted examples, including the event tree and chain event graph. Chain event graphs are used here for the first time to summarise non-participation, data collection techniques, data reliability, and disease severity in case-control studies. We demonstrate how these features of a case-control study can be incorporated into the analysis to provide further insight, which can help to identify potential biases and lead to more accurate study results.
Coined quantum walks on weighted graphs
NASA Astrophysics Data System (ADS)
Wong, Thomas G.
2017-11-01
We define a discrete-time, coined quantum walk on weighted graphs that is inspired by Szegedy’s quantum walk. Using this, we prove that many lackadaisical quantum walks, where each vertex has l integer self-loops, can be generalized to a quantum walk where each vertex has a single self-loop of real-valued weight l. We apply this real-valued lackadaisical quantum walk to two problems. First, we analyze it on the line or one-dimensional lattice, showing that it is exactly equivalent to a continuous deformation of the three-state Grover walk with faster ballistic dispersion. Second, we generalize Grover’s algorithm, or search on the complete graph, to have a weighted self-loop at each vertex, yielding an improved success probability when l < 3 + 2\\sqrt{2} ≈ 5.828 .
Got Graphs? An Assessment of Data Visualization Tools
NASA Technical Reports Server (NTRS)
Schaefer, C. M.; Foy, M.
2015-01-01
Graphs are powerful tools for simplifying complex data. They are useful for quickly assessing patterns and relationships among one or more variables from a dataset. As the amount of data increases, it becomes more difficult to visualize potential associations. Lifetime Surveillance of Astronaut Health (LSAH) was charged with assessing its current visualization tools along with others on the market to determine whether new tools would be useful for supporting NASA's occupational surveillance effort. It was concluded by members of LSAH that the current tools hindered their ability to provide quick results to researchers working with the department. Due to the high volume of data requests and the many iterations of visualizations requested by researchers, software with a better ability to replicate graphs and edit quickly could improve LSAH's efficiency and lead to faster research results.
NASA Astrophysics Data System (ADS)
Hadyan, Fadhlil; Shaufiah; Arif Bijaksana, Moch.
2017-01-01
Automatic summarization is a system that can help someone to take the core information of a long text instantly. The system can help by summarizing text automatically. there’s Already many summarization systems that have been developed at this time but there are still many problems in those system. In this final task proposed summarization method using document index graph. This method utilizes the PageRank and HITS formula used to assess the web page, adapted to make an assessment of words in the sentences in a text document. The expected outcome of this final task is a system that can do summarization of a single document, by utilizing document index graph with TextRank and HITS to improve the quality of the summary results automatically.
Automatic extraction of protein point mutations using a graph bigram association.
Lee, Lawrence C; Horn, Florence; Cohen, Fred E
2007-02-02
Protein point mutations are an essential component of the evolutionary and experimental analysis of protein structure and function. While many manually curated databases attempt to index point mutations, most experimentally generated point mutations and the biological impacts of the changes are described in the peer-reviewed published literature. We describe an application, Mutation GraB (Graph Bigram), that identifies, extracts, and verifies point mutations from biomedical literature. The principal problem of point mutation extraction is to link the point mutation with its associated protein and organism of origin. Our algorithm uses a graph-based bigram traversal to identify these relevant associations and exploits the Swiss-Prot protein database to verify this information. The graph bigram method is different from other models for point mutation extraction in that it incorporates frequency and positional data of all terms in an article to drive the point mutation-protein association. Our method was tested on 589 articles describing point mutations from the G protein-coupled receptor (GPCR), tyrosine kinase, and ion channel protein families. We evaluated our graph bigram metric against a word-proximity metric for term association on datasets of full-text literature in these three different protein families. Our testing shows that the graph bigram metric achieves a higher F-measure for the GPCRs (0.79 versus 0.76), protein tyrosine kinases (0.72 versus 0.69), and ion channel transporters (0.76 versus 0.74). Importantly, in situations where more than one protein can be assigned to a point mutation and disambiguation is required, the graph bigram metric achieves a precision of 0.84 compared with the word distance metric precision of 0.73. We believe the graph bigram search metric to be a significant improvement over previous search metrics for point mutation extraction and to be applicable to text-mining application requiring the association of words.
Novel Lutein Loaded Lipid Nanoparticles on Porcine Corneal Distribution
Liu, Chi-Hsien; Chiu, Hao-Che; Wu, Wei-Chi; Sahoo, Soubhagya Laxmi; Hsu, Ching-Yun
2014-01-01
Topical delivery has the advantages including being user friendly and cost effective. Development of topical delivery carriers for lutein is becoming an important issue for the ocular drug delivery. Quantification of the partition coefficient of drug in the ocular tissue is the first step for the evaluation of delivery efficacy. The objectives of this study were to evaluate the effects of lipid nanoparticles and cyclodextrin (CD) on the corneal lutein accumulation and to measure the partition coefficients in the porcine cornea. Lipid nanoparticles combined with 2% HPβCD could enhance lutein accumulation up to 209.2 ± 18 (μg/g) which is 4.9-fold higher than that of the nanoparticles. CD combined nanoparticles have 68% of drug loading efficiency and lower cytotoxicity in the bovine cornea cells. From the confocal images, this improvement is due to the increased partitioning of lutein to the corneal epithelium by CD in the lipid nanoparticles. The novel lipid nanoparticles could not only improve the stability and entrapment efficacy of lutein but also enhance the lutein accumulation and partition in the cornea. Additionally the corneal accumulation of lutein was further enhanced by increasing the lutein payload in the vehicles. PMID:25101172
A strand graph semantics for DNA-based computation
Petersen, Rasmus L.; Lakin, Matthew R.; Phillips, Andrew
2015-01-01
DNA nanotechnology is a promising approach for engineering computation at the nanoscale, with potential applications in biofabrication and intelligent nanomedicine. DNA strand displacement is a general strategy for implementing a broad range of nanoscale computations, including any computation that can be expressed as a chemical reaction network. Modelling and analysis of DNA strand displacement systems is an important part of the design process, prior to experimental realisation. As experimental techniques improve, it is important for modelling languages to keep pace with the complexity of structures that can be realised experimentally. In this paper we present a process calculus for modelling DNA strand displacement computations involving rich secondary structures, including DNA branches and loops. We prove that our calculus is also sufficiently expressive to model previous work on non-branching structures, and propose a mapping from our calculus to a canonical strand graph representation, in which vertices represent DNA strands, ordered sites represent domains, and edges between sites represent bonds between domains. We define interactions between strands by means of strand graph rewriting, and prove the correspondence between the process calculus and strand graph behaviours. Finally, we propose a mapping from strand graphs to an efficient implementation, which we use to perform modelling and simulation of DNA strand displacement systems with rich secondary structure. PMID:27293306
Visualizing risks in cancer communication: A systematic review of computer-supported visual aids.
Stellamanns, Jan; Ruetters, Dana; Dahal, Keshav; Schillmoeller, Zita; Huebner, Jutta
2017-08-01
Health websites are becoming important sources for cancer information. Lay users, patients and carers seek support for critical decisions, but they are prone to common biases when quantitative information is presented. Graphical representations of risk data can facilitate comprehension, and interactive visualizations are popular. This review summarizes the evidence on computer-supported graphs that present risk data and their effects on various measures. The systematic literature search was conducted in several databases, including MEDLINE, EMBASE and CINAHL. Only studies with a controlled design were included. Relevant publications were carefully selected and critically appraised by two reviewers. Thirteen studies were included. Ten studies evaluated static graphs and three dynamic formats. Most decision scenarios were hypothetical. Static graphs could improve accuracy, comprehension, and behavioural intention. But the results were heterogeneous and inconsistent among the studies. Dynamic formats were not superior or even impaired performance compared to static formats. Static graphs show promising but inconsistent results, while research on dynamic visualizations is scarce and must be interpreted cautiously due to methodical limitations. Well-designed and context-specific static graphs can support web-based cancer risk communication in particular populations. The application of dynamic formats cannot be recommended and needs further research. Copyright © 2017 Elsevier B.V. All rights reserved.
Du, Lihong; White, Robert L
2009-02-01
A previously proposed partition equilibrium model for quantitative prediction of analyte response in electrospray ionization mass spectrometry is modified to yield an improved linear relationship. Analyte mass spectrometer response is modeled by a competition mechanism between analyte and background electrolytes that is based on partition equilibrium considerations. The correlation between analyte response and solution composition is described by the linear model over a wide concentration range and the improved model is shown to be valid for a wide range of experimental conditions. The behavior of an analyte in a salt solution, which could not be explained by the original model, is correctly predicted. The ion suppression effects of 16:0 lysophosphatidylcholine (LPC) on analyte signals are attributed to a combination of competition for excess charge and reduction of total charge due to surface tension effects. In contrast to the complicated mathematical forms that comprise the original model, the simplified model described here can more easily be employed to predict analyte mass spectrometer responses for solutions containing multiple components. Copyright (c) 2008 John Wiley & Sons, Ltd.
Elsayed, Mustafa M A; Vierl, Ulrich; Cevc, Gregor
2009-06-01
Potentiometric lipid membrane-water partition coefficient studies neglect electrostatic interactions to date; this leads to incorrect results. We herein show how to account properly for such interactions in potentiometric data analysis. We conducted potentiometric titration experiments to determine lipid membrane-water partition coefficients of four illustrative drugs, bupivacaine, diclofenac, ketoprofen and terbinafine. We then analyzed the results conventionally and with an improved analytical approach that considers Coulombic electrostatic interactions. The new analytical approach delivers robust partition coefficient values. In contrast, the conventional data analysis yields apparent partition coefficients of the ionized drug forms that depend on experimental conditions (mainly the lipid-drug ratio and the bulk ionic strength). This is due to changing electrostatic effects originating either from bound drug and/or lipid charges. A membrane comprising 10 mol-% mono-charged molecules in a 150 mM (monovalent) electrolyte solution yields results that differ by a factor of 4 from uncharged membranes results. Allowance for the Coulombic electrostatic interactions is a prerequisite for accurate and reliable determination of lipid membrane-water partition coefficients of ionizable drugs from potentiometric titration data. The same conclusion applies to all analytical methods involving drug binding to a surface.
Data traffic reduction schemes for Cholesky factorization on asynchronous multiprocessor systems
NASA Technical Reports Server (NTRS)
Naik, Vijay K.; Patrick, Merrell L.
1989-01-01
Communication requirements of Cholesky factorization of dense and sparse symmetric, positive definite matrices are analyzed. The communication requirement is characterized by the data traffic generated on multiprocessor systems with local and shared memory. Lower bound proofs are given to show that when the load is uniformly distributed the data traffic associated with factoring an n x n dense matrix using n to the alpha power (alpha less than or equal 2) processors is omega(n to the 2 + alpha/2 power). For n x n sparse matrices representing a square root of n x square root of n regular grid graph the data traffic is shown to be omega(n to the 1 + alpha/2 power), alpha less than or equal 1. Partitioning schemes that are variations of block assignment scheme are described and it is shown that the data traffic generated by these schemes are asymptotically optimal. The schemes allow efficient use of up to O(n to the 2nd power) processors in the dense case and up to O(n) processors in the sparse case before the total data traffic reaches the maximum value of O(n to the 3rd power) and O(n to the 3/2 power), respectively. It is shown that the block based partitioning schemes allow a better utilization of the data accessed from shared memory and thus reduce the data traffic than those based on column-wise wrap around assignment schemes.
Kranabetter, J. Marty; McLauchlan, Kendra K.; Enders, Sara K.; Fraterrigo, Jennifer M.; Higuera, Philip E.; Morris, Jesse L.; Rastetter, Edward B.; Barnes, Rebecca; Buma, Brian; Gavin, Daniel G.; Gerhart, Laci M.; Gillson, Lindsey; Hietz, Peter; Mack, Michelle C.; McNeil, Brenden; Perakis, Steven
2016-01-01
Disturbances affect almost all terrestrial ecosystems, but it has been difficult to identify general principles regarding these influences. To improve our understanding of the long-term consequences of disturbance on terrestrial ecosystems, we present a conceptual framework that analyzes disturbances by their biogeochemical impacts. We posit that the ratio of soil and plant nutrient stocks in mature ecosystems represents a characteristic site property. Focusing on nitrogen (N), we hypothesize that this partitioning ratio (soil N: plant N) will undergo a predictable trajectory after disturbance. We investigate the nature of this partitioning ratio with three approaches: (1) nutrient stock data from forested ecosystems in North America, (2) a process-based ecosystem model, and (3) conceptual shifts in site nutrient availability with altered disturbance frequency. Partitioning ratios could be applied to a variety of ecosystems and successional states, allowing for improved temporal scaling of disturbance events. The generally short-term empirical evidence for recovery trajectories of nutrient stocks and partitioning ratios suggests two areas for future research. First, we need to recognize and quantify how disturbance effects can be accreting or depleting, depending on whether their net effect is to increase or decrease ecosystem nutrient stocks. Second, we need to test how altered disturbance frequencies from the present state may be constructive or destructive in their effects on biogeochemical cycling and nutrient availability. Long-term studies, with repeated sampling of soils and vegetation, will be essential in further developing this framework of biogeochemical response to disturbance.
NASA Technical Reports Server (NTRS)
Mog, Robert A.
1999-01-01
Unique and innovative graph theory, neural network, organizational modeling, and genetic algorithms are applied to the design and evolution of programmatic and organizational architectures. Graph theory representations of programs and organizations increase modeling capabilities and flexibility, while illuminating preferable programmatic/organizational design features. Treating programs and organizations as neural networks results in better system synthesis, and more robust data modeling. Organizational modeling using covariance structures enhances the determination of organizational risk factors. Genetic algorithms improve programmatic evolution characteristics, while shedding light on rulebase requirements for achieving specified technological readiness levels, given budget and schedule resources. This program of research improves the robustness and verifiability of systems synthesis tools, including the Complex Organizational Metric for Programmatic Risk Environments (COMPRE).
DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases
NASA Astrophysics Data System (ADS)
Bröcheler, Matthias; Pugliese, Andrea; Subrahmanian, V. S.
RDF is an increasingly important paradigm for the representation of information on the Web. As RDF databases increase in size to approach tens of millions of triples, and as sophisticated graph matching queries expressible in languages like SPARQL become increasingly important, scalability becomes an issue. To date, there is no graph-based indexing method for RDF data where the index was designed in a way that makes it disk-resident. There is therefore a growing need for indexes that can operate efficiently when the index itself resides on disk. In this paper, we first propose the DOGMA index for fast subgraph matching on disk and then develop a basic algorithm to answer queries over this index. This algorithm is then significantly sped up via an optimized algorithm that uses efficient (but correct) pruning strategies when combined with two different extensions of the index. We have implemented a preliminary system and tested it against four existing RDF database systems developed by others. Our experiments show that our algorithm performs very well compared to these systems, with orders of magnitude improvements for complex graph queries.
Contact Graph Routing Enhancements Developed in ION for DTN
NASA Technical Reports Server (NTRS)
Segui, John S.; Burleigh, Scott
2013-01-01
The Interplanetary Overlay Network (ION) software suite is an open-source, flight-ready implementation of networking protocols including the Delay/Disruption Tolerant Networking (DTN) Bundle Protocol (BP), the CCSDS (Consultative Committee for Space Data Systems) File Delivery Protocol (CFDP), and many others including the Contact Graph Routing (CGR) DTN routing system. While DTN offers the capability to tolerate disruption and long signal propagation delays in transmission, without an appropriate routing protocol, no data can be delivered. CGR was built for space exploration networks with scheduled communication opportunities (typically based on trajectories and orbits), represented as a contact graph. Since CGR uses knowledge of future connectivity, the contact graph can grow rather large, and so efficient processing is desired. These enhancements allow CGR to scale to predicted NASA space network complexities and beyond. This software improves upon CGR by adopting an earliest-arrival-time cost metric and using the Dijkstra path selection algorithm. Moving to Dijkstra path selection also enables construction of an earliest- arrival-time tree for multicast routing. The enhancements have been rolled into ION 3.0 available on sourceforge.net.
NASA Astrophysics Data System (ADS)
Garciá-Arteaga, Juan D.; Corredor, Germán.; Wang, Xiangxue; Velcheti, Vamsidhar; Madabhushi, Anant; Romero, Eduardo
2017-11-01
Tumor-infiltrating lymphocytes occurs when various classes of white blood cells migrate from the blood stream towards the tumor, infiltrating it. The presence of TIL is predictive of the response of the patient to therapy. In this paper, we show how the automatic detection of lymphocytes in digital H and E histopathological images and the quantitative evaluation of the global lymphocyte configuration, evaluated through global features extracted from non-parametric graphs, constructed from the lymphocytes' detected positions, can be correlated to the patient's outcome in early-stage non-small cell lung cancer (NSCLC). The method was assessed on a tissue microarray cohort composed of 63 NSCLC cases. From the evaluated graphs, minimum spanning trees and K-nn showed the highest predictive ability, yielding F1 Scores of 0.75 and 0.72 and accuracies of 0.67 and 0.69, respectively. The predictive power of the proposed methodology indicates that graphs may be used to develop objective measures of the infiltration grade of tumors, which can, in turn, be used by pathologists to improve the decision making and treatment planning processes.
NASA Astrophysics Data System (ADS)
Levchuk, Georgiy; Colonna-Romano, John; Eslami, Mohammed
2017-05-01
The United States increasingly relies on cyber-physical systems to conduct military and commercial operations. Attacks on these systems have increased dramatically around the globe. The attackers constantly change their methods, making state-of-the-art commercial and military intrusion detection systems ineffective. In this paper, we present a model to identify functional behavior of network devices from netflow traces. Our model includes two innovations. First, we define novel features for a host IP using detection of application graph patterns in IP's host graph constructed from 5-min aggregated packet flows. Second, we present the first application, to the best of our knowledge, of Graph Semi-Supervised Learning (GSSL) to the space of IP behavior classification. Using a cyber-attack dataset collected from NetFlow packet traces, we show that GSSL trained with only 20% of the data achieves higher attack detection rates than Support Vector Machines (SVM) and Naïve Bayes (NB) classifiers trained with 80% of data points. We also show how to improve detection quality by filtering out web browsing data, and conclude with discussion of future research directions.
Dong, Jianwu; Chen, Feng; Zhou, Dong; Liu, Tian; Yu, Zhaofei; Wang, Yi
2017-03-01
Existence of low SNR regions and rapid-phase variations pose challenges to spatial phase unwrapping algorithms. Global optimization-based phase unwrapping methods are widely used, but are significantly slower than greedy methods. In this paper, dual decomposition acceleration is introduced to speed up a three-dimensional graph cut-based phase unwrapping algorithm. The phase unwrapping problem is formulated as a global discrete energy minimization problem, whereas the technique of dual decomposition is used to increase the computational efficiency by splitting the full problem into overlapping subproblems and enforcing the congruence of overlapping variables. Using three dimensional (3D) multiecho gradient echo images from an agarose phantom and five brain hemorrhage patients, we compared this proposed method with an unaccelerated graph cut-based method. Experimental results show up to 18-fold acceleration in computation time. Dual decomposition significantly improves the computational efficiency of 3D graph cut-based phase unwrapping algorithms. Magn Reson Med 77:1353-1358, 2017. © 2016 International Society for Magnetic Resonance in Medicine. © 2016 International Society for Magnetic Resonance in Medicine.