Recursive Feature Extraction in Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-08-14
ReFeX extracts recursive topological features from graph data. The input is a graph as a csv file and the output is a csv file containing feature values for each node in the graph. The features are based on topological counts in the neighborhoods of each nodes, as well as recursive summaries of neighbors' features.
A graph-Laplacian-based feature extraction algorithm for neural spike sorting.
Ghanbari, Yasser; Spence, Larry; Papamichalis, Panos
2009-01-01
Analysis of extracellular neural spike recordings is highly dependent upon the accuracy of neural waveform classification, commonly referred to as spike sorting. Feature extraction is an important stage of this process because it can limit the quality of clustering which is performed in the feature space. This paper proposes a new feature extraction method (which we call Graph Laplacian Features, GLF) based on minimizing the graph Laplacian and maximizing the weighted variance. The algorithm is compared with Principal Components Analysis (PCA, the most commonly-used feature extraction method) using simulated neural data. The results show that the proposed algorithm produces more compact and well-separated clusters compared to PCA. As an added benefit, tentative cluster centers are output which can be used to initialize a subsequent clustering stage.
Palmprint verification using Lagrangian decomposition and invariant interest points
NASA Astrophysics Data System (ADS)
Gupta, P.; Rattani, A.; Kisku, D. R.; Hwang, C. J.; Sing, J. K.
2011-06-01
This paper presents a palmprint based verification system using SIFT features and Lagrangian network graph technique. We employ SIFT for feature extraction from palmprint images whereas the region of interest (ROI) which has been extracted from wide palm texture at the preprocessing stage, is considered for invariant points extraction. Finally, identity is established by finding permutation matrix for a pair of reference and probe palm graphs drawn on extracted SIFT features. Permutation matrix is used to minimize the distance between two graphs. The propsed system has been tested on CASIA and IITK palmprint databases and experimental results reveal the effectiveness and robustness of the system.
Automatic Authorship Detection Using Textual Patterns Extracted from Integrated Syntactic Graphs
Gómez-Adorno, Helena; Sidorov, Grigori; Pinto, David; Vilariño, Darnes; Gelbukh, Alexander
2016-01-01
We apply the integrated syntactic graph feature extraction methodology to the task of automatic authorship detection. This graph-based representation allows integrating different levels of language description into a single structure. We extract textual patterns based on features obtained from shortest path walks over integrated syntactic graphs and apply them to determine the authors of documents. On average, our method outperforms the state of the art approaches and gives consistently high results across different corpora, unlike existing methods. Our results show that our textual patterns are useful for the task of authorship attribution. PMID:27589740
Model-based morphological segmentation and labeling of coronary angiograms.
Haris, K; Efstratiadis, S N; Maglaveras, N; Pappas, C; Gourassas, J; Louridas, G
1999-10-01
A method for extraction and labeling of the coronary arterial tree (CAT) using minimal user supervision in single-view angiograms is proposed. The CAT structural description (skeleton and borders) is produced, along with quantitative information for the artery dimensions and assignment of coded labels, based on a given coronary artery model represented by a graph. The stages of the method are: 1) CAT tracking and detection; 2) artery skeleton and border estimation; 3) feature graph creation; and iv) artery labeling by graph matching. The approximate CAT centerline and borders are extracted by recursive tracking based on circular template analysis. The accurate skeleton and borders of each CAT segment are computed, based on morphological homotopy modification and watershed transform. The approximate centerline and borders are used for constructing the artery segment enclosing area (ASEA), where the defined skeleton and border curves are considered as markers. Using the marked ASEA, an artery gradient image is constructed where all the ASEA pixels (except the skeleton ones) are assigned the gradient magnitude of the original image. The artery gradient image markers are imposed as its unique regional minima by the homotopy modification method, the watershed transform is used for extracting the artery segment borders, and the feature graph is updated. Finally, given the created feature graph and the known model graph, a graph matching algorithm assigns the appropriate labels to the extracted CAT using weighted maximal cliques on the association graph corresponding to the two given graphs. Experimental results using clinical digitized coronary angiograms are presented.
Graph theory for feature extraction and classification: a migraine pathology case study.
Jorge-Hernandez, Fernando; Garcia Chimeno, Yolanda; Garcia-Zapirain, Begonya; Cabrera Zubizarreta, Alberto; Gomez Beldarrain, Maria Angeles; Fernandez-Ruanova, Begonya
2014-01-01
Graph theory is also widely used as a representational form and characterization of brain connectivity network, as is machine learning for classifying groups depending on the features extracted from images. Many of these studies use different techniques, such as preprocessing, correlations, features or algorithms. This paper proposes an automatic tool to perform a standard process using images of the Magnetic Resonance Imaging (MRI) machine. The process includes pre-processing, building the graph per subject with different correlations, atlas, relevant feature extraction according to the literature, and finally providing a set of machine learning algorithms which can produce analyzable results for physicians or specialists. In order to verify the process, a set of images from prescription drug abusers and patients with migraine have been used. In this way, the proper functioning of the tool has been proved, providing results of 87% and 92% of success depending on the classifier used.
Sidek, Khairul; Khali, Ibrahim
2012-01-01
In this paper, a person identification mechanism implemented with Cardioid based graph using electrocardiogram (ECG) is presented. Cardioid based graph has given a reasonably good classification accuracy in terms of differentiating between individuals. However, the current feature extraction method using Euclidean distance could be further improved by using Mahalanobis distance measurement producing extracted coefficients which takes into account the correlations of the data set. Identification is then done by applying these extracted features to Radial Basis Function Network. A total of 30 ECG data from MITBIH Normal Sinus Rhythm database (NSRDB) and MITBIH Arrhythmia database (MITDB) were used for development and evaluation purposes. Our experimentation results suggest that the proposed feature extraction method has significantly increased the classification performance of subjects in both databases with accuracy from 97.50% to 99.80% in NSRDB and 96.50% to 99.40% in MITDB. High sensitivity, specificity and positive predictive value of 99.17%, 99.91% and 99.23% for NSRDB and 99.30%, 99.90% and 99.40% for MITDB also validates the proposed method. This result also indicates that the right feature extraction technique plays a vital role in determining the persistency of the classification accuracy for Cardioid based person identification mechanism.
NASA Astrophysics Data System (ADS)
Alshehhi, Rasha; Marpu, Prashanth Reddy
2017-04-01
Extraction of road networks in urban areas from remotely sensed imagery plays an important role in many urban applications (e.g. road navigation, geometric correction of urban remote sensing images, updating geographic information systems, etc.). It is normally difficult to accurately differentiate road from its background due to the complex geometry of the buildings and the acquisition geometry of the sensor. In this paper, we present a new method for extracting roads from high-resolution imagery based on hierarchical graph-based image segmentation. The proposed method consists of: 1. Extracting features (e.g., using Gabor and morphological filtering) to enhance the contrast between road and non-road pixels, 2. Graph-based segmentation consisting of (i) Constructing a graph representation of the image based on initial segmentation and (ii) Hierarchical merging and splitting of image segments based on color and shape features, and 3. Post-processing to remove irregularities in the extracted road segments. Experiments are conducted on three challenging datasets of high-resolution images to demonstrate the proposed method and compare with other similar approaches. The results demonstrate the validity and superior performance of the proposed method for road extraction in urban areas.
Man-Made Object Extraction from Remote Sensing Imagery by Graph-Based Manifold Ranking
NASA Astrophysics Data System (ADS)
He, Y.; Wang, X.; Hu, X. Y.; Liu, S. H.
2018-04-01
The automatic extraction of man-made objects from remote sensing imagery is useful in many applications. This paper proposes an algorithm for extracting man-made objects automatically by integrating a graph model with the manifold ranking algorithm. Initially, we estimate a priori value of the man-made objects with the use of symmetric and contrast features. The graph model is established to represent the spatial relationships among pre-segmented superpixels, which are used as the graph nodes. Multiple characteristics, namely colour, texture and main direction, are used to compute the weights of the adjacent nodes. Manifold ranking effectively explores the relationships among all the nodes in the feature space as well as initial query assignment; thus, it is applied to generate a ranking map, which indicates the scores of the man-made objects. The man-made objects are then segmented on the basis of the ranking map. Two typical segmentation algorithms are compared with the proposed algorithm. Experimental results show that the proposed algorithm can extract man-made objects with high recognition rate and low omission rate.
A Set of Handwriting Features for Use in Automated Writer Identification.
Miller, John J; Patterson, Robert Bradley; Gantz, Donald T; Saunders, Christopher P; Walch, Mark A; Buscaglia, JoAnn
2017-05-01
A writer's biometric identity can be characterized through the distribution of physical feature measurements ("writer's profile"); a graph-based system that facilitates the quantification of these features is described. To accomplish this quantification, handwriting is segmented into basic graphical forms ("graphemes"), which are "skeletonized" to yield the graphical topology of the handwritten segment. The graph-based matching algorithm compares the graphemes first by their graphical topology and then by their geometric features. Graphs derived from known writers can be compared against graphs extracted from unknown writings. The process is computationally intensive and relies heavily upon statistical pattern recognition algorithms. This article focuses on the quantification of these physical features and the construction of the associated pattern recognition methods for using the features to discriminate among writers. The graph-based system described in this article has been implemented in a highly accurate and approximately language-independent biometric recognition system of writers of cursive documents. © 2017 American Academy of Forensic Sciences.
Saund, Eric
2013-10-01
Effective object and scene classification and indexing depend on extraction of informative image features. This paper shows how large families of complex image features in the form of subgraphs can be built out of simpler ones through construction of a graph lattice—a hierarchy of related subgraphs linked in a lattice. Robustness is achieved by matching many overlapping and redundant subgraphs, which allows the use of inexpensive exact graph matching, instead of relying on expensive error-tolerant graph matching to a minimal set of ideal model graphs. Efficiency in exact matching is gained by exploitation of the graph lattice data structure. Additionally, the graph lattice enables methods for adaptively growing a feature space of subgraphs tailored to observed data. We develop the approach in the domain of rectilinear line art, specifically for the practical problem of document forms recognition. We are especially interested in methods that require only one or very few labeled training examples per category. We demonstrate two approaches to using the subgraph features for this purpose. Using a bag-of-words feature vector we achieve essentially single-instance learning on a benchmark forms database, following an unsupervised clustering stage. Further performance gains are achieved on a more difficult dataset using a feature voting method and feature selection procedure.
Graph reconstruction using covariance-based methods.
Sulaimanov, Nurgazy; Koeppl, Heinz
2016-12-01
Methods based on correlation and partial correlation are today employed in the reconstruction of a statistical interaction graph from high-throughput omics data. These dedicated methods work well even for the case when the number of variables exceeds the number of samples. In this study, we investigate how the graphs extracted from covariance and concentration matrix estimates are related by using Neumann series and transitive closure and through discussing concrete small examples. Considering the ideal case where the true graph is available, we also compare correlation and partial correlation methods for large realistic graphs. In particular, we perform the comparisons with optimally selected parameters based on the true underlying graph and with data-driven approaches where the parameters are directly estimated from the data.
Bakal, Gokhan; Talari, Preetham; Kakani, Elijah V; Kavuluru, Ramakanth
2018-06-01
Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying different causal relations between biomedical entities is also critical to understand biomedical processes. Generally, natural language processing (NLP) and machine learning are used to predict specific relations between any given pair of entities using the distant supervision approach. To build high accuracy supervised predictive models to predict previously unknown treatment and causative relations between biomedical entities based only on semantic graph pattern features extracted from biomedical knowledge graphs. We used 7000 treats and 2918 causes hand-curated relations from the UMLS Metathesaurus to train and test our models. Our graph pattern features are extracted from simple paths connecting biomedical entities in the SemMedDB graph (based on the well-known SemMedDB database made available by the U.S. National Library of Medicine). Using these graph patterns connecting biomedical entities as features of logistic regression and decision tree models, we computed mean performance measures (precision, recall, F-score) over 100 distinct 80-20% train-test splits of the datasets. For all experiments, we used a positive:negative class imbalance of 1:10 in the test set to model relatively more realistic scenarios. Our models predict treats and causes relations with high F-scores of 99% and 90% respectively. Logistic regression model coefficients also help us identify highly discriminative patterns that have an intuitive interpretation. We are also able to predict some new plausible relations based on false positives that our models scored highly based on our collaborations with two physician co-authors. Finally, our decision tree models are able to retrieve over 50% of treatment relations from a recently created external dataset. We employed semantic graph patterns connecting pairs of candidate biomedical entities in a knowledge graph as features to predict treatment/causative relations between them. We provide what we believe is the first evidence in direct prediction of biomedical relations based on graph features. Our work complements lexical pattern based approaches in that the graph patterns can be used as additional features for weakly supervised relation prediction. Copyright © 2018 Elsevier Inc. All rights reserved.
NOUS: Construction and Querying of Dynamic Knowledge Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Choudhury, Sutanay; Agarwal, Khushbu; Purohit, Sumit
The ability to construct domain specific knowledge graphs (KG) and perform question-answering or hypothesis generation is a trans- formative capability. Despite their value, automated construction of knowledge graphs remains an expensive technical challenge that is beyond the reach for most enterprises and academic institutions. We propose an end-to-end framework for developing custom knowl- edge graph driven analytics for arbitrary application domains. The uniqueness of our system lies A) in its combination of curated KGs along with knowledge extracted from unstructured text, B) support for advanced trending and explanatory questions on a dynamic KG, and C) the ability to answer queriesmore » where the answer is embedded across multiple data sources.« less
Zheng, Qiang; Warner, Steven; Tasian, Gregory; Fan, Yong
2018-02-12
Automatic segmentation of kidneys in ultrasound (US) images remains a challenging task because of high speckle noise, low contrast, and large appearance variations of kidneys in US images. Because texture features may improve the US image segmentation performance, we propose a novel graph cuts method to segment kidney in US images by integrating image intensity information and texture feature maps. We develop a new graph cuts-based method to segment kidney US images by integrating original image intensity information and texture feature maps extracted using Gabor filters. To handle large appearance variation within kidney images and improve computational efficiency, we build a graph of image pixels close to kidney boundary instead of building a graph of the whole image. To make the kidney segmentation robust to weak boundaries, we adopt localized regional information to measure similarity between image pixels for computing edge weights to build the graph of image pixels. The localized graph is dynamically updated and the graph cuts-based segmentation iteratively progresses until convergence. Our method has been evaluated based on kidney US images of 85 subjects. The imaging data of 20 randomly selected subjects were used as training data to tune parameters of the image segmentation method, and the remaining data were used as testing data for validation. Experiment results demonstrated that the proposed method obtained promising segmentation results for bilateral kidneys (average Dice index = 0.9446, average mean distance = 2.2551, average specificity = 0.9971, average accuracy = 0.9919), better than other methods under comparison (P < .05, paired Wilcoxon rank sum tests). The proposed method achieved promising performance for segmenting kidneys in two-dimensional US images, better than segmentation methods built on any single channel of image information. This method will facilitate extraction of kidney characteristics that may predict important clinical outcomes such as progression of chronic kidney disease. Copyright © 2018 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Graph representation of hepatic vessel based on centerline extraction and junction detection
NASA Astrophysics Data System (ADS)
Zhang, Xing; Tian, Jie; Deng, Kexin; Li, Xiuli; Yang, Fei
2012-02-01
In the area of computer-aided diagnosis (CAD), segmentation and analysis of hepatic vessel is a prerequisite for hepatic diseases diagnosis and surgery planning. For liver surgery planning, it is crucial to provide the surgeon with a patient-individual three-dimensional representation of the liver along with its vasculature and lesions. The representation allows an exploration of the vascular anatomy and the measurement of vessel diameters, following by intra-patient registration, as well as the analysis of the shape and volume of vascular territories. In this paper, we present an approach for generation of hepatic vessel graph based on centerline extraction and junction detection. The proposed approach involves the following concepts and methods: 1) Flux driven automatic centerline extraction; 2) Junction detection on the centerline using hollow sphere filtering; 3) Graph representation of hepatic vessel based on the centerline and junction. The approach is evaluated on contrast-enhanced liver CT datasets to demonstrate its availability and effectiveness.
Spectral-clustering approach to Lagrangian vortex detection.
Hadjighasem, Alireza; Karrasch, Daniel; Teramoto, Hiroshi; Haller, George
2016-06-01
One of the ubiquitous features of real-life turbulent flows is the existence and persistence of coherent vortices. Here we show that such coherent vortices can be extracted as clusters of Lagrangian trajectories. We carry out the clustering on a weighted graph, with the weights measuring pairwise distances of fluid trajectories in the extended phase space of positions and time. We then extract coherent vortices from the graph using tools from spectral graph theory. Our method locates all coherent vortices in the flow simultaneously, thereby showing high potential for automated vortex tracking. We illustrate the performance of this technique by identifying coherent Lagrangian vortices in several two- and three-dimensional flows.
G-Hash: Towards Fast Kernel-based Similarity Search in Large Graph Databases.
Wang, Xiaohong; Smalter, Aaron; Huan, Jun; Lushington, Gerald H
2009-01-01
Structured data including sets, sequences, trees and graphs, pose significant challenges to fundamental aspects of data management such as efficient storage, indexing, and similarity search. With the fast accumulation of graph databases, similarity search in graph databases has emerged as an important research topic. Graph similarity search has applications in a wide range of domains including cheminformatics, bioinformatics, sensor network management, social network management, and XML documents, among others.Most of the current graph indexing methods focus on subgraph query processing, i.e. determining the set of database graphs that contains the query graph and hence do not directly support similarity search. In data mining and machine learning, various graph kernel functions have been designed to capture the intrinsic similarity of graphs. Though successful in constructing accurate predictive and classification models for supervised learning, graph kernel functions have (i) high computational complexity and (ii) non-trivial difficulty to be indexed in a graph database.Our objective is to bridge graph kernel function and similarity search in graph databases by proposing (i) a novel kernel-based similarity measurement and (ii) an efficient indexing structure for graph data management. Our method of similarity measurement builds upon local features extracted from each node and their neighboring nodes in graphs. A hash table is utilized to support efficient storage and fast search of the extracted local features. Using the hash table, a graph kernel function is defined to capture the intrinsic similarity of graphs and for fast similarity query processing. We have implemented our method, which we have named G-hash, and have demonstrated its utility on large chemical graph databases. Our results show that the G-hash method achieves state-of-the-art performance for k-nearest neighbor (k-NN) classification. Most importantly, the new similarity measurement and the index structure is scalable to large database with smaller indexing size, faster indexing construction time, and faster query processing time as compared to state-of-the-art indexing methods such as C-tree, gIndex, and GraphGrep.
A wavelet-based approach for a continuous analysis of phonovibrograms.
Unger, Jakob; Meyer, Tobias; Doellinger, Michael; Hecker, Dietmar J; Schick, Bernhard; Lohscheller, Joerg
2012-01-01
Recently, endoscopic high-speed laryngoscopy has been established for commercial use and constitutes a state-of-the-art technique to examine vocal fold dynamics. Despite overcoming many limitations of commonly applied stroboscopy it has not gained widespread clinical application, yet. A major drawback is a missing methodology of extracting valuable features to support visual assessment or computer-aided diagnosis. In this paper a compact and descriptive feature set is presented. The feature extraction routines are based on two-dimensional color graphs called phonovibrograms (PVG). These graphs contain the full spatio-temporal pattern of vocal fold dynamics and are therefore suited to derive features that comprehensively describe the vibration pattern of vocal folds. Within our approach, clinically relevant features such as glottal closure type, symmetry and periodicity are quantified in a set of 10 descriptive features. The suitability for classification tasks is shown using a clinical data set comprising 50 healthy and 50 paralytic subjects. A classification accuracy of 93.2% has been achieved.
Exploiting graph kernels for high performance biomedical relation extraction.
Panyam, Nagesh C; Verspoor, Karin; Cohn, Trevor; Ramamohanarao, Kotagiri
2018-01-30
Relation extraction from biomedical publications is an important task in the area of semantic mining of text. Kernel methods for supervised relation extraction are often preferred over manual feature engineering methods, when classifying highly ordered structures such as trees and graphs obtained from syntactic parsing of a sentence. Tree kernels such as the Subset Tree Kernel and Partial Tree Kernel have been shown to be effective for classifying constituency parse trees and basic dependency parse graphs of a sentence. Graph kernels such as the All Path Graph kernel (APG) and Approximate Subgraph Matching (ASM) kernel have been shown to be suitable for classifying general graphs with cycles, such as the enhanced dependency parse graph of a sentence. In this work, we present a high performance Chemical-Induced Disease (CID) relation extraction system. We present a comparative study of kernel methods for the CID task and also extend our study to the Protein-Protein Interaction (PPI) extraction task, an important biomedical relation extraction task. We discuss novel modifications to the ASM kernel to boost its performance and a method to apply graph kernels for extracting relations expressed in multiple sentences. Our system for CID relation extraction attains an F-score of 60%, without using external knowledge sources or task specific heuristic or rules. In comparison, the state of the art Chemical-Disease Relation Extraction system achieves an F-score of 56% using an ensemble of multiple machine learning methods, which is then boosted to 61% with a rule based system employing task specific post processing rules. For the CID task, graph kernels outperform tree kernels substantially, and the best performance is obtained with APG kernel that attains an F-score of 60%, followed by the ASM kernel at 57%. The performance difference between the ASM and APG kernels for CID sentence level relation extraction is not significant. In our evaluation of ASM for the PPI task, ASM performed better than APG kernel for the BioInfer dataset, in the Area Under Curve (AUC) measure (74% vs 69%). However, for all the other PPI datasets, namely AIMed, HPRD50, IEPA and LLL, ASM is substantially outperformed by the APG kernel in F-score and AUC measures. We demonstrate a high performance Chemical Induced Disease relation extraction, without employing external knowledge sources or task specific heuristics. Our work shows that graph kernels are effective in extracting relations that are expressed in multiple sentences. We also show that the graph kernels, namely the ASM and APG kernels, substantially outperform the tree kernels. Among the graph kernels, we showed the ASM kernel as effective for biomedical relation extraction, with comparable performance to the APG kernel for datasets such as the CID-sentence level relation extraction and BioInfer in PPI. Overall, the APG kernel is shown to be significantly more accurate than the ASM kernel, achieving better performance on most datasets.
Visual texture perception via graph-based semi-supervised learning
NASA Astrophysics Data System (ADS)
Zhang, Qin; Dong, Junyu; Zhong, Guoqiang
2018-04-01
Perceptual features, for example direction, contrast and repetitiveness, are important visual factors for human to perceive a texture. However, it needs to perform psychophysical experiment to quantify these perceptual features' scale, which requires a large amount of human labor and time. This paper focuses on the task of obtaining perceptual features' scale of textures by small number of textures with perceptual scales through a rating psychophysical experiment (what we call labeled textures) and a mass of unlabeled textures. This is the scenario that the semi-supervised learning is naturally suitable for. This is meaningful for texture perception research, and really helpful for the perceptual texture database expansion. A graph-based semi-supervised learning method called random multi-graphs, RMG for short, is proposed to deal with this task. We evaluate different kinds of features including LBP, Gabor, and a kind of unsupervised deep features extracted by a PCA-based deep network. The experimental results show that our method can achieve satisfactory effects no matter what kind of texture features are used.
VISAGE: Interactive Visual Graph Querying.
Pienta, Robert; Navathe, Shamkant; Tamersoy, Acar; Tong, Hanghang; Endert, Alex; Chau, Duen Horng
2016-06-01
Extracting useful patterns from large network datasets has become a fundamental challenge in many domains. We present VISAGE, an interactive visual graph querying approach that empowers users to construct expressive queries, without writing complex code (e.g., finding money laundering rings of bankers and business owners). Our contributions are as follows: (1) we introduce graph autocomplete , an interactive approach that guides users to construct and refine queries, preventing over-specification; (2) VISAGE guides the construction of graph queries using a data-driven approach, enabling users to specify queries with varying levels of specificity, from concrete and detailed (e.g., query by example), to abstract (e.g., with "wildcard" nodes of any types), to purely structural matching; (3) a twelve-participant, within-subject user study demonstrates VISAGE's ease of use and the ability to construct graph queries significantly faster than using a conventional query language; (4) VISAGE works on real graphs with over 468K edges, achieving sub-second response times for common queries.
VISAGE: Interactive Visual Graph Querying
Pienta, Robert; Navathe, Shamkant; Tamersoy, Acar; Tong, Hanghang; Endert, Alex; Chau, Duen Horng
2017-01-01
Extracting useful patterns from large network datasets has become a fundamental challenge in many domains. We present VISAGE, an interactive visual graph querying approach that empowers users to construct expressive queries, without writing complex code (e.g., finding money laundering rings of bankers and business owners). Our contributions are as follows: (1) we introduce graph autocomplete, an interactive approach that guides users to construct and refine queries, preventing over-specification; (2) VISAGE guides the construction of graph queries using a data-driven approach, enabling users to specify queries with varying levels of specificity, from concrete and detailed (e.g., query by example), to abstract (e.g., with “wildcard” nodes of any types), to purely structural matching; (3) a twelve-participant, within-subject user study demonstrates VISAGE’s ease of use and the ability to construct graph queries significantly faster than using a conventional query language; (4) VISAGE works on real graphs with over 468K edges, achieving sub-second response times for common queries. PMID:28553670
Multilabel user classification using the community structure of online networks
Papadopoulos, Symeon; Kompatsiaris, Yiannis
2017-01-01
We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user’s graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score. PMID:28278242
Multilabel user classification using the community structure of online networks.
Rizos, Georgios; Papadopoulos, Symeon; Kompatsiaris, Yiannis
2017-01-01
We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user's graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score.
Multiple kernels learning-based biological entity relationship extraction method.
Dongliang, Xu; Jingchang, Pan; Bailing, Wang
2017-09-20
Automatic extracting protein entity interaction information from biomedical literature can help to build protein relation network and design new drugs. There are more than 20 million literature abstracts included in MEDLINE, which is the most authoritative textual database in the field of biomedicine, and follow an exponential growth over time. This frantic expansion of the biomedical literature can often be difficult to absorb or manually analyze. Thus efficient and automated search engines are necessary to efficiently explore the biomedical literature using text mining techniques. The P, R, and F value of tag graph method in Aimed corpus are 50.82, 69.76, and 58.61%, respectively. The P, R, and F value of tag graph kernel method in other four evaluation corpuses are 2-5% higher than that of all-paths graph kernel. And The P, R and F value of feature kernel and tag graph kernel fuse methods is 53.43, 71.62 and 61.30%, respectively. The P, R and F value of feature kernel and tag graph kernel fuse methods is 55.47, 70.29 and 60.37%, respectively. It indicated that the performance of the two kinds of kernel fusion methods is better than that of simple kernel. In comparison with the all-paths graph kernel method, the tag graph kernel method is superior in terms of overall performance. Experiments show that the performance of the multi-kernels method is better than that of the three separate single-kernel method and the dual-mutually fused kernel method used hereof in five corpus sets.
Automatic building extraction from LiDAR data fusion of point and grid-based features
NASA Astrophysics Data System (ADS)
Du, Shouji; Zhang, Yunsheng; Zou, Zhengrong; Xu, Shenghua; He, Xue; Chen, Siyang
2017-08-01
This paper proposes a method for extracting buildings from LiDAR point cloud data by combining point-based and grid-based features. To accurately discriminate buildings from vegetation, a point feature based on the variance of normal vectors is proposed. For a robust building extraction, a graph cuts algorithm is employed to combine the used features and consider the neighbor contexture information. As grid feature computing and a graph cuts algorithm are performed on a grid structure, a feature-retained DSM interpolation method is proposed in this paper. The proposed method is validated by the benchmark ISPRS Test Project on Urban Classification and 3D Building Reconstruction and compared to the state-art-of-the methods. The evaluation shows that the proposed method can obtain a promising result both at area-level and at object-level. The method is further applied to the entire ISPRS dataset and to a real dataset of the Wuhan City. The results show a completeness of 94.9% and a correctness of 92.2% at the per-area level for the former dataset and a completeness of 94.4% and a correctness of 95.8% for the latter one. The proposed method has a good potential for large-size LiDAR data.
FacetGist: Collective Extraction of Document Facets in Large Technical Corpora.
Siddiqui, Tarique; Ren, Xiang; Parameswaran, Aditya; Han, Jiawei
2016-10-01
Given the large volume of technical documents available, it is crucial to automatically organize and categorize these documents to be able to understand and extract value from them. Towards this end, we introduce a new research problem called Facet Extraction. Given a collection of technical documents, the goal of Facet Extraction is to automatically label each document with a set of concepts for the key facets ( e.g. , application, technique, evaluation metrics, and dataset) that people may be interested in. Facet Extraction has numerous applications, including document summarization, literature search, patent search and business intelligence. The major challenge in performing Facet Extraction arises from multiple sources: concept extraction, concept to facet matching, and facet disambiguation. To tackle these challenges, we develop FacetGist, a framework for facet extraction. Facet Extraction involves constructing a graph-based heterogeneous network to capture information available across multiple local sentence-level features, as well as global context features. We then formulate a joint optimization problem, and propose an efficient algorithm for graph-based label propagation to estimate the facet of each concept mention. Experimental results on technical corpora from two domains demonstrate that Facet Extraction can lead to an improvement of over 25% in both precision and recall over competing schemes.
FacetGist: Collective Extraction of Document Facets in Large Technical Corpora
Siddiqui, Tarique; Ren, Xiang; Parameswaran, Aditya; Han, Jiawei
2017-01-01
Given the large volume of technical documents available, it is crucial to automatically organize and categorize these documents to be able to understand and extract value from them. Towards this end, we introduce a new research problem called Facet Extraction. Given a collection of technical documents, the goal of Facet Extraction is to automatically label each document with a set of concepts for the key facets (e.g., application, technique, evaluation metrics, and dataset) that people may be interested in. Facet Extraction has numerous applications, including document summarization, literature search, patent search and business intelligence. The major challenge in performing Facet Extraction arises from multiple sources: concept extraction, concept to facet matching, and facet disambiguation. To tackle these challenges, we develop FacetGist, a framework for facet extraction. Facet Extraction involves constructing a graph-based heterogeneous network to capture information available across multiple local sentence-level features, as well as global context features. We then formulate a joint optimization problem, and propose an efficient algorithm for graph-based label propagation to estimate the facet of each concept mention. Experimental results on technical corpora from two domains demonstrate that Facet Extraction can lead to an improvement of over 25% in both precision and recall over competing schemes. PMID:28210517
Usability-driven pruning of large ontologies: the case of SNOMED CT.
López-García, Pablo; Boeker, Martin; Illarramendi, Arantza; Schulz, Stefan
2012-06-01
To study ontology modularization techniques when applied to SNOMED CT in a scenario in which no previous corpus of information exists and to examine if frequency-based filtering using MEDLINE can reduce subset size without discarding relevant concepts. Subsets were first extracted using four graph-traversal heuristics and one logic-based technique, and were subsequently filtered with frequency information from MEDLINE. Twenty manually coded discharge summaries from cardiology patients were used as signatures and test sets. The coverage, size, and precision of extracted subsets were measured. Graph-traversal heuristics provided high coverage (71-96% of terms in the test sets of discharge summaries) at the expense of subset size (17-51% of the size of SNOMED CT). Pre-computed subsets and logic-based techniques extracted small subsets (1%), but coverage was limited (24-55%). Filtering reduced the size of large subsets to 10% while still providing 80% coverage. Extracting subsets to annotate discharge summaries is challenging when no previous corpus exists. Ontology modularization provides valuable techniques, but the resulting modules grow as signatures spread across subhierarchies, yielding a very low precision. Graph-traversal strategies and frequency data from an authoritative source can prune large biomedical ontologies and produce useful subsets that still exhibit acceptable coverage. However, a clinical corpus closer to the specific use case is preferred when available.
Identifying patients with Alzheimer's disease using resting-state fMRI and graph theory.
Khazaee, Ali; Ebrahimzadeh, Ata; Babajani-Feremi, Abbas
2015-11-01
Study of brain network on the basis of resting-state functional magnetic resonance imaging (fMRI) has provided promising results to investigate changes in connectivity among different brain regions because of diseases. Graph theory can efficiently characterize different aspects of the brain network by calculating measures of integration and segregation. In this study, we combine graph theoretical approaches with advanced machine learning methods to study functional brain network alteration in patients with Alzheimer's disease (AD). Support vector machine (SVM) was used to explore the ability of graph measures in diagnosis of AD. We applied our method on the resting-state fMRI data of twenty patients with AD and twenty age and gender matched healthy subjects. The data were preprocessed and each subject's graph was constructed by parcellation of the whole brain into 90 distinct regions using the automated anatomical labeling (AAL) atlas. The graph measures were then calculated and used as the discriminating features. Extracted network-based features were fed to different feature selection algorithms to choose most significant features. In addition to the machine learning approach, statistical analysis was performed on connectivity matrices to find altered connectivity patterns in patients with AD. Using the selected features, we were able to accurately classify patients with AD from healthy subjects with accuracy of 100%. Results of this study show that pattern recognition and graph of brain network, on the basis of the resting state fMRI data, can efficiently assist in the diagnosis of AD. Classification based on the resting-state fMRI can be used as a non-invasive and automatic tool to diagnosis of Alzheimer's disease. Copyright © 2015 International Federation of Clinical Neurophysiology. All rights reserved.
A graph algebra for scalable visual analytics.
Shaverdian, Anna A; Zhou, Hao; Michailidis, George; Jagadish, Hosagrahar V
2012-01-01
Visual analytics (VA), which combines analytical techniques with advanced visualization features, is fast becoming a standard tool for extracting information from graph data. Researchers have developed many tools for this purpose, suggesting a need for formal methods to guide these tools' creation. Increased data demands on computing requires redesigning VA tools to consider performance and reliability in the context of analysis of exascale datasets. Furthermore, visual analysts need a way to document their analyses for reuse and results justification. A VA graph framework encapsulated in a graph algebra helps address these needs. Its atomic operators include selection and aggregation. The framework employs a visual operator and supports dynamic attributes of data to enable scalable visual exploration of data.
Analyzing cross-college course enrollments via contextual graph mining
Liu, Xiaozhong; Chen, Yan
2017-01-01
The ability to predict what courses a student may enroll in the coming semester plays a pivotal role in the allocation of learning resources, which is a hot topic in the domain of educational data mining. In this study, we propose an innovative approach to characterize students’ cross-college course enrollments by leveraging a novel contextual graph. Specifically, different kinds of variables, such as students, courses, colleges and diplomas, as well as various types of variable relations, are utilized to depict the context of each variable, and then a representation learning algorithm node2vec is applied to extracting sophisticated graph-based features for the enrollment analysis. In this manner, the relations between any pair of variables can be measured quantitatively, which enables the variable type to transform from nominal to ratio. These graph-based features are examined by the random forest algorithm, and experiments on 24,663 students, 1,674 courses and 417,590 enrollment records demonstrate that the contextual graph can successfully improve analyzing the cross-college course enrollments, where three of the graph-based features have significantly stronger impacts on prediction accuracy than the others. Besides, the empirical results also indicate that the student’s course preference is the most important factor in predicting future course enrollments, which is consistent to the previous studies that acknowledge the course interest is a key point for course recommendations. PMID:29186171
Analyzing cross-college course enrollments via contextual graph mining.
Wang, Yongzhen; Liu, Xiaozhong; Chen, Yan
2017-01-01
The ability to predict what courses a student may enroll in the coming semester plays a pivotal role in the allocation of learning resources, which is a hot topic in the domain of educational data mining. In this study, we propose an innovative approach to characterize students' cross-college course enrollments by leveraging a novel contextual graph. Specifically, different kinds of variables, such as students, courses, colleges and diplomas, as well as various types of variable relations, are utilized to depict the context of each variable, and then a representation learning algorithm node2vec is applied to extracting sophisticated graph-based features for the enrollment analysis. In this manner, the relations between any pair of variables can be measured quantitatively, which enables the variable type to transform from nominal to ratio. These graph-based features are examined by the random forest algorithm, and experiments on 24,663 students, 1,674 courses and 417,590 enrollment records demonstrate that the contextual graph can successfully improve analyzing the cross-college course enrollments, where three of the graph-based features have significantly stronger impacts on prediction accuracy than the others. Besides, the empirical results also indicate that the student's course preference is the most important factor in predicting future course enrollments, which is consistent to the previous studies that acknowledge the course interest is a key point for course recommendations.
1991-12-01
9 2.6.1 Multi-Shape Detection. .. .. .. .. .. .. ...... 9 Page 2.6.2 Line Segment Extraction and Re-Combination.. 9 2.6.3 Planimetric Feature... Extraction ............... 10 2.6.4 Line Segment Extraction From Statistical Texture Analysis .............................. 11 2.6.5 Edge Following as Graph...image after image, could benefit clue to the fact that major spatial characteristics of subregions could be extracted , and minor spatial changes could be
Iterative cross section sequence graph for handwritten character segmentation.
Dawoud, Amer
2007-08-01
The iterative cross section sequence graph (ICSSG) is an algorithm for handwritten character segmentation. It expands the cross section sequence graph concept by applying it iteratively at equally spaced thresholds. The iterative thresholding reduces the effect of information loss associated with image binarization. ICSSG preserves the characters' skeletal structure by preventing the interference of pixels that causes flooding of adjacent characters' segments. Improving the structural quality of the characters' skeleton facilitates better feature extraction and classification, which improves the overall performance of optical character recognition (OCR). Experimental results showed significant improvements in OCR recognition rates compared to other well-established segmentation algorithms.
Usability-driven pruning of large ontologies: the case of SNOMED CT
Boeker, Martin; Illarramendi, Arantza; Schulz, Stefan
2012-01-01
Objectives To study ontology modularization techniques when applied to SNOMED CT in a scenario in which no previous corpus of information exists and to examine if frequency-based filtering using MEDLINE can reduce subset size without discarding relevant concepts. Materials and Methods Subsets were first extracted using four graph-traversal heuristics and one logic-based technique, and were subsequently filtered with frequency information from MEDLINE. Twenty manually coded discharge summaries from cardiology patients were used as signatures and test sets. The coverage, size, and precision of extracted subsets were measured. Results Graph-traversal heuristics provided high coverage (71–96% of terms in the test sets of discharge summaries) at the expense of subset size (17–51% of the size of SNOMED CT). Pre-computed subsets and logic-based techniques extracted small subsets (1%), but coverage was limited (24–55%). Filtering reduced the size of large subsets to 10% while still providing 80% coverage. Discussion Extracting subsets to annotate discharge summaries is challenging when no previous corpus exists. Ontology modularization provides valuable techniques, but the resulting modules grow as signatures spread across subhierarchies, yielding a very low precision. Conclusion Graph-traversal strategies and frequency data from an authoritative source can prune large biomedical ontologies and produce useful subsets that still exhibit acceptable coverage. However, a clinical corpus closer to the specific use case is preferred when available. PMID:22268217
Dimitriadis, Stavros I.; Salis, Christos; Tarnanas, Ioannis; Linden, David E.
2017-01-01
The human brain is a large-scale system of functionally connected brain regions. This system can be modeled as a network, or graph, by dividing the brain into a set of regions, or “nodes,” and quantifying the strength of the connections between nodes, or “edges,” as the temporal correlation in their patterns of activity. Network analysis, a part of graph theory, provides a set of summary statistics that can be used to describe complex brain networks in a meaningful way. The large-scale organization of the brain has features of complex networks that can be quantified using network measures from graph theory. The adaptation of both bivariate (mutual information) and multivariate (Granger causality) connectivity estimators to quantify the synchronization between multichannel recordings yields a fully connected, weighted, (a)symmetric functional connectivity graph (FCG), representing the associations among all brain areas. The aforementioned procedure leads to an extremely dense network of tens up to a few hundreds of weights. Therefore, this FCG must be filtered out so that the “true” connectivity pattern can emerge. Here, we compared a large number of well-known topological thresholding techniques with the novel proposed data-driven scheme based on orthogonal minimal spanning trees (OMSTs). OMSTs filter brain connectivity networks based on the optimization between the global efficiency of the network and the cost preserving its wiring. We demonstrated the proposed method in a large EEG database (N = 101 subjects) with eyes-open (EO) and eyes-closed (EC) tasks by adopting a time-varying approach with the main goal to extract features that can totally distinguish each subject from the rest of the set. Additionally, the reliability of the proposed scheme was estimated in a second case study of fMRI resting-state activity with multiple scans. Our results demonstrated clearly that the proposed thresholding scheme outperformed a large list of thresholding schemes based on the recognition accuracy of each subject compared to the rest of the cohort (EEG). Additionally, the reliability of the network metrics based on the fMRI static networks was improved based on the proposed topological filtering scheme. Overall, the proposed algorithm could be used across neuroimaging and multimodal studies as a common computationally efficient standardized tool for a great number of neuroscientists and physicists working on numerous of projects. PMID:28491032
[A graph cuts-based interactive method for segmentation of magnetic resonance images of meningioma].
Li, Shuan-qiang; Feng, Qian-jin; Chen, Wu-fan; Lin, Ya-zhong
2011-06-01
For accurate segmentation of the magnetic resonance (MR) images of meningioma, we propose a novel interactive segmentation method based on graph cuts. The high dimensional image features was extracted, and for each pixel, the probabilities of its origin, either the tumor or the background regions, were estimated by exploiting the weighted K-nearest neighborhood classifier. Based on these probabilities, a new energy function was proposed. Finally, a graph cut optimal framework was used for the solution of the energy function. The proposed method was evaluated by application in the segmentation of MR images of meningioma, and the results showed that the method significantly improved the segmentation accuracy compared with the gray level information-based graph cut method.
Butler, William E.; Atai, Nadia; Carter, Bob; Hochberg, Fred
2014-01-01
The Richard Floor Biorepository supports collaborative studies of extracellular vesicles (EVs) found in human fluids and tissue specimens. The current emphasis is on biomarkers for central nervous system neoplasms but its structure may serve as a template for collaborative EV translational studies in other fields. The informatic system provides specimen inventory tracking with bar codes assigned to specimens and containers and projects, is hosted on globalized cloud computing resources, and embeds a suite of shared documents, calendars, and video-conferencing features. Clinical data are recorded in relation to molecular EV attributes and may be tagged with terms drawn from a network of externally maintained ontologies thus offering expansion of the system as the field matures. We fashioned the graphical user interface (GUI) around a web-based data visualization package. This system is now in an early stage of deployment, mainly focused on specimen tracking and clinical, laboratory, and imaging data capture in support of studies to optimize detection and analysis of brain tumour–specific mutations. It currently includes 4,392 specimens drawn from 611 subjects, the majority with brain tumours. As EV science evolves, we plan biorepository changes which may reflect multi-institutional collaborations, proteomic interfaces, additional biofluids, changes in operating procedures and kits for specimen handling, novel procedures for detection of tumour-specific EVs, and for RNA extraction and changes in the taxonomy of EVs. We have used an ontology-driven data model and web-based architecture with a graph theory–driven GUI to accommodate and stimulate the semantic web of EV science. PMID:25317275
Butler, William E; Atai, Nadia; Carter, Bob; Hochberg, Fred
2014-01-01
The Richard Floor Biorepository supports collaborative studies of extracellular vesicles (EVs) found in human fluids and tissue specimens. The current emphasis is on biomarkers for central nervous system neoplasms but its structure may serve as a template for collaborative EV translational studies in other fields. The informatic system provides specimen inventory tracking with bar codes assigned to specimens and containers and projects, is hosted on globalized cloud computing resources, and embeds a suite of shared documents, calendars, and video-conferencing features. Clinical data are recorded in relation to molecular EV attributes and may be tagged with terms drawn from a network of externally maintained ontologies thus offering expansion of the system as the field matures. We fashioned the graphical user interface (GUI) around a web-based data visualization package. This system is now in an early stage of deployment, mainly focused on specimen tracking and clinical, laboratory, and imaging data capture in support of studies to optimize detection and analysis of brain tumour-specific mutations. It currently includes 4,392 specimens drawn from 611 subjects, the majority with brain tumours. As EV science evolves, we plan biorepository changes which may reflect multi-institutional collaborations, proteomic interfaces, additional biofluids, changes in operating procedures and kits for specimen handling, novel procedures for detection of tumour-specific EVs, and for RNA extraction and changes in the taxonomy of EVs. We have used an ontology-driven data model and web-based architecture with a graph theory-driven GUI to accommodate and stimulate the semantic web of EV science.
Graph-based layout analysis for PDF documents
NASA Astrophysics Data System (ADS)
Xu, Canhui; Tang, Zhi; Tao, Xin; Li, Yun; Shi, Cao
2013-03-01
To increase the flexibility and enrich the reading experience of e-book on small portable screens, a graph based method is proposed to perform layout analysis on Portable Document Format (PDF) documents. Digital born document has its inherent advantages like representing texts and fractional images in explicit form, which can be straightforwardly exploited. To integrate traditional image-based document analysis and the inherent meta-data provided by PDF parser, the page primitives including text, image and path elements are processed to produce text and non text layer for respective analysis. Graph-based method is developed in superpixel representation level, and page text elements corresponding to vertices are used to construct an undirected graph. Euclidean distance between adjacent vertices is applied in a top-down manner to cut the graph tree formed by Kruskal's algorithm. And edge orientation is then used in a bottom-up manner to extract text lines from each sub tree. On the other hand, non-textual objects are segmented by connected component analysis. For each segmented text and non-text composite, a 13-dimensional feature vector is extracted for labelling purpose. The experimental results on selected pages from PDF books are presented.
Solutions for Coding Societal Events
2016-12-01
develop a prototype system for civil unrest event extraction, and (3) engineer BBN ACCENT (ACCurate Events from Natural Text ) to support broad use by...56 iv List of Tables Table 1: Features in similarity metric. Abbreviations are as follows. TG: text graph...extraction of a stream of events (e.g. protests, attacks, etc.) from unstructured text (e.g. news, social media). This technical report presents results
Song, Youyi; Zhang, Ling; Chen, Siping; Ni, Dong; Lei, Baiying; Wang, Tianfu
2015-10-01
In this paper, a multiscale convolutional network (MSCN) and graph-partitioning-based method is proposed for accurate segmentation of cervical cytoplasm and nuclei. Specifically, deep learning via the MSCN is explored to extract scale invariant features, and then, segment regions centered at each pixel. The coarse segmentation is refined by an automated graph partitioning method based on the pretrained feature. The texture, shape, and contextual information of the target objects are learned to localize the appearance of distinctive boundary, which is also explored to generate markers to split the touching nuclei. For further refinement of the segmentation, a coarse-to-fine nucleus segmentation framework is developed. The computational complexity of the segmentation is reduced by using superpixel instead of raw pixels. Extensive experimental results demonstrate that the proposed cervical nucleus cell segmentation delivers promising results and outperforms existing methods.
An image understanding system using attributed symbolic representation and inexact graph-matching
NASA Astrophysics Data System (ADS)
Eshera, M. A.; Fu, K.-S.
1986-09-01
A powerful image understanding system using a semantic-syntactic representation scheme consisting of attributed relational graphs (ARGs) is proposed for the analysis of the global information content of images. A multilayer graph transducer scheme performs the extraction of ARG representations from images, with ARG nodes representing the global image features, and the relations between features represented by the attributed branches between corresponding nodes. An efficient dynamic programming technique is employed to derive the distance between two ARGs and the inexact matching of their respective components. Noise, distortion and ambiguity in real-world images are handled through modeling in the transducer mapping rules and through the appropriate cost of error-transformation for the inexact matching of the representation. The system is demonstrated for the case of locating objects in a scene composed of complex overlapped objects, and the case of target detection in noisy and distorted synthetic aperture radar image.
Graph run-length matrices for histopathological image segmentation.
Tosun, Akif Burak; Gunduz-Demir, Cigdem
2011-03-01
The histopathological examination of tissue specimens is essential for cancer diagnosis and grading. However, this examination is subject to a considerable amount of observer variability as it mainly relies on visual interpretation of pathologists. To alleviate this problem, it is very important to develop computational quantitative tools, for which image segmentation constitutes the core step. In this paper, we introduce an effective and robust algorithm for the segmentation of histopathological tissue images. This algorithm incorporates the background knowledge of the tissue organization into segmentation. For this purpose, it quantifies spatial relations of cytological tissue components by constructing a graph and uses this graph to define new texture features for image segmentation. This new texture definition makes use of the idea of gray-level run-length matrices. However, it considers the runs of cytological components on a graph to form a matrix, instead of considering the runs of pixel intensities. Working with colon tissue images, our experiments demonstrate that the texture features extracted from "graph run-length matrices" lead to high segmentation accuracies, also providing a reasonable number of segmented regions. Compared with four other segmentation algorithms, the results show that the proposed algorithm is more effective in histopathological image segmentation.
Extraction of object skeletons in multispectral imagery by the orthogonal regression fitting
NASA Astrophysics Data System (ADS)
Palenichka, Roman M.; Zaremba, Marek B.
2003-03-01
Accurate and automatic extraction of skeletal shape of objects of interest from satellite images provides an efficient solution to such image analysis tasks as object detection, object identification, and shape description. The problem of skeletal shape extraction can be effectively solved in three basic steps: intensity clustering (i.e. segmentation) of objects, extraction of a structural graph of the object shape, and refinement of structural graph by the orthogonal regression fitting. The objects of interest are segmented from the background by a clustering transformation of primary features (spectral components) with respect to each pixel. The structural graph is composed of connected skeleton vertices and represents the topology of the skeleton. In the general case, it is a quite rough piecewise-linear representation of object skeletons. The positions of skeleton vertices on the image plane are adjusted by means of the orthogonal regression fitting. It consists of changing positions of existing vertices according to the minimum of the mean orthogonal distances and, eventually, adding new vertices in-between if a given accuracy if not yet satisfied. Vertices of initial piecewise-linear skeletons are extracted by using a multi-scale image relevance function. The relevance function is an image local operator that has local maximums at the centers of the objects of interest.
RelFinder: Revealing Relationships in RDF Knowledge Bases
NASA Astrophysics Data System (ADS)
Heim, Philipp; Hellmann, Sebastian; Lehmann, Jens; Lohmann, Steffen; Stegemann, Timo
The Semantic Web has recently seen a rise of large knowledge bases (such as DBpedia) that are freely accessible via SPARQL endpoints. The structured representation of the contained information opens up new possibilities in the way it can be accessed and queried. In this paper, we present an approach that extracts a graph covering relationships between two objects of interest. We show an interactive visualization of this graph that supports the systematic analysis of the found relationships by providing highlighting, previewing, and filtering features.
Safeguarding End-User Military Software
2014-12-04
product lines using composi- tional symbolic execution [17] Software product lines are families of products defined by feature commonality and vari...ability, with a well-managed asset base. Recent work in testing of software product lines has exploited similarities across development phases to reuse...feature dependence graph to extract the set of possible interaction trees in a product family. It composes these to incrementally and symbolically
Benchmarking natural-language parsers for biological applications using dependency graphs.
Clegg, Andrew B; Shepherd, Adrian J
2007-01-25
Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques.
Benchmarking natural-language parsers for biological applications using dependency graphs
Clegg, Andrew B; Shepherd, Adrian J
2007-01-01
Background Interest is growing in the application of syntactic parsers to natural language processing problems in biology, but assessing their performance is difficult because differences in linguistic convention can falsely appear to be errors. We present a method for evaluating their accuracy using an intermediate representation based on dependency graphs, in which the semantic relationships important in most information extraction tasks are closer to the surface. We also demonstrate how this method can be easily tailored to various application-driven criteria. Results Using the GENIA corpus as a gold standard, we tested four open-source parsers which have been used in bioinformatics projects. We first present overall performance measures, and test the two leading tools, the Charniak-Lease and Bikel parsers, on subtasks tailored to reflect the requirements of a system for extracting gene expression relationships. These two tools clearly outperform the other parsers in the evaluation, and achieve accuracy levels comparable to or exceeding native dependency parsers on similar tasks in previous biological evaluations. Conclusion Evaluating using dependency graphs allows parsers to be tested easily on criteria chosen according to the semantics of particular biological applications, drawing attention to important mistakes and soaking up many insignificant differences that would otherwise be reported as errors. Generating high-accuracy dependency graphs from the output of phrase-structure parsers also provides access to the more detailed syntax trees that are used in several natural-language processing techniques. PMID:17254351
NEFI: Network Extraction From Images
Dirnberger, M.; Kehl, T.; Neumann, A.
2015-01-01
Networks are amongst the central building blocks of many systems. Given a graph of a network, methods from graph theory enable a precise investigation of its properties. Software for the analysis of graphs is widely available and has been applied to study various types of networks. In some applications, graph acquisition is relatively simple. However, for many networks data collection relies on images where graph extraction requires domain-specific solutions. Here we introduce NEFI, a tool that extracts graphs from images of networks originating in various domains. Regarding previous work on graph extraction, theoretical results are fully accessible only to an expert audience and ready-to-use implementations for non-experts are rarely available or insufficiently documented. NEFI provides a novel platform allowing practitioners to easily extract graphs from images by combining basic tools from image processing, computer vision and graph theory. Thus, NEFI constitutes an alternative to tedious manual graph extraction and special purpose tools. We anticipate NEFI to enable time-efficient collection of large datasets. The analysis of these novel datasets may open up the possibility to gain new insights into the structure and function of various networks. NEFI is open source and available at http://nefi.mpi-inf.mpg.de. PMID:26521675
Recognition of emotions using multimodal physiological signals and an ensemble deep learning model.
Yin, Zhong; Zhao, Mengyuan; Wang, Yongxiong; Yang, Jingdong; Zhang, Jianhua
2017-03-01
Using deep-learning methodologies to analyze multimodal physiological signals becomes increasingly attractive for recognizing human emotions. However, the conventional deep emotion classifiers may suffer from the drawback of the lack of the expertise for determining model structure and the oversimplification of combining multimodal feature abstractions. In this study, a multiple-fusion-layer based ensemble classifier of stacked autoencoder (MESAE) is proposed for recognizing emotions, in which the deep structure is identified based on a physiological-data-driven approach. Each SAE consists of three hidden layers to filter the unwanted noise in the physiological features and derives the stable feature representations. An additional deep model is used to achieve the SAE ensembles. The physiological features are split into several subsets according to different feature extraction approaches with each subset separately encoded by a SAE. The derived SAE abstractions are combined according to the physiological modality to create six sets of encodings, which are then fed to a three-layer, adjacent-graph-based network for feature fusion. The fused features are used to recognize binary arousal or valence states. DEAP multimodal database was employed to validate the performance of the MESAE. By comparing with the best existing emotion classifier, the mean of classification rate and F-score improves by 5.26%. The superiority of the MESAE against the state-of-the-art shallow and deep emotion classifiers has been demonstrated under different sizes of the available physiological instances. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Novel Spectral Representations and Sparsity-Driven Algorithms for Shape Modeling and Analysis
NASA Astrophysics Data System (ADS)
Zhong, Ming
In this dissertation, we focus on extending classical spectral shape analysis by incorporating spectral graph wavelets and sparsity-seeking algorithms. Defined with the graph Laplacian eigenbasis, the spectral graph wavelets are localized both in the vertex domain and graph spectral domain, and thus are very effective in describing local geometry. With a rich dictionary of elementary vectors and forcing certain sparsity constraints, a real life signal can often be well approximated by a very sparse coefficient representation. The many successful applications of sparse signal representation in computer vision and image processing inspire us to explore the idea of employing sparse modeling techniques with dictionary of spectral basis to solve various shape modeling problems. Conventional spectral mesh compression uses the eigenfunctions of mesh Laplacian as shape bases, which are highly inefficient in representing local geometry. To ameliorate, we advocate an innovative approach to 3D mesh compression using spectral graph wavelets as dictionary to encode mesh geometry. The spectral graph wavelets are locally defined at individual vertices and can better capture local shape information than Laplacian eigenbasis. The multi-scale SGWs form a redundant dictionary as shape basis, so we formulate the compression of 3D shape as a sparse approximation problem that can be readily handled by greedy pursuit algorithms. Surface inpainting refers to the completion or recovery of missing shape geometry based on the shape information that is currently available. We devise a new surface inpainting algorithm founded upon the theory and techniques of sparse signal recovery. Instead of estimating the missing geometry directly, our novel method is to find this low-dimensional representation which describes the entire original shape. More specifically, we find that, for many shapes, the vertex coordinate function can be well approximated by a very sparse coefficient representation with respect to the dictionary comprising its Laplacian eigenbasis, and it is then possible to recover this sparse representation from partial measurements of the original shape. Taking advantage of the sparsity cue, we advocate a novel variational approach for surface inpainting, integrating data fidelity constraints on the shape domain with coefficient sparsity constraints on the transformed domain. Because of the powerful properties of Laplacian eigenbasis, the inpainting results of our method tend to be globally coherent with the remaining shape. Informative and discriminative feature descriptors are vital in qualitative and quantitative shape analysis for a large variety of graphics applications. We advocate novel strategies to define generalized, user-specified features on shapes. Our new region descriptors are primarily built upon the coefficients of spectral graph wavelets that are both multi-scale and multi-level in nature, consisting of both local and global information. Based on our novel spectral feature descriptor, we developed a user-specified feature detection framework and a tensor-based shape matching algorithm. Through various experiments, we demonstrate the competitive performance of our proposed methods and the great potential of spectral basis and sparsity-driven methods for shape modeling.
EEG analysis of seizure patterns using visibility graphs for detection of generalized seizures.
Wang, Lei; Long, Xi; Arends, Johan B A M; Aarts, Ronald M
2017-10-01
The traditional EEG features in the time and frequency domain show limited seizure detection performance in the epileptic population with intellectual disability (ID). In addition, the influence of EEG seizure patterns on detection performance was less studied. A single-channel EEG signal can be mapped into visibility graphs (VGS), including basic visibility graph (VG), horizontal VG (HVG), and difference VG (DVG). These graphs were used to characterize different EEG seizure patterns. To demonstrate its effectiveness in identifying EEG seizure patterns and detecting generalized seizures, EEG recordings of 615h on one EEG channel from 29 epileptic patients with ID were analyzed. A novel feature set with discriminative power for seizure detection was obtained by using the VGS method. The degree distributions (DDs) of DVG can clearly distinguish EEG of each seizure pattern. The degree entropy and power-law degree power in DVG were proposed here for the first time, and they show significant difference between seizure and non-seizure EEG. The connecting structure measured by HVG can better distinguish seizure EEG from background than those by VG and DVG. A traditional EEG feature set based on frequency analysis was used here as a benchmark feature set. With a support vector machine (SVM) classifier, the seizure detection performance of the benchmark feature set (sensitivity of 24%, FD t /h of 1.8s) can be improved by combining our proposed VGS features extracted from one EEG channel (sensitivity of 38%, FD t /h of 1.4s). The proposed VGS-based features can help improve seizure detection for ID patients. Copyright © 2017 Elsevier B.V. All rights reserved.
EEG Sleep Stages Classification Based on Time Domain Features and Structural Graph Similarity.
Diykh, Mohammed; Li, Yan; Wen, Peng
2016-11-01
The electroencephalogram (EEG) signals are commonly used in diagnosing and treating sleep disorders. Many existing methods for sleep stages classification mainly depend on the analysis of EEG signals in time or frequency domain to obtain a high classification accuracy. In this paper, the statistical features in time domain, the structural graph similarity and the K-means (SGSKM) are combined to identify six sleep stages using single channel EEG signals. Firstly, each EEG segment is partitioned into sub-segments. The size of a sub-segment is determined empirically. Secondly, statistical features are extracted, sorted into different sets of features and forwarded to the SGSKM to classify EEG sleep stages. We have also investigated the relationships between sleep stages and the time domain features of the EEG data used in this paper. The experimental results show that the proposed method yields better classification results than other four existing methods and the support vector machine (SVM) classifier. A 95.93% average classification accuracy is achieved by using the proposed method.
Figure-Ground Segmentation Using Factor Graphs
Shen, Huiying; Coughlan, James; Ivanchenko, Volodymyr
2009-01-01
Foreground-background segmentation has recently been applied [26,12] to the detection and segmentation of specific objects or structures of interest from the background as an efficient alternative to techniques such as deformable templates [27]. We introduce a graphical model (i.e. Markov random field)-based formulation of structure-specific figure-ground segmentation based on simple geometric features extracted from an image, such as local configurations of linear features, that are characteristic of the desired figure structure. Our formulation is novel in that it is based on factor graphs, which are graphical models that encode interactions among arbitrary numbers of random variables. The ability of factor graphs to express interactions higher than pairwise order (the highest order encountered in most graphical models used in computer vision) is useful for modeling a variety of pattern recognition problems. In particular, we show how this property makes factor graphs a natural framework for performing grouping and segmentation, and demonstrate that the factor graph framework emerges naturally from a simple maximum entropy model of figure-ground segmentation. We cast our approach in a learning framework, in which the contributions of multiple grouping cues are learned from training data, and apply our framework to the problem of finding printed text in natural scenes. Experimental results are described, including a performance analysis that demonstrates the feasibility of the approach. PMID:20160994
Robust Joint Graph Sparse Coding for Unsupervised Spectral Feature Selection.
Zhu, Xiaofeng; Li, Xuelong; Zhang, Shichao; Ju, Chunhua; Wu, Xindong
2017-06-01
In this paper, we propose a new unsupervised spectral feature selection model by embedding a graph regularizer into the framework of joint sparse regression for preserving the local structures of data. To do this, we first extract the bases of training data by previous dictionary learning methods and, then, map original data into the basis space to generate their new representations, by proposing a novel joint graph sparse coding (JGSC) model. In JGSC, we first formulate its objective function by simultaneously taking subspace learning and joint sparse regression into account, then, design a new optimization solution to solve the resulting objective function, and further prove the convergence of the proposed solution. Furthermore, we extend JGSC to a robust JGSC (RJGSC) via replacing the least square loss function with a robust loss function, for achieving the same goals and also avoiding the impact of outliers. Finally, experimental results on real data sets showed that both JGSC and RJGSC outperformed the state-of-the-art algorithms in terms of k -nearest neighbor classification performance.
Fang, Leyuan; Cunefare, David; Wang, Chong; Guymer, Robyn H.; Li, Shutao; Farsiu, Sina
2017-01-01
We present a novel framework combining convolutional neural networks (CNN) and graph search methods (termed as CNN-GS) for the automatic segmentation of nine layer boundaries on retinal optical coherence tomography (OCT) images. CNN-GS first utilizes a CNN to extract features of specific retinal layer boundaries and train a corresponding classifier to delineate a pilot estimate of the eight layers. Next, a graph search method uses the probability maps created from the CNN to find the final boundaries. We validated our proposed method on 60 volumes (2915 B-scans) from 20 human eyes with non-exudative age-related macular degeneration (AMD), which attested to effectiveness of our proposed technique. PMID:28663902
Fang, Leyuan; Cunefare, David; Wang, Chong; Guymer, Robyn H; Li, Shutao; Farsiu, Sina
2017-05-01
We present a novel framework combining convolutional neural networks (CNN) and graph search methods (termed as CNN-GS) for the automatic segmentation of nine layer boundaries on retinal optical coherence tomography (OCT) images. CNN-GS first utilizes a CNN to extract features of specific retinal layer boundaries and train a corresponding classifier to delineate a pilot estimate of the eight layers. Next, a graph search method uses the probability maps created from the CNN to find the final boundaries. We validated our proposed method on 60 volumes (2915 B-scans) from 20 human eyes with non-exudative age-related macular degeneration (AMD), which attested to effectiveness of our proposed technique.
Bremer, Peer-Timo; Weber, Gunther; Tierny, Julien; Pascucci, Valerio; Day, Marcus S; Bell, John B
2011-09-01
Large-scale simulations are increasingly being used to study complex scientific and engineering phenomena. As a result, advanced visualization and data analysis are also becoming an integral part of the scientific process. Often, a key step in extracting insight from these large simulations involves the definition, extraction, and evaluation of features in the space and time coordinates of the solution. However, in many applications, these features involve a range of parameters and decisions that will affect the quality and direction of the analysis. Examples include particular level sets of a specific scalar field, or local inequalities between derived quantities. A critical step in the analysis is to understand how these arbitrary parameters/decisions impact the statistical properties of the features, since such a characterization will help to evaluate the conclusions of the analysis as a whole. We present a new topological framework that in a single-pass extracts and encodes entire families of possible features definitions as well as their statistical properties. For each time step we construct a hierarchical merge tree a highly compact, yet flexible feature representation. While this data structure is more than two orders of magnitude smaller than the raw simulation data it allows us to extract a set of features for any given parameter selection in a postprocessing step. Furthermore, we augment the trees with additional attributes making it possible to gather a large number of useful global, local, as well as conditional statistic that would otherwise be extremely difficult to compile. We also use this representation to create tracking graphs that describe the temporal evolution of the features over time. Our system provides a linked-view interface to explore the time-evolution of the graph interactively alongside the segmentation, thus making it possible to perform extensive data analysis in a very efficient manner. We demonstrate our framework by extracting and analyzing burning cells from a large-scale turbulent combustion simulation. In particular, we show how the statistical analysis enabled by our techniques provides new insight into the combustion process.
An automatic graph-based approach for artery/vein classification in retinal images.
Dashtbozorg, Behdad; Mendonça, Ana Maria; Campilho, Aurélio
2014-03-01
The classification of retinal vessels into artery/vein (A/V) is an important phase for automating the detection of vascular changes, and for the calculation of characteristic signs associated with several systemic diseases such as diabetes, hypertension, and other cardiovascular conditions. This paper presents an automatic approach for A/V classification based on the analysis of a graph extracted from the retinal vasculature. The proposed method classifies the entire vascular tree deciding on the type of each intersection point (graph nodes) and assigning one of two labels to each vessel segment (graph links). Final classification of a vessel segment as A/V is performed through the combination of the graph-based labeling results with a set of intensity features. The results of this proposed method are compared with manual labeling for three public databases. Accuracy values of 88.3%, 87.4%, and 89.8% are obtained for the images of the INSPIRE-AVR, DRIVE, and VICAVR databases, respectively. These results demonstrate that our method outperforms recent approaches for A/V classification.
Sequential visibility-graph motifs
NASA Astrophysics Data System (ADS)
Iacovacci, Jacopo; Lacasa, Lucas
2016-04-01
Visibility algorithms transform time series into graphs and encode dynamical information in their topology, paving the way for graph-theoretical time series analysis as well as building a bridge between nonlinear dynamics and network science. In this work we introduce and study the concept of sequential visibility-graph motifs, smaller substructures of n consecutive nodes that appear with characteristic frequencies. We develop a theory to compute in an exact way the motif profiles associated with general classes of deterministic and stochastic dynamics. We find that this simple property is indeed a highly informative and computationally efficient feature capable of distinguishing among different dynamics and robust against noise contamination. We finally confirm that it can be used in practice to perform unsupervised learning, by extracting motif profiles from experimental heart-rate series and being able, accordingly, to disentangle meditative from other relaxation states. Applications of this general theory include the automatic classification and description of physical, biological, and financial time series.
A Collection of Features for Semantic Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Eliassi-Rad, T; Fodor, I K; Gallagher, B
2007-05-02
Semantic graphs are commonly used to represent data from one or more data sources. Such graphs extend traditional graphs by imposing types on both nodes and links. This type information defines permissible links among specified nodes and can be represented as a graph commonly referred to as an ontology or schema graph. Figure 1 depicts an ontology graph for data from National Association of Securities Dealers. Each node type and link type may also have a list of attributes. To capture the increased complexity of semantic graphs, concepts derived for standard graphs have to be extended. This document explains brieflymore » features commonly used to characterize graphs, and their extensions to semantic graphs. This document is divided into two sections. Section 2 contains the feature descriptions for static graphs. Section 3 extends the features for semantic graphs that vary over time.« less
Using a high-dimensional graph of semantic space to model relationships among words
Jackson, Alice F.; Bolger, Donald J.
2014-01-01
The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD). PMID:24860525
Using a high-dimensional graph of semantic space to model relationships among words.
Jackson, Alice F; Bolger, Donald J
2014-01-01
The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD).
Akama, Hiroyuki; Miyake, Maki; Jung, Jaeyoung; Murphy, Brian
2015-01-01
In this study, we introduce an original distance definition for graphs, called the Markov-inverse-F measure (MiF). This measure enables the integration of classical graph theory indices with new knowledge pertaining to structural feature extraction from semantic networks. MiF improves the conventional Jaccard and/or Simpson indices, and reconciles both the geodesic information (random walk) and co-occurrence adjustment (degree balance and distribution). We measure the effectiveness of graph-based coefficients through the application of linguistic graph information for a neural activity recorded during conceptual processing in the human brain. Specifically, the MiF distance is computed between each of the nouns used in a previous neural experiment and each of the in-between words in a subgraph derived from the Edinburgh Word Association Thesaurus of English. From the MiF-based information matrix, a machine learning model can accurately obtain a scalar parameter that specifies the degree to which each voxel in (the MRI image of) the brain is activated by each word or each principal component of the intermediate semantic features. Furthermore, correlating the voxel information with the MiF-based principal components, a new computational neurolinguistics model with a network connectivity paradigm is created. This allows two dimensions of context space to be incorporated with both semantic and neural distributional representations.
Using graph theory to analyze biological networks
2011-01-01
Understanding complex systems often requires a bottom-up analysis towards a systems biology approach. The need to investigate a system, not only as individual components but as a whole, emerges. This can be done by examining the elementary constituents individually and then how these are connected. The myriad components of a system and their interactions are best characterized as networks and they are mainly represented as graphs where thousands of nodes are connected with thousands of vertices. In this article we demonstrate approaches, models and methods from the graph theory universe and we discuss ways in which they can be used to reveal hidden properties and features of a network. This network profiling combined with knowledge extraction will help us to better understand the biological significance of the system. PMID:21527005
A linked GeoData map for enabling information access
Powell, Logan J.; Varanka, Dalia E.
2018-01-10
OverviewThe Geospatial Semantic Web (GSW) is an emerging technology that uses the Internet for more effective knowledge engineering and information extraction. Among the aims of the GSW are to structure the semantic specifications of data to reduce ambiguity and to link those data more efficiently. The data are stored as triples, the basic data unit in graph databases, which are similar to the vector data model of geographic information systems (GIS); that is, a node-edge-node model that forms a graph of semantically related information. The GSW is supported by emerging technologies such as linked geospatial data, described below, that enable it to store and manage geographical data that require new cartographic methods for visualization. This report describes a map that can interact with linked geospatial data using a simulation of a data query approach called the browsable graph to find information that is semantically related to a subject of interest, visualized using the Data Driven Documents (D3) library. Such a semantically enabled map functions as a map knowledge base (MKB) (Varanka and Usery, 2017).A MKB differs from a database in an important way. The central element of a triple, alternatively called the edge or property, is composed of a logic formalization that structures the relation between the first and third parts, the nodes or objects. Node-edge-node represents the graphic form of the triple, and the subject-property-object terms represent the data structure. Object classes connect to build a federated graph, similar to a network in visual form. Because the triple property is a logical statement (a predicate), the data graph represents logical propositions or assertions accepted to be true about the subject matter. These logical formalizations can be manipulated to calculate new triples, representing inferred logical assertions, from the existing data.To demonstrate a MKB system, a technical proof-of-concept is developed that uses geographically attributed Resource Description Framework (RDF) serializations of linked data for mapping. The proof-of-concept focuses on accessing triple data from visual elements of a geographic map as the interface to the MKB. The map interface is embedded with other essential functions such as SPARQL Protocol and RDF Query Language (SPARQL) data query endpoint services and reasoning capabilities of Apache Marmotta (Apache Software Foundation, 2017). An RDF database of the Geographic Names Information System (GNIS), which contains official names of domestic feature in the United States, was linked to a county data layer from The National Map of the U.S. Geological Survey. The county data are part of a broader Government Units theme offered to the public as Esri shapefiles. The shapefile used to draw the map itself was converted to a geographic-oriented JavaScript Object Notation (JSON) (GeoJSON) format and linked through various properties with a linked geodata version of the GNIS database called “GNIS–LD” (Butler and others, 2016; B. Regalia and others, University of California-Santa Barbara, written commun., 2017). The GNIS–LD files originated in Terse RDF Triple Language (Turtle) format but were converted to a JSON format specialized in linked data, “JSON–LD” (Beckett and Berners-Lee, 2011; Sorny and others, 2014). The GNIS–LD database is composed of roughly three predominant triple data graphs: Features, Names, and History. The graphs include a set of namespace prefixes used by each of the attributes. Predefining the prefixes made the conversion to the JSON–LD format simple to complete because Turtle and JSON–LD are variant specifications of the basic RDF concept.To convert a shapefile into GeoJSON format to capture the geospatial coordinate geometry objects, an online converter, Mapshaper, was used (Bloch, 2013). To convert the Turtle files, a custom converter written in Java reconstructs the files by parsing each grouping of attributes belonging to one subject and pasting the data into a new file that follows the syntax of JSON–LD. Additionally, the Features file contained its own set of geometries, which was exported into a separate JSON–LD file along with its elevation value to form a fourth file, named “features-geo.json.” Extracted data from external files can be represented in HyperText Markup Language (HTML) path objects. The goal was to import multiple JSON–LD files using this approach.
Topological visual mapping in robotics.
Romero, Anna; Cazorla, Miguel
2012-08-01
A key problem in robotics is the construction of a map from its environment. This map could be used in different tasks, like localization, recognition, obstacle avoidance, etc. Besides, the simultaneous location and mapping (SLAM) problem has had a lot of interest in the robotics community. This paper presents a new method for visual mapping, using topological instead of metric information. For that purpose, we propose prior image segmentation into regions in order to group the extracted invariant features in a graph so that each graph defines a single region of the image. Although others methods have been proposed for visual SLAM, our method is complete, in the sense that it makes all the process: it presents a new method for image matching; it defines a way to build the topological map; and it also defines a matching criterion for loop-closing. The matching process will take into account visual features and their structure using the graph transformation matching (GTM) algorithm, which allows us to process the matching and to remove out the outliers. Then, using this image comparison method, we propose an algorithm for constructing topological maps. During the experimentation phase, we will test the robustness of the method and its ability constructing topological maps. We have also introduced new hysteresis behavior in order to solve some problems found building the graph.
Bayesian segmentation of atrium wall using globally-optimal graph cuts on 3D meshes.
Veni, Gopalkrishna; Fu, Zhisong; Awate, Suyash P; Whitaker, Ross T
2013-01-01
Efficient segmentation of the left atrium (LA) wall from delayed enhancement MRI is challenging due to inconsistent contrast, combined with noise, and high variation in atrial shape and size. We present a surface-detection method that is capable of extracting the atrial wall by computing an optimal a-posteriori estimate. This estimation is done on a set of nested meshes, constructed from an ensemble of segmented training images, and graph cuts on an associated multi-column, proper-ordered graph. The graph/mesh is a part of a template/model that has an associated set of learned intensity features. When this mesh is overlaid onto a test image, it produces a set of costs which lead to an optimal segmentation. The 3D mesh has an associated weighted, directed multi-column graph with edges that encode smoothness and inter-surface penalties. Unlike previous graph-cut methods that impose hard constraints on the surface properties, the proposed method follows from a Bayesian formulation resulting in soft penalties on spatial variation of the cuts through the mesh. The novelty of this method also lies in the construction of proper-ordered graphs on complex shapes for choosing among distinct classes of base shapes for automatic LA segmentation. We evaluate the proposed segmentation framework on simulated and clinical cardiac MRI.
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2018-06-01
Text categorization has been used extensively in recent years to classify plain-text clinical reports. This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier was responsible for predicting CoD using the proposed conceptual graph-based document representation technique. To demonstrate the significance of the proposed technique, its results were compared with those of six (6) state-of-the-art document representation techniques. Lastly, this study compared the effects of one-level classification and two-level classification on the experimental results. The experimental results indicated that the CGDR technique achieved 12% to 15% improvement in accuracy compared with fully automated document representation baseline techniques. Moreover, two-level classification obtained better results compared with one-level classification. The promising results of the proposed conceptual graph-based document representation technique suggest that pathologists can adopt the proposed system as their basis for second opinion, thereby supporting them in effectively determining CoD. Copyright © 2018 Elsevier Inc. All rights reserved.
Dogrusoz, U; Erson, E Z; Giral, E; Demir, E; Babur, O; Cetintas, A; Colak, R
2006-02-01
Patikaweb provides a Web interface for retrieving and analyzing biological pathways in the Patika database, which contains data integrated from various prominent public pathway databases. It features a user-friendly interface, dynamic visualization and automated layout, advanced graph-theoretic queries for extracting biologically important phenomena, local persistence capability and exporting facilities to various pathway exchange formats.
Metabolomic Analysis and Visualization Engine for LC–MS Data
Melamud, Eugene; Vastag, Livia; Rabinowitz, Joshua D.
2017-01-01
Metabolomic analysis by liquid chromatography–high-resolution mass spectrometry results in data sets with thousands of features arising from metabolites, fragments, isotopes, and adducts. Here we describe a software package, Metabolomic Analysis and Visualization ENgine (MAVEN), designed for efficient interactive analysis of LC–MS data, including in the presence of isotope labeling. The software contains tools for all aspects of the data analysis process, from feature extraction to pathway-based graphical data display. To facilitate data validation, a machine learning algorithm automatically assesses peak quality. Users interact with raw data primarily in the form of extracted ion chromatograms, which are displayed with overlaid circles indicating peak quality, and bar graphs of peak intensities for both unlabeled and isotope-labeled metabolite forms. Click-based navigation leads to additional information, such as raw data for specific isotopic forms or for metabolites changing significantly between conditions. Fast data processing algorithms result in nearly delay-free browsing. Drop-down menus provide tools for the overlay of data onto pathway maps. These tools enable animating series of pathway graphs, e.g., to show propagation of labeled forms through a metabolic network. MAVEN is released under an open source license at http://maven.princeton.edu. PMID:21049934
Analyzing locomotion synthesis with feature-based motion graphs.
Mahmudi, Mentar; Kallmann, Marcelo
2013-05-01
We propose feature-based motion graphs for realistic locomotion synthesis among obstacles. Among several advantages, feature-based motion graphs achieve improved results in search queries, eliminate the need of postprocessing for foot skating removal, and reduce the computational requirements in comparison to traditional motion graphs. Our contributions are threefold. First, we show that choosing transitions based on relevant features significantly reduces graph construction time and leads to improved search performances. Second, we employ a fast channel search method that confines the motion graph search to a free channel with guaranteed clearance among obstacles, achieving faster and improved results that avoid expensive collision checking. Lastly, we present a motion deformation model based on Inverse Kinematics applied over the transitions of a solution branch. Each transition is assigned a continuous deformation range that does not exceed the original transition cost threshold specified by the user for the graph construction. The obtained deformation improves the reachability of the feature-based motion graph and in turn also reduces the time spent during search. The results obtained by the proposed methods are evaluated and quantified, and they demonstrate significant improvements in comparison to traditional motion graph techniques.
Classification of user interfaces for graph-based online analytical processing
NASA Astrophysics Data System (ADS)
Michaelis, James R.
2016-05-01
In the domain of business intelligence, user-oriented software for conducting multidimensional analysis via Online- Analytical Processing (OLAP) is now commonplace. In this setting, datasets commonly have well-defined sets of dimensions and measures around which analysis tasks can be conducted. However, many forms of data used in intelligence operations - deriving from social networks, online communications, and text corpora - will consist of graphs with varying forms of potential dimensional structure. Hence, enabling OLAP over such data collections requires explicit definition and extraction of supporting dimensions and measures. Further, as Graph OLAP remains an emerging technique, limited research has been done on its user interface requirements. Namely, on effective pairing of interface designs to different types of graph-derived dimensions and measures. This paper presents a novel technique for pairing of user interface designs to Graph OLAP datasets, rooted in Analytic Hierarchy Process (AHP) driven comparisons. Attributes of the classification strategy are encoded through an AHP ontology, developed in our alternate work and extended to support pairwise comparison of interfaces. Specifically, according to their ability, as perceived by Subject Matter Experts, to support dimensions and measures corresponding to Graph OLAP dataset attributes. To frame this discussion, a survey is provided both on existing variations of Graph OLAP, as well as existing interface designs previously applied in multidimensional analysis settings. Following this, a review of our AHP ontology is provided, along with a listing of corresponding dataset and interface attributes applicable toward SME recommendation structuring. A walkthrough of AHP-based recommendation encoding via the ontology-based approach is then provided. The paper concludes with a short summary of proposed future directions seen as essential for this research area.
NASA Astrophysics Data System (ADS)
Cruz-Roa, Angel; Xu, Jun; Madabhushi, Anant
2015-01-01
Nuclear architecture or the spatial arrangement of individual cancer nuclei on histopathology images has been shown to be associated with different grades and differential risk for a number of solid tumors such as breast, prostate, and oropharyngeal. Graph-based representations of individual nuclei (nuclei representing the graph nodes) allows for mining of quantitative metrics to describe tumor morphology. These graph features can be broadly categorized into global and local depending on the type of graph construction method. While a number of local graph (e.g. Cell Cluster Graphs) and global graph (e.g. Voronoi, Delaunay Triangulation, Minimum Spanning Tree) features have been shown to associated with cancer grade, risk, and outcome for different cancer types, the sensitivity of the preceding segmentation algorithms in identifying individual nuclei can have a significant bearing on the discriminability of the resultant features. This therefore begs the question as to which features while being discriminative of cancer grade and aggressiveness are also the most resilient to the segmentation errors. These properties are particularly desirable in the context of digital pathology images, where the method of slide preparation, staining, and type of nuclear segmentation algorithm employed can all dramatically affect the quality of the nuclear graphs and corresponding features. In this paper we evaluated the trade off between discriminability and stability of both global and local graph-based features in conjunction with a few different segmentation algorithms and in the context of two different histopathology image datasets of breast cancer from whole-slide images (WSI) and tissue microarrays (TMA). Specifically in this paper we investigate a few different performance measures including stability, discriminability and stability vs discriminability trade off, all of which are based on p-values from the Kruskal-Wallis one-way analysis of variance for local and global graph features. Apart from identifying the set of local and global features that satisfied the trade off between stability and discriminability, our most interesting finding was that a simple segmentation method was sufficient to identify the most discriminant features for invasive tumour detection in TMAs, whereas for tumour grading in WSI, the graph based features were more sensitive to the accuracy of the segmentation algorithm employed.
Effective spin physics in two-dimensional cavity QED arrays
NASA Astrophysics Data System (ADS)
Minář, Jiří; Güneş Söyler, Şebnem; Rotondo, Pietro; Lesanovsky, Igor
2017-06-01
We investigate a strongly correlated system of light and matter in two-dimensional cavity arrays. We formulate a multimode Tavis-Cummings (TC) Hamiltonian for two-level atoms coupled to cavity modes and driven by an external laser field which reduces to an effective spin Hamiltonian in the dispersive regime. In one-dimension we provide an exact analytical solution. In two-dimensions, we perform mean-field study and large scale quantum Monte Carlo simulations of both the TC and the effective spin models. We discuss the phase diagram and the parameter regime which gives rise to frustrated interactions between the spins. We provide a quantitative description of the phase transitions and correlation properties featured by the system and we discuss graph-theoretical properties of the ground states in terms of graph colourings using Pólya’s enumeration theorem.
NASA Technical Reports Server (NTRS)
Hein, L. A.; Myers, W. N.
1980-01-01
Vertical axis wind turbine incorporates several unique features to extract more energy from wind increasing efficiency 20% over conventional propeller driven units. System also features devices that utilize solar energy or chimney effluents during periods of no wind.
Language translation, doman specific languages and ANTLR
NASA Technical Reports Server (NTRS)
Craymer, Loring; Parr, Terence
2002-01-01
We will discuss the features of ANTLR that make it an attractive tool for rapid developement of domain specific language translators and present some practical examples of its use: extraction of information from the Cassini Command Language specification, the processing of structured binary data, and IVL--an English-like language for generating VRML scene graph, which is used in configuring the jGuru.com server.
An Integrated Ransac and Graph Based Mismatch Elimination Approach for Wide-Baseline Image Matching
NASA Astrophysics Data System (ADS)
Hasheminasab, M.; Ebadi, H.; Sedaghat, A.
2015-12-01
In this paper we propose an integrated approach in order to increase the precision of feature point matching. Many different algorithms have been developed as to optimizing the short-baseline image matching while because of illumination differences and viewpoints changes, wide-baseline image matching is so difficult to handle. Fortunately, the recent developments in the automatic extraction of local invariant features make wide-baseline image matching possible. The matching algorithms which are based on local feature similarity principle, using feature descriptor as to establish correspondence between feature point sets. To date, the most remarkable descriptor is the scale-invariant feature transform (SIFT) descriptor , which is invariant to image rotation and scale, and it remains robust across a substantial range of affine distortion, presence of noise, and changes in illumination. The epipolar constraint based on RANSAC (random sample consensus) method is a conventional model for mismatch elimination, particularly in computer vision. Because only the distance from the epipolar line is considered, there are a few false matches in the selected matching results based on epipolar geometry and RANSAC. Aguilariu et al. proposed Graph Transformation Matching (GTM) algorithm to remove outliers which has some difficulties when the mismatched points surrounded by the same local neighbor structure. In this study to overcome these limitations, which mentioned above, a new three step matching scheme is presented where the SIFT algorithm is used to obtain initial corresponding point sets. In the second step, in order to reduce the outliers, RANSAC algorithm is applied. Finally, to remove the remained mismatches, based on the adjacent K-NN graph, the GTM is implemented. Four different close range image datasets with changes in viewpoint are utilized to evaluate the performance of the proposed method and the experimental results indicate its robustness and capability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Brost, Randolph C.; McLendon, William Clarence,
2013-01-01
Modeling geospatial information with semantic graphs enables search for sites of interest based on relationships between features, without requiring strong a priori models of feature shape or other intrinsic properties. Geospatial semantic graphs can be constructed from raw sensor data with suitable preprocessing to obtain a discretized representation. This report describes initial work toward extending geospatial semantic graphs to include temporal information, and initial results applying semantic graph techniques to SAR image data. We describe an efficient graph structure that includes geospatial and temporal information, which is designed to support simultaneous spatial and temporal search queries. We also report amore » preliminary implementation of feature recognition, semantic graph modeling, and graph search based on input SAR data. The report concludes with lessons learned and suggestions for future improvements.« less
Visibility graph analysis on quarterly macroeconomic series of China based on complex network theory
NASA Astrophysics Data System (ADS)
Wang, Na; Li, Dong; Wang, Qiwen
2012-12-01
The visibility graph approach and complex network theory provide a new insight into time series analysis. The inheritance of the visibility graph from the original time series was further explored in the paper. We found that degree distributions of visibility graphs extracted from Pseudo Brownian Motion series obtained by the Frequency Domain algorithm exhibit exponential behaviors, in which the exponential exponent is a binomial function of the Hurst index inherited in the time series. Our simulations presented that the quantitative relations between the Hurst indexes and the exponents of degree distribution function are different for different series and the visibility graph inherits some important features of the original time series. Further, we convert some quarterly macroeconomic series including the growth rates of value-added of three industry series and the growth rates of Gross Domestic Product series of China to graphs by the visibility algorithm and explore the topological properties of graphs associated from the four macroeconomic series, namely, the degree distribution and correlations, the clustering coefficient, the average path length, and community structure. Based on complex network analysis we find degree distributions of associated networks from the growth rates of value-added of three industry series are almost exponential and the degree distributions of associated networks from the growth rates of GDP series are scale free. We also discussed the assortativity and disassortativity of the four associated networks as they are related to the evolutionary process of the original macroeconomic series. All the constructed networks have “small-world” features. The community structures of associated networks suggest dynamic changes of the original macroeconomic series. We also detected the relationship among government policy changes, community structures of associated networks and macroeconomic dynamics. We find great influences of government policies in China on the changes of dynamics of GDP and the three industries adjustment. The work in our paper provides a new way to understand the dynamics of economic development.
A fuzzy pattern matching method based on graph kernel for lithography hotspot detection
NASA Astrophysics Data System (ADS)
Nitta, Izumi; Kanazawa, Yuzi; Ishida, Tsutomu; Banno, Koji
2017-03-01
In advanced technology nodes, lithography hotspot detection has become one of the most significant issues in design for manufacturability. Recently, machine learning based lithography hotspot detection has been widely investigated, but it has trade-off between detection accuracy and false alarm. To apply machine learning based technique to the physical verification phase, designers require minimizing undetected hotspots to avoid yield degradation. They also need a ranking of similar known patterns with a detected hotspot to prioritize layout pattern to be corrected. To achieve high detection accuracy and to prioritize detected hotspots, we propose a novel lithography hotspot detection method using Delaunay triangulation and graph kernel based machine learning. Delaunay triangulation extracts features of hotspot patterns where polygons locate irregularly and closely one another, and graph kernel expresses inner structure of graphs. Additionally, our method provides similarity between two patterns and creates a list of similar training patterns with a detected hotspot. Experiments results on ICCAD 2012 benchmarks show that our method achieves high accuracy with allowable range of false alarm. We also show the ranking of the similar known patterns with a detected hotspot.
NASA Astrophysics Data System (ADS)
Garciá-Arteaga, Juan D.; Corredor, Germán.; Wang, Xiangxue; Velcheti, Vamsidhar; Madabhushi, Anant; Romero, Eduardo
2017-11-01
Tumor-infiltrating lymphocytes occurs when various classes of white blood cells migrate from the blood stream towards the tumor, infiltrating it. The presence of TIL is predictive of the response of the patient to therapy. In this paper, we show how the automatic detection of lymphocytes in digital H and E histopathological images and the quantitative evaluation of the global lymphocyte configuration, evaluated through global features extracted from non-parametric graphs, constructed from the lymphocytes' detected positions, can be correlated to the patient's outcome in early-stage non-small cell lung cancer (NSCLC). The method was assessed on a tissue microarray cohort composed of 63 NSCLC cases. From the evaluated graphs, minimum spanning trees and K-nn showed the highest predictive ability, yielding F1 Scores of 0.75 and 0.72 and accuracies of 0.67 and 0.69, respectively. The predictive power of the proposed methodology indicates that graphs may be used to develop objective measures of the infiltration grade of tumors, which can, in turn, be used by pathologists to improve the decision making and treatment planning processes.
Standardization of databases for AMDB taxi routing functions
NASA Astrophysics Data System (ADS)
Pschierer, C.; Sindlinger, A.; Schiefele, J.
2010-04-01
Input, management, and display of taxi routes on airport moving map displays (AMM) have been covered in various studies in the past. The demonstrated applications are typically based on Aerodrome Mapping Databases (AMDB). Taxi routing functions require specific enhancements, typically in the form of a graph network with nodes and edges modeling all connectivities within an airport, which are not supported by the current AMDB standards. Therefore, the data schemas and data content have been defined specifically for the purpose and test scenarios of these studies. A standardization of the data format for taxi routing information is a prerequisite for turning taxi routing functions into production. The joint RTCA/EUROCAE special committee SC-217, responsible for updating and enhancing the AMDB standards DO-272 [1] and DO-291 [2], is currently in the process of studying different alternatives and defining reasonable formats. Requirements for taxi routing data are primarily driven by depiction concepts for assigned and cleared taxi routes, but also by database size and the economic feasibility. Studied concepts are similar to the ones described in the GDF (geographic data files) specification [3], which is used in most car navigation systems today. They include - A highly aggregated graph network of complex features - A modestly aggregated graph network of simple features - A non-explicit topology of plain AMDB taxi guidance line elements This paper introduces the different concepts and their advantages and disadvantages.
ERIC Educational Resources Information Center
Reading Teacher, 2012
2012-01-01
The "Toolbox" column features content adapted from ReadWriteThink.org lesson plans and provides practical tools for classroom teachers. This issue's column features a lesson plan adapted from "Graphing Plot and Character in a Novel" by Lisa Storm Fink and "Bio-graph: Graphing Life Events" by Susan Spangler. Students retell biographic events…
Scenario driven data modelling: a method for integrating diverse sources of data and data streams
2011-01-01
Background Biology is rapidly becoming a data intensive, data-driven science. It is essential that data is represented and connected in ways that best represent its full conceptual content and allows both automated integration and data driven decision-making. Recent advancements in distributed multi-relational directed graphs, implemented in the form of the Semantic Web make it possible to deal with complicated heterogeneous data in new and interesting ways. Results This paper presents a new approach, scenario driven data modelling (SDDM), that integrates multi-relational directed graphs with data streams. SDDM can be applied to virtually any data integration challenge with widely divergent types of data and data streams. In this work, we explored integrating genetics data with reports from traditional media. SDDM was applied to the New Delhi metallo-beta-lactamase gene (NDM-1), an emerging global health threat. The SDDM process constructed a scenario, created a RDF multi-relational directed graph that linked diverse types of data to the Semantic Web, implemented RDF conversion tools (RDFizers) to bring content into the Sematic Web, identified data streams and analytical routines to analyse those streams, and identified user requirements and graph traversals to meet end-user requirements. Conclusions We provided an example where SDDM was applied to a complex data integration challenge. The process created a model of the emerging NDM-1 health threat, identified and filled gaps in that model, and constructed reliable software that monitored data streams based on the scenario derived multi-relational directed graph. The SDDM process significantly reduced the software requirements phase by letting the scenario and resulting multi-relational directed graph define what is possible and then set the scope of the user requirements. Approaches like SDDM will be critical to the future of data intensive, data-driven science because they automate the process of converting massive data streams into usable knowledge. PMID:22165854
Fast correspondences search in anatomical trees
NASA Astrophysics Data System (ADS)
dos Santos, Thiago R.; Gergel, Ingmar; Meinzer, Hans-Peter; Maier-Hein, Lena
2010-03-01
Registration of multiple medical images commonly comprises the steps feature extraction, correspondences search and transformation computation. In this paper, we present a new method for a fast and pose independent search of correspondences using as features anatomical trees such as the bronchial system in the lungs or the vessel system in the liver. Our approach scores the similarities between the trees' nodes (bifurcations) taking into account both, topological properties extracted from their graph representations and anatomical properties extracted from the trees themselves. The node assignment maximizes the global similarity (sum of the scores of each pair of assigned nodes), assuring that the matches are distributed throughout the trees. Furthermore, the proposed method is able to deal with distortions in the data, such as noise, motion, artifacts, and problems associated with the extraction method, such as missing or false branches. According to an evaluation on swine lung data sets, the method requires less than one second on average to compute the matching and yields a high rate of correct matches compared to state of the art work.
A hybrid model based on neural networks for biomedical relation extraction.
Zhang, Yijia; Lin, Hongfei; Yang, Zhihao; Wang, Jian; Zhang, Shaowu; Sun, Yuanyuan; Yang, Liang
2018-05-01
Biomedical relation extraction can automatically extract high-quality biomedical relations from biomedical texts, which is a vital step for the mining of biomedical knowledge hidden in the literature. Recurrent neural networks (RNNs) and convolutional neural networks (CNNs) are two major neural network models for biomedical relation extraction. Neural network-based methods for biomedical relation extraction typically focus on the sentence sequence and employ RNNs or CNNs to learn the latent features from sentence sequences separately. However, RNNs and CNNs have their own advantages for biomedical relation extraction. Combining RNNs and CNNs may improve biomedical relation extraction. In this paper, we present a hybrid model for the extraction of biomedical relations that combines RNNs and CNNs. First, the shortest dependency path (SDP) is generated based on the dependency graph of the candidate sentence. To make full use of the SDP, we divide the SDP into a dependency word sequence and a relation sequence. Then, RNNs and CNNs are employed to automatically learn the features from the sentence sequence and the dependency sequences, respectively. Finally, the output features of the RNNs and CNNs are combined to detect and extract biomedical relations. We evaluate our hybrid model using five public (protein-protein interaction) PPI corpora and a (drug-drug interaction) DDI corpus. The experimental results suggest that the advantages of RNNs and CNNs in biomedical relation extraction are complementary. Combining RNNs and CNNs can effectively boost biomedical relation extraction performance. Copyright © 2018 Elsevier Inc. All rights reserved.
Hierarchical image feature extraction by an irregular pyramid of polygonal partitions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Skurikhin, Alexei N
2008-01-01
We present an algorithmic framework for hierarchical image segmentation and feature extraction. We build a successive fine-to-coarse hierarchy of irregular polygonal partitions of the original image. This multiscale hierarchy forms the basis for object-oriented image analysis. The framework incorporates the Gestalt principles of visual perception, such as proximity and closure, and exploits spectral and textural similarities of polygonal partitions, while iteratively grouping them until dissimilarity criteria are exceeded. Seed polygons are built upon a triangular mesh composed of irregular sized triangles, whose spatial arrangement is adapted to the image content. This is achieved by building the triangular mesh on themore » top of detected spectral discontinuities (such as edges), which form a network of constraints for the Delaunay triangulation. The image is then represented as a spatial network in the form of a graph with vertices corresponding to the polygonal partitions and edges reflecting their relations. The iterative agglomeration of partitions into object-oriented segments is formulated as Minimum Spanning Tree (MST) construction. An important characteristic of the approach is that the agglomeration of polygonal partitions is constrained by the detected edges; thus the shapes of agglomerated partitions are more likely to correspond to the outlines of real-world objects. The constructed partitions and their spatial relations are characterized using spectral, textural and structural features based on proximity graphs. The framework allows searching for object-oriented features of interest across multiple levels of details of the built hierarchy and can be generalized to the multi-criteria MST to account for multiple criteria important for an application.« less
Extraction of Graph Information Based on Image Contents and the Use of Ontology
ERIC Educational Resources Information Center
Kanjanawattana, Sarunya; Kimura, Masaomi
2016-01-01
A graph is an effective form of data representation used to summarize complex information. Explicit information such as the relationship between the X- and Y-axes can be easily extracted from a graph by applying human intelligence. However, implicit knowledge such as information obtained from other related concepts in an ontology also resides in…
Multi-Centrality Graph Spectral Decompositions and Their Application to Cyber Intrusion Detection
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Pin-Yu; Choudhury, Sutanay; Hero, Alfred
Many modern datasets can be represented as graphs and hence spectral decompositions such as graph principal component analysis (PCA) can be useful. Distinct from previous graph decomposition approaches based on subspace projection of a single topological feature, e.g., the centered graph adjacency matrix (graph Laplacian), we propose spectral decomposition approaches to graph PCA and graph dictionary learning that integrate multiple features, including graph walk statistics, centrality measures and graph distances to reference nodes. In this paper we propose a new PCA method for single graph analysis, called multi-centrality graph PCA (MC-GPCA), and a new dictionary learning method for ensembles ofmore » graphs, called multi-centrality graph dictionary learning (MC-GDL), both based on spectral decomposition of multi-centrality matrices. As an application to cyber intrusion detection, MC-GPCA can be an effective indicator of anomalous connectivity pattern and MC-GDL can provide discriminative basis for attack classification.« less
Recognizing human activities using appearance metric feature and kinematics feature
NASA Astrophysics Data System (ADS)
Qian, Huimin; Zhou, Jun; Lu, Xinbiao; Wu, Xinye
2017-05-01
The problem of automatically recognizing human activities from videos through the fusion of the two most important cues, appearance metric feature and kinematics feature, is considered. And a system of two-dimensional (2-D) Poisson equations is introduced to extract the more discriminative appearance metric feature. Specifically, the moving human blobs are first detected out from the video by background subtraction technique to form a binary image sequence, from which the appearance feature designated as the motion accumulation image and the kinematics feature termed as centroid instantaneous velocity are extracted. Second, 2-D discrete Poisson equations are employed to reinterpret the motion accumulation image to produce a more differentiated Poisson silhouette image, from which the appearance feature vector is created through the dimension reduction technique called bidirectional 2-D principal component analysis, considering the balance between classification accuracy and time consumption. Finally, a cascaded classifier based on the nearest neighbor classifier and two directed acyclic graph support vector machine classifiers, integrated with the fusion of the appearance feature vector and centroid instantaneous velocity vector, is applied to recognize the human activities. Experimental results on the open databases and a homemade one confirm the recognition performance of the proposed algorithm.
NASA Technical Reports Server (NTRS)
Narasimhan, Sriram; Roychoudhury, Indranil; Balaban, Edward; Saxena, Abhinav
2010-01-01
Model-based diagnosis typically uses analytical redundancy to compare predictions from a model against observations from the system being diagnosed. However this approach does not work very well when it is not feasible to create analytic relations describing all the observed data, e.g., for vibration data which is usually sampled at very high rates and requires very detailed finite element models to describe its behavior. In such cases, features (in time and frequency domains) that contain diagnostic information are extracted from the data. Since this is a computationally intensive process, it is not efficient to extract all the features all the time. In this paper we present an approach that combines the analytic model-based and feature-driven diagnosis approaches. The analytic approach is used to reduce the set of possible faults and then features are chosen to best distinguish among the remaining faults. We describe an implementation of this approach on the Flyable Electro-mechanical Actuator (FLEA) test bed.
A fast algorithm for vertex-frequency representations of signals on graphs
Jestrović, Iva; Coyle, James L.; Sejdić, Ervin
2016-01-01
The windowed Fourier transform (short time Fourier transform) and the S-transform are widely used signal processing tools for extracting frequency information from non-stationary signals. Previously, the windowed Fourier transform had been adopted for signals on graphs and has been shown to be very useful for extracting vertex-frequency information from graphs. However, high computational complexity makes these algorithms impractical. We sought to develop a fast windowed graph Fourier transform and a fast graph S-transform requiring significantly shorter computation time. The proposed schemes have been tested with synthetic test graph signals and real graph signals derived from electroencephalography recordings made during swallowing. The results showed that the proposed schemes provide significantly lower computation time in comparison with the standard windowed graph Fourier transform and the fast graph S-transform. Also, the results showed that noise has no effect on the results of the algorithm for the fast windowed graph Fourier transform or on the graph S-transform. Finally, we showed that graphs can be reconstructed from the vertex-frequency representations obtained with the proposed algorithms. PMID:28479645
A Comparative Study of Different EEG Reference Choices for Diagnosing Unipolar Depression.
Mumtaz, Wajid; Malik, Aamir Saeed
2018-06-02
The choice of an electroencephalogram (EEG) reference has fundamental importance and could be critical during clinical decision-making because an impure EEG reference could falsify the clinical measurements and subsequent inferences. In this research, the suitability of three EEG references was compared while classifying depressed and healthy brains using a machine-learning (ML)-based validation method. In this research, the EEG data of 30 unipolar depressed subjects and 30 age-matched healthy controls were recorded. The EEG data were analyzed in three different EEG references, the link-ear reference (LE), average reference (AR), and reference electrode standardization technique (REST). The EEG-based functional connectivity (FC) was computed. Also, the graph-based measures, such as the distances between nodes, minimum spanning tree, and maximum flow between the nodes for each channel pair, were calculated. An ML scheme provided a mechanism to compare the performances of the extracted features that involved a general framework such as the feature extraction (graph-based theoretic measures), feature selection, classification, and validation. For comparison purposes, the performance metrics such as the classification accuracies, sensitivities, specificities, and F scores were computed. When comparing the three references, the diagnostic accuracy showed better performances during the REST, while the LE and AR showed less discrimination between the two groups. Based on the results, it can be concluded that the choice of appropriate reference is critical during the clinical scenario. The REST reference is recommended for future applications of EEG-based diagnosis of mental illnesses.
SING: Subgraph search In Non-homogeneous Graphs
2010-01-01
Background Finding the subgraphs of a graph database that are isomorphic to a given query graph has practical applications in several fields, from cheminformatics to image understanding. Since subgraph isomorphism is a computationally hard problem, indexing techniques have been intensively exploited to speed up the process. Such systems filter out those graphs which cannot contain the query, and apply a subgraph isomorphism algorithm to each residual candidate graph. The applicability of such systems is limited to databases of small graphs, because their filtering power degrades on large graphs. Results In this paper, SING (Subgraph search In Non-homogeneous Graphs), a novel indexing system able to cope with large graphs, is presented. The method uses the notion of feature, which can be a small subgraph, subtree or path. Each graph in the database is annotated with the set of all its features. The key point is to make use of feature locality information. This idea is used to both improve the filtering performance and speed up the subgraph isomorphism task. Conclusions Extensive tests on chemical compounds, biological networks and synthetic graphs show that the proposed system outperforms the most popular systems in query time over databases of medium and large graphs. Other specific tests show that the proposed system is effective for single large graphs. PMID:20170516
A Semantic Graph Query Language
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kaplan, I L
2006-10-16
Semantic graphs can be used to organize large amounts of information from a number of sources into one unified structure. A semantic query language provides a foundation for extracting information from the semantic graph. The graph query language described here provides a simple, powerful method for querying semantic graphs.
Molecular graph convolutions: moving beyond fingerprints.
Kearnes, Steven; McCloskey, Kevin; Berndl, Marc; Pande, Vijay; Riley, Patrick
2016-08-01
Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph-atoms, bonds, distances, etc.-which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.
Thinking graphically: Connecting vision and cognition during graph comprehension.
Ratwani, Raj M; Trafton, J Gregory; Boehm-Davis, Deborah A
2008-03-01
Task analytic theories of graph comprehension account for the perceptual and conceptual processes required to extract specific information from graphs. Comparatively, the processes underlying information integration have received less attention. We propose a new framework for information integration that highlights visual integration and cognitive integration. During visual integration, pattern recognition processes are used to form visual clusters of information; these visual clusters are then used to reason about the graph during cognitive integration. In 3 experiments, the processes required to extract specific information and to integrate information were examined by collecting verbal protocol and eye movement data. Results supported the task analytic theories for specific information extraction and the processes of visual and cognitive integration for integrative questions. Further, the integrative processes scaled up as graph complexity increased, highlighting the importance of these processes for integration in more complex graphs. Finally, based on this framework, design principles to improve both visual and cognitive integration are described. PsycINFO Database Record (c) 2008 APA, all rights reserved
High-order graph matching based feature selection for Alzheimer's disease identification.
Liu, Feng; Suk, Heung-Il; Wee, Chong-Yaw; Chen, Huafu; Shen, Dinggang
2013-01-01
One of the main limitations of l1-norm feature selection is that it focuses on estimating the target vector for each sample individually without considering relations with other samples. However, it's believed that the geometrical relation among target vectors in the training set may provide useful information, and it would be natural to expect that the predicted vectors have similar geometric relations as the target vectors. To overcome these limitations, we formulate this as a graph-matching feature selection problem between a predicted graph and a target graph. In the predicted graph a node is represented by predicted vector that may describe regional gray matter volume or cortical thickness features, and in the target graph a node is represented by target vector that include class label and clinical scores. In particular, we devise new regularization terms in sparse representation to impose high-order graph matching between the target vectors and the predicted ones. Finally, the selected regional gray matter volume and cortical thickness features are fused in kernel space for classification. Using the ADNI dataset, we evaluate the effectiveness of the proposed method and obtain the accuracies of 92.17% and 81.57% in AD and MCI classification, respectively.
Multi-Atlas Based Segmentation of Brainstem Nuclei from MR Images by Deep Hyper-Graph Learning.
Dong, Pei; Guo, Yangrong; Gao, Yue; Liang, Peipeng; Shi, Yonghong; Wang, Qian; Shen, Dinggang; Wu, Guorong
2016-10-01
Accurate segmentation of brainstem nuclei (red nucleus and substantia nigra) is very important in various neuroimaging applications such as deep brain stimulation and the investigation of imaging biomarkers for Parkinson's disease (PD). Due to iron deposition during aging, image contrast in the brainstem is very low in Magnetic Resonance (MR) images. Hence, the ambiguity of patch-wise similarity makes the recently successful multi-atlas patch-based label fusion methods have difficulty to perform as competitive as segmenting cortical and sub-cortical regions from MR images. To address this challenge, we propose a novel multi-atlas brainstem nuclei segmentation method using deep hyper-graph learning. Specifically, we achieve this goal in three-fold. First , we employ hyper-graph to combine the advantage of maintaining spatial coherence from graph-based segmentation approaches and the benefit of harnessing population priors from multi-atlas based framework. Second , besides using low-level image appearance, we also extract high-level context features to measure the complex patch-wise relationship. Since the context features are calculated on a tentatively estimated label probability map, we eventually turn our hyper-graph learning based label propagation into a deep and self-refining model. Third , since anatomical labels on some voxels (usually located in uniform regions) can be identified much more reliably than other voxels (usually located at the boundary between two regions), we allow these reliable voxels to propagate their labels to the nearby difficult-to-label voxels. Such hierarchical strategy makes our proposed label fusion method deep and dynamic. We evaluate our proposed label fusion method in segmenting substantia nigra (SN) and red nucleus (RN) from 3.0 T MR images, where our proposed method achieves significant improvement over the state-of-the-art label fusion methods.
Airola, Antti; Pyysalo, Sampo; Björne, Jari; Pahikkala, Tapio; Ginter, Filip; Salakoski, Tapio
2008-11-19
Automated extraction of protein-protein interactions (PPI) is an important and widely studied task in biomedical text mining. We propose a graph kernel based approach for this task. In contrast to earlier approaches to PPI extraction, the introduced all-paths graph kernel has the capability to make use of full, general dependency graphs representing the sentence structure. We evaluate the proposed method on five publicly available PPI corpora, providing the most comprehensive evaluation done for a machine learning based PPI-extraction system. We additionally perform a detailed evaluation of the effects of training and testing on different resources, providing insight into the challenges involved in applying a system beyond the data it was trained on. Our method is shown to achieve state-of-the-art performance with respect to comparable evaluations, with 56.4 F-score and 84.8 AUC on the AImed corpus. We show that the graph kernel approach performs on state-of-the-art level in PPI extraction, and note the possible extension to the task of extracting complex interactions. Cross-corpus results provide further insight into how the learning generalizes beyond individual corpora. Further, we identify several pitfalls that can make evaluations of PPI-extraction systems incomparable, or even invalid. These include incorrect cross-validation strategies and problems related to comparing F-score results achieved on different evaluation resources. Recommendations for avoiding these pitfalls are provided.
Guo, Hanqi; Phillips, Carolyn L; Peterka, Tom; Karpeyev, Dmitry; Glatz, Andreas
2016-01-01
We propose a method for the vortex extraction and tracking of superconducting magnetic flux vortices for both structured and unstructured mesh data. In the Ginzburg-Landau theory, magnetic flux vortices are well-defined features in a complex-valued order parameter field, and their dynamics determine electromagnetic properties in type-II superconductors. Our method represents each vortex line (a 1D curve embedded in 3D space) as a connected graph extracted from the discretized field in both space and time. For a time-varying discrete dataset, our vortex extraction and tracking method is as accurate as the data discretization. We then apply 3D visualization and 2D event diagrams to the extraction and tracking results to help scientists understand vortex dynamics and macroscale superconductor behavior in greater detail than previously possible.
Chriskos, Panteleimon; Frantzidis, Christos A; Gkivogkli, Polyxeni T; Bamidis, Panagiotis D; Kourtidou-Papadeli, Chrysoula
2018-01-01
Sleep staging, the process of assigning labels to epochs of sleep, depending on the stage of sleep they belong, is an arduous, time consuming and error prone process as the initial recordings are quite often polluted by noise from different sources. To properly analyze such data and extract clinical knowledge, noise components must be removed or alleviated. In this paper a pre-processing and subsequent sleep staging pipeline for the sleep analysis of electroencephalographic signals is described. Two novel methods of functional connectivity estimation (Synchronization Likelihood/SL and Relative Wavelet Entropy/RWE) are comparatively investigated for automatic sleep staging through manually pre-processed electroencephalographic recordings. A multi-step process that renders signals suitable for further analysis is initially described. Then, two methods that rely on extracting synchronization features from electroencephalographic recordings to achieve computerized sleep staging are proposed, based on bivariate features which provide a functional overview of the brain network, contrary to most proposed methods that rely on extracting univariate time and frequency features. Annotation of sleep epochs is achieved through the presented feature extraction methods by training classifiers, which are in turn able to accurately classify new epochs. Analysis of data from sleep experiments on a randomized, controlled bed-rest study, which was organized by the European Space Agency and was conducted in the "ENVIHAB" facility of the Institute of Aerospace Medicine at the German Aerospace Center (DLR) in Cologne, Germany attains high accuracy rates, over 90% based on ground truth that resulted from manual sleep staging by two experienced sleep experts. Therefore, it can be concluded that the above feature extraction methods are suitable for semi-automatic sleep staging.
Chriskos, Panteleimon; Frantzidis, Christos A.; Gkivogkli, Polyxeni T.; Bamidis, Panagiotis D.; Kourtidou-Papadeli, Chrysoula
2018-01-01
Sleep staging, the process of assigning labels to epochs of sleep, depending on the stage of sleep they belong, is an arduous, time consuming and error prone process as the initial recordings are quite often polluted by noise from different sources. To properly analyze such data and extract clinical knowledge, noise components must be removed or alleviated. In this paper a pre-processing and subsequent sleep staging pipeline for the sleep analysis of electroencephalographic signals is described. Two novel methods of functional connectivity estimation (Synchronization Likelihood/SL and Relative Wavelet Entropy/RWE) are comparatively investigated for automatic sleep staging through manually pre-processed electroencephalographic recordings. A multi-step process that renders signals suitable for further analysis is initially described. Then, two methods that rely on extracting synchronization features from electroencephalographic recordings to achieve computerized sleep staging are proposed, based on bivariate features which provide a functional overview of the brain network, contrary to most proposed methods that rely on extracting univariate time and frequency features. Annotation of sleep epochs is achieved through the presented feature extraction methods by training classifiers, which are in turn able to accurately classify new epochs. Analysis of data from sleep experiments on a randomized, controlled bed-rest study, which was organized by the European Space Agency and was conducted in the “ENVIHAB” facility of the Institute of Aerospace Medicine at the German Aerospace Center (DLR) in Cologne, Germany attains high accuracy rates, over 90% based on ground truth that resulted from manual sleep staging by two experienced sleep experts. Therefore, it can be concluded that the above feature extraction methods are suitable for semi-automatic sleep staging. PMID:29628883
NASA Astrophysics Data System (ADS)
Jiang, Li; Shi, Tielin; Xuan, Jianping
2012-05-01
Generally, the vibration signals of fault bearings are non-stationary and highly nonlinear under complicated operating conditions. Thus, it's a big challenge to extract optimal features for improving classification and simultaneously decreasing feature dimension. Kernel Marginal Fisher analysis (KMFA) is a novel supervised manifold learning algorithm for feature extraction and dimensionality reduction. In order to avoid the small sample size problem in KMFA, we propose regularized KMFA (RKMFA). A simple and efficient intelligent fault diagnosis method based on RKMFA is put forward and applied to fault recognition of rolling bearings. So as to directly excavate nonlinear features from the original high-dimensional vibration signals, RKMFA constructs two graphs describing the intra-class compactness and the inter-class separability, by combining traditional manifold learning algorithm with fisher criteria. Therefore, the optimal low-dimensional features are obtained for better classification and finally fed into the simplest K-nearest neighbor (KNN) classifier to recognize different fault categories of bearings. The experimental results demonstrate that the proposed approach improves the fault classification performance and outperforms the other conventional approaches.
The Model Analyst’s Toolkit: Scientific Model Development, Analysis, and Validation
2014-05-20
but there can still be many recommendations generated. Therefore, the recommender results are displayed in a sortable table where each row is a...reporting period. Since the synthesis graph can be complex and have many dependencies, the system must determine the order of evaluation of nodes, and...validation failure, if any. 3.1. Automatic Feature Extraction In many domains, causal models can often be more readily described as patterns of
Yun Chen; Hui Yang
2014-01-01
The rapid advancements of biomedical instrumentation and healthcare technology have resulted in data-rich environments in hospitals. However, the meaningful information extracted from rich datasets is limited. There is a dire need to go beyond current medical practices, and develop data-driven methods and tools that will enable and help (i) the handling of big data, (ii) the extraction of data-driven knowledge, (iii) the exploitation of acquired knowledge for optimizing clinical decisions. This present study focuses on the prediction of mortality rates in Intensive Care Units (ICU) using patient-specific healthcare recordings. It is worth mentioning that postsurgical monitoring in ICU leads to massive datasets with unique properties, e.g., variable heterogeneity, patient heterogeneity, and time asyncronization. To cope with the challenges in ICU datasets, we developed the postsurgical decision support system with a series of analytical tools, including data categorization, data pre-processing, feature extraction, feature selection, and predictive modeling. Experimental results show that the proposed data-driven methodology outperforms traditional approaches and yields better results based on the evaluation of real-world ICU data from 4000 subjects in the database. This research shows great potentials for the use of data-driven analytics to improve the quality of healthcare services.
Graph mining for next generation sequencing: leveraging the assembly graph for biological insights.
Warnke-Sommer, Julia; Ali, Hesham
2016-05-06
The assembly of Next Generation Sequencing (NGS) reads remains a challenging task. This is especially true for the assembly of metagenomics data that originate from environmental samples potentially containing hundreds to thousands of unique species. The principle objective of current assembly tools is to assemble NGS reads into contiguous stretches of sequence called contigs while maximizing for both accuracy and contig length. The end goal of this process is to produce longer contigs with the major focus being on assembly only. Sequence read assembly is an aggregative process, during which read overlap relationship information is lost as reads are merged into longer sequences or contigs. The assembly graph is information rich and capable of capturing the genomic architecture of an input read data set. We have developed a novel hybrid graph in which nodes represent sequence regions at different levels of granularity. This model, utilized in the assembly and analysis pipeline Focus, presents a concise yet feature rich view of a given input data set, allowing for the extraction of biologically relevant graph structures for graph mining purposes. Focus was used to create hybrid graphs to model metagenomics data sets obtained from the gut microbiomes of five individuals with Crohn's disease and eight healthy individuals. Repetitive and mobile genetic elements are found to be associated with hybrid graph structure. Using graph mining techniques, a comparative study of the Crohn's disease and healthy data sets was conducted with focus on antibiotics resistance genes associated with transposase genes. Results demonstrated significant differences in the phylogenetic distribution of categories of antibiotics resistance genes in the healthy and diseased patients. Focus was also evaluated as a pure assembly tool and produced excellent results when compared against the Meta-velvet, Omega, and UD-IDBA assemblers. Mining the hybrid graph can reveal biological phenomena captured by its structure. We demonstrate the advantages of considering assembly graphs as data-mining support in addition to their role as frameworks for assembly.
Molecular graph convolutions: moving beyond fingerprints
NASA Astrophysics Data System (ADS)
Kearnes, Steven; McCloskey, Kevin; Berndl, Marc; Pande, Vijay; Riley, Patrick
2016-08-01
Molecular "fingerprints" encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.
NASA Astrophysics Data System (ADS)
Su, Zuqiang; Xiao, Hong; Zhang, Yi; Tang, Baoping; Jiang, Yonghua
2017-04-01
Extraction of sensitive features is a challenging but key task in data-driven machinery running state identification. Aimed at solving this problem, a method for machinery running state identification that applies discriminant semi-supervised local tangent space alignment (DSS-LTSA) for feature fusion and extraction is proposed. Firstly, in order to extract more distinct features, the vibration signals are decomposed by wavelet packet decomposition WPD, and a mixed-domain feature set consisted of statistical features, autoregressive (AR) model coefficients, instantaneous amplitude Shannon entropy and WPD energy spectrum is extracted to comprehensively characterize the properties of machinery running state(s). Then, the mixed-dimension feature set is inputted into DSS-LTSA for feature fusion and extraction to eliminate redundant information and interference noise. The proposed DSS-LTSA can extract intrinsic structure information of both labeled and unlabeled state samples, and as a result the over-fitting problem of supervised manifold learning and blindness problem of unsupervised manifold learning are overcome. Simultaneously, class discrimination information is integrated within the dimension reduction process in a semi-supervised manner to improve sensitivity of the extracted fusion features. Lastly, the extracted fusion features are inputted into a pattern recognition algorithm to achieve the running state identification. The effectiveness of the proposed method is verified by a running state identification case in a gearbox, and the results confirm the improved accuracy of the running state identification.
Multiresolution analysis over graphs for a motor imagery based online BCI game.
Asensio-Cubero, Javier; Gan, John Q; Palaniappan, Ramaswamy
2016-01-01
Multiresolution analysis (MRA) over graph representation of EEG data has proved to be a promising method for offline brain-computer interfacing (BCI) data analysis. For the first time we aim to prove the feasibility of the graph lifting transform in an online BCI system. Instead of developing a pointer device or a wheel-chair controller as test bed for human-machine interaction, we have designed and developed an engaging game which can be controlled by means of imaginary limb movements. Some modifications to the existing MRA analysis over graphs for BCI have also been proposed, such as the use of common spatial patterns for feature extraction at the different levels of decomposition, and sequential floating forward search as a best basis selection technique. In the online game experiment we obtained for three classes an average classification rate of 63.0% for fourteen naive subjects. The application of a best basis selection method helps significantly decrease the computing resources needed. The present study allows us to further understand and assess the benefits of the use of tailored wavelet analysis for processing motor imagery data and contributes to the further development of BCI for gaming purposes. Copyright © 2015 Elsevier Ltd. All rights reserved.
A Graph-Based Recovery and Decomposition of Swanson’s Hypothesis using Semantic Predications
Cameron, Delroy; Bodenreider, Olivier; Yalamanchili, Hima; Danh, Tu; Vallabhaneni, Sreeram; Thirunarayan, Krishnaprasad; Sheth, Amit P.; Rindflesch, Thomas C.
2014-01-01
Objectives This paper presents a methodology for recovering and decomposing Swanson’s Raynaud Syndrome–Fish Oil Hypothesis semi-automatically. The methodology leverages the semantics of assertions extracted from biomedical literature (called semantic predications) along with structured background knowledge and graph-based algorithms to semi-automatically capture the informative associations originally discovered manually by Swanson. Demonstrating that Swanson’s manually intensive techniques can be undertaken semi-automatically, paves the way for fully automatic semantics-based hypothesis generation from scientific literature. Methods Semantic predications obtained from biomedical literature allow the construction of labeled directed graphs which contain various associations among concepts from the literature. By aggregating such associations into informative subgraphs, some of the relevant details originally articulated by Swanson has been uncovered. However, by leveraging background knowledge to bridge important knowledge gaps in the literature, a methodology for semi-automatically capturing the detailed associations originally explicated in natural language by Swanson has been developed. Results Our methodology not only recovered the 3 associations commonly recognized as Swanson’s Hypothesis, but also decomposed them into an additional 16 detailed associations, formulated as chains of semantic predications. Altogether, 14 out of the 19 associations that can be attributed to Swanson were retrieved using our approach. To the best of our knowledge, such an in-depth recovery and decomposition of Swanson’s Hypothesis has never been attempted. Conclusion In this work therefore, we presented a methodology for semi- automatically recovering and decomposing Swanson’s RS-DFO Hypothesis using semantic representations and graph algorithms. Our methodology provides new insights into potential prerequisites for semantics-driven Literature-Based Discovery (LBD). These suggest that three critical aspects of LBD include: 1) the need for more expressive representations beyond Swanson’s ABC model; 2) an ability to accurately extract semantic information from text; and 3) the semantic integration of scientific literature with structured background knowledge. PMID:23026233
Efficient enumeration of monocyclic chemical graphs with given path frequencies
2014-01-01
Background The enumeration of chemical graphs (molecular graphs) satisfying given constraints is one of the fundamental problems in chemoinformatics and bioinformatics because it leads to a variety of useful applications including structure determination and development of novel chemical compounds. Results We consider the problem of enumerating chemical graphs with monocyclic structure (a graph structure that contains exactly one cycle) from a given set of feature vectors, where a feature vector represents the frequency of the prescribed paths in a chemical compound to be constructed and the set is specified by a pair of upper and lower feature vectors. To enumerate all tree-like (acyclic) chemical graphs from a given set of feature vectors, Shimizu et al. and Suzuki et al. proposed efficient branch-and-bound algorithms based on a fast tree enumeration algorithm. In this study, we devise a novel method for extending these algorithms to enumeration of chemical graphs with monocyclic structure by designing a fast algorithm for testing uniqueness. The results of computational experiments reveal that the computational efficiency of the new algorithm is as good as those for enumeration of tree-like chemical compounds. Conclusions We succeed in expanding the class of chemical graphs that are able to be enumerated efficiently. PMID:24955135
Graph-Based Object Class Discovery
NASA Astrophysics Data System (ADS)
Xia, Shengping; Hancock, Edwin R.
We are interested in the problem of discovering the set of object classes present in a database of images using a weakly supervised graph-based framework. Rather than making use of the ”Bag-of-Features (BoF)” approach widely used in current work on object recognition, we represent each image by a graph using a group of selected local invariant features. Using local feature matching and iterative Procrustes alignment, we perform graph matching and compute a similarity measure. Borrowing the idea of query expansion , we develop a similarity propagation based graph clustering (SPGC) method. Using this method class specific clusters of the graphs can be obtained. Such a cluster can be generally represented by using a higher level graph model whose vertices are the clustered graphs, and the edge weights are determined by the pairwise similarity measure. Experiments are performed on a dataset, in which the number of images increases from 1 to 50K and the number of objects increases from 1 to over 500. Some objects have been discovered with total recall and a precision 1 in a single cluster.
An ensemble method for extracting adverse drug events from social media.
Liu, Jing; Zhao, Songzheng; Zhang, Xiaodi
2016-06-01
Because adverse drug events (ADEs) are a serious health problem and a leading cause of death, it is of vital importance to identify them correctly and in a timely manner. With the development of Web 2.0, social media has become a large data source for information on ADEs. The objective of this study is to develop a relation extraction system that uses natural language processing techniques to effectively distinguish between ADEs and non-ADEs in informal text on social media. We develop a feature-based approach that utilizes various lexical, syntactic, and semantic features. Information-gain-based feature selection is performed to address high-dimensional features. Then, we evaluate the effectiveness of four well-known kernel-based approaches (i.e., subset tree kernel, tree kernel, shortest dependency path kernel, and all-paths graph kernel) and several ensembles that are generated by adopting different combination methods (i.e., majority voting, weighted averaging, and stacked generalization). All of the approaches are tested using three data sets: two health-related discussion forums and one general social networking site (i.e., Twitter). When investigating the contribution of each feature subset, the feature-based approach attains the best area under the receiver operating characteristics curve (AUC) values, which are 78.6%, 72.2%, and 79.2% on the three data sets. When individual methods are used, we attain the best AUC values of 82.1%, 73.2%, and 77.0% using the subset tree kernel, shortest dependency path kernel, and feature-based approach on the three data sets, respectively. When using classifier ensembles, we achieve the best AUC values of 84.5%, 77.3%, and 84.5% on the three data sets, outperforming the baselines. Our experimental results indicate that ADE extraction from social media can benefit from feature selection. With respect to the effectiveness of different feature subsets, lexical features and semantic features can enhance the ADE extraction capability. Kernel-based approaches, which can stay away from the feature sparsity issue, are qualified to address the ADE extraction problem. Combining different individual classifiers using suitable combination methods can further enhance the ADE extraction effectiveness. Copyright © 2016 Elsevier B.V. All rights reserved.
Min-Cut Based Segmentation of Airborne LIDAR Point Clouds
NASA Astrophysics Data System (ADS)
Ural, S.; Shan, J.
2012-07-01
Introducing an organization to the unstructured point cloud before extracting information from airborne lidar data is common in many applications. Aggregating the points with similar features into segments in 3-D which comply with the nature of actual objects is affected by the neighborhood, scale, features and noise among other aspects. In this study, we present a min-cut based method for segmenting the point cloud. We first assess the neighborhood of each point in 3-D by investigating the local geometric and statistical properties of the candidates. Neighborhood selection is essential since point features are calculated within their local neighborhood. Following neighborhood determination, we calculate point features and determine the clusters in the feature space. We adapt a graph representation from image processing which is especially used in pixel labeling problems and establish it for the unstructured 3-D point clouds. The edges of the graph that are connecting the points with each other and nodes representing feature clusters hold the smoothness costs in the spatial domain and data costs in the feature domain. Smoothness costs ensure spatial coherence, while data costs control the consistency with the representative feature clusters. This graph representation formalizes the segmentation task as an energy minimization problem. It allows the implementation of an approximate solution by min-cuts for a global minimum of this NP hard minimization problem in low order polynomial time. We test our method with airborne lidar point cloud acquired with maximum planned post spacing of 1.4 m and a vertical accuracy 10.5 cm as RMSE. We present the effects of neighborhood and feature determination in the segmentation results and assess the accuracy and efficiency of the implemented min-cut algorithm as well as its sensitivity to the parameters of the smoothness and data cost functions. We find that smoothness cost that only considers simple distance parameter does not strongly conform to the natural structure of the points. Including shape information within the energy function by assigning costs based on the local properties may help to achieve a better representation for segmentation.
Automatic extraction of protein point mutations using a graph bigram association.
Lee, Lawrence C; Horn, Florence; Cohen, Fred E
2007-02-02
Protein point mutations are an essential component of the evolutionary and experimental analysis of protein structure and function. While many manually curated databases attempt to index point mutations, most experimentally generated point mutations and the biological impacts of the changes are described in the peer-reviewed published literature. We describe an application, Mutation GraB (Graph Bigram), that identifies, extracts, and verifies point mutations from biomedical literature. The principal problem of point mutation extraction is to link the point mutation with its associated protein and organism of origin. Our algorithm uses a graph-based bigram traversal to identify these relevant associations and exploits the Swiss-Prot protein database to verify this information. The graph bigram method is different from other models for point mutation extraction in that it incorporates frequency and positional data of all terms in an article to drive the point mutation-protein association. Our method was tested on 589 articles describing point mutations from the G protein-coupled receptor (GPCR), tyrosine kinase, and ion channel protein families. We evaluated our graph bigram metric against a word-proximity metric for term association on datasets of full-text literature in these three different protein families. Our testing shows that the graph bigram metric achieves a higher F-measure for the GPCRs (0.79 versus 0.76), protein tyrosine kinases (0.72 versus 0.69), and ion channel transporters (0.76 versus 0.74). Importantly, in situations where more than one protein can be assigned to a point mutation and disambiguation is required, the graph bigram metric achieves a precision of 0.84 compared with the word distance metric precision of 0.73. We believe the graph bigram search metric to be a significant improvement over previous search metrics for point mutation extraction and to be applicable to text-mining application requiring the association of words.
Sideris, Costas; Alshurafa, Nabil; Pourhomayoun, Mohammad; Shahmohammadi, Farhad; Samy, Lauren; Sarrafzadeh, Majid
2015-01-01
In this paper, we propose a novel methodology for utilizing disease diagnostic information to predict severity of condition for Congestive Heart Failure (CHF) patients. Our methodology relies on a novel, clustering-based, feature extraction framework using disease diagnostic information. To reduce the dimensionality we identify disease clusters using cooccurence frequencies. We then utilize these clusters as features to predict patient severity of condition. We build our clustering and feature extraction algorithm using the 2012 National Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP) which contains 7 million discharge records and ICD-9-CM codes. The proposed framework is tested on Ronald Reagan UCLA Medical Center Electronic Health Records (EHR) from 3041 patients. We compare our cluster-based feature set with another that incorporates the Charlson comorbidity score as a feature and demonstrate an accuracy improvement of up to 14% in the predictability of the severity of condition.
Molecular graph convolutions: moving beyond fingerprints
Kearnes, Steven; McCloskey, Kevin; Berndl, Marc; Pande, Vijay; Riley, Patrick
2016-01-01
Molecular “fingerprints” encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement. PMID:27558503
Taking Advantage of Automated Assessment of Student-Constructed Graphs in Science
ERIC Educational Resources Information Center
Vitale, Jonathan M.; Lai, Kevin; Linn, Marcia C.
2015-01-01
We present a new system for automated scoring of graph construction items that address complex science concepts, feature qualitative prompts, and support a range of possible solutions. This system utilizes analysis of spatial features (e.g., slope of a line) to evaluate potential student ideas represented within graphs. Student ideas are then…
Automated Recognition of 3D Features in GPIR Images
NASA Technical Reports Server (NTRS)
Park, Han; Stough, Timothy; Fijany, Amir
2007-01-01
A method of automated recognition of three-dimensional (3D) features in images generated by ground-penetrating imaging radar (GPIR) is undergoing development. GPIR 3D images can be analyzed to detect and identify such subsurface features as pipes and other utility conduits. Until now, much of the analysis of GPIR images has been performed manually by expert operators who must visually identify and track each feature. The present method is intended to satisfy a need for more efficient and accurate analysis by means of algorithms that can automatically identify and track subsurface features, with minimal supervision by human operators. In this method, data from multiple sources (for example, data on different features extracted by different algorithms) are fused together for identifying subsurface objects. The algorithms of this method can be classified in several different ways. In one classification, the algorithms fall into three classes: (1) image-processing algorithms, (2) feature- extraction algorithms, and (3) a multiaxis data-fusion/pattern-recognition algorithm that includes a combination of machine-learning, pattern-recognition, and object-linking algorithms. The image-processing class includes preprocessing algorithms for reducing noise and enhancing target features for pattern recognition. The feature-extraction algorithms operate on preprocessed data to extract such specific features in images as two-dimensional (2D) slices of a pipe. Then the multiaxis data-fusion/ pattern-recognition algorithm identifies, classifies, and reconstructs 3D objects from the extracted features. In this process, multiple 2D features extracted by use of different algorithms and representing views along different directions are used to identify and reconstruct 3D objects. In object linking, which is an essential part of this process, features identified in successive 2D slices and located within a threshold radius of identical features in adjacent slices are linked in a directed-graph data structure. Relative to past approaches, this multiaxis approach offers the advantages of more reliable detections, better discrimination of objects, and provision of redundant information, which can be helpful in filling gaps in feature recognition by one of the component algorithms. The image-processing class also includes postprocessing algorithms that enhance identified features to prepare them for further scrutiny by human analysts (see figure). Enhancement of images as a postprocessing step is a significant departure from traditional practice, in which enhancement of images is a preprocessing step.
Weakly supervised image semantic segmentation based on clustering superpixels
NASA Astrophysics Data System (ADS)
Yan, Xiong; Liu, Xiaohua
2018-04-01
In this paper, we propose an image semantic segmentation model which is trained from image-level labeled images. The proposed model starts with superpixel segmenting, and features of the superpixels are extracted by trained CNN. We introduce a superpixel-based graph followed by applying the graph partition method to group correlated superpixels into clusters. For the acquisition of inter-label correlations between the image-level labels in dataset, we not only utilize label co-occurrence statistics but also exploit visual contextual cues simultaneously. At last, we formulate the task of mapping appropriate image-level labels to the detected clusters as a problem of convex minimization. Experimental results on MSRC-21 dataset and LableMe dataset show that the proposed method has a better performance than most of the weakly supervised methods and is even comparable to fully supervised methods.
Graph wavelet alignment kernels for drug virtual screening.
Smalter, Aaron; Huan, Jun; Lushington, Gerald
2009-06-01
In this paper, we introduce a novel statistical modeling technique for target property prediction, with applications to virtual screening and drug design. In our method, we use graphs to model chemical structures and apply a wavelet analysis of graphs to summarize features capturing graph local topology. We design a novel graph kernel function to utilize the topology features to build predictive models for chemicals via Support Vector Machine classifier. We call the new graph kernel a graph wavelet-alignment kernel. We have evaluated the efficacy of the wavelet-alignment kernel using a set of chemical structure-activity prediction benchmarks. Our results indicate that the use of the kernel function yields performance profiles comparable to, and sometimes exceeding that of the existing state-of-the-art chemical classification approaches. In addition, our results also show that the use of wavelet functions significantly decreases the computational costs for graph kernel computation with more than ten fold speedup.
A global/local affinity graph for image segmentation.
Xiaofang Wang; Yuxing Tang; Masnou, Simon; Liming Chen
2015-04-01
Construction of a reliable graph capturing perceptual grouping cues of an image is fundamental for graph-cut based image segmentation methods. In this paper, we propose a novel sparse global/local affinity graph over superpixels of an input image to capture both short- and long-range grouping cues, and thereby enabling perceptual grouping laws, including proximity, similarity, continuity, and to enter in action through a suitable graph-cut algorithm. Moreover, we also evaluate three major visual features, namely, color, texture, and shape, for their effectiveness in perceptual segmentation and propose a simple graph fusion scheme to implement some recent findings from psychophysics, which suggest combining these visual features with different emphases for perceptual grouping. In particular, an input image is first oversegmented into superpixels at different scales. We postulate a gravitation law based on empirical observations and divide superpixels adaptively into small-, medium-, and large-sized sets. Global grouping is achieved using medium-sized superpixels through a sparse representation of superpixels' features by solving a ℓ0-minimization problem, and thereby enabling continuity or propagation of local smoothness over long-range connections. Small- and large-sized superpixels are then used to achieve local smoothness through an adjacent graph in a given feature space, and thus implementing perceptual laws, for example, similarity and proximity. Finally, a bipartite graph is also introduced to enable propagation of grouping cues between superpixels of different scales. Extensive experiments are carried out on the Berkeley segmentation database in comparison with several state-of-the-art graph constructions. The results show the effectiveness of the proposed approach, which outperforms state-of-the-art graphs using four different objective criteria, namely, the probabilistic rand index, the variation of information, the global consistency error, and the boundary displacement error.
Exact and approximate graph matching using random walks.
Gori, Marco; Maggini, Marco; Sarti, Lorenzo
2005-07-01
In this paper, we propose a general framework for graph matching which is suitable for different problems of pattern recognition. The pattern representation we assume is at the same time highly structured, like for classic syntactic and structural approaches, and of subsymbolic nature with real-valued features, like for connectionist and statistic approaches. We show that random walk based models, inspired by Google's PageRank, give rise to a spectral theory that nicely enhances the graph topological features at node level. As a straightforward consequence, we derive a polynomial algorithm for the classic graph isomorphism problem, under the restriction of dealing with Markovian spectrally distinguishable graphs (MSD), a class of graphs that does not seem to be easily reducible to others proposed in the literature. The experimental results that we found on different test-beds of the TC-15 graph database show that the defined MSD class "almost always" covers the database, and that the proposed algorithm is significantly more efficient than top scoring VF algorithm on the same data. Most interestingly, the proposed approach is very well-suited for dealing with partial and approximate graph matching problems, derived for instance from image retrieval tasks. We consider the objects of the COIL-100 visual collection and provide a graph-based representation, whose node's labels contain appropriate visual features. We show that the adoption of classic bipartite graph matching algorithms offers a straightforward generalization of the algorithm given for graph isomorphism and, finally, we report very promising experimental results on the COIL-100 visual collection.
Solving Partial Differential Equations in a data-driven multiprocessor environment
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gaudiot, J.L.; Lin, C.M.; Hosseiniyar, M.
1988-12-31
Partial differential equations can be found in a host of engineering and scientific problems. The emergence of new parallel architectures has spurred research in the definition of parallel PDE solvers. Concurrently, highly programmable systems such as data-how architectures have been proposed for the exploitation of large scale parallelism. The implementation of some Partial Differential Equation solvers (such as the Jacobi method) on a tagged token data-flow graph is demonstrated here. Asynchronous methods (chaotic relaxation) are studied and new scheduling approaches (the Token No-Labeling scheme) are introduced in order to support the implementation of the asychronous methods in a data-driven environment.more » New high-level data-flow language program constructs are introduced in order to handle chaotic operations. Finally, the performance of the program graphs is demonstrated by a deterministic simulation of a message passing data-flow multiprocessor. An analysis of the overhead in the data-flow graphs is undertaken to demonstrate the limits of parallel operations in dataflow PDE program graphs.« less
RGB-D SLAM Combining Visual Odometry and Extended Information Filter
Zhang, Heng; Liu, Yanli; Tan, Jindong; Xiong, Naixue
2015-01-01
In this paper, we present a novel RGB-D SLAM system based on visual odometry and an extended information filter, which does not require any other sensors or odometry. In contrast to the graph optimization approaches, this is more suitable for online applications. A visual dead reckoning algorithm based on visual residuals is devised, which is used to estimate motion control input. In addition, we use a novel descriptor called binary robust appearance and normals descriptor (BRAND) to extract features from the RGB-D frame and use them as landmarks. Furthermore, considering both the 3D positions and the BRAND descriptors of the landmarks, our observation model avoids explicit data association between the observations and the map by marginalizing the observation likelihood over all possible associations. Experimental validation is provided, which compares the proposed RGB-D SLAM algorithm with just RGB-D visual odometry and a graph-based RGB-D SLAM algorithm using the publicly-available RGB-D dataset. The results of the experiments demonstrate that our system is quicker than the graph-based RGB-D SLAM algorithm. PMID:26263990
1990-01-09
data structures can easily be presented to the user interface. An emphasis of the Graph Browser was the realization of graph views and graph animation ... animation of the graph. Anima- tion of the graph includes changing node shapes, changing node and arc colors, changing node and arc text, and making...many graphs tend to be tree-like. Animtion of a graph is a useful feature. One of the primary goals of GMB was to support animated graphs. For animation
Perkins, David Nikolaus; Brost, Randolph; Ray, Lawrence P.
2017-08-08
Various technologies for facilitating analysis of large remote sensing and geolocation datasets to identify features of interest are described herein. A search query can be submitted to a computing system that executes searches over a geospatial temporal semantic (GTS) graph to identify features of interest. The GTS graph comprises nodes corresponding to objects described in the remote sensing and geolocation datasets, and edges that indicate geospatial or temporal relationships between pairs of nodes in the nodes. Trajectory information is encoded in the GTS graph by the inclusion of movable nodes to facilitate searches for features of interest in the datasets relative to moving objects such as vehicles.
Improving the Accuracy of Attribute Extraction using the Relatedness between Attribute Values
NASA Astrophysics Data System (ADS)
Bollegala, Danushka; Tani, Naoki; Ishizuka, Mitsuru
Extracting attribute-values related to entities from web texts is an important step in numerous web related tasks such as information retrieval, information extraction, and entity disambiguation (namesake disambiguation). For example, for a search query that contains a personal name, we can not only return documents that contain that personal name, but if we have attribute-values such as the organization for which that person works, we can also suggest documents that contain information related to that organization, thereby improving the user's search experience. Despite numerous potential applications of attribute extraction, it remains a challenging task due to the inherent noise in web data -- often a single web page contains multiple entities and attributes. We propose a graph-based approach to select the correct attribute-values from a set of candidate attribute-values extracted for a particular entity. First, we build an undirected weighted graph in which, attribute-values are represented by nodes, and the edge that connects two nodes in the graph represents the degree of relatedness between the corresponding attribute-values. Next, we find the maximum spanning tree of this graph that connects exactly one attribute-value for each attribute-type. The proposed method outperforms previously proposed attribute extraction methods on a dataset that contains 5000 web pages.
Social Network Extraction and Analysis Based on Multimodal Dyadic Interaction
Escalera, Sergio; Baró, Xavier; Vitrià, Jordi; Radeva, Petia; Raducanu, Bogdan
2012-01-01
Social interactions are a very important component in people’s lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Times’ Blogging Heads opinion blog. The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The links’ weights are a measure of the “influence” a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network. PMID:22438733
Jelicic Kadic, Antonia; Vucic, Katarina; Dosenovic, Svjetlana; Sapunar, Damir; Puljak, Livia
2016-06-01
To compare speed and accuracy of graphical data extraction using manual estimation and open source software. Data points from eligible graphs/figures published in randomized controlled trials (RCTs) from 2009 to 2014 were extracted by two authors independently, both by manual estimation and with the Plot Digitizer, open source software. Corresponding authors of each RCT were contacted up to four times via e-mail to obtain exact numbers that were used to create graphs. Accuracy of each method was compared against the source data from which the original graphs were produced. Software data extraction was significantly faster, reducing time for extraction for 47%. Percent agreement between the two raters was 51% for manual and 53.5% for software data extraction. Percent agreement between the raters and original data was 66% vs. 75% for the first rater and 69% vs. 73% for the second rater, for manual and software extraction, respectively. Data extraction from figures should be conducted using software, whereas manual estimation should be avoided. Using software for data extraction of data presented only in figures is faster and enables higher interrater reliability. Copyright © 2016 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Prieto, L. P.; Sharma, K.; Kidzinski, L.; Rodríguez-Triana, M. J.; Dillenbourg, P.
2018-01-01
The pedagogical modelling of everyday classroom practice is an interesting kind of evidence, both for educational research and teachers' own professional development. This paper explores the usage of wearable sensors and machine learning techniques to automatically extract orchestration graphs (teaching activities and their social plane over time)…
Extracting Undimensional Chains from Multidimensional Datasets: A Graph Theory Approach.
ERIC Educational Resources Information Center
Yamomoto, Yoneo; Wise, Steven L.
An order-analysis procedure, which uses graph theory to extract efficiently nonredundant, unidimensional chains of items from multidimensional data sets and chain consistency as a criterion for chain membership is outlined in this paper. The procedure is intended as an alternative to the Reynolds (1976) procedure which is described as being…
NASA Astrophysics Data System (ADS)
Dekavalla, Maria; Argialas, Demetre
2017-07-01
The analysis of undersea topography and geomorphological features provides necessary information to related disciplines and many applications. The development of an automated knowledge-based classification approach of undersea topography and geomorphological features is challenging due to their multi-scale nature. The aim of the study is to develop and evaluate an automated knowledge-based OBIA approach to: i) decompose the global undersea topography to multi-scale regions of distinct morphometric properties, and ii) assign the derived regions to characteristic geomorphological features. First, the global undersea topography was decomposed through the SRTM30_PLUS bathymetry data to the so-called morphometric objects of discrete morphometric properties and spatial scales defined by data-driven methods (local variance graphs and nested means) and multi-scale analysis. The derived morphometric objects were combined with additional relative topographic position information computed with a self-adaptive pattern recognition method (geomorphons), and auxiliary data and were assigned to characteristic undersea geomorphological feature classes through a knowledge base, developed from standard definitions. The decomposition of the SRTM30_PLUS data to morphometric objects was considered successful for the requirements of maximizing intra-object and inter-object heterogeneity, based on the near zero values of the Moran's I and the low values of the weighted variance index. The knowledge-based classification approach was tested for its transferability in six case studies of various tectonic settings and achieved the efficient extraction of 11 undersea geomorphological feature classes. The classification results for the six case studies were compared with the digital global seafloor geomorphic features map (GSFM). The 11 undersea feature classes and their producer's accuracies in respect to the GSFM relevant areas were Basin (95%), Continental Shelf (94.9%), Trough (88.4%), Plateau (78.9%), Continental Slope (76.4%), Trench (71.2%), Abyssal Hill (62.9%), Abyssal Plain (62.4%), Ridge (49.8%), Seamount (48.8%) and Continental Rise (25.4%). The knowledge-based OBIA classification approach was considered transferable since the percentages of spatial and thematic agreement between the most of the classified undersea feature classes and the GSFM exhibited low deviations across the six case studies.
Feature Grouping and Selection Over an Undirected Graph.
Yang, Sen; Yuan, Lei; Lai, Ying-Cheng; Shen, Xiaotong; Wonka, Peter; Ye, Jieping
2012-01-01
High-dimensional regression/classification continues to be an important and challenging problem, especially when features are highly correlated. Feature selection, combined with additional structure information on the features has been considered to be promising in promoting regression/classification performance. Graph-guided fused lasso (GFlasso) has recently been proposed to facilitate feature selection and graph structure exploitation, when features exhibit certain graph structures. However, the formulation in GFlasso relies on pairwise sample correlations to perform feature grouping, which could introduce additional estimation bias. In this paper, we propose three new feature grouping and selection methods to resolve this issue. The first method employs a convex function to penalize the pairwise l ∞ norm of connected regression/classification coefficients, achieving simultaneous feature grouping and selection. The second method improves the first one by utilizing a non-convex function to reduce the estimation bias. The third one is the extension of the second method using a truncated l 1 regularization to further reduce the estimation bias. The proposed methods combine feature grouping and feature selection to enhance estimation accuracy. We employ the alternating direction method of multipliers (ADMM) and difference of convex functions (DC) programming to solve the proposed formulations. Our experimental results on synthetic data and two real datasets demonstrate the effectiveness of the proposed methods.
FGRAAL: FORTRAN extended graph algorithmic language
NASA Technical Reports Server (NTRS)
Basili, V. R.; Mesztenyi, C. K.; Rheinboldt, W. C.
1972-01-01
The FORTRAN version FGRAAL of the graph algorithmic language GRAAL as it has been implemented for the Univac 1108 is described. FBRAAL is an extension of FORTRAN 5 and is intended for describing and implementing graph algorithms of the type primarily arising in applications. The formal description contained in this report represents a supplement to the FORTRAN 5 manual for the Univac 1108 (UP-4060), that is, only the new features of the language are described. Several typical graph algorithms, written in FGRAAL, are included to illustrate various features of the language and to show its applicability.
Process and representation in graphical displays
NASA Technical Reports Server (NTRS)
Gillan, Douglas J.; Lewis, Robert; Rudisill, Marianne
1990-01-01
How people comprehend graphics is examined. Graphical comprehension involves the cognitive representation of information from a graphic display and the processing strategies that people apply to answer questions about graphics. Research on representation has examined both the features present in a graphic display and the cognitive representation of the graphic. The key features include the physical components of a graph, the relation between the figure and its axes, and the information in the graph. Tests of people's memory for graphs indicate that both the physical and informational aspect of a graph are important in the cognitive representation of a graph. However, the physical (or perceptual) features overshadow the information to a large degree. Processing strategies also involve a perception-information distinction. In order to answer simple questions (e.g., determining the value of a variable, comparing several variables, and determining the mean of a set of variables), people switch between two information processing strategies: (1) an arithmetic, look-up strategy in which they use a graph much like a table, looking up values and performing arithmetic calculations; and (2) a perceptual strategy in which they use the spatial characteristics of the graph to make comparisons and estimations. The user's choice of strategies depends on the task and the characteristics of the graph. A theory of graphic comprehension is presented.
Extracting Knowledge from Graph Data in Adversarial Settings
NASA Astrophysics Data System (ADS)
Skillicorn, David
Graph data captures connections and relationships among individuals, and between individuals and objects, places, and times. Because many of the properties f graphs are emergent, they are resistant to manipulation by adversaries. This robustness comes at the expense of more-complex analysis algorithms. We describe several approaches to analysing graph data, illustrating with examples from the relationships within al Qaeda.
Thinking Graphically: Connecting Vision and Cognition during Graph Comprehension
ERIC Educational Resources Information Center
Ratwani, Raj M.; Trafton, J. Gregory; Boehm-Davis, Deborah A.
2008-01-01
Task analytic theories of graph comprehension account for the perceptual and conceptual processes required to extract specific information from graphs. Comparatively, the processes underlying information integration have received less attention. We propose a new framework for information integration that highlights visual integration and cognitive…
The Design and Product of National 1:1000000 Cartographic Data of Topographic Map
NASA Astrophysics Data System (ADS)
Wang, Guizhi
2016-06-01
National administration of surveying, mapping and geoinformation started to launch the project of national fundamental geographic information database dynamic update in 2012. Among them, the 1:50000 database was updated once a year, furthermore the 1:250000 database was downsized and linkage-updated on the basis. In 2014, using the latest achievements of 1:250000 database, comprehensively update the 1:1000000 digital line graph database. At the same time, generate cartographic data of topographic map and digital elevation model data. This article mainly introduce national 1:1000000 cartographic data of topographic map, include feature content, database structure, Database-driven Mapping technology, workflow and so on.
PANTHER. Pattern ANalytics To support High-performance Exploitation and Reasoning.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Czuchlewski, Kristina Rodriguez; Hart, William E.
Sandia has approached the analysis of big datasets with an integrated methodology that uses computer science, image processing, and human factors to exploit critical patterns and relationships in large datasets despite the variety and rapidity of information. The work is part of a three-year LDRD Grand Challenge called PANTHER (Pattern ANalytics To support High-performance Exploitation and Reasoning). To maximize data analysis capability, Sandia pursued scientific advances across three key technical domains: (1) geospatial-temporal feature extraction via image segmentation and classification; (2) geospatial-temporal analysis capabilities tailored to identify and process new signatures more efficiently; and (3) domain- relevant models of humanmore » perception and cognition informing the design of analytic systems. Our integrated results include advances in geographical information systems (GIS) in which we discover activity patterns in noisy, spatial-temporal datasets using geospatial-temporal semantic graphs. We employed computational geometry and machine learning to allow us to extract and predict spatial-temporal patterns and outliers from large aircraft and maritime trajectory datasets. We automatically extracted static and ephemeral features from real, noisy synthetic aperture radar imagery for ingestion into a geospatial-temporal semantic graph. We worked with analysts and investigated analytic workflows to (1) determine how experiential knowledge evolves and is deployed in high-demand, high-throughput visual search workflows, and (2) better understand visual search performance and attention. Through PANTHER, Sandia's fundamental rethinking of key aspects of geospatial data analysis permits the extraction of much richer information from large amounts of data. The project results enable analysts to examine mountains of historical and current data that would otherwise go untouched, while also gaining meaningful, measurable, and defensible insights into overlooked relationships and patterns. The capability is directly relevant to the nation's nonproliferation remote-sensing activities and has broad national security applications for military and intelligence- gathering organizations.« less
ERIC Educational Resources Information Center
Xi, Xiaoming
2010-01-01
Motivated by cognitive theories of graph comprehension, this study systematically manipulated characteristics of a line graph description task in a speaking test in ways to mitigate the influence of graph familiarity, a potential source of construct-irrelevant variance. It extends Xi (2005), which found that the differences in holistic scores on…
Flame analysis using image processing techniques
NASA Astrophysics Data System (ADS)
Her Jie, Albert Chang; Zamli, Ahmad Faizal Ahmad; Zulazlan Shah Zulkifli, Ahmad; Yee, Joanne Lim Mun; Lim, Mooktzeng
2018-04-01
This paper presents image processing techniques with the use of fuzzy logic and neural network approach to perform flame analysis. Flame diagnostic is important in the industry to extract relevant information from flame images. Experiment test is carried out in a model industrial burner with different flow rates. Flame features such as luminous and spectral parameters are extracted using image processing and Fast Fourier Transform (FFT). Flame images are acquired using FLIR infrared camera. Non-linearities such as thermal acoustic oscillations and background noise affect the stability of flame. Flame velocity is one of the important characteristics that determines stability of flame. In this paper, an image processing method is proposed to determine flame velocity. Power spectral density (PSD) graph is a good tool for vibration analysis where flame stability can be approximated. However, a more intelligent diagnostic system is needed to automatically determine flame stability. In this paper, flame features of different flow rates are compared and analyzed. The selected flame features are used as inputs to the proposed fuzzy inference system to determine flame stability. Neural network is used to test the performance of the fuzzy inference system.
Improving activity recognition using temporal coherence.
Ataya, Abbas; Jallon, Pierre; Bianchi, Pascal; Doron, Maeva
2013-01-01
Assessment of daily physical activity using data from wearable sensors has recently become a prominent research area in the biomedical engineering field and a substantial application for pattern recognition. In this paper, we present an accelerometer-based activity recognition scheme on the basis of a hierarchical structured classifier. A first step consists of distinguishing static activities from dynamic ones in order to extract relevant features for each activity type. Next, a separate classifier is applied to detect more specific activities of the same type. On top of our activity recognition system, we introduce a novel approach to take into account the temporal coherence of activities. Inter-activity transition information is modeled by a directed graph Markov chain. Confidence measures in activity classes are then evaluated from conventional classifier's outputs and coupled with the graph to reinforce activity estimation. Accurate results and significant improvement of activity detection are obtained when applying our system for the recognition of 9 activities for 48 subjects.
NASA Astrophysics Data System (ADS)
Xie, Tian; Grossman, Jeffrey C.
2018-04-01
The use of machine learning methods for accelerating the design of crystalline materials usually requires manually constructed feature vectors or complex transformation of atom coordinates to input the crystal structure, which either constrains the model to certain crystal types or makes it difficult to provide chemical insights. Here, we develop a crystal graph convolutional neural networks framework to directly learn material properties from the connection of atoms in the crystal, providing a universal and interpretable representation of crystalline materials. Our method provides a highly accurate prediction of density functional theory calculated properties for eight different properties of crystals with various structure types and compositions after being trained with 1 04 data points. Further, our framework is interpretable because one can extract the contributions from local chemical environments to global properties. Using an example of perovskites, we show how this information can be utilized to discover empirical rules for materials design.
Random-Forest Classification of High-Resolution Remote Sensing Images and Ndsm Over Urban Areas
NASA Astrophysics Data System (ADS)
Sun, X. F.; Lin, X. G.
2017-09-01
As an intermediate step between raw remote sensing data and digital urban maps, remote sensing data classification has been a challenging and long-standing research problem in the community of remote sensing. In this work, an effective classification method is proposed for classifying high-resolution remote sensing data over urban areas. Starting from high resolution multi-spectral images and 3D geometry data, our method proceeds in three main stages: feature extraction, classification, and classified result refinement. First, we extract color, vegetation index and texture features from the multi-spectral image and compute the height, elevation texture and differential morphological profile (DMP) features from the 3D geometry data. Then in the classification stage, multiple random forest (RF) classifiers are trained separately, then combined to form a RF ensemble to estimate each sample's category probabilities. Finally the probabilities along with the feature importance indicator outputted by RF ensemble are used to construct a fully connected conditional random field (FCCRF) graph model, by which the classification results are refined through mean-field based statistical inference. Experiments on the ISPRS Semantic Labeling Contest dataset show that our proposed 3-stage method achieves 86.9% overall accuracy on the test data.
Predicting Future Morphological Changes of Lesions from Radiotracer Uptake in 18F-FDG-PET Images
Bagci, Ulas; Yao, Jianhua; Miller-Jaster, Kirsten; Chen, Xinjian; Mollura, Daniel J.
2013-01-01
We introduce a novel computational framework to enable automated identification of texture and shape features of lesions on 18F-FDG-PET images through a graph-based image segmentation method. The proposed framework predicts future morphological changes of lesions with high accuracy. The presented methodology has several benefits over conventional qualitative and semi-quantitative methods, due to its fully quantitative nature and high accuracy in each step of (i) detection, (ii) segmentation, and (iii) feature extraction. To evaluate our proposed computational framework, thirty patients received 2 18F-FDG-PET scans (60 scans total), at two different time points. Metastatic papillary renal cell carcinoma, cerebellar hemongioblastoma, non-small cell lung cancer, neurofibroma, lymphomatoid granulomatosis, lung neoplasm, neuroendocrine tumor, soft tissue thoracic mass, nonnecrotizing granulomatous inflammation, renal cell carcinoma with papillary and cystic features, diffuse large B-cell lymphoma, metastatic alveolar soft part sarcoma, and small cell lung cancer were included in this analysis. The radiotracer accumulation in patients' scans was automatically detected and segmented by the proposed segmentation algorithm. Delineated regions were used to extract shape and textural features, with the proposed adaptive feature extraction framework, as well as standardized uptake values (SUV) of uptake regions, to conduct a broad quantitative analysis. Evaluation of segmentation results indicates that our proposed segmentation algorithm has a mean dice similarity coefficient of 85.75±1.75%. We found that 28 of 68 extracted imaging features were correlated well with SUVmax (p<0.05), and some of the textural features (such as entropy and maximum probability) were superior in predicting morphological changes of radiotracer uptake regions longitudinally, compared to single intensity feature such as SUVmax. We also found that integrating textural features with SUV measurements significantly improves the prediction accuracy of morphological changes (Spearman correlation coefficient = 0.8715, p<2e-16). PMID:23431398
Probabilistic graphlet transfer for photo cropping.
Zhang, Luming; Song, Mingli; Zhao, Qi; Liu, Xiao; Bu, Jiajun; Chen, Chun
2013-02-01
As one of the most basic photo manipulation processes, photo cropping is widely used in the printing, graphic design, and photography industries. In this paper, we introduce graphlets (i.e., small connected subgraphs) to represent a photo's aesthetic features, and propose a probabilistic model to transfer aesthetic features from the training photo onto the cropped photo. In particular, by segmenting each photo into a set of regions, we construct a region adjacency graph (RAG) to represent the global aesthetic feature of each photo. Graphlets are then extracted from the RAGs, and these graphlets capture the local aesthetic features of the photos. Finally, we cast photo cropping as a candidate-searching procedure on the basis of a probabilistic model, and infer the parameters of the cropped photos using Gibbs sampling. The proposed method is fully automatic. Subjective evaluations have shown that it is preferred over a number of existing approaches.
Lung lobe segmentation based on statistical atlas and graph cuts
NASA Astrophysics Data System (ADS)
Nimura, Yukitaka; Kitasaka, Takayuki; Honma, Hirotoshi; Takabatake, Hirotsugu; Mori, Masaki; Natori, Hiroshi; Mori, Kensaku
2012-03-01
This paper presents a novel method that can extract lung lobes by utilizing probability atlas and multilabel graph cuts. Information about pulmonary structures plays very important role for decision of the treatment strategy and surgical planning. The human lungs are divided into five anatomical regions, the lung lobes. Precise segmentation and recognition of lung lobes are indispensable tasks in computer aided diagnosis systems and computer aided surgery systems. A lot of methods for lung lobe segmentation are proposed. However, these methods only target the normal cases. Therefore, these methods cannot extract the lung lobes in abnormal cases, such as COPD cases. To extract lung lobes in abnormal cases, this paper propose a lung lobe segmentation method based on probability atlas of lobe location and multilabel graph cuts. The process consists of three components; normalization based on the patient's physique, probability atlas generation, and segmentation based on graph cuts. We apply this method to six cases of chest CT images including COPD cases. Jaccard index was 79.1%.
Scenario driven data modelling: a method for integrating diverse sources of data and data streams
Brettin, Thomas S.; Cottingham, Robert W.; Griffith, Shelton D.; Quest, Daniel J.
2015-09-08
A system and method of integrating diverse sources of data and data streams is presented. The method can include selecting a scenario based on a topic, creating a multi-relational directed graph based on the scenario, identifying and converting resources in accordance with the scenario and updating the multi-directed graph based on the resources, identifying data feeds in accordance with the scenario and updating the multi-directed graph based on the data feeds, identifying analytical routines in accordance with the scenario and updating the multi-directed graph using the analytical routines and identifying data outputs in accordance with the scenario and defining queries to produce the data outputs from the multi-directed graph.
Human body segmentation via data-driven graph cut.
Li, Shifeng; Lu, Huchuan; Shao, Xingqing
2014-11-01
Human body segmentation is a challenging and important problem in computer vision. Existing methods usually entail a time-consuming training phase for prior knowledge learning with complex shape matching for body segmentation. In this paper, we propose a data-driven method that integrates top-down body pose information and bottom-up low-level visual cues for segmenting humans in static images within the graph cut framework. The key idea of our approach is first to exploit human kinematics to search for body part candidates via dynamic programming for high-level evidence. Then, by using the body parts classifiers, obtaining bottom-up cues of human body distribution for low-level evidence. All the evidence collected from top-down and bottom-up procedures are integrated in a graph cut framework for human body segmentation. Qualitative and quantitative experiment results demonstrate the merits of the proposed method in segmenting human bodies with arbitrary poses from cluttered backgrounds.
Learning a Health Knowledge Graph from Electronic Medical Records.
Rotmensch, Maya; Halpern, Yoni; Tlimat, Abdulhakim; Horng, Steven; Sontag, David
2017-07-20
Demand for clinical decision support systems in medicine and self-diagnostic symptom checkers has substantially increased in recent years. Existing platforms rely on knowledge bases manually compiled through a labor-intensive process or automatically derived using simple pairwise statistics. This study explored an automated process to learn high quality knowledge bases linking diseases and symptoms directly from electronic medical records. Medical concepts were extracted from 273,174 de-identified patient records and maximum likelihood estimation of three probabilistic models was used to automatically construct knowledge graphs: logistic regression, naive Bayes classifier and a Bayesian network using noisy OR gates. A graph of disease-symptom relationships was elicited from the learned parameters and the constructed knowledge graphs were evaluated and validated, with permission, against Google's manually-constructed knowledge graph and against expert physician opinions. Our study shows that direct and automated construction of high quality health knowledge graphs from medical records using rudimentary concept extraction is feasible. The noisy OR model produces a high quality knowledge graph reaching precision of 0.85 for a recall of 0.6 in the clinical evaluation. Noisy OR significantly outperforms all tested models across evaluation frameworks (p < 0.01).
Bordier, Cecile; Puja, Francesco; Macaluso, Emiliano
2013-01-01
The investigation of brain activity using naturalistic, ecologically-valid stimuli is becoming an important challenge for neuroscience research. Several approaches have been proposed, primarily relying on data-driven methods (e.g. independent component analysis, ICA). However, data-driven methods often require some post-hoc interpretation of the imaging results to draw inferences about the underlying sensory, motor or cognitive functions. Here, we propose using a biologically-plausible computational model to extract (multi-)sensory stimulus statistics that can be used for standard hypothesis-driven analyses (general linear model, GLM). We ran two separate fMRI experiments, which both involved subjects watching an episode of a TV-series. In Exp 1, we manipulated the presentation by switching on-and-off color, motion and/or sound at variable intervals, whereas in Exp 2, the video was played in the original version, with all the consequent continuous changes of the different sensory features intact. Both for vision and audition, we extracted stimulus statistics corresponding to spatial and temporal discontinuities of low-level features, as well as a combined measure related to the overall stimulus saliency. Results showed that activity in occipital visual cortex and the superior temporal auditory cortex co-varied with changes of low-level features. Visual saliency was found to further boost activity in extra-striate visual cortex plus posterior parietal cortex, while auditory saliency was found to enhance activity in the superior temporal cortex. Data-driven ICA analyses of the same datasets also identified “sensory” networks comprising visual and auditory areas, but without providing specific information about the possible underlying processes, e.g., these processes could relate to modality, stimulus features and/or saliency. We conclude that the combination of computational modeling and GLM enables the tracking of the impact of bottom–up signals on brain activity during viewing of complex and dynamic multisensory stimuli, beyond the capability of purely data-driven approaches. PMID:23202431
Scoring nuclear pleomorphism using a visual BoF modulated by a graph structure
NASA Astrophysics Data System (ADS)
Moncayo-Martínez, Ricardo; Romo-Bucheli, David; Arias, Viviana; Romero, Eduardo
2017-11-01
Nuclear pleomorphism has been recognized as a key histological criterium in breast cancer grading systems (such as Bloom Richardson and Nothingham grading systems). However, the nuclear pleomorphism assessment is subjective and presents high inter-reader variability. Automatic algorithms might facilitate quantitative estimation of nuclear variations in shape and size. Nevertheless, the automatic segmentation of the nuclei is difficult and still and open research problem. This paper presents a method using a bag of multi-scale visual features, modulated by a graph structure, to grade nuclei in breast cancer microscopical fields. This strategy constructs hematoxylin-eosin image patches, each containing a nucleus that is represented by a set of visual words in the BoF. The contribution of each visual word is computed by examining the visual words in an associated graph built when projecting the multi-dimensional BoF to a bi-dimensional plane where local relationships are conserved. The methodology was evaluated using 14 breast cancer cases of the Cancer Genome Atlas database. From these cases, a set of 134 microscopical fields was extracted, and under a leave-one-out validation scheme, an average F-score of 0.68 was obtained.
Graph-based sensor fusion for classification of transient acoustic signals.
Srinivas, Umamahesh; Nasrabadi, Nasser M; Monga, Vishal
2015-03-01
Advances in acoustic sensing have enabled the simultaneous acquisition of multiple measurements of the same physical event via co-located acoustic sensors. We exploit the inherent correlation among such multiple measurements for acoustic signal classification, to identify the launch/impact of munition (i.e., rockets, mortars). Specifically, we propose a probabilistic graphical model framework that can explicitly learn the class conditional correlations between the cepstral features extracted from these different measurements. Additionally, we employ symbolic dynamic filtering-based features, which offer improvements over the traditional cepstral features in terms of robustness to signal distortions. Experiments on real acoustic data sets show that our proposed algorithm outperforms conventional classifiers as well as the recently proposed joint sparsity models for multisensor acoustic classification. Additionally our proposed algorithm is less sensitive to insufficiency in training samples compared to competing approaches.
Yang, Zhihao; Lin, Yuan; Wu, Jiajin; Tang, Nan; Lin, Hongfei; Li, Yanpeng
2011-10-01
Knowledge about protein-protein interactions (PPIs) unveils the molecular mechanisms of biological processes. However, the volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database curators to detect and curate protein interaction information manually. We present a multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, and graph and combines their output with Ranking support vector machine (SVM). Experimental evaluations show that the features in individual kernels are complementary and the kernel combined with Ranking SVM achieves better performance than those of the individual kernels, equal weight combination and optimal weight combination. Our approach can achieve state-of-the-art performance with respect to the comparable evaluations, with 64.88% F-score and 88.02% AUC on the AImed corpus. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Key-Node-Separated Graph Clustering and Layouts for Human Relationship Graph Visualization.
Itoh, Takayuki; Klein, Karsten
2015-01-01
Many graph-drawing methods apply node-clustering techniques based on the density of edges to find tightly connected subgraphs and then hierarchically visualize the clustered graphs. However, users may want to focus on important nodes and their connections to groups of other nodes for some applications. For this purpose, it is effective to separately visualize the key nodes detected based on adjacency and attributes of the nodes. This article presents a graph visualization technique for attribute-embedded graphs that applies a graph-clustering algorithm that accounts for the combination of connections and attributes. The graph clustering step divides the nodes according to the commonality of connected nodes and similarity of feature value vectors. It then calculates the distances between arbitrary pairs of clusters according to the number of connecting edges and the similarity of feature value vectors and finally places the clusters based on the distances. Consequently, the technique separates important nodes that have connections to multiple large clusters and improves the visibility of such nodes' connections. To test this technique, this article presents examples with human relationship graph datasets, including a coauthorship and Twitter communication network dataset.
Flight Simulator: Use of SpaceGraph Display in an Instructor/Operator Station. Final Report.
ERIC Educational Resources Information Center
Sher, Lawrence D.
This report describes SpaceGraph, a new computer-driven display technology capable of showing space-filling images, i.e., true three dimensional displays, and discusses the advantages of this technology over flat displays for use with the instructor/operator station (IOS) of a flight simulator. Ideas resulting from 17 brainstorming sessions with…
Optimized Graph Learning Using Partial Tags and Multiple Features for Image and Video Annotation.
Song, Jingkuan; Gao, Lianli; Nie, Feiping; Shen, Heng Tao; Yan, Yan; Sebe, Nicu
2016-11-01
In multimedia annotation, due to the time constraints and the tediousness of manual tagging, it is quite common to utilize both tagged and untagged data to improve the performance of supervised learning when only limited tagged training data are available. This is often done by adding a geometry-based regularization term in the objective function of a supervised learning model. In this case, a similarity graph is indispensable to exploit the geometrical relationships among the training data points, and the graph construction scheme essentially determines the performance of these graph-based learning algorithms. However, most of the existing works construct the graph empirically and are usually based on a single feature without using the label information. In this paper, we propose a semi-supervised annotation approach by learning an optimized graph (OGL) from multi-cues (i.e., partial tags and multiple features), which can more accurately embed the relationships among the data points. Since OGL is a transductive method and cannot deal with novel data points, we further extend our model to address the out-of-sample issue. Extensive experiments on image and video annotation show the consistent superiority of OGL over the state-of-the-art methods.
A graph-based approach for the retrieval of multi-modality medical images.
Kumar, Ashnil; Kim, Jinman; Wen, Lingfeng; Fulham, Michael; Feng, Dagan
2014-02-01
In this paper, we address the retrieval of multi-modality medical volumes, which consist of two different imaging modalities, acquired sequentially, from the same scanner. One such example, positron emission tomography and computed tomography (PET-CT), provides physicians with complementary functional and anatomical features as well as spatial relationships and has led to improved cancer diagnosis, localisation, and staging. The challenge of multi-modality volume retrieval for cancer patients lies in representing the complementary geometric and topologic attributes between tumours and organs. These attributes and relationships, which are used for tumour staging and classification, can be formulated as a graph. It has been demonstrated that graph-based methods have high accuracy for retrieval by spatial similarity. However, naïvely representing all relationships on a complete graph obscures the structure of the tumour-anatomy relationships. We propose a new graph structure derived from complete graphs that structurally constrains the edges connected to tumour vertices based upon the spatial proximity of tumours and organs. This enables retrieval on the basis of tumour localisation. We also present a similarity matching algorithm that accounts for different feature sets for graph elements from different imaging modalities. Our method emphasises the relationships between a tumour and related organs, while still modelling patient-specific anatomical variations. Constraining tumours to related anatomical structures improves the discrimination potential of graphs, making it easier to retrieve similar images based on tumour location. We evaluated our retrieval methodology on a dataset of clinical PET-CT volumes. Our results showed that our method enabled the retrieval of multi-modality images using spatial features. Our graph-based retrieval algorithm achieved a higher precision than several other retrieval techniques: gray-level histograms as well as state-of-the-art methods such as visual words using the scale- invariant feature transform (SIFT) and relational matrices representing the spatial arrangements of objects. Copyright © 2013 Elsevier B.V. All rights reserved.
Approximate Computing Techniques for Iterative Graph Algorithms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Panyala, Ajay R.; Subasi, Omer; Halappanavar, Mahantesh
Approximate computing enables processing of large-scale graphs by trading off quality for performance. Approximate computing techniques have become critical not only due to the emergence of parallel architectures but also the availability of large scale datasets enabling data-driven discovery. Using two prototypical graph algorithms, PageRank and community detection, we present several approximate computing heuristics to scale the performance with minimal loss of accuracy. We present several heuristics including loop perforation, data caching, incomplete graph coloring and synchronization, and evaluate their efficiency. We demonstrate performance improvements of up to 83% for PageRank and up to 450x for community detection, with lowmore » impact of accuracy for both the algorithms. We expect the proposed approximate techniques will enable scalable graph analytics on data of importance to several applications in science and their subsequent adoption to scale similar graph algorithms.« less
Graph-based Data Modeling and Analysis for Data Fusion in Remote Sensing
NASA Astrophysics Data System (ADS)
Fan, Lei
Hyperspectral imaging provides the capability of increased sensitivity and discrimination over traditional imaging methods by combining standard digital imaging with spectroscopic methods. For each individual pixel in a hyperspectral image (HSI), a continuous spectrum is sampled as the spectral reflectance/radiance signature to facilitate identification of ground cover and surface material. The abundant spectrum knowledge allows all available information from the data to be mined. The superior qualities within hyperspectral imaging allow wide applications such as mineral exploration, agriculture monitoring, and ecological surveillance, etc. The processing of massive high-dimensional HSI datasets is a challenge since many data processing techniques have a computational complexity that grows exponentially with the dimension. Besides, a HSI dataset may contain a limited number of degrees of freedom due to the high correlations between data points and among the spectra. On the other hand, merely taking advantage of the sampled spectrum of individual HSI data point may produce inaccurate results due to the mixed nature of raw HSI data, such as mixed pixels, optical interferences and etc. Fusion strategies are widely adopted in data processing to achieve better performance, especially in the field of classification and clustering. There are mainly three types of fusion strategies, namely low-level data fusion, intermediate-level feature fusion, and high-level decision fusion. Low-level data fusion combines multi-source data that is expected to be complementary or cooperative. Intermediate-level feature fusion aims at selection and combination of features to remove redundant information. Decision level fusion exploits a set of classifiers to provide more accurate results. The fusion strategies have wide applications including HSI data processing. With the fast development of multiple remote sensing modalities, e.g. Very High Resolution (VHR) optical sensors, LiDAR, etc., fusion of multi-source data can in principal produce more detailed information than each single source. On the other hand, besides the abundant spectral information contained in HSI data, features such as texture and shape may be employed to represent data points from a spatial perspective. Furthermore, feature fusion also includes the strategy of removing redundant and noisy features in the dataset. One of the major problems in machine learning and pattern recognition is to develop appropriate representations for complex nonlinear data. In HSI processing, a particular data point is usually described as a vector with coordinates corresponding to the intensities measured in the spectral bands. This vector representation permits the application of linear and nonlinear transformations with linear algebra to find an alternative representation of the data. More generally, HSI is multi-dimensional in nature and the vector representation may lose the contextual correlations. Tensor representation provides a more sophisticated modeling technique and a higher-order generalization to linear subspace analysis. In graph theory, data points can be generalized as nodes with connectivities measured from the proximity of a local neighborhood. The graph-based framework efficiently characterizes the relationships among the data and allows for convenient mathematical manipulation in many applications, such as data clustering, feature extraction, feature selection and data alignment. In this thesis, graph-based approaches applied in the field of multi-source feature and data fusion in remote sensing area are explored. We will mainly investigate the fusion of spatial, spectral and LiDAR information with linear and multilinear algebra under graph-based framework for data clustering and classification problems.
NASA Astrophysics Data System (ADS)
Panahi, Nima S.
We studied the problem of understanding and computing the essential features and dynamics of molecular motions through the development of two theories for two different systems. First, we studied the process of the Berry Pseudorotation of PF5 and the rotations it induces in the molecule through its natural and intrinsic geometric nature by setting it in the language of fiber bundles and graph theory. With these tools, we successfully extracted the essentials of the process' loops and induced rotations. The infinite number of pseudorotation loops were broken down into a small set of essential loops called "super loops", with their intrinsic properties and link to the physical movements of the molecule extensively studied. In addition, only the three "self-edge loops" generated any induced rotations, and then only a finite number of classes of them. Second, we studied applying the statistical methods of Principal Components Analysis (PCA) and Principal Coordinate Analysis (PCO) to capture only the most important changes in Argon clusters so as to reduce computational costs and graph the potential energy surface (PES) in three dimensions respectively. Both methods proved successful, but PCA was only partially successful since one will only see advantages for PES database systems much larger than those both currently being studied and those that can be computationally studied in the next few decades to come. In addition, PCA is only needed for the very rare case of a PES database that does not already include Hessian eigenvalues.
Neuro-symbolic representation learning on biological knowledge graphs.
Alshahrani, Mona; Khan, Mohammad Asif; Maddouri, Omar; Kinjo, Akira R; Queralt-Rosinach, Núria; Hoehndorf, Robert
2017-09-01
Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. In the past years, feature learning methods that are applicable to graph-structured data are becoming available, but have not yet widely been applied and evaluated on structured biological knowledge. Results: We develop a novel method for feature learning on biological knowledge graphs. Our method combines symbolic methods, in particular knowledge representation using symbolic logic and automated reasoning, with neural networks to generate embeddings of nodes that encode for related information within knowledge graphs. Through the use of symbolic logic, these embeddings contain both explicit and implicit information. We apply these embeddings to the prediction of edges in the knowledge graph representing problems of function prediction, finding candidate genes of diseases, protein-protein interactions, or drug target relations, and demonstrate performance that matches and sometimes outperforms traditional approaches based on manually crafted features. Our method can be applied to any biological knowledge graph, and will thereby open up the increasing amount of Semantic Web based knowledge bases in biology to use in machine learning and data analytics. https://github.com/bio-ontology-research-group/walking-rdf-and-owl. robert.hoehndorf@kaust.edu.sa. Supplementary data are available at Bioinformatics online. © The Author(s) 2017. Published by Oxford University Press.
Content-based image retrieval by matching hierarchical attributed region adjacency graphs
NASA Astrophysics Data System (ADS)
Fischer, Benedikt; Thies, Christian J.; Guld, Mark O.; Lehmann, Thomas M.
2004-05-01
Content-based image retrieval requires a formal description of visual information. In medical applications, all relevant biological objects have to be represented by this description. Although color as the primary feature has proven successful in publicly available retrieval systems of general purpose, this description is not applicable to most medical images. Additionally, it has been shown that global features characterizing the whole image do not lead to acceptable results in the medical context or that they are only suitable for specific applications. For a general purpose content-based comparison of medical images, local, i.e. regional features that are collected on multiple scales must be used. A hierarchical attributed region adjacency graph (HARAG) provides such a representation and transfers image comparison to graph matching. However, building a HARAG from an image requires a restriction in size to be computationally feasible while at the same time all visually plausible information must be preserved. For this purpose, mechanisms for the reduction of the graph size are presented. Even with a reduced graph, the problem of graph matching remains NP-complete. In this paper, the Similarity Flooding approach and Hopfield-style neural networks are adapted from the graph matching community to the needs of HARAG comparison. Based on synthetic image material build from simple geometric objects, all visually similar regions were matched accordingly showing the framework's general applicability to content-based image retrieval of medical images.
Building a SuAVE browse interface to R2R's Linked Data
NASA Astrophysics Data System (ADS)
Clark, D.; Stocks, K. I.; Arko, R. A.; Zaslavsky, I.; Whitenack, T.
2017-12-01
The Rolling Deck to Repository program (R2R) is creating and evaluating a new browse portal based on the SuAVE platform and the R2R linked data graph. R2R manages the underway sensor data collected by the fleet of US academic research vessels, and provides a discovery and access point to those data at its website, www.rvdata.us. R2R has a database-driven search interface, but seeks a more capable and extensible browse interface that could be built off of the substantial R2R linked data resources. R2R's Linked Data graph organizes its data holdings around key concepts (e.g. cruise, vessel, device type, operator, award, organization, publication), anchored by persistent identifiers where feasible. The "Survey Analysis via Visual Exploration" or SuAVE platform (suave.sdsc.edu) is a system for online publication, sharing, and analysis of images and metadata. It has been implemented as an interface to diverse data collections, but has not been driven off of linked data in the past. SuAVE supports several features of interest to R2R, including faceted searching, collaborative annotations, efficient subsetting, Google maps-like navigation over an image gallery, and several types of data analysis. Our initial SuAVE-based implementation was through a CSV export from the R2R PostGIS-enabled PostgreSQL database. This served to demonstrate the utility of SuAVE but was static and required reloading as R2R data holdings grew. We are now working to implement a SPARQL-based ("RDF Query Language") service that directly leverages the R2R Linked Data graph and offers the ability to subset and/or customize output.We will show examples of SuAVE faceted searches on R2R linked data concepts, and discuss our experience to date with this work in progress.
Principal curve detection in complicated graph images
NASA Astrophysics Data System (ADS)
Liu, Yuncai; Huang, Thomas S.
2001-09-01
Finding principal curves in an image is an important low level processing in computer vision and pattern recognition. Principal curves are those curves in an image that represent boundaries or contours of objects of interest. In general, a principal curve should be smooth with certain length constraint and allow either smooth or sharp turning. In this paper, we present a method that can efficiently detect principal curves in complicated map images. For a given feature image, obtained from edge detection of an intensity image or thinning operation of a pictorial map image, the feature image is first converted to a graph representation. In graph image domain, the operation of principal curve detection is performed to identify useful image features. The shortest path and directional deviation schemes are used in our algorithm os principal verve detection, which is proven to be very efficient working with real graph images.
From Specific Information Extraction to Inferences: A Hierarchical Framework of Graph Comprehension
2004-09-01
The skill to interpret the information displayed in graphs is so important to have, the National Council of Teachers of Mathematics has created...guidelines to ensure that students learn these skills ( NCTM : Standards for Mathematics , 2003). These guidelines are based primarily on the extraction of...graphical perception. Human Computer Interaction, 8, 353-388. NCTM : Standards for Mathematics . (2003, 2003). Peebles, D., & Cheng, P. C.-H. (2002
Natural image classification driven by human brain activity
NASA Astrophysics Data System (ADS)
Zhang, Dai; Peng, Hanyang; Wang, Jinqiao; Tang, Ming; Xue, Rong; Zuo, Zhentao
2016-03-01
Natural image classification has been a hot topic in computer vision and pattern recognition research field. Since the performance of an image classification system can be improved by feature selection, many image feature selection methods have been developed. However, the existing supervised feature selection methods are typically driven by the class label information that are identical for different samples from the same class, ignoring with-in class image variability and therefore degrading the feature selection performance. In this study, we propose a novel feature selection method, driven by human brain activity signals collected using fMRI technique when human subjects were viewing natural images of different categories. The fMRI signals associated with subjects viewing different images encode the human perception of natural images, and therefore may capture image variability within- and cross- categories. We then select image features with the guidance of fMRI signals from brain regions with active response to image viewing. Particularly, bag of words features based on GIST descriptor are extracted from natural images for classification, and a sparse regression base feature selection method is adapted to select image features that can best predict fMRI signals. Finally, a classification model is built on the select image features to classify images without fMRI signals. The validation experiments for classifying images from 4 categories of two subjects have demonstrated that our method could achieve much better classification performance than the classifiers built on image feature selected by traditional feature selection methods.
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2017-01-01
Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models. Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system. Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines. The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports.
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2017-01-01
Objectives Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models. Methods Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system. Results Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines. Conclusion The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports. PMID:28166263
Network reconstruction via graph blending
NASA Astrophysics Data System (ADS)
Estrada, Rolando
2016-05-01
Graphs estimated from empirical data are often noisy and incomplete due to the difficulty of faithfully observing all the components (nodes and edges) of the true graph. This problem is particularly acute for large networks where the number of components may far exceed available surveillance capabilities. Errors in the observed graph can render subsequent analyses invalid, so it is vital to develop robust methods that can minimize these observational errors. Errors in the observed graph may include missing and spurious components, as well fused (multiple nodes are merged into one) and split (a single node is misinterpreted as many) nodes. Traditional graph reconstruction methods are only able to identify missing or spurious components (primarily edges, and to a lesser degree nodes), so we developed a novel graph blending framework that allows us to cast the full estimation problem as a simple edge addition/deletion problem. Armed with this framework, we systematically investigate the viability of various topological graph features, such as the degree distribution or the clustering coefficients, and existing graph reconstruction methods for tackling the full estimation problem. Our experimental results suggest that incorporating any topological feature as a source of information actually hinders reconstruction accuracy. We provide a theoretical analysis of this phenomenon and suggest several avenues for improving this estimation problem.
Data-Driven Packet Loss Estimation for Node Healthy Sensing in Decentralized Cluster.
Fan, Hangyu; Wang, Huandong; Li, Yong
2018-01-23
Decentralized clustering of modern information technology is widely adopted in various fields these years. One of the main reason is the features of high availability and the failure-tolerance which can prevent the entire system form broking down by a failure of a single point. Recently, toolkits such as Akka are used by the public commonly to easily build such kind of cluster. However, clusters of such kind that use Gossip as their membership managing protocol and use link failure detecting mechanism to detect link failures cannot deal with the scenario that a node stochastically drops packets and corrupts the member status of the cluster. In this paper, we formulate the problem to be evaluating the link quality and finding a max clique (NP-Complete) in the connectivity graph. We then proposed an algorithm that consists of two models driven by data from application layer to respectively solving these two problems. Through simulations with statistical data and a real-world product, we demonstrate that our algorithm has a good performance.
Simultaneous grouping pursuit and feature selection over an undirected graph*
Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei
2013-01-01
Summary In high-dimensional regression, grouping pursuit and feature selection have their own merits while complementing each other in battling the curse of dimensionality. To seek a parsimonious model, we perform simultaneous grouping pursuit and feature selection over an arbitrary undirected graph with each node corresponding to one predictor. When the corresponding nodes are reachable from each other over the graph, regression coefficients can be grouped, whose absolute values are the same or close. This is motivated from gene network analysis, where genes tend to work in groups according to their biological functionalities. Through a nonconvex penalty, we develop a computational strategy and analyze the proposed method. Theoretical analysis indicates that the proposed method reconstructs the oracle estimator, that is, the unbiased least squares estimator given the true grouping, leading to consistent reconstruction of grouping structures and informative features, as well as to optimal parameter estimation. Simulation studies suggest that the method combines the benefit of grouping pursuit with that of feature selection, and compares favorably against its competitors in selection accuracy and predictive performance. An application to eQTL data is used to illustrate the methodology, where a network is incorporated into analysis through an undirected graph. PMID:24098061
The Effect of Emergent Features on Judgments of Quantity in Configural and Separable Displays
ERIC Educational Resources Information Center
Peebles, David
2008-01-01
Two experiments investigated effects of emergent features on perceptual judgments of comparative magnitude in three diagrammatic representations: kiviat charts, bar graphs, and line graphs. Experiment 1 required participants to compare individual values; whereas in Experiment 2 participants had to integrate several values to produce a global…
New methods for analyzing semantic graph based assessments in science education
NASA Astrophysics Data System (ADS)
Vikaros, Lance Steven
This research investigated how the scoring of semantic graphs (known by many as concept maps) could be improved and automated in order to address issues of inter-rater reliability and scalability. As part of the NSF funded SENSE-IT project to introduce secondary school science students to sensor networks (NSF Grant No. 0833440), semantic graphs illustrating how temperature change affects water ecology were collected from 221 students across 16 schools. The graphing task did not constrain students' use of terms, as is often done with semantic graph based assessment due to coding and scoring concerns. The graphing software used provided real-time feedback to help students learn how to construct graphs, stay on topic and effectively communicate ideas. The collected graphs were scored by human raters using assessment methods expected to boost reliability, which included adaptations of traditional holistic and propositional scoring methods, use of expert raters, topical rubrics, and criterion graphs. High levels of inter-rater reliability were achieved, demonstrating that vocabulary constraints may not be necessary after all. To investigate a new approach to automating the scoring of graphs, thirty-two different graph features characterizing graphs' structure, semantics, configuration and process of construction were then used to predict human raters' scoring of graphs in order to identify feature patterns correlated to raters' evaluations of graphs' topical accuracy and complexity. Results led to the development of a regression model able to predict raters' scoring with 77% accuracy, with 46% accuracy expected when used to score new sets of graphs, as estimated via cross-validation tests. Although such performance is comparable to other graph and essay based scoring systems, cross-context testing of the model and methods used to develop it would be needed before it could be recommended for widespread use. Still, the findings suggest techniques for improving the reliability and scalability of semantic graph based assessments without requiring constraint of how ideas are expressed.
On the structure of Bayesian network for Indonesian text document paraphrase identification
NASA Astrophysics Data System (ADS)
Prayogo, Ario Harry; Syahrul Mubarok, Mohamad; Adiwijaya
2018-03-01
Paraphrase identification is an important process within natural language processing. The idea is to automatically recognize phrases that have different forms but contain same meanings. For examples if we input query “causing fire hazard”, then the computer has to recognize this query that this query has same meaning as “the cause of fire hazard. Paraphrasing is an activity that reveals the meaning of an expression, writing, or speech using different words or forms, especially to achieve greater clarity. In this research we will focus on classifying two Indonesian sentences whether it is a paraphrase to each other or not. There are four steps in this research, first is preprocessing, second is feature extraction, third is classifier building, and the last is performance evaluation. Preprocessing consists of tokenization, non-alphanumerical removal, and stemming. After preprocessing we will conduct feature extraction in order to build new features from given dataset. There are two kinds of features in the research, syntactic features and semantic features. Syntactic features consist of normalized levenshtein distance feature, term-frequency based cosine similarity feature, and LCS (Longest Common Subsequence) feature. Semantic features consist of Wu and Palmer feature and Shortest Path Feature. We use Bayesian Networks as the method of training the classifier. Parameter estimation that we use is called MAP (Maximum A Posteriori). For structure learning of Bayesian Networks DAG (Directed Acyclic Graph), we use BDeu (Bayesian Dirichlet equivalent uniform) scoring function and for finding DAG with the best BDeu score, we use K2 algorithm. In evaluation step we perform cross-validation. The average result that we get from testing the classifier as follows: Precision 75.2%, Recall 76.5%, F1-Measure 75.8% and Accuracy 75.6%.
Shojaedini, Seyed Vahab; Heydari, Masoud
2014-10-01
Shape and movement features of sperms are important parameters for infertility study and treatment. In this article, a new method is introduced for characterization sperms in microscopic videos. In this method, first a hypothesis framework is defined to distinguish sperms from other particles in captured video. Then decision about each hypothesis is done in following steps: Selecting some primary regions as candidates for sperms by watershed-based segmentation, pruning of some false candidates during successive frames using graph theory concept and finally confirming correct sperms by using their movement trajectories. Performance of the proposed method is evaluated on real captured images belongs to semen with high density of sperms. The obtained results show the proposed method may detect 97% of sperms in presence of 5% false detections and track 91% of moving sperms. Furthermore, it can be shown that better characterization of sperms in proposed algorithm doesn't lead to extracting more false sperms compared to some present approaches.
Two classes of bipartite networks: nested biological and social systems.
Burgos, Enrique; Ceva, Horacio; Hernández, Laura; Perazzo, R P J; Devoto, Mariano; Medan, Diego
2008-10-01
Bipartite graphs have received some attention in the study of social networks and of biological mutualistic systems. A generalization of a previous model is presented, that evolves the topology of the graph in order to optimally account for a given contact preference rule between the two guilds of the network. As a result, social and biological graphs are classified as belonging to two clearly different classes. Projected graphs, linking the agents of only one guild, are obtained from the original bipartite graph. The corresponding evolution of its statistical properties is also studied. An example of a biological mutualistic network is analyzed in detail, and it is found that the model provides a very good fitting of all the main statistical features. The model also provides a proper qualitative description of the same features observed in social webs, suggesting the possible reasons underlying the difference in the organization of these two kinds of bipartite networks.
Yu, Qingbao; Du, Yuhui; Chen, Jiayu; He, Hao; Sui, Jing; Pearlson, Godfrey; Calhoun, Vince D
2017-11-01
A key challenge in building a brain graph using fMRI data is how to define the nodes. Spatial brain components estimated by independent components analysis (ICA) and regions of interest (ROIs) determined by brain atlas are two popular methods to define nodes in brain graphs. It is difficult to evaluate which method is better in real fMRI data. Here we perform a simulation study and evaluate the accuracies of a few graph metrics in graphs with nodes of ICA components, ROIs, or modified ROIs in four simulation scenarios. Graph measures with ICA nodes are more accurate than graphs with ROI nodes in all cases. Graph measures with modified ROI nodes are modulated by artifacts. The correlations of graph metrics across subjects between graphs with ICA nodes and ground truth are higher than the correlations between graphs with ROI nodes and ground truth in scenarios with large overlapped spatial sources. Moreover, moving the location of ROIs would largely decrease the correlations in all scenarios. Evaluating graphs with different nodes is promising in simulated data rather than real data because different scenarios can be simulated and measures of different graphs can be compared with a known ground truth. Since ROIs defined using brain atlas may not correspond well to real functional boundaries, overall findings of this work suggest that it is more appropriate to define nodes using data-driven ICA than ROI approaches in real fMRI data. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Klutz, Glenn
1989-01-01
A facility was established that uses collected data and feeds it into mathematical models that generate improved data arrays by correcting for various losses, base line drift, and conversion to unity scaling. These developed data arrays have headers and other identifying information affixed and are subsequently stored in a Laser Materials and Characteristics data base which is accessible to various users. The two part data base: absorption - emission spectra and tabulated data, is developed around twelve laser models. The tabulated section of the data base is divided into several parts: crystalline, optical, mechanical, and thermal properties; aborption and emission spectra information; chemical name and formulas; and miscellaneous. A menu-driven, language-free graphing program will reduce and/or remove the requirement that users become competent FORTRAN programmers and the concomitant requirement that they also spend several days to a few weeks becoming conversant with the GEOGRAF library and sequence of calls and the continual refreshers of both. The work included becoming thoroughly conversant with or at least very familiar with GEOGRAF by GEOCOMP Corp. The development of the graphing program involved trial runs of the various callable library routines on dummy data in order to become familiar with actual implementation and sequencing. This was followed by trial runs with actual data base files and some additional data from current research that was not in the data base but currently needed graphs. After successful runs, with dummy and real data, using actual FORTRAN instructions steps were undertaken to develop the menu-driven language-free implementation of a program which would require the user only know how to use microcomputers. The user would simply be responding to items displayed on the video screen. To assist the user in arriving at the optimum values needed for a specific graph, a paper, and pencil check list was made available to use on the trial runs.
Mohammed, Ameer; Zamani, Majid; Bayford, Richard; Demosthenous, Andreas
2017-12-01
In Parkinson's disease (PD), on-demand deep brain stimulation is required so that stimulation is regulated to reduce side effects resulting from continuous stimulation and PD exacerbation due to untimely stimulation. Also, the progressive nature of PD necessitates the use of dynamic detection schemes that can track the nonlinearities in PD. This paper proposes the use of dynamic feature extraction and dynamic pattern classification to achieve dynamic PD detection taking into account the demand for high accuracy, low computation, and real-time detection. The dynamic feature extraction and dynamic pattern classification are selected by evaluating a subset of feature extraction, dimensionality reduction, and classification algorithms that have been used in brain-machine interfaces. A novel dimensionality reduction technique, the maximum ratio method (MRM) is proposed, which provides the most efficient performance. In terms of accuracy and complexity for hardware implementation, a combination having discrete wavelet transform for feature extraction, MRM for dimensionality reduction, and dynamic k-nearest neighbor for classification was chosen as the most efficient. It achieves a classification accuracy of 99.29%, an F1-score of 97.90%, and a choice probability of 99.86%.
Mathematical modeling of the malignancy of cancer using graph evolution.
Gunduz-Demir, Cigdem
2007-10-01
We report a novel computational method based on graph evolution process to model the malignancy of brain cancer called glioma. In this work, we analyze the phases that a graph passes through during its evolution and demonstrate strong relation between the malignancy of cancer and the phase of its graph. From the photomicrographs of tissues, which are diagnosed as normal, low-grade cancerous and high-grade cancerous, we construct cell-graphs based on the locations of cells; we probabilistically generate an edge between every pair of cells depending on the Euclidean distance between them. For a cell-graph, we extract connectivity information including the properties of its connected components in order to analyze the phase of the cell-graph. Working with brain tissue samples surgically removed from 12 patients, we demonstrate that cell-graphs generated for different tissue types evolve differently and that they exhibit different phase properties, which distinguish a tissue type from another.
Improving the performance of univariate control charts for abnormal detection and classification
NASA Astrophysics Data System (ADS)
Yiakopoulos, Christos; Koutsoudaki, Maria; Gryllias, Konstantinos; Antoniadis, Ioannis
2017-03-01
Bearing failures in rotating machinery can cause machine breakdown and economical loss, if no effective actions are taken on time. Therefore, it is of prime importance to detect accurately the presence of faults, especially at their early stage, to prevent sequent damage and reduce costly downtime. The machinery fault diagnosis follows a roadmap of data acquisition, feature extraction and diagnostic decision making, in which mechanical vibration fault feature extraction is the foundation and the key to obtain an accurate diagnostic result. A challenge in this area is the selection of the most sensitive features for various types of fault, especially when the characteristics of failures are difficult to be extracted. Thus, a plethora of complex data-driven fault diagnosis methods are fed by prominent features, which are extracted and reduced through traditional or modern algorithms. Since most of the available datasets are captured during normal operating conditions, the last decade a number of novelty detection methods, able to work when only normal data are available, have been developed. In this study, a hybrid method combining univariate control charts and a feature extraction scheme is introduced focusing towards an abnormal change detection and classification, under the assumption that measurements under normal operating conditions of the machinery are available. The feature extraction method integrates the morphological operators and the Morlet wavelets. The effectiveness of the proposed methodology is validated on two different experimental cases with bearing faults, demonstrating that the proposed approach can improve the fault detection and classification performance of conventional control charts.
Multiscale Feature Analysis of Salivary Gland Branching Morphogenesis
Baydil, Banu; Daley, William P.; Larsen, Melinda; Yener, Bülent
2012-01-01
Pattern formation in developing tissues involves dynamic spatio-temporal changes in cellular organization and subsequent evolution of functional adult structures. Branching morphogenesis is a developmental mechanism by which patterns are generated in many developing organs, which is controlled by underlying molecular pathways. Understanding the relationship between molecular signaling, cellular behavior and resulting morphological change requires quantification and categorization of the cellular behavior. In this study, tissue-level and cellular changes in developing salivary gland in response to disruption of ROCK-mediated signaling by are modeled by building cell-graphs to compute mathematical features capturing structural properties at multiple scales. These features were used to generate multiscale cell-graph signatures of untreated and ROCK signaling disrupted salivary gland organ explants. From confocal images of mouse submandibular salivary gland organ explants in which epithelial and mesenchymal nuclei were marked, a multiscale feature set capturing global structural properties, local structural properties, spectral, and morphological properties of the tissues was derived. Six feature selection algorithms and multiway modeling of the data was performed to identify distinct subsets of cell graph features that can uniquely classify and differentiate between different cell populations. Multiscale cell-graph analysis was most effective in classification of the tissue state. Cellular and tissue organization, as defined by a multiscale subset of cell-graph features, are both quantitatively distinct in epithelial and mesenchymal cell types both in the presence and absence of ROCK inhibitors. Whereas tensor analysis demonstrate that epithelial tissue was affected the most by inhibition of ROCK signaling, significant multiscale changes in mesenchymal tissue organization were identified with this analysis that were not identified in previous biological studies. We here show how to define and calculate a multiscale feature set as an effective computational approach to identify and quantify changes at multiple biological scales and to distinguish between different states in developing tissues. PMID:22403724
Correlation of Neuromarketing to Neurology
NASA Astrophysics Data System (ADS)
Gupta, Ashutosh; Shreyam, Richa; Garg, Ridhi; Sayed, Tabassum
2017-08-01
The aim of this research work is to identify the most preferred brand of soap in New Delhi through wireless EEG signal through Neuromarketing. A group of four major soap brand advertisements i.e. Pears, Lux, Cinthol and Dove are considered for this research. The advertisement (video) of above these brands are used to stimulate the subjects (9 male and 9 female with age range of 22-30 years) The brain signal responses for the stimuli were collected using a 14 channel wireless headset with a sampling frequency of 128 Hz. The acquired signals are preprocessed using fourth order Butterworth band pass filter. Then feature extraction is done to extract desired features from the EEG signal. The mean value and then power of mean value of each soap brand is calculated. The frequency spectrum of above soap brands is obtained through time-frequency analysis using Short Time Fourier Transform (STFT). The results so obtained are plotted in graphs for final analysis. The present experimental results are analyzed and it is indicated that the subjects are mostly inspired on Dove brand of soap compared to other brands.
Delineation and geometric modeling of road networks
NASA Astrophysics Data System (ADS)
Poullis, Charalambos; You, Suya
In this work we present a novel vision-based system for automatic detection and extraction of complex road networks from various sensor resources such as aerial photographs, satellite images, and LiDAR. Uniquely, the proposed system is an integrated solution that merges the power of perceptual grouping theory (Gabor filtering, tensor voting) and optimized segmentation techniques (global optimization using graph-cuts) into a unified framework to address the challenging problems of geospatial feature detection and classification. Firstly, the local precision of the Gabor filters is combined with the global context of the tensor voting to produce accurate classification of the geospatial features. In addition, the tensorial representation used for the encoding of the data eliminates the need for any thresholds, therefore removing any data dependencies. Secondly, a novel orientation-based segmentation is presented which incorporates the classification of the perceptual grouping, and results in segmentations with better defined boundaries and continuous linear segments. Finally, a set of gaussian-based filters are applied to automatically extract centerline information (magnitude, width and orientation). This information is then used for creating road segments and transforming them to their polygonal representations.
An application of Chan-Vese method used to determine the ROI area in CT lung screening
NASA Astrophysics Data System (ADS)
Prokop, Paweł; Surtel, Wojciech
2016-09-01
The article presents two approaches of determining the ROI area in CT lung screening. First approach is based on a classic method of framing the image in order to determine the ROI by using a MaZda tool. Second approach is based on segmentation of CT images of the lungs and reducing the redundant information from the image. Of the two approaches of an Active Contour, it was decided to choose the Chan-Vese method. In order to determine the effectiveness of the approach, it was performed an analysis of received ROI texture and extraction of textural features. In order to determine the effectiveness of the method, it was performed an analysis of the received ROI textures and extraction of the texture features, by using a Mazda tool. The results were compared and presented in the form of the radar graphs. The second approach proved to be effective and appropriate and consequently it is used for further analysis of CT images, in the computer-aided diagnosis of sarcoidosis.
Superpixel-Augmented Endmember Detection for Hyperspectral Images
NASA Technical Reports Server (NTRS)
Thompson, David R.; Castano, Rebecca; Gilmore, Martha
2011-01-01
Superpixels are homogeneous image regions comprised of several contiguous pixels. They are produced by shattering the image into contiguous, homogeneous regions that each cover between 20 and 100 image pixels. The segmentation aims for a many-to-one mapping from superpixels to image features; each image feature could contain several superpixels, but each superpixel occupies no more than one image feature. This conservative segmentation is relatively easy to automate in a robust fashion. Superpixel processing is related to the more general idea of improving hyperspectral analysis through spatial constraints, which can recognize subtle features at or below the level of noise by exploiting the fact that their spectral signatures are found in neighboring pixels. Recent work has explored spatial constraints for endmember extraction, showing significant advantages over techniques that ignore pixels relative positions. Methods such as AMEE (automated morphological endmember extraction) express spatial influence using fixed isometric relationships a local square window or Euclidean distance in pixel coordinates. In other words, two pixels covariances are based on their spatial proximity, but are independent of their absolute location in the scene. These isometric spatial constraints are most appropriate when spectral variation is smooth and constant over the image. Superpixels are simple to implement, efficient to compute, and are empirically effective. They can be used as a preprocessing step with any desired endmember extraction technique. Superpixels also have a solid theoretical basis in the hyperspectral linear mixing model, making them a principled approach for improving endmember extraction. Unlike existing approaches, superpixels can accommodate non-isometric covariance between image pixels (characteristic of discrete image features separated by step discontinuities). These kinds of image features are common in natural scenes. Analysts can substitute superpixels for image pixels during endmember analysis that leverages the spatial contiguity of scene features to enhance subtle spectral features. Superpixels define populations of image pixels that are independent samples from each image feature, permitting robust estimation of spectral properties, and reducing measurement noise in proportion to the area of the superpixel. This permits improved endmember extraction, and enables automated search for novel and constituent minerals in very noisy, hyperspatial images. This innovation begins with a graph-based segmentation based on the work of Felzenszwalb et al., but then expands their approach to the hyperspectral image domain with a Euclidean distance metric. Then, the mean spectrum of each segment is computed, and the resulting data cloud is used as input into sequential maximum angle convex cone (SMACC) endmember extraction.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wylie, Brian Neil; Moreland, Kenneth D.
Graphs are a vital way of organizing data with complex correlations. A good visualization of a graph can fundamentally change human understanding of the data. Consequently, there is a rich body of work on graph visualization. Although there are many techniques that are effective on small to medium sized graphs (tens of thousands of nodes), there is a void in the research for visualizing massive graphs containing millions of nodes. Sandia is one of the few entities in the world that has the means and motivation to handle data on such a massive scale. For example, homeland security generates graphsmore » from prolific media sources such as television, telephone, and the Internet. The purpose of this project is to provide the groundwork for visualizing such massive graphs. The research provides for two major feature gaps: a parallel, interactive visualization framework and scalable algorithms to make the framework usable to a practical application. Both the frameworks and algorithms are designed to run on distributed parallel computers, which are already available at Sandia. Some features are integrated into the ThreatView{trademark} application and future work will integrate further parallel algorithms.« less
GrouseFlocks: steerable exploration of graph hierarchy space.
Archambault, Daniel; Munzner, Tamara; Auber, David
2008-01-01
Several previous systems allow users to interactively explore a large input graph through cuts of a superimposed hierarchy. This hierarchy is often created using clustering algorithms or topological features present in the graph. However, many graphs have domain-specific attributes associated with the nodes and edges, which could be used to create many possible hierarchies providing unique views of the input graph. GrouseFlocks is a system for the exploration of this graph hierarchy space. By allowing users to see several different possible hierarchies on the same graph, the system helps users investigate graph hierarchy space instead of a single fixed hierarchy. GrouseFlocks provides a simple set of operations so that users can create and modify their graph hierarchies based on selections. These selections can be made manually or based on patterns in the attribute data provided with the graph. It provides feedback to the user within seconds, allowing interactive exploration of this space.
Design features of graphs in health risk communication: a systematic review.
Ancker, Jessica S; Senathirajah, Yalini; Kukafka, Rita; Starren, Justin B
2006-01-01
This review describes recent experimental and focus group research on graphics as a method of communication about quantitative health risks. Some of the studies discussed in this review assessed effect of graphs on quantitative reasoning, others assessed effects on behavior or behavioral intentions, and still others assessed viewers' likes and dislikes. Graphical features that improve the accuracy of quantitative reasoning appear to differ from the features most likely to alter behavior or intentions. For example, graphs that make part-to-whole relationships available visually may help people attend to the relationship between the numerator (the number of people affected by a hazard) and the denominator (the entire population at risk), whereas graphs that show only the numerator appear to inflate the perceived risk and may induce risk-averse behavior. Viewers often preferred design features such as visual simplicity and familiarity that were not associated with accurate quantitative judgments. Communicators should not assume that all graphics are more intuitive than text; many of the studies found that patients' interpretations of the graphics were dependent upon expertise or instruction. Potentially useful directions for continuing research include interactions with educational level and numeracy and successful ways to communicate uncertainty about risk.
HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition.
Lagorce, Xavier; Orchard, Garrick; Galluppi, Francesco; Shi, Bertram E; Benosman, Ryad B
2017-07-01
This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.
Quantum walk on a chimera graph
NASA Astrophysics Data System (ADS)
Xu, Shu; Sun, Xiangxiang; Wu, Jizhou; Zhang, Wei-Wei; Arshed, Nigum; Sanders, Barry C.
2018-05-01
We analyse a continuous-time quantum walk on a chimera graph, which is a graph of choice for designing quantum annealers, and we discover beautiful quantum walk features such as localization that starkly distinguishes classical from quantum behaviour. Motivated by technological thrusts, we study continuous-time quantum walk on enhanced variants of the chimera graph and on diminished chimera graph with a random removal of vertices. We explain the quantum walk by constructing a generating set for a suitable subgroup of graph isomorphisms and corresponding symmetry operators that commute with the quantum walk Hamiltonian; the Hamiltonian and these symmetry operators provide a complete set of labels for the spectrum and the stationary states. Our quantum walk characterization of the chimera graph and its variants yields valuable insights into graphs used for designing quantum-annealers.
Zhang, Pin; Liang, Yanmei; Chang, Shengjiang; Fan, Hailun
2013-08-01
Accurate segmentation of renal tissues in abdominal computed tomography (CT) image sequences is an indispensable step for computer-aided diagnosis and pathology detection in clinical applications. In this study, the goal is to develop a radiology tool to extract renal tissues in CT sequences for the management of renal diagnosis and treatments. In this paper, the authors propose a new graph-cuts-based active contours model with an adaptive width of narrow band for kidney extraction in CT image sequences. Based on graph cuts and contextual continuity, the segmentation is carried out slice-by-slice. In the first stage, the middle two adjacent slices in a CT sequence are segmented interactively based on the graph cuts approach. Subsequently, the deformable contour evolves toward the renal boundaries by the proposed model for the kidney extraction of the remaining slices. In this model, the energy function combining boundary with regional information is optimized in the constructed graph and the adaptive search range is determined by contextual continuity and the object size. In addition, in order to reduce the complexity of the min-cut computation, the nodes in the graph only have n-links for fewer edges. The total 30 CT images sequences with normal and pathological renal tissues are used to evaluate the accuracy and effectiveness of our method. The experimental results reveal that the average dice similarity coefficient of these image sequences is from 92.37% to 95.71% and the corresponding standard deviation for each dataset is from 2.18% to 3.87%. In addition, the average automatic segmentation time for one kidney in each slice is about 0.36 s. Integrating the graph-cuts-based active contours model with contextual continuity, the algorithm takes advantages of energy minimization and the characteristics of image sequences. The proposed method achieves effective results for kidney segmentation in CT sequences.
Mathematical foundations of the GraphBLAS
Kepner, Jeremy; Aaltonen, Peter; Bader, David; ...
2016-12-01
The GraphBLAS standard (GraphBlas.org) is being developed to bring the potential of matrix-based graph algorithms to the broadest possible audience. Mathematically, the GraphBLAS defines a core set of matrix-based graph operations that can be used to implement a wide class of graph algorithms in a wide range of programming environments. This study provides an introduction to the mathematics of the GraphBLAS. Graphs represent connections between vertices with edges. Matrices can represent a wide range of graphs using adjacency matrices or incidence matrices. Adjacency matrices are often easier to analyze while incidence matrices are often better for representing data. Fortunately, themore » two are easily connected by matrix multiplication. A key feature of matrix mathematics is that a very small number of matrix operations can be used to manipulate a very wide range of graphs. This composability of a small number of operations is the foundation of the GraphBLAS. A standard such as the GraphBLAS can only be effective if it has low performance overhead. Finally, performance measurements of prototype GraphBLAS implementations indicate that the overhead is low.« less
Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar
2017-01-01
Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems. PMID:29099838
Murugesan, Gurusamy; Abdulkadhar, Sabenabanu; Natarajan, Jeyakumar
2017-01-01
Automatic extraction of protein-protein interaction (PPI) pairs from biomedical literature is a widely examined task in biological information extraction. Currently, many kernel based approaches such as linear kernel, tree kernel, graph kernel and combination of multiple kernels has achieved promising results in PPI task. However, most of these kernel methods fail to capture the semantic relation information between two entities. In this paper, we present a special type of tree kernel for PPI extraction which exploits both syntactic (structural) and semantic vectors information known as Distributed Smoothed Tree kernel (DSTK). DSTK comprises of distributed trees with syntactic information along with distributional semantic vectors representing semantic information of the sentences or phrases. To generate robust machine learning model composition of feature based kernel and DSTK were combined using ensemble support vector machine (SVM). Five different corpora (AIMed, BioInfer, HPRD50, IEPA, and LLL) were used for evaluating the performance of our system. Experimental results show that our system achieves better f-score with five different corpora compared to other state-of-the-art systems.
Jooya, Hossein Z.; Reihani, Kamran; Chu, Shih-I
2016-11-21
We propose a graph-theoretical formalism to study generic circuit quantum electrodynamics systems consisting of a two level qubit coupled with a single-mode resonator in arbitrary coupling strength regimes beyond rotating-wave approximation. We define colored-weighted graphs, and introduce different products between them to investigate the dynamics of superconducting qubits in transverse, longitudinal, and bidirectional coupling schemes. In conclusion, the intuitive and predictive picture provided by this method, and the simplicity of the mathematical construction, are demonstrated with some numerical studies of the multiphoton resonance processes and quantum interference phenomena for the superconducting qubit systems driven by intense ac fields.
MorphoGraphX: A platform for quantifying morphogenesis in 4D.
Barbier de Reuille, Pierre; Routier-Kierzkowska, Anne-Lise; Kierzkowski, Daniel; Bassel, George W; Schüpbach, Thierry; Tauriello, Gerardo; Bajpai, Namrata; Strauss, Sören; Weber, Alain; Kiss, Annamaria; Burian, Agata; Hofhuis, Hugo; Sapala, Aleksandra; Lipowczan, Marcin; Heimlicher, Maria B; Robinson, Sarah; Bayer, Emmanuelle M; Basler, Konrad; Koumoutsakos, Petros; Roeder, Adrienne H K; Aegerter-Wilmsen, Tinri; Nakayama, Naomi; Tsiantis, Miltos; Hay, Angela; Kwiatkowska, Dorota; Xenarios, Ioannis; Kuhlemeier, Cris; Smith, Richard S
2015-05-06
Morphogenesis emerges from complex multiscale interactions between genetic and mechanical processes. To understand these processes, the evolution of cell shape, proliferation and gene expression must be quantified. This quantification is usually performed either in full 3D, which is computationally expensive and technically challenging, or on 2D planar projections, which introduces geometrical artifacts on highly curved organs. Here we present MorphoGraphX ( www.MorphoGraphX.org), a software that bridges this gap by working directly with curved surface images extracted from 3D data. In addition to traditional 3D image analysis, we have developed algorithms to operate on curved surfaces, such as cell segmentation, lineage tracking and fluorescence signal quantification. The software's modular design makes it easy to include existing libraries, or to implement new algorithms. Cell geometries extracted with MorphoGraphX can be exported and used as templates for simulation models, providing a powerful platform to investigate the interactions between shape, genes and growth.
Tensor-driven extraction of developmental features from varying paediatric EEG datasets.
Kinney-Lang, Eli; Spyrou, Loukianos; Ebied, Ahmed; Chin, Richard Fm; Escudero, Javier
2018-05-21
Constant changes in developing children's brains can pose a challenge in EEG dependant technologies. Advancing signal processing methods to identify developmental differences in paediatric populations could help improve function and usability of such technologies. Taking advantage of the multi-dimensional structure of EEG data through tensor analysis may offer a framework for extracting relevant developmental features of paediatric datasets. A proof of concept is demonstrated through identifying latent developmental features in resting-state EEG. Approach. Three paediatric datasets (n = 50, 17, 44) were analyzed using a two-step constrained parallel factor (PARAFAC) tensor decomposition. Subject age was used as a proxy measure of development. Classification used support vector machines (SVM) to test if PARAFAC identified features could predict subject age. The results were cross-validated within each dataset. Classification analysis was complemented by visualization of the high-dimensional feature structures using t-distributed Stochastic Neighbour Embedding (t-SNE) maps. Main Results. Development-related features were successfully identified for the developmental conditions of each dataset. SVM classification showed the identified features could accurately predict subject at a significant level above chance for both healthy and impaired populations. t-SNE maps revealed suitable tensor factorization was key in extracting the developmental features. Significance. The described methods are a promising tool for identifying latent developmental features occurring throughout childhood EEG. © 2018 IOP Publishing Ltd.
Graph theory applied to the analysis of motor activity in patients with schizophrenia and depression
Fasmer, Erlend Eindride; Berle, Jan Øystein; Oedegaard, Ketil J.; Hauge, Erik R.
2018-01-01
Depression and schizophrenia are defined only by their clinical features, and diagnostic separation between them can be difficult. Disturbances in motor activity pattern are central features of both types of disorders. We introduce a new method to analyze time series, called the similarity graph algorithm. Time series of motor activity, obtained from actigraph registrations over 12 days in depressed and schizophrenic patients, were mapped into a graph and we then applied techniques from graph theory to characterize these time series, primarily looking for changes in complexity. The most marked finding was that depressed patients were found to be significantly different from both controls and schizophrenic patients, with evidence of less regularity of the time series, when analyzing the recordings with one hour intervals. These findings support the contention that there are important differences in control systems regulating motor behavior in patients with depression and schizophrenia. The similarity graph algorithm we have described can easily be applied to the study of other types of time series. PMID:29668743
Xu, Xin; Huang, Zhenhua; Graves, Daniel; Pedrycz, Witold
2014-12-01
In order to deal with the sequential decision problems with large or continuous state spaces, feature representation and function approximation have been a major research topic in reinforcement learning (RL). In this paper, a clustering-based graph Laplacian framework is presented for feature representation and value function approximation (VFA) in RL. By making use of clustering-based techniques, that is, K-means clustering or fuzzy C-means clustering, a graph Laplacian is constructed by subsampling in Markov decision processes (MDPs) with continuous state spaces. The basis functions for VFA can be automatically generated from spectral analysis of the graph Laplacian. The clustering-based graph Laplacian is integrated with a class of approximation policy iteration algorithms called representation policy iteration (RPI) for RL in MDPs with continuous state spaces. Simulation and experimental results show that, compared with previous RPI methods, the proposed approach needs fewer sample points to compute an efficient set of basis functions and the learning control performance can be improved for a variety of parameter settings.
Fasmer, Erlend Eindride; Fasmer, Ole Bernt; Berle, Jan Øystein; Oedegaard, Ketil J; Hauge, Erik R
2018-01-01
Depression and schizophrenia are defined only by their clinical features, and diagnostic separation between them can be difficult. Disturbances in motor activity pattern are central features of both types of disorders. We introduce a new method to analyze time series, called the similarity graph algorithm. Time series of motor activity, obtained from actigraph registrations over 12 days in depressed and schizophrenic patients, were mapped into a graph and we then applied techniques from graph theory to characterize these time series, primarily looking for changes in complexity. The most marked finding was that depressed patients were found to be significantly different from both controls and schizophrenic patients, with evidence of less regularity of the time series, when analyzing the recordings with one hour intervals. These findings support the contention that there are important differences in control systems regulating motor behavior in patients with depression and schizophrenia. The similarity graph algorithm we have described can easily be applied to the study of other types of time series.
NASA Astrophysics Data System (ADS)
Wen, Hongwei; Liu, Yue; Wang, Jieqiong; Zhang, Jishui; Peng, Yun; He, Huiguang
2016-03-01
Tourette syndrome (TS) is a childhood-onset neurobehavioral disorder characterized by the presence of multiple motor and vocal tics. Tic generation has been linked to disturbed networks of brain areas involved in planning, controlling and execution of action. The aim of our work is to select topological characteristics of structural network which were most efficient for estimating the classification models to identify early TS children. Here we employed the diffusion tensor imaging (DTI) and deterministic tractography to construct the structural networks of 44 TS children and 48 age and gender matched healthy children. We calculated four different connection matrices (fiber number, mean FA, averaged fiber length weighted and binary matrices) and then applied graph theoretical methods to extract the regional nodal characteristics of structural network. For each weighted or binary network, nodal degree, nodal efficiency and nodal betweenness were selected as features. Support Vector Machine Recursive Feature Extraction (SVM-RFE) algorithm was used to estimate the best feature subset for classification. The accuracy of 88.26% evaluated by a nested cross validation was achieved on combing best feature subset of each network characteristic. The identified discriminative brain nodes mostly located in the basal ganglia and frontal cortico-cortical networks involved in TS children which was associated with tic severity. Our study holds promise for early identification and predicting prognosis of TS children.
Yu, Miao; Rhuma, Adel; Naqvi, Syed Mohsen; Wang, Liang; Chambers, Jonathon
2012-11-01
We propose a novel computer vision based fall detection system for monitoring an elderly person in a home care application. Background subtraction is applied to extract the foreground human body and the result is improved by using certain post-processing. Information from ellipse fitting and a projection histogram along the axes of the ellipse are used as the features for distinguishing different postures of the human. These features are then fed into a directed acyclic graph support vector machine (DAGSVM) for posture classification, the result of which is then combined with derived floor information to detect a fall. From a dataset of 15 people, we show that our fall detection system can achieve a high fall detection rate (97.08%) and a very low false detection rate (0.8%) in a simulated home environment.
Ontology- and graph-based similarity assessment in biological networks.
Wang, Haiying; Zheng, Huiru; Azuaje, Francisco
2010-10-15
A standard systems-based approach to biomarker and drug target discovery consists of placing putative biomarkers in the context of a network of biological interactions, followed by different 'guilt-by-association' analyses. The latter is typically done based on network structural features. Here, an alternative analysis approach in which the networks are analyzed on a 'semantic similarity' space is reported. Such information is extracted from ontology-based functional annotations. We present SimTrek, a Cytoscape plugin for ontology-based similarity assessment in biological networks. http://rosalind.infj.ulst.ac.uk/SimTrek.html francisco.azuaje@crp-sante.lu Supplementary data are available at Bioinformatics online.
Olayan, Rawan S; Ashoor, Haitham; Bajic, Vladimir B
2018-04-01
Finding computationally drug-target interactions (DTIs) is a convenient strategy to identify new DTIs at low cost with reasonable accuracy. However, the current DTI prediction methods suffer the high false positive prediction rate. We developed DDR, a novel method that improves the DTI prediction accuracy. DDR is based on the use of a heterogeneous graph that contains known DTIs with multiple similarities between drugs and multiple similarities between target proteins. DDR applies non-linear similarity fusion method to combine different similarities. Before fusion, DDR performs a pre-processing step where a subset of similarities is selected in a heuristic process to obtain an optimized combination of similarities. Then, DDR applies a random forest model using different graph-based features extracted from the DTI heterogeneous graph. Using 5-repeats of 10-fold cross-validation, three testing setups, and the weighted average of area under the precision-recall curve (AUPR) scores, we show that DDR significantly reduces the AUPR score error relative to the next best start-of-the-art method for predicting DTIs by 34% when the drugs are new, by 23% when targets are new and by 34% when the drugs and the targets are known but not all DTIs between them are not known. Using independent sources of evidence, we verify as correct 22 out of the top 25 DDR novel predictions. This suggests that DDR can be used as an efficient method to identify correct DTIs. The data and code are provided at https://bitbucket.org/RSO24/ddr/. vladimir.bajic@kaust.edu.sa. Supplementary data are available at Bioinformatics online.
Visualization of Documents and Concepts in Neuroinformatics with the 3D-SE Viewer
Naud, Antoine; Usui, Shiro; Ueda, Naonori; Taniguchi, Tatsuki
2007-01-01
A new interactive visualization tool is proposed for mining text data from various fields of neuroscience. Applications to several text datasets are presented to demonstrate the capability of the proposed interactive tool to visualize complex relationships between pairs of lexical entities (with some semantic contents) such as terms, keywords, posters, or papers' abstracts. Implemented as a Java applet, this tool is based on the spherical embedding (SE) algorithm, which was designed for the visualization of bipartite graphs. Items such as words and documents are linked on the basis of occurrence relationships, which can be represented in a bipartite graph. These items are visualized by embedding the vertices of the bipartite graph on spheres in a three-dimensional (3-D) space. The main advantage of the proposed visualization tool is that 3-D layouts can convey more information than planar or linear displays of items or graphs. Different kinds of information extracted from texts, such as keywords, indexing terms, or topics are visualized, allowing interactive browsing of various fields of research featured by keywords, topics, or research teams. A typical use of the 3D-SE viewer is quick browsing of topics displayed on a sphere, then selecting one or several item(s) displays links to related terms on another sphere representing, e.g., documents or abstracts, and provides direct online access to the document source in a database, such as the Visiome Platform or the SfN Annual Meeting. Developed as a Java applet, it operates as a tool on top of existing resources. PMID:18974802
Visualization of Documents and Concepts in Neuroinformatics with the 3D-SE Viewer.
Naud, Antoine; Usui, Shiro; Ueda, Naonori; Taniguchi, Tatsuki
2007-01-01
A new interactive visualization tool is proposed for mining text data from various fields of neuroscience. Applications to several text datasets are presented to demonstrate the capability of the proposed interactive tool to visualize complex relationships between pairs of lexical entities (with some semantic contents) such as terms, keywords, posters, or papers' abstracts. Implemented as a Java applet, this tool is based on the spherical embedding (SE) algorithm, which was designed for the visualization of bipartite graphs. Items such as words and documents are linked on the basis of occurrence relationships, which can be represented in a bipartite graph. These items are visualized by embedding the vertices of the bipartite graph on spheres in a three-dimensional (3-D) space. The main advantage of the proposed visualization tool is that 3-D layouts can convey more information than planar or linear displays of items or graphs. Different kinds of information extracted from texts, such as keywords, indexing terms, or topics are visualized, allowing interactive browsing of various fields of research featured by keywords, topics, or research teams. A typical use of the 3D-SE viewer is quick browsing of topics displayed on a sphere, then selecting one or several item(s) displays links to related terms on another sphere representing, e.g., documents or abstracts, and provides direct online access to the document source in a database, such as the Visiome Platform or the SfN Annual Meeting. Developed as a Java applet, it operates as a tool on top of existing resources.
Lee, Hansang; Hong, Helen; Kim, Junmo
2014-12-01
We propose a graph-cut-based segmentation method for the anterior cruciate ligament (ACL) in knee MRI with a novel shape prior and label refinement. As the initial seeds for graph cuts, candidates for the ACL and the background are extracted from knee MRI roughly by means of adaptive thresholding with Gaussian mixture model fitting. The extracted ACL candidate is segmented iteratively by graph cuts with patient-specific shape constraints. Two shape constraints termed fence and neighbor costs are suggested such that the graph cuts prevent any leakage into adjacent regions with similar intensity. The segmented ACL label is refined by means of superpixel classification. Superpixel classification makes the segmented label propagate into missing inhomogeneous regions inside the ACL. In the experiments, the proposed method segmented the ACL with Dice similarity coefficient of 66.47±7.97%, average surface distance of 2.247±0.869, and root mean squared error of 3.538±1.633, which increased the accuracy by 14.8%, 40.3%, and 37.6% from the Boykov model, respectively. Copyright © 2014 Elsevier Ltd. All rights reserved.
Information extraction and knowledge graph construction from geoscience literature
NASA Astrophysics Data System (ADS)
Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo; Chen, Jingwen
2018-03-01
Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.
Graph pyramids for protein function prediction
2015-01-01
Background Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Methods Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Results Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data. PMID:26044522
Graph pyramids for protein function prediction.
Sandhan, Tushar; Yoo, Youngjun; Choi, Jin; Kim, Sun
2015-01-01
Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy. Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels. Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data.
A flower image retrieval method based on ROI feature.
Hong, An-Xiang; Chen, Gang; Li, Jun-Li; Chi, Zhe-Ru; Zhang, Dan
2004-07-01
Flower image retrieval is a very important step for computer-aided plant species recognition. In this paper, we propose an efficient segmentation method based on color clustering and domain knowledge to extract flower regions from flower images. For flower retrieval, we use the color histogram of a flower region to characterize the color features of flower and two shape-based features sets, Centroid-Contour Distance (CCD) and Angle Code Histogram (ACH), to characterize the shape features of a flower contour. Experimental results showed that our flower region extraction method based on color clustering and domain knowledge can produce accurate flower regions. Flower retrieval results on a database of 885 flower images collected from 14 plant species showed that our Region-of-Interest (ROI) based retrieval approach using both color and shape features can perform better than a method based on the global color histogram proposed by Swain and Ballard (1991) and a method based on domain knowledge-driven segmentation and color names proposed by Das et al.(1999).
Phillips, Carolyn L.; Guo, Hanqi; Peterka, Tom; ...
2016-02-19
In type-II superconductors, the dynamics of magnetic flux vortices determine their transport properties. In the Ginzburg-Landau theory, vortices correspond to topological defects in the complex order parameter field. Earlier, we introduced a method for extracting vortices from the discretized complex order parameter field generated by a large-scale simulation of vortex matter. With this method, at a fixed time step, each vortex [simplistically, a one-dimensional (1D) curve in 3D space] can be represented as a connected graph extracted from the discretized field. Here we extend this method as a function of time as well. A vortex now corresponds to a 2Dmore » space-time sheet embedded in 4D space time that can be represented as a connected graph extracted from the discretized field over both space and time. Vortices that interact by merging or splitting correspond to disappearance and appearance of holes in the connected graph in the time direction. This method of tracking vortices, which makes no assumptions about the scale or behavior of the vortices, can track the vortices with a resolution as good as the discretization of the temporally evolving complex scalar field. In addition, even details of the trajectory between time steps can be reconstructed from the connected graph. With this form of vortex tracking, the details of vortex dynamics in a model of a superconducting materials can be understood in greater detail than previously possible.« less
NASA Astrophysics Data System (ADS)
Rao, Prahalad Krishna
This research proposes approaches for monitoring and inspection of surface morphology with respect to two ultraprecision/nanomanufacturing processes, namely, ultraprecision machining (UPM) and chemical mechanical planarization (CMP). The methods illustrated in this dissertation are motivated from the compelling need for in situ process monitoring in nanomanufacturing and invoke concepts from diverse scientific backgrounds, such as artificial neural networks, Bayesian learning, and algebraic graph theory. From an engineering perspective, this work has the following contributions: 1. A combined neural network and Bayesian learning approach for early detection of UPM process anomalies by integrating data from multiple heterogeneous in situ sensors (force, vibration, and acoustic emission) is developed. The approach captures process drifts in UPM of aluminum 6061 discs within 15 milliseconds of their inception and is therefore valuable for minimizing yield losses. 2. CMP process dynamics are mathematically represented using a deterministic multi-scale hierarchical nonlinear differential equation model. This process-machine inter-action (PMI) model is evocative of the various physio-mechanical aspects in CMP and closely emulates experimentally acquired vibration signal patterns, including complex nonlinear dynamics manifest in the process. By combining the PMI model predictions with features gathered from wirelessly acquired CMP vibration signal patterns, CMP process anomalies, such as pad wear, and drifts in polishing were identified in their nascent stage with high fidelity (R2 ~ 75%). 3. An algebraic graph theoretic approach for quantifying nano-surface morphology from optical micrograph images is developed. The approach enables a parsimonious representation of the topological relationships between heterogeneous nano-surface fea-tures, which are enshrined in graph theoretic entities, namely, the similarity, degree, and Laplacian matrices. Topological invariant measures (e.g., Fiedler number, Kirchoff index) extracted from these matrices are shown to be sensitive to evolving nano-surface morphology. For instance, we observed that prominent nanoscale morphological changes on CMP processed Cu wafers, although discernible visually, could not be tractably quantified using statistical metrology parameters, such as arithmetic average roughness (Sa), root mean square roughness (Sq), etc. In contrast, CMP induced nanoscale surface variations were captured on invoking graph theoretic topological invariants. Consequently, the graph theoretic approach can enable timely, non-contact, and in situ metrology of semiconductor wafers by obviating the need for reticent profile mapping techniques (e.g., AFM, SEM, etc.), and thereby prevent the propagation of yield losses over long production runs.
Composing Data Parallel Code for a SPARQL Graph Engine
DOE Office of Scientific and Technical Information (OSTI.GOV)
Castellana, Vito G.; Tumeo, Antonino; Villa, Oreste
Big data analytics process large amount of data to extract knowledge from them. Semantic databases are big data applications that adopt the Resource Description Framework (RDF) to structure metadata through a graph-based representation. The graph based representation provides several benefits, such as the possibility to perform in memory processing with large amounts of parallelism. SPARQL is a language used to perform queries on RDF-structured data through graph matching. In this paper we present a tool that automatically translates SPARQL queries to parallel graph crawling and graph matching operations. The tool also supports complex SPARQL constructs, which requires more than basicmore » graph matching for their implementation. The tool generates parallel code annotated with OpenMP pragmas for x86 Shared-memory Multiprocessors (SMPs). With respect to commercial database systems such as Virtuoso, our approach reduces memory occupation due to join operations and provides higher performance. We show the scaling of the automatically generated graph-matching code on a 48-core SMP.« less
Art Expertise Reduces Influence of Visual Salience on Fixation in Viewing Abstract-Paintings
Koide, Naoko; Kubo, Takatomi; Nishida, Satoshi; Shibata, Tomohiro; Ikeda, Kazushi
2015-01-01
When viewing a painting, artists perceive more information from the painting on the basis of their experience and knowledge than art novices do. This difference can be reflected in eye scan paths during viewing of paintings. Distributions of scan paths of artists are different from those of novices even when the paintings contain no figurative object (i.e. abstract paintings). There are two possible explanations for this difference of scan paths. One is that artists have high sensitivity to high-level features such as textures and composition of colors and therefore their fixations are more driven by such features compared with novices. The other is that fixations of artists are more attracted by salient features than those of novices and the fixations are driven by low-level features. To test these, we measured eye fixations of artists and novices during the free viewing of various abstract paintings and compared the distribution of their fixations for each painting with a topological attentional map that quantifies the conspicuity of low-level features in the painting (i.e. saliency map). We found that the fixation distribution of artists was more distinguishable from the saliency map than that of novices. This difference indicates that fixations of artists are less driven by low-level features than those of novices. Our result suggests that artists may extract visual information from paintings based on high-level features. This ability of artists may be associated with artists’ deep aesthetic appreciation of paintings. PMID:25658327
Text categorization of biomedical data sets using graph kernels and a controlled vocabulary.
Bleik, Said; Mishra, Meenakshi; Huan, Jun; Song, Min
2013-01-01
Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph kernels to classify those articles into high-level categories. In our representation, common biomedical concepts and semantic relationships are identified with the help of an existing ontology and are used to build a rich graph structure that provides a consistent feature set and preserves additional semantic information that could improve a classifier's performance. We attempt to classify the graphs using both a set-based graph kernel that is capable of dealing with the disconnected nature of the graphs and a simple linear kernel. Finally, we report the results comparing the classification performance of the kernel classifiers to common text-based classifiers.
Matching of renewable source of energy generation graphs and electrical load in local energy system
NASA Astrophysics Data System (ADS)
Lezhniuk, Petro; Komar, Vyacheslav; Sobchuk, Dmytro; Kravchuk, Sergiy; Kacejko, Piotr; Zavidsky, Vladislav
2017-08-01
The paper contains the method of matching generation graph of photovoltaic electric stations and consumers. Characteristic feature of this method is the application of morphometric analysis for assessment of non-uniformity of the integrated graph of energy supply, optimal coefficients of current distribution, that enables by mean of refining the powers, transferring in accordance with the graph , to provide the decrease of electric energy losses in the grid and transport task, as the optimization tool.
2015-09-21
this framework, MIT LL carried out a one-year proof- of-concept study to determine the capabilities and challenges in the detection of anomalies in...extremely large graphs [5]. Under this effort, two real datasets were considered, and algorithms for data modeling and anomaly detection were developed...is required in a well-defined experimental framework for the detection of anomalies in very large graphs. This study is intended to inform future
A graph-based approach to detect spatiotemporal dynamics in satellite image time series
NASA Astrophysics Data System (ADS)
Guttler, Fabio; Ienco, Dino; Nin, Jordi; Teisseire, Maguelonne; Poncelet, Pascal
2017-08-01
Enhancing the frequency of satellite acquisitions represents a key issue for Earth Observation community nowadays. Repeated observations are crucial for monitoring purposes, particularly when intra-annual process should be taken into account. Time series of images constitute a valuable source of information in these cases. The goal of this paper is to propose a new methodological framework to automatically detect and extract spatiotemporal information from satellite image time series (SITS). Existing methods dealing with such kind of data are usually classification-oriented and cannot provide information about evolutions and temporal behaviors. In this paper we propose a graph-based strategy that combines object-based image analysis (OBIA) with data mining techniques. Image objects computed at each individual timestamp are connected across the time series and generates a set of evolution graphs. Each evolution graph is associated to a particular area within the study site and stores information about its temporal evolution. Such information can be deeply explored at the evolution graph scale or used to compare the graphs and supply a general picture at the study site scale. We validated our framework on two study sites located in the South of France and involving different types of natural, semi-natural and agricultural areas. The results obtained from a Landsat SITS support the quality of the methodological approach and illustrate how the framework can be employed to extract and characterize spatiotemporal dynamics.
Monocular precrash vehicle detection: features and classifiers.
Sun, Zehang; Bebis, George; Miller, Ronald
2006-07-01
Robust and reliable vehicle detection from images acquired by a moving vehicle (i.e., on-road vehicle detection) is an important problem with applications to driver assistance systems and autonomous, self-guided vehicles. The focus of this work is on the issues of feature extraction and classification for rear-view vehicle detection. Specifically, by treating the problem of vehicle detection as a two-class classification problem, we have investigated several different feature extraction methods such as principal component analysis, wavelets, and Gabor filters. To evaluate the extracted features, we have experimented with two popular classifiers, neural networks and support vector machines (SVMs). Based on our evaluation results, we have developed an on-board real-time monocular vehicle detection system that is capable of acquiring grey-scale images, using Ford's proprietary low-light camera, achieving an average detection rate of 10 Hz. Our vehicle detection algorithm consists of two main steps: a multiscale driven hypothesis generation step and an appearance-based hypothesis verification step. During the hypothesis generation step, image locations where vehicles might be present are extracted. This step uses multiscale techniques not only to speed up detection, but also to improve system robustness. The appearance-based hypothesis verification step verifies the hypotheses using Gabor features and SVMs. The system has been tested in Ford's concept vehicle under different traffic conditions (e.g., structured highway, complex urban streets, and varying weather conditions), illustrating good performance.
NASA Astrophysics Data System (ADS)
Hu, Y.; Quinn, C.; Cai, X.
2015-12-01
One major challenge of agent-based modeling is to derive agents' behavioral rules due to behavioral uncertainty and data scarcity. This study proposes a new approach to combine a data-driven modeling based on the directed information (i.e., machine intelligence) with expert domain knowledge (i.e., human intelligence) to derive the behavioral rules of agents considering behavioral uncertainty. A directed information graph algorithm is applied to identifying the causal relationships between agents' decisions (i.e., groundwater irrigation depth) and time-series of environmental, socio-economical and institutional factors. A case study is conducted for the High Plains aquifer hydrological observatory (HO) area, U.S. Preliminary results show that four factors, corn price (CP), underlying groundwater level (GWL), monthly mean temperature (T) and precipitation (P) have causal influences on agents' decisions on groundwater irrigation depth (GWID) to various extents. Based on the similarity of the directed information graph for each agent, five clusters of graphs are further identified to represent all the agents' behaviors in the study area as shown in Figure 1. Using these five representative graphs, agents' monthly optimal groundwater pumping rates are derived through the probabilistic inference. Such data-driven relationships and probabilistic quantifications are then coupled with a physically-based groundwater model to investigate the interactions between agents' pumping behaviors and the underlying groundwater system in the context of coupled human and natural systems.
Multiple Illuminant Colour Estimation via Statistical Inference on Factor Graphs.
Mutimbu, Lawrence; Robles-Kelly, Antonio
2016-08-31
This paper presents a method to recover a spatially varying illuminant colour estimate from scenes lit by multiple light sources. Starting with the image formation process, we formulate the illuminant recovery problem in a statistically datadriven setting. To do this, we use a factor graph defined across the scale space of the input image. In the graph, we utilise a set of illuminant prototypes computed using a data driven approach. As a result, our method delivers a pixelwise illuminant colour estimate being devoid of libraries or user input. The use of a factor graph also allows for the illuminant estimates to be recovered making use of a maximum a posteriori (MAP) inference process. Moreover, we compute the probability marginals by performing a Delaunay triangulation on our factor graph. We illustrate the utility of our method for pixelwise illuminant colour recovery on widely available datasets and compare against a number of alternatives. We also show sample colour correction results on real-world images.
Topology for efficient information dissemination in ad-hoc networking
NASA Technical Reports Server (NTRS)
Jennings, E.; Okino, C. M.
2002-01-01
In this paper, we explore the information dissemination problem in ad-hoc wirless networks. First, we analyze the probability of successful broadcast, assuming: the nodes are uniformly distributed, the available area has a lower bould relative to the total number of nodes, and there is zero knowledge of the overall topology of the network. By showing that the probability of such events is small, we are motivated to extract good graph topologies to minimize the overall transmissions. Three algorithms are used to generate topologies of the network with guaranteed connectivity. These are the minimum radius graph, the relative neighborhood graph and the minimum spanning tree. Our simulation shows that the relative neighborhood graph has certain good graph properties, which makes it suitable for efficient information dissemination.
Neuromuscular disease classification system
NASA Astrophysics Data System (ADS)
Sáez, Aurora; Acha, Begoña; Montero-Sánchez, Adoración; Rivas, Eloy; Escudero, Luis M.; Serrano, Carmen
2013-06-01
Diagnosis of neuromuscular diseases is based on subjective visual assessment of biopsies from patients by the pathologist specialist. A system for objective analysis and classification of muscular dystrophies and neurogenic atrophies through muscle biopsy images of fluorescence microscopy is presented. The procedure starts with an accurate segmentation of the muscle fibers using mathematical morphology and a watershed transform. A feature extraction step is carried out in two parts: 24 features that pathologists take into account to diagnose the diseases and 58 structural features that the human eye cannot see, based on the assumption that the biopsy is considered as a graph, where the nodes are represented by each fiber, and two nodes are connected if two fibers are adjacent. A feature selection using sequential forward selection and sequential backward selection methods, a classification using a Fuzzy ARTMAP neural network, and a study of grading the severity are performed on these two sets of features. A database consisting of 91 images was used: 71 images for the training step and 20 as the test. A classification error of 0% was obtained. It is concluded that the addition of features undetectable by the human visual inspection improves the categorization of atrophic patterns.
Klijn, Marieke E; Hubbuch, Jürgen
2018-04-27
Protein phase diagrams are a tool to investigate cause and consequence of solution conditions on protein phase behavior. The effects are scored according to aggregation morphologies such as crystals or amorphous precipitates. Solution conditions affect morphological features, such as crystal size, as well as kinetic features, such as crystal growth time. Common used data visualization techniques include individual line graphs or symbols-based phase diagrams. These techniques have limitations in terms of handling large datasets, comprehensiveness or completeness. To eliminate these limitations, morphological and kinetic features obtained from crystallization images generated with high throughput microbatch experiments have been visualized with radar charts in combination with the empirical phase diagram (EPD) method. Morphological features (crystal size, shape, and number, as well as precipitate size) and kinetic features (crystal and precipitate onset and growth time) are extracted for 768 solutions with varying chicken egg white lysozyme concentration, salt type, ionic strength and pH. Image-based aggregation morphology and kinetic features were compiled into a single and easily interpretable figure, thereby showing that the EPD method can support high throughput crystallization experiments in its data amount as well as its data complexity. Copyright © 2018. Published by Elsevier Inc.
Data-Driven Packet Loss Estimation for Node Healthy Sensing in Decentralized Cluster
Fan, Hangyu; Wang, Huandong; Li, Yong
2018-01-01
Decentralized clustering of modern information technology is widely adopted in various fields these years. One of the main reason is the features of high availability and the failure-tolerance which can prevent the entire system form broking down by a failure of a single point. Recently, toolkits such as Akka are used by the public commonly to easily build such kind of cluster. However, clusters of such kind that use Gossip as their membership managing protocol and use link failure detecting mechanism to detect link failures cannot deal with the scenario that a node stochastically drops packets and corrupts the member status of the cluster. In this paper, we formulate the problem to be evaluating the link quality and finding a max clique (NP-Complete) in the connectivity graph. We then proposed an algorithm that consists of two models driven by data from application layer to respectively solving these two problems. Through simulations with statistical data and a real-world product, we demonstrate that our algorithm has a good performance. PMID:29360792
Poor textural image tie point matching via graph theory
NASA Astrophysics Data System (ADS)
Yuan, Xiuxiao; Chen, Shiyu; Yuan, Wei; Cai, Yang
2017-07-01
Feature matching aims to find corresponding points to serve as tie points between images. Robust matching is still a challenging task when input images are characterized by low contrast or contain repetitive patterns, occlusions, or homogeneous textures. In this paper, a novel feature matching algorithm based on graph theory is proposed. This algorithm integrates both geometric and radiometric constraints into an edge-weighted (EW) affinity tensor. Tie points are then obtained by high-order graph matching. Four pairs of poor textural images covering forests, deserts, bare lands, and urban areas are tested. For comparison, three state-of-the-art matching techniques, namely, scale-invariant feature transform (SIFT), speeded up robust features (SURF), and features from accelerated segment test (FAST), are also used. The experimental results show that the matching recall obtained by SIFT, SURF, and FAST varies from 0 to 35% in different types of poor textures. However, through the integration of both geometry and radiometry and the EW strategy, the recall obtained by the proposed algorithm is better than 50% in all four image pairs. The better matching recall improves the number of correct matches, dispersion, and positional accuracy.
TEES 2.2: Biomedical Event Extraction for Diverse Corpora
2015-01-01
Background The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. Results The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. Conclusions The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented. PMID:26551925
TEES 2.2: Biomedical Event Extraction for Diverse Corpora.
Björne, Jari; Salakoski, Tapio
2015-01-01
The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented.
Arc_Mat: a Matlab-based spatial data analysis toolbox
NASA Astrophysics Data System (ADS)
Liu, Xingjian; Lesage, James
2010-03-01
This article presents an overview of Arc_Mat, a Matlab-based spatial data analysis software package whose source code has been placed in the public domain. An earlier version of the Arc_Mat toolbox was developed to extract map polygon and database information from ESRI shapefiles and provide high quality mapping in the Matlab software environment. We discuss revisions to the toolbox that: utilize enhanced computing and graphing capabilities of more recent versions of Matlab, restructure the toolbox with object-oriented programming features, and provide more comprehensive functions for spatial data analysis. The Arc_Mat toolbox functionality includes basic choropleth mapping; exploratory spatial data analysis that provides exploratory views of spatial data through various graphs, for example, histogram, Moran scatterplot, three-dimensional scatterplot, density distribution plot, and parallel coordinate plots; and more formal spatial data modeling that draws on the extensive Spatial Econometrics Toolbox functions. A brief review of the design aspects of the revised Arc_Mat is described, and we provide some illustrative examples that highlight representative uses of the toolbox. Finally, we discuss programming with and customizing the Arc_Mat toolbox functionalities.
3D object retrieval using salient views
Shapiro, Linda G.
2013-01-01
This paper presents a method for selecting salient 2D views to describe 3D objects for the purpose of retrieval. The views are obtained by first identifying salient points via a learning approach that uses shape characteristics of the 3D points (Atmosukarto and Shapiro in International workshop on structural, syntactic, and statistical pattern recognition, 2008; Atmosukarto and Shapiro in ACM multimedia information retrieval, 2008). The salient views are selected by choosing views with multiple salient points on the silhouette of the object. Silhouette-based similarity measures from Chen et al. (Comput Graph Forum 22(3):223–232, 2003) are then used to calculate the similarity between two 3D objects. Retrieval experiments were performed on three datasets: the Heads dataset, the SHREC2008 dataset, and the Princeton dataset. Experimental results show that the retrieval results using the salient views are comparable to the existing light field descriptor method (Chen et al. in Comput Graph Forum 22(3):223–232, 2003), and our method achieves a 15-fold speedup in the feature extraction computation time. PMID:23833704
Batalle, Dafnis; Eixarch, Elisenda; Figueras, Francesc; Muñoz-Moreno, Emma; Bargallo, Nuria; Illa, Miriam; Acosta-Rojas, Ruthy; Amat-Roldan, Ivan; Gratacos, Eduard
2012-04-02
Intrauterine growth restriction (IUGR) due to placental insufficiency affects 5-10% of all pregnancies and it is associated with a wide range of short- and long-term neurodevelopmental disorders. Prediction of neurodevelopmental outcomes in IUGR is among the clinical challenges of modern fetal medicine and pediatrics. In recent years several studies have used magnetic resonance imaging (MRI) to demonstrate differences in brain structure in IUGR subjects, but the ability to use MRI for individual predictive purposes in IUGR is limited. Recent research suggests that MRI in vivo access to brain connectivity might have the potential to help understanding cognitive and neurodevelopment processes. Specifically, MRI based connectomics is an emerging approach to extract information from MRI data that exhaustively maps inter-regional connectivity within the brain to build a graph model of its neural circuitry known as brain network. In the present study we used diffusion MRI based connectomics to obtain structural brain networks of a prospective cohort of one year old infants (32 controls and 24 IUGR) and analyze the existence of quantifiable brain reorganization of white matter circuitry in IUGR group by means of global and regional graph theory features of brain networks. Based on global and regional analyses of the brain network topology we demonstrated brain reorganization in IUGR infants at one year of age. Specifically, IUGR infants presented decreased global and local weighted efficiency, and a pattern of altered regional graph theory features. By means of binomial logistic regression, we also demonstrated that connectivity measures were associated with abnormal performance in later neurodevelopmental outcome as measured by Bayley Scale for Infant and Toddler Development, Third edition (BSID-III) at two years of age. These findings show the potential of diffusion MRI based connectomics and graph theory based network characteristics for estimating differences in the architecture of neural circuitry and developing imaging biomarkers of poor neurodevelopment outcome in infants with prenatal diseases. Copyright © 2012 Elsevier Inc. All rights reserved.
Qualitative and Quantitative Pedigree Analysis: Graph Theory, Computer Software, and Case Studies.
ERIC Educational Resources Information Center
Jungck, John R.; Soderberg, Patti
1995-01-01
Presents a series of elementary mathematical tools for re-representing pedigrees, pedigree generators, pedigree-driven database management systems, and case studies for exploring genetic relationships. (MKR)
Ding, Xing; He, Yu; Duan, Z-C; Gregersen, Niels; Chen, M-C; Unsleber, S; Maier, S; Schneider, Christian; Kamp, Martin; Höfling, Sven; Lu, Chao-Yang; Pan, Jian-Wei
2016-01-15
Scalable photonic quantum technologies require on-demand single-photon sources with simultaneously high levels of purity, indistinguishability, and efficiency. These key features, however, have only been demonstrated separately in previous experiments. Here, by s-shell pulsed resonant excitation of a Purcell-enhanced quantum dot-micropillar system, we deterministically generate resonance fluorescence single photons which, at π pulse excitation, have an extraction efficiency of 66%, single-photon purity of 99.1%, and photon indistinguishability of 98.5%. Such a single-photon source for the first time combines the features of high efficiency and near-perfect levels of purity and indistinguishabilty, and thus opens the way to multiphoton experiments with semiconductor quantum dots.
Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track
2015-11-20
Mining Tasks from the Web Anchor Text Graph: MSR Notebook Paper for the TREC 2015 Tasks Track Paul N. Bennett Microsoft Research Redmond, USA pauben...anchor text graph has proven useful in the general realm of query reformulation [2], we sought to quantify the value of extracting key phrases from...anchor text in the broader setting of the task understanding track. Given a query, our approach considers a simple method for identifying a relevant
A data-driven dynamics simulation framework for railway vehicles
NASA Astrophysics Data System (ADS)
Nie, Yinyu; Tang, Zhao; Liu, Fengjia; Chang, Jian; Zhang, Jianjun
2018-03-01
The finite element (FE) method is essential for simulating vehicle dynamics with fine details, especially for train crash simulations. However, factors such as the complexity of meshes and the distortion involved in a large deformation would undermine its calculation efficiency. An alternative method, the multi-body (MB) dynamics simulation provides satisfying time efficiency but limited accuracy when highly nonlinear dynamic process is involved. To maintain the advantages of both methods, this paper proposes a data-driven simulation framework for dynamics simulation of railway vehicles. This framework uses machine learning techniques to extract nonlinear features from training data generated by FE simulations so that specific mesh structures can be formulated by a surrogate element (or surrogate elements) to replace the original mechanical elements, and the dynamics simulation can be implemented by co-simulation with the surrogate element(s) embedded into a MB model. This framework consists of a series of techniques including data collection, feature extraction, training data sampling, surrogate element building, and model evaluation and selection. To verify the feasibility of this framework, we present two case studies, a vertical dynamics simulation and a longitudinal dynamics simulation, based on co-simulation with MATLAB/Simulink and Simpack, and a further comparison with a popular data-driven model (the Kriging model) is provided. The simulation result shows that using the legendre polynomial regression model in building surrogate elements can largely cut down the simulation time without sacrifice in accuracy.
Assembly planning based on subassembly extraction
NASA Technical Reports Server (NTRS)
Lee, Sukhan; Shin, Yeong Gil
1990-01-01
A method is presented for the automatic determination of assembly partial orders from a liaison graph representation of an assembly through the extraction of preferred subassemblies. In particular, the authors show how to select a set of tentative subassemblies by decomposing a liaison graph into a set of subgraphs based on feasibility and difficulty of disassembly, how to evaluate each of the tentative subassemblies in terms of assembly cost using the subassembly selection indices, and how to construct a hierarchical partial order graph (HPOG) as an assembly plan. The method provides an approach to assembly planning by identifying spatial parallelism in assembly as a means of constructing temporal relationships among assembly operations and solves the problem of finding a cost-effective assembly plan in a flexible environment. A case study of the assembly planning of a mechanical assembly is presented.
Structural health monitoring feature design by genetic programming
NASA Astrophysics Data System (ADS)
Harvey, Dustin Y.; Todd, Michael D.
2014-09-01
Structural health monitoring (SHM) systems provide real-time damage and performance information for civil, aerospace, and other high-capital or life-safety critical structures. Conventional data processing involves pre-processing and extraction of low-dimensional features from in situ time series measurements. The features are then input to a statistical pattern recognition algorithm to perform the relevant classification or regression task necessary to facilitate decisions by the SHM system. Traditional design of signal processing and feature extraction algorithms can be an expensive and time-consuming process requiring extensive system knowledge and domain expertise. Genetic programming, a heuristic program search method from evolutionary computation, was recently adapted by the authors to perform automated, data-driven design of signal processing and feature extraction algorithms for statistical pattern recognition applications. The proposed method, called Autofead, is particularly suitable to handle the challenges inherent in algorithm design for SHM problems where the manifestation of damage in structural response measurements is often unclear or unknown. Autofead mines a training database of response measurements to discover information-rich features specific to the problem at hand. This study provides experimental validation on three SHM applications including ultrasonic damage detection, bearing damage classification for rotating machinery, and vibration-based structural health monitoring. Performance comparisons with common feature choices for each problem area are provided demonstrating the versatility of Autofead to produce significant algorithm improvements on a wide range of problems.
Exploring Text and Icon Graph Interpretation in Students with Dyslexia: An Eye-tracking Study.
Kim, Sunjung; Wiseheart, Rebecca
2017-02-01
A growing body of research suggests that individuals with dyslexia struggle to use graphs efficiently. Given the persistence of orthographic processing deficits in dyslexia, this study tested whether graph interpretation deficits in dyslexia are directly related to difficulties processing the orthographic components of graphs (i.e. axes and legend labels). Participants were 80 college students with and without dyslexia. Response times and eye movements were recorded as students answered comprehension questions about simple data displayed in bar graphs. Axes and legends were labelled either with words (mixed-modality graphs) or icons (orthography-free graphs). Students also answered informationally equivalent questions presented in sentences (orthography-only condition). Response times were slower in the dyslexic group only for processing sentences. However, eye tracking data revealed group differences for processing mixed-modality graphs, whereas no group differences were found for the orthography-free graphs. When processing bar graphs, students with dyslexia differ from their able reading peers only when graphs contain orthographic features. Implications for processing informational text are discussed. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Process and representation in graphical displays
NASA Technical Reports Server (NTRS)
Gillan, Douglas J.; Lewis, Robert; Rudisill, Marianne
1993-01-01
Our initial model of graphic comprehension has focused on statistical graphs. Like other models of human-computer interaction, models of graphical comprehension can be used by human-computer interface designers and developers to create interfaces that present information in an efficient and usable manner. Our investigation of graph comprehension addresses two primary questions: how do people represent the information contained in a data graph?; and how do they process information from the graph? The topics of focus for graphic representation concern the features into which people decompose a graph and the representations of the graph in memory. The issue of processing can be further analyzed as two questions: what overall processing strategies do people use?; and what are the specific processing skills required?
NASA Astrophysics Data System (ADS)
Sharma, Harshita; Zerbe, Norman; Heim, Daniel; Wienert, Stephan; Lohmann, Sebastian; Hellwich, Olaf; Hufnagl, Peter
2016-03-01
This paper describes a novel graph-based method for efficient representation and subsequent classification in histological whole slide images of gastric cancer. Her2/neu immunohistochemically stained and haematoxylin and eosin stained histological sections of gastric carcinoma are digitized. Immunohistochemical staining is used in practice by pathologists to determine extent of malignancy, however, it is laborious to visually discriminate the corresponding malignancy levels in the more commonly used haematoxylin and eosin stain, and this study attempts to solve this problem using a computer-based method. Cell nuclei are first isolated at high magnification using an automatic cell nuclei segmentation strategy, followed by construction of cell nuclei attributed relational graphs of the tissue regions. These graphs represent tissue architecture comprehensively, as they contain information about cell nuclei morphology as vertex attributes, along with knowledge of neighborhood in the form of edge linking and edge attributes. Global graph characteristics are derived and ensemble learning is used to discriminate between three types of malignancy levels, namely, non-tumor, Her2/neu positive tumor and Her2/neu negative tumor. Performance is compared with state of the art methods including four texture feature groups (Haralick, Gabor, Local Binary Patterns and Varma Zisserman features), color and intensity features, and Voronoi diagram and Delaunay triangulation. Texture, color and intensity information is also combined with graph-based knowledge, followed by correlation analysis. Quantitative assessment is performed using two cross validation strategies. On investigating the experimental results, it can be concluded that the proposed method provides a promising way for computer-based analysis of histopathological images of gastric cancer.
An Automated Method to Identify Mesoscale Convective Complexes (MCCs) Implementing Graph Theory
NASA Astrophysics Data System (ADS)
Whitehall, K. D.; Mattmann, C. A.; Jenkins, G. S.; Waliser, D. E.; Rwebangira, R.; Demoz, B.; Kim, J.; Goodale, C. E.; Hart, A. F.; Ramirez, P.; Joyce, M. J.; Loikith, P.; Lee, H.; Khudikyan, S.; Boustani, M.; Goodman, A.; Zimdars, P. A.; Whittell, J.
2013-12-01
Mesoscale convective complexes (MCCs) are convectively-driven weather systems with a duration of ~10 - 12 hours and contributions of large amounts to the rainfall daily and monthly totals. More than 400 MCCs occur annually over various locations on the globe. In West Africa, ~170 MCCs occur annually during the 180 days representing the summer months (June - November), and contribute ~75% of the annual wet season rainfall. The main objective of this study is to improve automatic identification of MCC over West Africa. The spatial expanse of MCCs and the spatio-temporal variability in their convective characteristics make them difficult to characterize even in dense networks of radars and/or surface gauges. As such there exist criteria for identifying MCCs with satellite images - mostly using infrared (IR) data. Automated MCC identification methods are based on forward and/or backward in time spatial-temporal analysis of the IR satellite data and characteristically incorporate a manual component as these algorithms routinely falter with merging and splitting cloud systems between satellite images. However, these algorithms are not readily transferable to voluminous data or other satellite-derived datasets (e.g. TRMM), thus hindering comprehensive studies of these features both at weather and climate timescales. Recognizing the existing limitations of automated methods, this study explores the applicability of graph theory to creating a fully automated method for deriving a West African MCC dataset from hourly infrared satellite images between 2001- 2012. Graph theory, though not heavily implemented in the atmospheric sciences, has been used for the predicting (nowcasting) of thunderstorms from radar and satellite data by considering the relationship between atmospheric variables at a given time, or for the spatial-temporal analysis of cloud volumes. From these few studies, graph theory appears to be innately applicable to the complexity, non-linearity and inherent chaos of the atmospheric system. Our preliminary results show that the use of graph theory improves data management thus allowing for longer periods to be studied, and creates a transferable method that allows for other data to be utilized.
NASA Astrophysics Data System (ADS)
Buscema, Massimo; Asadi-Zeydabadi, Masoud; Lodwick, Weldon; Breda, Marco
2016-04-01
Significant applications such as the analysis of Alzheimer's disease differentiated from dementia, or in data mining of social media, or in extracting information of drug cartel structural composition, are often modeled as graphs. The structural or topological complexity or lack of it in a graph is quite often useful in understanding and more importantly, resolving the problem. We are proposing a new index we call the H0function to measure the structural/topological complexity of a graph. To do this, we introduce the concept of graph pruning and its associated algorithm that is used in the development of our measure. We illustrate the behavior of our measure, the H0 function, through different examples found in the appendix. These examples indicate that the H0 function contains information that is useful and important characteristics of a graph. Here, we restrict ourselves to undirected.
Mining and Modeling Real-World Networks: Patterns, Anomalies, and Tools
ERIC Educational Resources Information Center
Akoglu, Leman
2012-01-01
Large real-world graph (a.k.a network, relational) data are omnipresent, in online media, businesses, science, and the government. Analysis of these massive graphs is crucial, in order to extract descriptive and predictive knowledge with many commercial, medical, and environmental applications. In addition to its general structure, knowing what…
Unwinding the hairball graph: Pruning algorithms for weighted complex networks
NASA Astrophysics Data System (ADS)
Dianati, Navid
2016-01-01
Empirical networks of weighted dyadic relations often contain "noisy" edges that alter the global characteristics of the network and obfuscate the most important structures therein. Graph pruning is the process of identifying the most significant edges according to a generative null model and extracting the subgraph consisting of those edges. Here, we focus on integer-weighted graphs commonly arising when weights count the occurrences of an "event" relating the nodes. We introduce a simple and intuitive null model related to the configuration model of network generation and derive two significance filters from it: the marginal likelihood filter (MLF) and the global likelihood filter (GLF). The former is a fast algorithm assigning a significance score to each edge based on the marginal distribution of edge weights, whereas the latter is an ensemble approach which takes into account the correlations among edges. We apply these filters to the network of air traffic volume between US airports and recover a geographically faithful representation of the graph. Furthermore, compared with thresholding based on edge weight, we show that our filters extract a larger and significantly sparser giant component.
MorphoGraphX: A platform for quantifying morphogenesis in 4D
Barbier de Reuille, Pierre; Routier-Kierzkowska, Anne-Lise; Kierzkowski, Daniel; Bassel, George W; Schüpbach, Thierry; Tauriello, Gerardo; Bajpai, Namrata; Strauss, Sören; Weber, Alain; Kiss, Annamaria; Burian, Agata; Hofhuis, Hugo; Sapala, Aleksandra; Lipowczan, Marcin; Heimlicher, Maria B; Robinson, Sarah; Bayer, Emmanuelle M; Basler, Konrad; Koumoutsakos, Petros; Roeder, Adrienne HK; Aegerter-Wilmsen, Tinri; Nakayama, Naomi; Tsiantis, Miltos; Hay, Angela; Kwiatkowska, Dorota; Xenarios, Ioannis; Kuhlemeier, Cris; Smith, Richard S
2015-01-01
Morphogenesis emerges from complex multiscale interactions between genetic and mechanical processes. To understand these processes, the evolution of cell shape, proliferation and gene expression must be quantified. This quantification is usually performed either in full 3D, which is computationally expensive and technically challenging, or on 2D planar projections, which introduces geometrical artifacts on highly curved organs. Here we present MorphoGraphX (www.MorphoGraphX.org), a software that bridges this gap by working directly with curved surface images extracted from 3D data. In addition to traditional 3D image analysis, we have developed algorithms to operate on curved surfaces, such as cell segmentation, lineage tracking and fluorescence signal quantification. The software's modular design makes it easy to include existing libraries, or to implement new algorithms. Cell geometries extracted with MorphoGraphX can be exported and used as templates for simulation models, providing a powerful platform to investigate the interactions between shape, genes and growth. DOI: http://dx.doi.org/10.7554/eLife.05864.001 PMID:25946108
A neural network ActiveX based integrated image processing environment.
Ciuca, I; Jitaru, E; Alaicescu, M; Moisil, I
2000-01-01
The paper outlines an integrated image processing environment that uses neural networks ActiveX technology for object recognition and classification. The image processing environment which is Windows based, encapsulates a Multiple-Document Interface (MDI) and is menu driven. Object (shape) parameter extraction is focused on features that are invariant in terms of translation, rotation and scale transformations. The neural network models that can be incorporated as ActiveX components into the environment allow both clustering and classification of objects from the analysed image. Mapping neural networks perform an input sensitivity analysis on the extracted feature measurements and thus facilitate the removal of irrelevant features and improvements in the degree of generalisation. The program has been used to evaluate the dimensions of the hydrocephalus in a study for calculating the Evans index and the angle of the frontal horns of the ventricular system modifications.
Efficient content-based low-altitude images correlated network and strips reconstruction
NASA Astrophysics Data System (ADS)
He, Haiqing; You, Qi; Chen, Xiaoyong
2017-01-01
The manual intervention method is widely used to reconstruct strips for further aerial triangulation in low-altitude photogrammetry. Clearly the method for fully automatic photogrammetric data processing is not an expected way. In this paper, we explore a content-based approach without manual intervention or external information for strips reconstruction. Feature descriptors in the local spatial patterns are extracted by SIFT to construct vocabulary tree, in which these features are encoded in terms of TF-IDF numerical statistical algorithm to generate new representation for each low-altitude image. Then images correlated network is reconstructed by similarity measure, image matching and geometric graph theory. Finally, strips are reconstructed automatically by tracing straight lines and growing adjacent images gradually. Experimental results show that the proposed approach is highly effective in automatically rearranging strips of lowaltitude images and can provide rough relative orientation for further aerial triangulation.
Lombaert, Herve; Grady, Leo; Polimeni, Jonathan R.; Cheriet, Farida
2013-01-01
Existing methods for surface matching are limited by the trade-off between precision and computational efficiency. Here we present an improved algorithm for dense vertex-to-vertex correspondence that uses direct matching of features defined on a surface and improves it by using spectral correspondence as a regularization. This algorithm has the speed of both feature matching and spectral matching while exhibiting greatly improved precision (distance errors of 1.4%). The method, FOCUSR, incorporates implicitly such additional features to calculate the correspondence and relies on the smoothness of the lowest-frequency harmonics of a graph Laplacian to spatially regularize the features. In its simplest form, FOCUSR is an improved spectral correspondence method that nonrigidly deforms spectral embeddings. We provide here a full realization of spectral correspondence where virtually any feature can be used as additional information using weights on graph edges, but also on graph nodes and as extra embedded coordinates. As an example, the full power of FOCUSR is demonstrated in a real case scenario with the challenging task of brain surface matching across several individuals. Our results show that combining features and regularizing them in a spectral embedding greatly improves the matching precision (to a sub-millimeter level) while performing at much greater speed than existing methods. PMID:23868776
NASA Astrophysics Data System (ADS)
Seppke, Benjamin; Dreschler-Fischer, Leonie; Wilms, Christian
2016-08-01
The extraction of road signatures from remote sensing images as a promising indicator for urbanization is a classical segmentation problem. However, some segmentation algorithms often lead to non-sufficient results. One way to overcome this problem is the usage of superpixels, that represent a locally coherent cluster of connected pixels. Superpixels allow flexible, highly adaptive segmentation approaches due to the possibility of merging as well as splitting and form new basic image entities. On the other hand, superpixels require an appropriate representation containing all relevant information about topology and geometry to maximize their advantages.In this work, we present a combined geometric and topological representation based on a special graph representation, the so-called RS-graph. Moreover, we present the use of the RS-graph by means of a case study: the extraction of partially occluded road networks in rural areas from open source (spectral) remote sensing images by tracking. In addition, multiprocessing and GPU-based parallelization is used to speed up the construction of the representation and the application.
Ravikumar, Ke; Liu, Haibin; Cohn, Judith D; Wall, Michael E; Verspoor, Karin
2012-10-05
We propose a method for automatic extraction of protein-specific residue mentions from the biomedical literature. The method searches text for mentions of amino acids at specific sequence positions and attempts to correctly associate each mention with a protein also named in the text. The methods presented in this work will enable improved protein functional site extraction from articles, ultimately supporting protein function prediction. Our method made use of linguistic patterns for identifying the amino acid residue mentions in text. Further, we applied an automated graph-based method to learn syntactic patterns corresponding to protein-residue pairs mentioned in the text. We finally present an approach to automated construction of relevant training and test data using the distant supervision model. The performance of the method was assessed by extracting protein-residue relations from a new automatically generated test set of sentences containing high confidence examples found using distant supervision. It achieved a F-measure of 0.84 on automatically created silver corpus and 0.79 on a manually annotated gold data set for this task, outperforming previous methods. The primary contributions of this work are to (1) demonstrate the effectiveness of distant supervision for automatic creation of training data for protein-residue relation extraction, substantially reducing the effort and time involved in manual annotation of a data set and (2) show that the graph-based relation extraction approach we used generalizes well to the problem of protein-residue association extraction. This work paves the way towards effective extraction of protein functional residues from the literature.
A flexible data-driven comorbidity feature extraction framework.
Sideris, Costas; Pourhomayoun, Mohammad; Kalantarian, Haik; Sarrafzadeh, Majid
2016-06-01
Disease and symptom diagnostic codes are a valuable resource for classifying and predicting patient outcomes. In this paper, we propose a novel methodology for utilizing disease diagnostic information in a predictive machine learning framework. Our methodology relies on a novel, clustering-based feature extraction framework using disease diagnostic information. To reduce the data dimensionality, we identify disease clusters using co-occurrence statistics. We optimize the number of generated clusters in the training set and then utilize these clusters as features to predict patient severity of condition and patient readmission risk. We build our clustering and feature extraction algorithm using the 2012 National Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP) which contains 7 million hospital discharge records and ICD-9-CM codes. The proposed framework is tested on Ronald Reagan UCLA Medical Center Electronic Health Records (EHR) from 3041 Congestive Heart Failure (CHF) patients and the UCI 130-US diabetes dataset that includes admissions from 69,980 diabetic patients. We compare our cluster-based feature set with the commonly used comorbidity frameworks including Charlson's index, Elixhauser's comorbidities and their variations. The proposed approach was shown to have significant gains between 10.7-22.1% in predictive accuracy for CHF severity of condition prediction and 4.65-5.75% in diabetes readmission prediction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Object-based Encoding in Visual Working Memory: Evidence from Memory-driven Attentional Capture.
Gao, Zaifeng; Yu, Shixian; Zhu, Chengfeng; Shui, Rende; Weng, Xuchu; Li, Peng; Shen, Mowei
2016-03-09
Visual working memory (VWM) adopts a specific manner of object-based encoding (OBE) to extract perceptual information: Whenever one feature-dimension is selected for entry into VWM, the others are also extracted. Currently most studies revealing OBE probed an 'irrelevant-change distracting effect', where changes of irrelevant-features dramatically affected the performance of the target feature. However, the existence of irrelevant-feature change may affect participants' processing manner, leading to a false-positive result. The current study conducted a strict examination of OBE in VWM, by probing whether irrelevant-features guided the deployment of attention in visual search. The participants memorized an object's colour yet ignored shape and concurrently performed a visual-search task. They searched for a target line among distractor lines, each embedded within a different object. One object in the search display could match the shape, colour, or both dimensions of the memory item, but this object never contained the target line. Relative to a neutral baseline, where there was no match between the memory and search displays, search time was significantly prolonged in all match conditions, regardless of whether the memory item was displayed for 100 or 1000 ms. These results suggest that task-irrelevant shape was extracted into VWM, supporting OBE in VWM.
Chiu, Stephanie J; Toth, Cynthia A; Bowes Rickman, Catherine; Izatt, Joseph A; Farsiu, Sina
2012-05-01
This paper presents a generalized framework for segmenting closed-contour anatomical and pathological features using graph theory and dynamic programming (GTDP). More specifically, the GTDP method previously developed for quantifying retinal and corneal layer thicknesses is extended to segment objects such as cells and cysts. The presented technique relies on a transform that maps closed-contour features in the Cartesian domain into lines in the quasi-polar domain. The features of interest are then segmented as layers via GTDP. Application of this method to segment closed-contour features in several ophthalmic image types is shown. Quantitative validation experiments for retinal pigmented epithelium cell segmentation in confocal fluorescence microscopy images attests to the accuracy of the presented technique.
Chiu, Stephanie J.; Toth, Cynthia A.; Bowes Rickman, Catherine; Izatt, Joseph A.; Farsiu, Sina
2012-01-01
This paper presents a generalized framework for segmenting closed-contour anatomical and pathological features using graph theory and dynamic programming (GTDP). More specifically, the GTDP method previously developed for quantifying retinal and corneal layer thicknesses is extended to segment objects such as cells and cysts. The presented technique relies on a transform that maps closed-contour features in the Cartesian domain into lines in the quasi-polar domain. The features of interest are then segmented as layers via GTDP. Application of this method to segment closed-contour features in several ophthalmic image types is shown. Quantitative validation experiments for retinal pigmented epithelium cell segmentation in confocal fluorescence microscopy images attests to the accuracy of the presented technique. PMID:22567602
Khazaee, Ali; Ebrahimzadeh, Ata; Babajani-Feremi, Abbas
2016-09-01
The study of brain networks by resting-state functional magnetic resonance imaging (rs-fMRI) is a promising method for identifying patients with dementia from healthy controls (HC). Using graph theory, different aspects of the brain network can be efficiently characterized by calculating measures of integration and segregation. In this study, we combined a graph theoretical approach with advanced machine learning methods to study the brain network in 89 patients with mild cognitive impairment (MCI), 34 patients with Alzheimer's disease (AD), and 45 age-matched HC. The rs-fMRI connectivity matrix was constructed using a brain parcellation based on a 264 putative functional areas. Using the optimal features extracted from the graph measures, we were able to accurately classify three groups (i.e., HC, MCI, and AD) with accuracy of 88.4 %. We also investigated performance of our proposed method for a binary classification of a group (e.g., MCI) from two other groups (e.g., HC and AD). The classification accuracies for identifying HC from AD and MCI, AD from HC and MCI, and MCI from HC and AD, were 87.3, 97.5, and 72.0 %, respectively. In addition, results based on the parcellation of 264 regions were compared to that of the automated anatomical labeling atlas (AAL), consisted of 90 regions. The accuracy of classification of three groups using AAL was degraded to 83.2 %. Our results show that combining the graph measures with the machine learning approach, on the basis of the rs-fMRI connectivity analysis, may assist in diagnosis of AD and MCI.
NASA Astrophysics Data System (ADS)
Alzahrani, Abdullah; Hu, Sijung; Azorin-Peris, Vicente; Barrett, Laura; Esliger, Dale; Hayes, Matthew; Akbare, Shafique; Achart, Jérôme; Kuoch, Sylvain
2015-03-01
This study presents an effective engineering approach for human vital signs monitoring as increasingly demanded by personal healthcare. The aim of this work is to study how to capture critical physiological parameters efficiently through a well-constructed electronic system and a robust multi-channel opto-electronic patch sensor (OEPS), together with a wireless communication. A unique design comprising multi-wavelength illumination sources and a rapid response photo sensor with a 3-axis accelerometer enables to recover pulsatile features, compensate motion and increase signal-to-noise ratio. An approved protocol with designated tests was implemented at Loughborough University a UK leader in sport and exercise assessment. The results of sport physiological effects were extracted from the datasets of physical movements, i.e. sitting, standing, waking, running and cycling. t-test, Bland-Altman and correlation analysis were applied to evaluate the performance of the OEPS system against Acti-Graph and Mio-Alpha.There was no difference in heart rate measured using OEPS and both Acti-Graph and Mio-Alpha (both p<0.05). Strong correlations were observed between HR measured from the OEPS and both the Acti-graph and Mio-Alpha (r = 0.96, p<0.001). Bland-Altman analysis for the Acti-Graph and OEPS found the bias 0.85 bpm, the standard deviation 9.20 bpm, and the limits of agreement (LOA) -17.18 bpm to +18.88 bpm for lower and upper limits of agreement respectively, for the Mio-Alpha and OEPS the bias is 1.63 bpm, standard deviation SD8.62 bpm, lower and upper limits of agreement, - 15.27 bpm and +18.58 bpm respectively. The OEPS demonstrates a real time, robust and remote monitoring of cardiovascular function.
ERIC Educational Resources Information Center
Katz, Irvin R.; Xi, Xiaoming; Kim, Hyun-Joo; Cheng, Peter C. H.
2004-01-01
This research applied a cognitive model to identify item features that lead to irrelevant variance on the Test of Spoken English[TM] (TSE[R]). The TSE is an assessment of English oral proficiency and includes an item that elicits a description of a statistical graph. This item type sometimes appears to tap graph-reading skills--an irrelevant…
Seven Wonders of the Ancient and Modern Quadratic World.
ERIC Educational Resources Information Center
Taylor, Sharon E.; Mittag, Kathleen Cage
2001-01-01
Presents four methods for solving a quadratic equation using graphing calculator technology: (1) graphing with the CALC feature; (2) quadratic formula program; (3) table; and (4) solver. Includes a worksheet for a lab activity on factoring quadratic equations. (KHR)
Pan, Yongke; Niu, Wenjia
2017-01-01
Semisupervised Discriminant Analysis (SDA) is a semisupervised dimensionality reduction algorithm, which can easily resolve the out-of-sample problem. Relative works usually focus on the geometric relationships of data points, which are not obvious, to enhance the performance of SDA. Different from these relative works, the regularized graph construction is researched here, which is important in the graph-based semisupervised learning methods. In this paper, we propose a novel graph for Semisupervised Discriminant Analysis, which is called combined low-rank and k-nearest neighbor (LRKNN) graph. In our LRKNN graph, we map the data to the LR feature space and then the kNN is adopted to satisfy the algorithmic requirements of SDA. Since the low-rank representation can capture the global structure and the k-nearest neighbor algorithm can maximally preserve the local geometrical structure of the data, the LRKNN graph can significantly improve the performance of SDA. Extensive experiments on several real-world databases show that the proposed LRKNN graph is an efficient graph constructor, which can largely outperform other commonly used baselines. PMID:28316616
Efficient Extraction of High Centrality Vertices in Distributed Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kumbhare, Alok; Frincu, Marc; Raghavendra, Cauligi S.
2014-09-09
Betweenness centrality (BC) is an important measure for identifying high value or critical vertices in graphs, in variety of domains such as communication networks, road networks, and social graphs. However, calculating betweenness values is prohibitively expensive and, more often, domain experts are interested only in the vertices with the highest centrality values. In this paper, we first propose a partition-centric algorithm (MS-BC) to calculate BC for a large distributed graph that optimizes resource utilization and improves overall performance. Further, we extend the notion of approximate BC by pruning the graph and removing a subset of edges and vertices that contributemore » the least to the betweenness values of other vertices (MSL-BC), which further improves the runtime performance. We evaluate the proposed algorithms using a mix of real-world and synthetic graphs on an HPC cluster and analyze its strengths and weaknesses. The experimental results show an improvement in performance of upto 12x for large sparse graphs as compared to the state-of-the-art, and at the same time highlights the need for better partitioning methods to enable a balanced workload across partitions for unbalanced graphs such as small-world or power-law graphs.« less
NASA Astrophysics Data System (ADS)
Yamaguchi, Atsuko; Ohashi, Takeyoshi; Kawasaki, Takahiro; Inoue, Osamu; Kawada, Hiroki
2013-04-01
A new method for calculating critical dimension (CDs) at the top and bottom of three-dimensional (3D) pattern profiles from a critical-dimension scanning electron microscope (CD-SEM) image, called as "T-sigma method", is proposed and evaluated. Without preparing a library of database in advance, T-sigma can estimate a feature of a pattern sidewall. Furthermore, it supplies the optimum edge-definition (i.e., threshold level for determining edge position from a CDSEM signal) to detect the top and bottom of the pattern. This method consists of three steps. First, two components of line-edge roughness (LER); noise-induced bias (i.e., LER bias) and unbiased component (i.e., bias-free LER) are calculated with set threshold level. Second, these components are calculated with various threshold values, and the threshold-dependence of these two components, "T-sigma graph", is obtained. Finally, the optimum threshold value for the top and the bottom edge detection are given by the analysis of T-sigma graph. T-sigma was applied to CD-SEM images of three kinds of resist-pattern samples. In addition, reference metrology was performed with atomic force microscope (AFM) and scanning transmission electron microscope (STEM). Sensitivity of CD measured by T-sigma to the reference CD was higher than or equal to that measured by the conventional edge definition. Regarding the absolute measurement accuracy, T-sigma showed better results than the conventional definition. Furthermore, T-sigma graphs were calculated from CD-SEM images of two kinds of resist samples and compared with corresponding STEM observation results. Both bias-free LER and LER bias increased as the detected edge point moved from the bottom to the top of the pattern in the case that the pattern had a straight sidewall and a round top. On the other hand, they were almost constant in the case that the pattern had a re-entrant profile. T-sigma will be able to reveal a re-entrant feature. From these results, it is found that T-sigma method can provide rough cross-sectional pattern features and achieve quick, easy and accurate measurements of top and bottom CD.
BioJS DAGViewer: A reusable JavaScript component for displaying directed graphs
Micklem, Gos
2014-01-01
Summary: The DAGViewer BioJS component is a reusable JavaScript component made available as part of the BioJS project and intended to be used to display graphs of structured data, with a particular emphasis on Directed Acyclic Graphs (DAGs). It enables users to embed representations of graphs of data, such as ontologies or phylogenetic trees, in hyper-text documents (HTML). This component is generic, since it is capable (given the appropriate configuration) of displaying any kind of data that is organised as a graph. The features of this component which are useful for examining and filtering large and complex graphs are described. Availability: http://github.com/alexkalderimis/dag-viewer-biojs; http://github.com/biojs/biojs; http://dx.doi.org/10.5281/zenodo.8303. PMID:24627804
Unsalan, Cem; Boyer, Kim L
2005-04-01
Today's commercial satellite images enable experts to classify region types in great detail. In previous work, we considered discriminating rural and urban regions [23]. However, a more detailed classification is required for many purposes. These fine classifications assist government agencies in many ways including urban planning, transportation management, and rescue operations. In a step toward the automation of the fine classification process, this paper explores graph theoretical measures over grayscale images. The graphs are constructed by assigning photometric straight line segments to vertices, while graph edges encode their spatial relationships. We then introduce a set of measures based on various properties of the graph. These measures are nearly monotonic (positively correlated) with increasing structure (organization) in the image. Thus, increased cultural activity and land development are indicated by increases in these measures-without explicit extraction of road networks, buildings, residences, etc. These latter, time consuming (and still only partially automated) tasks can be restricted only to "promising" image regions, according to our measures. In some applications our measures may suffice. We present a theoretical basis for the measures followed by extensive experimental results in which the measures are first compared to manual evaluations of land development. We then present and test a method to focus on, and (pre)extract, suburban-style residential areas. These are of particular importance in many applications, and are especially difficult to extract. In this work, we consider commercial IKONOS data. These images are orthorectified to provide a fixed resolution of 1 meter per pixel on the ground. They are, therefore, metric in the sense that ground distance is fixed in scale to pixel distance. Our data set is large and diverse, including sea and coastline, rural, forest, residential, industrial, and urban areas.
Heuristic-driven graph wavelet modeling of complex terrain
NASA Astrophysics Data System (ADS)
Cioacǎ, Teodor; Dumitrescu, Bogdan; Stupariu, Mihai-Sorin; Pǎtru-Stupariu, Ileana; Nǎpǎrus, Magdalena; Stoicescu, Ioana; Peringer, Alexander; Buttler, Alexandre; Golay, François
2015-03-01
We present a novel method for building a multi-resolution representation of large digital surface models. The surface points coincide with the nodes of a planar graph which can be processed using a critically sampled, invertible lifting scheme. To drive the lazy wavelet node partitioning, we employ an attribute aware cost function based on the generalized quadric error metric. The resulting algorithm can be applied to multivariate data by storing additional attributes at the graph's nodes. We discuss how the cost computation mechanism can be coupled with the lifting scheme and examine the results by evaluating the root mean square error. The algorithm is experimentally tested using two multivariate LiDAR sets representing terrain surface and vegetation structure with different sampling densities.
Lukasczyk, Jonas; Weber, Gunther; Maciejewski, Ross; ...
2017-06-01
Tracking graphs are a well established tool in topological analysis to visualize the evolution of components and their properties over time, i.e., when components appear, disappear, merge, and split. However, tracking graphs are limited to a single level threshold and the graphs may vary substantially even under small changes to the threshold. To examine the evolution of features for varying levels, users have to compare multiple tracking graphs without a direct visual link between them. We propose a novel, interactive, nested graph visualization based on the fact that the tracked superlevel set components for different levels are related to eachmore » other through their nesting hierarchy. This approach allows us to set multiple tracking graphs in context to each other and enables users to effectively follow the evolution of components for different levels simultaneously. We show the effectiveness of our approach on datasets from finite pointset methods, computational fluid dynamics, and cosmology simulations.« less
Visual Routines Are Associated with Specific Graph Interpretations
ERIC Educational Resources Information Center
Michal, Audrey L.; Franconeri, Steven L.
2017-01-01
We argue that people compare values in graphs with a "visual routine"--attending to data values in an ordered pattern over time. Do these visual routines exist to manage capacity limitations in how many values can be encoded at once, or do they actually affect the relations that are extracted? We measured eye movements while people…
Reduced graphs and their applications in chemoinformatics.
Birchall, Kristian; Gillet, Valerie J
2011-01-01
Reduced graphs provide summary representations of chemical structures by collapsing groups of connected atoms into single nodes while preserving the topology of the original structures. This chapter reviews the extensive work that has been carried out on reduced graphs at The University of Sheffield and includes discussion of their application to the representation and search of Markush structures in patents, the varied approaches that have been implemented for similarity searching, their use in cluster representation, the different ways in which they have been applied to extract structure-activity relationships and their use in encoding bioisosteres.
Inference of Spatio-Temporal Functions Over Graphs via Multikernel Kriged Kalman Filtering
NASA Astrophysics Data System (ADS)
Ioannidis, Vassilis N.; Romero, Daniel; Giannakis, Georgios B.
2018-06-01
Inference of space-time varying signals on graphs emerges naturally in a plethora of network science related applications. A frequently encountered challenge pertains to reconstructing such dynamic processes, given their values over a subset of vertices and time instants. The present paper develops a graph-aware kernel-based kriged Kalman filter that accounts for the spatio-temporal variations, and offers efficient online reconstruction, even for dynamically evolving network topologies. The kernel-based learning framework bypasses the need for statistical information by capitalizing on the smoothness that graph signals exhibit with respect to the underlying graph. To address the challenge of selecting the appropriate kernel, the proposed filter is combined with a multi-kernel selection module. Such a data-driven method selects a kernel attuned to the signal dynamics on-the-fly within the linear span of a pre-selected dictionary. The novel multi-kernel learning algorithm exploits the eigenstructure of Laplacian kernel matrices to reduce computational complexity. Numerical tests with synthetic and real data demonstrate the superior reconstruction performance of the novel approach relative to state-of-the-art alternatives.
On a programming language for graph algorithms
NASA Technical Reports Server (NTRS)
Rheinboldt, W. C.; Basili, V. R.; Mesztenyi, C. K.
1971-01-01
An algorithmic language, GRAAL, is presented for describing and implementing graph algorithms of the type primarily arising in applications. The language is based on a set algebraic model of graph theory which defines the graph structure in terms of morphisms between certain set algebraic structures over the node set and arc set. GRAAL is modular in the sense that the user specifies which of these mappings are available with any graph. This allows flexibility in the selection of the storage representation for different graph structures. In line with its set theoretic foundation, the language introduces sets as a basic data type and provides for the efficient execution of all set and graph operators. At present, GRAAL is defined as an extension of ALGOL 60 (revised) and its formal description is given as a supplement to the syntactic and semantic definition of ALGOL. Several typical graph algorithms are written in GRAAL to illustrate various features of the language and to show its applicability.
Unsupervised Metric Fusion Over Multiview Data by Graph Random Walk-Based Cross-View Diffusion.
Wang, Yang; Zhang, Wenjie; Wu, Lin; Lin, Xuemin; Zhao, Xiang
2017-01-01
Learning an ideal metric is crucial to many tasks in computer vision. Diverse feature representations may combat this problem from different aspects; as visual data objects described by multiple features can be decomposed into multiple views, thus often provide complementary information. In this paper, we propose a cross-view fusion algorithm that leads to a similarity metric for multiview data by systematically fusing multiple similarity measures. Unlike existing paradigms, we focus on learning distance measure by exploiting a graph structure of data samples, where an input similarity matrix can be improved through a propagation of graph random walk. In particular, we construct multiple graphs with each one corresponding to an individual view, and a cross-view fusion approach based on graph random walk is presented to derive an optimal distance measure by fusing multiple metrics. Our method is scalable to a large amount of data by enforcing sparsity through an anchor graph representation. To adaptively control the effects of different views, we dynamically learn view-specific coefficients, which are leveraged into graph random walk to balance multiviews. However, such a strategy may lead to an over-smooth similarity metric where affinities between dissimilar samples may be enlarged by excessively conducting cross-view fusion. Thus, we figure out a heuristic approach to controlling the iteration number in the fusion process in order to avoid over smoothness. Extensive experiments conducted on real-world data sets validate the effectiveness and efficiency of our approach.
Wang, Zhengxia; Zhu, Xiaofeng; Adeli, Ehsan; Zhu, Yingying; Nie, Feiping; Munsell, Brent
2018-01-01
Graph-based transductive learning (GTL) is a powerful machine learning technique that is used when sufficient training data is not available. In particular, conventional GTL approaches first construct a fixed inter-subject relation graph that is based on similarities in voxel intensity values in the feature domain, which can then be used to propagate the known phenotype data (i.e., clinical scores and labels) from the training data to the testing data in the label domain. However, this type of graph is exclusively learned in the feature domain, and primarily due to outliers in the observed features, may not be optimal for label propagation in the label domain. To address this limitation, a progressive GTL (pGTL) method is proposed that gradually finds an intrinsic data representation that more accurately aligns imaging features with the phenotype data. In general, optimal feature-to-phenotype alignment is achieved using an iterative approach that: (1) refines inter-subject relationships observed in the feature domain by using the learned intrinsic data representation in the label domain, (2) updates the intrinsic data representation from the refined inter-subject relationships, and (3) verifies the intrinsic data representation on the training data to guarantee an optimal classification when applied to testing data. Additionally, the iterative approach is extended to multi-modal imaging data to further improve pGTL classification accuracy. Using Alzheimer’s disease and Parkinson’s disease study data, the classification accuracy of the proposed pGTL method is compared to several state-of-the-art classification methods, and the results show pGTL can more accurately identify subjects, even at different progression stages, in these two study data sets. PMID:28551556
MMKG: An approach to generate metallic materials knowledge graph based on DBpedia and Wikipedia
NASA Astrophysics Data System (ADS)
Zhang, Xiaoming; Liu, Xin; Li, Xin; Pan, Dongyu
2017-02-01
The research and development of metallic materials are playing an important role in today's society, and in the meanwhile lots of metallic materials knowledge is generated and available on the Web (e.g., Wikipedia) for materials experts. However, due to the diversity and complexity of metallic materials knowledge, the knowledge utilization may encounter much inconvenience. The idea of knowledge graph (e.g., DBpedia) provides a good way to organize the knowledge into a comprehensive entity network. Therefore, the motivation of our work is to generate a metallic materials knowledge graph (MMKG) using available knowledge on the Web. In this paper, an approach is proposed to build MMKG based on DBpedia and Wikipedia. First, we use an algorithm based on directly linked sub-graph semantic distance (DLSSD) to preliminarily extract metallic materials entities from DBpedia according to some predefined seed entities; then based on the results of the preliminary extraction, we use an algorithm, which considers both semantic distance and string similarity (SDSS), to achieve the further extraction. Second, due to the absence of materials properties in DBpedia, we use an ontology-based method to extract properties knowledge from the HTML tables of corresponding Wikipedia Web pages for enriching MMKG. Materials ontology is used to locate materials properties tables as well as to identify the structure of the tables. The proposed approach is evaluated by precision, recall, F1 and time performance, and meanwhile the appropriate thresholds for the algorithms in our approach are determined through experiments. The experimental results show that our approach returns expected performance. A tool prototype is also designed to facilitate the process of building the MMKG as well as to demonstrate the effectiveness of our approach.
Preserving Differential Privacy in Degree-Correlation based Graph Generation
Wang, Yue; Wu, Xintao
2014-01-01
Enabling accurate analysis of social network data while preserving differential privacy has been challenging since graph features such as cluster coefficient often have high sensitivity, which is different from traditional aggregate functions (e.g., count and sum) on tabular data. In this paper, we study the problem of enforcing edge differential privacy in graph generation. The idea is to enforce differential privacy on graph model parameters learned from the original network and then generate the graphs for releasing using the graph model with the private parameters. In particular, we develop a differential privacy preserving graph generator based on the dK-graph generation model. We first derive from the original graph various parameters (i.e., degree correlations) used in the dK-graph model, then enforce edge differential privacy on the learned parameters, and finally use the dK-graph model with the perturbed parameters to generate graphs. For the 2K-graph model, we enforce the edge differential privacy by calibrating noise based on the smooth sensitivity, rather than the global sensitivity. By doing this, we achieve the strict differential privacy guarantee with smaller magnitude noise. We conduct experiments on four real networks and compare the performance of our private dK-graph models with the stochastic Kronecker graph generation model in terms of utility and privacy tradeoff. Empirical evaluations show the developed private dK-graph generation models significantly outperform the approach based on the stochastic Kronecker generation model. PMID:24723987
Listing All Maximal Cliques in Sparse Graphs in Near-optimal Time
2011-01-01
523 10 Arabisopsis thaliana 1745 3098 71 12 Drosophila melanogaster 7282 24894 176 12 Homo Sapiens 9527 31182 308 12 Schizosaccharomyces pombe 2031...clusters of actors [6,14,28,40] and may be used as features in exponential random graph models for statistical analysis of social networks [17,19,20,44,49...29. R. Horaud and T. Skordas. Stereo correspondence through feature grouping and maximal cliques. IEEE Trans. Patt. An. Mach. Int. 11(11):1168–1180
Measuring Graph Comprehension, Critique, and Construction in Science
NASA Astrophysics Data System (ADS)
Lai, Kevin; Cabrera, Julio; Vitale, Jonathan M.; Madhok, Jacquie; Tinker, Robert; Linn, Marcia C.
2016-08-01
Interpreting and creating graphs plays a critical role in scientific practice. The K-12 Next Generation Science Standards call for students to use graphs for scientific modeling, reasoning, and communication. To measure progress on this dimension, we need valid and reliable measures of graph understanding in science. In this research, we designed items to measure graph comprehension, critique, and construction and developed scoring rubrics based on the knowledge integration (KI) framework. We administered the items to over 460 middle school students. We found that the items formed a coherent scale and had good reliability using both item response theory and classical test theory. The KI scoring rubric showed that most students had difficulty linking graphs features to science concepts, especially when asked to critique or construct graphs. In addition, students with limited access to computers as well as those who speak a language other than English at home have less integrated understanding than others. These findings point to the need to increase the integration of graphing into science instruction. The results suggest directions for further research leading to comprehensive assessments of graph understanding.
Chasin, Rachel; Rumshisky, Anna; Uzuner, Ozlem; Szolovits, Peter
2014-01-01
Objective To evaluate state-of-the-art unsupervised methods on the word sense disambiguation (WSD) task in the clinical domain. In particular, to compare graph-based approaches relying on a clinical knowledge base with bottom-up topic-modeling-based approaches. We investigate several enhancements to the topic-modeling techniques that use domain-specific knowledge sources. Materials and methods The graph-based methods use variations of PageRank and distance-based similarity metrics, operating over the Unified Medical Language System (UMLS). Topic-modeling methods use unlabeled data from the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC II) database to derive models for each ambiguous word. We investigate the impact of using different linguistic features for topic models, including UMLS-based and syntactic features. We use a sense-tagged clinical dataset from the Mayo Clinic for evaluation. Results The topic-modeling methods achieve 66.9% accuracy on a subset of the Mayo Clinic's data, while the graph-based methods only reach the 40–50% range, with a most-frequent-sense baseline of 56.5%. Features derived from the UMLS semantic type and concept hierarchies do not produce a gain over bag-of-words features in the topic models, but identifying phrases from UMLS and using syntax does help. Discussion Although topic models outperform graph-based methods, semantic features derived from the UMLS prove too noisy to improve performance beyond bag-of-words. Conclusions Topic modeling for WSD provides superior results in the clinical domain; however, integration of knowledge remains to be effectively exploited. PMID:24441986
Precalculus teachers' perspectives on using graphing calculators: an example from one curriculum
NASA Astrophysics Data System (ADS)
Karadeniz, Ilyas; Thompson, Denisse R.
2018-01-01
Graphing calculators are hand-held technological tools currently used in mathematics classrooms. Teachers' perspectives on using graphing calculators are important in terms of exploring what teachers think about using such technology in advanced mathematics courses, particularly precalculus courses. A descriptive intrinsic case study was conducted to analyse the perspectives of 11 teachers using graphing calculators with potential Computer Algebra System (CAS) capability while teaching Functions, Statistics, and Trigonometry, a precalculus course for 11th-grade students developed by the University of Chicago School Mathematics Project. Data were collected from multiple sources as part of a curriculum evaluation study conducted during the 2007-2008 school year. Although all teachers were using the same curriculum that integrated CAS into the instructional materials, teachers had mixed views about the technology. Graphing calculator features were used much more than CAS features, with many teachers concerned about the use of CAS because of pressures from external assessments. In addition, several teachers found it overwhelming to learn a new technology at the same time they were learning a new curriculum. The results have implications for curriculum developers and others working with teachers to update curriculum and the use of advanced technologies simultaneously.
Miao, Minmin; Zeng, Hong; Wang, Aimin; Zhao, Changsen; Liu, Feixiang
2017-02-15
Common spatial pattern (CSP) is most widely used in motor imagery based brain-computer interface (BCI) systems. In conventional CSP algorithm, pairs of the eigenvectors corresponding to both extreme eigenvalues are selected to construct the optimal spatial filter. In addition, an appropriate selection of subject-specific time segments and frequency bands plays an important role in its successful application. This study proposes to optimize spatial-frequency-temporal patterns for discriminative feature extraction. Spatial optimization is implemented by channel selection and finding discriminative spatial filters adaptively on each time-frequency segment. A novel Discernibility of Feature Sets (DFS) criteria is designed for spatial filter optimization. Besides, discriminative features located in multiple time-frequency segments are selected automatically by the proposed sparse time-frequency segment common spatial pattern (STFSCSP) method which exploits sparse regression for significant features selection. Finally, a weight determined by the sparse coefficient is assigned for each selected CSP feature and we propose a Weighted Naïve Bayesian Classifier (WNBC) for classification. Experimental results on two public EEG datasets demonstrate that optimizing spatial-frequency-temporal patterns in a data-driven manner for discriminative feature extraction greatly improves the classification performance. The proposed method gives significantly better classification accuracies in comparison with several competing methods in the literature. The proposed approach is a promising candidate for future BCI systems. Copyright © 2016 Elsevier B.V. All rights reserved.
Searching social networks for subgraph patterns
NASA Astrophysics Data System (ADS)
Ogaard, Kirk; Kase, Sue; Roy, Heather; Nagi, Rakesh; Sambhoos, Kedar; Sudit, Moises
2013-06-01
Software tools for Social Network Analysis (SNA) are being developed which support various types of analysis of social networks extracted from social media websites (e.g., Twitter). Once extracted and stored in a database such social networks are amenable to analysis by SNA software. This data analysis often involves searching for occurrences of various subgraph patterns (i.e., graphical representations of entities and relationships). The authors have developed the Graph Matching Toolkit (GMT) which provides an intuitive Graphical User Interface (GUI) for a heuristic graph matching algorithm called the Truncated Search Tree (TruST) algorithm. GMT is a visual interface for graph matching algorithms processing large social networks. GMT enables an analyst to draw a subgraph pattern by using a mouse to select categories and labels for nodes and links from drop-down menus. GMT then executes the TruST algorithm to find the top five occurrences of the subgraph pattern within the social network stored in the database. GMT was tested using a simulated counter-insurgency dataset consisting of cellular phone communications within a populated area of operations in Iraq. The results indicated GMT (when executing the TruST graph matching algorithm) is a time-efficient approach to searching large social networks. GMT's visual interface to a graph matching algorithm enables intelligence analysts to quickly analyze and summarize the large amounts of data necessary to produce actionable intelligence.
A model of language inflection graphs
NASA Astrophysics Data System (ADS)
Fukś, Henryk; Farzad, Babak; Cao, Yi
2014-01-01
Inflection graphs are highly complex networks representing relationships between inflectional forms of words in human languages. For so-called synthetic languages, such as Latin or Polish, they have particularly interesting structure due to the abundance of inflectional forms. We construct the simplest form of inflection graphs, namely a bipartite graph in which one group of vertices corresponds to dictionary headwords and the other group to inflected forms encountered in a given text. We, then, study projection of this graph on the set of headwords. The projection decomposes into a large number of connected components, to be called word groups. Distribution of sizes of word group exhibits some remarkable properties, resembling cluster distribution in a lattice percolation near the critical point. We propose a simple model which produces graphs of this type, reproducing the desired component distribution and other topological features.
Perceptron ensemble of graph-based positive-unlabeled learning for disease gene identification.
Jowkar, Gholam-Hossein; Mansoori, Eghbal G
2016-10-01
Identification of disease genes, using computational methods, is an important issue in biomedical and bioinformatics research. According to observations that diseases with the same or similar phenotype have the same biological characteristics, researchers have tried to identify genes by using machine learning tools. In recent attempts, some semi-supervised learning methods, called positive-unlabeled learning, is used for disease gene identification. In this paper, we present a Perceptron ensemble of graph-based positive-unlabeled learning (PEGPUL) on three types of biological attributes: gene ontologies, protein domains and protein-protein interaction networks. In our method, a reliable set of positive and negative genes are extracted using co-training schema. Then, the similarity graph of genes is built using metric learning by concentrating on multi-rank-walk method to perform inference from labeled genes. At last, a Perceptron ensemble is learned from three weighted classifiers: multilevel support vector machine, k-nearest neighbor and decision tree. The main contributions of this paper are: (i) incorporating the statistical properties of gene data through choosing proper metrics, (ii) statistical evaluation of biological features, and (iii) noise robustness characteristic of PEGPUL via using multilevel schema. In order to assess PEGPUL, we have applied it on 12950 disease genes with 949 positive genes from six class of diseases and 12001 unlabeled genes. Compared with some popular disease gene identification methods, the experimental results show that PEGPUL has reasonable performance. Copyright © 2016 Elsevier Ltd. All rights reserved.
Dunn-Walters, Deborah K.; Belelovsky, Alex; Edelman, Hanna; Banerjee, Monica; Mehr, Ramit
2002-01-01
We have developed a rigorous graph-theoretical algorithm for quantifying the shape properties of mutational lineage trees. We show that information about the dynamics of hypermutation and antigen-driven clonal selection during the humoral immune response is contained in the shape of mutational lineage trees deduced from the responding clones. Age and tissue related differences in the selection process can be studied using this method. Thus, tree shape analysis can be used as a means of elucidating humoral immune response dynamics in various situations. PMID:15144020
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phillips, Carolyn L.; Guo, Hanqi; Peterka, Tom
In type-II superconductors, the dynamics of magnetic flux vortices determine their transport properties. In the Ginzburg-Landau theory, vortices correspond to topological defects in the complex order parameter field. Earlier, in Phillips et al. [Phys. Rev. E 91, 023311 (2015)], we introduced a method for extracting vortices from the discretized complex order parameter field generated by a large-scale simulation of vortex matter. With this method, at a fixed time step, each vortex [simplistically, a one-dimensional (1D) curve in 3D space] can be represented as a connected graph extracted from the discretized field. Here we extend this method as a function ofmore » time as well. A vortex now corresponds to a 2D space-time sheet embedded in 4D space time that can be represented as a connected graph extracted from the discretized field over both space and time. Vortices that interact by merging or splitting correspond to disappearance and appearance of holes in the connected graph in the time direction. This method of tracking vortices, which makes no assumptions about the scale or behavior of the vortices, can track the vortices with a resolution as good as the discretization of the temporally evolving complex scalar field. Additionally, even details of the trajectory between time steps can be reconstructed from the connected graph. With this form of vortex tracking, the details of vortex dynamics in a model of a superconducting materials can be understood in greater detail than previously possible.« less
Event-driven processing for hardware-efficient neural spike sorting
NASA Astrophysics Data System (ADS)
Liu, Yan; Pereira, João L.; Constandinou, Timothy G.
2018-02-01
Objective. The prospect of real-time and on-node spike sorting provides a genuine opportunity to push the envelope of large-scale integrated neural recording systems. In such systems the hardware resources, power requirements and data bandwidth increase linearly with channel count. Event-based (or data-driven) processing can provide here a new efficient means for hardware implementation that is completely activity dependant. In this work, we investigate using continuous-time level-crossing sampling for efficient data representation and subsequent spike processing. Approach. (1) We first compare signals (synthetic neural datasets) encoded with this technique against conventional sampling. (2) We then show how such a representation can be directly exploited by extracting simple time domain features from the bitstream to perform neural spike sorting. (3) The proposed method is implemented in a low power FPGA platform to demonstrate its hardware viability. Main results. It is observed that considerably lower data rates are achievable when using 7 bits or less to represent the signals, whilst maintaining the signal fidelity. Results obtained using both MATLAB and reconfigurable logic hardware (FPGA) indicate that feature extraction and spike sorting accuracies can be achieved with comparable or better accuracy than reference methods whilst also requiring relatively low hardware resources. Significance. By effectively exploiting continuous-time data representation, neural signal processing can be achieved in a completely event-driven manner, reducing both the required resources (memory, complexity) and computations (operations). This will see future large-scale neural systems integrating on-node processing in real-time hardware.
Joshuva, A; Sugumaran, V
2017-03-01
Wind energy is one of the important renewable energy resources available in nature. It is one of the major resources for production of energy because of its dependability due to the development of the technology and relatively low cost. Wind energy is converted into electrical energy using rotating blades. Due to environmental conditions and large structure, the blades are subjected to various vibration forces that may cause damage to the blades. This leads to a liability in energy production and turbine shutdown. The downtime can be reduced when the blades are diagnosed continuously using structural health condition monitoring. These are considered as a pattern recognition problem which consists of three phases namely, feature extraction, feature selection, and feature classification. In this study, statistical features were extracted from vibration signals, feature selection was carried out using a J48 decision tree algorithm and feature classification was performed using best-first tree algorithm and functional trees algorithm. The better algorithm is suggested for fault diagnosis of wind turbine blade. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Analysis of the Relevance of Posts in Asynchronous Discussions
ERIC Educational Resources Information Center
Azevedo, Breno T.; Reategui, Eliseo; Behar, Patrícia A.
2014-01-01
This paper presents ForumMiner, a tool for the automatic analysis of students' posts in asynchronous discussions. ForumMiner uses a text mining system to extract graphs from texts that are given to students as a basis for their discussion. These graphs contain the most relevant terms found in the texts, as well as the relationships between them.…
Extraneous Information and Graph Comprehension: Implications for Effective Design Choices
ERIC Educational Resources Information Center
Stewart, Brandie M.; Cipolla, Jessica M.; Best, Lisa A.
2009-01-01
Purpose: The purpose of this paper is to examine if university students could accurately extract information from graphs presented in 2D or 3D formats with different colour hue variations or solid black and white. Design/methodology/approach: Participants are presented with 2D and 3D bar and pie charts in a PowerPoint presentation and are asked to…
Making Graphical Inferences: A Hierarchical Framework
2004-08-01
from graphs is considered one of the more complex skills graph readers should possess. According to the National Council of Teachers of Mathematics ...understanding graphical perception. Human Computer Interaction, 8, 353-388. NCTM : Standards for Mathematics . (2003, 2003). Pinker, S. (1990). A theory... NCTM ) the simplest type of question involves the extraction or comparison of a few explicitly represented data points (read-offs) ( NCTM : Standards
Labeled Graph Kernel for Behavior Analysis.
Zhao, Ruiqi; Martinez, Aleix M
2016-08-01
Automatic behavior analysis from video is a major topic in many areas of research, including computer vision, multimedia, robotics, biology, cognitive science, social psychology, psychiatry, and linguistics. Two major problems are of interest when analyzing behavior. First, we wish to automatically categorize observed behaviors into a discrete set of classes (i.e., classification). For example, to determine word production from video sequences in sign language. Second, we wish to understand the relevance of each behavioral feature in achieving this classification (i.e., decoding). For instance, to know which behavior variables are used to discriminate between the words apple and onion in American Sign Language (ASL). The present paper proposes to model behavior using a labeled graph, where the nodes define behavioral features and the edges are labels specifying their order (e.g., before, overlaps, start). In this approach, classification reduces to a simple labeled graph matching. Unfortunately, the complexity of labeled graph matching grows exponentially with the number of categories we wish to represent. Here, we derive a graph kernel to quickly and accurately compute this graph similarity. This approach is very general and can be plugged into any kernel-based classifier. Specifically, we derive a Labeled Graph Support Vector Machine (LGSVM) and a Labeled Graph Logistic Regressor (LGLR) that can be readily employed to discriminate between many actions (e.g., sign language concepts). The derived approach can be readily used for decoding too, yielding invaluable information for the understanding of a problem (e.g., to know how to teach a sign language). The derived algorithms allow us to achieve higher accuracy results than those of state-of-the-art algorithms in a fraction of the time. We show experimental results on a variety of problems and datasets, including multimodal data.
Interactive Web Graphs for Economic Principles.
ERIC Educational Resources Information Center
Kaufman, Dennis A.; Kaufman, Rebecca S.
2002-01-01
Describes a Web site with animation and interactive activities containing graphs and basic economics concepts. Features changes in supply and market equilibrium, the construction of the long-run average cost curve, short-run profit maximization, long-run market equilibrium, and changes in aggregate demand and aggregate supply. States the…
NASA Astrophysics Data System (ADS)
Ziemann, Amanda K.; Messinger, David W.; Albano, James A.; Basener, William F.
2012-06-01
Anomaly detection algorithms have historically been applied to hyperspectral imagery in order to identify pixels whose material content is incongruous with the background material in the scene. Typically, the application involves extracting man-made objects from natural and agricultural surroundings. A large challenge in designing these algorithms is determining which pixels initially constitute the background material within an image. The topological anomaly detection (TAD) algorithm constructs a graph theory-based, fully non-parametric topological model of the background in the image scene, and uses codensity to measure deviation from this background. In TAD, the initial graph theory structure of the image data is created by connecting an edge between any two pixel vertices x and y if the Euclidean distance between them is less than some resolution r. While this type of proximity graph is among the most well-known approaches to building a geometric graph based on a given set of data, there is a wide variety of dierent geometrically-based techniques. In this paper, we present a comparative test of the performance of TAD across four dierent constructs of the initial graph: mutual k-nearest neighbor graph, sigma-local graph for two different values of σ > 1, and the proximity graph originally implemented in TAD.
Renal cortex segmentation using optimal surface search with novel graph construction.
Li, Xiuli; Chen, Xinjian; Yao, Jianhua; Zhang, Xing; Tian, Jie
2011-01-01
In this paper, we propose a novel approach to solve the renal cortex segmentation problem, which has rarely been studied. In this study, the renal cortex segmentation problem is handled as a multiple-surfaces extraction problem, which is solved using the optimal surface search method. We propose a novel graph construction scheme in the optimal surface search to better accommodate multiple surfaces. Different surface sub-graphs are constructed according to their properties, and inter-surface relationships are also modeled in the graph. The proposed method was tested on 17 clinical CT datasets. The true positive volume fraction (TPVF) and false positive volume fraction (FPVF) are 74.10% and 0.08%, respectively. The experimental results demonstrate the effectiveness of the proposed method.
A tool for filtering information in complex systems
Tumminello, M.; Aste, T.; Di Matteo, T.; Mantegna, R. N.
2005-01-01
We introduce a technique to filter out complex data sets by extracting a subgraph of representative links. Such a filtering can be tuned up to any desired level by controlling the genus of the resulting graph. We show that this technique is especially suitable for correlation-based graphs, giving filtered graphs that preserve the hierarchical organization of the minimum spanning tree but containing a larger amount of information in their internal structure. In particular in the case of planar filtered graphs (genus equal to 0), triangular loops and four-element cliques are formed. The application of this filtering procedure to 100 stocks in the U.S. equity markets shows that such loops and cliques have important and significant relationships with the market structure and properties. PMID:16027373
A tool for filtering information in complex systems.
Tumminello, M; Aste, T; Di Matteo, T; Mantegna, R N
2005-07-26
We introduce a technique to filter out complex data sets by extracting a subgraph of representative links. Such a filtering can be tuned up to any desired level by controlling the genus of the resulting graph. We show that this technique is especially suitable for correlation-based graphs, giving filtered graphs that preserve the hierarchical organization of the minimum spanning tree but containing a larger amount of information in their internal structure. In particular in the case of planar filtered graphs (genus equal to 0), triangular loops and four-element cliques are formed. The application of this filtering procedure to 100 stocks in the U.S. equity markets shows that such loops and cliques have important and significant relationships with the market structure and properties.
Visual Odometry Based on Structural Matching of Local Invariant Features Using Stereo Camera Sensor
Núñez, Pedro; Vázquez-Martín, Ricardo; Bandera, Antonio
2011-01-01
This paper describes a novel sensor system to estimate the motion of a stereo camera. Local invariant image features are matched between pairs of frames and linked into image trajectories at video rate, providing the so-called visual odometry, i.e., motion estimates from visual input alone. Our proposal conducts two matching sessions: the first one between sets of features associated to the images of the stereo pairs and the second one between sets of features associated to consecutive frames. With respect to previously proposed approaches, the main novelty of this proposal is that both matching algorithms are conducted by means of a fast matching algorithm which combines absolute and relative feature constraints. Finding the largest-valued set of mutually consistent matches is equivalent to finding the maximum-weighted clique on a graph. The stereo matching allows to represent the scene view as a graph which emerge from the features of the accepted clique. On the other hand, the frame-to-frame matching defines a graph whose vertices are features in 3D space. The efficiency of the approach is increased by minimizing the geometric and algebraic errors to estimate the final displacement of the stereo camera between consecutive acquired frames. The proposed approach has been tested for mobile robotics navigation purposes in real environments and using different features. Experimental results demonstrate the performance of the proposal, which could be applied in both industrial and service robot fields. PMID:22164016
Information jet: Handling noisy big data from weakly disconnected network
NASA Astrophysics Data System (ADS)
Aurongzeb, Deeder
Sudden aggregation (information jet) of large amount of data is ubiquitous around connected social networks, driven by sudden interacting and non-interacting events, network security threat attacks, online sales channel etc. Clustering of information jet based on time series analysis and graph theory is not new but little work is done to connect them with particle jet statistics. We show pre-clustering based on context can element soft network or network of information which is critical to minimize time to calculate results from noisy big data. We show difference between, stochastic gradient boosting and time series-graph clustering. For disconnected higher dimensional information jet, we use Kallenberg representation theorem (Kallenberg, 2005, arXiv:1401.1137) to identify and eliminate jet similarities from dense or sparse graph.
Detecting community structure via the maximal sub-graphs and belonging degrees in complex networks
NASA Astrophysics Data System (ADS)
Cui, Yaozu; Wang, Xingyuan; Eustace, Justine
2014-12-01
Community structure is a common phenomenon in complex networks, and it has been shown that some communities in complex networks often overlap each other. So in this paper we propose a new algorithm to detect overlapping community structure in complex networks. To identify the overlapping community structure, our algorithm firstly extracts fully connected sub-graphs which are maximal sub-graphs from original networks. Then two maximal sub-graphs having the key pair-vertices can be merged into a new larger sub-graph using some belonging degree functions. Furthermore we extend the modularity function to evaluate the proposed algorithm. In addition, overlapping nodes between communities are founded successfully. Finally we report the comparison between the modularity and the computational complexity of the proposed algorithm with some other existing algorithms. The experimental results show that the proposed algorithm gives satisfactory results.
A tool for filtering information in complex systems
NASA Astrophysics Data System (ADS)
Tumminello, M.; Aste, T.; Di Matteo, T.; Mantegna, R. N.
2005-07-01
We introduce a technique to filter out complex data sets by extracting a subgraph of representative links. Such a filtering can be tuned up to any desired level by controlling the genus of the resulting graph. We show that this technique is especially suitable for correlation-based graphs, giving filtered graphs that preserve the hierarchical organization of the minimum spanning tree but containing a larger amount of information in their internal structure. In particular in the case of planar filtered graphs (genus equal to 0), triangular loops and four-element cliques are formed. The application of this filtering procedure to 100 stocks in the U.S. equity markets shows that such loops and cliques have important and significant relationships with the market structure and properties. This paper was submitted directly (Track II) to the PNAS office.Abbreviations: MST, minimum spanning tree; PMFG, Planar Maximally Filtered Graph; r-clique, clique of r elements.
A Ranking Approach on Large-Scale Graph With Multidimensional Heterogeneous Information.
Wei, Wei; Gao, Bin; Liu, Tie-Yan; Wang, Taifeng; Li, Guohui; Li, Hang
2016-04-01
Graph-based ranking has been extensively studied and frequently applied in many applications, such as webpage ranking. It aims at mining potentially valuable information from the raw graph-structured data. Recently, with the proliferation of rich heterogeneous information (e.g., node/edge features and prior knowledge) available in many real-world graphs, how to effectively and efficiently leverage all information to improve the ranking performance becomes a new challenging problem. Previous methods only utilize part of such information and attempt to rank graph nodes according to link-based methods, of which the ranking performances are severely affected by several well-known issues, e.g., over-fitting or high computational complexity, especially when the scale of graph is very large. In this paper, we address the large-scale graph-based ranking problem and focus on how to effectively exploit rich heterogeneous information of the graph to improve the ranking performance. Specifically, we propose an innovative and effective semi-supervised PageRank (SSP) approach to parameterize the derived information within a unified semi-supervised learning framework (SSLF-GR), then simultaneously optimize the parameters and the ranking scores of graph nodes. Experiments on the real-world large-scale graphs demonstrate that our method significantly outperforms the algorithms that consider such graph information only partially.
A regularized approach for geodesic-based semisupervised multimanifold learning.
Fan, Mingyu; Zhang, Xiaoqin; Lin, Zhouchen; Zhang, Zhongfei; Bao, Hujun
2014-05-01
Geodesic distance, as an essential measurement for data dissimilarity, has been successfully used in manifold learning. However, most geodesic distance-based manifold learning algorithms have two limitations when applied to classification: 1) class information is rarely used in computing the geodesic distances between data points on manifolds and 2) little attention has been paid to building an explicit dimension reduction mapping for extracting the discriminative information hidden in the geodesic distances. In this paper, we regard geodesic distance as a kind of kernel, which maps data from linearly inseparable space to linear separable distance space. In doing this, a new semisupervised manifold learning algorithm, namely regularized geodesic feature learning algorithm, is proposed. The method consists of three techniques: a semisupervised graph construction method, replacement of original data points with feature vectors which are built by geodesic distances, and a new semisupervised dimension reduction method for feature vectors. Experiments on the MNIST, USPS handwritten digit data sets, MIT CBCL face versus nonface data set, and an intelligent traffic data set show the effectiveness of the proposed algorithm.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sanfilippo, Antonio P.
2005-12-27
Graph theory is a branch of discrete combinatorial mathematics that studies the properties of graphs. The theory was pioneered by the Swiss mathematician Leonhard Euler in the 18th century, commenced its formal development during the second half of the 19th century, and has witnessed substantial growth during the last seventy years, with applications in areas as diverse as engineering, computer science, physics, sociology, chemistry and biology. Graph theory has also had a strong impact in computational linguistics by providing the foundations for the theory of features structures that has emerged as one of the most widely used frameworks for themore » representation of grammar formalisms.« less
Dimitriadis, Stavros I.; Zouridakis, George; Rezaie, Roozbeh; Babajani-Feremi, Abbas; Papanicolaou, Andrew C.
2015-01-01
Mild traumatic brain injury (mTBI) may affect normal cognition and behavior by disrupting the functional connectivity networks that mediate efficient communication among brain regions. In this study, we analyzed brain connectivity profiles from resting state Magnetoencephalographic (MEG) recordings obtained from 31 mTBI patients and 55 normal controls. We used phase-locking value estimates to compute functional connectivity graphs to quantify frequency-specific couplings between sensors at various frequency bands. Overall, normal controls showed a dense network of strong local connections and a limited number of long-range connections that accounted for approximately 20% of all connections, whereas mTBI patients showed networks characterized by weak local connections and strong long-range connections that accounted for more than 60% of all connections. Comparison of the two distinct general patterns at different frequencies using a tensor representation for the connectivity graphs and tensor subspace analysis for optimal feature extraction showed that mTBI patients could be separated from normal controls with 100% classification accuracy in the alpha band. These encouraging findings support the hypothesis that MEG-based functional connectivity patterns may be used as biomarkers that can provide more accurate diagnoses, help guide treatment, and monitor effectiveness of intervention in mTBI. PMID:26640764
Quantum walks on the chimera graph and its variants
NASA Astrophysics Data System (ADS)
Sanders, Barry; Sun, Xiangxiang; Xu, Shu; Wu, Jizhou; Zhang, Wei-Wei; Arshed, Nigum
We study quantum walks on the chimera graph, which is an important graph for performing quantum annealing, and we explore the nature of quantum walks on variants of the chimera graph. Features of these quantum walks provide profound insights into the nature of the chimera graph, including effects of greater and lesser connectivity, strong differences between quantum and classical random walks, isotropic spreading and localization only in the quantum case, and random graphs. We analyze finite-size effects due to limited width and length of the graph, and we explore the effect of different boundary conditions such as periodic and reflecting. Effects are explained via spectral analysis and the properties of stationary states, and spectral analysis enables us to characterize asymptotic behavior of the quantum walker in the long-time limit. Supported by China 1000 Talent Plan, National Science Foundation of China, Hefei National Laboratory for Physical Sciences at Microscale Fellowship, and the Chinese Academy of Sciences President's International Fellowship Initiative.
A Wave Chaotic Study of Quantum Graphs with Microwave Networks
NASA Astrophysics Data System (ADS)
Fu, Ziyuan
Quantum graphs provide a setting to test the hypothesis that all ray-chaotic systems show universal wave chaotic properties. I study the quantum graphs with a wave chaotic approach. Here, an experimental setup consisting of a microwave coaxial cable network is used to simulate quantum graphs. Some basic features and the distributions of impedance statistics are analyzed from experimental data on an ensemble of tetrahedral networks. The random coupling model (RCM) is applied in an attempt to uncover the universal statistical properties of the system. Deviations from RCM predictions have been observed in that the statistics of diagonal and off-diagonal impedance elements are different. Waves trapped due to multiple reflections on bonds between nodes in the graph most likely cause the deviations from universal behavior in the finite-size realization of a quantum graph. In addition, I have done some investigations on the Random Coupling Model, which are useful for further research.
Knowledge discovery by accuracy maximization
Cacciatore, Stefano; Luchinat, Claudio; Tenori, Leonardo
2014-01-01
Here we describe KODAMA (knowledge discovery by accuracy maximization), an unsupervised and semisupervised learning algorithm that performs feature extraction from noisy and high-dimensional data. Unlike other data mining methods, the peculiarity of KODAMA is that it is driven by an integrated procedure of cross-validation of the results. The discovery of a local manifold’s topology is led by a classifier through a Monte Carlo procedure of maximization of cross-validated predictive accuracy. Briefly, our approach differs from previous methods in that it has an integrated procedure of validation of the results. In this way, the method ensures the highest robustness of the obtained solution. This robustness is demonstrated on experimental datasets of gene expression and metabolomics, where KODAMA compares favorably with other existing feature extraction methods. KODAMA is then applied to an astronomical dataset, revealing unexpected features. Interesting and not easily predictable features are also found in the analysis of the State of the Union speeches by American presidents: KODAMA reveals an abrupt linguistic transition sharply separating all post-Reagan from all pre-Reagan speeches. The transition occurs during Reagan’s presidency and not from its beginning. PMID:24706821
Low-Rank Discriminant Embedding for Multiview Learning.
Li, Jingjing; Wu, Yue; Zhao, Jidong; Lu, Ke
2017-11-01
This paper focuses on the specific problem of multiview learning where samples have the same feature set but different probability distributions, e.g., different viewpoints or different modalities. Since samples lying in different distributions cannot be compared directly, this paper aims to learn a latent subspace shared by multiple views assuming that the input views are generated from this latent subspace. Previous approaches usually learn the common subspace by either maximizing the empirical likelihood, or preserving the geometric structure. However, considering the complementarity between the two objectives, this paper proposes a novel approach, named low-rank discriminant embedding (LRDE), for multiview learning by taking full advantage of both sides. By further considering the duality between data points and features of multiview scene, i.e., data points can be grouped based on their distribution on features, while features can be grouped based on their distribution on the data points, LRDE not only deploys low-rank constraints on both sample level and feature level to dig out the shared factors across different views, but also preserves geometric information in both the ambient sample space and the embedding feature space by designing a novel graph structure under the framework of graph embedding. Finally, LRDE jointly optimizes low-rank representation and graph embedding in a unified framework. Comprehensive experiments in both multiview manner and pairwise manner demonstrate that LRDE performs much better than previous approaches proposed in recent literatures.
Variation in Water Content in Martian Subsurface Along Curiosity Traverse
2013-03-18
This set of graphs shows variation in the amount and the depth of water detected beneath NASA Mars rover Curiosity by use of the rover Dynamic Albedo of Neutrons DAN instrument at different points the rover has driven.
Batalle, Dafnis; Muñoz-Moreno, Emma; Figueras, Francesc; Bargallo, Nuria; Eixarch, Elisenda; Gratacos, Eduard
2013-12-01
Obtaining individual biomarkers for the prediction of altered neurological outcome is a challenge of modern medicine and neuroscience. Connectomics based on magnetic resonance imaging (MRI) stands as a good candidate to exhaustively extract information from MRI by integrating the information obtained in a few network features that can be used as individual biomarkers of neurological outcome. However, this approach typically requires the use of diffusion and/or functional MRI to extract individual brain networks, which require high acquisition times and present an extreme sensitivity to motion artifacts, critical problems when scanning fetuses and infants. Extraction of individual networks based on morphological similarity from gray matter is a new approach that benefits from the power of graph theory analysis to describe gray matter morphology as a large-scale morphological network from a typical clinical anatomic acquisition such as T1-weighted MRI. In the present paper we propose a methodology to normalize these large-scale morphological networks to a brain network with standardized size based on a parcellation scheme. The proposed methodology was applied to reconstruct individual brain networks of 63 one-year-old infants, 41 infants with intrauterine growth restriction (IUGR) and 22 controls, showing altered network features in the IUGR group, and their association with neurodevelopmental outcome at two years of age by means of ordinal regression analysis of the network features obtained with Bayley Scale for Infant and Toddler Development, third edition. Although it must be more widely assessed, this methodology stands as a good candidate for the development of biomarkers for altered neurodevelopment in the pediatric population. © 2013 Elsevier Inc. All rights reserved.
Scaling Semantic Graph Databases in Size and Performance
DOE Office of Scientific and Technical Information (OSTI.GOV)
Morari, Alessandro; Castellana, Vito G.; Villa, Oreste
In this paper we present SGEM, a full software system for accelerating large-scale semantic graph databases on commodity clusters. Unlike current approaches, SGEM addresses semantic graph databases by only employing graph methods at all the levels of the stack. On one hand, this allows exploiting the space efficiency of graph data structures and the inherent parallelism of graph algorithms. These features adapt well to the increasing system memory and core counts of modern commodity clusters. On the other hand, however, these systems are optimized for regular computation and batched data transfers, while graph methods usually are irregular and generate fine-grainedmore » data accesses with poor spatial and temporal locality. Our framework comprises a SPARQL to data parallel C compiler, a library of parallel graph methods and a custom, multithreaded runtime system. We introduce our stack, motivate its advantages with respect to other solutions and show how we solved the challenges posed by irregular behaviors. We present the result of our software stack on the Berlin SPARQL benchmarks with datasets up to 10 billion triples (a triple corresponds to a graph edge), demonstrating scaling in dataset size and in performance as more nodes are added to the cluster.« less
Abou Zeid, Elias; Rezazadeh Sereshkeh, Alborz; Schultz, Benjamin; Chau, Tom
2017-01-01
In recent years, the readiness potential (RP), a type of pre-movement neural activity, has been investigated for asynchronous electroencephalogram (EEG)-based brain-computer interfaces (BCIs). Since the RP is attenuated for involuntary movements, a BCI driven by RP alone could facilitate intentional control amid a plethora of unintentional movements. Previous studies have mainly attempted binary single-trial classification of RP. An RP-based BCI with three or more states would expand the options for functional control. Here, we propose a ternary BCI based on single-trial RPs. This BCI classifies amongst an idle state, a left hand and a right hand self-initiated fine movement. A pipeline of spatio-temporal filtering with per participant parameter optimization was used for feature extraction. The ternary classification was decomposed into binary classifications using a decision-directed acyclic graph (DDAG). For each class pair in the DDAG structure, an ordered diversified classifier system (ODCS-DDAG) was used to select the best among various classification algorithms or to combine the results of different classification algorithms. Using EEG data from 14 participants performing self-initiated left or right key presses, punctuated with rest periods, we compared the performance of ODCS-DDAG to a ternary classifier and four popular multiclass decomposition methods using only a single classification algorithm. ODCS-DDAG had the highest performance (0.769 Cohen's Kappa score) and was significantly better than the ternary classifier and two of the four multiclass decomposition methods. Our work supports further study of RP-based BCI for intuitive asynchronous environmental control or augmentative communication. PMID:28596725
Terrain-driven unstructured mesh development through semi-automatic vertical feature extraction
NASA Astrophysics Data System (ADS)
Bilskie, Matthew V.; Coggin, David; Hagen, Scott C.; Medeiros, Stephen C.
2015-12-01
A semi-automated vertical feature terrain extraction algorithm is described and applied to a two-dimensional, depth-integrated, shallow water equation inundation model. The extracted features describe what are commonly sub-mesh scale elevation details (ridge and valleys), which may be ignored in standard practice because adequate mesh resolution cannot be afforded. The extraction algorithm is semi-automated, requires minimal human intervention, and is reproducible. A lidar-derived digital elevation model (DEM) of coastal Mississippi and Alabama serves as the source data for the vertical feature extraction. Unstructured mesh nodes and element edges are aligned to the vertical features and an interpolation algorithm aimed at minimizing topographic elevation error assigns elevations to mesh nodes via the DEM. The end result is a mesh that accurately represents the bare earth surface as derived from lidar with element resolution in the floodplain ranging from 15 m to 200 m. To examine the influence of the inclusion of vertical features on overland flooding, two additional meshes were developed, one without crest elevations of the features and another with vertical features withheld. All three meshes were incorporated into a SWAN+ADCIRC model simulation of Hurricane Katrina. Each of the three models resulted in similar validation statistics when compared to observed time-series water levels at gages and post-storm collected high water marks. Simulated water level peaks yielded an R2 of 0.97 and upper and lower 95% confidence interval of ∼ ± 0.60 m. From the validation at the gages and HWM locations, it was not clear which of the three model experiments performed best in terms of accuracy. Examination of inundation extent among the three model results were compared to debris lines derived from NOAA post-event aerial imagery, and the mesh including vertical features showed higher accuracy. The comparison of model results to debris lines demonstrates that additional validation techniques are necessary for state-of-the-art flood inundation models. In addition, the semi-automated, unstructured mesh generation process presented herein increases the overall accuracy of simulated storm surge across the floodplain without reliance on hand digitization or sacrificing computational cost.
Individual classification of Alzheimer's disease with diffusion magnetic resonance imaging.
Schouten, Tijn M; Koini, Marisa; Vos, Frank de; Seiler, Stephan; Rooij, Mark de; Lechner, Anita; Schmidt, Reinhold; Heuvel, Martijn van den; Grond, Jeroen van der; Rombouts, Serge A R B
2017-05-15
Diffusion magnetic resonance imaging (MRI) is a powerful non-invasive method to study white matter integrity, and is sensitive to detect differences in Alzheimer's disease (AD) patients. Diffusion MRI may be able to contribute towards reliable diagnosis of AD. We used diffusion MRI to classify AD patients (N=77), and controls (N=173). We use different methods to extract information from the diffusion MRI data. First, we use the voxel-wise diffusion tensor measures that have been skeletonised using tract based spatial statistics. Second, we clustered the voxel-wise diffusion measures with independent component analysis (ICA), and extracted the mixing weights. Third, we determined structural connectivity between Harvard Oxford atlas regions with probabilistic tractography, as well as graph measures based on these structural connectivity graphs. Classification performance for voxel-wise measures ranged between an AUC of 0.888, and 0.902. The ICA-clustered measures ranged between an AUC of 0.893, and 0.920. The AUC for the structural connectivity graph was 0.900, while graph measures based upon this graph ranged between an AUC of 0.531, and 0.840. All measures combined with a sparse group lasso resulted in an AUC of 0.896. Overall, fractional anisotropy clustered into ICA components was the best performing measure. These findings may be useful for future incorporation of diffusion MRI into protocols for AD classification, or as a starting point for early detection of AD using diffusion MRI. Copyright © 2017 Elsevier Inc. All rights reserved.
System analysis through bond graph modeling
NASA Astrophysics Data System (ADS)
McBride, Robert Thomas
2005-07-01
Modeling and simulation form an integral role in the engineering design process. An accurate mathematical description of a system provides the design engineer the flexibility to perform trade studies quickly and accurately to expedite the design process. Most often, the mathematical model of the system contains components of different engineering disciplines. A modeling methodology that can handle these types of systems might be used in an indirect fashion to extract added information from the model. This research examines the ability of a modeling methodology to provide added insight into system analysis and design. The modeling methodology used is bond graph modeling. An investigation into the creation of a bond graph model using the Lagrangian of the system is provided. Upon creation of the bond graph, system analysis is performed. To aid in the system analysis, an object-oriented approach to bond graph modeling is introduced. A framework is provided to simulate the bond graph directly. Through object-oriented simulation of a bond graph, the information contained within the bond graph can be exploited to create a measurement of system efficiency. A definition of system efficiency is given. This measurement of efficiency is used in the design of different controllers of varying architectures. Optimal control of a missile autopilot is discussed within the framework of the calculated system efficiency.
An enhanced digital line graph design
Guptill, Stephen C.
1990-01-01
In response to increasing information demands on its digital cartographic data, the U.S. Geological Survey has designed an enhanced version of the Digital Line Graph, termed Digital Line Graph - Enhanced (DLG-E). In the DLG-E model, the phenomena represented by geographic and cartographic data are termed entities. Entities represent individual phenomena in the real world. A feature is an abstraction of a set of entities, with the feature description encompassing only selected properties of the entities (typically the properties that have been portrayed cartographically on a map). Buildings, bridges, roads, streams, grasslands, and counties are examples of features. A feature instance, that is, one occurrence of a feature, is described in the digital environment by feature objects and spatial objects. A feature object identifies a feature instance and its nonlocational attributes. Nontopological relationships are associated with feature objects. The locational aspects of the feature instance are represented by spatial objects. Four spatial objects (points, nodes, chains, and polygons) and their topological relationships are defined. To link the locational and nonlocational aspects of the feature instance, a given feature object is associated with (or is composed of) a set of spatial objects. These objects, attributes, and relationships are the components of the DLG-E data model. To establish a domain of features for DLG-E, an approach using a set of classes, or views, of spatial entities was adopted. The five views that were developed are cover, division, ecosystem, geoposition, and morphology. The views are exclusive; each view is a self-contained analytical approach to the entire range of world features. Because each view is independent of the others, a single point on the surface of the Earth can be represented under multiple views. Under the five views, over 200 features were identified and defined. This set constitutes an initial domain of DLG-E features.
Chaotic Traversal (CHAT): Very Large Graphs Traversal Using Chaotic Dynamics
NASA Astrophysics Data System (ADS)
Changaival, Boonyarit; Rosalie, Martin; Danoy, Grégoire; Lavangnananda, Kittichai; Bouvry, Pascal
2017-12-01
Graph Traversal algorithms can find their applications in various fields such as routing problems, natural language processing or even database querying. The exploration can be considered as a first stepping stone into knowledge extraction from the graph which is now a popular topic. Classical solutions such as Breadth First Search (BFS) and Depth First Search (DFS) require huge amounts of memory for exploring very large graphs. In this research, we present a novel memoryless graph traversal algorithm, Chaotic Traversal (CHAT) which integrates chaotic dynamics to traverse large unknown graphs via the Lozi map and the Rössler system. To compare various dynamics effects on our algorithm, we present an original way to perform the exploration of a parameter space using a bifurcation diagram with respect to the topological structure of attractors. The resulting algorithm is an efficient and nonresource demanding algorithm, and is therefore very suitable for partial traversal of very large and/or unknown environment graphs. CHAT performance using Lozi map is proven superior than the, commonly known, Random Walk, in terms of number of nodes visited (coverage percentage) and computation time where the environment is unknown and memory usage is restricted.
Historical contingency in fluviokarst landscape evolution
NASA Astrophysics Data System (ADS)
Phillips, Jonathan D.
2018-02-01
Lateral and vertical erosion at meander bends in the Kentucky River gorge area has created a series of strath terraces on the interior of incised meander bends. These represent a chronosequence of fluviokarst landscape evolution from the youngest valley side transition zone near the valley bottom to the oldest upland surface. This five-part chronosequence (not including the active river channel and floodplain) was analyzed in terms of the landforms that occur at each stage or surface. These include dolines, uvalas, karst valleys, pocket valleys, unincised channels, incised channels, and cliffs (smaller features such as swallets and shafts also occur). Landform coincidence analysis shows higher coincidence indices (CI) than would be expected based on an idealized chronosequence. CI values indicate genetic relationships (common causality) among some landforms and unexpected persistence of some features on older surfaces. The idealized and two observed chronosequences were also represented as graphs and analyzed using algebraic graph theory. The two field sites yielded graphs more complex and with less historical contingency than the idealized sequence. Indeed, some of the spectral graph measures for the field sites more closely approximate a purely hypothetical no-historical-contingency benchmark graph. The deviations of observations from the idealized expectations, and the high levels of graph complexity both point to potential transitions among landform types as being the dominant phenomenon, rather than canalization along a particular evolutionary pathway. As the base level of both the fluvial and karst landforms is lowered as the meanders expand, both fluvial and karst denudation are rejuvenated, and landform transitions remain active.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-08-14
RolX takes the features from Re-FeX or any other feature matrix as input and outputs role assignments (clusters). The output of RolX is a csv file containing the node-role memberships and a csv file containing the role-feature definitions.
Valavanis, Ioannis; Pilalis, Eleftherios; Georgiadis, Panagiotis; Kyrtopoulos, Soterios; Chatziioannou, Aristotelis
2015-01-01
DNA methylation profiling exploits microarray technologies, thus yielding a wealth of high-volume data. Here, an intelligent framework is applied, encompassing epidemiological genome-scale DNA methylation data produced from the Illumina’s Infinium Human Methylation 450K Bead Chip platform, in an effort to correlate interesting methylation patterns with cancer predisposition and, in particular, breast cancer and B-cell lymphoma. Feature selection and classification are employed in order to select, from an initial set of ~480,000 methylation measurements at CpG sites, predictive cancer epigenetic biomarkers and assess their classification power for discriminating healthy versus cancer related classes. Feature selection exploits evolutionary algorithms or a graph-theoretic methodology which makes use of the semantics information included in the Gene Ontology (GO) tree. The selected features, corresponding to methylation of CpG sites, attained moderate-to-high classification accuracies when imported to a series of classifiers evaluated by resampling or blindfold validation. The semantics-driven selection revealed sets of CpG sites performing similarly with evolutionary selection in the classification tasks. However, gene enrichment and pathway analysis showed that it additionally provides more descriptive sets of GO terms and KEGG pathways regarding the cancer phenotypes studied here. Results support the expediency of this methodology regarding its application in epidemiological studies. PMID:27600245
Prediction of Oncogenic Interactions and Cancer-Related Signaling Networks Based on Network Topology
Acencio, Marcio Luis; Bovolenta, Luiz Augusto; Camilo, Esther; Lemke, Ney
2013-01-01
Cancer has been increasingly recognized as a systems biology disease since many investigators have demonstrated that this malignant phenotype emerges from abnormal protein-protein, regulatory and metabolic interactions induced by simultaneous structural and regulatory changes in multiple genes and pathways. Therefore, the identification of oncogenic interactions and cancer-related signaling networks is crucial for better understanding cancer. As experimental techniques for determining such interactions and signaling networks are labor-intensive and time-consuming, the development of a computational approach capable to accomplish this task would be of great value. For this purpose, we present here a novel computational approach based on network topology and machine learning capable to predict oncogenic interactions and extract relevant cancer-related signaling subnetworks from an integrated network of human genes interactions (INHGI). This approach, called graph2sig, is twofold: first, it assigns oncogenic scores to all interactions in the INHGI and then these oncogenic scores are used as edge weights to extract oncogenic signaling subnetworks from INHGI. Regarding the prediction of oncogenic interactions, we showed that graph2sig is able to recover 89% of known oncogenic interactions with a precision of 77%. Moreover, the interactions that received high oncogenic scores are enriched in genes for which mutations have been causally implicated in cancer. We also demonstrated that graph2sig is potentially useful in extracting oncogenic signaling subnetworks: more than 80% of constructed subnetworks contain more than 50% of original interactions in their corresponding oncogenic linear pathways present in the KEGG PATHWAY database. In addition, the potential oncogenic signaling subnetworks discovered by graph2sig are supported by experimental evidence. Taken together, these results suggest that graph2sig can be a useful tool for investigators involved in cancer research interested in detecting signaling networks most prone to contribute with the emergence of malignant phenotype. PMID:24204854
Katwal, Santosh B; Gore, John C; Marois, Rene; Rogers, Baxter P
2013-09-01
We present novel graph-based visualizations of self-organizing maps for unsupervised functional magnetic resonance imaging (fMRI) analysis. A self-organizing map is an artificial neural network model that transforms high-dimensional data into a low-dimensional (often a 2-D) map using unsupervised learning. However, a postprocessing scheme is necessary to correctly interpret similarity between neighboring node prototypes (feature vectors) on the output map and delineate clusters and features of interest in the data. In this paper, we used graph-based visualizations to capture fMRI data features based upon 1) the distribution of data across the receptive fields of the prototypes (density-based connectivity); and 2) temporal similarities (correlations) between the prototypes (correlation-based connectivity). We applied this approach to identify task-related brain areas in an fMRI reaction time experiment involving a visuo-manual response task, and we correlated the time-to-peak of the fMRI responses in these areas with reaction time. Visualization of self-organizing maps outperformed independent component analysis and voxelwise univariate linear regression analysis in identifying and classifying relevant brain regions. We conclude that the graph-based visualizations of self-organizing maps help in advanced visualization of cluster boundaries in fMRI data enabling the separation of regions with small differences in the timings of their brain responses.
Image segmentation by hierarchial agglomeration of polygons using ecological statistics
Prasad, Lakshman; Swaminarayan, Sriram
2013-04-23
A method for rapid hierarchical image segmentation based on perceptually driven contour completion and scene statistics is disclosed. The method begins with an initial fine-scale segmentation of an image, such as obtained by perceptual completion of partial contours into polygonal regions using region-contour correspondences established by Delaunay triangulation of edge pixels as implemented in VISTA. The resulting polygons are analyzed with respect to their size and color/intensity distributions and the structural properties of their boundaries. Statistical estimates of granularity of size, similarity of color, texture, and saliency of intervening boundaries are computed and formulated into logical (Boolean) predicates. The combined satisfiability of these Boolean predicates by a pair of adjacent polygons at a given segmentation level qualifies them for merging into a larger polygon representing a coarser, larger-scale feature of the pixel image and collectively obtains the next level of polygonal segments in a hierarchy of fine-to-coarse segmentations. The iterative application of this process precipitates textured regions as polygons with highly convolved boundaries and helps distinguish them from objects which typically have more regular boundaries. The method yields a multiscale decomposition of an image into constituent features that enjoy a hierarchical relationship with features at finer and coarser scales. This provides a traversable graph structure from which feature content and context in terms of other features can be derived, aiding in automated image understanding tasks. The method disclosed is highly efficient and can be used to decompose and analyze large images.
Asada, Yukiko; Abel, Hannah; Skedgel, Chris; Warner, Grace
2017-12-01
Policy Points: Effective graphs can be a powerful tool in communicating health inequality. The choice of graphs is often based on preferences and familiarity rather than science. According to the literature on graph perception, effective graphs allow human brains to decode visual cues easily. Dot charts are easier to decode than bar charts, and thus they are more effective. Dot charts are a flexible and versatile way to display information about health inequality. Consistent with the health risk communication literature, the captions accompanying health inequality graphs should provide a numerical, explicitly calculated description of health inequality, expressed in absolute and relative terms, from carefully thought-out perspectives. Graphs are an essential tool for communicating health inequality, a key health policy concern. The choice of graphs is often driven by personal preferences and familiarity. Our article is aimed at health policy researchers developing health inequality graphs for policy and scientific audiences and seeks to (1) raise awareness of the effective use of graphs in communicating health inequality; (2) advocate for a particular type of graph (ie, dot charts) to depict health inequality; and (3) suggest key considerations for the captions accompanying health inequality graphs. Using composite review methods, we selected the prevailing recommendations for improving graphs in scientific reporting. To find the origins of these recommendations, we reviewed the literature on graph perception and then applied what we learned to the context of health inequality. In addition, drawing from the numeracy literature in health risk communication, we examined numeric and verbal formats to explain health inequality graphs. Many disciplines offer commonsense recommendations for visually presenting quantitative data. The literature on graph perception, which defines effective graphs as those allowing the easy decoding of visual cues in human brains, shows that with their more accurate and easier-to-decode visual cues, dot charts are more effective than bar charts. Dot charts can flexibly present a large amount of information in limited space. They also can easily accommodate typical health inequality information to describe a health variable (eg, life expectancy) by an inequality domain (eg, income) with domain groups (eg, poor and rich) in a population (eg, Canada) over time periods (eg, 2010 and 2017). The numeracy literature suggests that a health inequality graph's caption should provide a numerical, explicitly calculated description of health inequality expressed in absolute and relative terms, from carefully thought-out perspectives. Given the ubiquity of graphs, the health inequality field should learn from the vibrant multidisciplinary literature how to construct effective graphic communications, especially by considering to use dot charts. © 2017 Milbank Memorial Fund.
Visual Routines for Extracting Magnitude Relations
ERIC Educational Resources Information Center
Michal, Audrey L.; Uttal, David; Shah, Priti; Franconeri, Steven L.
2016-01-01
Linking relations described in text with relations in visualizations is often difficult. We used eye tracking to measure the optimal way to extract such relations in graphs, college students, and young children (6- and 8-year-olds). Participants compared relational statements ("Are there more blueberries than oranges?") with simple…
No-reference image quality assessment based on statistics of convolution feature maps
NASA Astrophysics Data System (ADS)
Lv, Xiaoxin; Qin, Min; Chen, Xiaohui; Wei, Guo
2018-04-01
We propose a Convolutional Feature Maps (CFM) driven approach to accurately predict image quality. Our motivation bases on the finding that the Nature Scene Statistic (NSS) features on convolution feature maps are significantly sensitive to distortion degree of an image. In our method, a Convolutional Neural Network (CNN) is trained to obtain kernels for generating CFM. We design a forward NSS layer which performs on CFM to better extract NSS features. The quality aware features derived from the output of NSS layer is effective to describe the distortion type and degree an image suffered. Finally, a Support Vector Regression (SVR) is employed in our No-Reference Image Quality Assessment (NR-IQA) model to predict a subjective quality score of a distorted image. Experiments conducted on two public databases demonstrate the promising performance of the proposed method is competitive to state of the art NR-IQA methods.
Wang, Shuihua; Yang, Ming; Du, Sidan; Yang, Jiquan; Liu, Bin; Gorriz, Juan M.; Ramírez, Javier; Yuan, Ti-Fei; Zhang, Yudong
2016-01-01
Highlights We develop computer-aided diagnosis system for unilateral hearing loss detection in structural magnetic resonance imaging.Wavelet entropy is introduced to extract image global features from brain images. Directed acyclic graph is employed to endow support vector machine an ability to handle multi-class problems.The developed computer-aided diagnosis system achieves an overall accuracy of 95.1% for this three-class problem of differentiating left-sided and right-sided hearing loss from healthy controls. Aim: Sensorineural hearing loss (SNHL) is correlated to many neurodegenerative disease. Now more and more computer vision based methods are using to detect it in an automatic way. Materials: We have in total 49 subjects, scanned by 3.0T MRI (Siemens Medical Solutions, Erlangen, Germany). The subjects contain 14 patients with right-sided hearing loss (RHL), 15 patients with left-sided hearing loss (LHL), and 20 healthy controls (HC). Method: We treat this as a three-class classification problem: RHL, LHL, and HC. Wavelet entropy (WE) was selected from the magnetic resonance images of each subjects, and then submitted to a directed acyclic graph support vector machine (DAG-SVM). Results: The 10 repetition results of 10-fold cross validation shows 3-level decomposition will yield an overall accuracy of 95.10% for this three-class classification problem, higher than feedforward neural network, decision tree, and naive Bayesian classifier. Conclusions: This computer-aided diagnosis system is promising. We hope this study can attract more computer vision method for detecting hearing loss. PMID:27807415
Wang, Shuihua; Yang, Ming; Du, Sidan; Yang, Jiquan; Liu, Bin; Gorriz, Juan M; Ramírez, Javier; Yuan, Ti-Fei; Zhang, Yudong
2016-01-01
Highlights We develop computer-aided diagnosis system for unilateral hearing loss detection in structural magnetic resonance imaging.Wavelet entropy is introduced to extract image global features from brain images. Directed acyclic graph is employed to endow support vector machine an ability to handle multi-class problems.The developed computer-aided diagnosis system achieves an overall accuracy of 95.1% for this three-class problem of differentiating left-sided and right-sided hearing loss from healthy controls. Aim: Sensorineural hearing loss (SNHL) is correlated to many neurodegenerative disease. Now more and more computer vision based methods are using to detect it in an automatic way. Materials: We have in total 49 subjects, scanned by 3.0T MRI (Siemens Medical Solutions, Erlangen, Germany). The subjects contain 14 patients with right-sided hearing loss (RHL), 15 patients with left-sided hearing loss (LHL), and 20 healthy controls (HC). Method: We treat this as a three-class classification problem: RHL, LHL, and HC. Wavelet entropy (WE) was selected from the magnetic resonance images of each subjects, and then submitted to a directed acyclic graph support vector machine (DAG-SVM). Results: The 10 repetition results of 10-fold cross validation shows 3-level decomposition will yield an overall accuracy of 95.10% for this three-class classification problem, higher than feedforward neural network, decision tree, and naive Bayesian classifier. Conclusions: This computer-aided diagnosis system is promising. We hope this study can attract more computer vision method for detecting hearing loss.
Discriminative graph embedding for label propagation.
Nguyen, Canh Hao; Mamitsuka, Hiroshi
2011-09-01
In many applications, the available information is encoded in graph structures. This is a common problem in biological networks, social networks, web communities and document citations. We investigate the problem of classifying nodes' labels on a similarity graph given only a graph structure on the nodes. Conventional machine learning methods usually require data to reside in some Euclidean spaces or to have a kernel representation. Applying these methods to nodes on graphs would require embedding the graphs into these spaces. By embedding and then learning the nodes on graphs, most methods are either flexible with different learning objectives or efficient enough for large scale applications. We propose a method to embed a graph into a feature space for a discriminative purpose. Our idea is to include label information into the embedding process, making the space representation tailored to the task. We design embedding objective functions that the following learning formulations become spectral transforms. We then reformulate these spectral transforms into multiple kernel learning problems. Our method, while being tailored to the discriminative tasks, is efficient and can scale to massive data sets. We show the need of discriminative embedding on some simulations. Applying to biological network problems, our method is shown to outperform baselines.
Collaborative mining and transfer learning for relational data
NASA Astrophysics Data System (ADS)
Levchuk, Georgiy; Eslami, Mohammed
2015-06-01
Many of the real-world problems, - including human knowledge, communication, biological, and cyber network analysis, - deal with data entities for which the essential information is contained in the relations among those entities. Such data must be modeled and analyzed as graphs, with attributes on both objects and relations encode and differentiate their semantics. Traditional data mining algorithms were originally designed for analyzing discrete objects for which a set of features can be defined, and thus cannot be easily adapted to deal with graph data. This gave rise to the relational data mining field of research, of which graph pattern learning is a key sub-domain [11]. In this paper, we describe a model for learning graph patterns in collaborative distributed manner. Distributed pattern learning is challenging due to dependencies between the nodes and relations in the graph, and variability across graph instances. We present three algorithms that trade-off benefits of parallelization and data aggregation, compare their performance to centralized graph learning, and discuss individual benefits and weaknesses of each model. Presented algorithms are designed for linear speedup in distributed computing environments, and learn graph patterns that are both closer to ground truth and provide higher detection rates than centralized mining algorithm.
Use of Time-Aware Language Model in Entity Driven Filtering System
2014-11-01
knowledge graph explo- ration. In the case where no Wikipedia page were found for an entity, the set remains empty; - a set of surface forms found using...Interna- tional Conference on Weblogs and Social Media, Barcelona, Catalonia , Spain, July 17-21, 2011.
The relationship between 2D static features and 2D dynamic features used in gait recognition
NASA Astrophysics Data System (ADS)
Alawar, Hamad M.; Ugail, Hassan; Kamala, Mumtaz; Connah, David
2013-05-01
In most gait recognition techniques, both static and dynamic features are used to define a subject's gait signature. In this study, the existence of a relationship between static and dynamic features was investigated. The correlation coefficient was used to analyse the relationship between the features extracted from the "University of Bradford Multi-Modal Gait Database". This study includes two dimensional dynamic and static features from 19 subjects. The dynamic features were compromised of Phase-Weighted Magnitudes driven by a Fourier Transform of the temporal rotational data of a subject's joints (knee, thigh, shoulder, and elbow). The results concluded that there are eleven pairs of features that are considered significantly correlated with (p<0.05). This result indicates the existence of a statistical relationship between static and dynamics features, which challenges the results of several similar studies. These results bare great potential for further research into the area, and would potentially contribute to the creation of a gait signature using latent data.
Graph-based analysis of kinetics on multidimensional potential-energy surfaces.
Okushima, T; Niiyama, T; Ikeda, K S; Shimizu, Y
2009-09-01
The aim of this paper is twofold: one is to give a detailed description of an alternative graph-based analysis method, which we call saddle connectivity graph, for analyzing the global topography and the dynamical properties of many-dimensional potential-energy landscapes and the other is to give examples of applications of this method in the analysis of the kinetics of realistic systems. A Dijkstra-type shortest path algorithm is proposed to extract dynamically dominant transition pathways by kinetically defining transition costs. The applicability of this approach is first confirmed by an illustrative example of a low-dimensional random potential. We then show that a coarse-graining procedure tailored for saddle connectivity graphs can be used to obtain the kinetic properties of 13- and 38-atom Lennard-Jones clusters. The coarse-graining method not only reduces the complexity of the graphs, but also, with iterative use, reveals a self-similar hierarchical structure in these clusters. We also propose that the self-similarity is common to many-atom Lennard-Jones clusters.
Mathematical formula recognition using graph grammar
NASA Astrophysics Data System (ADS)
Lavirotte, Stephane; Pottier, Loic
1998-04-01
This paper describes current results of Ofr, a system for extracting and understanding mathematical expressions in documents. Such a tool could be really useful to be able to re-use knowledge in scientific books which are not available in electronic form. We currently also study use of this system for direct input of formulas with a graphical tablet for computer algebra system softwares. Existing solutions for mathematical recognition have problems to analyze 2D expressions like vectors and matrices. This is because they often try to use extended classical grammar to analyze formulas, relatively to baseline. But a lot of mathematical notations do not respect rules for such a parsing and that is the reason why they fail to extend text parsing technic. We investigate graph grammar and graph rewriting as a solution to recognize 2D mathematical notations. Graph grammar provide a powerful formalism to describe structural manipulations of multi-dimensional data. The main two problems to solve are ambiguities between rules of grammar and construction of graph.
Price, Melanie; Cameron, Rachel; Butow, Phyllis
2007-12-01
Statistical health risk information has proved notoriously confusing and difficult to understand. While past research indicates that presenting risk information in a frequency format is superior to relative risk and probability formats, the optimal characteristics of frequency formats are still unclear. The aim of this study is to determine the features of 1000 person frequency diagrams (pictographs) which result in the greatest speed and accuracy of graphical perception. Participants estimated the difference in chance of survival when taking or not taking Drug A, on a pictograph format, varying by mode (one-graph/two-graph), direction (vertical/horizontal), and shading (shaded/unshaded), and their preferences for the different formats. Their understanding of different components of the 1000 person diagram was assessed. Responses were timed and scored for accuracy. Horizontal pictographs were perceived faster and more accurately than vertical formats. Two-graph pictographs were perceived faster than one-graph formats. Shading reduced response time in two-graph formats, but increased response times in one-graph formats. Shaded and one-graph pictographs were preferred. As shading and one-graph formats were preferred, further clarification as to why shading negatively impacts on response times in the one-graph format is warranted. Horizontal pictographs are optimal.
FPFH-based graph matching for 3D point cloud registration
NASA Astrophysics Data System (ADS)
Zhao, Jiapeng; Li, Chen; Tian, Lihua; Zhu, Jihua
2018-04-01
Correspondence detection is a vital step in point cloud registration and it can help getting a reliable initial alignment. In this paper, we put forward an advanced point feature-based graph matching algorithm to solve the initial alignment problem of rigid 3D point cloud registration with partial overlap. Specifically, Fast Point Feature Histograms are used to determine the initial possible correspondences firstly. Next, a new objective function is provided to make the graph matching more suitable for partially overlapping point cloud. The objective function is optimized by the simulated annealing algorithm for final group of correct correspondences. Finally, we present a novel set partitioning method which can transform the NP-hard optimization problem into a O(n3)-solvable one. Experiments on the Stanford and UWA public data sets indicates that our method can obtain better result in terms of both accuracy and time cost compared with other point cloud registration methods.
Mathematics make microbes beautiful, beneficial, and bountiful.
Jungck, John R
2012-01-01
Microbiology is a rich area for visualizing the importance of mathematics in terms of designing experiments, data mining, testing hypotheses, and visualizing relationships. Historically, Nobel Prizes have acknowledged the close interplay between mathematics and microbiology in such examples as the fluctuation test and mutation rates using Poisson statistics by Luria and Delbrück and the use of graph theory of polyhedra by Caspar and Klug. More and more contemporary microbiology journals feature mathematical models, computational algorithms and heuristics, and multidimensional visualizations. While revolutions in research have driven these initiatives, a commensurate effort needs to be made to incorporate much more mathematics into the professional preparation of microbiologists. In order not to be daunting to many educators, a Bloom-like "Taxonomy of Quantitative Reasoning" is shared with explicit examples of microbiological activities for engaging students in (a) counting, measuring, calculating using image analysis of bacterial colonies and viral infections on variegated leaves, measurement of fractal dimensions of beautiful colony morphologies, and counting vertices, edges, and faces on viral capsids and using graph theory to understand self assembly; (b) graphing, mapping, ordering by applying linear, exponential, and logistic growth models of public health and sanitation problems, revisiting Snow's epidemiological map of cholera with computational geometry, and using interval graphs to do complementation mapping, deletion mapping, food webs, and microarray heatmaps; (c) problem solving by doing gene mapping and experimental design, and applying Boolean algebra to gene regulation of operons; (d) analysis of the "Bacterial Bonanza" of microbial sequence and genomic data using bioinformatics and phylogenetics; (e) hypothesis testing-again with phylogenetic trees and use of Poisson statistics and the Luria-Delbrück fluctuation test; and (f) modeling of biodiversity by using game theory, of epidemics with algebraic models, bacterial motion by using motion picture analysis and fluid mechanics of motility in multiple dimensions through the physics of "Life at Low Reynolds Numbers," and pattern formation of quorum sensing bacterial populations. Through a developmental model for preprofessional education that emphasizes the beauty, utility, and diversity of microbiological systems, we hope to foster creativity as well as mathematically rigorous reasoning. Copyright © 2012 Elsevier Inc. All rights reserved.
Categorical Structure among Shared Features in Networks of Early-Learned Nouns
ERIC Educational Resources Information Center
Hills, Thomas T.; Maouene, Mounir; Maouene, Josita; Sheya, Adam; Smith, Linda
2009-01-01
The shared features that characterize the noun categories that young children learn first are a formative basis of the human category system. To investigate the potential categorical information contained in the features of early-learned nouns, we examine the graph-theoretic properties of noun-feature networks. The networks are built from the…
Constant size descriptors for accurate machine learning models of molecular properties
NASA Astrophysics Data System (ADS)
Collins, Christopher R.; Gordon, Geoffrey J.; von Lilienfeld, O. Anatole; Yaron, David J.
2018-06-01
Two different classes of molecular representations for use in machine learning of thermodynamic and electronic properties are studied. The representations are evaluated by monitoring the performance of linear and kernel ridge regression models on well-studied data sets of small organic molecules. One class of representations studied here counts the occurrence of bonding patterns in the molecule. These require only the connectivity of atoms in the molecule as may be obtained from a line diagram or a SMILES string. The second class utilizes the three-dimensional structure of the molecule. These include the Coulomb matrix and Bag of Bonds, which list the inter-atomic distances present in the molecule, and Encoded Bonds, which encode such lists into a feature vector whose length is independent of molecular size. Encoded Bonds' features introduced here have the advantage of leading to models that may be trained on smaller molecules and then used successfully on larger molecules. A wide range of feature sets are constructed by selecting, at each rank, either a graph or geometry-based feature. Here, rank refers to the number of atoms involved in the feature, e.g., atom counts are rank 1, while Encoded Bonds are rank 2. For atomization energies in the QM7 data set, the best graph-based feature set gives a mean absolute error of 3.4 kcal/mol. Inclusion of 3D geometry substantially enhances the performance, with Encoded Bonds giving 2.4 kcal/mol, when used alone, and 1.19 kcal/mol, when combined with graph features.
Rorres, Chris; Romano, Maria; Miller, Jennifer A; Mossey, Jana M; Grubesic, Tony H; Zellner, David E; Smith, Gary
2018-06-01
Contact tracing is a crucial component of the control of many infectious diseases, but is an arduous and time consuming process. Procedures that increase the efficiency of contact tracing increase the chance that effective controls can be implemented sooner and thus reduce the magnitude of the epidemic. We illustrate a procedure using Graph Theory in the context of infectious disease epidemics of farmed animals in which the epidemics are driven mainly by the shipment of animals between farms. Specifically, we created a directed graph of the recorded shipments of deer between deer farms in Pennsylvania over a timeframe and asked how the properties of the graph could be exploited to make contact tracing more efficient should Chronic Wasting Disease (a prion disease of deer) be discovered in one of the farms. We show that the presence of a large strongly connected component in the graph has a significant impact on the number of contacts that can arise. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.
Yu, Qingbao; Erhardt, Erik B.; Sui, Jing; Du, Yuhui; He, Hao; Hjelm, Devon; Cetin, Mustafa S.; Rachakonda, Srinivas; Miller, Robyn L.; Pearlson, Godfrey; Calhoun, Vince D.
2014-01-01
Graph theory-based analysis has been widely employed in brain imaging studies, and altered topological properties of brain connectivity have emerged as important features of mental diseases such as schizophrenia. However, most previous studies have focused on graph metrics of stationary brain graphs, ignoring that brain connectivity exhibits fluctuations over time. Here we develop a new framework for accessing dynamic graph properties of time-varying functional brain connectivity in resting state fMRI data and apply it to healthy controls (HCs) and patients with schizophrenia (SZs). Specifically, nodes of brain graphs are defined by intrinsic connectivity networks (ICNs) identified by group independent component analysis (ICA). Dynamic graph metrics of the time-varying brain connectivity estimated by the correlation of sliding time-windowed ICA time courses of ICNs are calculated. First- and second-level connectivity states are detected based on the correlation of nodal connectivity strength between time-varying brain graphs. Our results indicate that SZs show decreased variance in the dynamic graph metrics. Consistent with prior stationary functional brain connectivity works, graph measures of identified first-level connectivity states show lower values in SZs. In addition, more first-level connectivity states are disassociated with the second-level connectivity state which resembles the stationary connectivity pattern computed by the entire scan. Collectively, the findings provide new evidence about altered dynamic brain graphs in schizophrenia which may underscore the abnormal brain performance in this mental illness. PMID:25514514
Document page structure learning for fixed-layout e-books using conditional random fields
NASA Astrophysics Data System (ADS)
Tao, Xin; Tang, Zhi; Xu, Canhui
2013-12-01
In this paper, a model is proposed to learn logical structure of fixed-layout document pages by combining support vector machine (SVM) and conditional random fields (CRF). Features related to each logical label and their dependencies are extracted from various original Portable Document Format (PDF) attributes. Both local evidence and contextual dependencies are integrated in the proposed model so as to achieve better logical labeling performance. With the merits of SVM as local discriminative classifier and CRF modeling contextual correlations of adjacent fragments, it is capable of resolving the ambiguities of semantic labels. The experimental results show that CRF based models with both tree and chain graph structures outperform the SVM model with an increase of macro-averaged F1 by about 10%.
NASA Tech Briefs, December 2013
NASA Technical Reports Server (NTRS)
2013-01-01
Topics include: Microwave Kinetic Inductance Detector With; Selective Polarization Coupling; Flexible Microstrip Circuits for; Superconducting Electronics; CFD Extraction Tool for TecPlot From DPLR Solutions; RECOVIR Software for Identifying Viruses; Enhanced Contact Graph Routing (ECGR) MACHETE Simulation Model; Orbital Debris Engineering Model (ORDEM) v.3; Scatter-Reducing Sounding Filtration Using a Genetic Algorithm and Mean Monthly Standard Deviation; Thermo-Mechanical Methodology for Stabilizing Shape Memory Alloy Response; Hermetic Seal Designs for Sample Return Sample Tubes; Silicon Alignment Pins: An Easy Way To Realize a Wafer-to-Wafer Alignment; Positive-Buoyancy Rover for Under Ice Mobility; Electric Machine With Boosted Inductance to Stabilize Current Control; International Space Station-Based Electromagnetic Launcher for Space Science Payloads; Advanced Hybrid Spacesuit Concept Featuring Integrated Open Loop and Closed Loop Ventilation Systems; Data Quality Screening Service.
Multiple sclerosis lesion segmentation using an automatic multimodal graph cuts.
García-Lorenzo, Daniel; Lecoeur, Jeremy; Arnold, Douglas L; Collins, D Louis; Barillot, Christian
2009-01-01
Graph Cuts have been shown as a powerful interactive segmentation technique in several medical domains. We propose to automate the Graph Cuts in order to automatically segment Multiple Sclerosis (MS) lesions in MRI. We replace the manual interaction with a robust EM-based approach in order to discriminate between MS lesions and the Normal Appearing Brain Tissues (NABT). Evaluation is performed in synthetic and real images showing good agreement between the automatic segmentation and the target segmentation. We compare our algorithm with the state of the art techniques and with several manual segmentations. An advantage of our algorithm over previously published ones is the possibility to semi-automatically improve the segmentation due to the Graph Cuts interactive feature.
Structural scene analysis and content-based image retrieval applied to bone age assessment
NASA Astrophysics Data System (ADS)
Fischer, Benedikt; Brosig, André; Deserno, Thomas M.; Ott, Bastian; Günther, Rolf W.
2009-02-01
Radiological bone age assessment is based on global or local image regions of interest (ROI), such as epiphyseal regions or the area of carpal bones. Usually, these regions are compared to a standardized reference and a score determining the skeletal maturity is calculated. For computer-assisted diagnosis, automatic ROI extraction is done so far by heuristic approaches. In this work, we apply a high-level approach of scene analysis for knowledge-based ROI segmentation. Based on a set of 100 reference images from the IRMA database, a so called structural prototype (SP) is trained. In this graph-based structure, the 14 phalanges and 5 metacarpal bones are represented by nodes, with associated location, shape, as well as texture parameters modeled by Gaussians. Accordingly, the Gaussians describing the relative positions, relative orientation, and other relative parameters between two nodes are associated to the edges. Thereafter, segmentation of a hand radiograph is done in several steps: (i) a multi-scale region merging scheme is applied to extract visually prominent regions; (ii) a graph/sub-graph matching to the SP robustly identifies a subset of the 19 bones; (iii) the SP is registered to the current image for complete scene-reconstruction (iv) the epiphyseal regions are extracted from the reconstructed scene. The evaluation is based on 137 images of Caucasian males from the USC hand atlas. Overall, an error rate of 32% is achieved, for the 6 middle distal and medial/distal epiphyses, 23% of all extractions need adjustments. On average 9.58 of the 14 epiphyseal regions were extracted successfully per image. This is promising for further use in content-based image retrieval (CBIR) and CBIR-based automatic bone age assessment.
Quantification of network structural dissimilarities.
Schieber, Tiago A; Carpi, Laura; Díaz-Guilera, Albert; Pardalos, Panos M; Masoller, Cristina; Ravetti, Martín G
2017-01-09
Identifying and quantifying dissimilarities among graphs is a fundamental and challenging problem of practical importance in many fields of science. Current methods of network comparison are limited to extract only partial information or are computationally very demanding. Here we propose an efficient and precise measure for network comparison, which is based on quantifying differences among distance probability distributions extracted from the networks. Extensive experiments on synthetic and real-world networks show that this measure returns non-zero values only when the graphs are non-isomorphic. Most importantly, the measure proposed here can identify and quantify structural topological differences that have a practical impact on the information flow through the network, such as the presence or absence of critical links that connect or disconnect connected components.
Bridge damage detection using spatiotemporal patterns extracted from dense sensor network
NASA Astrophysics Data System (ADS)
Liu, Chao; Gong, Yongqiang; Laflamme, Simon; Phares, Brent; Sarkar, Soumik
2017-01-01
The alarmingly degrading state of transportation infrastructures combined with their key societal and economic importance calls for automatic condition assessment methods to facilitate smart management of maintenance and repairs. With the advent of ubiquitous sensing and communication capabilities, scalable data-driven approaches is of great interest, as it can utilize large volume of streaming data without requiring detailed physical models that can be inaccurate and computationally expensive to run. Properly designed, a data-driven methodology could enable fast and automatic evaluation of infrastructures, discovery of causal dependencies among various sub-system dynamic responses, and decision making with uncertainties and lack of labeled data. In this work, a spatiotemporal pattern network (STPN) strategy built on symbolic dynamic filtering (SDF) is proposed to explore spatiotemporal behaviors in a bridge network. Data from strain gauges installed on two bridges are generated using finite element simulation for three types of sensor networks from a density perspective (dense, nominal, sparse). Causal relationships among spatially distributed strain data streams are extracted and analyzed for vehicle identification and detection, and for localization of structural degradation in bridges. Multiple case studies show significant capabilities of the proposed approach in: (i) capturing spatiotemporal features to discover causality between bridges (geographically close), (ii) robustness to noise in data for feature extraction, (iii) detecting and localizing damage via comparison of bridge responses to similar vehicle loads, and (iv) implementing real-time health monitoring and decision making work flow for bridge networks. Also, the results demonstrate increased sensitivity in detecting damages and higher reliability in quantifying the damage level with increase in sensor network density.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Park, Yubin; Shankar, Mallikarjun; Park, Byung H.
Designing a database system for both efficient data management and data services has been one of the enduring challenges in the healthcare domain. In many healthcare systems, data services and data management are often viewed as two orthogonal tasks; data services refer to retrieval and analytic queries such as search, joins, statistical data extraction, and simple data mining algorithms, while data management refers to building error-tolerant and non-redundant database systems. The gap between service and management has resulted in rigid database systems and schemas that do not support effective analytics. We compose a rich graph structure from an abstracted healthcaremore » RDBMS to illustrate how we can fill this gap in practice. We show how a healthcare graph can be automatically constructed from a normalized relational database using the proposed 3NF Equivalent Graph (3EG) transformation.We discuss a set of real world graph queries such as finding self-referrals, shared providers, and collaborative filtering, and evaluate their performance over a relational database and its 3EG-transformed graph. Experimental results show that the graph representation serves as multiple de-normalized tables, thus reducing complexity in a database and enhancing data accessibility of users. Based on this finding, we propose an ensemble framework of databases for healthcare applications.« less
Alvarez-Meza, Andres M.; Orozco-Gutierrez, Alvaro; Castellanos-Dominguez, German
2017-01-01
We introduce Enhanced Kernel-based Relevance Analysis (EKRA) that aims to support the automatic identification of brain activity patterns using electroencephalographic recordings. EKRA is a data-driven strategy that incorporates two kernel functions to take advantage of the available joint information, associating neural responses to a given stimulus condition. Regarding this, a Centered Kernel Alignment functional is adjusted to learning the linear projection that best discriminates the input feature set, optimizing the required free parameters automatically. Our approach is carried out in two scenarios: (i) feature selection by computing a relevance vector from extracted neural features to facilitating the physiological interpretation of a given brain activity task, and (ii) enhanced feature selection to perform an additional transformation of relevant features aiming to improve the overall identification accuracy. Accordingly, we provide an alternative feature relevance analysis strategy that allows improving the system performance while favoring the data interpretability. For the validation purpose, EKRA is tested in two well-known tasks of brain activity: motor imagery discrimination and epileptic seizure detection. The obtained results show that the EKRA approach estimates a relevant representation space extracted from the provided supervised information, emphasizing the salient input features. As a result, our proposal outperforms the state-of-the-art methods regarding brain activity discrimination accuracy with the benefit of enhanced physiological interpretation about the task at hand. PMID:29056897
Combining joint models for biomedical event extraction
2012-01-01
Background We explore techniques for performing model combination between the UMass and Stanford biomedical event extraction systems. Both sub-components address event extraction as a structured prediction problem, and use dual decomposition (UMass) and parsing algorithms (Stanford) to find the best scoring event structure. Our primary focus is on stacking where the predictions from the Stanford system are used as features in the UMass system. For comparison, we look at simpler model combination techniques such as intersection and union which require only the outputs from each system and combine them directly. Results First, we find that stacking substantially improves performance while intersection and union provide no significant benefits. Second, we investigate the graph properties of event structures and their impact on the combination of our systems. Finally, we trace the origins of events proposed by the stacked model to determine the role each system plays in different components of the output. We learn that, while stacking can propose novel event structures not seen in either base model, these events have extremely low precision. Removing these novel events improves our already state-of-the-art F1 to 56.6% on the test set of Genia (Task 1). Overall, the combined system formed via stacking ("FAUST") performed well in the BioNLP 2011 shared task. The FAUST system obtained 1st place in three out of four tasks: 1st place in Genia Task 1 (56.0% F1) and Task 2 (53.9%), 2nd place in the Epigenetics and Post-translational Modifications track (35.0%), and 1st place in the Infectious Diseases track (55.6%). Conclusion We present a state-of-the-art event extraction system that relies on the strengths of structured prediction and model combination through stacking. Akin to results on other tasks, stacking outperforms intersection and union and leads to very strong results. The utility of model combination hinges on complementary views of the data, and we show that our sub-systems capture different graph properties of event structures. Finally, by removing low precision novel events, we show that performance from stacking can be further improved. PMID:22759463
Predicting conversion from MCI to AD using resting-state fMRI, graph theoretical approach and SVM.
Hojjati, Seyed Hani; Ebrahimzadeh, Ata; Khazaee, Ali; Babajani-Feremi, Abbas
2017-04-15
We investigated identifying patients with mild cognitive impairment (MCI) who progress to Alzheimer's disease (AD), MCI converter (MCI-C), from those with MCI who do not progress to AD, MCI non-converter (MCI-NC), based on resting-state fMRI (rs-fMRI). Graph theory and machine learning approach were utilized to predict progress of patients with MCI to AD using rs-fMRI. Eighteen MCI converts (average age 73.6 years; 11 male) and 62 age-matched MCI non-converters (average age 73.0 years, 28 male) were included in this study. We trained and tested a support vector machine (SVM) to classify MCI-C from MCI-NC using features constructed based on the local and global graph measures. A novel feature selection algorithm was developed and utilized to select an optimal subset of features. Using subset of optimal features in SVM, we classified MCI-C from MCI-NC with an accuracy, sensitivity, specificity, and the area under the receiver operating characteristic (ROC) curve of 91.4%, 83.24%, 90.1%, and 0.95, respectively. Furthermore, results of our statistical analyses were used to identify the affected brain regions in AD. To the best of our knowledge, this is the first study that combines the graph measures (constructed based on rs-fMRI) with machine learning approach and accurately classify MCI-C from MCI-NC. Results of this study demonstrate potential of the proposed approach for early AD diagnosis and demonstrate capability of rs-fMRI to predict conversion from MCI to AD by identifying affected brain regions underlying this conversion. Copyright © 2017 Elsevier B.V. All rights reserved.
Deep Learning in Label-free Cell Classification
Chen, Claire Lifan; Mahjoubfar, Ata; Tai, Li-Chia; Blaby, Ian K.; Huang, Allen; Niazi, Kayvan Reza; Jalali, Bahram
2016-01-01
Label-free cell analysis is essential to personalized genomics, cancer diagnostics, and drug development as it avoids adverse effects of staining reagents on cellular viability and cell signaling. However, currently available label-free cell assays mostly rely only on a single feature and lack sufficient differentiation. Also, the sample size analyzed by these assays is limited due to their low throughput. Here, we integrate feature extraction and deep learning with high-throughput quantitative imaging enabled by photonic time stretch, achieving record high accuracy in label-free cell classification. Our system captures quantitative optical phase and intensity images and extracts multiple biophysical features of individual cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. We compare various learning algorithms including artificial neural network, support vector machine, logistic regression, and a novel deep learning pipeline, which adopts global optimization of receiver operating characteristics. As a validation of the enhanced sensitivity and specificity of our system, we show classification of white blood T-cells against colon cancer cells, as well as lipid accumulating algal strains for biofuel production. This system opens up a new path to data-driven phenotypic diagnosis and better understanding of the heterogeneous gene expressions in cells. PMID:26975219
Deep Learning in Label-free Cell Classification
NASA Astrophysics Data System (ADS)
Chen, Claire Lifan; Mahjoubfar, Ata; Tai, Li-Chia; Blaby, Ian K.; Huang, Allen; Niazi, Kayvan Reza; Jalali, Bahram
2016-03-01
Label-free cell analysis is essential to personalized genomics, cancer diagnostics, and drug development as it avoids adverse effects of staining reagents on cellular viability and cell signaling. However, currently available label-free cell assays mostly rely only on a single feature and lack sufficient differentiation. Also, the sample size analyzed by these assays is limited due to their low throughput. Here, we integrate feature extraction and deep learning with high-throughput quantitative imaging enabled by photonic time stretch, achieving record high accuracy in label-free cell classification. Our system captures quantitative optical phase and intensity images and extracts multiple biophysical features of individual cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. We compare various learning algorithms including artificial neural network, support vector machine, logistic regression, and a novel deep learning pipeline, which adopts global optimization of receiver operating characteristics. As a validation of the enhanced sensitivity and specificity of our system, we show classification of white blood T-cells against colon cancer cells, as well as lipid accumulating algal strains for biofuel production. This system opens up a new path to data-driven phenotypic diagnosis and better understanding of the heterogeneous gene expressions in cells.
From Spin to Swindle: Identifying Falsification in Financial Text.
Minhas, Saliha; Hussain, Amir
Despite legislative attempts to curtail financial statement fraud, it continues unabated. This study makes a renewed attempt to aid in detecting this misconduct using linguistic analysis with data mining on narrative sections of annual reports/10-K form. Different from the features used in similar research, this paper extracts three distinct sets of features from a newly constructed corpus of narratives (408 annual reports/10-K, 6.5 million words) from fraud and non-fraud firms. Separately each of these three sets of features is put through a suite of classification algorithms, to determine classifier performance in this binary fraud/non-fraud discrimination task. From the results produced, there is a clear indication that the language deployed by management engaged in wilful falsification of firm performance is discernibly different from truth-tellers. For the first time, this new interdisciplinary research extracts features for readability at a much deeper level, attempts to draw out collocations using n -grams and measures tone using appropriate financial dictionaries. This linguistic analysis with machine learning-driven data mining approach to fraud detection could be used by auditors in assessing financial reporting of firms and early detection of possible misdemeanours.
SBEToolbox: A Matlab Toolbox for Biological Network Analysis
Konganti, Kranti; Wang, Gang; Yang, Ence; Cai, James J.
2013-01-01
We present SBEToolbox (Systems Biology and Evolution Toolbox), an open-source Matlab toolbox for biological network analysis. It takes a network file as input, calculates a variety of centralities and topological metrics, clusters nodes into modules, and displays the network using different graph layout algorithms. Straightforward implementation and the inclusion of high-level functions allow the functionality to be easily extended or tailored through developing custom plugins. SBEGUI, a menu-driven graphical user interface (GUI) of SBEToolbox, enables easy access to various network and graph algorithms for programmers and non-programmers alike. All source code and sample data are freely available at https://github.com/biocoder/SBEToolbox/releases. PMID:24027418
SBEToolbox: A Matlab Toolbox for Biological Network Analysis.
Konganti, Kranti; Wang, Gang; Yang, Ence; Cai, James J
2013-01-01
We present SBEToolbox (Systems Biology and Evolution Toolbox), an open-source Matlab toolbox for biological network analysis. It takes a network file as input, calculates a variety of centralities and topological metrics, clusters nodes into modules, and displays the network using different graph layout algorithms. Straightforward implementation and the inclusion of high-level functions allow the functionality to be easily extended or tailored through developing custom plugins. SBEGUI, a menu-driven graphical user interface (GUI) of SBEToolbox, enables easy access to various network and graph algorithms for programmers and non-programmers alike. All source code and sample data are freely available at https://github.com/biocoder/SBEToolbox/releases.
Melonic Phase Transition in Group Field Theory
NASA Astrophysics Data System (ADS)
Baratin, Aristide; Carrozza, Sylvain; Oriti, Daniele; Ryan, James; Smerlak, Matteo
2014-08-01
Group field theories have recently been shown to admit a 1/N expansion dominated by so-called `melonic graphs', dual to triangulated spheres. In this note, we deepen the analysis of this melonic sector. We obtain a combinatorial formula for the melonic amplitudes in terms of a graph polynomial related to a higher-dimensional generalization of the Kirchhoff tree-matrix theorem. Simple bounds on these amplitudes show the existence of a phase transition driven by melonic interaction processes. We restrict our study to the Boulatov-Ooguri models, which describe topological BF theories and are the basis for the construction of 4-dimensional models of quantum gravity.
A novel line segment detection algorithm based on graph search
NASA Astrophysics Data System (ADS)
Zhao, Hong-dan; Liu, Guo-ying; Song, Xu
2018-02-01
To overcome the problem of extracting line segment from an image, a method of line segment detection was proposed based on the graph search algorithm. After obtaining the edge detection result of the image, the candidate straight line segments are obtained in four directions. For the candidate straight line segments, their adjacency relationships are depicted by a graph model, based on which the depth-first search algorithm is employed to determine how many adjacent line segments need to be merged. Finally we use the least squares method to fit the detected straight lines. The comparative experimental results verify that the proposed algorithm has achieved better results than the line segment detector (LSD).
A graph-based watershed merging using fuzzy C-means and simulated annealing for image segmentation
NASA Astrophysics Data System (ADS)
Vadiveloo, Mogana; Abdullah, Rosni; Rajeswari, Mandava
2015-12-01
In this paper, we have addressed the issue of over-segmented regions produced in watershed by merging the regions using global feature. The global feature information is obtained from clustering the image in its feature space using Fuzzy C-Means (FCM) clustering. The over-segmented regions produced by performing watershed on the gradient of the image are then mapped to this global information in the feature space. Further to this, the global feature information is optimized using Simulated Annealing (SA). The optimal global feature information is used to derive the similarity criterion to merge the over-segmented watershed regions which are represented by the region adjacency graph (RAG). The proposed method has been tested on digital brain phantom simulated dataset to segment white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF) soft tissues regions. The experiments showed that the proposed method performs statistically better, with average of 95.242% regions are merged, than the immersion watershed and average accuracy improvement of 8.850% in comparison with RAG-based immersion watershed merging using global and local features.
Elucidating high-dimensional cancer hallmark annotation via enriched ontology.
Yan, Shankai; Wong, Ka-Chun
2017-09-01
Cancer hallmark annotation is a promising technique that could discover novel knowledge about cancer from the biomedical literature. The automated annotation of cancer hallmarks could reveal relevant cancer transformation processes in the literature or extract the articles that correspond to the cancer hallmark of interest. It acts as a complementary approach that can retrieve knowledge from massive text information, advancing numerous focused studies in cancer research. Nonetheless, the high-dimensional nature of cancer hallmark annotation imposes a unique challenge. To address the curse of dimensionality, we compared multiple cancer hallmark annotation methods on 1580 PubMed abstracts. Based on the insights, a novel approach, UDT-RF, which makes use of ontological features is proposed. It expands the feature space via the Medical Subject Headings (MeSH) ontology graph and utilizes novel feature selections for elucidating the high-dimensional cancer hallmark annotation space. To demonstrate its effectiveness, state-of-the-art methods are compared and evaluated by a multitude of performance metrics, revealing the full performance spectrum on the full set of cancer hallmarks. Several case studies are conducted, demonstrating how the proposed approach could reveal novel insights into cancers. https://github.com/cskyan/chmannot. Copyright © 2017 Elsevier Inc. All rights reserved.
Graphing Calculator Mini Course
NASA Technical Reports Server (NTRS)
Karnawat, Sunil R.
1996-01-01
The "Graphing Calculator Mini Course" project provided a mathematically-intensive technologically-based summer enrichment workshop for teachers of American Indian students on the Turtle Mountain Indian Reservation. Eleven such teachers participated in the six-day workshop in summer of 1996 and three Sunday workshops in the academic year. The project aimed to improve science and mathematics education on the reservation by showing teachers effective ways to use high-end graphing calculators as teaching and learning tools in science and mathematics courses at all levels. In particular, the workshop concentrated on applying TI-82's user-friendly features to understand the various mathematical and scientific concepts.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oppel, III, Fred; Hart, Brian; Hart, Derek
Umbra is a software package that has been in development at Sandia National Laboratories since 1995, under the name Umbra since 1997. Umbra is a software framework written in C++ and Tcl/Tk that has been applied to many operations, primarily dealing with robotics and simulation. Umbra executables are C++ libraries orchestrated with Tcl/Tk scripts. Two major feature upgrades occurred from 4.7 to 4.8 1. System Umbra Module with its own Update Graph within the C++ framework. 2. New terrain graph for fast line-of-sight calculations All else were minor updates such as later versions of Visual Studio, OpenSceneGraph and Boost.
Revealing Long-Range Interconnected Hubs in Human Chromatin Interaction Data Using Graph Theory
NASA Astrophysics Data System (ADS)
Boulos, R. E.; Arneodo, A.; Jensen, P.; Audit, B.
2013-09-01
We use graph theory to analyze chromatin interaction (Hi-C) data in the human genome. We show that a key functional feature of the genome—“master” replication origins—corresponds to DNA loci of maximal network centrality. These loci form a set of interconnected hubs both within chromosomes and between different chromosomes. Our results open the way to a fruitful use of graph theory concepts to decipher DNA structural organization in relation to genome functions such as replication and transcription. This quantitative information should prove useful to discriminate between possible polymer models of nuclear organization.
Exact Solution of the Markov Propagator for the Voter Model on the Complete Graph
2014-07-01
distribution of the random walk. This process can also be applied to other models, incomplete graphs, or to multiple dimensions. An advantage of this...since any multiple of an eigenvector remains an eigenvector. Without any loss, let bk = 1. Now we can ascertain the explicit solution for bj when k < j...this bound is valid for all initial probability distributions. However, without detailed information about the eigenvectors, we cannot extract more
Dynamic Visualization of Co-expression in Systems Genetics Data
DOE Office of Scientific and Technical Information (OSTI.GOV)
New, Joshua Ryan; Huang, Jian; Chesler, Elissa J
2008-01-01
Biologists hope to address grand scientific challenges by exploring the abundance of data made available through modern microarray technology and other high-throughput techniques. The impact of this data, however, is limited unless researchers can effectively assimilate such complex information and integrate it into their daily research; interactive visualization tools are called for to support the effort. Specifically, typical studies of gene co-expression require novel visualization tools that enable the dynamic formulation and fine-tuning of hypotheses to aid the process of evaluating sensitivity of key parameters. These tools should allow biologists to develop an intuitive understanding of the structure of biologicalmore » networks and discover genes which reside in critical positions in networks and pathways. By using a graph as a universal data representation of correlation in gene expression data, our novel visualization tool employs several techniques that when used in an integrated manner provide innovative analytical capabilities. Our tool for interacting with gene co-expression data integrates techniques such as: graph layout, qualitative subgraph extraction through a novel 2D user interface, quantitative subgraph extraction using graph-theoretic algorithms or by querying an optimized b-tree, dynamic level-of-detail graph abstraction, and template-based fuzzy classification using neural networks. We demonstrate our system using a real-world workflow from a large-scale, systems genetics study of mammalian gene co-expression.« less
Fingerprint recognition system by use of graph matching
NASA Astrophysics Data System (ADS)
Shen, Wei; Shen, Jun; Zheng, Huicheng
2001-09-01
Fingerprint recognition is an important subject in biometrics to identify or verify persons by physiological characteristics, and has found wide applications in different domains. In the present paper, we present a finger recognition system that combines singular points and structures. The principal steps of processing in our system are: preprocessing and ridge segmentation, singular point extraction and selection, graph representation, and finger recognition by graphs matching. Our fingerprint recognition system is implemented and tested for many fingerprint images and the experimental result are satisfactory. Different techniques are used in our system, such as fast calculation of orientation field, local fuzzy dynamical thresholding, algebraic analysis of connections and fingerprints representation and matching by graphs. Wed find that for fingerprint database that is not very large, the recognition rate is very high even without using a prior coarse category classification. This system works well for both one-to-few and one-to-many problems.
The Use of Weighted Graphs for Large-Scale Genome Analysis
Zhou, Fang; Toivonen, Hannu; King, Ross D.
2014-01-01
There is an acute need for better tools to extract knowledge from the growing flood of sequence data. For example, thousands of complete genomes have been sequenced, and their metabolic networks inferred. Such data should enable a better understanding of evolution. However, most existing network analysis methods are based on pair-wise comparisons, and these do not scale to thousands of genomes. Here we propose the use of weighted graphs as a data structure to enable large-scale phylogenetic analysis of networks. We have developed three types of weighted graph for enzymes: taxonomic (these summarize phylogenetic importance), isoenzymatic (these summarize enzymatic variety/redundancy), and sequence-similarity (these summarize sequence conservation); and we applied these types of weighted graph to survey prokaryotic metabolism. To demonstrate the utility of this approach we have compared and contrasted the large-scale evolution of metabolism in Archaea and Eubacteria. Our results provide evidence for limits to the contingency of evolution. PMID:24619061
Uzuner, Özlem; Szolovits, Peter
2017-01-01
Research on extracting biomedical relations has received growing attention recently, with numerous biological and clinical applications including those in pharmacogenomics, clinical trial screening and adverse drug reaction detection. The ability to accurately capture both semantic and syntactic structures in text expressing these relations becomes increasingly critical to enable deep understanding of scientific papers and clinical narratives. Shared task challenges have been organized by both bioinformatics and clinical informatics communities to assess and advance the state-of-the-art research. Significant progress has been made in algorithm development and resource construction. In particular, graph-based approaches bridge semantics and syntax, often achieving the best performance in shared tasks. However, a number of problems at the frontiers of biomedical relation extraction continue to pose interesting challenges and present opportunities for great improvement and fruitful research. In this article, we place biomedical relation extraction against the backdrop of its versatile applications, present a gentle introduction to its general pipeline and shared resources, review the current state-of-the-art in methodology advancement, discuss limitations and point out several promising future directions. PMID:26851224
Benchmarking Measures of Network Controllability on Canonical Graph Models
NASA Astrophysics Data System (ADS)
Wu-Yan, Elena; Betzel, Richard F.; Tang, Evelyn; Gu, Shi; Pasqualetti, Fabio; Bassett, Danielle S.
2018-03-01
The control of networked dynamical systems opens the possibility for new discoveries and therapies in systems biology and neuroscience. Recent theoretical advances provide candidate mechanisms by which a system can be driven from one pre-specified state to another, and computational approaches provide tools to test those mechanisms in real-world systems. Despite already having been applied to study network systems in biology and neuroscience, the practical performance of these tools and associated measures on simple networks with pre-specified structure has yet to be assessed. Here, we study the behavior of four control metrics (global, average, modal, and boundary controllability) on eight canonical graphs (including Erdős-Rényi, regular, small-world, random geometric, Barábasi-Albert preferential attachment, and several modular networks) with different edge weighting schemes (Gaussian, power-law, and two nonparametric distributions from brain networks, as examples of real-world systems). We observe that differences in global controllability across graph models are more salient when edge weight distributions are heavy-tailed as opposed to normal. In contrast, differences in average, modal, and boundary controllability across graph models (as well as across nodes in the graph) are more salient when edge weight distributions are less heavy-tailed. Across graph models and edge weighting schemes, average and modal controllability are negatively correlated with one another across nodes; yet, across graph instances, the relation between average and modal controllability can be positive, negative, or nonsignificant. Collectively, these findings demonstrate that controllability statistics (and their relations) differ across graphs with different topologies and that these differences can be muted or accentuated by differences in the edge weight distributions. More generally, our numerical studies motivate future analytical efforts to better understand the mathematical underpinnings of the relationship between graph topology and control, as well as efforts to design networks with specific control profiles.
NASA Astrophysics Data System (ADS)
Tahmassebi, Amirhessam; Pinker-Domenig, Katja; Wengert, Georg; Lobbes, Marc; Stadlbauer, Andreas; Romero, Francisco J.; Morales, Diego P.; Castillo, Encarnacion; Garcia, Antonio; Botella, Guillermo; Meyer-Bäse, Anke
2017-05-01
Graph network models in dementia have become an important computational technique in neuroscience to study fundamental organizational principles of brain structure and function of neurodegenerative diseases such as dementia. The graph connectivity is reflected in the connectome, the complete set of structural and functional connections of the graph network, which is mostly based on simple Pearson correlation links. In contrast to simple Pearson correlation networks, the partial correlations (PC) only identify direct correlations while indirect associations are eliminated. In addition to this, the state-of-the-art techniques in brain research are based on static graph theory, which is unable to capture the dynamic behavior of the brain connectivity, as it alters with disease evolution. We propose a new research avenue in neuroimaging connectomics based on combining dynamic graph network theory and modeling strategies at different time scales. We present the theoretical framework for area aggregation and time-scale modeling in brain networks as they pertain to disease evolution in dementia. This novel paradigm is extremely powerful, since we can derive both static parameters pertaining to node and area parameters, as well as dynamic parameters, such as system's eigenvalues. By implementing and analyzing dynamically both disease driven PC-networks and regular concentration networks, we reveal differences in the structure of these network that play an important role in the temporal evolution of this disease. The described research is key to advance biomedical research on novel disease prediction trajectories and dementia therapies.
A Graph Approach to Mining Biological Patterns in the Binding Interfaces.
Cheng, Wen; Yan, Changhui
2017-01-01
Protein-RNA interactions play important roles in the biological systems. Searching for regular patterns in the Protein-RNA binding interfaces is important for understanding how protein and RNA recognize each other and bind to form a complex. Herein, we present a graph-mining method for discovering biological patterns in the protein-RNA interfaces. We represented known protein-RNA interfaces using graphs and then discovered graph patterns enriched in the interfaces. Comparison of the discovered graph patterns with UniProt annotations showed that the graph patterns had a significant overlap with residue sites that had been proven crucial for the RNA binding by experimental methods. Using 200 patterns as input features, a support vector machine method was able to classify protein surface patches into RNA-binding sites and non-RNA-binding sites with 84.0% accuracy and 88.9% precision. We built a simple scoring function that calculated the total number of the graph patterns that occurred in a protein-RNA interface. That scoring function was able to discriminate near-native protein-RNA complexes from docking decoys with a performance comparable with that of a state-of-the-art complex scoring function. Our work also revealed possible patterns that might be important for binding affinity.
Li, Qu; Yao, Min; Yang, Jianhua; Xu, Ning
2014-01-01
Online friend recommendation is a fast developing topic in web mining. In this paper, we used SVD matrix factorization to model user and item feature vector and used stochastic gradient descent to amend parameter and improve accuracy. To tackle cold start problem and data sparsity, we used KNN model to influence user feature vector. At the same time, we used graph theory to partition communities with fairly low time and space complexity. What is more, matrix factorization can combine online and offline recommendation. Experiments showed that the hybrid recommendation algorithm is able to recommend online friends with good accuracy.
Optimizing graph-based patterns to extract biomedical events from the literature
2015-01-01
In BioNLP-ST 2013 We participated in the BioNLP 2013 shared tasks on event extraction. Our extraction method is based on the search for an approximate subgraph isomorphism between key context dependencies of events and graphs of input sentences. Our system was able to address both the GENIA (GE) task focusing on 13 molecular biology related event types and the Cancer Genetics (CG) task targeting a challenging group of 40 cancer biology related event types with varying arguments concerning 18 kinds of biological entities. In addition to adapting our system to the two tasks, we also attempted to integrate semantics into the graph matching scheme using a distributional similarity model for more events, and evaluated the event extraction impact of using paths of all possible lengths as key context dependencies beyond using only the shortest paths in our system. We achieved a 46.38% F-score in the CG task (ranking 3rd) and a 48.93% F-score in the GE task (ranking 4th). After BioNLP-ST 2013 We explored three ways to further extend our event extraction system in our previously published work: (1) We allow non-essential nodes to be skipped, and incorporated a node skipping penalty into the subgraph distance function of our approximate subgraph matching algorithm. (2) Instead of assigning a unified subgraph distance threshold to all patterns of an event type, we learned a customized threshold for each pattern. (3) We implemented the well-known Empirical Risk Minimization (ERM) principle to optimize the event pattern set by balancing prediction errors on training data against regularization. When evaluated on the official GE task test data, these extensions help to improve the extraction precision from 62% to 65%. However, the overall F-score stays equivalent to the previous performance due to a 1% drop in recall. PMID:26551594
Sussman, Elyse; Winkler, István; Kreuzer, Judith; Saher, Marieke; Näätänen, Risto; Ritter, Walter
2002-12-01
Our previous study showed that the auditory context could influence whether two successive acoustic changes occurring within the temporal integration window (approximately 200ms) were pre-attentively encoded as a single auditory event or as two discrete events (Cogn Brain Res 12 (2001) 431). The aim of the current study was to assess whether top-down processes could influence the stimulus-driven processes in determining what constitutes an auditory event. Electroencepholagram (EEG) was recorded from 11 scalp electrodes to frequently occurring standard and infrequently occurring deviant sounds. Within the stimulus blocks, deviants either occurred only in pairs (successive feature changes) or both singly and in pairs. Event-related potential indices of change and target detection, the mismatch negativity (MMN) and the N2b component, respectively, were compared with the simultaneously measured performance in discriminating the deviants. Even though subjects could voluntarily distinguish the two successive auditory feature changes from each other, which was also indicated by the elicitation of the N2b target-detection response, top-down processes did not modify the event organization reflected by the MMN response. Top-down processes can extract elemental auditory information from a single integrated acoustic event, but the extraction occurs at a later processing stage than the one whose outcome is indexed by MMN. Initial processes of auditory event-formation are fully governed by the context within which the sounds occur. Perception of the deviants as two separate sound events (the top-down effects) did not change the initial neural representation of the same deviants as one event (indexed by the MMN), without a corresponding change in the stimulus-driven sound organization.
Inferior vena cava segmentation with parameter propagation and graph cut.
Yan, Zixu; Chen, Feng; Wu, Fa; Kong, Dexing
2017-09-01
The inferior vena cava (IVC) is one of the vital veins inside the human body. Accurate segmentation of the IVC from contrast-enhanced CT images is of great importance. This extraction not only helps the physician understand its quantitative features such as blood flow and volume, but also it is helpful during the hepatic preoperative planning. However, manual delineation of the IVC is time-consuming and poorly reproducible. In this paper, we propose a novel method to segment the IVC with minimal user interaction. The proposed method performs the segmentation block by block between user-specified beginning and end masks. At each stage, the proposed method builds the segmentation model based on information from image regional appearances, image boundaries, and a prior shape. The intensity range and the prior shape for this segmentation model are estimated based on the segmentation result from the last block, or from user- specified beginning mask if at first stage. Then, the proposed method minimizes the energy function and generates the segmentation result for current block using graph cut. Finally, a backward tracking step from the end of the IVC is performed if necessary. We have tested our method on 20 clinical datasets and compared our method to three other vessel extraction approaches. The evaluation was performed using three quantitative metrics: the Dice coefficient (Dice), the mean symmetric distance (MSD), and the Hausdorff distance (MaxD). The proposed method has achieved a Dice of [Formula: see text], an MSD of [Formula: see text] mm, and a MaxD of [Formula: see text] mm, respectively, in our experiments. The proposed approach can achieve a sound performance with a relatively low computational cost and a minimal user interaction. The proposed algorithm has high potential to be applied for the clinical applications in the future.
Signalling Network Construction for Modelling Plant Defence Response
Miljkovic, Dragana; Stare, Tjaša; Mozetič, Igor; Podpečan, Vid; Petek, Marko; Witek, Kamil; Dermastia, Marina; Lavrač, Nada; Gruden, Kristina
2012-01-01
Plant defence signalling response against various pathogens, including viruses, is a complex phenomenon. In resistant interaction a plant cell perceives the pathogen signal, transduces it within the cell and performs a reprogramming of the cell metabolism leading to the pathogen replication arrest. This work focuses on signalling pathways crucial for the plant defence response, i.e., the salicylic acid, jasmonic acid and ethylene signal transduction pathways, in the Arabidopsis thaliana model plant. The initial signalling network topology was constructed manually by defining the representation formalism, encoding the information from public databases and literature, and composing a pathway diagram. The manually constructed network structure consists of 175 components and 387 reactions. In order to complement the network topology with possibly missing relations, a new approach to automated information extraction from biological literature was developed. This approach, named Bio3graph, allows for automated extraction of biological relations from the literature, resulting in a set of (component1, reaction, component2) triplets and composing a graph structure which can be visualised, compared to the manually constructed topology and examined by the experts. Using a plant defence response vocabulary of components and reaction types, Bio3graph was applied to a set of 9,586 relevant full text articles, resulting in 137 newly detected reactions between the components. Finally, the manually constructed topology and the new reactions were merged to form a network structure consisting of 175 components and 524 reactions. The resulting pathway diagram of plant defence signalling represents a valuable source for further computational modelling and interpretation of omics data. The developed Bio3graph approach, implemented as an executable language processing and graph visualisation workflow, is publically available at http://ropot.ijs.si/bio3graph/and can be utilised for modelling other biological systems, given that an adequate vocabulary is provided. PMID:23272172
A distance constrained synaptic plasticity model of C. elegans neuronal network
NASA Astrophysics Data System (ADS)
Badhwar, Rahul; Bagler, Ganesh
2017-03-01
Brain research has been driven by enquiry for principles of brain structure organization and its control mechanisms. The neuronal wiring map of C. elegans, the only complete connectome available till date, presents an incredible opportunity to learn basic governing principles that drive structure and function of its neuronal architecture. Despite its apparently simple nervous system, C. elegans is known to possess complex functions. The nervous system forms an important underlying framework which specifies phenotypic features associated to sensation, movement, conditioning and memory. In this study, with the help of graph theoretical models, we investigated the C. elegans neuronal network to identify network features that are critical for its control. The 'driver neurons' are associated with important biological functions such as reproduction, signalling processes and anatomical structural development. We created 1D and 2D network models of C. elegans neuronal system to probe the role of features that confer controllability and small world nature. The simple 1D ring model is critically poised for the number of feed forward motifs, neuronal clustering and characteristic path-length in response to synaptic rewiring, indicating optimal rewiring. Using empirically observed distance constraint in the neuronal network as a guiding principle, we created a distance constrained synaptic plasticity model that simultaneously explains small world nature, saturation of feed forward motifs as well as observed number of driver neurons. The distance constrained model suggests optimum long distance synaptic connections as a key feature specifying control of the network.
Wang, Gang; Wang, Yalin
2017-02-15
In this paper, we propose a heat kernel based regional shape descriptor that may be capable of better exploiting volumetric morphological information than other available methods, thereby improving statistical power on brain magnetic resonance imaging (MRI) analysis. The mechanism of our analysis is driven by the graph spectrum and the heat kernel theory, to capture the volumetric geometry information in the constructed tetrahedral meshes. In order to capture profound brain grey matter shape changes, we first use the volumetric Laplace-Beltrami operator to determine the point pair correspondence between white-grey matter and CSF-grey matter boundary surfaces by computing the streamlines in a tetrahedral mesh. Secondly, we propose multi-scale grey matter morphology signatures to describe the transition probability by random walk between the point pairs, which reflects the inherent geometric characteristics. Thirdly, a point distribution model is applied to reduce the dimensionality of the grey matter morphology signatures and generate the internal structure features. With the sparse linear discriminant analysis, we select a concise morphology feature set with improved classification accuracies. In our experiments, the proposed work outperformed the cortical thickness features computed by FreeSurfer software in the classification of Alzheimer's disease and its prodromal stage, i.e., mild cognitive impairment, on publicly available data from the Alzheimer's Disease Neuroimaging Initiative. The multi-scale and physics based volumetric structure feature may bring stronger statistical power than some traditional methods for MRI-based grey matter morphology analysis. Copyright © 2016 Elsevier Inc. All rights reserved.
User-Driven Geolocation of Untagged Desert Imagery Using Digital Elevation Models (Open Access)
2013-09-12
IEEE International Conference on, pages 3677–3680. IEEE, 2011. [13] W. Zhang and J. Kosecka. Image based localization in urban environments. In 3D ...non- urban environments such as deserts. Our system generates synthetic skyline views from a DEM and extracts stable concavity-based features from these...fine as 100m2. 1. Introduction Automatic geolocation of imagery has many exciting use cases. For example, such a tool could semantically orga- nize
User-Driven Geolocation of Untagged Desert Imagery Using Digital Elevation Models
2013-01-01
Conference on, pages 3677–3680. IEEE, 2011. [13] W. Zhang and J. Kosecka. Image based localization in urban environments. In 3D Data Processing...non- urban environments such as deserts. Our system generates synthetic skyline views from a DEM and extracts stable concavity-based features from these...fine as 100m2. 1. Introduction Automatic geolocation of imagery has many exciting use cases. For example, such a tool could semantically orga- nize
Trust from the past: Bayesian Personalized Ranking based Link Prediction in Knowledge Graphs
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, Baichuan; Choudhury, Sutanay; Al-Hasan, Mohammad
2016-02-01
Estimating the confidence for a link is a critical task for Knowledge Graph construction. Link prediction, or predicting the likelihood of a link in a knowledge graph based on prior state is a key research direction within this area. We propose a Latent Feature Embedding based link recommendation model for prediction task and utilize Bayesian Personalized Ranking based optimization technique for learning models for each predicate. Experimental results on large-scale knowledge bases such as YAGO2 show that our approach achieves substantially higher performance than several state-of-art approaches. Furthermore, we also study the performance of the link prediction algorithm in termsmore » of topological properties of the Knowledge Graph and present a linear regression model to reason about its expected level of accuracy.« less
A data fusion approach for track monitoring from multiple in-service trains
NASA Astrophysics Data System (ADS)
Lederman, George; Chen, Siheng; Garrett, James H.; Kovačević, Jelena; Noh, Hae Young; Bielak, Jacobo
2017-10-01
We present a data fusion approach for enabling data-driven rail-infrastructure monitoring from multiple in-service trains. A number of researchers have proposed using vibration data collected from in-service trains as a low-cost method to monitor track geometry. The majority of this work has focused on developing novel features to extract information about the tracks from data produced by individual sensors on individual trains. We extend this work by presenting a technique to combine extracted features from multiple passes over the tracks from multiple sensors aboard multiple vehicles. There are a number of challenges in combining multiple data sources, like different relative position coordinates depending on the location of the sensor within the train. Furthermore, as the number of sensors increases, the likelihood that some will malfunction also increases. We use a two-step approach that first minimizes position offset errors through data alignment, then fuses the data with a novel adaptive Kalman filter that weights data according to its estimated reliability. We show the efficacy of this approach both through simulations and on a data-set collected from two instrumented trains operating over a one-year period. Combining data from numerous in-service trains allows for more continuous and more reliable data-driven monitoring than analyzing data from any one train alone; as the number of instrumented trains increases, the proposed fusion approach could facilitate track monitoring of entire rail-networks.
Four types of ensemble coding in data visualizations.
Szafir, Danielle Albers; Haroz, Steve; Gleicher, Michael; Franconeri, Steven
2016-01-01
Ensemble coding supports rapid extraction of visual statistics about distributed visual information. Researchers typically study this ability with the goal of drawing conclusions about how such coding extracts information from natural scenes. Here we argue that a second domain can serve as another strong inspiration for understanding ensemble coding: graphs, maps, and other visual presentations of data. Data visualizations allow observers to leverage their ability to perform visual ensemble statistics on distributions of spatial or featural visual information to estimate actual statistics on data. We survey the types of visual statistical tasks that occur within data visualizations across everyday examples, such as scatterplots, and more specialized images, such as weather maps or depictions of patterns in text. We divide these tasks into four categories: identification of sets of values, summarization across those values, segmentation of collections, and estimation of structure. We point to unanswered questions for each category and give examples of such cross-pollination in the current literature. Increased collaboration between the data visualization and perceptual psychology research communities can inspire new solutions to challenges in visualization while simultaneously exposing unsolved problems in perception research.
The Communicability of Graphical Alternatives to Tabular Displays of Statistical Simulation Studies
Cook, Alex R.; Teo, Shanice W. L.
2011-01-01
Simulation studies are often used to assess the frequency properties and optimality of statistical methods. They are typically reported in tables, which may contain hundreds of figures to be contrasted over multiple dimensions. To assess the degree to which these tables are fit for purpose, we performed a randomised cross-over experiment in which statisticians were asked to extract information from (i) such a table sourced from the literature and (ii) a graphical adaptation designed by the authors, and were timed and assessed for accuracy. We developed hierarchical models accounting for differences between individuals of different experience levels (under- and post-graduate), within experience levels, and between different table-graph pairs. In our experiment, information could be extracted quicker and, for less experienced participants, more accurately from graphical presentations than tabular displays. We also performed a literature review to assess the prevalence of hard-to-interpret design features in tables of simulation studies in three popular statistics journals, finding that many are presented innumerately. We recommend simulation studies be presented in graphical form. PMID:22132184
The communicability of graphical alternatives to tabular displays of statistical simulation studies.
Cook, Alex R; Teo, Shanice W L
2011-01-01
Simulation studies are often used to assess the frequency properties and optimality of statistical methods. They are typically reported in tables, which may contain hundreds of figures to be contrasted over multiple dimensions. To assess the degree to which these tables are fit for purpose, we performed a randomised cross-over experiment in which statisticians were asked to extract information from (i) such a table sourced from the literature and (ii) a graphical adaptation designed by the authors, and were timed and assessed for accuracy. We developed hierarchical models accounting for differences between individuals of different experience levels (under- and post-graduate), within experience levels, and between different table-graph pairs. In our experiment, information could be extracted quicker and, for less experienced participants, more accurately from graphical presentations than tabular displays. We also performed a literature review to assess the prevalence of hard-to-interpret design features in tables of simulation studies in three popular statistics journals, finding that many are presented innumerately. We recommend simulation studies be presented in graphical form.
ERIC Educational Resources Information Center
National Aeronautics and Space Administration, Hampton, VA. Langley Research Center.
The NASA CONNECT series features 30-minute, instructional videos for students in grades 5-8 and teacher's guides that use aeronautics and space technology as the organizing theme. In this guide and videotape, National Aeronautics and Space Administration (NASA) researchers and scientists use measurement, ratios, and graphing to demonstrate the…
Kavuluru, Ramakanth; Han, Sifei; Harris, Daniel
2017-01-01
Diagnosis codes are extracted from medical records for billing and reimbursement and for secondary uses such as quality control and cohort identification. In the US, these codes come from the standard terminology ICD-9-CM derived from the international classification of diseases (ICD). ICD-9 codes are generally extracted by trained human coders by reading all artifacts available in a patient’s medical record following specific coding guidelines. To assist coders in this manual process, this paper proposes an unsupervised ensemble approach to automatically extract ICD-9 diagnosis codes from textual narratives included in electronic medical records (EMRs). Earlier attempts on automatic extraction focused on individual documents such as radiology reports and discharge summaries. Here we use a more realistic dataset and extract ICD-9 codes from EMRs of 1000 inpatient visits at the University of Kentucky Medical Center. Using named entity recognition (NER), graph-based concept-mapping of medical concepts, and extractive text summarization techniques, we achieve an example based average recall of 0.42 with average precision 0.47; compared with a baseline of using only NER, we notice a 12% improvement in recall with the graph-based approach and a 7% improvement in precision using the extractive text summarization approach. Although diagnosis codes are complex concepts often expressed in text with significant long range non-local dependencies, our present work shows the potential of unsupervised methods in extracting a portion of codes. As such, our findings are especially relevant for code extraction tasks where obtaining large amounts of training data is difficult. PMID:28748227
Towards Scalable Graph Computation on Mobile Devices.
Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng
2014-10-01
Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach.
Towards Scalable Graph Computation on Mobile Devices
Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng
2015-01-01
Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach. PMID:25859564
The new protein topology graph library web server.
Schäfer, Tim; Scheck, Andreas; Bruneß, Daniel; May, Patrick; Koch, Ina
2016-02-01
We present a new, extended version of the Protein Topology Graph Library web server. The Protein Topology Graph Library describes the protein topology on the super-secondary structure level. It allows to compute and visualize protein ligand graphs and search for protein structural motifs. The new server features additional information on ligand binding to secondary structure elements, increased usability and an application programming interface (API) to retrieve data, allowing for an automated analysis of protein topology. The Protein Topology Graph Library server is freely available on the web at http://ptgl.uni-frankfurt.de. The website is implemented in PHP, JavaScript, PostgreSQL and Apache. It is supported by all major browsers. The VPLG software that was used to compute the protein ligand graphs and all other data in the database is available under the GNU public license 2.0 from http://vplg.sourceforge.net. tim.schaefer@bioinformatik.uni-frankfurt.de; ina.koch@bioinformatik.uni-frankfurt.de Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Winlaw, Manda; De Sterck, Hans; Sanders, Geoffrey
In very simple terms a network can be de ned as a collection of points joined together by lines. Thus, networks can be used to represent connections between entities in a wide variety of elds including engi- neering, science, medicine, and sociology. Many large real-world networks share a surprising number of properties, leading to a strong interest in model development research and techniques for building synthetic networks have been developed, that capture these similarities and replicate real-world graphs. Modeling these real-world networks serves two purposes. First, building models that mimic the patterns and prop- erties of real networks helps tomore » understand the implications of these patterns and helps determine which patterns are important. If we develop a generative process to synthesize real networks we can also examine which growth processes are plausible and which are not. Secondly, high-quality, large-scale network data is often not available, because of economic, legal, technological, or other obstacles [7]. Thus, there are many instances where the systems of interest cannot be represented by a single exemplar network. As one example, consider the eld of cybersecurity, where systems require testing across diverse threat scenarios and validation across diverse network structures. In these cases, where there is no single exemplar network, the systems must instead be modeled as a collection of networks in which the variation among them may be just as important as their common features. By developing processes to build synthetic models, so-called graph generators, we can build synthetic networks that capture both the essential features of a system and realistic variability. Then we can use such synthetic graphs to perform tasks such as simulations, analysis, and decision making. We can also use synthetic graphs to performance test graph analysis algorithms, including clustering algorithms and anomaly detection algorithms.« less
Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text
Xin, Yu; Hochberg, Ephraim; Joshi, Rohit; Uzuner, Ozlem; Szolovits, Peter
2015-01-01
Objective Extracting medical knowledge from electronic medical records requires automated approaches to combat scalability limitations and selection biases. However, existing machine learning approaches are often regarded by clinicians as black boxes. Moreover, training data for these automated approaches at often sparsely annotated at best. The authors target unsupervised learning for modeling clinical narrative text, aiming at improving both accuracy and interpretability. Methods The authors introduce a novel framework named subgraph augmented non-negative tensor factorization (SANTF). In addition to relying on atomic features (e.g., words in clinical narrative text), SANTF automatically mines higher-order features (e.g., relations of lymphoid cells expressing antigens) from clinical narrative text by converting sentences into a graph representation and identifying important subgraphs. The authors compose a tensor using patients, higher-order features, and atomic features as its respective modes. We then apply non-negative tensor factorization to cluster patients, and simultaneously identify latent groups of higher-order features that link to patient clusters, as in clinical guidelines where a panel of immunophenotypic features and laboratory results are used to specify diagnostic criteria. Results and Conclusion SANTF demonstrated over 10% improvement in averaged F-measure on patient clustering compared to widely used non-negative matrix factorization (NMF) and k-means clustering methods. Multiple baselines were established by modeling patient data using patient-by-features matrices with different feature configurations and then performing NMF or k-means to cluster patients. Feature analysis identified latent groups of higher-order features that lead to medical insights. We also found that the latent groups of atomic features help to better correlate the latent groups of higher-order features. PMID:25862765
Zhao, Bo; Ding, Ruoxi; Chen, Shoushun; Linares-Barranco, Bernabe; Tang, Huajin
2015-09-01
This paper introduces an event-driven feedforward categorization system, which takes data from a temporal contrast address event representation (AER) sensor. The proposed system extracts bio-inspired cortex-like features and discriminates different patterns using an AER based tempotron classifier (a network of leaky integrate-and-fire spiking neurons). One of the system's most appealing characteristics is its event-driven processing, with both input and features taking the form of address events (spikes). The system was evaluated on an AER posture dataset and compared with two recently developed bio-inspired models. Experimental results have shown that it consumes much less simulation time while still maintaining comparable performance. In addition, experiments on the Mixed National Institute of Standards and Technology (MNIST) image dataset have demonstrated that the proposed system can work not only on raw AER data but also on images (with a preprocessing step to convert images into AER events) and that it can maintain competitive accuracy even when noise is added. The system was further evaluated on the MNIST dynamic vision sensor dataset (in which data is recorded using an AER dynamic vision sensor), with testing accuracy of 88.14%.
View subspaces for indexing and retrieval of 3D models
NASA Astrophysics Data System (ADS)
Dutagaci, Helin; Godil, Afzal; Sankur, Bülent; Yemez, Yücel
2010-02-01
View-based indexing schemes for 3D object retrieval are gaining popularity since they provide good retrieval results. These schemes are coherent with the theory that humans recognize objects based on their 2D appearances. The viewbased techniques also allow users to search with various queries such as binary images, range images and even 2D sketches. The previous view-based techniques use classical 2D shape descriptors such as Fourier invariants, Zernike moments, Scale Invariant Feature Transform-based local features and 2D Digital Fourier Transform coefficients. These methods describe each object independent of others. In this work, we explore data driven subspace models, such as Principal Component Analysis, Independent Component Analysis and Nonnegative Matrix Factorization to describe the shape information of the views. We treat the depth images obtained from various points of the view sphere as 2D intensity images and train a subspace to extract the inherent structure of the views within a database. We also show the benefit of categorizing shapes according to their eigenvalue spread. Both the shape categorization and data-driven feature set conjectures are tested on the PSB database and compared with the competitor view-based 3D shape retrieval algorithms.
VIGOR: Interactive Visual Exploration of Graph Query Results.
Pienta, Robert; Hohman, Fred; Endert, Alex; Tamersoy, Acar; Roundy, Kevin; Gates, Chris; Navathe, Shamkant; Chau, Duen Horng
2018-01-01
Finding patterns in graphs has become a vital challenge in many domains from biological systems, network security, to finance (e.g., finding money laundering rings of bankers and business owners). While there is significant interest in graph databases and querying techniques, less research has focused on helping analysts make sense of underlying patterns within a group of subgraph results. Visualizing graph query results is challenging, requiring effective summarization of a large number of subgraphs, each having potentially shared node-values, rich node features, and flexible structure across queries. We present VIGOR, a novel interactive visual analytics system, for exploring and making sense of query results. VIGOR uses multiple coordinated views, leveraging different data representations and organizations to streamline analysts sensemaking process. VIGOR contributes: (1) an exemplar-based interaction technique, where an analyst starts with a specific result and relaxes constraints to find other similar results or starts with only the structure (i.e., without node value constraints), and adds constraints to narrow in on specific results; and (2) a novel feature-aware subgraph result summarization. Through a collaboration with Symantec, we demonstrate how VIGOR helps tackle real-world problems through the discovery of security blindspots in a cybersecurity dataset with over 11,000 incidents. We also evaluate VIGOR with a within-subjects study, demonstrating VIGOR's ease of use over a leading graph database management system, and its ability to help analysts understand their results at higher speed and make fewer errors.
Accurate airway centerline extraction based on topological thinning using graph-theoretic analysis.
Bian, Zijian; Tan, Wenjun; Yang, Jinzhu; Liu, Jiren; Zhao, Dazhe
2014-01-01
The quantitative analysis of the airway tree is of critical importance in the CT-based diagnosis and treatment of popular pulmonary diseases. The extraction of airway centerline is a precursor to identify airway hierarchical structure, measure geometrical parameters, and guide visualized detection. Traditional methods suffer from extra branches and circles due to incomplete segmentation results, which induce false analysis in applications. This paper proposed an automatic and robust centerline extraction method for airway tree. First, the centerline is located based on the topological thinning method; border voxels are deleted symmetrically to preserve topological and geometrical properties iteratively. Second, the structural information is generated using graph-theoretic analysis. Then inaccurate circles are removed with a distance weighting strategy, and extra branches are pruned according to clinical anatomic knowledge. The centerline region without false appendices is eventually determined after the described phases. Experimental results show that the proposed method identifies more than 96% branches and keep consistency across different cases and achieves superior circle-free structure and centrality.
Integrating Semantic Information in Metadata Descriptions for a Geoscience-wide Resource Inventory.
NASA Astrophysics Data System (ADS)
Zaslavsky, I.; Richard, S. M.; Gupta, A.; Valentine, D.; Whitenack, T.; Ozyurt, I. B.; Grethe, J. S.; Schachne, A.
2016-12-01
Integrating semantic information into legacy metadata catalogs is a challenging issue and so far has been mostly done on a limited scale. We present experience of CINERGI (Community Inventory of Earthcube Resources for Geoscience Interoperability), an NSF Earthcube Building Block project, in creating a large cross-disciplinary catalog of geoscience information resources to enable cross-domain discovery. The project developed a pipeline for automatically augmenting resource metadata, in particular generating keywords that describe metadata documents harvested from multiple geoscience information repositories or contributed by geoscientists through various channels including surveys and domain resource inventories. The pipeline examines available metadata descriptions using text parsing, vocabulary management and semantic annotation and graph navigation services of GeoSciGraph. GeoSciGraph, in turn, relies on a large cross-domain ontology of geoscience terms, which bridges several independently developed ontologies or taxonomies including SWEET, ENVO, YAGO, GeoSciML, GCMD, SWO, and CHEBI. The ontology content enables automatic extraction of keywords reflecting science domains, equipment used, geospatial features, measured properties, methods, processes, etc. We specifically focus on issues of cross-domain geoscience ontology creation, resolving several types of semantic conflicts among component ontologies or vocabularies, and constructing and managing facets for improved data discovery and navigation. The ontology and keyword generation rules are iteratively improved as pipeline results are presented to data managers for selective manual curation via a CINERGI Annotator user interface. We present lessons learned from applying CINERGI metadata augmentation pipeline to a number of federal agency and academic data registries, in the context of several use cases that require data discovery and integration across multiple earth science data catalogs of varying quality and completeness. The inventory is accessible at http://cinergi.sdsc.edu, and the CINERGI project web page is http://earthcube.org/group/cinergi
Computational neuroscience approach to biomarkers and treatments for mental disorders.
Yahata, Noriaki; Kasai, Kiyoto; Kawato, Mitsuo
2017-04-01
Psychiatry research has long experienced a stagnation stemming from a lack of understanding of the neurobiological underpinnings of phenomenologically defined mental disorders. Recently, the application of computational neuroscience to psychiatry research has shown great promise in establishing a link between phenomenological and pathophysiological aspects of mental disorders, thereby recasting current nosology in more biologically meaningful dimensions. In this review, we highlight recent investigations into computational neuroscience that have undertaken either theory- or data-driven approaches to quantitatively delineate the mechanisms of mental disorders. The theory-driven approach, including reinforcement learning models, plays an integrative role in this process by enabling correspondence between behavior and disorder-specific alterations at multiple levels of brain organization, ranging from molecules to cells to circuits. Previous studies have explicated a plethora of defining symptoms of mental disorders, including anhedonia, inattention, and poor executive function. The data-driven approach, on the other hand, is an emerging field in computational neuroscience seeking to identify disorder-specific features among high-dimensional big data. Remarkably, various machine-learning techniques have been applied to neuroimaging data, and the extracted disorder-specific features have been used for automatic case-control classification. For many disorders, the reported accuracies have reached 90% or more. However, we note that rigorous tests on independent cohorts are critically required to translate this research into clinical applications. Finally, we discuss the utility of the disorder-specific features found by the data-driven approach to psychiatric therapies, including neurofeedback. Such developments will allow simultaneous diagnosis and treatment of mental disorders using neuroimaging, thereby establishing 'theranostics' for the first time in clinical psychiatry. © 2016 The Authors. Psychiatry and Clinical Neurosciences © 2016 Japanese Society of Psychiatry and Neurology.
Structural Features of Algebraic Quantum Notations
ERIC Educational Resources Information Center
Gire, Elizabeth; Price, Edward
2015-01-01
The formalism of quantum mechanics includes a rich collection of representations for describing quantum systems, including functions, graphs, matrices, histograms of probabilities, and Dirac notation. The varied features of these representations affect how computations are performed. For example, identifying probabilities of measurement outcomes…
Harris, Eric S J; Erickson, Sean D; Tolopko, Andrew N; Cao, Shugeng; Craycroft, Jane A; Scholten, Robert; Fu, Yanling; Wang, Wenquan; Liu, Yong; Zhao, Zhongzhen; Clardy, Jon; Shamu, Caroline E; Eisenberg, David M
2011-05-17
Ethnobotanically driven drug-discovery programs include data related to many aspects of the preparation of botanical medicines, from initial plant collection to chemical extraction and fractionation. The Traditional Medicine Collection Tracking System (TM-CTS) was created to organize and store data of this type for an international collaborative project involving the systematic evaluation of commonly used Traditional Chinese Medicinal plants. The system was developed using domain-driven design techniques, and is implemented using Java, Hibernate, PostgreSQL, Business Intelligence and Reporting Tools (BIRT), and Apache Tomcat. The TM-CTS relational database schema contains over 70 data types, comprising over 500 data fields. The system incorporates a number of unique features that are useful in the context of ethnobotanical projects such as support for information about botanical collection, method of processing, quality tests for plants with existing pharmacopoeia standards, chemical extraction and fractionation, and historical uses of the plants. The database also accommodates data provided in multiple languages and integration with a database system built to support high throughput screening based drug discovery efforts. It is accessed via a web-based application that provides extensive, multi-format reporting capabilities. This new database system was designed to support a project evaluating the bioactivity of Chinese medicinal plants. The software used to create the database is open source, freely available, and could potentially be applied to other ethnobotanically driven natural product collection and drug-discovery programs. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Harris, Eric S. J.; Erickson, Sean D.; Tolopko, Andrew N.; Cao, Shugeng; Craycroft, Jane A.; Scholten, Robert; Fu, Yanling; Wang, Wenquan; Liu, Yong; Zhao, Zhongzhen; Clardy, Jon; Shamu, Caroline E.; Eisenberg, David M.
2011-01-01
Aim of the study. Ethnobotanically-driven drug-discovery programs include data related to many aspects of the preparation of botanical medicines, from initial plant collection to chemical extraction and fractionation. The Traditional Medicine-Collection Tracking System (TM-CTS) was created to organize and store data of this type for an international collaborative project involving the systematic evaluation of commonly used Traditional Chinese Medicinal plants. Materials and Methods. The system was developed using domain-driven design techniques, and is implemented using Java, Hibernate, PostgreSQL, Business Intelligence and Reporting Tools (BIRT), and Apache Tomcat. Results. The TM-CTS relational database schema contains over 70 data types, comprising over 500 data fields. The system incorporates a number of unique features that are useful in the context of ethnobotanical projects such as support for information about botanical collection, method of processing, quality tests for plants with existing pharmacopoeia standards, chemical extraction and fractionation, and historical uses of the plants. The database also accommodates data provided in multiple languages and integration with a database system built to support high throughput screening based drug discovery efforts. It is accessed via a web-based application that provides extensive, multi-format reporting capabilities. Conclusions. This new database system was designed to support a project evaluating the bioactivity of Chinese medicinal plants. The software used to create the database is open source, freely available, and could potentially be applied to other ethnobotanically-driven natural product collection and drug-discovery programs. PMID:21420479
Quantitation & Case-Study-Driven Inquiry to Enhance Yeast Fermentation Studies
ERIC Educational Resources Information Center
Grammer, Robert T.
2012-01-01
We propose a procedure for the assay of fermentation in yeast in microcentrifuge tubes that is simple and rapid, permitting assay replicates, descriptive statistics, and the preparation of line graphs that indicate reproducibility. Using regression and simple derivatives to determine initial velocities, we suggest methods to compare the effects of…
Clustering in complex directed networks
NASA Astrophysics Data System (ADS)
Fagiolo, Giorgio
2007-08-01
Many empirical networks display an inherent tendency to cluster, i.e., to form circles of connected nodes. This feature is typically measured by the clustering coefficient (CC). The CC, originally introduced for binary, undirected graphs, has been recently generalized to weighted, undirected networks. Here we extend the CC to the case of (binary and weighted) directed networks and we compute its expected value for random graphs. We distinguish between CCs that count all directed triangles in the graph (independently of the direction of their edges) and CCs that only consider particular types of directed triangles (e.g., cycles). The main concepts are illustrated by employing empirical data on world-trade flows.
Anifah, Lilik; Purnama, I Ketut Eddy; Hariadi, Mochamad; Purnomo, Mauridhi Hery
2013-01-01
Localization is the first step in osteoarthritis (OA) classification. Manual classification, however, is time-consuming, tedious, and expensive. The proposed system is designed as decision support system for medical doctors to classify the severity of knee OA. A method has been proposed here to localize a joint space area for OA and then classify it in 4 steps to classify OA into KL-Grade 0, KL-Grade 1, KL-Grade 2, KL-Grade 3 and KL-Grade 4, which are preprocessing, segmentation, feature extraction, and classification. In this proposed system, right and left knee detection was performed by employing the Contrast-Limited Adaptive Histogram Equalization (CLAHE) and the template matching. The Gabor kernel, row sum graph and moment methods were used to localize the junction space area of knee. CLAHE is used for preprocessing step, i.e.to normalize the varied intensities. The segmentation process was conducted using the Gabor kernel, template matching, row sum graph and gray level center of mass method. Here GLCM (contrast, correlation, energy, and homogeinity) features were employed as training data. Overall, 50 data were evaluated for training and 258 data for testing. Experimental results showed the best performance by using gabor kernel with parameters α=8, θ=0, Ψ=[0 π/2], γ=0,8, N=4 and with number of iterations being 5000, momentum value 0.5 and α0=0.6 for the classification process. The run gave classification accuracy rate of 93.8% for KL-Grade 0, 70% for KL-Grade 1, 4% for KL-Grade 2, 10% for KL-Grade 3 and 88.9% for KL-Grade 4.
Anifah, Lilik; Purnama, I Ketut Eddy; Hariadi, Mochamad; Purnomo, Mauridhi Hery
2013-01-01
Localization is the first step in osteoarthritis (OA) classification. Manual classification, however, is time-consuming, tedious, and expensive. The proposed system is designed as decision support system for medical doctors to classify the severity of knee OA. A method has been proposed here to localize a joint space area for OA and then classify it in 4 steps to classify OA into KL-Grade 0, KL-Grade 1, KL-Grade 2, KL-Grade 3 and KL-Grade 4, which are preprocessing, segmentation, feature extraction, and classification. In this proposed system, right and left knee detection was performed by employing the Contrast-Limited Adaptive Histogram Equalization (CLAHE) and the template matching. The Gabor kernel, row sum graph and moment methods were used to localize the junction space area of knee. CLAHE is used for preprocessing step, i.e.to normalize the varied intensities. The segmentation process was conducted using the Gabor kernel, template matching, row sum graph and gray level center of mass method. Here GLCM (contrast, correlation, energy, and homogeinity) features were employed as training data. Overall, 50 data were evaluated for training and 258 data for testing. Experimental results showed the best performance by using gabor kernel with parameters α=8, θ=0, Ψ=[0 π/2], γ=0,8, N=4 and with number of iterations being 5000, momentum value 0.5 and α0=0.6 for the classification process. The run gave classification accuracy rate of 93.8% for KL-Grade 0, 70% for KL-Grade 1, 4% for KL-Grade 2, 10% for KL-Grade 3 and 88.9% for KL-Grade 4. PMID:23525188
NASA Astrophysics Data System (ADS)
Zhou, Hang
Quantum walks are the quantum mechanical analogue of classical random walks. Discrete-time quantum walks have been introduced and studied mostly on the line Z or higher dimensional space Zd but rarely defined on graphs with fractal dimensions because the coin operator depends on the position and the Fourier transform on the fractals is not defined. Inspired by its nature of classical walks, different quantum walks will be defined by choosing different shift and coin operators. When the coin operator is uniform, the results of classical walks will be obtained upon measurement at each step. Moreover, with measurement at each step, our results reveal more information about the classical random walks. In this dissertation, two graphs with fractal dimensions will be considered. The first one is Sierpinski gasket, a degree-4 regular graph with Hausdorff dimension of df = ln 3/ ln 2. The second is the Cantor graph derived like Cantor set, with Hausdorff dimension of df = ln 2/ ln 3. The definitions and amplitude functions of the quantum walks will be introduced. The main part of this dissertation is to derive a recursive formula to compute the amplitude Green function. The exiting probability will be computed and compared with the classical results. When the generation of graphs goes to infinity, the recursion of the walks will be investigated and the convergence rates will be obtained and compared with the classical counterparts.
The coevolution of innovation and technical intelligence in primates
Street, Sally E.; Whalen, Andrew; Laland, Kevin N.
2016-01-01
In birds and primates, the frequency of behavioural innovation has been shown to covary with absolute and relative brain size, leading to the suggestion that large brains allow animals to innovate, and/or that selection for innovativeness, together with social learning, may have driven brain enlargement. We examined the relationship between primate brain size and both technical (i.e. tool using) and non-technical innovation, deploying a combination of phylogenetically informed regression and exploratory causal graph analyses. Regression analyses revealed that absolute and relative brain size correlated positively with technical innovation, and exhibited consistently weaker, but still positive, relationships with non-technical innovation. These findings mirror similar results in birds. Our exploratory causal graph analyses suggested that technical innovation shares strong direct relationships with brain size, body size, social learning rate and social group size, whereas non-technical innovation did not exhibit a direct relationship with brain size. Nonetheless, non-technical innovation was linked to brain size indirectly via diet and life-history variables. Our findings support ‘technical intelligence’ hypotheses in linking technical innovation to encephalization in the restricted set of primate lineages where technical innovation has been reported. Our findings also provide support for a broad co-evolving complex of brain, behaviour, life-history, social and dietary variables, providing secondary support for social and ecological intelligence hypotheses. The ability to gain access to difficult-to-extract, but potentially nutrient-rich, resources through tool use may have conferred on some primates adaptive advantages, leading to selection for brain circuitry that underlies technical proficiency. PMID:26926276
The coevolution of innovation and technical intelligence in primates.
Navarrete, Ana F; Reader, Simon M; Street, Sally E; Whalen, Andrew; Laland, Kevin N
2016-03-19
In birds and primates, the frequency of behavioural innovation has been shown to covary with absolute and relative brain size, leading to the suggestion that large brains allow animals to innovate, and/or that selection for innovativeness, together with social learning, may have driven brain enlargement. We examined the relationship between primate brain size and both technical (i.e. tool using) and non-technical innovation, deploying a combination of phylogenetically informed regression and exploratory causal graph analyses. Regression analyses revealed that absolute and relative brain size correlated positively with technical innovation, and exhibited consistently weaker, but still positive, relationships with non-technical innovation. These findings mirror similar results in birds. Our exploratory causal graph analyses suggested that technical innovation shares strong direct relationships with brain size, body size, social learning rate and social group size, whereas non-technical innovation did not exhibit a direct relationship with brain size. Nonetheless, non-technical innovation was linked to brain size indirectly via diet and life-history variables. Our findings support 'technical intelligence' hypotheses in linking technical innovation to encephalization in the restricted set of primate lineages where technical innovation has been reported. Our findings also provide support for a broad co-evolving complex of brain, behaviour, life-history, social and dietary variables, providing secondary support for social and ecological intelligence hypotheses. The ability to gain access to difficult-to-extract, but potentially nutrient-rich, resources through tool use may have conferred on some primates adaptive advantages, leading to selection for brain circuitry that underlies technical proficiency. © 2016 The Author(s).
Parsing Flowcharts and Series-Parallel Graphs
1978-11-01
descriptions of the graph. This possible multiplicity is undesirable in most practical applications, a fact that makes parti%:ularly useful reduction...to parse TT networks, some of the features that make this parsing method useful in other cases are more natually introduced in the context of this...as Figure 4.5 shows. This multiplicity is due to the associativity of consecutive Two Terminal Series and Two Terminal Parallel compositions. In spite
Rapid tuning shifts in human auditory cortex enhance speech intelligibility
Holdgraf, Christopher R.; de Heer, Wendy; Pasley, Brian; Rieger, Jochem; Crone, Nathan; Lin, Jack J.; Knight, Robert T.; Theunissen, Frédéric E.
2016-01-01
Experience shapes our perception of the world on a moment-to-moment basis. This robust perceptual effect of experience parallels a change in the neural representation of stimulus features, though the nature of this representation and its plasticity are not well-understood. Spectrotemporal receptive field (STRF) mapping describes the neural response to acoustic features, and has been used to study contextual effects on auditory receptive fields in animal models. We performed a STRF plasticity analysis on electrophysiological data from recordings obtained directly from the human auditory cortex. Here, we report rapid, automatic plasticity of the spectrotemporal response of recorded neural ensembles, driven by previous experience with acoustic and linguistic information, and with a neurophysiological effect in the sub-second range. This plasticity reflects increased sensitivity to spectrotemporal features, enhancing the extraction of more speech-like features from a degraded stimulus and providing the physiological basis for the observed ‘perceptual enhancement' in understanding speech. PMID:27996965
Abbasian Ardakani, Ali; Gharbali, Akbar; Mohammadi, Afshin
2015-01-01
The aim of this study was to evaluate computer aided diagnosis (CAD) system with texture analysis (TA) to improve radiologists' accuracy in identification of thyroid nodules as malignant or benign. A total of 70 cases (26 benign and 44 malignant) were analyzed in this study. We extracted up to 270 statistical texture features as a descriptor for each selected region of interests (ROIs) in three normalization schemes (default, 3s and 1%-99%). Then features by the lowest probability of classification error and average correlation coefficients (POE+ACC), and Fisher coefficient (Fisher) eliminated to 10 best and most effective features. These features were analyzed under standard and nonstandard states. For TA of the thyroid nodules, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Non-Linear Discriminant Analysis (NDA) were applied. First Nearest-Neighbour (1-NN) classifier was performed for the features resulting from PCA and LDA. NDA features were classified by artificial neural network (A-NN). Receiver operating characteristic (ROC) curve analysis was used for examining the performance of TA methods. The best results were driven in 1-99% normalization with features extracted by POE+ACC algorithm and analyzed by NDA with the area under the ROC curve ( Az) of 0.9722 which correspond to sensitivity of 94.45%, specificity of 100%, and accuracy of 97.14%. Our results indicate that TA is a reliable method, can provide useful information help radiologist in detection and classification of benign and malignant thyroid nodules.
Leveraging graph topology and semantic context for pharmacovigilance through twitter-streams.
Eshleman, Ryan; Singh, Rahul
2016-10-06
Adverse drug events (ADEs) constitute one of the leading causes of post-therapeutic death and their identification constitutes an important challenge of modern precision medicine. Unfortunately, the onset and effects of ADEs are often underreported complicating timely intervention. At over 500 million posts per day, Twitter is a commonly used social media platform. The ubiquity of day-to-day personal information exchange on Twitter makes it a promising target for data mining for ADE identification and intervention. Three technical challenges are central to this problem: (1) identification of salient medical keywords in (noisy) tweets, (2) mapping drug-effect relationships, and (3) classification of such relationships as adverse or non-adverse. We use a bipartite graph-theoretic representation called a drug-effect graph (DEG) for modeling drug and side effect relationships by representing the drugs and side effects as vertices. We construct individual DEGs on two data sources. The first DEG is constructed from the drug-effect relationships found in FDA package inserts as recorded in the SIDER database. The second DEG is constructed by mining the history of Twitter users. We use dictionary-based information extraction to identify medically-relevant concepts in tweets. Drugs, along with co-occurring symptoms are connected with edges weighted by temporal distance and frequency. Finally, information from the SIDER DEG is integrate with the Twitter DEG and edges are classified as either adverse or non-adverse using supervised machine learning. We examine both graph-theoretic and semantic features for the classification task. The proposed approach can identify adverse drug effects with high accuracy with precision exceeding 85 % and F1 exceeding 81 %. When compared with leading methods at the state-of-the-art, which employ un-enriched graph-theoretic analysis alone, our method leads to improvements ranging between 5 and 8 % in terms of the aforementioned measures. Additionally, we employ our method to discover several ADEs which, though present in medical literature and Twitter-streams, are not represented in the SIDER databases. We present a DEG integration model as a powerful formalism for the analysis of drug-effect relationships that is general enough to accommodate diverse data sources, yet rigorous enough to provide a strong mechanism for ADE identification.
Model-based multiple patterning layout decomposition
NASA Astrophysics Data System (ADS)
Guo, Daifeng; Tian, Haitong; Du, Yuelin; Wong, Martin D. F.
2015-10-01
As one of the most promising next generation lithography technologies, multiple patterning lithography (MPL) plays an important role in the attempts to keep in pace with 10 nm technology node and beyond. With feature size keeps shrinking, it has become impossible to print dense layouts within one single exposure. As a result, MPL such as double patterning lithography (DPL) and triple patterning lithography (TPL) has been widely adopted. There is a large volume of literature on DPL/TPL layout decomposition, and the current approach is to formulate the problem as a classical graph-coloring problem: Layout features (polygons) are represented by vertices in a graph G and there is an edge between two vertices if and only if the distance between the two corresponding features are less than a minimum distance threshold value dmin. The problem is to color the vertices of G using k colors (k = 2 for DPL, k = 3 for TPL) such that no two vertices connected by an edge are given the same color. This is a rule-based approach, which impose a geometric distance as a minimum constraint to simply decompose polygons within the distance into different masks. It is not desired in practice because this criteria cannot completely capture the behavior of the optics. For example, it lacks of sufficient information such as the optical source characteristics and the effects between the polygons outside the minimum distance. To remedy the deficiency, a model-based layout decomposition approach to make the decomposition criteria base on simulation results was first introduced at SPIE 2013.1 However, the algorithm1 is based on simplified assumption on the optical simulation model and therefore its usage on real layouts is limited. Recently AMSL2 also proposed a model-based approach to layout decomposition by iteratively simulating the layout, which requires excessive computational resource and may lead to sub-optimal solutions. The approach2 also potentially generates too many stiches. In this paper, we propose a model-based MPL layout decomposition method using a pre-simulated library of frequent layout patterns. Instead of using the graph G in the standard graph-coloring formulation, we build an expanded graph H where each vertex represents a group of adjacent features together with a coloring solution. By utilizing the library and running sophisticated graph algorithms on H, our approach can obtain optimal decomposition results efficiently. Our model-based solution can achieve a practical mask design which significantly improves the lithography quality on the wafer compared to the rule based decomposition.
SAGE: String-overlap Assembly of GEnomes.
Ilie, Lucian; Haider, Bahlul; Molnar, Michael; Solis-Oba, Roberto
2014-09-15
De novo genome assembly of next-generation sequencing data is one of the most important current problems in bioinformatics, essential in many biological applications. In spite of significant amount of work in this area, better solutions are still very much needed. We present a new program, SAGE, for de novo genome assembly. As opposed to most assemblers, which are de Bruijn graph based, SAGE uses the string-overlap graph. SAGE builds upon great existing work on string-overlap graph and maximum likelihood assembly, bringing an important number of new ideas, such as the efficient computation of the transitive reduction of the string overlap graph, the use of (generalized) edge multiplicity statistics for more accurate estimation of read copy counts, and the improved use of mate pairs and min-cost flow for supporting edge merging. The assemblies produced by SAGE for several short and medium-size genomes compared favourably with those of existing leading assemblers. SAGE benefits from innovations in almost every aspect of the assembly process: error correction of input reads, string-overlap graph construction, read copy counts estimation, overlap graph analysis and reduction, contig extraction, and scaffolding. We hope that these new ideas will help advance the current state-of-the-art in an essential area of research in genomics.
Edge length dynamics on graphs with applications to p-adic AdS/CFT
Gubser, Steven S.; Heydeman, Matthew; Jepsen, Christian; ...
2017-06-30
We formulate a Euclidean theory of edge length dynamics based on a notion of Ricci curvature on graphs with variable edge lengths. In order to write an explicit form for the discrete analog of the Einstein-Hilbert action, we require that the graph should either be a tree or that all its cycles should be sufficiently long. The infinite regular tree with all edge lengths equal is an example of a graph with constant negative curvature, providing a connection with p-adic AdS/CFT, where such a tree takes the place of anti-de Sitter space. Here, we compute simple correlators of the operatormore » holographically dual to edge length fluctuations. This operator has dimension equal to the dimension of the boundary, and it has some features in common with the stress tensor.« less
Edge length dynamics on graphs with applications to p-adic AdS/CFT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gubser, Steven S.; Heydeman, Matthew; Jepsen, Christian
We formulate a Euclidean theory of edge length dynamics based on a notion of Ricci curvature on graphs with variable edge lengths. In order to write an explicit form for the discrete analog of the Einstein-Hilbert action, we require that the graph should either be a tree or that all its cycles should be sufficiently long. The infinite regular tree with all edge lengths equal is an example of a graph with constant negative curvature, providing a connection with p-adic AdS/CFT, where such a tree takes the place of anti-de Sitter space. Here, we compute simple correlators of the operatormore » holographically dual to edge length fluctuations. This operator has dimension equal to the dimension of the boundary, and it has some features in common with the stress tensor.« less
NASA Astrophysics Data System (ADS)
Hong, Pengyu; Sun, Hui; Sha, Long; Pu, Yi; Khatri, Kshitij; Yu, Xiang; Tang, Yang; Lin, Cheng
2017-08-01
A major challenge in glycomics is the characterization of complex glycan structures that are essential for understanding their diverse roles in many biological processes. We present a novel efficient computational approach, named GlycoDeNovo, for accurate elucidation of the glycan topologies from their tandem mass spectra. Given a spectrum, GlycoDeNovo first builds an interpretation-graph specifying how to interpret each peak using preceding interpreted peaks. It then reconstructs the topologies of peaks that contribute to interpreting the precursor ion. We theoretically prove that GlycoDeNovo is highly efficient. A major innovative feature added to GlycoDeNovo is a data-driven IonClassifier which can be used to effectively rank candidate topologies. IonClassifier is automatically learned from experimental spectra of known glycans to distinguish B- and C-type ions from all other ion types. Our results showed that GlycoDeNovo is robust and accurate for topology reconstruction of glycans from their tandem mass spectra. [Figure not available: see fulltext.
Branching dynamics of viral information spreading.
Iribarren, José Luis; Moro, Esteban
2011-10-01
Despite its importance for rumors or innovations propagation, peer-to-peer collaboration, social networking, or marketing, the dynamics of information spreading is not well understood. Since the diffusion depends on the heterogeneous patterns of human behavior and is driven by the participants' decisions, its propagation dynamics shows surprising properties not explained by traditional epidemic or contagion models. Here we present a detailed analysis of our study of real viral marketing campaigns where tracking the propagation of a controlled message allowed us to analyze the structure and dynamics of a diffusion graph involving over 31,000 individuals. We found that information spreading displays a non-Markovian branching dynamics that can be modeled by a two-step Bellman-Harris branching process that generalizes the static models known in the literature and incorporates the high variability of human behavior. It explains accurately all the features of information propagation under the "tipping point" and can be used for prediction and management of viral information spreading processes.
Branching dynamics of viral information spreading
NASA Astrophysics Data System (ADS)
Iribarren, José Luis; Moro, Esteban
2011-10-01
Despite its importance for rumors or innovations propagation, peer-to-peer collaboration, social networking, or marketing, the dynamics of information spreading is not well understood. Since the diffusion depends on the heterogeneous patterns of human behavior and is driven by the participants’ decisions, its propagation dynamics shows surprising properties not explained by traditional epidemic or contagion models. Here we present a detailed analysis of our study of real viral marketing campaigns where tracking the propagation of a controlled message allowed us to analyze the structure and dynamics of a diffusion graph involving over 31 000 individuals. We found that information spreading displays a non-Markovian branching dynamics that can be modeled by a two-step Bellman-Harris branching process that generalizes the static models known in the literature and incorporates the high variability of human behavior. It explains accurately all the features of information propagation under the “tipping point” and can be used for prediction and management of viral information spreading processes.
Tsatsishvili, Valeri; Burunat, Iballa; Cong, Fengyu; Toiviainen, Petri; Alluri, Vinoo; Ristaniemi, Tapani
2018-06-01
There has been growing interest towards naturalistic neuroimaging experiments, which deepen our understanding of how human brain processes and integrates incoming streams of multifaceted sensory information, as commonly occurs in real world. Music is a good example of such complex continuous phenomenon. In a few recent fMRI studies examining neural correlates of music in continuous listening settings, multiple perceptual attributes of music stimulus were represented by a set of high-level features, produced as the linear combination of the acoustic descriptors computationally extracted from the stimulus audio. NEW METHOD: fMRI data from naturalistic music listening experiment were employed here. Kernel principal component analysis (KPCA) was applied to acoustic descriptors extracted from the stimulus audio to generate a set of nonlinear stimulus features. Subsequently, perceptual and neural correlates of the generated high-level features were examined. The generated features captured musical percepts that were hidden from the linear PCA features, namely Rhythmic Complexity and Event Synchronicity. Neural correlates of the new features revealed activations associated to processing of complex rhythms, including auditory, motor, and frontal areas. Results were compared with the findings in the previously published study, which analyzed the same fMRI data but applied linear PCA for generating stimulus features. To enable comparison of the results, methodology for finding stimulus-driven functional maps was adopted from the previous study. Exploiting nonlinear relationships among acoustic descriptors can lead to the novel high-level stimulus features, which can in turn reveal new brain structures involved in music processing. Copyright © 2018 Elsevier B.V. All rights reserved.
Model-Driven Development for scientific computing. An upgrade of the RHEEDGr program
NASA Astrophysics Data System (ADS)
Daniluk, Andrzej
2009-11-01
Model-Driven Engineering (MDE) is the software engineering discipline, which considers models as the most important element for software development, and for the maintenance and evolution of software, through model transformation. Model-Driven Architecture (MDA) is the approach for software development under the Model-Driven Engineering framework. This paper surveys the core MDA technology that was used to upgrade of the RHEEDGR program to C++0x language standards. New version program summaryProgram title: RHEEDGR-09 Catalogue identifier: ADUY_v3_0 Program summary URL:
Multi-label literature classification based on the Gene Ontology graph.
Jin, Bo; Muller, Brian; Zhai, Chengxiang; Lu, Xinghua
2008-12-08
The Gene Ontology is a controlled vocabulary for representing knowledge related to genes and proteins in a computable form. The current effort of manually annotating proteins with the Gene Ontology is outpaced by the rate of accumulation of biomedical knowledge in literature, which urges the development of text mining approaches to facilitate the process by automatically extracting the Gene Ontology annotation from literature. The task is usually cast as a text classification problem, and contemporary methods are confronted with unbalanced training data and the difficulties associated with multi-label classification. In this research, we investigated the methods of enhancing automatic multi-label classification of biomedical literature by utilizing the structure of the Gene Ontology graph. We have studied three graph-based multi-label classification algorithms, including a novel stochastic algorithm and two top-down hierarchical classification methods for multi-label literature classification. We systematically evaluated and compared these graph-based classification algorithms to a conventional flat multi-label algorithm. The results indicate that, through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods can significantly improve predictions of the Gene Ontology terms implied by the analyzed text. Furthermore, the graph-based multi-label classifiers are capable of suggesting Gene Ontology annotations (to curators) that are closely related to the true annotations even if they fail to predict the true ones directly. A software package implementing the studied algorithms is available for the research community. Through utilizing the information from the structure of the Gene Ontology graph, the graph-based multi-label classification methods have better potential than the conventional flat multi-label classification approach to facilitate protein annotation based on the literature.
Time-dependence of graph theory metrics in functional connectivity analysis
Chiang, Sharon; Cassese, Alberto; Guindani, Michele; Vannucci, Marina; Yeh, Hsiang J.; Haneef, Zulfi; Stern, John M.
2016-01-01
Brain graphs provide a useful way to computationally model the network structure of the connectome, and this has led to increasing interest in the use of graph theory to quantitate and investigate the topological characteristics of the healthy brain and brain disorders on the network level. The majority of graph theory investigations of functional connectivity have relied on the assumption of temporal stationarity. However, recent evidence increasingly suggests that functional connectivity fluctuates over the length of the scan. In this study, we investigate the stationarity of brain network topology using a Bayesian hidden Markov model (HMM) approach that estimates the dynamic structure of graph theoretical measures of whole-brain functional connectivity. In addition to extracting the stationary distribution and transition probabilities of commonly employed graph theory measures, we propose two estimators of temporal stationarity: the S-index and N-index. These indexes can be used to quantify different aspects of the temporal stationarity of graph theory measures. We apply the method and proposed estimators to resting-state functional MRI data from healthy controls and patients with temporal lobe epilepsy. Our analysis shows that several graph theory measures, including small-world index, global integration measures, and betweenness centrality, may exhibit greater stationarity over time and therefore be more robust. Additionally, we demonstrate that accounting for subject-level differences in the level of temporal stationarity of network topology may increase discriminatory power in discriminating between disease states. Our results confirm and extend findings from other studies regarding the dynamic nature of functional connectivity, and suggest that using statistical models which explicitly account for the dynamic nature of functional connectivity in graph theory analyses may improve the sensitivity of investigations and consistency across investigations. PMID:26518632
Lung vessel segmentation in CT images using graph-cuts
NASA Astrophysics Data System (ADS)
Zhai, Zhiwei; Staring, Marius; Stoel, Berend C.
2016-03-01
Accurate lung vessel segmentation is an important operation for lung CT analysis. Filters that are based on analyzing the eigenvalues of the Hessian matrix are popular for pulmonary vessel enhancement. However, due to their low response at vessel bifurcations and vessel boundaries, extracting lung vessels by thresholding the vesselness is not sufficiently accurate. Some methods turn to graph-cuts for more accurate segmentation, as it incorporates neighbourhood information. In this work, we propose a new graph-cuts cost function combining appearance and shape, where CT intensity represents appearance and vesselness from a Hessian-based filter represents shape. Due to the amount of voxels in high resolution CT scans, the memory requirement and time consumption for building a graph structure is very high. In order to make the graph representation computationally tractable, those voxels that are considered clearly background are removed from the graph nodes, using a threshold on the vesselness map. The graph structure is then established based on the remaining voxel nodes, source/sink nodes and the neighbourhood relationship of the remaining voxels. Vessels are segmented by minimizing the energy cost function with the graph-cuts optimization framework. We optimized the parameters used in the graph-cuts cost function and evaluated the proposed method with two manually labeled sub-volumes. For independent evaluation, we used 20 CT scans of the VESSEL12 challenge. The evaluation results of the sub-volume data show that the proposed method produced a more accurate vessel segmentation compared to the previous methods, with F1 score 0.76 and 0.69. In the VESSEL12 data-set, our method obtained a competitive performance with an area under the ROC curve of 0.975, especially among the binary submissions.
Time-dependence of graph theory metrics in functional connectivity analysis.
Chiang, Sharon; Cassese, Alberto; Guindani, Michele; Vannucci, Marina; Yeh, Hsiang J; Haneef, Zulfi; Stern, John M
2016-01-15
Brain graphs provide a useful way to computationally model the network structure of the connectome, and this has led to increasing interest in the use of graph theory to quantitate and investigate the topological characteristics of the healthy brain and brain disorders on the network level. The majority of graph theory investigations of functional connectivity have relied on the assumption of temporal stationarity. However, recent evidence increasingly suggests that functional connectivity fluctuates over the length of the scan. In this study, we investigate the stationarity of brain network topology using a Bayesian hidden Markov model (HMM) approach that estimates the dynamic structure of graph theoretical measures of whole-brain functional connectivity. In addition to extracting the stationary distribution and transition probabilities of commonly employed graph theory measures, we propose two estimators of temporal stationarity: the S-index and N-index. These indexes can be used to quantify different aspects of the temporal stationarity of graph theory measures. We apply the method and proposed estimators to resting-state functional MRI data from healthy controls and patients with temporal lobe epilepsy. Our analysis shows that several graph theory measures, including small-world index, global integration measures, and betweenness centrality, may exhibit greater stationarity over time and therefore be more robust. Additionally, we demonstrate that accounting for subject-level differences in the level of temporal stationarity of network topology may increase discriminatory power in discriminating between disease states. Our results confirm and extend findings from other studies regarding the dynamic nature of functional connectivity, and suggest that using statistical models which explicitly account for the dynamic nature of functional connectivity in graph theory analyses may improve the sensitivity of investigations and consistency across investigations. Copyright © 2015 Elsevier Inc. All rights reserved.
Retina verification system based on biometric graph matching.
Lajevardi, Seyed Mehdi; Arakala, Arathi; Davis, Stephen A; Horadam, Kathy J
2013-09-01
This paper presents an automatic retina verification framework based on the biometric graph matching (BGM) algorithm. The retinal vasculature is extracted using a family of matched filters in the frequency domain and morphological operators. Then, retinal templates are defined as formal spatial graphs derived from the retinal vasculature. The BGM algorithm, a noisy graph matching algorithm, robust to translation, non-linear distortion, and small rotations, is used to compare retinal templates. The BGM algorithm uses graph topology to define three distance measures between a pair of graphs, two of which are new. A support vector machine (SVM) classifier is used to distinguish between genuine and imposter comparisons. Using single as well as multiple graph measures, the classifier achieves complete separation on a training set of images from the VARIA database (60% of the data), equaling the state-of-the-art for retina verification. Because the available data set is small, kernel density estimation (KDE) of the genuine and imposter score distributions of the training set are used to measure performance of the BGM algorithm. In the one dimensional case, the KDE model is validated with the testing set. A 0 EER on testing shows that the KDE model is a good fit for the empirical distribution. For the multiple graph measures, a novel combination of the SVM boundary and the KDE model is used to obtain a fair comparison with the KDE model for the single measure. A clear benefit in using multiple graph measures over a single measure to distinguish genuine and imposter comparisons is demonstrated by a drop in theoretical error of between 60% and more than two orders of magnitude.
Lin, Jimmy
2008-01-01
Background Graph analysis algorithms such as PageRank and HITS have been successful in Web environments because they are able to extract important inter-document relationships from manually-created hyperlinks. We consider the application of these techniques to biomedical text retrieval. In the current PubMed® search interface, a MEDLINE® citation is connected to a number of related citations, which are in turn connected to other citations. Thus, a MEDLINE record represents a node in a vast content-similarity network. This article explores the hypothesis that these networks can be exploited for text retrieval, in the same manner as hyperlink graphs on the Web. Results We conducted a number of reranking experiments using the TREC 2005 genomics track test collection in which scores extracted from PageRank and HITS analysis were combined with scores returned by an off-the-shelf retrieval engine. Experiments demonstrate that incorporating PageRank scores yields significant improvements in terms of standard ranked-retrieval metrics. Conclusion The link structure of content-similarity networks can be exploited to improve the effectiveness of information retrieval systems. These results generalize the applicability of graph analysis algorithms to text retrieval in the biomedical domain. PMID:18538027
VLSI Design of SVM-Based Seizure Detection System With On-Chip Learning Capability.
Feng, Lichen; Li, Zunchao; Wang, Yuanfa
2018-02-01
Portable automatic seizure detection system is very convenient for epilepsy patients to carry. In order to make the system on-chip trainable with high efficiency and attain high detection accuracy, this paper presents a very large scale integration (VLSI) design based on the nonlinear support vector machine (SVM). The proposed design mainly consists of a feature extraction (FE) module and an SVM module. The FE module performs the three-level Daubechies discrete wavelet transform to fit the physiological bands of the electroencephalogram (EEG) signal and extracts the time-frequency domain features reflecting the nonstationary signal properties. The SVM module integrates the modified sequential minimal optimization algorithm with the table-driven-based Gaussian kernel to enable efficient on-chip learning. The presented design is verified on an Altera Cyclone II field-programmable gate array and tested using the two publicly available EEG datasets. Experiment results show that the designed VLSI system improves the detection accuracy and training efficiency.
Design Challenges of a Rapid Cycling Synchrotron for Carbon/Proton Therapy
NASA Astrophysics Data System (ADS)
Cook, Nathan
2012-03-01
The growing interest in radiation therapy with protons and light ions has driven demand for new methods of ion acceleration and the delivery of ion beams. One exciting new platform for ion beam acceleration and delivery is the rapid cycling synchrotron. Operating at 15Hz, rapid cycling achieves faster treatment times by making beam extraction possible at any energy during the cycle. Moreover, risk to the patient is reduced by requiring fewer particles in the beam line at a given time, thus eliminating the need for passive filtering and reducing the consequences of a malfunction. Lastly, the ability to switch between carbon ion and proton beam therapy provides the machine with an unmatched flexibility. However, these features do stipulate challenges in accelerator design. Maintaining a compact lattice requires careful tuning of lattice functions, tight focusing combined function magnets, and fast injection and extraction systems. Providing the necessary acceleration over a short cycle time also necessitates a five-fold frequency swing for carbon ions, further burdening the design requirements of ferrite-driven radiofrequency cavities. We will consider these challenges as well as some solutions selected for our current design.
Building Specialized Multilingual Lexical Graphs Using Community Resources
NASA Astrophysics Data System (ADS)
Daoud, Mohammad; Boitet, Christian; Kageura, Kyo; Kitamoto, Asanobu; Mangeot, Mathieu; Daoud, Daoud
We are describing methods for compiling domain-dedicated multilingual terminological data from various resources. We focus on collecting data from online community users as a main source, therefore, our approach depends on acquiring contributions from volunteers (explicit approach), and it depends on analyzing users' behaviors to extract interesting patterns and facts (implicit approach). As a generic repository that can handle the collected multilingual terminological data, we are describing the concept of dedicated Multilingual Preterminological Graphs MPGs, and some automatic approaches for constructing them by analyzing the behavior of online community users. A Multilingual Preterminological Graph is a special lexical resource that contains massive amount of terms related to a special domain. We call it preterminological, because it is a raw material that can be used to build a standardized terminological repository. Building such a graph is difficult using traditional approaches, as it needs huge efforts by domain specialists and terminologists. In our approach, we build such a graph by analyzing the access log files of the website of the community, and by finding the important terms that have been used to search in that website, and their association with each other. We aim at making this graph as a seed repository so multilingual volunteers can contribute. We are experimenting this approach with the Digital Silk Road Project. We have used its access log files since its beginning in 2003, and obtained an initial graph of around 116000 terms. As an application, we used this graph to obtain a preterminological multilingual database that is serving a CLIR system for the DSR project.
NASA Astrophysics Data System (ADS)
Khazaeli, S.; Ravandi, A. G.; Banerji, S.; Bagchi, A.
2016-04-01
Recently, data-driven models for Structural Health Monitoring (SHM) have been of great interest among many researchers. In data-driven models, the sensed data are processed to determine the structural performance and evaluate the damages of an instrumented structure without necessitating the mathematical modeling of the structure. A framework of data-driven models for online assessment of the condition of a structure has been developed here. The developed framework is intended for automated evaluation of the monitoring data and structural performance by the Internet technology and resources. The main challenges in developing such framework include: (a) utilizing the sensor measurements to estimate and localize the induced damage in a structure by means of signal processing and data mining techniques, and (b) optimizing the computing and storage resources with the aid of cloud services. The main focus in this paper is to demonstrate the efficiency of the proposed framework for real-time damage detection of a multi-story shear-building structure in two damage scenarios (change in mass and stiffness) in various locations. Several features are extracted from the sensed data by signal processing techniques and statistical methods. Machine learning algorithms are deployed to select damage-sensitive features as well as classifying the data to trace the anomaly in the response of the structure. Here, the cloud computing resources from Amazon Web Services (AWS) have been used to implement the proposed framework.
A method for the extraction and quantitation of phycoerythrin from algae
NASA Technical Reports Server (NTRS)
Stewart, D. E.
1982-01-01
A summary of a new technique for the extraction and quantitation of phycoerythrin (PHE) from algal samples is described. Results of analysis of four extracts representing three PHE types from algae including cryptomonad and cyanophyte types are presented. The method of extraction and an equation for quantitation are given. A graph showing the relationship of concentration and fluorescence units that may be used with samples fluorescing around 575-580 nm (probably dominated by cryptophytes in estuarine waters) and 560 nm (dominated by cyanophytes characteristics of the open ocean) is provided.
Deep Learning in Label-free Cell Classification
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Claire Lifan; Mahjoubfar, Ata; Tai, Li-Chia
Label-free cell analysis is essential to personalized genomics, cancer diagnostics, and drug development as it avoids adverse effects of staining reagents on cellular viability and cell signaling. However, currently available label-free cell assays mostly rely only on a single feature and lack sufficient differentiation. Also, the sample size analyzed by these assays is limited due to their low throughput. Here, we integrate feature extraction and deep learning with high-throughput quantitative imaging enabled by photonic time stretch, achieving record high accuracy in label-free cell classification. Our system captures quantitative optical phase and intensity images and extracts multiple biophysical features of individualmore » cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. We compare various learning algorithms including artificial neural network, support vector machine, logistic regression, and a novel deep learning pipeline, which adopts global optimization of receiver operating characteristics. As a validation of the enhanced sensitivity and specificity of our system, we show classification of white blood T-cells against colon cancer cells, as well as lipid accumulating algal strains for biofuel production. In conclusion, this system opens up a new path to data-driven phenotypic diagnosis and better understanding of the heterogeneous gene expressions in cells.« less
Deep Learning in Label-free Cell Classification
Chen, Claire Lifan; Mahjoubfar, Ata; Tai, Li-Chia; ...
2016-03-15
Label-free cell analysis is essential to personalized genomics, cancer diagnostics, and drug development as it avoids adverse effects of staining reagents on cellular viability and cell signaling. However, currently available label-free cell assays mostly rely only on a single feature and lack sufficient differentiation. Also, the sample size analyzed by these assays is limited due to their low throughput. Here, we integrate feature extraction and deep learning with high-throughput quantitative imaging enabled by photonic time stretch, achieving record high accuracy in label-free cell classification. Our system captures quantitative optical phase and intensity images and extracts multiple biophysical features of individualmore » cells. These biophysical measurements form a hyperdimensional feature space in which supervised learning is performed for cell classification. We compare various learning algorithms including artificial neural network, support vector machine, logistic regression, and a novel deep learning pipeline, which adopts global optimization of receiver operating characteristics. As a validation of the enhanced sensitivity and specificity of our system, we show classification of white blood T-cells against colon cancer cells, as well as lipid accumulating algal strains for biofuel production. In conclusion, this system opens up a new path to data-driven phenotypic diagnosis and better understanding of the heterogeneous gene expressions in cells.« less
Classification of iRBD and Parkinson's disease patients based on eye movements during sleep.
Christensen, Julie A E; Koch, Henriette; Frandsen, Rune; Kempfner, Jacob; Arvastson, Lars; Christensen, Soren R; Sorensen, Helge B D; Jennum, Poul
2013-01-01
Patients suffering from the sleep disorder idiopathic rapid-eye-movement sleep behavior disorder (iRBD) have been observed to be in high risk of developing Parkinson's disease (PD). This makes it essential to analyze them in the search for PD biomarkers. This study aims at classifying patients suffering from iRBD or PD based on features reflecting eye movements (EMs) during sleep. A Latent Dirichlet Allocation (LDA) topic model was developed based on features extracted from two electrooculographic (EOG) signals measured as parts in full night polysomnographic (PSG) recordings from ten control subjects. The trained model was tested on ten other control subjects, ten iRBD patients and ten PD patients, obtaining a EM topic mixture diagram for each subject in the test dataset. Three features were extracted from the topic mixture diagrams, reflecting "certainty", "fragmentation" and "stability" in the timely distribution of the EM topics. Using a Naive Bayes (NB) classifier and the features "certainty" and "stability" yielded the best classification result and the subjects were classified with a sensitivity of 95 %, a specificity of 80% and an accuracy of 90 %. This study demonstrates in a data-driven approach, that iRBD and PD patients may exhibit abnorm form and/or timely distribution of EMs during sleep.
Integrated segmentation and recognition of connected Ottoman script
NASA Astrophysics Data System (ADS)
Yalniz, Ismet Zeki; Altingovde, Ismail Sengor; Güdükbay, Uğur; Ulusoy, Özgür
2009-11-01
We propose a novel context-sensitive segmentation and recognition method for connected letters in Ottoman script. This method first extracts a set of segments from a connected script and determines the candidate letters to which extracted segments are most similar. Next, a function is defined for scoring each different syntactically correct sequence of these candidate letters. To find the candidate letter sequence that maximizes the score function, a directed acyclic graph is constructed. The letters are finally recognized by computing the longest path in this graph. Experiments using a collection of printed Ottoman documents reveal that the proposed method provides >90% precision and recall figures in terms of character recognition. In a further set of experiments, we also demonstrate that the framework can be used as a building block for an information retrieval system for digital Ottoman archives.
Loops in hierarchical channel networks
NASA Astrophysics Data System (ADS)
Katifori, Eleni; Magnasco, Marcelo
2012-02-01
Nature provides us with many examples of planar distribution and structural networks having dense sets of closed loops. An archetype of this form of network organization is the vasculature of dicotyledonous leaves, which showcases a hierarchically-nested architecture. Although a number of methods have been proposed to measure aspects of the structure of such networks, a robust metric to quantify their hierarchical organization is still lacking. We present an algorithmic framework that allows mapping loopy networks to binary trees, preserving in the connectivity of the trees the architecture of the original graph. We apply this framework to investigate computer generated and natural graphs extracted from digitized images of dicotyledonous leaves and animal vasculature. We calculate various metrics on the corresponding trees and discuss the relationship of these quantities to the architectural organization of the original graphs. This algorithmic framework decouples the geometric information from the metric topology (connectivity and edge weight) and it ultimately allows us to perform a quantitative statistical comparison between predictions of theoretical models and naturally occurring loopy graphs.
Constructing a Graph Database for Semantic Literature-Based Discovery.
Hristovski, Dimitar; Kastrin, Andrej; Dinevski, Dejan; Rindflesch, Thomas C
2015-01-01
Literature-based discovery (LBD) generates discoveries, or hypotheses, by combining what is already known in the literature. Potential discoveries have the form of relations between biomedical concepts; for example, a drug may be determined to treat a disease other than the one for which it was intended. LBD views the knowledge in a domain as a network; a set of concepts along with the relations between them. As a starting point, we used SemMedDB, a database of semantic relations between biomedical concepts extracted with SemRep from Medline. SemMedDB is distributed as a MySQL relational database, which has some problems when dealing with network data. We transformed and uploaded SemMedDB into the Neo4j graph database, and implemented the basic LBD discovery algorithms with the Cypher query language. We conclude that storing the data needed for semantic LBD is more natural in a graph database. Also, implementing LBD discovery algorithms is conceptually simpler with a graph query language when compared with standard SQL.
Exploring the evolution of London's street network in the information space: A dual approach
NASA Astrophysics Data System (ADS)
Masucci, A. Paolo; Stanilov, Kiril; Batty, Michael
2014-01-01
We study the growth of London's street network in its dual representation, as the city has evolved over the past 224 years. The dual representation of a planar graph is a content-based network, where each node is a set of edges of the planar graph and represents a transportation unit in the so-called information space, i.e., the space where information is handled in order to navigate through the city. First, we discuss a novel hybrid technique to extract dual graphs from planar graphs, called the hierarchical intersection continuity negotiation principle. Then we show that the growth of the network can be analytically described by logistic laws and that the topological properties of the network are governed by robust log-normal distributions characterizing the network's connectivity and small-world properties that are consistent over time. Moreover, we find that the double-Pareto-like distributions for the connectivity emerge for major roads and can be modeled via a stochastic content-based network model using simple space-filling principles.
Traxl, Dominik; Boers, Niklas; Kurths, Jürgen
2016-06-01
Network theory has proven to be a powerful tool in describing and analyzing systems by modelling the relations between their constituent objects. Particularly in recent years, a great progress has been made by augmenting "traditional" network theory in order to account for the multiplex nature of many networks, multiple types of connections between objects, the time-evolution of networks, networks of networks and other intricacies. However, existing network representations still lack crucial features in order to serve as a general data analysis tool. These include, most importantly, an explicit association of information with possibly heterogeneous types of objects and relations, and a conclusive representation of the properties of groups of nodes as well as the interactions between such groups on different scales. In this paper, we introduce a collection of definitions resulting in a framework that, on the one hand, entails and unifies existing network representations (e.g., network of networks and multilayer networks), and on the other hand, generalizes and extends them by incorporating the above features. To implement these features, we first specify the nodes and edges of a finite graph as sets of properties (which are permitted to be arbitrary mathematical objects). Second, the mathematical concept of partition lattices is transferred to the network theory in order to demonstrate how partitioning the node and edge set of a graph into supernodes and superedges allows us to aggregate, compute, and allocate information on and between arbitrary groups of nodes. The derived partition lattice of a graph, which we denote by deep graph, constitutes a concise, yet comprehensive representation that enables the expression and analysis of heterogeneous properties, relations, and interactions on all scales of a complex system in a self-contained manner. Furthermore, to be able to utilize existing network-based methods and models, we derive different representations of multilayer networks from our framework and demonstrate the advantages of our representation. On the basis of the formal framework described here, we provide a rich, fully scalable (and self-explanatory) software package that integrates into the PyData ecosystem and offers interfaces to popular network packages, making it a powerful, general-purpose data analysis toolkit. We exemplify an application of deep graphs using a real world dataset, comprising 16 years of satellite-derived global precipitation measurements. We deduce a deep graph representation of these measurements in order to track and investigate local formations of spatio-temporal clusters of extreme precipitation events.
Feedback topology and XOR-dynamics in Boolean networks with varying input structure
NASA Astrophysics Data System (ADS)
Ciandrini, L.; Maffi, C.; Motta, A.; Bassetti, B.; Cosentino Lagomarsino, M.
2009-08-01
We analyze a model of fixed in-degree random Boolean networks in which the fraction of input-receiving nodes is controlled by the parameter γ . We investigate analytically and numerically the dynamics of graphs under a parallel XOR updating scheme. This scheme is interesting because it is accessible analytically and its phenomenology is at the same time under control and as rich as the one of general Boolean networks. We give analytical formulas for the dynamics on general graphs, showing that with a XOR-type evolution rule, dynamic features are direct consequences of the topological feedback structure, in analogy with the role of relevant components in Kauffman networks. Considering graphs with fixed in-degree, we characterize analytically and numerically the feedback regions using graph decimation algorithms (Leaf Removal). With varying γ , this graph ensemble shows a phase transition that separates a treelike graph region from one in which feedback components emerge. Networks near the transition point have feedback components made of disjoint loops, in which each node has exactly one incoming and one outgoing link. Using this fact, we provide analytical estimates of the maximum period starting from topological considerations.
Feedback topology and XOR-dynamics in Boolean networks with varying input structure.
Ciandrini, L; Maffi, C; Motta, A; Bassetti, B; Cosentino Lagomarsino, M
2009-08-01
We analyze a model of fixed in-degree random Boolean networks in which the fraction of input-receiving nodes is controlled by the parameter gamma. We investigate analytically and numerically the dynamics of graphs under a parallel XOR updating scheme. This scheme is interesting because it is accessible analytically and its phenomenology is at the same time under control and as rich as the one of general Boolean networks. We give analytical formulas for the dynamics on general graphs, showing that with a XOR-type evolution rule, dynamic features are direct consequences of the topological feedback structure, in analogy with the role of relevant components in Kauffman networks. Considering graphs with fixed in-degree, we characterize analytically and numerically the feedback regions using graph decimation algorithms (Leaf Removal). With varying gamma , this graph ensemble shows a phase transition that separates a treelike graph region from one in which feedback components emerge. Networks near the transition point have feedback components made of disjoint loops, in which each node has exactly one incoming and one outgoing link. Using this fact, we provide analytical estimates of the maximum period starting from topological considerations.
Hunter, Lawrence; Lu, Zhiyong; Firby, James; Baumgartner, William A; Johnson, Helen L; Ogren, Philip V; Cohen, K Bretonnel
2008-01-01
Background Information extraction (IE) efforts are widely acknowledged to be important in harnessing the rapid advance of biomedical knowledge, particularly in areas where important factual information is published in a diverse literature. Here we report on the design, implementation and several evaluations of OpenDMAP, an ontology-driven, integrated concept analysis system. It significantly advances the state of the art in information extraction by leveraging knowledge in ontological resources, integrating diverse text processing applications, and using an expanded pattern language that allows the mixing of syntactic and semantic elements and variable ordering. Results OpenDMAP information extraction systems were produced for extracting protein transport assertions (transport), protein-protein interaction assertions (interaction) and assertions that a gene is expressed in a cell type (expression). Evaluations were performed on each system, resulting in F-scores ranging from .26 – .72 (precision .39 – .85, recall .16 – .85). Additionally, each of these systems was run over all abstracts in MEDLINE, producing a total of 72,460 transport instances, 265,795 interaction instances and 176,153 expression instances. Conclusion OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this level of performance appears to generalize to other information extraction tasks, including extracting information about predicates of more than two arguments. The output of the information extraction system is always constructed from elements of an ontology, ensuring that the knowledge representation is grounded with respect to a carefully constructed model of reality. The results of these efforts can be used to increase the efficiency of manual curation efforts and to provide additional features in systems that integrate multiple sources for information extraction. The open source OpenDMAP code library is freely available at PMID:18237434
Focus-based filtering + clustering technique for power-law networks with small world phenomenon
NASA Astrophysics Data System (ADS)
Boutin, François; Thièvre, Jérôme; Hascoët, Mountaz
2006-01-01
Realistic interaction networks usually present two main properties: a power-law degree distribution and a small world behavior. Few nodes are linked to many nodes and adjacent nodes are likely to share common neighbors. Moreover, graph structure usually presents a dense core that is difficult to explore with classical filtering and clustering techniques. In this paper, we propose a new filtering technique accounting for a user-focus. This technique extracts a tree-like graph with also power-law degree distribution and small world behavior. Resulting structure is easily drawn with classical force-directed drawing algorithms. It is also quickly clustered and displayed into a multi-level silhouette tree (MuSi-Tree) from any user-focus. We built a new graph filtering + clustering + drawing API and report a case study.
Extended Graph-Based Models for Enhanced Similarity Search in Cavbase.
Krotzky, Timo; Fober, Thomas; Hüllermeier, Eyke; Klebe, Gerhard
2014-01-01
To calculate similarities between molecular structures, measures based on the maximum common subgraph are frequently applied. For the comparison of protein binding sites, these measures are not fully appropriate since graphs representing binding sites on a detailed atomic level tend to get very large. In combination with an NP-hard problem, a large graph leads to a computationally demanding task. Therefore, for the comparison of binding sites, a less detailed coarse graph model is used building upon so-called pseudocenters. Consistently, a loss of structural data is caused since many atoms are discarded and no information about the shape of the binding site is considered. This is usually resolved by performing subsequent calculations based on additional information. These steps are usually quite expensive, making the whole approach very slow. The main drawback of a graph-based model solely based on pseudocenters, however, is the loss of information about the shape of the protein surface. In this study, we propose a novel and efficient modeling formalism that does not increase the size of the graph model compared to the original approach, but leads to graphs containing considerably more information assigned to the nodes. More specifically, additional descriptors considering surface characteristics are extracted from the local surface and attributed to the pseudocenters stored in Cavbase. These properties are evaluated as additional node labels, which lead to a gain of information and allow for much faster but still very accurate comparisons between different structures.
Xu, Jun; Luo, Xiaofei; Wang, Guanhao; Gilmore, Hannah; Madabhushi, Anant
2016-01-01
Epithelial (EP) and stromal (ST) are two types of tissues in histological images. Automated segmentation or classification of EP and ST tissues is important when developing computerized system for analyzing the tumor microenvironment. In this paper, a Deep Convolutional Neural Networks (DCNN) based feature learning is presented to automatically segment or classify EP and ST regions from digitized tumor tissue microarrays (TMAs). Current approaches are based on handcraft feature representation, such as color, texture, and Local Binary Patterns (LBP) in classifying two regions. Compared to handcrafted feature based approaches, which involve task dependent representation, DCNN is an end-to-end feature extractor that may be directly learned from the raw pixel intensity value of EP and ST tissues in a data driven fashion. These high-level features contribute to the construction of a supervised classifier for discriminating the two types of tissues. In this work we compare DCNN based models with three handcraft feature extraction based approaches on two different datasets which consist of 157 Hematoxylin and Eosin (H&E) stained images of breast cancer and 1376 immunohistological (IHC) stained images of colorectal cancer, respectively. The DCNN based feature learning approach was shown to have a F1 classification score of 85%, 89%, and 100%, accuracy (ACC) of 84%, 88%, and 100%, and Matthews Correlation Coefficient (MCC) of 86%, 77%, and 100% on two H&E stained (NKI and VGH) and IHC stained data, respectively. Our DNN based approach was shown to outperform three handcraft feature extraction based approaches in terms of the classification of EP and ST regions. PMID:28154470
Xu, Jun; Luo, Xiaofei; Wang, Guanhao; Gilmore, Hannah; Madabhushi, Anant
2016-05-26
Epithelial (EP) and stromal (ST) are two types of tissues in histological images. Automated segmentation or classification of EP and ST tissues is important when developing computerized system for analyzing the tumor microenvironment. In this paper, a Deep Convolutional Neural Networks (DCNN) based feature learning is presented to automatically segment or classify EP and ST regions from digitized tumor tissue microarrays (TMAs). Current approaches are based on handcraft feature representation, such as color, texture, and Local Binary Patterns (LBP) in classifying two regions. Compared to handcrafted feature based approaches, which involve task dependent representation, DCNN is an end-to-end feature extractor that may be directly learned from the raw pixel intensity value of EP and ST tissues in a data driven fashion. These high-level features contribute to the construction of a supervised classifier for discriminating the two types of tissues. In this work we compare DCNN based models with three handcraft feature extraction based approaches on two different datasets which consist of 157 Hematoxylin and Eosin (H&E) stained images of breast cancer and 1376 immunohistological (IHC) stained images of colorectal cancer, respectively. The DCNN based feature learning approach was shown to have a F1 classification score of 85%, 89%, and 100%, accuracy (ACC) of 84%, 88%, and 100%, and Matthews Correlation Coefficient (MCC) of 86%, 77%, and 100% on two H&E stained (NKI and VGH) and IHC stained data, respectively. Our DNN based approach was shown to outperform three handcraft feature extraction based approaches in terms of the classification of EP and ST regions.
Ansari, A H; Cherian, P J; Dereymaeker, A; Matic, V; Jansen, K; De Wispelaere, L; Dielman, C; Vervisch, J; Swarte, R M; Govaert, P; Naulaers, G; De Vos, M; Van Huffel, S
2016-09-01
After identifying the most seizure-relevant characteristics by a previously developed heuristic classifier, a data-driven post-processor using a novel set of features is applied to improve the performance. The main characteristics of the outputs of the heuristic algorithm are extracted by five sets of features including synchronization, evolution, retention, segment, and signal features. Then, a support vector machine and a decision making layer remove the falsely detected segments. Four datasets including 71 neonates (1023h, 3493 seizures) recorded in two different university hospitals, are used to train and test the algorithm without removing the dubious seizures. The heuristic method resulted in a false alarm rate of 3.81 per hour and good detection rate of 88% on the entire test databases. The post-processor, effectively reduces the false alarm rate by 34% while the good detection rate decreases by 2%. This post-processing technique improves the performance of the heuristic algorithm. The structure of this post-processor is generic, improves our understanding of the core visually determined EEG features of neonatal seizures and is applicable for other neonatal seizure detectors. The post-processor significantly decreases the false alarm rate at the expense of a small reduction of the good detection rate. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
A data-driven multiplicative fault diagnosis approach for automation processes.
Hao, Haiyang; Zhang, Kai; Ding, Steven X; Chen, Zhiwen; Lei, Yaguo
2014-09-01
This paper presents a new data-driven method for diagnosing multiplicative key performance degradation in automation processes. Different from the well-established additive fault diagnosis approaches, the proposed method aims at identifying those low-level components which increase the variability of process variables and cause performance degradation. Based on process data, features of multiplicative fault are extracted. To identify the root cause, the impact of fault on each process variable is evaluated in the sense of contribution to performance degradation. Then, a numerical example is used to illustrate the functionalities of the method and Monte-Carlo simulation is performed to demonstrate the effectiveness from the statistical viewpoint. Finally, to show the practical applicability, a case study on the Tennessee Eastman process is presented. Copyright © 2013. Published by Elsevier Ltd.
Multiple kernel learning in protein-protein interaction extraction from biomedical literature.
Yang, Zhihao; Tang, Nan; Zhang, Xiao; Lin, Hongfei; Li, Yanpeng; Yang, Zhiwei
2011-03-01
Knowledge about protein-protein interactions (PPIs) unveils the molecular mechanisms of biological processes. The volume and content of published biomedical literature on protein interactions is expanding rapidly, making it increasingly difficult for interaction database administrators, responsible for content input and maintenance to detect and manually update protein interaction information. The objective of this work is to develop an effective approach to automatic extraction of PPI information from biomedical literature. We present a weighted multiple kernel learning-based approach for automatic PPI extraction from biomedical literature. The approach combines the following kernels: feature-based, tree, graph and part-of-speech (POS) path. In particular, we extend the shortest path-enclosed tree (SPT) and dependency path tree to capture richer contextual information. Our experimental results show that the combination of SPT and dependency path tree extensions contributes to the improvement of performance by almost 0.7 percentage units in F-score and 2 percentage units in area under the receiver operating characteristics curve (AUC). Combining two or more appropriately weighed individual will further improve the performance. Both on the individual corpus and cross-corpus evaluation our combined kernel can achieve state-of-the-art performance with respect to comparable evaluations, with 64.41% F-score and 88.46% AUC on the AImed corpus. As different kernels calculate the similarity between two sentences from different aspects. Our combined kernel can reduce the risk of missing important features. More specifically, we use a weighted linear combination of individual kernels instead of assigning the same weight to each individual kernel, thus allowing the introduction of each kernel to incrementally contribute to the performance improvement. In addition, SPT and dependency path tree extensions can improve the performance by including richer context information. Copyright © 2010 Elsevier B.V. All rights reserved.
Understanding Resonance Graphs Using Easy Java Simulations (EJS) and Why We Use EJS
ERIC Educational Resources Information Center
Wee, Loo Kang; Lee, Tat Leong; Chew, Charles; Wong, Darren; Tan, Samuel
2015-01-01
This paper reports a computer model simulation created using Easy Java Simulation (EJS) for learners to visualize how the steady-state amplitude of a driven oscillating system varies with the frequency of the periodic driving force. The simulation shows (N = 100) identical spring-mass systems being subjected to (1) a periodic driving force of…
Chen, Xi; Kopsaftopoulos, Fotis; Wu, Qi; Ren, He; Chang, Fu-Kuo
2018-04-29
In this work, a data-driven approach for identifying the flight state of a self-sensing wing structure with an embedded multi-functional sensing network is proposed. The flight state is characterized by the structural vibration signals recorded from a series of wind tunnel experiments under varying angles of attack and airspeeds. A large feature pool is created by extracting potential features from the signals covering the time domain, the frequency domain as well as the information domain. Special emphasis is given to feature selection in which a novel filter method is developed based on the combination of a modified distance evaluation algorithm and a variance inflation factor. Machine learning algorithms are then employed to establish the mapping relationship from the feature space to the practical state space. Results from two case studies demonstrate the high identification accuracy and the effectiveness of the model complexity reduction via the proposed method, thus providing new perspectives of self-awareness towards the next generation of intelligent air vehicles.
Co-occurrence graphs for word sense disambiguation in the biomedical domain.
Duque, Andres; Stevenson, Mark; Martinez-Romo, Juan; Araujo, Lourdes
2018-05-01
Word sense disambiguation is a key step for many natural language processing tasks (e.g. summarization, text classification, relation extraction) and presents a challenge to any system that aims to process documents from the biomedical domain. In this paper, we present a new graph-based unsupervised technique to address this problem. The knowledge base used in this work is a graph built with co-occurrence information from medical concepts found in scientific abstracts, and hence adapted to the specific domain. Unlike other unsupervised approaches based on static graphs such as UMLS, in this work the knowledge base takes the context of the ambiguous terms into account. Abstracts downloaded from PubMed are used for building the graph and disambiguation is performed using the personalized PageRank algorithm. Evaluation is carried out over two test datasets widely explored in the literature. Different parameters of the system are also evaluated to test robustness and scalability. Results show that the system is able to outperform state-of-the-art knowledge-based systems, obtaining more than 10% of accuracy improvement in some cases, while only requiring minimal external resources. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kohler, Sophie; Far, Aïcha Beya; Hirsch, Ernest
2007-01-01
This paper presents an original approach for the optimal 3D reconstruction of manufactured workpieces based on a priori planification of the task, enhanced on-line through dynamic adjustment of the lighting conditions, and built around a cognitive intelligent sensory system using so-called Situation Graph Trees. The system takes explicitely structural knowledge related to image acquisition conditions, type of illumination sources, contents of the scene (e. g., CAD models and tolerance information), etc. into account. The principle of the approach relies on two steps. First, a socalled initialization phase, leading to the a priori task plan, collects this structural knowledge. This knowledge is conveniently encoded, as a sub-part, in the Situation Graph Tree building the backbone of the planning system specifying exhaustively the behavior of the application. Second, the image is iteratively evaluated under the control of this Situation Graph Tree. The information describing the quality of the piece to analyze is thus extracted and further exploited for, e. g., inspection tasks. Lastly, the approach enables dynamic adjustment of the Situation Graph Tree, enabling the system to adjust itself to the actual application run-time conditions, thus providing the system with a self-learning capability.
Protein-protein interaction inference based on semantic similarity of Gene Ontology terms.
Zhang, Shu-Bo; Tang, Qiang-Rong
2016-07-21
Identifying protein-protein interactions is important in molecular biology. Experimental methods to this issue have their limitations, and computational approaches have attracted more and more attentions from the biological community. The semantic similarity derived from the Gene Ontology (GO) annotation has been regarded as one of the most powerful indicators for protein interaction. However, conventional methods based on GO similarity fail to take advantage of the specificity of GO terms in the ontology graph. We proposed a GO-based method to predict protein-protein interaction by integrating different kinds of similarity measures derived from the intrinsic structure of GO graph. We extended five existing methods to derive the semantic similarity measures from the descending part of two GO terms in the GO graph, then adopted a feature integration strategy to combines both the ascending and the descending similarity scores derived from the three sub-ontologies to construct various kinds of features to characterize each protein pair. Support vector machines (SVM) were employed as discriminate classifiers, and five-fold cross validation experiments were conducted on both human and yeast protein-protein interaction datasets to evaluate the performance of different kinds of integrated features, the experimental results suggest the best performance of the feature that combines information from both the ascending and the descending parts of the three ontologies. Our method is appealing for effective prediction of protein-protein interaction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Graph-based geometric-iconic guide-wire tracking.
Honnorat, Nicolas; Vaillant, Régis; Paragios, Nikos
2011-01-01
In this paper we introduce a novel hybrid graph-based approach for Guide-wire tracking. The image support is captured by steerable filters and improved through tensor voting. Then, a graphical model is considered that represents guide-wire extraction/tracking through a B-spline control-point model. Points with strong geometric interest (landmarks) are automatically determined and anchored to such a representation. Tracking is then performed through discrete MRFs that optimize the spatio-temporal positions of the control points while establishing landmark temporal correspondences. Promising results demonstrate the potentials of our method.
Speech graphs provide a quantitative measure of thought disorder in psychosis.
Mota, Natalia B; Vasconcelos, Nivaldo A P; Lemos, Nathalia; Pieretti, Ana C; Kinouchi, Osame; Cecchi, Guillermo A; Copelli, Mauro; Ribeiro, Sidarta
2012-01-01
Psychosis has various causes, including mania and schizophrenia. Since the differential diagnosis of psychosis is exclusively based on subjective assessments of oral interviews with patients, an objective quantification of the speech disturbances that characterize mania and schizophrenia is in order. In principle, such quantification could be achieved by the analysis of speech graphs. A graph represents a network with nodes connected by edges; in speech graphs, nodes correspond to words and edges correspond to semantic and grammatical relationships. To quantify speech differences related to psychosis, interviews with schizophrenics, manics and normal subjects were recorded and represented as graphs. Manics scored significantly higher than schizophrenics in ten graph measures. Psychopathological symptoms such as logorrhea, poor speech, and flight of thoughts were grasped by the analysis even when verbosity differences were discounted. Binary classifiers based on speech graph measures sorted schizophrenics from manics with up to 93.8% of sensitivity and 93.7% of specificity. In contrast, sorting based on the scores of two standard psychiatric scales (BPRS and PANSS) reached only 62.5% of sensitivity and specificity. The results demonstrate that alterations of the thought process manifested in the speech of psychotic patients can be objectively measured using graph-theoretical tools, developed to capture specific features of the normal and dysfunctional flow of thought, such as divergence and recurrence. The quantitative analysis of speech graphs is not redundant with standard psychometric scales but rather complementary, as it yields a very accurate sorting of schizophrenics and manics. Overall, the results point to automated psychiatric diagnosis based not on what is said, but on how it is said.
A simplifying feature of the heterotic one loop four graviton amplitude
NASA Astrophysics Data System (ADS)
Basu, Anirban
2018-01-01
We show that the weight four modular graph functions that contribute to the integrand of the t8t8D4R4 term at one loop in heterotic string theory do not require regularization, and hence the integrand is simple. This is unlike the graphs that contribute to the integrands of the other gravitational terms at this order in the low momentum expansion, and these integrands require regularization. This property persists for an infinite number of terms in the effective action, and their integrands do not require regularization. We find non-trivial relations between weight four graphs of distinct topologies that do not require regularization by performing trivial manipulations using auxiliary diagrams.
Time series analysis of the developed financial markets' integration using visibility graphs
NASA Astrophysics Data System (ADS)
Zhuang, Enyu; Small, Michael; Feng, Gang
2014-09-01
A time series representing the developed financial markets' segmentation from 1973 to 2012 is studied. The time series reveals an obvious market integration trend. To further uncover the features of this time series, we divide it into seven windows and generate seven visibility graphs. The measuring capabilities of the visibility graphs provide means to quantitatively analyze the original time series. It is found that the important historical incidents that influenced market integration coincide with variations in the measured graphical node degree. Through the measure of neighborhood span, the frequencies of the historical incidents are disclosed. Moreover, it is also found that large "cycles" and significant noise in the time series are linked to large and small communities in the generated visibility graphs. For large cycles, how historical incidents significantly affected market integration is distinguished by density and compactness of the corresponding communities.
Large-scale quantum networks based on graphs
NASA Astrophysics Data System (ADS)
Epping, Michael; Kampermann, Hermann; Bruß, Dagmar
2016-05-01
Society relies and depends increasingly on information exchange and communication. In the quantum world, security and privacy is a built-in feature for information processing. The essential ingredient for exploiting these quantum advantages is the resource of entanglement, which can be shared between two or more parties. The distribution of entanglement over large distances constitutes a key challenge for current research and development. Due to losses of the transmitted quantum particles, which typically scale exponentially with the distance, intermediate quantum repeater stations are needed. Here we show how to generalise the quantum repeater concept to the multipartite case, by describing large-scale quantum networks, i.e. network nodes and their long-distance links, consistently in the language of graphs and graph states. This unifying approach comprises both the distribution of multipartite entanglement across the network, and the protection against errors via encoding. The correspondence to graph states also provides a tool for optimising the architecture of quantum networks.
Bootstrapping Security Policies for Wearable Apps Using Attributed Structural Graphs.
González-Tablas, Ana I; Tapiador, Juan E
2016-05-11
We address the problem of bootstrapping security and privacy policies for newly-deployed apps in wireless body area networks (WBAN) composed of smartphones, sensors and other wearable devices. We introduce a framework to model such a WBAN as an undirected graph whose vertices correspond to devices, apps and app resources, while edges model structural relationships among them. This graph is then augmented with attributes capturing the features of each entity together with user-defined tags. We then adapt available graph-based similarity metrics to find the closest app to a new one to be deployed, with the aim of reusing, and possibly adapting, its security policy. We illustrate our approach through a detailed smartphone ecosystem case study. Our results suggest that the scheme can provide users with a reasonably good policy that is consistent with the user's security preferences implicitly captured by policies already in place.
Bootstrapping Security Policies for Wearable Apps Using Attributed Structural Graphs
González-Tablas, Ana I.; Tapiador, Juan E.
2016-01-01
We address the problem of bootstrapping security and privacy policies for newly-deployed apps in wireless body area networks (WBAN) composed of smartphones, sensors and other wearable devices. We introduce a framework to model such a WBAN as an undirected graph whose vertices correspond to devices, apps and app resources, while edges model structural relationships among them. This graph is then augmented with attributes capturing the features of each entity together with user-defined tags. We then adapt available graph-based similarity metrics to find the closest app to a new one to be deployed, with the aim of reusing, and possibly adapting, its security policy. We illustrate our approach through a detailed smartphone ecosystem case study. Our results suggest that the scheme can provide users with a reasonably good policy that is consistent with the user’s security preferences implicitly captured by policies already in place. PMID:27187385
GfaPy: a flexible and extensible software library for handling sequence graphs in Python.
Gonnella, Giorgio; Kurtz, Stefan
2017-10-01
GFA 1 and GFA 2 are recently defined formats for representing sequence graphs, such as assembly, variation or splicing graphs. The formats are adopted by several software tools. Here, we present GfaPy, a software package for creating, parsing and editing GFA graphs using the programming language Python. GfaPy supports GFA 1 and GFA 2, using the same interface and allows for interconversion between both formats. The software package provides a simple interface for custom record types, which is an important new feature of GFA 2 (compared to GFA 1). This enables new applications of the format. GfaPy is available open source at https://github.com/ggonnella/gfapy and installable via pip. gonnella@zbh.uni-hamburg.de. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Single neuron computation: from dynamical system to feature detector.
Hong, Sungho; Agüera y Arcas, Blaise; Fairhall, Adrienne L
2007-12-01
White noise methods are a powerful tool for characterizing the computation performed by neural systems. These methods allow one to identify the feature or features that a neural system extracts from a complex input and to determine how these features are combined to drive the system's spiking response. These methods have also been applied to characterize the input-output relations of single neurons driven by synaptic inputs, simulated by direct current injection. To interpret the results of white noise analysis of single neurons, we would like to understand how the obtained feature space of a single neuron maps onto the biophysical properties of the membrane, in particular, the dynamics of ion channels. Here, through analysis of a simple dynamical model neuron, we draw explicit connections between the output of a white noise analysis and the underlying dynamical system. We find that under certain assumptions, the form of the relevant features is well defined by the parameters of the dynamical system. Further, we show that under some conditions, the feature space is spanned by the spike-triggered average and its successive order time derivatives.
Lohse, Christian; Bassett, Danielle S; Lim, Kelvin O; Carlson, Jean M
2014-10-01
Human brain anatomy and function display a combination of modular and hierarchical organization, suggesting the importance of both cohesive structures and variable resolutions in the facilitation of healthy cognitive processes. However, tools to simultaneously probe these features of brain architecture require further development. We propose and apply a set of methods to extract cohesive structures in network representations of brain connectivity using multi-resolution techniques. We employ a combination of soft thresholding, windowed thresholding, and resolution in community detection, that enable us to identify and isolate structures associated with different weights. One such mesoscale structure is bipartivity, which quantifies the extent to which the brain is divided into two partitions with high connectivity between partitions and low connectivity within partitions. A second, complementary mesoscale structure is modularity, which quantifies the extent to which the brain is divided into multiple communities with strong connectivity within each community and weak connectivity between communities. Our methods lead to multi-resolution curves of these network diagnostics over a range of spatial, geometric, and structural scales. For statistical comparison, we contrast our results with those obtained for several benchmark null models. Our work demonstrates that multi-resolution diagnostic curves capture complex organizational profiles in weighted graphs. We apply these methods to the identification of resolution-specific characteristics of healthy weighted graph architecture and altered connectivity profiles in psychiatric disease.
NASA Astrophysics Data System (ADS)
Pan, Jun; Chen, Jinglong; Zi, Yanyang; Yuan, Jing; Chen, Binqiang; He, Zhengjia
2016-12-01
It is significant to perform condition monitoring and fault diagnosis on rolling mills in steel-making plant to ensure economic benefit. However, timely fault identification of key parts in a complicated industrial system under operating condition is still a challenging task since acquired condition signals are usually multi-modulated and inevitably mixed with strong noise. Therefore, a new data-driven mono-component identification method is proposed in this paper for diagnostic purpose. First, the modified nonlocal means algorithm (NLmeans) is proposed to reduce noise in vibration signals without destroying its original Fourier spectrum structure. During the modified NLmeans, two modifications are investigated and performed to improve denoising effect. Then, the modified empirical wavelet transform (MEWT) is applied on the de-noised signal to adaptively extract empirical mono-component modes. Finally, the modes are analyzed for mechanical fault identification based on Hilbert transform. The results show that the proposed data-driven method owns superior performance during system operation compared with the MEWT method.
Eyben, Florian; Weninger, Felix; Lehment, Nicolas; Schuller, Björn; Rigoll, Gerhard
2013-01-01
Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet, the lion's share of research in affective computing is exclusively focusing on signals conveyed by humans, such as affective speech. Uniting the fields of multimedia retrieval and affective computing is believed to lend to a multiplicity of interesting retrieval applications, and at the same time to benefit affective computing research, by moving its methodology "out of the lab" to real-world, diverse data. In this contribution, we address the problem of finding "disturbing" scenes in movies, a scenario that is highly relevant for computer-aided parental guidance. We apply large-scale segmental feature extraction combined with audio-visual classification to the particular task of detecting violence. Our system performs fully data-driven analysis including automatic segmentation. We evaluate the system in terms of mean average precision (MAP) on the official data set of the MediaEval 2012 evaluation campaign's Affect Task, which consists of 18 original Hollywood movies, achieving up to .398 MAP on unseen test data in full realism. An in-depth analysis of the worth of individual features with respect to the target class and the system errors is carried out and reveals the importance of peak-related audio feature extraction and low-level histogram-based video analysis.
Eyben, Florian; Weninger, Felix; Lehment, Nicolas; Schuller, Björn; Rigoll, Gerhard
2013-01-01
Without doubt general video and sound, as found in large multimedia archives, carry emotional information. Thus, audio and video retrieval by certain emotional categories or dimensions could play a central role for tomorrow's intelligent systems, enabling search for movies with a particular mood, computer aided scene and sound design in order to elicit certain emotions in the audience, etc. Yet, the lion's share of research in affective computing is exclusively focusing on signals conveyed by humans, such as affective speech. Uniting the fields of multimedia retrieval and affective computing is believed to lend to a multiplicity of interesting retrieval applications, and at the same time to benefit affective computing research, by moving its methodology “out of the lab” to real-world, diverse data. In this contribution, we address the problem of finding “disturbing” scenes in movies, a scenario that is highly relevant for computer-aided parental guidance. We apply large-scale segmental feature extraction combined with audio-visual classification to the particular task of detecting violence. Our system performs fully data-driven analysis including automatic segmentation. We evaluate the system in terms of mean average precision (MAP) on the official data set of the MediaEval 2012 evaluation campaign's Affect Task, which consists of 18 original Hollywood movies, achieving up to .398 MAP on unseen test data in full realism. An in-depth analysis of the worth of individual features with respect to the target class and the system errors is carried out and reveals the importance of peak-related audio feature extraction and low-level histogram-based video analysis. PMID:24391704
NASA Astrophysics Data System (ADS)
Revunova, Svetlana; Vlasenko, Vyacheslav; Bukreev, Anatoly
2017-10-01
The article proposes the models of innovative activity development, which is driven by the formation of “points of innovation-driven growth”. The models are based on the analysis of the current state and dynamics of innovative development of construction enterprises in the transport sector and take into account a number of essential organizational and economic changes in management. The authors substantiate implementing such development models as an organizational innovation that has a communication genesis. The use of the communication approach to the formation of “points of innovation-driven growth” allowed the authors to apply the mathematical tools of the graph theory in order to activate the innovative activity of the transport industry in the region. As a result, the authors have proposed models that allow constructing an optimal mechanism for the formation of “points of innovation-driven growth”.
This data set is a geographic information system (GIS) coverage of pipelines, transmission lines, and miscellaneous transportation features for the United States Environmental Protection Agency (USEPA) Mid-Atlantic Integrated Assessment (MAIA) Project region. The coverage was p...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Symons, Christopher T; Arel, Itamar
2011-01-01
Budgeted learning under constraints on both the amount of labeled information and the availability of features at test time pertains to a large number of real world problems. Ideas from multi-view learning, semi-supervised learning, and even active learning have applicability, but a common framework whose assumptions fit these problem spaces is non-trivial to construct. We leverage ideas from these fields based on graph regularizers to construct a robust framework for learning from labeled and unlabeled samples in multiple views that are non-independent and include features that are inaccessible at the time the model would need to be applied. We describemore » examples of applications that fit this scenario, and we provide experimental results to demonstrate the effectiveness of knowledge carryover from training-only views. As learning algorithms are applied to more complex applications, relevant information can be found in a wider variety of forms, and the relationships between these information sources are often quite complex. The assumptions that underlie most learning algorithms do not readily or realistically permit the incorporation of many of the data sources that are available, despite an implicit understanding that useful information exists in these sources. When multiple information sources are available, they are often partially redundant, highly interdependent, and contain noise as well as other information that is irrelevant to the problem under study. In this paper, we are focused on a framework whose assumptions match this reality, as well as the reality that labeled information is usually sparse. Most significantly, we are interested in a framework that can also leverage information in scenarios where many features that would be useful for learning a model are not available when the resulting model will be applied. As with constraints on labels, there are many practical limitations on the acquisition of potentially useful features. A key difference in the case of feature acquisition is that the same constraints often don't pertain to the training samples. This difference provides an opportunity to allow features that are impractical in an applied setting to nevertheless add value during the model-building process. Unfortunately, there are few machine learning frameworks built on assumptions that allow effective utilization of features that are only available at training time. In this paper we formulate a knowledge carryover framework for the budgeted learning scenario with constraints on features and labels. The approach is based on multi-view and semi-supervised learning methods that use graph-encoded regularization. Our main contributions are the following: (1) we propose and provide justification for a methodology for ensuring that changes in the graph regularizer using alternate views are performed in a manner that is target-concept specific, allowing value to be obtained from noisy views; and (2) we demonstrate how this general set-up can be used to effectively improve models by leveraging features unavailable at test time. The rest of the paper is structured as follows. In Section 2, we outline real-world problems to motivate the approach and describe relevant prior work. Section 3 describes the graph construction process and the learning methodologies that are employed. Section 4 provides preliminary discussion regarding theoretical motivation for the method. In Section 5, effectiveness of the approach is demonstrated in a series of experiments employing modified versions of two well-known semi-supervised learning algorithms. Section 6 concludes the paper.« less
Extraction of Features from High-resolution 3D LiDaR Point-cloud Data
NASA Astrophysics Data System (ADS)
Keller, P.; Kreylos, O.; Hamann, B.; Kellogg, L. H.; Cowgill, E. S.; Yikilmaz, M. B.; Hering-Bertram, M.; Hagen, H.
2008-12-01
Airborne and tripod-based LiDaR scans are capable of producing new insight into geologic features by providing high-quality 3D measurements of the landscape. High-resolution LiDaR is a promising method for studying slip on faults, erosion, and other landscape-altering processes. LiDaR scans can produce up to several billion individual point returns associated with the reflection of a laser from natural and engineered surfaces; these point clouds are typically used to derive a high-resolution digital elevation model (DEM). Currently, there exist only few methods that can support the analysis of the data at full resolution and in the natural 3D perspective in which it was collected by working directly with the points. We are developing new algorithms for extracting features from LiDaR scans, and present method for determining the local curvature of a LiDaR data set, working directly with the individual point returns of a scan. Computing the curvature enables us to rapidly and automatically identify key features such as ridge-lines, stream beds, and edges of terraces. We fit polynomial surface patches via a moving least squares (MLS) approach to local point neighborhoods, determining curvature values for each point. The size of the local point neighborhood is defined by a user. Since both terrestrial and airborne LiDaR scans suffer from high noise, we apply additional pre- and post-processing smoothing steps to eliminate unwanted features. LiDaR data also captures objects like buildings and trees complicating greatly the task of extracting reliable curvature values. Hence, we use a stochastic approach to determine whether a point can be reliably used to estimate curvature or not. Additionally, we have developed a graph-based approach to establish connectivities among points that correspond to regions of high curvature. The result is an explicit description of ridge-lines, for example. We have applied our method to the raw point cloud data collected as part of the GeoEarthScope B-4 project on a section of the San Andreas Fault (Segment SA09). This section provides an excellent test site for our method as it exposes the fault clearly, contains few extraneous structures, and exhibits multiple dry stream-beds that have been off-set by motion on the fault.
Analysis Tools for Interconnected Boolean Networks With Biological Applications.
Chaves, Madalena; Tournier, Laurent
2018-01-01
Boolean networks with asynchronous updates are a class of logical models particularly well adapted to describe the dynamics of biological networks with uncertain measures. The state space of these models can be described by an asynchronous state transition graph, which represents all the possible exits from every single state, and gives a global image of all the possible trajectories of the system. In addition, the asynchronous state transition graph can be associated with an absorbing Markov chain, further providing a semi-quantitative framework where it becomes possible to compute probabilities for the different trajectories. For large networks, however, such direct analyses become computationally untractable, given the exponential dimension of the graph. Exploiting the general modularity of biological systems, we have introduced the novel concept of asymptotic graph , computed as an interconnection of several asynchronous transition graphs and recovering all asymptotic behaviors of a large interconnected system from the behavior of its smaller modules. From a modeling point of view, the interconnection of networks is very useful to address for instance the interplay between known biological modules and to test different hypotheses on the nature of their mutual regulatory links. This paper develops two new features of this general methodology: a quantitative dimension is added to the asymptotic graph, through the computation of relative probabilities for each final attractor and a companion cross-graph is introduced to complement the method on a theoretical point of view.
Exploring biological interaction networks with tailored weighted quasi-bicliques
2012-01-01
Background Biological networks provide fundamental insights into the functional characterization of genes and their products, the characterization of DNA-protein interactions, the identification of regulatory mechanisms, and other biological tasks. Due to the experimental and biological complexity, their computational exploitation faces many algorithmic challenges. Results We introduce novel weighted quasi-biclique problems to identify functional modules in biological networks when represented by bipartite graphs. In difference to previous quasi-biclique problems, we include biological interaction levels by using edge-weighted quasi-bicliques. While we prove that our problems are NP-hard, we also describe IP formulations to compute exact solutions for moderately sized networks. Conclusions We verify the effectiveness of our IP solutions using both simulation and empirical data. The simulation shows high quasi-biclique recall rates, and the empirical data corroborate the abilities of our weighted quasi-bicliques in extracting features and recovering missing interactions from biological networks. PMID:22759421
Brain Tumor Segmentation Using Deep Belief Networks and Pathological Knowledge.
Zhan, Tianming; Chen, Yi; Hong, Xunning; Lu, Zhenyu; Chen, Yunjie
2017-01-01
In this paper, we propose an automatic brain tumor segmentation method based on Deep Belief Networks (DBNs) and pathological knowledge. The proposed method is targeted against gliomas (both low and high grade) obtained in multi-sequence magnetic resonance images (MRIs). Firstly, a novel deep architecture is proposed to combine the multi-sequences intensities feature extraction with classification to get the classification probabilities of each voxel. Then, graph cut based optimization is executed on the classification probabilities to strengthen the spatial relationships of voxels. At last, pathological knowledge of gliomas is applied to remove some false positives. Our method was validated in the Brain Tumor Segmentation Challenge 2012 and 2013 databases (BRATS 2012, 2013). The performance of segmentation results demonstrates our proposal providing a competitive solution with stateof- the-art methods. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
A combinatorial framework to quantify peak/pit asymmetries in complex dynamics.
Hasson, Uri; Iacovacci, Jacopo; Davis, Ben; Flanagan, Ryan; Tagliazucchi, Enzo; Laufs, Helmut; Lacasa, Lucas
2018-02-23
We explore a combinatorial framework which efficiently quantifies the asymmetries between minima and maxima in local fluctuations of time series. We first showcase its performance by applying it to a battery of synthetic cases. We find rigorous results on some canonical dynamical models (stochastic processes with and without correlations, chaotic processes) complemented by extensive numerical simulations for a range of processes which indicate that the methodology correctly distinguishes different complex dynamics and outperforms state of the art metrics in several cases. Subsequently, we apply this methodology to real-world problems emerging across several disciplines including cases in neurobiology, finance and climate science. We conclude that differences between the statistics of local maxima and local minima in time series are highly informative of the complex underlying dynamics and a graph-theoretic extraction procedure allows to use these features for statistical learning purposes.
Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data.
Sun, Wenqing; Tseng, Tzu-Liang Bill; Zhang, Jianying; Qian, Wei
2017-04-01
In this study we developed a graph based semi-supervised learning (SSL) scheme using deep convolutional neural network (CNN) for breast cancer diagnosis. CNN usually needs a large amount of labeled data for training and fine tuning the parameters, and our proposed scheme only requires a small portion of labeled data in training set. Four modules were included in the diagnosis system: data weighing, feature selection, dividing co-training data labeling, and CNN. 3158 region of interests (ROIs) with each containing a mass extracted from 1874 pairs of mammogram images were used for this study. Among them 100 ROIs were treated as labeled data while the rest were treated as unlabeled. The area under the curve (AUC) observed in our study was 0.8818, and the accuracy of CNN is 0.8243 using the mixed labeled and unlabeled data. Copyright © 2016. Published by Elsevier Ltd.
Markerless video analysis for movement quantification in pediatric epilepsy monitoring.
Lu, Haiping; Eng, How-Lung; Mandal, Bappaditya; Chan, Derrick W S; Ng, Yen-Ling
2011-01-01
This paper proposes a markerless video analytic system for quantifying body part movements in pediatric epilepsy monitoring. The system utilizes colored pajamas worn by a patient in bed to extract body part movement trajectories, from which various features can be obtained for seizure detection and analysis. Hence, it is non-intrusive and it requires no sensor/marker to be attached to the patient's body. It takes raw video sequences as input and a simple user-initialization indicates the body parts to be examined. In background/foreground modeling, Gaussian mixture models are employed in conjunction with HSV-based modeling. Body part detection follows a coarse-to-fine paradigm with graph-cut-based segmentation. Finally, body part parameters are estimated with domain knowledge guidance. Experimental studies are reported on sequences captured in an Epilepsy Monitoring Unit at a local hospital. The results demonstrate the feasibility of the proposed system in pediatric epilepsy monitoring and seizure detection.
Bae, Youngoh; Yoo, Byeong Wook; Lee, Jung Chan; Kim, Hee Chan
2017-05-01
Detection and diagnosis based on extracting features and classification using electroencephalography (EEG) signals are being studied vigorously. A network analysis of time series EEG signal data is one of many techniques that could help study brain functions. In this study, we analyze EEG to diagnose alcoholism. We propose a novel methodology to estimate the differences in the status of the brain based on EEG data of normal subjects and data from alcoholics by computing many parameters stemming from effective network using Granger causality. Among many parameters, only ten parameters were chosen as final candidates. By the combination of ten graph-based parameters, our results demonstrate predictable differences between alcoholics and normal subjects. A support vector machine classifier with best performance had 90% accuracy with sensitivity of 95.3%, and specificity of 82.4% for differentiating between the two groups.
Decomposition Algorithm for Global Reachability on a Time-Varying Graph
NASA Technical Reports Server (NTRS)
Kuwata, Yoshiaki
2010-01-01
A decomposition algorithm has been developed for global reachability analysis on a space-time grid. By exploiting the upper block-triangular structure, the planning problem is decomposed into smaller subproblems, which is much more scalable than the original approach. Recent studies have proposed the use of a hot-air (Montgolfier) balloon for possible exploration of Titan and Venus because these bodies have thick haze or cloud layers that limit the science return from an orbiter, and the atmospheres would provide enough buoyancy for balloons. One of the important questions that needs to be addressed is what surface locations the balloon can reach from an initial location, and how long it would take. This is referred to as the global reachability problem, where the paths from starting locations to all possible target locations must be computed. The balloon could be driven with its own actuation, but its actuation capability is fairly limited. It would be more efficient to take advantage of the wind field and ride the wind that is much stronger than what the actuator could produce. It is possible to pose the path planning problem as a graph search problem on a directed graph by discretizing the spacetime world and the vehicle actuation. The decomposition algorithm provides reachability analysis of a time-varying graph. Because the balloon only moves in the positive direction in time, the adjacency matrix of the graph can be represented with an upper block-triangular matrix, and this upper block-triangular structure can be exploited to decompose a large graph search problem. The new approach consumes a much smaller amount of memory, which also helps speed up the overall computation when the computing resource has a limited physical memory compared to the problem size.
Mammographic phenotypes of breast cancer risk driven by breast anatomy
NASA Astrophysics Data System (ADS)
Gastounioti, Aimilia; Oustimov, Andrew; Hsieh, Meng-Kang; Pantalone, Lauren; Conant, Emily F.; Kontos, Despina
2017-03-01
Image-derived features of breast parenchymal texture patterns have emerged as promising risk factors for breast cancer, paving the way towards personalized recommendations regarding women's cancer risk evaluation and screening. The main steps to extract texture features of the breast parenchyma are the selection of regions of interest (ROIs) where texture analysis is performed, the texture feature calculation and the texture feature summarization in case of multiple ROIs. In this study, we incorporate breast anatomy in these three key steps by (a) introducing breast anatomical sampling for the definition of ROIs, (b) texture feature calculation aligned with the structure of the breast and (c) weighted texture feature summarization considering the spatial position and the underlying tissue composition of each ROI. We systematically optimize this novel framework for parenchymal tissue characterization in a case-control study with digital mammograms from 424 women. We also compare the proposed approach with a conventional methodology, not considering breast anatomy, recently shown to enhance the case-control discriminatory capacity of parenchymal texture analysis. The case-control classification performance is assessed using elastic-net regression with 5-fold cross validation, where the evaluation measure is the area under the curve (AUC) of the receiver operating characteristic. Upon optimization, the proposed breast-anatomy-driven approach demonstrated a promising case-control classification performance (AUC=0.87). In the same dataset, the performance of conventional texture characterization was found to be significantly lower (AUC=0.80, DeLong's test p-value<0.05). Our results suggest that breast anatomy may further leverage the associations of parenchymal texture features with breast cancer, and may therefore be a valuable addition in pipelines aiming to elucidate quantitative mammographic phenotypes of breast cancer risk.
Extracting Loop Bounds for WCET Analysis Using the Instrumentation Point Graph
NASA Astrophysics Data System (ADS)
Betts, A.; Bernat, G.
2009-05-01
Every calculation engine proposed in the literature of Worst-Case Execution Time (WCET) analysis requires upper bounds on loop iterations. Existing mechanisms to procure this information are either error prone, because they are gathered from the end-user, or limited in scope, because automatic analyses target very specific loop structures. In this paper, we present a technique that obtains bounds completely automatically for arbitrary loop structures. In particular, we show how to employ the Instrumentation Point Graph (IPG) to parse traces of execution (generated by an instrumented program) in order to extract bounds relative to any loop-nesting level. With this technique, therefore, non-rectangular dependencies between loops can be captured, allowing more accurate WCET estimates to be calculated. We demonstrate the improvement in accuracy by comparing WCET estimates computed through our HMB framework against those computed with state-of-the-art techniques.
Yun, Ruijuan; Lin, Chung-Chih; Wu, Shuicai; Huang, Chu-Chung; Lin, Ching-Po; Chao, Yi-Ping
2013-01-01
In this study, we employed diffusion tensor imaging (DTI) to construct brain structural network and then derive the connection matrices from 96 healthy elderly subjects. The correlation analysis between these topological properties of network based on graph theory and the Cognitive Abilities Screening Instrument (CASI) index were processed to extract the significant network characteristics. These characteristics were then integrated to estimate the models by various machine-learning algorithms to predict user's cognitive performance. From the results, linear regression model and Gaussian processes model showed presented better abilities with lower mean absolute errors of 5.8120 and 6.25 to predict the cognitive performance respectively. Moreover, these extracted topological properties of brain structural network derived from DTI also could be regarded as the bio-signatures for further evaluation of brain degeneration in healthy aged and early diagnosis of mild cognitive impairment (MCI).
Prieto, Luis P; Sharma, Kshitij; Kidzinski, Łukasz; Rodríguez-Triana, María Jesús; Dillenbourg, Pierre
2018-04-01
The pedagogical modelling of everyday classroom practice is an interesting kind of evidence, both for educational research and teachers' own professional development. This paper explores the usage of wearable sensors and machine learning techniques to automatically extract orchestration graphs (teaching activities and their social plane over time), on a dataset of 12 classroom sessions enacted by two different teachers in different classroom settings. The dataset included mobile eye-tracking as well as audiovisual and accelerometry data from sensors worn by the teacher. We evaluated both time-independent and time-aware models, achieving median F1 scores of about 0.7-0.8 on leave-one-session-out k-fold cross-validation. Although these results show the feasibility of this approach, they also highlight the need for larger datasets, recorded in a wider variety of classroom settings, to provide automated tagging of classroom practice that can be used in everyday practice across multiple teachers.
Fractal active contour model for segmenting the boundary of man-made target in nature scenes
NASA Astrophysics Data System (ADS)
Li, Min; Tang, Yandong; Wang, Lidi; Shi, Zelin
2006-02-01
In this paper, a novel geometric active contour model based on the fractal dimension feature to extract the boundary of man-made target in nature scenes is presented. In order to suppress the nature clutters, an adaptive weighting function is defined using the fractal dimension feature. Then the weighting function is introduced into the geodesic active contour model to detect the boundary of man-made target. Curve driven by our proposed model can evolve gradually from the initial position to the boundary of man-made target without being disturbed by nature clutters, even if the initial curve is far away from the true boundary. Experimental results validate the effectiveness and feasibility of our model.
Multiple Semantic Matching on Augmented N-partite Graph for Object Co-segmentation.
Wang, Chuan; Zhang, Hua; Yang, Liang; Cao, Xiaochun; Xiong, Hongkai
2017-09-08
Recent methods for object co-segmentation focus on discovering single co-occurring relation of candidate regions representing the foreground of multiple images. However, region extraction based only on low and middle level information often occupies a large area of background without the help of semantic context. In addition, seeking single matching solution very likely leads to discover local parts of common objects. To cope with these deficiencies, we present a new object cosegmentation framework, which takes advantages of semantic information and globally explores multiple co-occurring matching cliques based on an N-partite graph structure. To this end, we first propose to incorporate candidate generation with semantic context. Based on the regions extracted from semantic segmentation of each image, we design a merging mechanism to hierarchically generate candidates with high semantic responses. Secondly, all candidates are taken into consideration to globally formulate multiple maximum weighted matching cliques, which complements the discovery of part of the common objects induced by a single clique. To facilitate the discovery of multiple matching cliques, an N-partite graph, which inherently excludes intralinks between candidates from the same image, is constructed to separate multiple cliques without additional constraints. Further, we augment the graph with an additional virtual node in each part to handle irrelevant matches when the similarity between two candidates is too small. Finally, with the explored multiple cliques, we statistically compute pixel-wise co-occurrence map for each image. Experimental results on two benchmark datasets, i.e., iCoseg and MSRC datasets, achieve desirable performance and demonstrate the effectiveness of our proposed framework.
Understanding resonance graphs using Easy Java Simulations (EJS) and why we use EJS
NASA Astrophysics Data System (ADS)
Wee, Loo Kang; Lee, Tat Leong; Chew, Charles; Wong, Darren; Tan, Samuel
2015-03-01
This paper reports a computer model simulation created using Easy Java Simulation (EJS) for learners to visualize how the steady-state amplitude of a driven oscillating system varies with the frequency of the periodic driving force. The simulation shows (N = 100) identical spring-mass systems being subjected to (1) a periodic driving force of equal amplitude but different driving frequencies, and (2) different amounts of damping. The simulation aims to create a visually intuitive way of understanding how the series of amplitude versus driving frequency graphs are obtained by showing how the displacement of the system changes over time as it transits from the transient to the steady state. A suggested ‘how to use’ the model is added to help educators and students in their teaching and learning, where we explain the theoretical steady-state equation time conditions when the model begins to allow data recording of maximum amplitudes to closely match the theoretical equation, and the steps to collect different runs of the degree of damping. We also discuss two of the design features in our computer model: displaying the instantaneous oscillation together with the achieved steady-state amplitudes, and the explicit world view overlay with scientific representation with different degrees of damping runs. Three advantages of using EJS include: (1) open source codes and creative commons attribution licenses for scaling up of interactively engaging educational practices; (2) the models made can run on almost any device, including Android and iOS; and (3) it allows the redefinition of physics educational practices through computer modeling.
Ecological and evolutionary dynamics of interconnectedness and modularity
Nordbotten, Jan M.; Levin, Simon A.; Szathmáry, Eörs; Stenseth, Nils C.
2018-01-01
In this contribution, we develop a theoretical framework for linking microprocesses (i.e., population dynamics and evolution through natural selection) with macrophenomena (such as interconnectedness and modularity within an ecological system). This is achieved by developing a measure of interconnectedness for population distributions defined on a trait space (generalizing the notion of modularity on graphs), in combination with an evolution equation for the population distribution. With this contribution, we provide a platform for understanding under what environmental, ecological, and evolutionary conditions ecosystems evolve toward being more or less modular. A major contribution of this work is that we are able to decompose the overall driver of changes at the macro level (such as interconnectedness) into three components: (i) ecologically driven change, (ii) evolutionarily driven change, and (iii) environmentally driven change. PMID:29311333
Visual traffic jam analysis based on trajectory data.
Wang, Zuchao; Lu, Min; Yuan, Xiaoru; Zhang, Junping; van de Wetering, Huub
2013-12-01
In this work, we present an interactive system for visual analysis of urban traffic congestion based on GPS trajectories. For these trajectories we develop strategies to extract and derive traffic jam information. After cleaning the trajectories, they are matched to a road network. Subsequently, traffic speed on each road segment is computed and traffic jam events are automatically detected. Spatially and temporally related events are concatenated in, so-called, traffic jam propagation graphs. These graphs form a high-level description of a traffic jam and its propagation in time and space. Our system provides multiple views for visually exploring and analyzing the traffic condition of a large city as a whole, on the level of propagation graphs, and on road segment level. Case studies with 24 days of taxi GPS trajectories collected in Beijing demonstrate the effectiveness of our system.
Fast determination of structurally cohesive subgroups in large networks
Sinkovits, Robert S.; Moody, James; Oztan, B. Tolga; White, Douglas R.
2016-01-01
Structurally cohesive subgroups are a powerful and mathematically rigorous way to characterize network robustness. Their strength lies in the ability to detect strong connections among vertices that not only have no neighbors in common, but that may be distantly separated in the graph. Unfortunately, identifying cohesive subgroups is a computationally intensive problem, which has limited empirical assessments of cohesion to relatively small graphs of at most a few thousand vertices. We describe here an approach that exploits the properties of cliques, k-cores and vertex separators to iteratively reduce the complexity of the graph to the point where standard algorithms can be used to complete the analysis. As a proof of principle, we apply our method to the cohesion analysis of a 29,462-vertex biconnected component extracted from a 128,151-vertex co-authorship data set. PMID:28503215
A Cheap, Semiquantitative Hand-Held Conductivity Tester.
ERIC Educational Resources Information Center
Zawacky, Susan K. S.
1995-01-01
Describes a design for a hand-held conductivity tester powered by a 9V battery that gives semi-quantitative results for aqueous electrolyte solutions of concentrations ranging from 0.001 M to 0.1 M. The tester uses a bar-graph LED driven by an LM3914 integrated circuit to indicate the level of conductivity. A list of parts, procedures, and results…
2015-06-18
platform assembly 2, with micro-mirror platform deflection, measured on actuation side ( PFa ) and side opposite actuation (PFo...beam micro-mirror platform assembly 1; micro-mirror platform deflection, measured on actuation side ( PFa ) and side opposite actuation (PFo...side ( PFa ) and side opposite actuation (PFo) ........................................................ 106 xiv Figure 73: Graph of measured 10-beam
Distortions in memory for visual displays
NASA Technical Reports Server (NTRS)
Tversky, Barbara
1989-01-01
Systematic errors in perception and memory present a challenge to theories of perception and memory and to applied psychologists interested in overcoming them as well. A number of systematic errors in memory for maps and graphs are reviewed, and they are accounted for by an analysis of the perceptual processing presumed to occur in comprehension of maps and graphs. Visual stimuli, like verbal stimuli, are organized in comprehension and memory. For visual stimuli, the organization is a consequence of perceptual processing, which is bottom-up or data-driven in its earlier stages, but top-down and affected by conceptual knowledge later on. Segregation of figure from ground is an early process, and figure recognition later; for both, symmetry is a rapidly detected and ecologically valid cue. Once isolated, figures are organized relative to one another and relative to a frame of reference. Both perceptual (e.g., salience) and conceptual factors (e.g., significance) seem likely to affect selection of a reference frame. Consistent with the analysis, subjects perceived and remembered curves in graphs and rivers in maps as more symmetric than they actually were. Symmetry, useful for detecting and recognizing figures, distorts map and graph figures alike. Top-down processes also seem to operate in that calling attention to the symmetry vs. asymmetry of a slightly asymmetric curve yielded memory errors in the direction of the description. Conceptual frame of reference effects were demonstrated in memory for lines embedded in graphs. In earlier work, the orientation of map figures was distorted in memory toward horizontal or vertical. In recent work, graph lines, but not map lines, were remembered as closer to an imaginary 45 deg line than they had been. Reference frames are determined by both perceptual and conceptual factors, leading to selection of the canonical axes as a reference frame in maps, but selection of the imaginary 45 deg as a reference frame in graphs.