similar analysis based: Topics by Science.gov

Sample records for similar analysis based

A new metaphor for projection-based visual analysis and data exploration

NASA Astrophysics Data System (ADS)

Schreck, Tobias; Panse, Christian

2007-01-01

In many important application domains such as Business and Finance, Process Monitoring, and Security, huge and quickly increasing volumes of complex data are collected. Strong efforts are underway developing automatic and interactive analysis tools for mining useful information from these data repositories. Many data analysis algorithms require an appropriate definition of similarity (or distance) between data instances to allow meaningful clustering, classification, and retrieval, among other analysis tasks. Projection-based data visualization is highly interesting (a) for visual discrimination analysis of a data set within a given similarity definition, and (b) for comparative analysis of similarity characteristics of a given data set represented by different similarity definitions. We introduce an intuitive and effective novel approach for projection-based similarity visualization for interactive discrimination analysis, data exploration, and visual evaluation of metric space effectiveness. The approach is based on the convex hull metaphor for visually aggregating sets of points in projected space, and it can be used with a variety of different projection techniques. The effectiveness of the approach is demonstrated by application on two well-known data sets. Statistical evidence supporting the validity of the hull metaphor is presented. We advocate the hull-based approach over the standard symbol-based approach to projection visualization, as it allows a more effective perception of similarity relationships and class distribution characteristics.
Binary similarity measures for fingerprint analysis of qualitative metabolomic profiles.

PubMed

Rácz, Anita; Andrić, Filip; Bajusz, Dávid; Héberger, Károly

2018-01-01

Contemporary metabolomic fingerprinting is based on multiple spectrometric and chromatographic signals, used either alone or combined with structural and chemical information of metabolic markers at the qualitative and semiquantitative level. However, signal shifting, convolution, and matrix effects may compromise metabolomic patterns. Recent increase in the use of qualitative metabolomic data, described by the presence (1) or absence (0) of particular metabolites, demonstrates great potential in the field of metabolomic profiling and fingerprint analysis. The aim of this study is a comprehensive evaluation of binary similarity measures for the elucidation of patterns among samples of different botanical origin and various metabolomic profiles. Nine qualitative metabolomic data sets covering a wide range of natural products and metabolomic profiles were applied to assess 44 binary similarity measures for the fingerprinting of plant extracts and natural products. The measures were analyzed by the novel sum of ranking differences method (SRD), searching for the most promising candidates. Baroni-Urbani-Buser (BUB) and Hawkins-Dotson (HD) similarity coefficients were selected as the best measures by SRD and analysis of variance (ANOVA), while Dice (Di1), Yule, Russel-Rao, and Consonni-Todeschini 3 ranked the worst. ANOVA revealed that concordantly and intermediately symmetric similarity coefficients are better candidates for metabolomic fingerprinting than the asymmetric and correlation based ones. The fingerprint analysis based on the BUB and HD coefficients and qualitative metabolomic data performed equally well as the quantitative metabolomic profile analysis. Fingerprint analysis based on the qualitative metabolomic profiles and binary similarity measures proved to be a reliable way in finding the same/similar patterns in metabolomic data as that extracted from quantitative data.
Fast Depiction Invariant Visual Similarity for Content Based Image Retrieval Based on Data-driven Visual Similarity using Linear Discriminant Analysis

NASA Astrophysics Data System (ADS)

Wihardi, Y.; Setiawan, W.; Nugraha, E.

2018-01-01

On this research we try to build CBIRS based on Learning Distance/Similarity Function using Linear Discriminant Analysis (LDA) and Histogram of Oriented Gradient (HoG) feature. Our method is invariant to depiction of image, such as similarity of image to image, sketch to image, and painting to image. LDA can decrease execution time compared to state of the art method, but it still needs an improvement in term of accuracy. Inaccuracy in our experiment happen because we did not perform sliding windows search and because of low number of negative samples as natural-world images.
Comparative Analysis of Mass Spectral Similarity Measures on Peak Alignment for Comprehensive Two-Dimensional Gas Chromatography Mass Spectrometry

PubMed Central

2013-01-01

Peak alignment is a critical procedure in mass spectrometry-based biomarker discovery in metabolomics. One of peak alignment approaches to comprehensive two-dimensional gas chromatography mass spectrometry (GC×GC-MS) data is peak matching-based alignment. A key to the peak matching-based alignment is the calculation of mass spectral similarity scores. Various mass spectral similarity measures have been developed mainly for compound identification, but the effect of these spectral similarity measures on the performance of peak matching-based alignment still remains unknown. Therefore, we selected five mass spectral similarity measures, cosine correlation, Pearson's correlation, Spearman's correlation, partial correlation, and part correlation, and examined their effects on peak alignment using two sets of experimental GC×GC-MS data. The results show that the spectral similarity measure does not affect the alignment accuracy significantly in analysis of data from less complex samples, while the partial correlation performs much better than other spectral similarity measures when analyzing experimental data acquired from complex biological samples. PMID:24151524
A Systems Biology Approach for Identifying Hepatotoxicant Groups Based on Similarity in Mechanisms of Action and Chemical Structure.

PubMed

Hebels, Dennie G A J; Rasche, Axel; Herwig, Ralf; van Westen, Gerard J P; Jennen, Danyel G J; Kleinjans, Jos C S

2016-01-01

When evaluating compound similarity, addressing multiple sources of information to reach conclusions about common pharmaceutical and/or toxicological mechanisms of action is a crucial strategy. In this chapter, we describe a systems biology approach that incorporates analyses of hepatotoxicant data for 33 compounds from three different sources: a chemical structure similarity analysis based on the 3D Tanimoto coefficient, a chemical structure-based protein target prediction analysis, and a cross-study/cross-platform meta-analysis of in vitro and in vivo human and rat transcriptomics data derived from public resources (i.e., the diXa data warehouse). Hierarchical clustering of the outcome scores of the separate analyses did not result in a satisfactory grouping of compounds considering their known toxic mechanism as described in literature. However, a combined analysis of multiple data types may hypothetically compensate for missing or unreliable information in any of the single data types. We therefore performed an integrated clustering analysis of all three data sets using the R-based tool iClusterPlus. This indeed improved the grouping results. The compound clusters that were formed by means of iClusterPlus represent groups that show similar gene expression while simultaneously integrating a similarity in structure and protein targets, which corresponds much better with the known mechanism of action of these toxicants. Using an integrative systems biology approach may thus overcome the limitations of the separate analyses when grouping liver toxicants sharing a similar mechanism of toxicity.
An Analysis of Context-Based Similarity Tasks in Textbooks from Brazil and the United States

ERIC Educational Resources Information Center

Barcelos Amaral, Rúbia; Hollebrands, Karen

2017-01-01

Three textbooks from Brazil and three textbooks from the United States were analysed with a focus on similarity and context-based tasks. Students' opportunities to learn similarity were examined by considering whether students were provided context-based tasks of high cognitive demand and whether those tasks included missing or superfluous…
A new similarity index for nonlinear signal analysis based on local extrema patterns

NASA Astrophysics Data System (ADS)

Niknazar, Hamid; Motie Nasrabadi, Ali; Shamsollahi, Mohammad Bagher

2018-02-01

Common similarity measures of time domain signals such as cross-correlation and Symbolic Aggregate approximation (SAX) are not appropriate for nonlinear signal analysis. This is because of the high sensitivity of nonlinear systems to initial points. Therefore, a similarity measure for nonlinear signal analysis must be invariant to initial points and quantify the similarity by considering the main dynamics of signals. The statistical behavior of local extrema (SBLE) method was previously proposed to address this problem. The SBLE similarity index uses quantized amplitudes of local extrema to quantify the dynamical similarity of signals by considering patterns of sequential local extrema. By adding time information of local extrema as well as fuzzifying quantized values, this work proposes a new similarity index for nonlinear and long-term signal analysis, which extends the SBLE method. These new features provide more information about signals and reduce noise sensitivity by fuzzifying them. A number of practical tests were performed to demonstrate the ability of the method in nonlinear signal clustering and classification on synthetic data. In addition, epileptic seizure detection based on electroencephalography (EEG) signal processing was done by the proposed similarity to feature the potentials of the method as a real-world application tool.
A Profile-Based Framework for Factorial Similarity and the Congruence Coefficient.

PubMed

Hartley, Anselma G; Furr, R Michael

2017-01-01

We present a novel profile-based framework for understanding factorial similarity in the context of exploratory factor analysis in general, and for understanding the congruence coefficient (a commonly used index of factor similarity) specifically. First, we introduce the profile-based framework articulating factorial similarity in terms of 3 intuitive components: general saturation similarity, differential saturation similarity, and configural similarity. We then articulate the congruence coefficient in terms of these components, along with 2 additional profile-based components, and we explain how these components resolve ambiguities that can be-and are-found when using the congruence coefficient. Finally, we present secondary analyses revealing that profile-based components of factorial are indeed linked to experts' actual evaluations of factorial similarity. Overall, the profile-based approach we present offers new insights into the ways in which researchers can examine factor similarity and holds the potential to enhance researchers' ability to understand the congruence coefficient.
A study of concept-based similarity approaches for recommending program examples

NASA Astrophysics Data System (ADS)

Hosseini, Roya; Brusilovsky, Peter

2017-07-01

This paper investigates a range of concept-based example recommendation approaches that we developed to provide example-based problem-solving support in the domain of programming. The goal of these approaches is to offer students a set of most relevant remedial examples when they have trouble solving a code comprehension problem where students examine a program code to determine its output or the final value of a variable. In this paper, we use the ideas of semantic-level similarity-based linking developed in the area of intelligent hypertext to generate examples for the given problem. To determine the best-performing approach, we explored two groups of similarity approaches for selecting examples: non-structural approaches focusing on examples that are similar to the problem in terms of concept coverage and structural approaches focusing on examples that are similar to the problem by the structure of the content. We also explored the value of personalized example recommendation based on student's knowledge levels and learning goal of the exercise. The paper presents concept-based similarity approaches that we developed, explains the data collection studies and reports the result of comparative analysis. The results of our analysis showed better ranking performance of the personalized structural variant of cosine similarity approach.
An analysis of context-based similarity tasks in textbooks from Brazil and the United States

NASA Astrophysics Data System (ADS)

Barcelos Amaral, Rúbia; Hollebrands, Karen

2017-11-01

Three textbooks from Brazil and three textbooks from the United States were analysed with a focus on similarity and context-based tasks. Students' opportunities to learn similarity were examined by considering whether students were provided context-based tasks of high cognitive demand and whether those tasks included missing or superfluous information. Although books in the United States included more tasks, the proportion of tasks focused on similarity were about the same. Context-based similarity tasks accounted for 9%-29% of the similarity tasks, and many of these contextual tasks were of low cognitive demand. In addition, the types of contexts that were included in the textbooks were critiqued and examples provided.
Functional Evolution of PLP-dependent Enzymes based on Active-Site Structural Similarities

PubMed Central

Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert

2014-01-01

Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5’-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the Comparison of Protein Active Site Structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. PMID:24920327
Functional evolution of PLP-dependent enzymes based on active-site structural similarities.

PubMed

Catazaro, Jonathan; Caprez, Adam; Guru, Ashu; Swanson, David; Powers, Robert

2014-10-01

Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on active-site structural similarities has not yet been undertaken. Pyridoxal-5'-phosphate (PLP) dependent enzymes are primordial enzymes that diversified in the last universal ancestor. Using the comparison of protein active site structures (CPASS) software and database, we show that the active site structures of PLP-dependent enzymes can be used to infer evolutionary relationships based on functional similarity. The enzymes successfully clustered together based on substrate specificity, function, and three-dimensional-fold. This study demonstrates the value of using active site structures for functional evolutionary analysis and the effectiveness of CPASS. © 2014 Wiley Periodicals, Inc.
The human disease network in terms of dysfunctional regulatory mechanisms.

PubMed

Yang, Jing; Wu, Su-Juan; Dai, Wen-Tao; Li, Yi-Xue; Li, Yuan-Yuan

2015-10-08

Elucidation of human disease similarities has emerged as an active research area, which is highly relevant to etiology, disease classification, and drug repositioning. In pioneer studies, disease similarity was commonly estimated according to clinical manifestation. Subsequently, scientists started to investigate disease similarity based on gene-phenotype knowledge, which were inevitably biased to well-studied diseases. In recent years, estimating disease similarity according to transcriptomic behavior significantly enhances the probability of finding novel disease relationships, while the currently available studies usually mine expression data through differential expression analysis that has been considered to have little chance of unraveling dysfunctional regulatory relationships, the causal pathogenesis of diseases. We developed a computational approach to measure human disease similarity based on expression data. Differential coexpression analysis, instead of differential expression analysis, was employed to calculate differential coexpression level of every gene for each disease, which was then summarized to the pathway level. Disease similarity was eventually calculated as the partial correlation coefficients of pathways' differential coexpression values between any two diseases. The significance of disease relationships were evaluated by permutation test. Based on mRNA expression data and a differential coexpression analysis based method, we built a human disease network involving 1326 significant Disease-Disease links among 108 diseases. Compared with disease relationships captured by differential expression analysis based method, our disease links shared known disease genes and drugs more significantly. Some novel disease relationships were discovered, for example, Obesity and cancer, Obesity and Psoriasis, lung adenocarcinoma and S. pneumonia, which had been commonly regarded as unrelated to each other, but recently found to share similar molecular mechanisms. Additionally, it was found that both the type of disease and the type of affected tissue influenced the degree of disease similarity. A sub-network including Allergic asthma, Type 2 diabetes and Chronic kidney disease was extracted to demonstrate the exploration of their common pathogenesis. The present study produces a global view of human diseasome for the first time from the viewpoint of regulation mechanisms, which therefore could provide insightful clues to etiology and pathogenesis, and help to perform drug repositioning and design novel therapeutic interventions.
SSAW: A new sequence similarity analysis method based on the stationary discrete wavelet transform.

PubMed

Lin, Jie; Wei, Jing; Adjeroh, Donald; Jiang, Bing-Hua; Jiang, Yue

2018-05-02

Alignment-free sequence similarity analysis methods often lead to significant savings in computational time over alignment-based counterparts. A new alignment-free sequence similarity analysis method, called SSAW is proposed. SSAW stands for Sequence Similarity Analysis using the Stationary Discrete Wavelet Transform (SDWT). It extracts k-mers from a sequence, then maps each k-mer to a complex number field. Then, the series of complex numbers formed are transformed into feature vectors using the stationary discrete wavelet transform. After these steps, the original sequence is turned into a feature vector with numeric values, which can then be used for clustering and/or classification. Using two different types of applications, namely, clustering and classification, we compared SSAW against the the-state-of-the-art alignment free sequence analysis methods. SSAW demonstrates competitive or superior performance in terms of standard indicators, such as accuracy, F-score, precision, and recall. The running time was significantly better in most cases. These make SSAW a suitable method for sequence analysis, especially, given the rapidly increasing volumes of sequence data required by most modern applications.
An alternative approach to measure similarity between two deterministic transient signals

NASA Astrophysics Data System (ADS)

Shin, Kihong

2016-06-01

In many practical engineering applications, it is often required to measure the similarity of two signals to gain insight into the conditions of a system. For example, an application that monitors machinery can regularly measure the signal of the vibration and compare it to a healthy reference signal in order to monitor whether or not any fault symptom is developing. Also in modal analysis, a frequency response function (FRF) from a finite element model (FEM) is often compared with an FRF from experimental modal analysis. Many different similarity measures are applicable in such cases, and correlation-based similarity measures may be most frequently used among these such as in the case where the correlation coefficient in the time domain and the frequency response assurance criterion (FRAC) in the frequency domain are used. Although correlation-based similarity measures may be particularly useful for random signals because they are based on probability and statistics, we frequently deal with signals that are largely deterministic and transient. Thus, it may be useful to develop another similarity measure that takes the characteristics of the deterministic transient signal properly into account. In this paper, an alternative approach to measure the similarity between two deterministic transient signals is proposed. This newly proposed similarity measure is based on the fictitious system frequency response function, and it consists of the magnitude similarity and the shape similarity. Finally, a few examples are presented to demonstrate the use of the proposed similarity measure.
Automatized Assessment of Protective Group Reactivity: A Step Toward Big Reaction Data Analysis.

PubMed

Lin, Arkadii I; Madzhidov, Timur I; Klimchuk, Olga; Nugmanov, Ramil I; Antipin, Igor S; Varnek, Alexandre

2016-11-28

We report a new method to assess protective groups (PGs) reactivity as a function of reaction conditions (catalyst, solvent) using raw reaction data. It is based on an intuitive similarity principle for chemical reactions: similar reactions proceed under similar conditions. Technically, reaction similarity can be assessed using the Condensed Graph of Reaction (CGR) approach representing an ensemble of reactants and products as a single molecular graph, i.e., as a pseudomolecule for which molecular descriptors or fingerprints can be calculated. CGR-based in-house tools were used to process data for 142,111 catalytic hydrogenation reactions extracted from the Reaxys database. Our results reveal some contradictions with famous Greene's Reactivity Charts based on manual expert analysis. Models developed in this study show high accuracy (ca. 90%) for predicting optimal experimental conditions of protective group deprotection.
Structure-related clustering of gene expression fingerprints of thp-1 cells exposed to smaller polycyclic aromatic hydrocarbons.

PubMed

Wan, B; Yarbrough, J W; Schultz, T W

2008-01-01

This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
Information filtering based on corrected redundancy-eliminating mass diffusion.

PubMed

Zhu, Xuzhen; Yang, Yujie; Chen, Guilin; Medo, Matus; Tian, Hui; Cai, Shi-Min

2017-01-01

Methods used in information filtering and recommendation often rely on quantifying the similarity between objects or users. The used similarity metrics often suffer from similarity redundancies arising from correlations between objects' attributes. Based on an unweighted undirected object-user bipartite network, we propose a Corrected Redundancy-Eliminating similarity index (CRE) which is based on a spreading process on the network. Extensive experiments on three benchmark data sets-Movilens, Netflix and Amazon-show that when used in recommendation, the CRE yields significant improvements in terms of recommendation accuracy and diversity. A detailed analysis is presented to unveil the origins of the observed differences between the CRE and mainstream similarity indices.
Using text analysis to quantify the similarity and evolution of scientific disciplines

PubMed Central

Dias, Laércio; Scharloth, Joachim

2018-01-01

We use an information-theoretic measure of linguistic similarity to investigate the organization and evolution of scientific fields. An analysis of almost 20 M papers from the past three decades reveals that the linguistic similarity is related but different from experts and citation-based classifications, leading to an improved view on the organization of science. A temporal analysis of the similarity of fields shows that some fields (e.g. computer science) are becoming increasingly central, but that on average the similarity between pairs of disciplines has not changed in the last decades. This suggests that tendencies of convergence (e.g. multi-disciplinarity) and divergence (e.g. specialization) of disciplines are in balance. PMID:29410857
Using text analysis to quantify the similarity and evolution of scientific disciplines.

PubMed

Dias, Laércio; Gerlach, Martin; Scharloth, Joachim; Altmann, Eduardo G

2018-01-01

We use an information-theoretic measure of linguistic similarity to investigate the organization and evolution of scientific fields. An analysis of almost 20 M papers from the past three decades reveals that the linguistic similarity is related but different from experts and citation-based classifications, leading to an improved view on the organization of science. A temporal analysis of the similarity of fields shows that some fields (e.g. computer science) are becoming increasingly central, but that on average the similarity between pairs of disciplines has not changed in the last decades. This suggests that tendencies of convergence (e.g. multi-disciplinarity) and divergence (e.g. specialization) of disciplines are in balance.

Brain activity across the development of automatic categorization: A comparison of categorization tasks using multi-voxel pattern analysis

PubMed Central

Soto, Fabian A.; Waldschmidt, Jennifer G.; Helie, Sebastien; Ashby, F. Gregory

2013-01-01

Previous evidence suggests that relatively separate neural networks underlie initial learning of rule-based and information-integration categorization tasks. With the development of automaticity, categorization behavior in both tasks becomes increasingly similar and exclusively related to activity in cortical regions. The present study uses multi-voxel pattern analysis to directly compare the development of automaticity in different categorization tasks. Each of three groups of participants received extensive training in a different categorization task: either an information-integration task, or one of two rule-based tasks. Four training sessions were performed inside an MRI scanner. Three different analyses were performed on the imaging data from a number of regions of interest (ROIs). The common patterns analysis had the goal of revealing ROIs with similar patterns of activation across tasks. The unique patterns analysis had the goal of revealing ROIs with dissimilar patterns of activation across tasks. The representational similarity analysis aimed at exploring (1) the similarity of category representations across ROIs and (2) how those patterns of similarities compared across tasks. The results showed that common patterns of activation were present in motor areas and basal ganglia early in training, but only in the former later on. Unique patterns were found in a variety of cortical and subcortical areas early in training, but they were dramatically reduced with training. Finally, patterns of representational similarity between brain regions became increasingly similar across tasks with the development of automaticity. PMID:23333700
A Query Expansion Framework in Image Retrieval Domain Based on Local and Global Analysis

PubMed Central

Rahman, M. M.; Antani, S. K.; Thoma, G. R.

2011-01-01

We present an image retrieval framework based on automatic query expansion in a concept feature space by generalizing the vector space model of information retrieval. In this framework, images are represented by vectors of weighted concepts similar to the keyword-based representation used in text retrieval. To generate the concept vocabularies, a statistical model is built by utilizing Support Vector Machine (SVM)-based classification techniques. The images are represented as “bag of concepts” that comprise perceptually and/or semantically distinguishable color and texture patches from local image regions in a multi-dimensional feature space. To explore the correlation between the concepts and overcome the assumption of feature independence in this model, we propose query expansion techniques in the image domain from a new perspective based on both local and global analysis. For the local analysis, the correlations between the concepts based on the co-occurrence pattern, and the metrical constraints based on the neighborhood proximity between the concepts in encoded images, are analyzed by considering local feedback information. We also analyze the concept similarities in the collection as a whole in the form of a similarity thesaurus and propose an efficient query expansion based on the global analysis. The experimental results on a photographic collection of natural scenes and a biomedical database of different imaging modalities demonstrate the effectiveness of the proposed framework in terms of precision and recall. PMID:21822350
Sentence Similarity Analysis with Applications in Automatic Short Answer Grading

ERIC Educational Resources Information Center

Mohler, Michael A. G.

2012-01-01

In this dissertation, I explore unsupervised techniques for the task of automatic short answer grading. I compare a number of knowledge-based and corpus-based measures of text similarity, evaluate the effect of domain and size on the corpus-based measures, and also introduce a novel technique to improve the performance of the system by integrating…
Geographical classification of Epimedium based on HPLC fingerprint analysis combined with multi-ingredients quantitative analysis.

PubMed

Xu, Ning; Zhou, Guofu; Li, Xiaojuan; Lu, Heng; Meng, Fanyun; Zhai, Huaqiang

2017-05-01

A reliable and comprehensive method for identifying the origin and assessing the quality of Epimedium has been developed. The method is based on analysis of HPLC fingerprints, combined with similarity analysis, hierarchical cluster analysis (HCA), principal component analysis (PCA) and multi-ingredient quantitative analysis. Nineteen batches of Epimedium, collected from different areas in the western regions of China, were used to establish the fingerprints and 18 peaks were selected for the analysis. Similarity analysis, HCA and PCA all classified the 19 areas into three groups. Simultaneous quantification of the five major bioactive ingredients in the Epimedium samples was also carried out to confirm the consistency of the quality tests. These methods were successfully used to identify the geographical origin of the Epimedium samples and to evaluate their quality. Copyright © 2016 John Wiley & Sons, Ltd.
Analysis of HIV-1 intersubtype recombination breakpoints suggests region with high pairing probability may be a more fundamental factor than sequence similarity affecting HIV-1 recombination.

PubMed

Jia, Lei; Li, Lin; Gui, Tao; Liu, Siyang; Li, Hanping; Han, Jingwan; Guo, Wei; Liu, Yongjian; Li, Jingyun

2016-09-21

With increasing data on HIV-1, a more relevant molecular model describing mechanism details of HIV-1 genetic recombination usually requires upgrades. Currently an incomplete structural understanding of the copy choice mechanism along with several other issues in the field that lack elucidation led us to perform an analysis of the correlation between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarity to further explore structural mechanisms. Near full length sequences of URFs from Asia, Europe, and Africa (one sequence/patient), and representative sequences of worldwide CRFs were retrieved from the Los Alamos HIV database. Their recombination patterns were analyzed by jpHMM in detail. Then the relationships between breakpoint distributions and (1) the probability of base pairing, and (2) intersubtype genetic similarities were investigated. Pearson correlation test showed that all URF groups and the CRF group exhibit the same breakpoint distribution pattern. Additionally, the Wilcoxon two-sample test indicated a significant and inexplicable limitation of recombination in regions with high pairing probability. These regions have been found to be strongly conserved across distinct biological states (i.e., strong intersubtype similarity), and genetic similarity has been determined to be a very important factor promoting recombination. Thus, the results revealed an unexpected disagreement between intersubtype similarity and breakpoint distribution, which were further confirmed by genetic similarity analysis. Our analysis reveals a critical conflict between results from natural HIV-1 isolates and those from HIV-1-based assay vectors in which genetic similarity has been shown to be a very critical factor promoting recombination. These results indicate the region with high-pairing probabilities may be a more fundamental factor affecting HIV-1 recombination than sequence similarity in natural HIV-1 infections. Our findings will be relevant in furthering the understanding of HIV-1 recombination mechanisms.
Information filtering based on corrected redundancy-eliminating mass diffusion

PubMed Central

Zhu, Xuzhen; Yang, Yujie; Chen, Guilin; Medo, Matus; Tian, Hui

2017-01-01

Methods used in information filtering and recommendation often rely on quantifying the similarity between objects or users. The used similarity metrics often suffer from similarity redundancies arising from correlations between objects’ attributes. Based on an unweighted undirected object-user bipartite network, we propose a Corrected Redundancy-Eliminating similarity index (CRE) which is based on a spreading process on the network. Extensive experiments on three benchmark data sets—Movilens, Netflix and Amazon—show that when used in recommendation, the CRE yields significant improvements in terms of recommendation accuracy and diversity. A detailed analysis is presented to unveil the origins of the observed differences between the CRE and mainstream similarity indices. PMID:28749976
Polynomial Conjoint Analysis of Similarities: A Model for Constructing Polynomial Conjoint Measurement Algorithms.

ERIC Educational Resources Information Center

Young, Forrest W.

A model permitting construction of algorithms for the polynomial conjoint analysis of similarities is presented. This model, which is based on concepts used in nonmetric scaling, permits one to obtain the best approximate solution. The concepts used to construct nonmetric scaling algorithms are reviewed. Finally, examples of algorithmic models for…
Explosion Source Similarity Analysis via SVD

NASA Astrophysics Data System (ADS)

Yedlin, Matthew; Ben Horin, Yochai; Margrave, Gary

2016-04-01

An important seismological ingredient for establishing a regional seismic nuclear discriminant is the similarity analysis of a sequence of explosion sources. To investigate source similarity, we are fortunate to have access to a sequence of 1805 three-component recordings of quarry blasts, shot from March 2002 to January 2015. The centroid of these blasts has an estimated location 36.3E and 29.9N. All blasts were detonated by JPMC (Jordan Phosphate Mines Co.) All data were recorded at the Israeli NDC, HFRI, located at 30.03N and 35.03E. Data were first winnowed based on the distribution of maximum amplitudes in the neighborhood of the P-wave arrival. The winnowed data were then detrended using the algorithm of Cleveland et al (1990). The detrended data were bandpass filtered between .1 to 12 Hz using an eighth order Butterworth filter. Finally, data were sorted based on maximum trace amplitude. Two similarity analysis approaches were used. First, for each component, the entire suite of traces was decomposed into its eigenvector representation, by employing singular-valued decomposition (SVD). The data were then reconstructed using 10 percent of the singular values, with the resulting enhancement of the S-wave and surface wave arrivals. The results of this first method are then compared to the second analysis method based on the eigenface decomposition analysis of Turk and Pentland (1991). While both methods yield similar results in enhancement of data arrivals and reduction of data redundancy, more analysis is required to calibrate the recorded data to charge size, a quantity that was not available for the current study. References Cleveland, R. B., Cleveland, W. S., McRae, J. E., and Terpenning, I., Stl: A seasonal-trend decomposition procedure based on loess, Journal of Official Statistics, 6, No. 1, 3-73, 1990. Turk, M. and Pentland, A., Eigenfaces for recognition. Journal of cognitive neuroscience, 3(1), 71-86, 1991.
A-DaGO-Fun: an adaptable Gene Ontology semantic similarity-based functional analysis tool.

PubMed

Mazandu, Gaston K; Chimusa, Emile R; Mbiyavanga, Mamana; Mulder, Nicola J

2016-02-01

Gene Ontology (GO) semantic similarity measures are being used for biological knowledge discovery based on GO annotations by integrating biological information contained in the GO structure into data analyses. To empower users to quickly compute, manipulate and explore these measures, we introduce A-DaGO-Fun (ADaptable Gene Ontology semantic similarity-based Functional analysis). It is a portable software package integrating all known GO information content-based semantic similarity measures and relevant biological applications associated with these measures. A-DaGO-Fun has the advantage not only of handling datasets from the current high-throughput genome-wide applications, but also allowing users to choose the most relevant semantic similarity approach for their biological applications and to adapt a given module to their needs. A-DaGO-Fun is freely available to the research community at http://web.cbio.uct.ac.za/ITGOM/adagofun. It is implemented in Linux using Python under free software (GNU General Public Licence). gmazandu@cbio.uct.ac.za or Nicola.Mulder@uct.ac.za Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Tailoring community-based wellness initiatives with latent class analysis--Massachusetts Community Transformation Grant projects.

PubMed

Arcaya, Mariana; Reardon, Timothy; Vogel, Joshua; Andrews, Bonnie K; Li, Wenjun; Land, Thomas

2014-02-13

Community-based approaches to preventing chronic diseases are attractive because of their broad reach and low costs, and as such, are integral components of health care reform efforts. Implementing community-based initiatives across Massachusetts' municipalities presents both programmatic and evaluation challenges. For effective delivery and evaluation of the interventions, establishing a community typology that groups similar municipalities provides a balanced and cost-effective approach. Through a series of key informant interviews and exploratory data analysis, we identified 55 municipal-level indicators of 6 domains for the typology analysis. The domains were health behaviors and health outcomes, housing and land use, transportation, retail environment, socioeconomics, and demographic composition. A latent class analysis was used to identify 10 groups of municipalities based on similar patterns of municipal-level indicators across the domains. Our model with 10 latent classes yielded excellent classification certainty (relative entropy = .995, minimum class probability for any class = .871), and differentiated distinct groups of municipalities based on health-relevant needs and resources. The classes differentiated healthy and racially and ethnically diverse urban areas from cities with similar population densities and diversity but worse health outcomes, affluent communities from lower-income rural communities, and mature suburban areas from rapidly suburbanizing communities with different healthy-living challenges. Latent class analysis is a tool that may aid in the planning, communication, and evaluation of community-based wellness initiatives such as Community Transformation Grants projects administrated by the Centers for Disease Control and Prevention.
Clustering More than Two Million Biomedical Publications: Comparing the Accuracies of Nine Text-Based Similarity Approaches

PubMed Central

Boyack, Kevin W.; Newman, David; Duhon, Russell J.; Klavans, Richard; Patek, Michael; Biberstine, Joseph R.; Schijvenaars, Bob; Skupin, André; Ma, Nianli; Börner, Katy

2011-01-01

Background We investigate the accuracy of different similarity approaches for clustering over two million biomedical documents. Clustering large sets of text documents is important for a variety of information needs and applications such as collection management and navigation, summary and analysis. The few comparisons of clustering results from different similarity approaches have focused on small literature sets and have given conflicting results. Our study was designed to seek a robust answer to the question of which similarity approach would generate the most coherent clusters of a biomedical literature set of over two million documents. Methodology We used a corpus of 2.15 million recent (2004-2008) records from MEDLINE, and generated nine different document-document similarity matrices from information extracted from their bibliographic records, including titles, abstracts and subject headings. The nine approaches were comprised of five different analytical techniques with two data sources. The five analytical techniques are cosine similarity using term frequency-inverse document frequency vectors (tf-idf cosine), latent semantic analysis (LSA), topic modeling, and two Poisson-based language models – BM25 and PMRA (PubMed Related Articles). The two data sources were a) MeSH subject headings, and b) words from titles and abstracts. Each similarity matrix was filtered to keep the top-n highest similarities per document and then clustered using a combination of graph layout and average-link clustering. Cluster results from the nine similarity approaches were compared using (1) within-cluster textual coherence based on the Jensen-Shannon divergence, and (2) two concentration measures based on grant-to-article linkages indexed in MEDLINE. Conclusions PubMed's own related article approach (PMRA) generated the most coherent and most concentrated cluster solution of the nine text-based similarity approaches tested, followed closely by the BM25 approach using titles and abstracts. Approaches using only MeSH subject headings were not competitive with those based on titles and abstracts. PMID:21437291
Impact of Spatial Scales on the Intercomparison of Climate Scenarios

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luo, Wei; Steptoe, Michael; Chang, Zheng

2017-01-01

Scenario analysis has been widely applied in climate science to understand the impact of climate change on the future human environment, but intercomparison and similarity analysis of different climate scenarios based on multiple simulation runs remain challenging. Although spatial heterogeneity plays a key role in modeling climate and human systems, little research has been performed to understand the impact of spatial variations and scales on similarity analysis of climate scenarios. To address this issue, the authors developed a geovisual analytics framework that lets users perform similarity analysis of climate scenarios from the Global Change Assessment Model (GCAM) using a hierarchicalmore » clustering approach.« less
Ontology- and graph-based similarity assessment in biological networks.

PubMed

Wang, Haiying; Zheng, Huiru; Azuaje, Francisco

2010-10-15

A standard systems-based approach to biomarker and drug target discovery consists of placing putative biomarkers in the context of a network of biological interactions, followed by different 'guilt-by-association' analyses. The latter is typically done based on network structural features. Here, an alternative analysis approach in which the networks are analyzed on a 'semantic similarity' space is reported. Such information is extracted from ontology-based functional annotations. We present SimTrek, a Cytoscape plugin for ontology-based similarity assessment in biological networks. http://rosalind.infj.ulst.ac.uk/SimTrek.html francisco.azuaje@crp-sante.lu Supplementary data are available at Bioinformatics online.
Structural similitude and scaling laws for laminated beam-plates

NASA Technical Reports Server (NTRS)

Simitses, George J.; Rezaeepazhand, Jalil

1992-01-01

The establishment of similarity conditions between two structural systems is discussed. Similarity conditions provide the relationship between a scale model and its prototype and can be used to predict the behavior of the prototype by extrapolating the experimental data of the corresponding small-scale model. Since satisfying all the similarity conditions simultaneously is difficult or even impossible, distorted models with partial similarity (with at least one similarity condition relaxed) are more practical. Establishing similarity conditions based on both dimensional analysis and direct use of governing equations is discussed, and the possibility of designing distorted models is investigated. The method is demonstrated through analysis of the cylindrical bending of orthotropic laminated beam-plates subjected to transverse line loads.
Similarity networks as a knowledge representation for space applications

NASA Technical Reports Server (NTRS)

Bailey, David; Thompson, Donna; Feinstein, Jerald

1987-01-01

Similarity networks are a powerful form of knowledge representation that are useful for many artificial intelligence applications. Similarity networks are used in applications ranging from information analysis and case based reasoning to machine learning and linking symbolic to neural processing. Strengths of similarity networks include simple construction, intuitive object storage, and flexible retrieval techniques that facilitate inferencing. Therefore, similarity networks provide great potential for space applications.
Case Selection via Matching

ERIC Educational Resources Information Center

Nielsen, Richard A.

2016-01-01

This article shows how statistical matching methods can be used to select "most similar" cases for qualitative analysis. I first offer a methodological justification for research designs based on selecting most similar cases. I then discuss the applicability of existing matching methods to the task of selecting most similar cases and…
Prediction and Analysis of the Nonsteady Transition and Separation Processes on an Oscillating Wind Turbine Airfoil using the \\gamma-Re_\\theta Transition Model.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Nandi, Taraj; Brasseur, James; Vijayakumar, Ganesh

2016-01-04

This study is aimed at gaining insight into the nonsteady transitional boundary layer dynamics of wind turbine blades and the predictive capabilities of URANS based transition and turbulence models for similar physics through the analysis of a controlled flow with similar nonsteady parameters.
Phylogenetic Analysis of Shewanella Strains by DNA Relatedness Derived from Whole Genome Microarray DNA-DNA Hybridization and Comparison with Other Methods

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wu, Liyou; Yi, T. Y.; Van Nostrand, Joy

Phylogenetic analyses were done for the Shewanella strains isolated from Baltic Sea (38 strains), US DOE Hanford Uranium bioremediation site [Hanford Reach of the Columbia River (HRCR), 11 strains], Pacific Ocean and Hawaiian sediments (8 strains), and strains from other resources (16 strains) with three out group strains, Rhodopseudomonas palustris, Clostridium cellulolyticum, and Thermoanaerobacter ethanolicus X514, using DNA relatedness derived from WCGA-based DNA-DNA hybridizations, sequence similarities of 16S rRNA gene and gyrB gene, and sequence similarities of 6 loci of Shewanella genome selected from a shared gene list of the Shewanella strains with whole genome sequenced based on the averagemore » nucleotide identity of them (ANI). The phylogenetic trees based on 16S rRNA and gyrB gene sequences, and DNA relatedness derived from WCGA hybridizations of the tested Shewanella strains share exactly the same sub-clusters with very few exceptions, in which the strains were basically grouped by species. However, the phylogenetic analysis based on DNA relatedness derived from WCGA hybridizations dramatically increased the differentiation resolution at species and strains level within Shewanella genus. When the tree based on DNA relatedness derived from WCGA hybridizations was compared to the tree based on the combined sequences of the selected functional genes (6 loci), we found that the resolutions of both methods are similar, but the clustering of the tree based on DNA relatedness derived from WMGA hybridizations was clearer. These results indicate that WCGA-based DNA-DNA hybridization is an idea alternative of conventional DNA-DNA hybridization methods and it is superior to the phylogenetics methods based on sequence similarities of single genes. Detailed analysis is being performed for the re-classification of the strains examined.« less
Visual reconciliation of alternative similarity spaces in climate modeling

Treesearch

J Poco; A Dasgupta; Y Wei; William Hargrove; C.R. Schwalm; D.N. Huntzinger; R Cook; E Bertini; C.T. Silva

2015-01-01

Visual data analysis often requires grouping of data objects based on their similarity. In many application domains researchers use algorithms and techniques like clustering and multidimensional scaling to extract groupings from data. While extracting these groups using a single similarity criteria is relatively straightforward, comparing alternative criteria poses...
CHAPTER 10: CURRENT TECHNICAL PROBLEMS IN EMERGY ANALYSIS

EPA Science Inventory

Technical problems related to the determination of the emergy base for self-organization in environmental systems are considered in this paper. The comparability of emergy analysis results depends on emergy analysts making similar choices in determining the emergy base for a part...

An improved method for functional similarity analysis of genes based on Gene Ontology.

PubMed

Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia

2016-12-23

Measures of gene functional similarity are essential tools for gene clustering, gene function prediction, evaluation of protein-protein interaction, disease gene prioritization and other applications. In recent years, many gene functional similarity methods have been proposed based on the semantic similarity of GO terms. However, these leading approaches may make errorprone judgments especially when they measure the specificity of GO terms as well as the IC of a term set. Therefore, how to estimate the gene functional similarity reliably is still a challenging problem. We propose WIS, an effective method to measure the gene functional similarity. First of all, WIS computes the IC of a term by employing its depth, the number of its ancestors as well as the topology of its descendants in the GO graph. Secondly, WIS calculates the IC of a term set by means of considering the weighted inherited semantics of terms. Finally, WIS estimates the gene functional similarity based on the IC overlap ratio of term sets. WIS is superior to some other representative measures on the experiments of functional classification of genes in a biological pathway, collaborative evaluation of GO-based semantic similarity measures, protein-protein interaction prediction and correlation with gene expression. Further analysis suggests that WIS takes fully into account the specificity of terms and the weighted inherited semantics of terms between GO terms. The proposed WIS method is an effective and reliable way to compare gene function. The web service of WIS is freely available at http://nclab.hit.edu.cn/WIS/ .
Discriminative structural approaches for enzyme active-site prediction.

PubMed

Kato, Tsuyoshi; Nagano, Nozomi

2011-02-15

Predicting enzyme active-sites in proteins is an important issue not only for protein sciences but also for a variety of practical applications such as drug design. Because enzyme reaction mechanisms are based on the local structures of enzyme active-sites, various template-based methods that compare local structures in proteins have been developed to date. In comparing such local sites, a simple measurement, RMSD, has been used so far. This paper introduces new machine learning algorithms that refine the similarity/deviation for comparison of local structures. The similarity/deviation is applied to two types of applications, single template analysis and multiple template analysis. In the single template analysis, a single template is used as a query to search proteins for active sites, whereas a protein structure is examined as a query to discover the possible active-sites using a set of templates in the multiple template analysis. This paper experimentally illustrates that the machine learning algorithms effectively improve the similarity/deviation measurements for both the analyses.
Generalized sample entropy analysis for traffic signals based on similarity measure

NASA Astrophysics Data System (ADS)

Shang, Du; Xu, Mengjia; Shang, Pengjian

2017-05-01

Sample entropy is a prevailing method used to quantify the complexity of a time series. In this paper a modified method of generalized sample entropy and surrogate data analysis is proposed as a new measure to assess the complexity of a complex dynamical system such as traffic signals. The method based on similarity distance presents a different way of signals patterns match showing distinct behaviors of complexity. Simulations are conducted over synthetic data and traffic signals for providing the comparative study, which is provided to show the power of the new method. Compared with previous sample entropy and surrogate data analysis, the new method has two main advantages. The first one is that it overcomes the limitation about the relationship between the dimension parameter and the length of series. The second one is that the modified sample entropy functions can be used to quantitatively distinguish time series from different complex systems by the similar measure.
Musical structure analysis using similarity matrix and dynamic programming

NASA Astrophysics Data System (ADS)

Shiu, Yu; Jeong, Hong; Kuo, C.-C. Jay

2005-10-01

Automatic music segmentation and structure analysis from audio waveforms based on a three-level hierarchy is examined in this research, where the three-level hierarchy includes notes, measures and parts. The pitch class profile (PCP) feature is first extracted at the note level. Then, a similarity matrix is constructed at the measure level, where a dynamic time warping (DTW) technique is used to enhance the similarity computation by taking the temporal distortion of similar audio segments into account. By processing the similarity matrix, we can obtain a coarse-grain music segmentation result. Finally, dynamic programming is applied to the coarse-grain segments so that a song can be decomposed into several major parts such as intro, verse, chorus, bridge and outro. The performance of the proposed music structure analysis system is demonstrated for pop and rock music.
Discrimination Enhancement with Transient Feature Analysis of a Graphene Chemical Sensor.

PubMed

Nallon, Eric C; Schnee, Vincent P; Bright, Collin J; Polcha, Michael P; Li, Qiliang

2016-01-19

A graphene chemical sensor is subjected to a set of structurally and chemically similar hydrocarbon compounds consisting of toluene, o-xylene, p-xylene, and mesitylene. The fractional change in resistance of the sensor upon exposure to these compounds exhibits a similar response magnitude among compounds, whereas large variation is observed within repetitions for each compound, causing a response overlap. Therefore, traditional features depending on maximum response change will cause confusion during further discrimination and classification analysis. More robust features that are less sensitive to concentration, sampling, and drift variability would provide higher quality information. In this work, we have explored the advantage of using transient-based exponential fitting coefficients to enhance the discrimination of similar compounds. The advantages of such feature analysis to discriminate each compound is evaluated using principle component analysis (PCA). In addition, machine learning-based classification algorithms were used to compare the prediction accuracies when using fitting coefficients as features. The additional features greatly enhanced the discrimination between compounds while performing PCA and also improved the prediction accuracy by 34% when using linear discrimination analysis.
Weighted similarity-based clustering of chemical structures and bioactivity data in early drug discovery.

PubMed

Perualila-Tan, Nolen Joy; Shkedy, Ziv; Talloen, Willem; Göhlmann, Hinrich W H; Moerbeke, Marijke Van; Kasim, Adetayo

2016-08-01

The modern process of discovering candidate molecules in early drug discovery phase includes a wide range of approaches to extract vital information from the intersection of biology and chemistry. A typical strategy in compound selection involves compound clustering based on chemical similarity to obtain representative chemically diverse compounds (not incorporating potency information). In this paper, we propose an integrative clustering approach that makes use of both biological (compound efficacy) and chemical (structural features) data sources for the purpose of discovering a subset of compounds with aligned structural and biological properties. The datasets are integrated at the similarity level by assigning complementary weights to produce a weighted similarity matrix, serving as a generic input in any clustering algorithm. This new analysis work flow is semi-supervised method since, after the determination of clusters, a secondary analysis is performed wherein it finds differentially expressed genes associated to the derived integrated cluster(s) to further explain the compound-induced biological effects inside the cell. In this paper, datasets from two drug development oncology projects are used to illustrate the usefulness of the weighted similarity-based clustering approach to integrate multi-source high-dimensional information to aid drug discovery. Compounds that are structurally and biologically similar to the reference compounds are discovered using this proposed integrative approach.
Virtual reality exposure therapy in anxiety disorders: a quantitative meta-analysis.

PubMed

Opriş, David; Pintea, Sebastian; García-Palacios, Azucena; Botella, Cristina; Szamosközi, Ştefan; David, Daniel

2012-02-01

Virtual reality exposure therapy (VRET) is a promising intervention for the treatment of the anxiety disorders. The main objective of this meta-analysis is to compare the efficacy of VRET, used in a behavioral or cognitive-behavioral framework, with that of the classical evidence-based treatments, in anxiety disorders. A comprehensive search of the literature identified 23 studies (n = 608) that were included in the final analysis. The results show that in the case of anxiety disorders, (1) VRET does far better than the waitlist control; (2) the post-treatment results show similar efficacy between the behavioral and the cognitive behavioral interventions incorporating a virtual reality exposure component and the classical evidence-based interventions, with no virtual reality exposure component; (3) VRET has a powerful real-life impact, similar to that of the classical evidence-based treatments; (4) VRET has a good stability of results over time, similar to that of the classical evidence-based treatments; (5) there is a dose-response relationship for VRET; and (6) there is no difference in the dropout rate between the virtual reality exposure and the in vivo exposure. Implications are discussed. © 2011 Wiley Periodicals, Inc.
Tox21 Enricher: Web-based Chemical/Biological Functional Annotation Analysis Tool Based on Tox21 Toxicity Screening Platform.

PubMed

Hur, Junguk; Danes, Larson; Hsieh, Jui-Hua; McGregor, Brett; Krout, Dakota; Auerbach, Scott

2018-05-01

The US Toxicology Testing in the 21st Century (Tox21) program was established to develop more efficient and human-relevant toxicity assessment methods. The Tox21 program screens >10,000 chemicals using quantitative high-throughput screening (qHTS) of assays that measure effects on toxicity pathways. To date, more than 70 assays have yielded >12 million concentration-response curves. The patterns of activity across assays can be used to define similarity between chemicals. Assuming chemicals with similar activity profiles have similar toxicological properties, we may infer toxicological properties based on its neighbourhood. One approach to inference is chemical/biological annotation enrichment analysis. Here, we present Tox21 Enricher, a web-based chemical annotation enrichment tool for the Tox21 toxicity screening platform. Tox21 Enricher identifies over-represented chemical/biological annotations among lists of chemicals (neighbourhoods), facilitating the identification of the toxicological properties and mechanisms in the chemical set. © 2018 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Concurrence of rule- and similarity-based mechanisms in artificial grammar learning.

PubMed

Opitz, Bertram; Hofmann, Juliane

2015-03-01

A current theoretical debate regards whether rule-based or similarity-based learning prevails during artificial grammar learning (AGL). Although the majority of findings are consistent with a similarity-based account of AGL it has been argued that these results were obtained only after limited exposure to study exemplars, and performance on subsequent grammaticality judgment tests has often been barely above chance level. In three experiments the conditions were investigated under which rule- and similarity-based learning could be applied. Participants were exposed to exemplars of an artificial grammar under different (implicit and explicit) learning instructions. The analysis of receiver operating characteristics (ROC) during a final grammaticality judgment test revealed that explicit but not implicit learning led to rule knowledge. It also demonstrated that this knowledge base is built up gradually while similarity knowledge governed the initial state of learning. Together these results indicate that rule- and similarity-based mechanisms concur during AGL. Moreover, it could be speculated that two different rule processes might operate in parallel; bottom-up learning via gradual rule extraction and top-down learning via rule testing. Crucially, the latter is facilitated by performance feedback that encourages explicit hypothesis testing. Copyright © 2015 Elsevier Inc. All rights reserved.
Linear Quantitative Profiling Method Fast Monitors Alkaloids of Sophora Flavescens That Was Verified by Tri-Marker Analyses.

PubMed

Hou, Zhifei; Sun, Guoxiang; Guo, Yong

2016-01-01

The present study demonstrated the use of the Linear Quantitative Profiling Method (LQPM) to evaluate the quality of Alkaloids of Sophora flavescens (ASF) based on chromatographic fingerprints in an accurate, economical and fast way. Both linear qualitative and quantitative similarities were calculated in order to monitor the consistency of the samples. The results indicate that the linear qualitative similarity (LQLS) is not sufficiently discriminating due to the predominant presence of three alkaloid compounds (matrine, sophoridine and oxymatrine) in the test samples; however, the linear quantitative similarity (LQTS) was shown to be able to obviously identify the samples based on the difference in the quantitative content of all the chemical components. In addition, the fingerprint analysis was also supported by the quantitative analysis of three marker compounds. The LQTS was found to be highly correlated to the contents of the marker compounds, indicating that quantitative analysis of the marker compounds may be substituted with the LQPM based on the chromatographic fingerprints for the purpose of quantifying all chemicals of a complex sample system. Furthermore, once reference fingerprint (RFP) developed from a standard preparation in an immediate detection way and the composition similarities calculated out, LQPM could employ the classical mathematical model to effectively quantify the multiple components of ASF samples without any chemical standard.
Fingerprint analysis of Hibiscus mutabilis L. leaves based on ultra performance liquid chromatography with photodiode array detector combined with similarity analysis and hierarchical clustering analysis methods

PubMed Central

Liang, Xianrui; Ma, Meiling; Su, Weike

2013-01-01

Background: A method for chemical fingerprint analysis of Hibiscus mutabilis L. leaves was developed based on ultra performance liquid chromatography with photodiode array detector (UPLC-PAD) combined with similarity analysis (SA) and hierarchical clustering analysis (HCA). Materials and Methods: 10 batches of Hibiscus mutabilis L. leaves samples were collected from different regions of China. UPLC-PAD was employed to collect chemical fingerprints of Hibiscus mutabilis L. leaves. Results: The relative standard deviations (RSDs) of the relative retention times (RRT) and relative peak areas (RPA) of 10 characteristic peaks (one of them was identified as rutin) in precision, repeatability and stability test were less than 3%, and the method of fingerprint analysis was validated to be suitable for the Hibiscus mutabilis L. leaves. Conclusions: The chromatographic fingerprints showed abundant diversity of chemical constituents qualitatively in the 10 batches of Hibiscus mutabilis L. leaves samples from different locations by similarity analysis on basis of calculating the correlation coefficients between each two fingerprints. Moreover, the HCA method clustered the samples into four classes, and the HCA dendrogram showed the close or distant relations among the 10 samples, which was consistent to the SA result to some extent. PMID:23930008
Western classical music development: a statistical analysis of composers similarity, differentiation and evolution.

PubMed

Georges, Patrick

2017-01-01

This paper proposes a statistical analysis that captures similarities and differences between classical music composers with the eventual aim to understand why particular composers 'sound' different even if their 'lineages' (influences network) are similar or why they 'sound' alike if their 'lineages' are different. In order to do this we use statistical methods and measures of association or similarity (based on presence/absence of traits such as specific 'ecological' characteristics and personal musical influences) that have been developed in biosystematics, scientometrics, and bibliographic coupling. This paper also represents a first step towards a more ambitious goal of developing an evolutionary model of Western classical music.
GOMA: functional enrichment analysis tool based on GO modules

PubMed Central

Huang, Qiang; Wu, Ling-Yun; Wang, Yong; Zhang, Xiang-Sun

2013-01-01

Analyzing the function of gene sets is a critical step in interpreting the results of high-throughput experiments in systems biology. A variety of enrichment analysis tools have been developed in recent years, but most output a long list of significantly enriched terms that are often redundant, making it difficult to extract the most meaningful functions. In this paper, we present GOMA, a novel enrichment analysis method based on the new concept of enriched functional Gene Ontology (GO) modules. With this method, we systematically revealed functional GO modules, i.e., groups of functionally similar GO terms, via an optimization model and then ranked them by enrichment scores. Our new method simplifies enrichment analysis results by reducing redundancy, thereby preventing inconsistent enrichment results among functionally similar terms and providing more biologically meaningful results. PMID:23237213
Effectiveness of Spectral Similarity Measures to Develop Precise Crop Spectra for Hyperspectral Data Analysis

NASA Astrophysics Data System (ADS)

Chauhan, H.; Krishna Mohan, B.

2014-11-01

The present study was undertaken with the objective to check effectiveness of spectral similarity measures to develop precise crop spectra from the collected hyperspectral field spectra. In Multispectral and Hyperspectral remote sensing, classification of pixels is obtained by statistical comparison (by means of spectral similarity) of known field or library spectra to unknown image spectra. Though these algorithms are readily used, little emphasis has been placed on use of various spectral similarity measures to select precise crop spectra from the set of field spectra. Conventionally crop spectra are developed after rejecting outliers based only on broad-spectrum analysis. Here a successful attempt has been made to develop precise crop spectra based on spectral similarity. As unevaluated data usage leads to uncertainty in the image classification, it is very crucial to evaluate the data. Hence, notwithstanding the conventional method, the data precision has been performed effectively to serve the purpose of the present research work. The effectiveness of developed precise field spectra was evaluated by spectral discrimination measures and found higher discrimination values compared to spectra developed conventionally. Overall classification accuracy for the image classified by field spectra selected conventionally is 51.89% and 75.47% for the image classified by field spectra selected precisely based on spectral similarity. KHAT values are 0.37, 0.62 and Z values are 2.77, 9.59 for image classified using conventional and precise field spectra respectively. Reasonable higher classification accuracy, KHAT and Z values shows the possibility of a new approach for field spectra selection based on spectral similarity measure.
Functional grouping of similar genes using eigenanalysis on minimum spanning tree based neighborhood graph.

PubMed

Jothi, R; Mohanty, Sraban Kumar; Ojha, Aparajita

2016-04-01

Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.
On Prolonging Network Lifetime through Load-Similar Node Deployment in Wireless Sensor Networks

PubMed Central

Li, Qiao-Qin; Gong, Haigang; Liu, Ming; Yang, Mei; Zheng, Jun

2011-01-01

This paper is focused on the study of the energy hole problem in the Progressive Multi-hop Rotational Clustered (PMRC)-structure, a highly scalable wireless sensor network (WSN) architecture. Based on an analysis on the traffic load distribution in PMRC-based WSNs, we propose a novel load-similar node distribution strategy combined with the Minimum Overlapping Layers (MOL) scheme to address the energy hole problem in PMRC-based WSNs. In this strategy, sensor nodes are deployed in the network area according to the load distribution. That is, more nodes shall be deployed in the range where the average load is higher, and then the loads among different areas in the sensor network tend to be balanced. Simulation results demonstrate that the load-similar node distribution strategy prolongs network lifetime and reduces the average packet latency in comparison with existing nonuniform node distribution and uniform node distribution strategies. Note that, besides the PMRC structure, the analysis model and the proposed load-similar node distribution strategy are also applicable to other multi-hop WSN structures. PMID:22163809
Comparison of bioactive chemical space networks generated using substructure- and fingerprint-based measures of molecular similarity

NASA Astrophysics Data System (ADS)

Zhang, Bijun; Vogt, Martin; Maggiora, Gerald M.; Bajorath, Jürgen

2015-07-01

Chemical space networks (CSNs) have recently been introduced as a conceptual alternative to coordinate-based representations of chemical space. CSNs were initially designed as threshold networks using the Tanimoto coefficient as a continuous similarity measure. The analysis of CSNs generated from sets of bioactive compounds revealed that many statistical properties were strongly dependent on their edge density. While it was difficult to compare CSNs at pre-defined similarity threshold values, CSNs with constant edge density were directly comparable. In the current study, alternative CSN representations were constructed by applying the matched molecular pair (MMP) formalism as a substructure-based similarity criterion. For more than 150 compound activity classes, MMP-based CSNs (MMP-CSNs) were compared to corresponding threshold CSNs (THR-CSNs) at a constant edge density by applying different parameters from network science, measures of community structure distributions, and indicators of structure-activity relationship (SAR) information content. MMP-CSNs were found to be an attractive alternative to THR-CSNs, yielding low edge densities and well-resolved topologies. MMP-CSNs and corresponding THR-CSNs often had similar topology and closely corresponding community structures, although there was only limited overlap in similarity relationships. The homophily principle from network science was shown to affect MMP-CSNs and THR-CSNs in different ways, despite the presence of conserved topological features. Moreover, activity cliff distributions in alternative CSN designs markedly differed, which has important implications for SAR analysis.
Sirius PSB: a generic system for analysis of biological sequences.

PubMed

Koh, Chuan Hock; Lin, Sharene; Jedd, Gregory; Wong, Limsoon

2009-12-01

Computational tools are essential components of modern biological research. For example, BLAST searches can be used to identify related proteins based on sequence homology, or when a new genome is sequenced, prediction models can be used to annotate functional sites such as transcription start sites, translation initiation sites and polyadenylation sites and to predict protein localization. Here we present Sirius Prediction Systems Builder (PSB), a new computational tool for sequence analysis, classification and searching. Sirius PSB has four main operations: (1) Building a classifier, (2) Deploying a classifier, (3) Search for proteins similar to query proteins, (4) Preliminary and post-prediction analysis. Sirius PSB supports all these operations via a simple and interactive graphical user interface. Besides being a convenient tool, Sirius PSB has also introduced two novelties in sequence analysis. Firstly, genetic algorithm is used to identify interesting features in the feature space. Secondly, instead of the conventional method of searching for similar proteins via sequence similarity, we introduced searching via features' similarity. To demonstrate the capabilities of Sirius PSB, we have built two prediction models - one for the recognition of Arabidopsis polyadenylation sites and another for the subcellular localization of proteins. Both systems are competitive against current state-of-the-art models based on evaluation of public datasets. More notably, the time and effort required to build each model is greatly reduced with the assistance of Sirius PSB. Furthermore, we show that under certain conditions when BLAST is unable to find related proteins, Sirius PSB can identify functionally related proteins based on their biophysical similarities. Sirius PSB and its related supplements are available at: http://compbio.ddns.comp.nus.edu.sg/~sirius.
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures.

PubMed

Mazandu, Gaston K; Mulder, Nicola J

2013-09-25

The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis.
A Study about Placement Support Using Semantic Similarity

ERIC Educational Resources Information Center

Katz, Marco; van Bruggen, Jan; Giesbers, Bas; Waterink, Wim; Eshuis, Jannes; Koper, Rob

2014-01-01

This paper discusses Latent Semantic Analysis (LSA) as a method for the assessment of prior learning. The Accreditation of Prior Learning (APL) is a procedure to offer learners an individualized curriculum based on their prior experiences and knowledge. The placement decisions in this process are based on the analysis of student material by domain…

Learning deep similarity in fundus photography

NASA Astrophysics Data System (ADS)

Chudzik, Piotr; Al-Diri, Bashir; Caliva, Francesco; Ometto, Giovanni; Hunter, Andrew

2017-02-01

Similarity learning is one of the most fundamental tasks in image analysis. The ability to extract similar images in the medical domain as part of content-based image retrieval (CBIR) systems has been researched for many years. The vast majority of methods used in CBIR systems are based on hand-crafted feature descriptors. The approximation of a similarity mapping for medical images is difficult due to the big variety of pixel-level structures of interest. In fundus photography (FP) analysis, a subtle difference in e.g. lesions and vessels shape and size can result in a different diagnosis. In this work, we demonstrated how to learn a similarity function for image patches derived directly from FP image data without the need of manually designed feature descriptors. We used a convolutional neural network (CNN) with a novel architecture adapted for similarity learning to accomplish this task. Furthermore, we explored and studied multiple CNN architectures. We show that our method can approximate the similarity between FP patches more efficiently and accurately than the state-of- the-art feature descriptors, including SIFT and SURF using a publicly available dataset. Finally, we observe that our approach, which is purely data-driven, learns that features such as vessels calibre and orientation are important discriminative factors, which resembles the way how humans reason about similarity. To the best of authors knowledge, this is the first attempt to approximate a visual similarity mapping in FP.
Clustering document fragments using background color and texture information

NASA Astrophysics Data System (ADS)

Chanda, Sukalpa; Franke, Katrin; Pal, Umapada

2012-01-01

Forensic analysis of questioned documents sometimes can be extensively data intensive. A forensic expert might need to analyze a heap of document fragments and in such cases to ensure reliability he/she should focus only on relevant evidences hidden in those document fragments. Relevant document retrieval needs finding of similar document fragments. One notion of obtaining such similar documents could be by using document fragment's physical characteristics like color, texture, etc. In this article we propose an automatic scheme to retrieve similar document fragments based on visual appearance of document paper and texture. Multispectral color characteristics using biologically inspired color differentiation techniques are implemented here. This is done by projecting document color characteristics to Lab color space. Gabor filter-based texture analysis is used to identify document texture. It is desired that document fragments from same source will have similar color and texture. For clustering similar document fragments of our test dataset we use a Self Organizing Map (SOM) of dimension 5×5, where the document color and texture information are used as features. We obtained an encouraging accuracy of 97.17% from 1063 test images.
RY-Coding and Non-Homogeneous Models Can Ameliorate the Maximum-Likelihood Inferences From Nucleotide Sequence Data with Parallel Compositional Heterogeneity.

PubMed

Ishikawa, Sohta A; Inagaki, Yuji; Hashimoto, Tetsuo

2012-01-01

In phylogenetic analyses of nucleotide sequences, 'homogeneous' substitution models, which assume the stationarity of base composition across a tree, are widely used, albeit individual sequences may bear distinctive base frequencies. In the worst-case scenario, a homogeneous model-based analysis can yield an artifactual union of two distantly related sequences that achieved similar base frequencies in parallel. Such potential difficulty can be countered by two approaches, 'RY-coding' and 'non-homogeneous' models. The former approach converts four bases into purine and pyrimidine to normalize base frequencies across a tree, while the heterogeneity in base frequency is explicitly incorporated in the latter approach. The two approaches have been applied to real-world sequence data; however, their basic properties have not been fully examined by pioneering simulation studies. Here, we assessed the performances of the maximum-likelihood analyses incorporating RY-coding and a non-homogeneous model (RY-coding and non-homogeneous analyses) on simulated data with parallel convergence to similar base composition. Both RY-coding and non-homogeneous analyses showed superior performances compared with homogeneous model-based analyses. Curiously, the performance of RY-coding analysis appeared to be significantly affected by a setting of the substitution process for sequence simulation relative to that of non-homogeneous analysis. The performance of a non-homogeneous analysis was also validated by analyzing a real-world sequence data set with significant base heterogeneity.
MiRGOFS: A GO-based functional similarity measure for miRNAs, with applications to the prediction of miRNA subcellular localization and miRNA-disease association.

PubMed

Yang, Yang; Fu, Xiaofeng; Qu, Wenhao; Xiao, Yiqun; Shen, Hong-Bin

2018-04-27

Benefiting from high-throughput experimental technologies, whole-genome analysis of microRNAs (miRNAs) has been more and more common to uncover important regulatory roles of miRNAs and identify miRNA biomarkers for disease diagnosis. As a complementary information to the high-throughput experimental data, domain knowledge like the Gene Ontology and KEGG pathway is usually used to guide gene function analysis. However, functional annotation for miRNAs is scarce in the public databases. Till now, only a few methods have been proposed for measuring the functional similarity between miRNAs based on public annotation data, and these methods cover a very limited number of miRNAs, which are not applicable to large-scale miRNA analysis. In this paper, we propose a new method to measure the functional similarity for miRNAs, called miRGOFS, which has two notable features: I) it adopts a new GO semantic similarity metric which considers both common ancestors and descendants of GO terms; II) it computes similarity between GO sets in an asymmetric manner, and weights each GO term by its statistical significance. The miRGOFS-based predictor achieves an F1 of 61.2% on a benchmark data set of miRNA localization, and AUC values of 87.7% and 81.1% on two benchmark sets of miRNA-disease association, respectively. Compared with the existing functional similarity measurements of miRNAs, miRGOFS has the advantages of higher accuracy and larger coverage of human miRNAs (over 1000 miRNAs). http://www.csbio.sjtu.edu.cn/bioinf/MiRGOFS/. yangyang@cs.sjtu.edu.cn or hbshen@sjtu.edu.cn. Supplementary data are available at Bioinformatics online.
Effects of Categorical Labels on Similarity Judgments: A Critical Analysis of Similarity-Based Approaches

ERIC Educational Resources Information Center

Noles, Nicholaus S.; Gelman, Susan A.

2012-01-01

Our goal in the present study was to evaluate the claim that category labels affect children's judgments of visual similarity. We presented preschool children with discriminable and identical sets of animal pictures and asked them to make perceptual judgments in the presence or absence of labels. Our findings indicate that children who are asked…
14 CFR 417.203 - Compliance.

Code of Federal Regulations, 2012 CFR

2012-01-01

... analysis method is based on accurate data and scientific principles and is statistically valid. The FAA... safety analysis must also meet the requirements for methods of analysis contained in appendices A and B... from an identical or similar launch if the analysis still applies to the later launch. (b) Method of...
14 CFR 417.203 - Compliance.

Code of Federal Regulations, 2014 CFR

2014-01-01

... analysis method is based on accurate data and scientific principles and is statistically valid. The FAA... safety analysis must also meet the requirements for methods of analysis contained in appendices A and B... from an identical or similar launch if the analysis still applies to the later launch. (b) Method of...
14 CFR 417.203 - Compliance.

Code of Federal Regulations, 2013 CFR

2013-01-01

... analysis method is based on accurate data and scientific principles and is statistically valid. The FAA... safety analysis must also meet the requirements for methods of analysis contained in appendices A and B... from an identical or similar launch if the analysis still applies to the later launch. (b) Method of...
Waveform Similarity Analysis: A Simple Template Comparing Approach for Detecting and Quantifying Noisy Evoked Compound Action Potentials.

PubMed

Potas, Jason Robert; de Castro, Newton Gonçalves; Maddess, Ted; de Souza, Marcio Nogueira

2015-01-01

Experimental electrophysiological assessment of evoked responses from regenerating nerves is challenging due to the typical complex response of events dispersed over various latencies and poor signal-to-noise ratio. Our objective was to automate the detection of compound action potential events and derive their latencies and magnitudes using a simple cross-correlation template comparison approach. For this, we developed an algorithm called Waveform Similarity Analysis. To test the algorithm, challenging signals were generated in vivo by stimulating sural and sciatic nerves, whilst recording evoked potentials at the sciatic nerve and tibialis anterior muscle, respectively, in animals recovering from sciatic nerve transection. Our template for the algorithm was generated based on responses evoked from the intact side. We also simulated noisy signals and examined the output of the Waveform Similarity Analysis algorithm with imperfect templates. Signals were detected and quantified using Waveform Similarity Analysis, which was compared to event detection, latency and magnitude measurements of the same signals performed by a trained observer, a process we called Trained Eye Analysis. The Waveform Similarity Analysis algorithm could successfully detect and quantify simple or complex responses from nerve and muscle compound action potentials of intact or regenerated nerves. Incorrectly specifying the template outperformed Trained Eye Analysis for predicting signal amplitude, but produced consistent latency errors for the simulated signals examined. Compared to the trained eye, Waveform Similarity Analysis is automatic, objective, does not rely on the observer to identify and/or measure peaks, and can detect small clustered events even when signal-to-noise ratio is poor. Waveform Similarity Analysis provides a simple, reliable and convenient approach to quantify latencies and magnitudes of complex waveforms and therefore serves as a useful tool for studying evoked compound action potentials in neural regeneration studies.
Waveform Similarity Analysis: A Simple Template Comparing Approach for Detecting and Quantifying Noisy Evoked Compound Action Potentials

PubMed Central

Potas, Jason Robert; de Castro, Newton Gonçalves; Maddess, Ted; de Souza, Marcio Nogueira

2015-01-01

Experimental electrophysiological assessment of evoked responses from regenerating nerves is challenging due to the typical complex response of events dispersed over various latencies and poor signal-to-noise ratio. Our objective was to automate the detection of compound action potential events and derive their latencies and magnitudes using a simple cross-correlation template comparison approach. For this, we developed an algorithm called Waveform Similarity Analysis. To test the algorithm, challenging signals were generated in vivo by stimulating sural and sciatic nerves, whilst recording evoked potentials at the sciatic nerve and tibialis anterior muscle, respectively, in animals recovering from sciatic nerve transection. Our template for the algorithm was generated based on responses evoked from the intact side. We also simulated noisy signals and examined the output of the Waveform Similarity Analysis algorithm with imperfect templates. Signals were detected and quantified using Waveform Similarity Analysis, which was compared to event detection, latency and magnitude measurements of the same signals performed by a trained observer, a process we called Trained Eye Analysis. The Waveform Similarity Analysis algorithm could successfully detect and quantify simple or complex responses from nerve and muscle compound action potentials of intact or regenerated nerves. Incorrectly specifying the template outperformed Trained Eye Analysis for predicting signal amplitude, but produced consistent latency errors for the simulated signals examined. Compared to the trained eye, Waveform Similarity Analysis is automatic, objective, does not rely on the observer to identify and/or measure peaks, and can detect small clustered events even when signal-to-noise ratio is poor. Waveform Similarity Analysis provides a simple, reliable and convenient approach to quantify latencies and magnitudes of complex waveforms and therefore serves as a useful tool for studying evoked compound action potentials in neural regeneration studies. PMID:26325291
Linear Quantitative Profiling Method Fast Monitors Alkaloids of Sophora Flavescens That Was Verified by Tri-Marker Analyses

PubMed Central

Hou, Zhifei; Sun, Guoxiang; Guo, Yong

2016-01-01

The present study demonstrated the use of the Linear Quantitative Profiling Method (LQPM) to evaluate the quality of Alkaloids of Sophora flavescens (ASF) based on chromatographic fingerprints in an accurate, economical and fast way. Both linear qualitative and quantitative similarities were calculated in order to monitor the consistency of the samples. The results indicate that the linear qualitative similarity (LQLS) is not sufficiently discriminating due to the predominant presence of three alkaloid compounds (matrine, sophoridine and oxymatrine) in the test samples; however, the linear quantitative similarity (LQTS) was shown to be able to obviously identify the samples based on the difference in the quantitative content of all the chemical components. In addition, the fingerprint analysis was also supported by the quantitative analysis of three marker compounds. The LQTS was found to be highly correlated to the contents of the marker compounds, indicating that quantitative analysis of the marker compounds may be substituted with the LQPM based on the chromatographic fingerprints for the purpose of quantifying all chemicals of a complex sample system. Furthermore, once reference fingerprint (RFP) developed from a standard preparation in an immediate detection way and the composition similarities calculated out, LQPM could employ the classical mathematical model to effectively quantify the multiple components of ASF samples without any chemical standard. PMID:27529425
Groundwater similarity across a watershed derived from time-warped and flow-corrected time series

NASA Astrophysics Data System (ADS)

Rinderer, M.; McGlynn, B. L.; van Meerveld, H. J.

2017-05-01

Information about catchment-scale groundwater dynamics is necessary to understand how catchments store and release water and why water quantity and quality varies in streams. However, groundwater level monitoring is often restricted to a limited number of sites. Knowledge of the factors that determine similarity between monitoring sites can be used to predict catchment-scale groundwater storage and connectivity of different runoff source areas. We used distance-based and correlation-based similarity measures to quantify the spatial and temporal differences in shallow groundwater similarity for 51 monitoring sites in a Swiss prealpine catchment. The 41 months long time series were preprocessed using Dynamic Time-Warping and a Flow-corrected Time Transformation to account for small timing differences and bias toward low-flow periods. The mean distance-based groundwater similarity was correlated to topographic indices, such as upslope contributing area, topographic wetness index, and local slope. Correlation-based similarity was less related to landscape position but instead revealed differences between seasons. Analysis of variance and partial Mantel tests showed that landscape position, represented by the topographic wetness index, explained 52% of the variability in mean distance-based groundwater similarity, while spatial distance, represented by the Euclidean distance, explained only 5%. The variability in distance-based similarity and correlation-based similarity between groundwater and streamflow time series was significantly larger for midslope locations than for other landscape positions. This suggests that groundwater dynamics at these midslope sites, which are important to understand runoff source areas and hydrological connectivity at the catchment scale, are most difficult to predict.
Inference-Based Similarity Search in Randomized Montgomery Domains for Privacy-Preserving Biometric Identification.

PubMed

Wang, Yi; Wan, Jianwu; Guo, Jun; Cheung, Yiu-Ming; Yuen, Pong C; Yi Wang; Jianwu Wan; Jun Guo; Yiu-Ming Cheung; Yuen, Pong C; Cheung, Yiu-Ming; Guo, Jun; Yuen, Pong C; Wan, Jianwu; Wang, Yi

2018-07-01

Similarity search is essential to many important applications and often involves searching at scale on high-dimensional data based on their similarity to a query. In biometric applications, recent vulnerability studies have shown that adversarial machine learning can compromise biometric recognition systems by exploiting the biometric similarity information. Existing methods for biometric privacy protection are in general based on pairwise matching of secured biometric templates and have inherent limitations in search efficiency and scalability. In this paper, we propose an inference-based framework for privacy-preserving similarity search in Hamming space. Our approach builds on an obfuscated distance measure that can conceal Hamming distance in a dynamic interval. Such a mechanism enables us to systematically design statistically reliable methods for retrieving most likely candidates without knowing the exact distance values. We further propose to apply Montgomery multiplication for generating search indexes that can withstand adversarial similarity analysis, and show that information leakage in randomized Montgomery domains can be made negligibly small. Our experiments on public biometric datasets demonstrate that the inference-based approach can achieve a search accuracy close to the best performance possible with secure computation methods, but the associated cost is reduced by orders of magnitude compared to cryptographic primitives.
DMT-TAFM: a data mining tool for technical analysis of futures market

NASA Astrophysics Data System (ADS)

Stepanov, Vladimir; Sathaye, Archana

2002-03-01

Technical analysis of financial markets describes many patterns of market behavior. For practical use, all these descriptions need to be adjusted for each particular trading session. In this paper, we develop a data mining tool for technical analysis of the futures markets (DMT-TAFM), which dynamically generates rules based on the notion of the price pattern similarity. The tool consists of three main components. The first component provides visualization of data series on a chart with different ranges, scales, and chart sizes and types. The second component constructs pattern descriptions using sets of polynomials. The third component specifies the training set for mining, defines the similarity notion, and searches for a set of similar patterns. DMT-TAFM is useful to prepare the data, and then reveal and systemize statistical information about similar patterns found in any type of historical price series. We performed experiments with our tool on three decades of trading data fro hundred types of futures. Our results for this data set shows that, we can prove or disprove many well-known patterns based on real data, as well as reveal new ones, and use the set of relatively consistent patterns found during data mining for developing better futures trading strategies.
Feature-level sentiment analysis by using comparative domain corpora

NASA Astrophysics Data System (ADS)

Quan, Changqin; Ren, Fuji

2016-06-01

Feature-level sentiment analysis (SA) is able to provide more fine-grained SA on certain opinion targets and has a wider range of applications on E-business. This study proposes an approach based on comparative domain corpora for feature-level SA. The proposed approach makes use of word associations for domain-specific feature extraction. First, we assign a similarity score for each candidate feature to denote its similarity extent to a domain. Then we identify domain features based on their similarity scores on different comparative domain corpora. After that, dependency grammar and a general sentiment lexicon are applied to extract and expand feature-oriented opinion words. Lastly, the semantic orientation of a domain-specific feature is determined based on the feature-oriented opinion lexicons. In evaluation, we compare the proposed method with several state-of-the-art methods (including unsupervised and semi-supervised) using a standard product review test collection. The experimental results demonstrate the effectiveness of using comparative domain corpora.
Average is Boring: How Similarity Kills a Meme's Success

NASA Astrophysics Data System (ADS)

Coscia, Michele

2014-09-01

Every day we are exposed to different ideas, or memes, competing with each other for our attention. Previous research explained popularity and persistence heterogeneity of memes by assuming them in competition for limited attention resources, distributed in a heterogeneous social network. Little has been said about what characteristics make a specific meme more likely to be successful. We propose a similarity-based explanation: memes with higher similarity to other memes have a significant disadvantage in their potential popularity. We employ a meme similarity measure based on semantic text analysis and computer vision to prove that a meme is more likely to be successful and to thrive if its characteristics make it unique. Our results show that indeed successful memes are located in the periphery of the meme similarity space and that our similarity measure is a promising predictor of a meme success.
Average is boring: how similarity kills a meme's success.

PubMed

Coscia, Michele

2014-09-26

Every day we are exposed to different ideas, or memes, competing with each other for our attention. Previous research explained popularity and persistence heterogeneity of memes by assuming them in competition for limited attention resources, distributed in a heterogeneous social network. Little has been said about what characteristics make a specific meme more likely to be successful. We propose a similarity-based explanation: memes with higher similarity to other memes have a significant disadvantage in their potential popularity. We employ a meme similarity measure based on semantic text analysis and computer vision to prove that a meme is more likely to be successful and to thrive if its characteristics make it unique. Our results show that indeed successful memes are located in the periphery of the meme similarity space and that our similarity measure is a promising predictor of a meme success.
A new similarity measure for link prediction based on local structures in social networks

NASA Astrophysics Data System (ADS)

Aghabozorgi, Farshad; Khayyambashi, Mohammad Reza

2018-07-01

Link prediction is a fundamental problem in social network analysis. There exist a variety of techniques for link prediction which applies the similarity measures to estimate proximity of vertices in the network. Complex networks like social networks contain structural units named network motifs. In this study, a newly developed similarity measure is proposed where these structural units are applied as the source of similarity estimation. This similarity measure is tested through a supervised learning experiment framework, where other similarity measures are compared with this similarity measure. The classification model trained with this similarity measure outperforms others of its kind.
Adaptive Spatial Filter Based on Similarity Indices to Preserve the Neural Information on EEG Signals during On-Line Processing

PubMed Central

Villa-Parra, Ana Cecilia; Bastos-Filho, Teodiano; López-Delis, Alberto; Frizera-Neto, Anselmo; Krishnan, Sridhar

2017-01-01

This work presents a new on-line adaptive filter, which is based on a similarity analysis between standard electrode locations, in order to reduce artifacts and common interferences throughout electroencephalography (EEG) signals, but preserving the useful information. Standard deviation and Concordance Correlation Coefficient (CCC) between target electrodes and its correspondent neighbor electrodes are analyzed on sliding windows to select those neighbors that are highly correlated. Afterwards, a model based on CCC is applied to provide higher values of weight to those correlated electrodes with lower similarity to the target electrode. The approach was applied to brain computer-interfaces (BCIs) based on Canonical Correlation Analysis (CCA) to recognize 40 targets of steady-state visual evoked potential (SSVEP), providing an accuracy (ACC) of 86.44 ± 2.81%. In addition, also using this approach, features of low frequency were selected in the pre-processing stage of another BCI to recognize gait planning. In this case, the recognition was significantly (p<0.01) improved for most of the subjects (ACC≥74.79%), when compared with other BCIs based on Common Spatial Pattern, Filter Bank-Common Spatial Pattern, and Riemannian Geometry. PMID:29186848
Genetic and chemical diversity of high mucilaginous plants of Sida complex by ISSR markers and chemical fingerprinting.

PubMed

Thul, Sanjog T; Srivastava, Ankit K; Singh, Subhash C; Shanker, Karuna

2011-09-01

A method was developed based on multiple approaches wherein DNA and chemical analysis was carried out toward differentiation of important species of Sida complex that is being used for commercial preparation. Isolated DNA samples were successfully performed through PCR amplification using ISSR markers and degree of genetic diversity among the different species of Sida is compared with that of chemical diversity. For genetic fingerprint investigation, selected 10 ISSR primers generating reproducible banding patterns were used. Among the total of 63 amplicons, 62 were recorded as polymorphic, genetic similarity index deduced from ISSR profiles ranged from 12 to 51%. Based on similarity index, S. acuta and S. rhombifolia found to be most similar (51%). High number of species-specific bands played pivotal role to delineate species at genetic level. Investigation based on HPTLC fingerprints analysis revealed 23 bands representing to characteristic chemicals and similarity index ranged from 73 to 91%. Prominent distinguishable bands were observed only in S. acuta, while S. cordifolia and S. rhombifolia shared most bands making them difficult to identify on chemical fingerprint basis. This report summarizes the genotypic and chemotypic diversity and the use of profiles for authentication of species of Sida complex.

Random whole metagenomic sequencing for forensic discrimination of soils.

PubMed

Khodakova, Anastasia S; Smith, Renee J; Burgoyne, Leigh; Abarno, Damien; Linacre, Adrian

2014-01-01

Here we assess the ability of random whole metagenomic sequencing approaches to discriminate between similar soils from two geographically distinct urban sites for application in forensic science. Repeat samples from two parklands in residential areas separated by approximately 3 km were collected and the DNA was extracted. Shotgun, whole genome amplification (WGA) and single arbitrarily primed DNA amplification (AP-PCR) based sequencing techniques were then used to generate soil metagenomic profiles. Full and subsampled metagenomic datasets were then annotated against M5NR/M5RNA (taxonomic classification) and SEED Subsystems (metabolic classification) databases. Further comparative analyses were performed using a number of statistical tools including: hierarchical agglomerative clustering (CLUSTER); similarity profile analysis (SIMPROF); non-metric multidimensional scaling (NMDS); and canonical analysis of principal coordinates (CAP) at all major levels of taxonomic and metabolic classification. Our data showed that shotgun and WGA-based approaches generated highly similar metagenomic profiles for the soil samples such that the soil samples could not be distinguished accurately. An AP-PCR based approach was shown to be successful at obtaining reproducible site-specific metagenomic DNA profiles, which in turn were employed for successful discrimination of visually similar soil samples collected from two different locations.
DaGO-Fun: tool for Gene Ontology-based functional analysis using term information content measures

PubMed Central

2013-01-01

Background The use of Gene Ontology (GO) data in protein analyses have largely contributed to the improved outcomes of these analyses. Several GO semantic similarity measures have been proposed in recent years and provide tools that allow the integration of biological knowledge embedded in the GO structure into different biological analyses. There is a need for a unified tool that provides the scientific community with the opportunity to explore these different GO similarity measure approaches and their biological applications. Results We have developed DaGO-Fun, an online tool available at http://web.cbio.uct.ac.za/ITGOM, which incorporates many different GO similarity measures for exploring, analyzing and comparing GO terms and proteins within the context of GO. It uses GO data and UniProt proteins with their GO annotations as provided by the Gene Ontology Annotation (GOA) project to precompute GO term information content (IC), enabling rapid response to user queries. Conclusions The DaGO-Fun online tool presents the advantage of integrating all the relevant IC-based GO similarity measures, including topology- and annotation-based approaches to facilitate effective exploration of these measures, thus enabling users to choose the most relevant approach for their application. Furthermore, this tool includes several biological applications related to GO semantic similarity scores, including the retrieval of genes based on their GO annotations, the clustering of functionally related genes within a set, and term enrichment analysis. PMID:24067102
Correlation between social proximity and mobility similarity.

PubMed

Fan, Chao; Liu, Yiding; Huang, Junming; Rong, Zhihai; Zhou, Tao

2017-09-20

Human behaviors exhibit ubiquitous correlations in many aspects, such as individual and collective levels, temporal and spatial dimensions, content, social and geographical layers. With rich Internet data of online behaviors becoming available, it attracts academic interests to explore human mobility similarity from the perspective of social network proximity. Existent analysis shows a strong correlation between online social proximity and offline mobility similarity, namely, mobile records between friends are significantly more similar than between strangers, and those between friends with common neighbors are even more similar. We argue the importance of the number and diversity of common friends, with a counter intuitive finding that the number of common friends has no positive impact on mobility similarity while the diversity plays a key role, disagreeing with previous studies. Our analysis provides a novel view for better understanding the coupling between human online and offline behaviors, and will help model and predict human behaviors based on social proximity.
AFLP-based genetic diversity assessment of commercially important tea germplasm in India.

PubMed

Sharma, R K; Negi, M S; Sharma, S; Bhardwaj, P; Kumar, R; Bhattachrya, E; Tripathi, S B; Vijayan, D; Baruah, A R; Das, S C; Bera, B; Rajkumar, R; Thomas, J; Sud, R K; Muraleedharan, N; Hazarika, M; Lakshmikumaran, M; Raina, S N; Ahuja, P S

2010-08-01

India has a large repository of important tea accessions and, therefore, plays a major role in improving production and quality of tea across the world. Using seven AFLP primer combinations, we analyzed 123 commercially important tea accessions representing major populations in India. The overall genetic similarity recorded was 51%. No significant differences were recorded in average genetic similarity among tea populations cultivated in various geographic regions (northwest 0.60, northeast and south both 0.59). UPGMA cluster analysis grouped the tea accessions according to geographic locations, with a bias toward China or Assam/Cambod types. Cluster analysis results were congruent with principal component analysis. Further, analysis of molecular variance detected a high level of genetic variation (85%) within and limited genetic variation (15%) among the populations, suggesting their origin from a similar genetic pool.
An Investigation of Document Partitions.

ERIC Educational Resources Information Center

Shaw, W. M., Jr.

1986-01-01

Empirical significance of document partitions is investigated as a function of index term-weight and similarity thresholds. Results show the same empirically preferred partitions can be detected by two independent strategies: an analysis of cluster-based retrieval analysis and an analysis of regularities in the underlying structure of the document…
An index-based algorithm for fast on-line query processing of latent semantic analysis

PubMed Central

Li, Pohan; Wang, Wei

2017-01-01

Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold θ. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm. PMID:28520747
An index-based algorithm for fast on-line query processing of latent semantic analysis.

PubMed

Zhang, Mingxi; Li, Pohan; Wang, Wei

2017-01-01

Latent Semantic Analysis (LSA) is widely used for finding the documents whose semantic is similar to the query of keywords. Although LSA yield promising similar results, the existing LSA algorithms involve lots of unnecessary operations in similarity computation and candidate check during on-line query processing, which is expensive in terms of time cost and cannot efficiently response the query request especially when the dataset becomes large. In this paper, we study the efficiency problem of on-line query processing for LSA towards efficiently searching the similar documents to a given query. We rewrite the similarity equation of LSA combined with an intermediate value called partial similarity that is stored in a designed index called partial index. For reducing the searching space, we give an approximate form of similarity equation, and then develop an efficient algorithm for building partial index, which skips the partial similarities lower than a given threshold θ. Based on partial index, we develop an efficient algorithm called ILSA for supporting fast on-line query processing. The given query is transformed into a pseudo document vector, and the similarities between query and candidate documents are computed by accumulating the partial similarities obtained from the index nodes corresponds to non-zero entries in the pseudo document vector. Compared to the LSA algorithm, ILSA reduces the time cost of on-line query processing by pruning the candidate documents that are not promising and skipping the operations that make little contribution to similarity scores. Extensive experiments through comparison with LSA have been done, which demonstrate the efficiency and effectiveness of our proposed algorithm.
Model-based document categorization employing semantic pattern analysis and local structure clustering

NASA Astrophysics Data System (ADS)

Fume, Kosei; Ishitani, Yasuto

2008-01-01

We propose a document categorization method based on a document model that can be defined externally for each task and that categorizes Web content or business documents into a target category in accordance with the similarity of the model. The main feature of the proposed method consists of two aspects of semantics extraction from an input document. The semantics of terms are extracted by the semantic pattern analysis and implicit meanings of document substructure are specified by a bottom-up text clustering technique focusing on the similarity of text line attributes. We have constructed a system based on the proposed method for trial purposes. The experimental results show that the system achieves more than 80% classification accuracy in categorizing Web content and business documents into 15 or 70 categories.
ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis.

PubMed

Mallik, Saurav; Zhao, Zhongming

2017-12-28

For transcriptomic analysis, there are numerous microarray-based genomic data, especially those generated for cancer research. The typical analysis measures the difference between a cancer sample-group and a matched control group for each transcript or gene. Association rule mining is used to discover interesting item sets through rule-based methodology. Thus, it has advantages to find causal effect relationships between the transcripts. In this work, we introduce two new rule-based similarity measures-weighted rank-based Jaccard and Cosine measures-and then propose a novel computational framework to detect condensed gene co-expression modules ( C o n G E M s) through the association rule-based learning system and the weighted similarity scores. In practice, the list of evolved condensed markers that consists of both singular and complex markers in nature depends on the corresponding condensed gene sets in either antecedent or consequent of the rules of the resultant modules. In our evaluation, these markers could be supported by literature evidence, KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and Gene Ontology annotations. Specifically, we preliminarily identified differentially expressed genes using an empirical Bayes test. A recently developed algorithm-RANWAR-was then utilized to determine the association rules from these genes. Based on that, we computed the integrated similarity scores of these rule-based similarity measures between each rule-pair, and the resultant scores were used for clustering to identify the co-expressed rule-modules. We applied our method to a gene expression dataset for lung squamous cell carcinoma and a genome methylation dataset for uterine cervical carcinogenesis. Our proposed module discovery method produced better results than the traditional gene-module discovery measures. In summary, our proposed rule-based method is useful for exploring biomarker modules from transcriptomic data.
Similarities among receptor pockets and among compounds: analysis and application to in silico ligand screening.

PubMed

Fukunishi, Yoshifumi; Mikami, Yoshiaki; Nakamura, Haruki

2005-09-01

We developed a new method to evaluate the distances and similarities between receptor pockets or chemical compounds based on a multi-receptor versus multi-ligand docking affinity matrix. The receptors were classified by a cluster analysis based on calculations of the distance between receptor pockets. A set of low homologous receptors that bind a similar compound could be classified into one cluster. Based on this line of reasoning, we proposed a new in silico screening method. According to this method, compounds in a database were docked to multiple targets. The new docking score was a slightly modified version of the multiple active site correction (MASC) score. Receptors that were at a set distance from the target receptor were not included in the analysis, and the modified MASC scores were calculated for the selected receptors. The choice of the receptors is important to achieve a good screening result, and our clustering of receptors is useful to this purpose. This method was applied to the analysis of a set of 132 receptors and 132 compounds, and the results demonstrated that this method achieves a high hit ratio, as compared to that of a uniform sampling, using a receptor-ligand docking program, Sievgene, which was newly developed with a good docking performance yielding 50.8% of the reconstructed complexes at a distance of less than 2 A RMSD.
Chemical and protein structural basis for biological crosstalk between PPAR α and COX enzymes

NASA Astrophysics Data System (ADS)

Cleves, Ann E.; Jain, Ajay N.

2015-02-01

We have previously validated a probabilistic framework that combined computational approaches for predicting the biological activities of small molecule drugs. Molecule comparison methods included molecular structural similarity metrics and similarity computed from lexical analysis of text in drug package inserts. Here we present an analysis of novel drug/target predictions, focusing on those that were not obvious based on known pharmacological crosstalk. Considering those cases where the predicted target was an enzyme with known 3D structure allowed incorporation of information from molecular docking and protein binding pocket similarity in addition to ligand-based comparisons. Taken together, the combination of orthogonal information sources led to investigation of a surprising predicted relationship between a transcription factor and an enzyme, specifically, PPAR α and the cyclooxygenase enzymes. These predictions were confirmed by direct biochemical experiments which validate the approach and show for the first time that PPAR α agonists are cyclooxygenase inhibitors.
Functional clustering of time series gene expression data by Granger causality

PubMed Central

2012-01-01

Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
Evaluating conducting network based transparent electrodes from geometrical considerations

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kumar, Ankush; Kulkarni, G. U., E-mail: guk@cens.res.in

2016-01-07

Conducting nanowire networks have been developed as viable alternative to existing indium tin oxide based transparent electrode (TE). The nature of electrical conduction and process optimization for electrodes have gained much from the theoretical models based on percolation transport using Monte Carlo approach and applying Kirchhoff's law on individual junctions and loops. While most of the literature work pertaining to theoretical analysis is focussed on networks obtained from conducting rods (mostly considering only junction resistance), hardly any attention has been paid to those made using template based methods, wherein the structure of network is neither similar to network obtained frommore » conducting rods nor similar to well periodic geometry. Here, we have attempted an analytical treatment based on geometrical arguments and applied image analysis on practical networks to gain deeper insight into conducting networked structure particularly in relation to sheet resistance and transmittance. Many literature examples reporting networks with straight or curvilinear wires with distributions in wire width and length have been analysed by treating the networks as two dimensional graphs and evaluating the sheet resistance based on wire density and wire width. The sheet resistance values from our analysis compare well with the experimental values. Our analysis on various examples has revealed that low sheet resistance is achieved with high wire density and compactness with straight rather than curvilinear wires and with narrower wire width distribution. Similarly, higher transmittance for given sheet resistance is possible with narrower wire width but of higher thickness, minimal curvilinearity, and maximum connectivity. For the purpose of evaluating active fraction of the network, the algorithm was made to distinguish and quantify current carrying backbone regions as against regions containing only dangling or isolated wires. The treatment can be helpful in predicting the properties of a network simply from image analysis and will be helpful in improvisation and comparison of various TEs and better understanding of electrical percolation.« less
Evaluating conducting network based transparent electrodes from geometrical considerations

NASA Astrophysics Data System (ADS)

Kumar, Ankush; Kulkarni, G. U.

2016-01-01

Conducting nanowire networks have been developed as viable alternative to existing indium tin oxide based transparent electrode (TE). The nature of electrical conduction and process optimization for electrodes have gained much from the theoretical models based on percolation transport using Monte Carlo approach and applying Kirchhoff's law on individual junctions and loops. While most of the literature work pertaining to theoretical analysis is focussed on networks obtained from conducting rods (mostly considering only junction resistance), hardly any attention has been paid to those made using template based methods, wherein the structure of network is neither similar to network obtained from conducting rods nor similar to well periodic geometry. Here, we have attempted an analytical treatment based on geometrical arguments and applied image analysis on practical networks to gain deeper insight into conducting networked structure particularly in relation to sheet resistance and transmittance. Many literature examples reporting networks with straight or curvilinear wires with distributions in wire width and length have been analysed by treating the networks as two dimensional graphs and evaluating the sheet resistance based on wire density and wire width. The sheet resistance values from our analysis compare well with the experimental values. Our analysis on various examples has revealed that low sheet resistance is achieved with high wire density and compactness with straight rather than curvilinear wires and with narrower wire width distribution. Similarly, higher transmittance for given sheet resistance is possible with narrower wire width but of higher thickness, minimal curvilinearity, and maximum connectivity. For the purpose of evaluating active fraction of the network, the algorithm was made to distinguish and quantify current carrying backbone regions as against regions containing only dangling or isolated wires. The treatment can be helpful in predicting the properties of a network simply from image analysis and will be helpful in improvisation and comparison of various TEs and better understanding of electrical percolation.
Cluster Analysis of Minnesota School Districts. A Research Report.

ERIC Educational Resources Information Center

Cleary, James

The term "cluster analysis" refers to a set of statistical methods that classify entities with similar profiles of scores on a number of measured dimensions, in order to create empirically based typologies. A 1980 Minnesota House Research Report employed cluster analysis to categorize school districts according to their relative mixtures…
Unsupervised spatiotemporal analysis of fMRI data using graph-based visualizations of self-organizing maps.

PubMed

Katwal, Santosh B; Gore, John C; Marois, Rene; Rogers, Baxter P

2013-09-01

We present novel graph-based visualizations of self-organizing maps for unsupervised functional magnetic resonance imaging (fMRI) analysis. A self-organizing map is an artificial neural network model that transforms high-dimensional data into a low-dimensional (often a 2-D) map using unsupervised learning. However, a postprocessing scheme is necessary to correctly interpret similarity between neighboring node prototypes (feature vectors) on the output map and delineate clusters and features of interest in the data. In this paper, we used graph-based visualizations to capture fMRI data features based upon 1) the distribution of data across the receptive fields of the prototypes (density-based connectivity); and 2) temporal similarities (correlations) between the prototypes (correlation-based connectivity). We applied this approach to identify task-related brain areas in an fMRI reaction time experiment involving a visuo-manual response task, and we correlated the time-to-peak of the fMRI responses in these areas with reaction time. Visualization of self-organizing maps outperformed independent component analysis and voxelwise univariate linear regression analysis in identifying and classifying relevant brain regions. We conclude that the graph-based visualizations of self-organizing maps help in advanced visualization of cluster boundaries in fMRI data enabling the separation of regions with small differences in the timings of their brain responses.
Visualization-based analysis of multiple response survey data

NASA Astrophysics Data System (ADS)

Timofeeva, Anastasiia

2017-11-01

During the survey, the respondents are often allowed to tick more than one answer option for a question. Analysis and visualization of such data have difficulties because of the need for processing multiple response variables. With standard representation such as pie and bar charts, information about the association between different answer options is lost. The author proposes a visualization approach for multiple response variables based on Venn diagrams. For a more informative representation with a large number of overlapping groups it is suggested to use similarity and association matrices. Some aggregate indicators of dissimilarity (similarity) are proposed based on the determinant of the similarity matrix and the maximum eigenvalue of association matrix. The application of the proposed approaches is well illustrated by the example of the analysis of advertising sources. Intersection of sets indicates that the same consumer audience is covered by several advertising sources. This information is very important for the allocation of the advertising budget. The differences between target groups in advertising sources are of interest. To identify such differences the hypothesis of homogeneity and independence are tested. Recent approach to the problem are briefly reviewed and compared. An alternative procedure is suggested. It is based on partition of a consumer audience into pairwise disjoint subsets and includes hypothesis testing of the difference between the population proportions. It turned out to be more suitable for the real problem being solved.
Average is Boring: How Similarity Kills a Meme's Success

PubMed Central

Coscia, Michele

2014-01-01

Every day we are exposed to different ideas, or memes, competing with each other for our attention. Previous research explained popularity and persistence heterogeneity of memes by assuming them in competition for limited attention resources, distributed in a heterogeneous social network. Little has been said about what characteristics make a specific meme more likely to be successful. We propose a similarity-based explanation: memes with higher similarity to other memes have a significant disadvantage in their potential popularity. We employ a meme similarity measure based on semantic text analysis and computer vision to prove that a meme is more likely to be successful and to thrive if its characteristics make it unique. Our results show that indeed successful memes are located in the periphery of the meme similarity space and that our similarity measure is a promising predictor of a meme success. PMID:25257730
Gene context analysis in the Integrated Microbial Genomes (IMG) data management system.

PubMed

Mavromatis, Konstantinos; Chu, Ken; Ivanova, Natalia; Hooper, Sean D; Markowitz, Victor M; Kyrpides, Nikos C

2009-11-24

Computational methods for determining the function of genes in newly sequenced genomes have been traditionally based on sequence similarity to genes whose function has been identified experimentally. Function prediction methods can be extended using gene context analysis approaches such as examining the conservation of chromosomal gene clusters, gene fusion events and co-occurrence profiles across genomes. Context analysis is based on the observation that functionally related genes are often having similar gene context and relies on the identification of such events across phylogenetically diverse collection of genomes. We have used the data management system of the Integrated Microbial Genomes (IMG) as the framework to implement and explore the power of gene context analysis methods because it provides one of the largest available genome integrations. Visualization and search tools to facilitate gene context analysis have been developed and applied across all publicly available archaeal and bacterial genomes in IMG. These computations are now maintained as part of IMG's regular genome content update cycle. IMG is available at: http://img.jgi.doe.gov.
Template based rotation: A method for functional connectivity analysis with a priori templates☆

PubMed Central

Schultz, Aaron P.; Chhatwal, Jasmeer P.; Huijbers, Willem; Hedden, Trey; van Dijk, Koene R.A.; McLaren, Donald G.; Ward, Andrew M.; Wigman, Sarah; Sperling, Reisa A.

2014-01-01

Functional connectivity magnetic resonance imaging (fcMRI) is a powerful tool for understanding the network level organization of the brain in research settings and is increasingly being used to study large-scale neuronal network degeneration in clinical trial settings. Presently, a variety of techniques, including seed-based correlation analysis and group independent components analysis (with either dual regression or back projection) are commonly employed to compute functional connectivity metrics. In the present report, we introduce template based rotation,1 a novel analytic approach optimized for use with a priori network parcellations, which may be particularly useful in clinical trial settings. Template based rotation was designed to leverage the stable spatial patterns of intrinsic connectivity derived from out-of-sample datasets by mapping data from novel sessions onto the previously defined a priori templates. We first demonstrate the feasibility of using previously defined a priori templates in connectivity analyses, and then compare the performance of template based rotation to seed based and dual regression methods by applying these analytic approaches to an fMRI dataset of normal young and elderly subjects. We observed that template based rotation and dual regression are approximately equivalent in detecting fcMRI differences between young and old subjects, demonstrating similar effect sizes for group differences and similar reliability metrics across 12 cortical networks. Both template based rotation and dual-regression demonstrated larger effect sizes and comparable reliabilities as compared to seed based correlation analysis, though all three methods yielded similar patterns of network differences. When performing inter-network and sub-network connectivity analyses, we observed that template based rotation offered greater flexibility, larger group differences, and more stable connectivity estimates as compared to dual regression and seed based analyses. This flexibility owes to the reduced spatial and temporal orthogonality constraints of template based rotation as compared to dual regression. These results suggest that template based rotation can provide a useful alternative to existing fcMRI analytic methods, particularly in clinical trial settings where predefined outcome measures and conserved network descriptions across groups are at a premium. PMID:25150630

Functional Module Search in Protein Networks based on Semantic Similarity Improves the Analysis of Proteomics Data*

PubMed Central

Boyanova, Desislava; Nilla, Santosh; Klau, Gunnar W.; Dandekar, Thomas; Müller, Tobias; Dittrich, Marcus

2014-01-01

The continuously evolving field of proteomics produces increasing amounts of data while improving the quality of protein identifications. Albeit quantitative measurements are becoming more popular, many proteomic studies are still based on non-quantitative methods for protein identification. These studies result in potentially large sets of identified proteins, where the biological interpretation of proteins can be challenging. Systems biology develops innovative network-based methods, which allow an integrated analysis of these data. Here we present a novel approach, which combines prior knowledge of protein-protein interactions (PPI) with proteomics data using functional similarity measurements of interacting proteins. This integrated network analysis exactly identifies network modules with a maximal consistent functional similarity reflecting biological processes of the investigated cells. We validated our approach on small (H9N2 virus-infected gastric cells) and large (blood constituents) proteomic data sets. Using this novel algorithm, we identified characteristic functional modules in virus-infected cells, comprising key signaling proteins (e.g. the stress-related kinase RAF1) and demonstrate that this method allows a module-based functional characterization of cell types. Analysis of a large proteome data set of blood constituents resulted in clear separation of blood cells according to their developmental origin. A detailed investigation of the T-cell proteome further illustrates how the algorithm partitions large networks into functional subnetworks each representing specific cellular functions. These results demonstrate that the integrated network approach not only allows a detailed analysis of proteome networks but also yields a functional decomposition of complex proteomic data sets and thereby provides deeper insights into the underlying cellular processes of the investigated system. PMID:24807868
A Meta-analysis of Studies Comparing Outcomes of Diverse Acellular Dermal Matrices for Implant-Based Breast Reconstruction.

PubMed

Lee, Kyeong-Tae; Mun, Goo-Hyun

2017-07-01

The current diversity of the available acellular dermal matrix (ADM) materials for implant-based breast reconstruction raises the issue of whether there are any differences in postoperative outcomes according to the kind of ADM used. The present meta-analysis aimed to investigate whether choice of ADM products can affect outcomes. Studies that used multiple kinds of ADM products for implant-based breast reconstruction and compared outcomes between them were searched. Outcomes of interest were rates of postoperative complications: infection, seroma, mastectomy flap necrosis, reconstruction failure, and overall complications. A total of 17 studies met the selection criteria. There was only 1 randomized controlled trial, and the other 16 studies had retrospective designs. Comparison of FlexHD, DermaMatrix, and ready-to-use AlloDerm with freeze-dried AlloDerm was conducted in multiple studies and could be meta-analyzed, in which 12 studies participated. In the meta-analysis comparing FlexHD and freeze-dried AlloDerm, using the results of 6 studies, both products showed similar pooled risks for all kinds of complications. When comparing DermaMatrix and freeze-dried AlloDerm with the results from 4 studies, there were also no differences between the pooled risks of complications of the two. Similarly, the meta-analysis of 4 studies comparing ready-to-use and freeze-dried AlloDerm demonstrated that the pooled risks for the complications did not differ. This meta-analysis demonstrates that the 3 recently invented, human cadaveric skin-based products of FlexHD, DermaMatrix, and ready-to-use AlloDerm have similar risks of complications compared with those of freeze-dried AlloDerm, which has been used for longer. However, as most studies had low levels of evidence, further investigations are needed.
A novel model for DNA sequence similarity analysis based on graph theory.

PubMed

Qi, Xingqin; Wu, Qin; Zhang, Yusen; Fuller, Eddie; Zhang, Cun-Quan

2011-01-01

Determination of sequence similarity is one of the major steps in computational phylogenetic studies. As we know, during evolutionary history, not only DNA mutations for individual nucleotide but also subsequent rearrangements occurred. It has been one of major tasks of computational biologists to develop novel mathematical descriptors for similarity analysis such that various mutation phenomena information would be involved simultaneously. In this paper, different from traditional methods (eg, nucleotide frequency, geometric representations) as bases for construction of mathematical descriptors, we construct novel mathematical descriptors based on graph theory. In particular, for each DNA sequence, we will set up a weighted directed graph. The adjacency matrix of the directed graph will be used to induce a representative vector for DNA sequence. This new approach measures similarity based on both ordering and frequency of nucleotides so that much more information is involved. As an application, the method is tested on a set of 0.9-kb mtDNA sequences of twelve different primate species. All output phylogenetic trees with various distance estimations have the same topology, and are generally consistent with the reported results from early studies, which proves the new method's efficiency; we also test the new method on a simulated data set, which shows our new method performs better than traditional global alignment method when subsequent rearrangements happen frequently during evolutionary history.
Large-Scale Chemical Similarity Networks for Target Profiling of Compounds Identified in Cell-Based Chemical Screens

PubMed Central

Lo, Yu-Chen; Senese, Silvia; Li, Chien-Ming; Hu, Qiyang; Huang, Yong; Damoiseaux, Robert; Torres, Jorge Z.

2015-01-01

Target identification is one of the most critical steps following cell-based phenotypic chemical screens aimed at identifying compounds with potential uses in cell biology and for developing novel disease therapies. Current in silico target identification methods, including chemical similarity database searches, are limited to single or sequential ligand analysis that have limited capabilities for accurate deconvolution of a large number of compounds with diverse chemical structures. Here, we present CSNAP (Chemical Similarity Network Analysis Pulldown), a new computational target identification method that utilizes chemical similarity networks for large-scale chemotype (consensus chemical pattern) recognition and drug target profiling. Our benchmark study showed that CSNAP can achieve an overall higher accuracy (>80%) of target prediction with respect to representative chemotypes in large (>200) compound sets, in comparison to the SEA approach (60–70%). Additionally, CSNAP is capable of integrating with biological knowledge-based databases (Uniprot, GO) and high-throughput biology platforms (proteomic, genetic, etc) for system-wise drug target validation. To demonstrate the utility of the CSNAP approach, we combined CSNAP's target prediction with experimental ligand evaluation to identify the major mitotic targets of hit compounds from a cell-based chemical screen and we highlight novel compounds targeting microtubules, an important cancer therapeutic target. The CSNAP method is freely available and can be accessed from the CSNAP web server (http://services.mbi.ucla.edu/CSNAP/). PMID:25826798
Classification and identification of Rhodobryum roseum Limpr. and its adulterants based on fourier-transform infrared spectroscopy (FTIR) and chemometrics.

PubMed

Cao, Zhen; Wang, Zhenjie; Shang, Zhonglin; Zhao, Jiancheng

2017-01-01

Fourier-transform infrared spectroscopy (FTIR) with the attenuated total reflectance technique was used to identify Rhodobryum roseum from its four adulterants. The FTIR spectra of six samples in the range from 4000 cm-1 to 600 cm-1 were obtained. The second-derivative transformation test was used to identify the small and nearby absorption peaks. A cluster analysis was performed to classify the spectra in a dendrogram based on the spectral similarity. Principal component analysis (PCA) was used to classify the species of six moss samples. A cluster analysis with PCA was used to identify different genera. However, some species of the same genus exhibited highly similar chemical components and FTIR spectra. Fourier self-deconvolution and discrete wavelet transform (DWT) were used to enhance the differences among the species with similar chemical components and FTIR spectra. Three scales were selected as the feature-extracting space in the DWT domain. The results show that FTIR spectroscopy with chemometrics is suitable for identifying Rhodobryum roseum and its adulterants.
A Population-Based Comparative Effectiveness Study of Radiation Therapy Techniques in Stage III Non-Small Cell Lung Cancer

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harris, Jeremy P.; Murphy, James D.; Hanlon, Alexandra L.

2014-03-15

Purpose: Concerns have been raised about the potential for worse treatment outcomes because of dosimetric inaccuracies related to tumor motion and increased toxicity caused by the spread of low-dose radiation to normal tissues in patients with locally advanced non-small cell lung cancer (NSCLC) treated with intensity modulated radiation therapy (IMRT). We therefore performed a population-based comparative effectiveness analysis of IMRT, conventional 3-dimensional conformal radiation therapy (3D-CRT), and 2-dimensional radiation therapy (2D-RT) in stage III NSCLC. Methods and Materials: We used the Surveillance, Epidemiology, and End Results (SEER)-Medicare database to identify a cohort of patients diagnosed with stage III NSCLC frommore » 2002 to 2009 treated with IMRT, 3D-CRT, or 2D-RT. Using Cox regression and propensity score matching, we compared survival and toxicities of these treatments. Results: The proportion of patients treated with IMRT increased from 2% in 2002 to 25% in 2009, and the use of 2D-RT decreased from 32% to 3%. In univariate analysis, IMRT was associated with improved overall survival (OS) (hazard ratio [HR] 0.90, P=.02) and cancer-specific survival (CSS) (HR 0.89, P=.02). After controlling for confounders, IMRT was associated with similar OS (HR 0.94, P=.23) and CSS (HR 0.94, P=.28) compared with 3D-CRT. Both techniques had superior OS compared with 2D-RT. IMRT was associated with similar toxicity risks on multivariate analysis compared with 3D-CRT. Propensity score matched model results were similar to those from adjusted models. Conclusions: In this population-based analysis, IMRT for stage III NSCLC was associated with similar OS and CSS and maintained similar toxicity risks compared with 3D-CRT.« less
Demonstration of innovative techniques for work zone safety data analysis

DOT National Transportation Integrated Search

2009-07-15

Based upon the results of the simulator data analysis, additional future research can be : identified to validate the driving simulator in terms of similarities with Ohio work zones. For : instance, the speeds observed in the simulator were greater f...
Different perspectives on economic base.

Treesearch

Lisa K. Crone; Richard W. Haynes; Nicholas E. Reyna

1999-01-01

Two general approaches for measuring the economic base are discussed. Each method is used to define the economic base for each of the counties included in the Interior Columbia Basin Ecosystem Management Project area. A more detailed look at four selected counties results in similar findings from different approaches. Limitations of economic base analysis also are...
Unsupervised Approaches for Post-Processing in Computationally Efficient Waveform-Similarity-Based Earthquake Detection

NASA Astrophysics Data System (ADS)

Bergen, K.; Yoon, C. E.; OReilly, O. J.; Beroza, G. C.

2015-12-01

Recent improvements in computational efficiency for waveform correlation-based detections achieved by new methods such as Fingerprint and Similarity Thresholding (FAST) promise to allow large-scale blind search for similar waveforms in long-duration continuous seismic data. Waveform similarity search applied to datasets of months to years of continuous seismic data will identify significantly more events than traditional detection methods. With the anticipated increase in number of detections and associated increase in false positives, manual inspection of the detection results will become infeasible. This motivates the need for new approaches to process the output of similarity-based detection. We explore data mining techniques for improved detection post-processing. We approach this by considering similarity-detector output as a sparse similarity graph with candidate events as vertices and similarities as weighted edges. Image processing techniques are leveraged to define candidate events and combine results individually processed at multiple stations. Clustering and graph analysis methods are used to identify groups of similar waveforms and assign a confidence score to candidate detections. Anomaly detection and classification are applied to waveform data for additional false detection removal. A comparison of methods will be presented and their performance will be demonstrated on a suspected induced and non-induced earthquake sequence.
Symbol recognition with kernel density matching.

PubMed

Zhang, Wan; Wenyin, Liu; Zhang, Kun

2006-12-01

We propose a novel approach to similarity assessment for graphic symbols. Symbols are represented as 2D kernel densities and their similarity is measured by the Kullback-Leibler divergence. Symbol orientation is found by gradient-based angle searching or independent component analysis. Experimental results show the outstanding performance of this approach in various situations.
Velocity analysis of simultaneous-source data using high-resolution semblance—coping with the strong noise

NASA Astrophysics Data System (ADS)

Gan, Shuwei; Wang, Shoudong; Chen, Yangkang; Qu, Shan; Zu, Shaohuan

2016-02-01

Direct imaging of simultaneous-source (or blended) data, without the need of deblending, requires a precise subsurface velocity model. In this paper, we focus on the velocity analysis of simultaneous-source data using the normal moveout-based velocity picking approach.We demonstrate that it is possible to obtain a precise velocity model directly from the blended data in the common-midpoint domain. The similarity-weighted semblance can help us obtain much better velocity spectrum with higher resolution and higher reliability compared with the traditional semblance. The similarity-weighted semblance enforces an inherent noise attenuation solely in the semblance calculation stage, thus it is not sensitive to the intense interference. We use both simulated synthetic and field data examples to demonstrate the performance of the similarity-weighted semblance in obtaining reliable subsurface velocity model for direct migration of simultaneous-source data. The migrated image of blended field data using prestack Kirchhoff time migration approach based on the picked velocity from the similarity-weighted semblance is very close to the migrated image of unblended data.
Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity.

PubMed

Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S

2015-09-01

The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. © 2015 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Comparison of topological clustering within protein networks using edge metrics that evaluate full sequence, full structure, and active site microenvironment similarity

PubMed Central

Leuthaeuser, Janelle B; Knutson, Stacy T; Kumar, Kiran; Babbitt, Patricia C; Fetrow, Jacquelyn S

2015-01-01

The development of accurate protein function annotation methods has emerged as a major unsolved biological problem. Protein similarity networks, one approach to function annotation via annotation transfer, group proteins into similarity-based clusters. An underlying assumption is that the edge metric used to identify such clusters correlates with functional information. In this contribution, this assumption is evaluated by observing topologies in similarity networks using three different edge metrics: sequence (BLAST), structure (TM-Align), and active site similarity (active site profiling, implemented in DASP). Network topologies for four well-studied protein superfamilies (enolase, peroxiredoxin (Prx), glutathione transferase (GST), and crotonase) were compared with curated functional hierarchies and structure. As expected, network topology differs, depending on edge metric; comparison of topologies provides valuable information on structure/function relationships. Subnetworks based on active site similarity correlate with known functional hierarchies at a single edge threshold more often than sequence- or structure-based networks. Sequence- and structure-based networks are useful for identifying sequence and domain similarities and differences; therefore, it is important to consider the clustering goal before deciding appropriate edge metric. Further, conserved active site residues identified in enolase and GST active site subnetworks correspond with published functionally important residues. Extension of this analysis yields predictions of functionally determinant residues for GST subgroups. These results support the hypothesis that active site similarity-based networks reveal clusters that share functional details and lay the foundation for capturing functionally relevant hierarchies using an approach that is both automatable and can deliver greater precision in function annotation than current similarity-based methods. PMID:26073648
A Biosequence-based Approach to Software Characterization

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oehmen, Christopher S.; Peterson, Elena S.; Phillips, Aaron R.

For many applications, it is desirable to have some process for recognizing when software binaries are closely related without relying on them to be identical or have identical segments. Some examples include monitoring utilization of high performance computing centers or service clouds, detecting freeware in licensed code, and enforcing application whitelists. But doing so in a dynamic environment is a nontrivial task because most approaches to software similarity require extensive and time-consuming analysis of a binary, or they fail to recognize executables that are similar but nonidentical. Presented herein is a novel biosequence-based method for quantifying similarity of executable binaries.more » Using this method, it is shown in an example application on large-scale multi-author codes that 1) the biosequence-based method has a statistical performance in recognizing and distinguishing between a collection of real-world high performance computing applications better than 90% of ideal; and 2) an example of using family tree analysis to tune identification for a code subfamily can achieve better than 99% of ideal performance.« less
DNetDB: The human disease network database based on dysfunctional regulation mechanism.

PubMed

Yang, Jing; Wu, Su-Juan; Yang, Shao-You; Peng, Jia-Wei; Wang, Shi-Nuo; Wang, Fu-Yan; Song, Yu-Xing; Qi, Ting; Li, Yi-Xue; Li, Yuan-Yuan

2016-05-21

Disease similarity study provides new insights into disease taxonomy, pathogenesis, which plays a guiding role in diagnosis and treatment. The early studies were limited to estimate disease similarities based on clinical manifestations, disease-related genes, medical vocabulary concepts or registry data, which were inevitably biased to well-studied diseases and offered small chance of discovering novel findings in disease relationships. In other words, genome-scale expression data give us another angle to address this problem since simultaneous measurement of the expression of thousands of genes allows for the exploration of gene transcriptional regulation, which is believed to be crucial to biological functions. Although differential expression analysis based methods have the potential to explore new disease relationships, it is difficult to unravel the upstream dysregulation mechanisms of diseases. We therefore estimated disease similarities based on gene expression data by using differential coexpression analysis, a recently emerging method, which has been proved to be more potential to capture dysfunctional regulation mechanisms than differential expression analysis. A total of 1,326 disease relationships among 108 diseases were identified, and the relevant information constituted the human disease network database (DNetDB). Benefiting from the use of differential coexpression analysis, the potential common dysfunctional regulation mechanisms shared by disease pairs (i.e. disease relationships) were extracted and presented. Statistical indicators, common disease-related genes and drugs shared by disease pairs were also included in DNetDB. In total, 1,326 disease relationships among 108 diseases, 5,598 pathways, 7,357 disease-related genes and 342 disease drugs are recorded in DNetDB, among which 3,762 genes and 148 drugs are shared by at least two diseases. DNetDB is the first database focusing on disease similarity from the viewpoint of gene regulation mechanism. It provides an easy-to-use web interface to search and browse the disease relationships and thus helps to systematically investigate etiology and pathogenesis, perform drug repositioning, and design novel therapeutic interventions.Database URL: http://app.scbit.org/DNetDB/ #.
Asymmetric transmission and reflection spectra of FBG in single-multi-single mode fiber structure.

PubMed

Chai, Quan; Liu, Yanlei; Zhang, Jianzhong; Yang, Jun; Chen, Yujin; Yuan, Libo; Peng, Gang-Ding

2015-05-04

We give a comprehensive theoretical analysis and simulation of a FBG in single-multi-single mode fiber structure (FBG-in-SMS), based on the coupled mode analysis and the mode interference analysis. This enables us to explain the experimental observations, its asymmetric transmission and reflection spectra with the similar temperature responses near the spectral range of Bragg wavelengths. The transmission spectrum shift during FBG written-in process is observed and discussed. The analysis results are useful in the design of the SMS structure based sensors and filters.
Study on force mechanism for therapeutic effect of pushing manipulation with one-finger meditation base on similarity analysis of force and waveform.

PubMed

Fang, Lei; Fang, Min; Guo, Min-Min

2016-12-27

To reveal the force mechanism for therapeutic effect of pushing manipulation with one-finger meditation. A total of 15 participants were recruited in this study and assigned to an expert group, a skilled group and a novice group, with 5 participants in each group. Mechanical signals were collected from a biomechanical testing platform, and these data were further observed via similarity analysis and cluster analysis. Comparing the force waveforms of manipulation revealed that the manipulation forces were similar between the expert group and the skilled group (P>0.05). The mean value of vertical force was 9.8 N, and 95% CI rang from 6.37 to 14.70 N, but there were significant differences compared with the novice group (P<0.05). The result of overall similarity coefficient cluster analysis showed that two kinds of manipulation forces curves were existed between the expert group and the skilled group. Pushing manipulation with one-finger meditation is a kind of light stimulation manipulation on the acupoint, and force characteristics of double waveforms continuously alternated during manual operation.
Triton's surface properties - A preliminary analysis from ground-based, Voyager photopolarimeter subsystem, and laboratory measurements

NASA Technical Reports Server (NTRS)

Buratti, B. J.; Lane, A. L.; Gibson, J.; Burrows, H.; Nelson, R. M.; Bliss, D.; Smythe, W.; Garkanian, V.; Wallis, B.

1991-01-01

The surface properties of Triton were investigated using data from the ground-based and Voyager photopolarimeter subsystem (PPS) observations of Triton's phase curve. The results indicate that Triton has a high single-scattering albedo (0.96 +/-0.01 at 0.75 micron) and an unusually compacted surface, possibly similar to that of Europa. Results also suggest that Triton's single-particle phase function and the macroscopically rough character of its surface are similar to those of most other icy satellites.
Countries population determination to test rice crisis indicator at national level using k-means cluster analysis

NASA Astrophysics Data System (ADS)

Hidayat, Y.; Purwandari, T.; Sukono; Ariska, Y. D.

2017-01-01

This study aimed to obtain information on the population of the countries which is have similarities with Indonesia based on three characteristics, that is the democratic atmosphere, rice consumption and purchasing power of rice. It is useful as a reference material for research which tested the strength and predictability of the rice crisis indicators Unprecedented Restlessness (UR). The similarities countries with Indonesia were conducted using multivariate analysis that is non-hierarchical cluster analysis k-Means with 38 countries as the data population. This analysis is done repeatedly until the obtainment number of clusters which is capable to show the differentiator power of the three characteristics and describe the high similarity within clusters. Based on the results, it turns out with 6 clusters can describe the differentiator power of characteristics of formed clusters. However, to answer the purpose of the study, only one cluster which will be taken accordance with the criteria of success for the population of countries that have similarities with Indonesia that cluster contain Indonesia therein, there are countries which is sustain crisis and non-crisis of rice in 2008, and cluster which is have the largest member among them. This criterion is met by cluster 2, which consists of 22 countries, namely Indonesia, Brazil, Costa Rica, Djibouti, Dominican Republic, Ecuador, Fiji, Guinea-Bissau, Haiti, India, Jamaica, Japan, Korea South, Madagascar, Malaysia, Mali, Nicaragua, Panama, Peru, Senegal, Sierra Leone and Suriname.
Advanced Models and Algorithms for Self-Similar IP Network Traffic Simulation and Performance Analysis

NASA Astrophysics Data System (ADS)

Radev, Dimitar; Lokshina, Izabella

2010-11-01

The paper examines self-similar (or fractal) properties of real communication network traffic data over a wide range of time scales. These self-similar properties are very different from the properties of traditional models based on Poisson and Markov-modulated Poisson processes. Advanced fractal models of sequentional generators and fixed-length sequence generators, and efficient algorithms that are used to simulate self-similar behavior of IP network traffic data are developed and applied. Numerical examples are provided; and simulation results are obtained and analyzed.

Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily.

PubMed

Lakshmi, Balasubramanian; Mishra, Madhulika; Srinivasan, Narayanaswamy; Archunan, Govindaraju

2015-01-01

Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.
Assessing semantic similarity of texts - Methods and algorithms

NASA Astrophysics Data System (ADS)

Rozeva, Anna; Zerkova, Silvia

2017-12-01

Assessing the semantic similarity of texts is an important part of different text-related applications like educational systems, information retrieval, text summarization, etc. This task is performed by sophisticated analysis, which implements text-mining techniques. Text mining involves several pre-processing steps, which provide for obtaining structured representative model of the documents in a corpus by means of extracting and selecting the features, characterizing their content. Generally the model is vector-based and enables further analysis with knowledge discovery approaches. Algorithms and measures are used for assessing texts at syntactical and semantic level. An important text-mining method and similarity measure is latent semantic analysis (LSA). It provides for reducing the dimensionality of the document vector space and better capturing the text semantics. The mathematical background of LSA for deriving the meaning of the words in a given text by exploring their co-occurrence is examined. The algorithm for obtaining the vector representation of words and their corresponding latent concepts in a reduced multidimensional space as well as similarity calculation are presented.
A Cyber-Attack Detection Model Based on Multivariate Analyses

NASA Astrophysics Data System (ADS)

Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi

In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.
Noncontiguous atom matching structural similarity function.

PubMed

Teixeira, Ana L; Falcao, Andre O

2013-10-28

Measuring similarity between molecules is a fundamental problem in cheminformatics. Given that similar molecules tend to have similar physical, chemical, and biological properties, the notion of molecular similarity plays an important role in the exploration of molecular data sets, query-retrieval in molecular databases, and in structure-property/activity modeling. Various methods to define structural similarity between molecules are available in the literature, but so far none has been used with consistent and reliable results for all situations. We propose a new similarity method based on atom alignment for the analysis of structural similarity between molecules. This method is based on the comparison of the bonding profiles of atoms on comparable molecules, including features that are seldom found in other structural or graph matching approaches like chirality or double bond stereoisomerism. The similarity measure is then defined on the annotated molecular graph, based on an iterative directed graph similarity procedure and optimal atom alignment between atoms using a pairwise matching algorithm. With the proposed approach the similarities detected are more intuitively understood because similar atoms in the molecules are explicitly shown. This noncontiguous atom matching structural similarity method (NAMS) was tested and compared with one of the most widely used similarity methods (fingerprint-based similarity) using three difficult data sets with different characteristics. Despite having a higher computational cost, the method performed well being able to distinguish either different or very similar hydrocarbons that were indistinguishable using a fingerprint-based approach. NAMS also verified the similarity principle using a data set of structurally similar steroids with differences in the binding affinity to the corticosteroid binding globulin receptor by showing that pairs of steroids with a high degree of similarity (>80%) tend to have smaller differences in the absolute value of binding activity. Using a highly diverse set of compounds with information about the monoamine oxidase inhibition level, the method was also able to recover a significantly higher average fraction of active compounds when the seed is active for different cutoff threshold values of similarity. Particularly, for the cutoff threshold values of 86%, 93%, and 96.5%, NAMS was able to recover a fraction of actives of 0.57, 0.63, and 0.83, respectively, while the fingerprint-based approach was able to recover a fraction of actives of 0.41, 0.40, and 0.39, respectively. NAMS is made available freely for the whole community in a simple Web based tool as well as the Python source code at http://nams.lasige.di.fc.ul.pt/.
Binary Classification using Decision Tree based Genetic Programming and Its Application to Analysis of Bio-mass Data

NASA Astrophysics Data System (ADS)

To, Cuong; Pham, Tuan D.

2010-01-01

In machine learning, pattern recognition may be the most popular task. "Similar" patterns identification is also very important in biology because first, it is useful for prediction of patterns associated with disease, for example cancer tissue (normal or tumor); second, similarity or dissimilarity of the kinetic patterns is used to identify coordinately controlled genes or proteins involved in the same regulatory process. Third, similar genes (proteins) share similar functions. In this paper, we present an algorithm which uses genetic programming to create decision tree for binary classification problem. The application of the algorithm was implemented on five real biological databases. Base on the results of comparisons with well-known methods, we see that the algorithm is outstanding in most of cases.
Applying CBR to machine tool product configuration design oriented to customer requirements

NASA Astrophysics Data System (ADS)

Wang, Pengjia; Gong, Yadong; Xie, Hualong; Liu, Yongxian; Nee, Andrew Yehching

2017-01-01

Product customization is a trend in the current market-oriented manufacturing environment. However, deduction from customer requirements to design results and evaluation of design alternatives are still heavily reliant on the designer's experience and knowledge. To solve the problem of fuzziness and uncertainty of customer requirements in product configuration, an analysis method based on the grey rough model is presented. The customer requirements can be converted into technical characteristics effectively. In addition, an optimization decision model for product planning is established to help the enterprises select the key technical characteristics under the constraints of cost and time to serve the customer to maximal satisfaction. A new case retrieval approach that combines the self-organizing map and fuzzy similarity priority ratio method is proposed in case-based design. The self-organizing map can reduce the retrieval range and increase the retrieval efficiency, and the fuzzy similarity priority ratio method can evaluate the similarity of cases comprehensively. To ensure that the final case has the best overall performance, an evaluation method of similar cases based on grey correlation analysis is proposed to evaluate similar cases to select the most suitable case. Furthermore, a computer-aided system is developed using MATLAB GUI to assist the product configuration design. The actual example and result on an ETC series machine tool product show that the proposed method is effective, rapid and accurate in the process of product configuration. The proposed methodology provides a detailed instruction for the product configuration design oriented to customer requirements.
Similarity Theory of Withdrawn Water Temperature Experiment

PubMed Central

2015-01-01

Selective withdrawal from a thermal stratified reservoir has been widely utilized in managing reservoir water withdrawal. Besides theoretical analysis and numerical simulation, model test was also necessary in studying the temperature of withdrawn water. However, information on the similarity theory of the withdrawn water temperature model remains lacking. Considering flow features of selective withdrawal, the similarity theory of the withdrawn water temperature model was analyzed theoretically based on the modification of governing equations, the Boussinesq approximation, and some simplifications. The similarity conditions between the model and the prototype were suggested. The conversion of withdrawn water temperature between the model and the prototype was proposed. Meanwhile, the fundamental theory of temperature distribution conversion was firstly proposed, which could significantly improve the experiment efficiency when the basic temperature of the model was different from the prototype. Based on the similarity theory, an experiment was performed on the withdrawn water temperature which was verified by numerical method. PMID:26065020
Lactobacillus heilongjiangensis sp. nov., isolated from Chinese pickle.

PubMed

Gu, Chun Tao; Li, Chun Yan; Yang, Li Jie; Huo, Gui Cheng

2013-11-01

A Gram-stain-positive bacterial strain, S4-3(T), was isolated from traditional pickle in Heilongjiang Province, China. The bacterium was characterized by a polyphasic approach, including 16S rRNA gene sequence analysis, pheS gene sequence analysis, rpoA gene sequence analysis, dnaK gene sequence analysis, fatty acid methyl ester (FAME) analysis, determination of DNA G+C content, DNA-DNA hybridization and an analysis of phenotypic features. Strain S4-3(T) showed 97.9-98.7 % 16S rRNA gene sequence similarities, 84.4-94.1 % pheS gene sequence similarities and 94.4-96.9 % rpoA gene sequence similarities to the type strains of Lactobacillus nantensis, Lactobacillus mindensis, Lactobacillus crustorum, Lactobacillus futsaii, Lactobacillus farciminis and Lactobacillus kimchiensis. dnaK gene sequence similarities between S4-3(T) and Lactobacillus nantensis LMG 23510(T), Lactobacillus mindensis LMG 21932(T), Lactobacillus crustorum LMG 23699(T), Lactobacillus futsaii JCM 17355(T) and Lactobacillus farciminis LMG 9200(T) were 95.4, 91.5, 90.4, 91.7 and 93.1 %, respectively. Based upon the data obtained in the present study, a novel species, Lactobacillus heilongjiangensis sp. nov., is proposed and the type strain is S4-3(T) ( = LMG 26166(T) = NCIMB 14701(T)).
A novel water quality data analysis framework based on time-series data mining.

PubMed

Deng, Weihui; Wang, Guoyin

2017-07-01

The rapid development of time-series data mining provides an emerging method for water resource management research. In this paper, based on the time-series data mining methodology, we propose a novel and general analysis framework for water quality time-series data. It consists of two parts: implementation components and common tasks of time-series data mining in water quality data. In the first part, we propose to granulate the time series into several two-dimensional normal clouds and calculate the similarities in the granulated level. On the basis of the similarity matrix, the similarity search, anomaly detection, and pattern discovery tasks in the water quality time-series instance dataset can be easily implemented in the second part. We present a case study of this analysis framework on weekly Dissolve Oxygen time-series data collected from five monitoring stations on the upper reaches of Yangtze River, China. It discovered the relationship of water quality in the mainstream and tributary as well as the main changing patterns of DO. The experimental results show that the proposed analysis framework is a feasible and efficient method to mine the hidden and valuable knowledge from water quality historical time-series data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Trust-Enhanced Cloud Service Selection Model Based on QoS Analysis.

PubMed

Pan, Yuchen; Ding, Shuai; Fan, Wenjuan; Li, Jing; Yang, Shanlin

2015-01-01

Cloud computing technology plays a very important role in many areas, such as in the construction and development of the smart city. Meanwhile, numerous cloud services appear on the cloud-based platform. Therefore how to how to select trustworthy cloud services remains a significant problem in such platforms, and extensively investigated owing to the ever-growing needs of users. However, trust relationship in social network has not been taken into account in existing methods of cloud service selection and recommendation. In this paper, we propose a cloud service selection model based on the trust-enhanced similarity. Firstly, the direct, indirect, and hybrid trust degrees are measured based on the interaction frequencies among users. Secondly, we estimate the overall similarity by combining the experience usability measured based on Jaccard's Coefficient and the numerical distance computed by Pearson Correlation Coefficient. Then through using the trust degree to modify the basic similarity, we obtain a trust-enhanced similarity. Finally, we utilize the trust-enhanced similarity to find similar trusted neighbors and predict the missing QoS values as the basis of cloud service selection and recommendation. The experimental results show that our approach is able to obtain optimal results via adjusting parameters and exhibits high effectiveness. The cloud services ranking by our model also have better QoS properties than other methods in the comparison experiments.
Trust-Enhanced Cloud Service Selection Model Based on QoS Analysis

PubMed Central

Pan, Yuchen; Ding, Shuai; Fan, Wenjuan; Li, Jing; Yang, Shanlin

2015-01-01

Cloud computing technology plays a very important role in many areas, such as in the construction and development of the smart city. Meanwhile, numerous cloud services appear on the cloud-based platform. Therefore how to how to select trustworthy cloud services remains a significant problem in such platforms, and extensively investigated owing to the ever-growing needs of users. However, trust relationship in social network has not been taken into account in existing methods of cloud service selection and recommendation. In this paper, we propose a cloud service selection model based on the trust-enhanced similarity. Firstly, the direct, indirect, and hybrid trust degrees are measured based on the interaction frequencies among users. Secondly, we estimate the overall similarity by combining the experience usability measured based on Jaccard’s Coefficient and the numerical distance computed by Pearson Correlation Coefficient. Then through using the trust degree to modify the basic similarity, we obtain a trust-enhanced similarity. Finally, we utilize the trust-enhanced similarity to find similar trusted neighbors and predict the missing QoS values as the basis of cloud service selection and recommendation. The experimental results show that our approach is able to obtain optimal results via adjusting parameters and exhibits high effectiveness. The cloud services ranking by our model also have better QoS properties than other methods in the comparison experiments. PMID:26606388
DOSim: an R package for similarity between diseases based on Disease Ontology.

PubMed

Li, Jiang; Gong, Binsheng; Chen, Xi; Liu, Tao; Wu, Chao; Zhang, Fan; Li, Chunquan; Li, Xiang; Rao, Shaoqi; Li, Xia

2011-06-29

The construction of the Disease Ontology (DO) has helped promote the investigation of diseases and disease risk factors. DO enables researchers to analyse disease similarity by adopting semantic similarity measures, and has expanded our understanding of the relationships between different diseases and to classify them. Simultaneously, similarities between genes can also be analysed by their associations with similar diseases. As a result, disease heterogeneity is better understood and insights into the molecular pathogenesis of similar diseases have been gained. However, bioinformatics tools that provide easy and straight forward ways to use DO to study disease and gene similarity simultaneously are required. We have developed an R-based software package (DOSim) to compute the similarity between diseases and to measure the similarity between human genes in terms of diseases. DOSim incorporates a DO-based enrichment analysis function that can be used to explore the disease feature of an independent gene set. A multilayered enrichment analysis (GO and KEGG annotation) annotation function that helps users explore the biological meaning implied in a newly detected gene module is also part of the DOSim package. We used the disease similarity application to demonstrate the relationship between 128 different DO cancer terms. The hierarchical clustering of these 128 different cancers showed modular characteristics. In another case study, we used the gene similarity application on 361 obesity-related genes. The results revealed the complex pathogenesis of obesity. In addition, the gene module detection and gene module multilayered annotation functions in DOSim when applied on these 361 obesity-related genes helped extend our understanding of the complex pathogenesis of obesity risk phenotypes and the heterogeneity of obesity-related diseases. DOSim can be used to detect disease-driven gene modules, and to annotate the modules for functions and pathways. The DOSim package can also be used to visualise DO structure. DOSim can reflect the modular characteristic of disease related genes and promote our understanding of the complex pathogenesis of diseases. DOSim is available on the Comprehensive R Archive Network (CRAN) or http://bioinfo.hrbmu.edu.cn/dosim.
Comparison of estimators for rolling samples using Forest Inventory and Analysis data

Treesearch

Devin S. Johnson; Michael S. Williams; Raymond L. Czaplewski

2003-01-01

The performance of three classes of weighted average estimators is studied for an annual inventory design similar to the Forest Inventory and Analysis program of the United States. The first class is based on an ARIMA(0,1,1) time series model. The equal weight, simple moving average is a member of this class. The second class is based on an ARIMA(0,2,2) time series...
NaviGO: interactive tool for visualization and functional similarity and coherence analysis with gene ontology.

PubMed

Wei, Qing; Khan, Ishita K; Ding, Ziyun; Yerneni, Satwica; Kihara, Daisuke

2017-03-20

The number of genomics and proteomics experiments is growing rapidly, producing an ever-increasing amount of data that are awaiting functional interpretation. A number of function prediction algorithms were developed and improved to enable fast and automatic function annotation. With the well-defined structure and manual curation, Gene Ontology (GO) is the most frequently used vocabulary for representing gene functions. To understand relationship and similarity between GO annotations of genes, it is important to have a convenient pipeline that quantifies and visualizes the GO function analyses in a systematic fashion. NaviGO is a web-based tool for interactive visualization, retrieval, and computation of functional similarity and associations of GO terms and genes. Similarity of GO terms and gene functions is quantified with six different scores including protein-protein interaction and context based association scores we have developed in our previous works. Interactive navigation of the GO function space provides intuitive and effective real-time visualization of functional groupings of GO terms and genes as well as statistical analysis of enriched functions. We developed NaviGO, which visualizes and analyses functional similarity and associations of GO terms and genes. The NaviGO webserver is freely available at: http://kiharalab.org/web/navigo .
Identification of uncommon objects in containers

DOEpatents

Bremer, Peer-Timo; Kim, Hyojin; Thiagarajan, Jayaraman J.

2017-09-12

A system for identifying in an image an object that is commonly found in a collection of images and for identifying a portion of an image that represents an object based on a consensus analysis of segmentations of the image. The system collects images of containers that contain objects for generating a collection of common objects within the containers. To process the images, the system generates a segmentation of each image. The image analysis system may also generate multiple segmentations for each image by introducing variations in the selection of voxels to be merged into a segment. The system then generates clusters of the segments based on similarity among the segments. Each cluster represents a common object found in the containers. Once the clustering is complete, the system may be used to identify common objects in images of new containers based on similarity between segments of images and the clusters.
Assessing co-regulation of directly linked genes in biological networks using microarray time series analysis.

PubMed

Del Sorbo, Maria Rosaria; Balzano, Walter; Donato, Michele; Draghici, Sorin

2013-11-01

Differential expression of genes detected with the analysis of high throughput genomic experiments is a commonly used intermediate step for the identification of signaling pathways involved in the response to different biological conditions. The impact analysis was the first approach for the analysis of signaling pathways involved in a certain biological process that was able to take into account not only the magnitude of the expression change of the genes but also the topology of signaling pathways including the type of each interactions between the genes. In the impact analysis, signaling pathways are represented as weighted directed graphs with genes as nodes and the interactions between genes as edges. Edges weights are represented by a β factor, the regulatory efficiency, which is assumed to be equal to 1 in inductive interactions between genes and equal to -1 in repressive interactions. This study presents a similarity analysis between gene expression time series aimed to find correspondences with the regulatory efficiency, i.e. the β factor as found in a widely used pathway database. Here, we focused on correlations among genes directly connected in signaling pathways, assuming that the expression variations of upstream genes impact immediately downstream genes in a short time interval and without significant influences by the interactions with other genes. Time series were processed using three different similarity metrics. The first metric is based on the bit string matching; the second one is a specific application of the Dynamic Time Warping to detect similarities even in presence of stretching and delays; the third one is a quantitative comparative analysis resulting by an evaluation of frequency domain representation of time series: the similarity metric is the correlation between dominant spectral components. These three approaches are tested on real data and pathways, and a comparison is performed using Information Retrieval benchmark tools, indicating the frequency approach as the best similarity metric among the three, for its ability to detect the correlation based on the correspondence of the most significant frequency components. Copyright © 2013. Published by Elsevier Ireland Ltd.
Subgrouping Automata: automatic sequence subgrouping using phylogenetic tree-based optimum subgrouping algorithm.

PubMed

Seo, Joo-Hyun; Park, Jihyang; Kim, Eun-Mi; Kim, Juhan; Joo, Keehyoung; Lee, Jooyoung; Kim, Byung-Gee

2014-02-01

Sequence subgrouping for a given sequence set can enable various informative tasks such as the functional discrimination of sequence subsets and the functional inference of unknown sequences. Because an identity threshold for sequence subgrouping may vary according to the given sequence set, it is highly desirable to construct a robust subgrouping algorithm which automatically identifies an optimal identity threshold and generates subgroups for a given sequence set. To meet this end, an automatic sequence subgrouping method, named 'Subgrouping Automata' was constructed. Firstly, tree analysis module analyzes the structure of tree and calculates the all possible subgroups in each node. Sequence similarity analysis module calculates average sequence similarity for all subgroups in each node. Representative sequence generation module finds a representative sequence using profile analysis and self-scoring for each subgroup. For all nodes, average sequence similarities are calculated and 'Subgrouping Automata' searches a node showing statistically maximum sequence similarity increase using Student's t-value. A node showing the maximum t-value, which gives the most significant differences in average sequence similarity between two adjacent nodes, is determined as an optimum subgrouping node in the phylogenetic tree. Further analysis showed that the optimum subgrouping node from SA prevents under-subgrouping and over-subgrouping. Copyright © 2013. Published by Elsevier Ltd.
Prediction of microRNAs Associated with Human Diseases Based on Weighted k Most Similar Neighbors

PubMed Central

Guo, Maozu; Guo, Yahong; Li, Jinbao; Ding, Jian; Liu, Yong; Dai, Qiguo; Li, Jin; Teng, Zhixia; Huang, Yufei

2013-01-01

Background The identification of human disease-related microRNAs (disease miRNAs) is important for further investigating their involvement in the pathogenesis of diseases. More experimentally validated miRNA-disease associations have been accumulated recently. On the basis of these associations, it is essential to predict disease miRNAs for various human diseases. It is useful in providing reliable disease miRNA candidates for subsequent experimental studies. Methodology/Principal Findings It is known that miRNAs with similar functions are often associated with similar diseases and vice versa. Therefore, the functional similarity of two miRNAs has been successfully estimated by measuring the semantic similarity of their associated diseases. To effectively predict disease miRNAs, we calculated the functional similarity by incorporating the information content of disease terms and phenotype similarity between diseases. Furthermore, the members of miRNA family or cluster are assigned higher weight since they are more probably associated with similar diseases. A new prediction method, HDMP, based on weighted k most similar neighbors is presented for predicting disease miRNAs. Experiments validated that HDMP achieved significantly higher prediction performance than existing methods. In addition, the case studies examining prostatic neoplasms, breast neoplasms, and lung neoplasms, showed that HDMP can uncover potential disease miRNA candidates. Conclusions The superior performance of HDMP can be attributed to the accurate measurement of miRNA functional similarity, the weight assignment based on miRNA family or cluster, and the effective prediction based on weighted k most similar neighbors. The online prediction and analysis tool is freely available at http://nclab.hit.edu.cn/hdmpred. PMID:23950912
Picosecond laser welding of similar and dissimilar materials.

PubMed

Carter, Richard M; Chen, Jianyong; Shephard, Jonathan D; Thomson, Robert R; Hand, Duncan P

2014-07-01

We report picosecond laser welding of similar and dissimilar materials based on plasma formation induced by a tightly focused beam from a 1030 nm, 10 ps, 400 kHz laser system. Specifically, we demonstrate the welding of fused silica, borosilicate, and sapphire to a range of materials including borosilicate, fused silica, silicon, copper, aluminum, and stainless steel. Dissimilar material welding of glass to aluminum and stainless steel has not been previously reported. Analysis of the borosilicate-to-borosilicate weld strength compares well to those obtained using similar welding systems based on femtosecond lasers. There is, however, a strong requirement to prepare surfaces to a high (10-60 nm Ra) flatness to ensure a successful weld.
Protein structure similarity from Principle Component Correlation analysis.

PubMed

Zhou, Xiaobo; Chou, James; Wong, Stephen T C

2006-01-25

Owing to rapid expansion of protein structure databases in recent years, methods of structure comparison are becoming increasingly effective and important in revealing novel information on functional properties of proteins and their roles in the grand scheme of evolutionary biology. Currently, the structural similarity between two proteins is measured by the root-mean-square-deviation (RMSD) in their best-superimposed atomic coordinates. RMSD is the golden rule of measuring structural similarity when the structures are nearly identical; it, however, fails to detect the higher order topological similarities in proteins evolved into different shapes. We propose new algorithms for extracting geometrical invariants of proteins that can be effectively used to identify homologous protein structures or topologies in order to quantify both close and remote structural similarities. We measure structural similarity between proteins by correlating the principle components of their secondary structure interaction matrix. In our approach, the Principle Component Correlation (PCC) analysis, a symmetric interaction matrix for a protein structure is constructed with relationship parameters between secondary elements that can take the form of distance, orientation, or other relevant structural invariants. When using a distance-based construction in the presence or absence of encoded N to C terminal sense, there are strong correlations between the principle components of interaction matrices of structurally or topologically similar proteins. The PCC method is extensively tested for protein structures that belong to the same topological class but are significantly different by RMSD measure. The PCC analysis can also differentiate proteins having similar shapes but different topological arrangements. Additionally, we demonstrate that when using two independently defined interaction matrices, comparison of their maximum eigenvalues can be highly effective in clustering structurally or topologically similar proteins. We believe that the PCC analysis of interaction matrix is highly flexible in adopting various structural parameters for protein structure comparison.

Identification, genetic localization, and allelic diversity of selectively amplified microsatellite polymorphic loci in lettuce and wild relatives (Lactuca spp.).

PubMed

Witsenboer, H; Michelmore, R W; Vogel, J

1997-12-01

Selectively amplified microsatellite polymorphic locus (SAMPL) analysis is a method of amplifying microsatellite loci using generic PCR primers. SAMPL analysis uses one AFLP primer in combination with a primer complementary to microsatellite sequences. SAMPL primers based on compound microsatellite sequences provided the clearest amplification patterns. We explored the potential of SAMPL analysis in lettuce to detect PCR-based codominant microsatellite markers. Fifty-eight SAMPLs were identified and placed on the genetic map. Seventeen were codominant. SAMPLs were dispersed with RFLP markers on 11 of the 12 main linkage groups in lettuce, indicating that they have a similar genomic distribution. Some but not all fragments amplified by SAMPL analysis were confirmed to contain microsatellite sequences by Southern hybridization. Forty-five cultivars of lettuce and five wild species of Lactuca were analyzed to determine the allelic diversity for codominant SAMPLs. From 3 to 11 putative alleles were found for each SAMPL; 2-6 alleles were found within Lactuca sativa and 1-3 alleles were found among the crisphead genotypes, the most genetically homogeneous plant type of L. sativa. This allelic diversity is greater than that found for RFLP markers. Numerous new alleles were observed in the wild species; however, there were frequent null alleles. Therefore, SAMPL analysis is more applicable to intraspecific than to interspecific comparisons. A phenetic analysis based on SAMPLs resulted in a dendrogram similar to those based on RFLP and AFLP markers.
Genetic relatedness of Brazilian Colletotrichum truncatum isolates assessed by vegetative compatibility groups and RAPD analysis.

PubMed

Sant'Anna, Juliane R; Miyamoto, Cláudia T; Rosada, Lúcia J; Franco, Claudinéia C S; Kaneshima, Edilson N; Castro-Prado, Marialba A A

2010-01-01

The genetic variation among nine soybean-originating isolates of Colletotrichum truncatum from different Brazilian states was studied. Nitrate non-utilizing (nit) mutants were obtained with potassium chlorate and used to characterize vegetative compatibility reactions, heterokaryosis and RAPD profile. Based on pairings of nit mutants from the different isolates, five vegetative complementation groups (VCG) were identified, and barriers to the formation of heterokaryons were observed among isolates derived from the same geographic area. No complementation was observed among any of the nit mutants recovered from the isolate A, which was designed heterokaryon-self-incompatible. Based on RAPD analysis, a polymorphism was detected among the wild isolate C and their nit1 and NitM mutants. RAPD amplification, with five different primers, also showed polymorphic profiles among Brazilian C. truncatum isolates. Dendrogram analysis resulted in a similarity degree ranging between 0.331 and 0.882 among isolates and identified three RAPD groups. Despite the lack of a correlation between the RAPD analysis and the vegetative compatibility grouping, results demonstrated the potential of VCG analysis to differentiate C. truncatum isolates genotypically similar when compared by RAPD.
A Method for Populating the Knowledge Base of APTAS, a Domain-Oriented Application Composition System

DTIC Science & Technology

1993-12-01

proposed a domain analysis approach called Feature-Oriented Domain Analysis ( FODA ). The approach identifies prominent features (similarities) and...characteristics of software systems in the domain. Unlike the other domain analysis approaches we have summarized, the re- searchers described FODA in...Domain Analysis ( FODA ) Feasibility Study. Technical Report, Software Engineering Institute, Carnegie Mellon University, Novem- ber 1990. 19. Lee, Kenneth
Optimal Threshold Determination for Interpreting Semantic Similarity and Particularity: Application to the Comparison of Gene Sets and Metabolic Pathways Using GO and ChEBI

PubMed Central

Bettembourg, Charles; Diot, Christian; Dameron, Olivier

2015-01-01

Background The analysis of gene annotations referencing back to Gene Ontology plays an important role in the interpretation of high-throughput experiments results. This analysis typically involves semantic similarity and particularity measures that quantify the importance of the Gene Ontology annotations. However, there is currently no sound method supporting the interpretation of the similarity and particularity values in order to determine whether two genes are similar or whether one gene has some significant particular function. Interpretation is frequently based either on an implicit threshold, or an arbitrary one (typically 0.5). Here we investigate a method for determining thresholds supporting the interpretation of the results of a semantic comparison. Results We propose a method for determining the optimal similarity threshold by minimizing the proportions of false-positive and false-negative similarity matches. We compared the distributions of the similarity values of pairs of similar genes and pairs of non-similar genes. These comparisons were performed separately for all three branches of the Gene Ontology. In all situations, we found overlap between the similar and the non-similar distributions, indicating that some similar genes had a similarity value lower than the similarity value of some non-similar genes. We then extend this method to the semantic particularity measure and to a similarity measure applied to the ChEBI ontology. Thresholds were evaluated over the whole HomoloGene database. For each group of homologous genes, we computed all the similarity and particularity values between pairs of genes. Finally, we focused on the PPAR multigene family to show that the similarity and particularity patterns obtained with our thresholds were better at discriminating orthologs and paralogs than those obtained using default thresholds. Conclusion We developed a method for determining optimal semantic similarity and particularity thresholds. We applied this method on the GO and ChEBI ontologies. Qualitative analysis using the thresholds on the PPAR multigene family yielded biologically-relevant patterns. PMID:26230274
Gender similarities and differences in brain activation strategies: Voxel-based meta-analysis on fMRI studies.

PubMed

AlRyalat, Saif Aldeen

2017-01-01

Gender similarities and differences have long been a matter of debate in almost all human research, especially upon reaching the discussion about brain functions. This large scale meta-analysis was performed on functional MRI studies. It included more than 700 active brain foci from more than 70 different experiments to study gender related similarities and differences in brain activation strategies for three of the main brain functions: Visual-spatial cognition, memory, and emotion. Areas that are significantly activated by both genders (i.e. core areas) for the tested brain function are mentioned, whereas those areas significantly activated exclusively in one gender are the gender specific areas. During visual-spatial cognition task, and in addition to the core areas, males significantly activated their left superior frontal gyrus, compared with left superior parietal lobule in females. For memory tasks, several different brain areas activated by each gender, but females significantly activated two areas from the limbic system during memory retrieval tasks. For emotional task, males tend to recruit their bilateral prefrontal regions, whereas females tend to recruit their bilateral amygdalae. This meta-analysis provides an overview based on functional MRI studies on how males and females use their brain.
Analysis of concrete beams using applied element method

NASA Astrophysics Data System (ADS)

Lincy Christy, D.; Madhavan Pillai, T. M.; Nagarajan, Praveen

2018-03-01

The Applied Element Method (AEM) is a displacement based method of structural analysis. Some of its features are similar to that of Finite Element Method (FEM). In AEM, the structure is analysed by dividing it into several elements similar to FEM. But, in AEM, elements are connected by springs instead of nodes as in the case of FEM. In this paper, background to AEM is discussed and necessary equations are derived. For illustrating the application of AEM, it has been used to analyse plain concrete beam of fixed support condition. The analysis is limited to the analysis of 2-dimensional structures. It was found that the number of springs has no much influence on the results. AEM could predict deflection and reactions with reasonable degree of accuracy.
Probing multi-scale self-similarity of tissue structures using light scattering spectroscopy: prospects in pre-cancer detection

NASA Astrophysics Data System (ADS)

Chatterjee, Subhasri; Das, Nandan K.; Kumar, Satish; Mohapatra, Sonali; Pradhan, Asima; Panigrahi, Prasanta K.; Ghosh, Nirmalya

2013-02-01

Multi-resolution analysis on the spatial refractive index inhomogeneities in the connective tissue regions of human cervix reveals clear signature of multifractality. We have thus developed an inverse analysis strategy for extraction and quantification of the multifractality of spatial refractive index fluctuations from the recorded light scattering signal. The method is based on Fourier domain pre-processing of light scattering data using Born approximation, and its subsequent analysis through Multifractal Detrended Fluctuation Analysis model. The method has been validated on several mono- and multi-fractal scattering objects whose self-similar properties are user controlled and known a-priori. Following successful validation, this approach has initially been explored for differentiating between different grades of precancerous human cervical tissues.
RNA-Seq-based toxicogenomic assessment of fresh frozen and formalin-fixed tissues yields similar mechanistic insights.

PubMed

Auerbach, Scott S; Phadke, Dhiral P; Mav, Deepak; Holmgren, Stephanie; Gao, Yuan; Xie, Bin; Shin, Joo Heon; Shah, Ruchir R; Merrick, B Alex; Tice, Raymond R

2015-07-01

Formalin-fixed, paraffin-embedded (FFPE) pathology specimens represent a potentially vast resource for transcriptomic-based biomarker discovery. We present here a comparison of results from a whole transcriptome RNA-Seq analysis of RNA extracted from fresh frozen and FFPE livers. The samples were derived from rats exposed to aflatoxin B1 (AFB1 ) and a corresponding set of control animals. Principal components analysis indicated that samples were separated in the two groups representing presence or absence of chemical exposure, both in fresh frozen and FFPE sample types. Sixty-five percent of the differentially expressed transcripts (AFB1 vs. controls) in fresh frozen samples were also differentially expressed in FFPE samples (overlap significance: P < 0.0001). Genomic signature and gene set analysis of AFB1 differentially expressed transcript lists indicated highly similar results between fresh frozen and FFPE at the level of chemogenomic signatures (i.e., single chemical/dose/duration elicited transcriptomic signatures), mechanistic and pathology signatures, biological processes, canonical pathways and transcription factor networks. Overall, our results suggest that similar hypotheses about the biological mechanism of toxicity would be formulated from fresh frozen and FFPE samples. These results indicate that phenotypically anchored archival specimens represent a potentially informative resource for signature-based biomarker discovery and mechanistic characterization of toxicity. Copyright © 2014 John Wiley & Sons, Ltd.
Structure-wise discrimination of cytosine, thymine, and uracil by proteins in terms of their nonbonded interactions.

PubMed

Usha, S; Selvaraj, S

2014-01-01

The molecular recognition and discrimination of very similar ligand moieties by proteins are important subjects in protein-ligand interaction studies. Specificity in the recognition of molecules is determined by the arrangement of protein and ligand atoms in space. The three pyrimidine bases, viz. cytosine, thymine, and uracil, are structurally similar, but the proteins that bind to them are able to discriminate them and form interactions. Since nonbonded interactions are responsible for molecular recognition processes in biological systems, our work attempts to understand some of the underlying principles of such recognition of pyrimidine molecular structures by proteins. The preferences of the amino acid residues to contact the pyrimidine bases in terms of nonbonded interactions; amino acid residue-ligand atom preferences; main chain and side chain atom contributions of amino acid residues; and solvent-accessible surface area of ligand atoms when forming complexes are analyzed. Our analysis shows that the amino acid residues, tyrosine and phenyl alanine, are highly involved in the pyrimidine interactions. Arginine prefers contacts with the cytosine base. The similarities and differences that exist between the interactions of the amino acid residues with each of the three pyrimidine base atoms in our analysis provide insights that can be exploited in designing specific inhibitors competitive to the ligands.
Pretreatment and integrated analysis of spectral data reveal seaweed similarities based on chemical diversity.

PubMed

Wei, Feifei; Ito, Kengo; Sakata, Kenji; Date, Yasuhiro; Kikuchi, Jun

2015-03-03

Extracting useful information from high dimensionality and large data sets is a major challenge for data-driven approaches. The present study was aimed at developing novel integrated analytical strategies for comprehensively characterizing seaweed similarities based on chemical diversity. The chemical compositions of 107 seaweed and 2 seagrass samples were analyzed using multiple techniques, including Fourier transform infrared (FT-IR) and solid- and solution-state nuclear magnetic resonance (NMR) spectroscopy, thermogravimetry-differential thermal analysis (TG-DTA), inductively coupled plasma-optical emission spectrometry (ICP-OES), CHNS/O total elemental analysis, and isotope ratio mass spectrometry (IR-MS). The spectral data were preprocessed using non-negative matrix factorization (NMF) and NMF combined with multivariate curve resolution-alternating least-squares (MCR-ALS) methods in order to separate individual component information from the overlapping and/or broad spectral peaks. Integrated analysis of the preprocessed chemical data demonstrated distinct discrimination of differential seaweed species. Further network analysis revealed a close correlation between the heavy metal elements and characteristic components of brown algae, such as cellulose, alginic acid, and sulfated mucopolysaccharides, providing a componential basis for its metal-sorbing potential. These results suggest that this integrated analytical strategy is useful for extracting and identifying the chemical characteristics of diverse seaweeds based on large chemical data sets, particularly complicated overlapping spectral data.
RNA-TVcurve: a Web server for RNA secondary structure comparison based on a multi-scale similarity of its triple vector curve representation.

PubMed

Li, Ying; Shi, Xiaohu; Liang, Yanchun; Xie, Juan; Zhang, Yu; Ma, Qin

2017-01-21

RNAs have been found to carry diverse functionalities in nature. Inferring the similarity between two given RNAs is a fundamental step to understand and interpret their functional relationship. The majority of functional RNAs show conserved secondary structures, rather than sequence conservation. Those algorithms relying on sequence-based features usually have limitations in their prediction performance. Hence, integrating RNA structure features is very critical for RNA analysis. Existing algorithms mainly fall into two categories: alignment-based and alignment-free. The alignment-free algorithms of RNA comparison usually have lower time complexity than alignment-based algorithms. An alignment-free RNA comparison algorithm was proposed, in which novel numerical representations RNA-TVcurve (triple vector curve representation) of RNA sequence and corresponding secondary structure features are provided. Then a multi-scale similarity score of two given RNAs was designed based on wavelet decomposition of their numerical representation. In support of RNA mutation and phylogenetic analysis, a web server (RNA-TVcurve) was designed based on this alignment-free RNA comparison algorithm. It provides three functional modules: 1) visualization of numerical representation of RNA secondary structure; 2) detection of single-point mutation based on secondary structure; and 3) comparison of pairwise and multiple RNA secondary structures. The inputs of the web server require RNA primary sequences, while corresponding secondary structures are optional. For the primary sequences alone, the web server can compute the secondary structures using free energy minimization algorithm in terms of RNAfold tool from Vienna RNA package. RNA-TVcurve is the first integrated web server, based on an alignment-free method, to deliver a suite of RNA analysis functions, including visualization, mutation analysis and multiple RNAs structure comparison. The comparison results with two popular RNA comparison tools, RNApdist and RNAdistance, showcased that RNA-TVcurve can efficiently capture subtle relationships among RNAs for mutation detection and non-coding RNA classification. All the relevant results were shown in an intuitive graphical manner, and can be freely downloaded from this server. RNA-TVcurve, along with test examples and detailed documents, are available at: http://ml.jlu.edu.cn/tvcurve/ .
Predicting Raters’ Transparency Judgments of English and Chinese Morphological Constituents using Latent Semantic Analysis

PubMed Central

Wang, Hsueh-Cheng; Hsu, Li-Chuan; Tien, Yi-Min; Pomplun, Marc

2013-01-01

The morphological constituents of English compounds (e.g., “butter” and “fly” for “butterfly”) and two-character Chinese compounds may differ in meaning from the whole word. Subjective differences and ambiguity of transparency make the judgments difficult, and a computational alternative based on a general model may be a way to average across subjective differences. The current study proposes two approaches based on Latent Semantic Analysis (Landauer & Dumais, 1997): Model 1 compares the semantic similarity between a compound word and each of its constituents, and Model 2 derives the dominant meaning of a constituent based on a clustering analysis of morphological family members (e.g., “butterfingers” or “buttermilk” for “butter”). The proposed models successfully predicted participants’ transparency ratings, and we recommend that experimenters use Model 1 for English compounds and Model 2 for Chinese compounds, due to raters’ morphological processing in different writing systems. The dominance of lexical meaning, semantic transparency, and the average similarity between all pairs within a morphological family are provided, and practical applications for future studies are discussed. PMID:23784009
Tensor-product kernel-based representation encoding joint MRI view similarity.

PubMed

Alvarez-Meza, A; Cardenas-Pena, D; Castro-Ospina, A E; Alvarez, M; Castellanos-Dominguez, G

2014-01-01

To support 3D magnetic resonance image (MRI) analysis, a marginal image similarity (MIS) matrix holding MR inter-slice relationship along every axis view (Axial, Coronal, and Sagittal) can be estimated. However, mutual inference from MIS view information poses a difficult task since relationships between axes are nonlinear. To overcome this issue, we introduce a Tensor-Product Kernel-based Representation (TKR) that allows encoding brain structure patterns due to patient differences, gathering all MIS matrices into a single joint image similarity framework. The TKR training strategy is carried out into a low dimensional projected space to get less influence of voxel-derived noise. Obtained results for classifying the considered patient categories (gender and age) on real MRI database shows that the proposed TKR training approach outperforms the conventional voxel-wise sum of squared differences. The proposed approach may be useful to support MRI clustering and similarity inference tasks, which are required on template-based image segmentation and atlas construction.
Assessing HTS Performance Using BioAssay Ontology: Screening and Analysis of a Bacterial Phospho-N-Acetylmuramoyl-Pentapeptide Translocase Campaign

PubMed Central

Moberg, Andreas; Hansson, Eva; Boyd, Helen

2014-01-01

Abstract With the public availability of biochemical assays and screening data constantly increasing, new applications for data mining and method analysis are evolving in parallel. One example is BioAssay Ontology (BAO) for systematic classification of assays based on screening setup and metadata annotations. In this article we report a high-throughput screening (HTS) against phospho-N-acetylmuramoyl-pentapeptide translocase (MraY), an attractive antibacterial drug target involved in peptidoglycan synthesis. The screen resulted in novel chemistry identification using a fluorescence resonance energy transfer assay. To address a subset of the false positive hits, a frequent hitter analysis was performed using an approach in which MraY hits were compared with hits from similar assays, previously used for HTS. The MraY assay was annotated according to BAO and three internal reference assays, using a similar assay design and detection technology, were identified. Analyzing the assays retrospectively, it was clear that both MraY and the three reference assays all showed a high false positive rate in the primary HTS assays. In the case of MraY, false positives were efficiently identified by applying a method to correct for compound interference at the hit-confirmation stage. Frequent hitter analysis based on the three reference assays with similar assay method identified additional false actives in the primary MraY assay as frequent hitters. This article demonstrates how assays annotated using BAO terms can be used to identify closely related reference assays, and that analysis based on these assays clearly can provide useful data to influence assay design, technology, and screening strategy. PMID:25415593
The Adversarial Route Analysis Tool: A Web Application

DOE Office of Scientific and Technical Information (OSTI.GOV)

Casson, William H. Jr.

2012-08-02

The Adversarial Route Analysis Tool is a type of Google maps for adversaries. It's a web-based Geospatial application similar to Google Maps. It helps the U.S. government plan operations that predict where an adversary might be. It's easily accessible and maintainble and it's simple to use without much training.
Streaming, Tracking and Reading Achievement: A Multilevel Analysis of Students in 40 Countries

ERIC Educational Resources Information Center

Chiu, Ming Ming; Chow, Bonnie Wing-Yin; Joh, Sung Wook

2017-01-01

Grouping similar students together within schools ("streaming") or classrooms ("tracking") based on past literacy skills (reported by parents), family socioeconomic status (SES) or reading attitudes might affect their reading achievement. Our multilevel analysis of the reading tests of 208,057 fourth-grade students across 40…
Small Oscillations via Conservation of Energy

ERIC Educational Resources Information Center

Troy, Tia; Reiner, Megan; Haugen, Andrew J.; Moore, Nathan T.

2017-01-01

The work describes an analogy-based small oscillations analysis of a standard static equilibrium lab problem. In addition to force analysis, a potential energy function for the system is developed, and by drawing out mathematical similarities to the simple harmonic oscillator, we are able to describe (and experimentally verify) the period of small…
A comparison of two computer-automated semen analysis instruments for the evaluation of sperm motion characteristics in the stallion.

PubMed

Jasko, D J; Lein, D H; Foote, R H

1990-01-01

Two commercially available computer-automate semen analysis instruments (CellSoft Automated Semen Analyzer and HTM-2000 Motion Analyzer) were compared for their ability to report similar results based on the analysis of pre-recorded video tapes of extended, motile stallion semen. The determinations of the percentage of motile cells by these instruments were more similar than the comparisons between subjective estimates and either instrument. However, mean values obtained from the same sample may still differ by as much as 30 percentage units between instruments. Instruments varied with regard to the determinations of mean sperm curvilinear velocity and sperm concentration, but mean sperm linearity determinations were similar between the instruments. We concluded that the determinations of sperm motion characteristics by subjective estimation, CellSoft Automated Semen Analyzer, and HTM-2000 Motility Analyzer are often dissimilar, making direct comparisons of results difficult.
Systolic Processor Array For Recognition Of Spectra

NASA Technical Reports Server (NTRS)

Chow, Edward T.; Peterson, John C.

1995-01-01

Spectral signatures of materials detected and identified quickly. Spectral Analysis Systolic Processor Array (SPA2) relatively inexpensive and satisfies need to analyze large, complex volume of multispectral data generated by imaging spectrometers to extract desired information: computational performance needed to do this in real time exceeds that of current supercomputers. Locates highly similar segments or contiguous subsegments in two different spectra at time. Compares sampled spectra from instruments with data base of spectral signatures of known materials. Computes and reports scores that express degrees of similarity between sampled and data-base spectra.
Certification report for the CALMAC solar powered pump

NASA Technical Reports Server (NTRS)

1978-01-01

The certification of the CALMAC solar powered thermopump is presented. Each element of the specification is delineated, together with the verification, based on analysis, similarity, inspection, or testing.

Measuring content overlap during handoff communication using distributional semantics: An exploratory study.

PubMed

Abraham, Joanna; Kannampallil, Thomas G; Srinivasan, Vignesh; Galanter, William L; Tagney, Gail; Cohen, Trevor

2017-01-01

We develop and evaluate a methodological approach to measure the degree and nature of overlap in handoff communication content within and across clinical professions. This extensible, exploratory approach relies on combining techniques from conversational analysis and distributional semantics. We audio-recorded handoff communication of residents and nurses on the General Medicine floor of a large academic hospital (n=120 resident and n=120 nurse handoffs). We measured semantic similarity, a proxy for content overlap, between resident-resident and nurse-nurse communication using multiple steps: a qualitative conversational content analysis; an automated semantic similarity analysis using Reflective Random Indexing (RRI); and comparing semantic similarity generated by RRI analysis with human ratings of semantic similarity. There was significant association between the semantic similarity as computed by the RRI method and human rating (ρ=0.88). Based on the semantic similarity scores, content overlap was relatively higher for content related to patient active problems, assessment of active problems, patient-identifying information, past medical history, and medications/treatments. In contrast, content overlap was limited on content related to allergies, family-related information, code status, and anticipatory guidance. Our approach using RRI analysis provides new opportunities for characterizing the nature and degree of overlap in handoff communication. Although exploratory, this method provides a basis for identifying content that can be used for determining shared understanding across clinical professions. Additionally, this approach can inform the development of flexibly standardized handoff tools that reflect clinical content that are most appropriate for fostering shared understanding during transitions of care. Copyright © 2016 Elsevier Inc. All rights reserved.
Agent-patient similarity affects sentence structure in language production: evidence from subject omissions in Mandarin

PubMed Central

Hsiao, Yaling; Gao, Yannan; MacDonald, Maryellen C.

2014-01-01

Interference effects from semantically similar items are well-known in studies of single word production, where the presence of semantically similar distractor words slows picture naming. This article examines the consequences of this interference in sentence production and tests the hypothesis that in situations of high similarity-based interference, producers are more likely to omit one of the interfering elements than when there is low semantic similarity and thus low interference. This work investigated language production in Mandarin, which allows subject noun phrases to be omitted in discourse contexts in which the subject entity has been previously mentioned in the discourse. We hypothesize that Mandarin speakers omit the subject more often when the subject and the object entities are conceptually similar. A corpus analysis of simple transitive sentences found higher rates of subject omission when both the subject and object were animate (potentially yielding similarity-based interference) than when the subject was animate and object was inanimate. A second study manipulated subject-object animacy in a picture description task and replicated this result: participants omitted the animate subject more often when the object was also animate than when it was inanimate. These results suggest that similarity-based interference affects sentence forms, particularly when the agent of the action is mentioned in the sentence. Alternatives and mechanisms for this effect are discussed. PMID:25278915
Measuring User Similarity Using Electric Circuit Analysis: Application to Collaborative Filtering

PubMed Central

Yang, Joonhyuk; Kim, Jinwook; Kim, Wonjoon; Kim, Young Hwan

2012-01-01

We propose a new technique of measuring user similarity in collaborative filtering using electric circuit analysis. Electric circuit analysis is used to measure the potential differences between nodes on an electric circuit. In this paper, by applying this method to transaction networks comprising users and items, i.e., user–item matrix, and by using the full information about the relationship structure of users in the perspective of item adoption, we overcome the limitations of one-to-one similarity calculation approach, such as the Pearson correlation, Tanimoto coefficient, and Hamming distance, in collaborative filtering. We found that electric circuit analysis can be successfully incorporated into recommender systems and has the potential to significantly enhance predictability, especially when combined with user-based collaborative filtering. We also propose four types of hybrid algorithms that combine the Pearson correlation method and electric circuit analysis. One of the algorithms exceeds the performance of the traditional collaborative filtering by 37.5% at most. This work opens new opportunities for interdisciplinary research between physics and computer science and the development of new recommendation systems PMID:23145095
Measuring user similarity using electric circuit analysis: application to collaborative filtering.

PubMed

Yang, Joonhyuk; Kim, Jinwook; Kim, Wonjoon; Kim, Young Hwan

2012-01-01

We propose a new technique of measuring user similarity in collaborative filtering using electric circuit analysis. Electric circuit analysis is used to measure the potential differences between nodes on an electric circuit. In this paper, by applying this method to transaction networks comprising users and items, i.e., user-item matrix, and by using the full information about the relationship structure of users in the perspective of item adoption, we overcome the limitations of one-to-one similarity calculation approach, such as the Pearson correlation, Tanimoto coefficient, and Hamming distance, in collaborative filtering. We found that electric circuit analysis can be successfully incorporated into recommender systems and has the potential to significantly enhance predictability, especially when combined with user-based collaborative filtering. We also propose four types of hybrid algorithms that combine the Pearson correlation method and electric circuit analysis. One of the algorithms exceeds the performance of the traditional collaborative filtering by 37.5% at most. This work opens new opportunities for interdisciplinary research between physics and computer science and the development of new recommendation systems.
A Method for Populating the Knowledge Base of AFIT’s Domain-Oriented Application Composition System

DTIC Science & Technology

1993-12-01

Analysis ( FODA ). The approach identifies prominent features (similarities) and distinctive features (differences) of software systems within an... analysis approaches we have summarized, the re- searchers described FODA in sufficient detail to use on large domain analysis projects (ones with...Software Technology Center, July 1991. 18. Kang, Kyo C. and others. Feature-Oriented Domain Analysis ( FODA ) Feasibility Study. Technical Report, Software
Study on store-space assignment based on logistic AGV in e-commerce goods to person picking pattern

NASA Astrophysics Data System (ADS)

Xu, Lijuan; Zhu, Jie

2017-10-01

This paper studied on the store-space assignment based on logistic AGV in E-commerce goods to person picking pattern, and established the store-space assignment model based on the lowest picking cost, and design for store-space assignment algorithm after the cluster analysis based on similarity coefficient. And then through the example analysis, compared the picking cost between store-space assignment algorithm this paper design and according to item number and storage according to ABC classification allocation, and verified the effectiveness of the design of the store-space assignment algorithm.
Assessing similarity analysis of chromatographic fingerprints of Cyclopia subternata extracts as potential screening tool for in vitro glucose utilisation.

PubMed

Schulze, Alexandra E; De Beer, Dalene; Mazibuko, Sithandiwe E; Muller, Christo J F; Roux, Candice; Willenburg, Elize L; Nyunaï, Nyemb; Louw, Johan; Manley, Marena; Joubert, Elizabeth

2016-01-01

Similarity analysis of the phenolic fingerprints of a large number of aqueous extracts of Cyclopia subternata, obtained by high-performance liquid chromatography (HPLC), was evaluated as a potential tool to screen extracts for relative bioactivity. The assessment was based on the (dis)similarity of their fingerprints to that of a reference active extract of C. subternata, proven to enhance glucose uptake in vitro and in vivo. In vitro testing of extracts, selected as being most similar (n = 5; r ≥ 0.962) and most dissimilar (n = 5; r ≤ 0.688) to the reference active extract, showed that no clear pattern in terms of relative glucose uptake efficacy in C2C12 myocytes emerged, irrespective of the dose. Some of the most dissimilar extracts had higher glucose-lowering activity than the reference active extract. Principal component analysis revealed the major compounds responsible for the most variation within the chromatographic fingerprints, as mangiferin, isomangiferin, iriflophenone-3-C-β-D-glucoside-4-O-β-D-glucoside, iriflophenone-3-C-β-D-glucoside, scolymoside, and phloretin-3',5'-di-C-β-D-glucoside. Quantitative analysis of the selected extracts showed that the most dissimilar extracts contained the highest mangiferin and isomangiferin levels, whilst the most similar extracts had the highest scolymoside content. These compounds demonstrated similar glucose uptake efficacy in C2C12 myocytes. It can be concluded that (dis)similarity of chromatographic fingerprints of extracts of unknown activity to that of a proven bioactive extract does not necessarily translate to lower or higher bioactivity.
A dynamic model of reasoning and memory.

PubMed

Hawkins, Guy E; Hayes, Brett K; Heit, Evan

2016-02-01

Previous models of category-based induction have neglected how the process of induction unfolds over time. We conceive of induction as a dynamic process and provide the first fine-grained examination of the distribution of response times observed in inductive reasoning. We used these data to develop and empirically test the first major quantitative modeling scheme that simultaneously accounts for inductive decisions and their time course. The model assumes that knowledge of similarity relations among novel test probes and items stored in memory drive an accumulation-to-bound sequential sampling process: Test probes with high similarity to studied exemplars are more likely to trigger a generalization response, and more rapidly, than items with low exemplar similarity. We contrast data and model predictions for inductive decisions with a recognition memory task using a common stimulus set. Hierarchical Bayesian analyses across 2 experiments demonstrated that inductive reasoning and recognition memory primarily differ in the threshold to trigger a decision: Observers required less evidence to make a property generalization judgment (induction) than an identity statement about a previously studied item (recognition). Experiment 1 and a condition emphasizing decision speed in Experiment 2 also found evidence that inductive decisions use lower quality similarity-based information than recognition. The findings suggest that induction might represent a less cautious form of recognition. We conclude that sequential sampling models grounded in exemplar-based similarity, combined with hierarchical Bayesian analysis, provide a more fine-grained and informative analysis of the processes involved in inductive reasoning than is possible solely through examination of choice data. PsycINFO Database Record (c) 2016 APA, all rights reserved.
Meta-analysis Exploring the Effectiveness of S-1-Based Chemotherapy for Advanced Non-Small Cell Lung Cancer.

PubMed

Sun, Xin; Sun, Li; Zhang, Shu-Ling; Xiong, Zhi-Cheng; Ma, Jie-Tao; Han, Cheng-Bo

2017-01-01

S-1 is a new oral fluoropyrimidine formulation that comprises tegafur, 5-chloro-2,4-dihydroxypyridine, and potassium oxonate. S-1 is designed to enhance antitumor activity and to reduce gastrointestinal toxicity. Several studies have demonstrated that both S-1 monotherapy and S-1 combination regimens showed encouraging efficacies and mild toxicities in the treatment of lung squamous cell carcinoma and adenocarcinoma. However, it is unclear whether S-1 can be used as standard care in advanced non-small cell lung cancer (NSCLC). The purpose of this meta-analysis was to assess the efficacy and safety of S-1-based chemotherapy, compared with standard chemotherapy, in patients with locally advanced or metastatic NSCLC. Thirteen randomized controlled trials (RCTs) involving 2,134 patients with a similar ratio of different pathological types were included. In first-line or second-line chemotherapy, compared with standard chemotherapy, S-1-based chemotherapy showed similar efficacy in terms of median overall survival (mOS), median progression free survival (mPFS), and objective response rate (ORR) (all P > 0.1), and significantly reduced the incidence of grade ≥ 3 hematological toxicities. In patients with locally advanced NSCLC receiving concurrent chemoradiotherapy, compared with standard chemoradiotherapy, significantly improved survival in the S-1-based chemotherapy was noted in terms of mOS and mPFS (risk radio [RR] = 1.289, P = 0.009; RR = 1.289, P = 0.000, respectively) with lower incidence of grade ≥ 3 neutropenia (RR = 0.453, P = 0.000). The present meta-analysis demonstrates that S-1-based chemotherapy shows similar benefits in advanced NSCLC and improves survival in locally advanced NSCLC, compared with standard treatment.
Preliminary Cost Model for Space Telescopes

NASA Technical Reports Server (NTRS)

Stahl, H. Philip; Prince, F. Andrew; Smart, Christian; Stephens, Kyle; Henrichs, Todd

2009-01-01

Parametric cost models are routinely used to plan missions, compare concepts and justify technology investments. However, great care is required. Some space telescope cost models, such as those based only on mass, lack sufficient detail to support such analysis and may lead to inaccurate conclusions. Similarly, using ground based telescope models which include the dome cost will also lead to inaccurate conclusions. This paper reviews current and historical models. Then, based on data from 22 different NASA space telescopes, this paper tests those models and presents preliminary analysis of single and multi-variable space telescope cost models.
3D-quantitative structure-activity relationship studies on benzothiadiazepine hydroxamates as inhibitors of tumor necrosis factor-alpha converting enzyme.

PubMed

Murumkar, Prashant R; Giridhar, Rajani; Yadav, Mange Ram

2008-04-01

A set of 29 benzothiadiazepine hydroxamates having selective tumor necrosis factor-alpha converting enzyme inhibitory activity were used to compare the quality and predictive power of 3D-quantitative structure-activity relationship, comparative molecular field analysis, and comparative molecular similarity indices models for the atom-based, centroid/atom-based, data-based, and docked conformer-based alignment. Removal of two outliers from the initial training set of molecules improved the predictivity of models. Among the 3D-quantitative structure-activity relationship models developed using the above four alignments, the database alignment provided the optimal predictive comparative molecular field analysis model for the training set with cross-validated r(2) (q(2)) = 0.510, non-cross-validated r(2) = 0.972, standard error of estimates (s) = 0.098, and F = 215.44 and the optimal comparative molecular similarity indices model with cross-validated r(2) (q(2)) = 0.556, non-cross-validated r(2) = 0.946, standard error of estimates (s) = 0.163, and F = 99.785. These models also showed the best test set prediction for six compounds with predictive r(2) values of 0.460 and 0.535, respectively. The contour maps obtained from 3D-quantitative structure-activity relationship studies were appraised for activity trends for the molecules analyzed. The comparative molecular similarity indices models exhibited good external predictivity as compared with that of comparative molecular field analysis models. The data generated from the present study helped us to further design and report some novel and potent tumor necrosis factor-alpha converting enzyme inhibitors.
GSP: A web-based platform for designing genome-specific primers in polyploids

USDA-ARS?s Scientific Manuscript database

The sequences among subgenomes in a polyploid species have high similarity. This makes difficult to design genome-specific primers for sequence analysis. We present a web-based platform named GSP for designing genome-specific primers to distinguish subgenome sequences in the polyploid genome backgr...
Visual Reconciliation of Alternative Similarity Spaces in Climate Modeling.

PubMed

Poco, Jorge; Dasgupta, Aritra; Wei, Yaxing; Hargrove, William; Schwalm, Christopher R; Huntzinger, Deborah N; Cook, Robert; Bertini, Enrico; Silva, Claudio T

2014-12-01

Visual data analysis often requires grouping of data objects based on their similarity. In many application domains researchers use algorithms and techniques like clustering and multidimensional scaling to extract groupings from data. While extracting these groups using a single similarity criteria is relatively straightforward, comparing alternative criteria poses additional challenges. In this paper we define visual reconciliation as the problem of reconciling multiple alternative similarity spaces through visualization and interaction. We derive this problem from our work on model comparison in climate science where climate modelers are faced with the challenge of making sense of alternative ways to describe their models: one through the output they generate, another through the large set of properties that describe them. Ideally, they want to understand whether groups of models with similar spatio-temporal behaviors share similar sets of criteria or, conversely, whether similar criteria lead to similar behaviors. We propose a visual analytics solution based on linked views, that addresses this problem by allowing the user to dynamically create, modify and observe the interaction among groupings, thereby making the potential explanations apparent. We present case studies that demonstrate the usefulness of our technique in the area of climate science.
Relating Diseases by Integrating Gene Associations and Information Flow through Protein Interaction Network

PubMed Central

Hamaneh, Mehdi Bagheri; Yu, Yi-Kuo

2014-01-01

Identifying similar diseases could potentially provide deeper understanding of their underlying causes, and may even hint at possible treatments. For this purpose, it is necessary to have a similarity measure that reflects the underpinning molecular interactions and biological pathways. We have thus devised a network-based measure that can partially fulfill this goal. Our method assigns weights to all proteins (and consequently their encoding genes) by using information flow from a disease to the protein interaction network and back. Similarity between two diseases is then defined as the cosine of the angle between their corresponding weight vectors. The proposed method also provides a way to suggest disease-pathway associations by using the weights assigned to the genes to perform enrichment analysis for each disease. By calculating pairwise similarities between 2534 diseases, we show that our disease similarity measure is strongly correlated with the probability of finding the diseases in the same disease family and, more importantly, sharing biological pathways. We have also compared our results to those of MimMiner, a text-mining method that assigns pairwise similarity scores to diseases. We find the results of the two methods to be complementary. It is also shown that clustering diseases based on their similarities and performing enrichment analysis for the cluster centers significantly increases the term association rate, suggesting that the cluster centers are better representatives for biological pathways than the diseases themselves. This lends support to the view that our similarity measure is a good indicator of relatedness of biological processes involved in causing the diseases. Although not needed for understanding this paper, the raw results are available for download for further study at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbpmn/DiseaseRelations/. PMID:25360770
Relating diseases by integrating gene associations and information flow through protein interaction network.

PubMed

Hamaneh, Mehdi Bagheri; Yu, Yi-Kuo

2014-01-01

Identifying similar diseases could potentially provide deeper understanding of their underlying causes, and may even hint at possible treatments. For this purpose, it is necessary to have a similarity measure that reflects the underpinning molecular interactions and biological pathways. We have thus devised a network-based measure that can partially fulfill this goal. Our method assigns weights to all proteins (and consequently their encoding genes) by using information flow from a disease to the protein interaction network and back. Similarity between two diseases is then defined as the cosine of the angle between their corresponding weight vectors. The proposed method also provides a way to suggest disease-pathway associations by using the weights assigned to the genes to perform enrichment analysis for each disease. By calculating pairwise similarities between 2534 diseases, we show that our disease similarity measure is strongly correlated with the probability of finding the diseases in the same disease family and, more importantly, sharing biological pathways. We have also compared our results to those of MimMiner, a text-mining method that assigns pairwise similarity scores to diseases. We find the results of the two methods to be complementary. It is also shown that clustering diseases based on their similarities and performing enrichment analysis for the cluster centers significantly increases the term association rate, suggesting that the cluster centers are better representatives for biological pathways than the diseases themselves. This lends support to the view that our similarity measure is a good indicator of relatedness of biological processes involved in causing the diseases. Although not needed for understanding this paper, the raw results are available for download for further study at ftp://ftp.ncbi.nlm.nih.gov/pub/qmbpmn/DiseaseRelations/.
Case-based lung image categorization and retrieval for interstitial lung diseases: clinical workflows.

PubMed

Depeursinge, Adrien; Vargas, Alejandro; Gaillard, Frédéric; Platon, Alexandra; Geissbuhler, Antoine; Poletti, Pierre-Alexandre; Müller, Henning

2012-01-01

Clinical workflows and user interfaces of image-based computer-aided diagnosis (CAD) for interstitial lung diseases in high-resolution computed tomography are introduced and discussed. Three use cases are implemented to assist students, radiologists, and physicians in the diagnosis workup of interstitial lung diseases. In a first step, the proposed system shows a three-dimensional map of categorized lung tissue patterns with quantification of the diseases based on texture analysis of the lung parenchyma. Then, based on the proportions of abnormal and normal lung tissue as well as clinical data of the patients, retrieval of similar cases is enabled using a multimodal distance aggregating content-based image retrieval (CBIR) and text-based information search. The global system leads to a hybrid detection-CBIR-based CAD, where detection-based and CBIR-based CAD show to be complementary both on the user's side and on the algorithmic side. The proposed approach is in accordance with the classical workflow of clinicians searching for similar cases in textbooks and personal collections. The developed system enables objective and customizable inter-case similarity assessment, and the performance measures obtained with a leave-one-patient-out cross-validation (LOPO CV) are representative of a clinical usage of the system.
A Multidimensional Analysis of the Joint Strike Fighter (JSF) Acquisition Program from the Perspective of Turkey

DTIC Science & Technology

2016-12-01

Conceptual Models,” includes a thorough analysis of Turkey’s involvement in the F-35 program, based on Allison’s Rational Actor and Organizational ...TuAF, but also suggested an organizational structure similar to the U.S. DOD. In May 1949, the Turkish Parliament passed a law to reform the Turkish... organizational behavior model and a governmental politics model provide a base for improved explanations and predictions. (Allison & Zelikow, 1999) 40
Macrolide-based regimens in absence of bacterial co-infection in critically ill H1N1 patients with primary viral pneumonia.

PubMed

Martín-Loeches, I; Bermejo-Martin, J F; Vallés, J; Granada, R; Vidaur, L; Vergara-Serrano, J C; Martín, M; Figueira, J C; Sirvent, J M; Blanquer, J; Suarez, D; Artigas, A; Torres, A; Diaz, E; Rodriguez, A

2013-04-01

To determine whether macrolide-based treatment is associated with mortality in critically ill H1N1 patients with primary viral pneumonia. Secondary analysis of a prospective, observational, multicenter study conducted across 148 Intensive Care Units (ICU) in Spain. Primary viral pneumonia was present in 733 ICU patients with pandemic influenza A (H1N1) virus infection with severe respiratory failure. Macrolide-based treatment was administered to 190 (25.9 %) patients. Patients who received macrolides had chronic obstructive pulmonary disease more often, lower severity on admission (APACHE II score on ICU admission (13.1 ± 6.8 vs. 14.4 ± 7.4 points, p < 0.05), and multiple organ dysfunction syndrome less often (23.4 vs. 30.1 %, p < 0.05). Length of ICU stay in survivors was not significantly different in patients who received macrolides compared to patients who did not (10 (IQR 4-20) vs. 10 (IQR 5-20), p = 0.9). ICU mortality was 24.1 % (n = 177). Patients with macrolide-based treatment had lower ICU mortality in the univariate analysis (19.2 vs. 28.1 %, p = 0.02); however, a propensity score analysis showed no effect of macrolide-based treatment on ICU mortality (OR = 0.87; 95 % CI 0.55-1.37, p = 0.5). Moreover, the sensitivity analysis revealed very similar results (OR = 0.91; 95 % CI 0.58-1.44, p = 0.7). A separate analysis of patients under mechanical ventilation yielded similar results (OR = 0.77; 95 % CI 0.44-1.35, p = 0.4). Our results suggest that macrolide-based treatment was not associated with improved survival in critically ill H1N1 patients with primary viral pneumonia.
A simulations approach for meta-analysis of genetic association studies based on additive genetic model.

PubMed

John, Majnu; Lencz, Todd; Malhotra, Anil K; Correll, Christoph U; Zhang, Jian-Ping

2018-06-01

Meta-analysis of genetic association studies is being increasingly used to assess phenotypic differences between genotype groups. When the underlying genetic model is assumed to be dominant or recessive, assessing the phenotype differences based on summary statistics, reported for individual studies in a meta-analysis, is a valid strategy. However, when the genetic model is additive, a similar strategy based on summary statistics will lead to biased results. This fact about the additive model is one of the things that we establish in this paper, using simulations. The main goal of this paper is to present an alternate strategy for the additive model based on simulating data for the individual studies. We show that the alternate strategy is far superior to the strategy based on summary statistics.
A Comparison of Mean Phase Difference and Generalized Least Squares for Analyzing Single-Case Data

ERIC Educational Resources Information Center

Manolov, Rumen; Solanas, Antonio

2013-01-01

The present study focuses on single-case data analysis specifically on two procedures for quantifying differences between baseline and treatment measurements. The first technique tested is based on generalized least square regression analysis and is compared to a proposed non-regression technique, which allows obtaining similar information. The…

29 CFR 515.4 - Submission of plan.

Code of Federal Regulations, 2013 CFR

2013-07-01

... classifications based upon an analysis of the duties and responsibilities of positions; (2) A compensation... examinations for similar positions, it being understood that such registers may be broken down by States; (5...
29 CFR 515.4 - Submission of plan.

Code of Federal Regulations, 2014 CFR

2014-07-01

... classifications based upon an analysis of the duties and responsibilities of positions; (2) A compensation... examinations for similar positions, it being understood that such registers may be broken down by States; (5...
29 CFR 515.4 - Submission of plan.

Code of Federal Regulations, 2011 CFR

2011-07-01

... classifications based upon an analysis of the duties and responsibilities of positions; (2) A compensation... examinations for similar positions, it being understood that such registers may be broken down by States; (5...
29 CFR 515.4 - Submission of plan.

Code of Federal Regulations, 2012 CFR

2012-07-01

... classifications based upon an analysis of the duties and responsibilities of positions; (2) A compensation... examinations for similar positions, it being understood that such registers may be broken down by States; (5...
Exploratory Item Classification Via Spectral Graph Clustering

PubMed Central

Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

2017-01-01

Large-scale assessments are supported by a large item pool. An important task in test development is to assign items into scales that measure different characteristics of individuals, and a popular approach is cluster analysis of items. Classical methods in cluster analysis, such as the hierarchical clustering, K-means method, and latent-class analysis, often induce a high computational overhead and have difficulty handling missing data, especially in the presence of high-dimensional responses. In this article, the authors propose a spectral clustering algorithm for exploratory item cluster analysis. The method is computationally efficient, effective for data with missing or incomplete responses, easy to implement, and often outperforms traditional clustering algorithms in the context of high dimensionality. The spectral clustering algorithm is based on graph theory, a branch of mathematics that studies the properties of graphs. The algorithm first constructs a graph of items, characterizing the similarity structure among items. It then extracts item clusters based on the graphical structure, grouping similar items together. The proposed method is evaluated through simulations and an application to the revised Eysenck Personality Questionnaire. PMID:29033476
Feasibility of wall stress analysis of abdominal aortic aneurysms using three-dimensional ultrasound.

PubMed

Kok, Annette M; Nguyen, V Lai; Speelman, Lambert; Brands, Peter J; Schurink, Geert-Willem H; van de Vosse, Frans N; Lopata, Richard G P

2015-05-01

Abdominal aortic aneurysms (AAAs) are local dilations that can lead to a fatal hemorrhage when ruptured. Wall stress analysis of AAAs is a novel tool that has proven high potential to improve risk stratification. Currently, wall stress analysis of AAAs is based on computed tomography (CT) and magnetic resonance imaging; however, three-dimensional (3D) ultrasound (US) has great advantages over CT and magnetic resonance imaging in terms of costs, speed, and lack of radiation. In this study, the feasibility of 3D US as input for wall stress analysis is investigated. Second, 3D US-based wall stress analysis was compared with CT-based results. The 3D US and CT data were acquired in 12 patients (diameter, 35-90 mm). US data were segmented manually and compared with automatically acquired CT geometries by calculating the similarity index and Hausdorff distance. Wall stresses were simulated at P = 140 mm Hg and compared between both modalities. The similarity index of US vs CT was 0.75 to 0.91 (n = 12), with a median Hausdorff distance ranging from 4.8 to 13.9 mm, with the higher values found at the proximal and distal sides of the AAA. Wall stresses were in accordance with literature, and a good agreement was found between US- and CT-based median stresses and interquartile stresses, which was confirmed by Bland-Altman and regression analysis (n = 8). Wall stresses based on US were typically higher (+23%), caused by geometric irregularities due to the registration of several 3D volumes and manual segmentation. In future work, an automated US registration and segmentation approach is the essential point of improvement before pursuing large-scale patient studies. This study is a first step toward US-based wall stress analysis, which would be the modality of choice to monitor wall stress development over time because no ionizing radiation and contrast material are involved. Copyright © 2015 Society for Vascular Surgery. Published by Elsevier Inc. All rights reserved.
Graph-based analysis of kinetics on multidimensional potential-energy surfaces.

PubMed

Okushima, T; Niiyama, T; Ikeda, K S; Shimizu, Y

2009-09-01

The aim of this paper is twofold: one is to give a detailed description of an alternative graph-based analysis method, which we call saddle connectivity graph, for analyzing the global topography and the dynamical properties of many-dimensional potential-energy landscapes and the other is to give examples of applications of this method in the analysis of the kinetics of realistic systems. A Dijkstra-type shortest path algorithm is proposed to extract dynamically dominant transition pathways by kinetically defining transition costs. The applicability of this approach is first confirmed by an illustrative example of a low-dimensional random potential. We then show that a coarse-graining procedure tailored for saddle connectivity graphs can be used to obtain the kinetic properties of 13- and 38-atom Lennard-Jones clusters. The coarse-graining method not only reduces the complexity of the graphs, but also, with iterative use, reveals a self-similar hierarchical structure in these clusters. We also propose that the self-similarity is common to many-atom Lennard-Jones clusters.
Pangenomic Definition of Prokaryotic Species and the Phylogenetic Structure of Prochlorococcus spp.

PubMed

Moldovan, Mikhail A; Gelfand, Mikhail S

2018-01-01

The pangenome is the collection of all groups of orthologous genes (OGGs) from a set of genomes. We apply the pangenome analysis to propose a definition of prokaryotic species based on identification of lineage-specific gene sets. While being similar to the classical biological definition based on allele flow, it does not rely on DNA similarity levels and does not require analysis of homologous recombination. Hence this definition is relatively objective and independent of arbitrary thresholds. A systematic analysis of 110 accepted species with the largest numbers of sequenced strains yields results largely consistent with the existing nomenclature. However, it has revealed that abundant marine cyanobacteria Prochlorococcus marinus should be divided into two species. As a control we have confirmed the paraphyletic origin of Yersinia pseudotuberculosis (with embedded, monophyletic Y. pestis ) and Burkholderia pseudomallei (with B. mallei ). We also demonstrate that by our definition and in accordance with recent studies Escherichia coli and Shigella spp. are one species.
galaxie--CGI scripts for sequence identification through automated phylogenetic analysis.

PubMed

Nilsson, R Henrik; Larsson, Karl-Henrik; Ursing, Björn M

2004-06-12

The prevalent use of similarity searches like BLAST to identify sequences and species implicitly assumes the reference database to be of extensive sequence sampling. This is often not the case, restraining the correctness of the outcome as a basis for sequence identification. Phylogenetic inference outperforms similarity searches in retrieving correct phylogenies and consequently sequence identities, and a project was initiated to design a freely available script package for sequence identification through automated Web-based phylogenetic analysis. Three CGI scripts were designed to facilitate qualified sequence identification from a Web interface. Query sequences are aligned to pre-made alignments or to alignments made by ClustalW with entries retrieved from a BLAST search. The subsequent phylogenetic analysis is based on the PHYLIP package for inferring neighbor-joining and parsimony trees. The scripts are highly configurable. A service installation and a version for local use are found at http://andromeda.botany.gu.se/galaxiewelcome.html and http://galaxie.cgb.ki.se
[Local Regression Algorithm Based on Net Analyte Signal and Its Application in Near Infrared Spectral Analysis].

PubMed

Zhang, Hong-guang; Lu, Jian-gang

2016-02-01

Abstract To overcome the problems of significant difference among samples and nonlinearity between the property and spectra of samples in spectral quantitative analysis, a local regression algorithm is proposed in this paper. In this algorithm, net signal analysis method(NAS) was firstly used to obtain the net analyte signal of the calibration samples and unknown samples, then the Euclidean distance between net analyte signal of the sample and net analyte signal of calibration samples was calculated and utilized as similarity index. According to the defined similarity index, the local calibration sets were individually selected for each unknown sample. Finally, a local PLS regression model was built on each local calibration sets for each unknown sample. The proposed method was applied to a set of near infrared spectra of meat samples. The results demonstrate that the prediction precision and model complexity of the proposed method are superior to global PLS regression method and conventional local regression algorithm based on spectral Euclidean distance.
Quantification of sensory and food quality: the R-index analysis.

PubMed

Lee, Hye-Seong; van Hout, Danielle

2009-08-01

The accurate quantification of sensory difference/similarity between foods, as well as consumer acceptance/preference and concepts, is greatly needed to optimize and maintain food quality. The R-Index is one class of measures of the degree of difference/similarity, and was originally developed for sensory difference tests for food quality control, product development, and so on. The index is based on signal detection theory and is free of the response bias that can invalidate difference testing protocols, including categorization and same-different and A-Not A tests. It is also a nonparametric analysis, making no assumptions about sensory distributions, and is simple to compute and understand. The R-Index is also flexible in its application. Methods based on R-Index analysis have been used as detection and sensory difference tests, as simple alternatives to hedonic scaling, and for the measurement of consumer concepts. This review indicates the various computational strategies for the R-Index and its practical applications to consumer and sensory measurements in food science.
Cooperation and competition in business on example of Internet research of opto-electronic companies

NASA Astrophysics Data System (ADS)

Kaliczyńska, Małgorzata

2006-10-01

Based on findings from earlier studies which showed that links to academic web sites contain important information, the following study examines the practicability of using co-link data to describe cooperation and competition in optoelec-tronic business. The analysis was based on 32 companies and organizations which were found in an issue of a specialist magazine. For the purpose of the research three search engines - Google, Yahoo! and MSN Search were used. Assuming that a number of co-links to a pair of Web sites is a measure of the similarity between the two companies, the study aims at search for the sets of companies that would be similar to one another. The method applied is the MDS - multidimensional scaling that allows to present results of the analysis on a 2D map.
Characterisation of colletotrichum species associated with anthracnose of banana.

PubMed

Zakaria, Latiffah; Sahak, Shamsiah; Zakaria, Maziah; Salleh, Baharuddin

2009-12-01

A total of 13 Colletotrichum isolates were obtained from different banana cultivars (Musa spp.) with symptoms of anthracnose. Colletotrichum isolates from anthracnose of guava (Psidium guajava) and water apple (Syzygium aqueum) were also included in this study. Based on cultural and morphological characteristics, isolates from banana and guava were identified as Colletotrichum musae and from water apple as Colletotrichum gloeosporiodes. Isolates of C. musae from banana and guava had similar banding patterns in a randomly amplified polymorphic DNA (RAPD) analysis with four random primers, and they clustered together in a UPGMA analysis. C. gloeosporiodes from water apple was clustered in a separate cluster. Based on the present study, C. musae was frequently isolated from anthracnose of different banana cultivars and the RAPD banding patterns of C. musae isolates were highly similar but showed intraspecific variations.
Carbon Fibers Conductivity Studies

NASA Technical Reports Server (NTRS)

Yang, C. Y.; Butkus, A. M.

1980-01-01

In an attempt to understand the process of electrical conduction in polyacrylonitrile (PAN)-based carbon fibers, calculations were carried out on cluster models of the fiber consisting of carbon, nitrogen, and hydrogen atoms using the modified intermediate neglect of differential overlap (MINDO) molecular orbital (MO) method. The models were developed based on the assumption that PAN carbon fibers obtained with heat treatment temperatures (HTT) below 1000 C retain nitrogen in a graphite-like lattice. For clusters modeling an edge nitrogen site, analysis of the occupied MO's indicated an electron distribution similar to that of graphite. A similar analysis for the somewhat less stable interior nitrogen site revealed a partially localized II electron distribution around the nitrogen atom. The differences in bonding trends and structural stability between edge and interior nitrogen clusters led to a two-step process proposed for nitrogen evolution with increasing HTT.
Characterisation of Colletotrichum Species Associated with Anthracnose of Banana

PubMed Central

Zakaria, Latiffah; Sahak, Shamsiah; Zakaria, Maziah; Salleh, Baharuddin

2009-01-01

A total of 13 Colletotrichum isolates were obtained from different banana cultivars (Musa spp.) with symptoms of anthracnose. Colletotrichum isolates from anthracnose of guava (Psidium guajava) and water apple (Syzygium aqueum) were also included in this study. Based on cultural and morphological characteristics, isolates from banana and guava were identified as Colletotrichum musae and from water apple as Colletotrichum gloeosporiodes. Isolates of C. musae from banana and guava had similar banding patterns in a randomly amplified polymorphic DNA (RAPD) analysis with four random primers, and they clustered together in a UPGMA analysis. C. gloeosporiodes from water apple was clustered in a separate cluster. Based on the present study, C. musae was frequently isolated from anthracnose of different banana cultivars and the RAPD banding patterns of C. musae isolates were highly similar but showed intraspecific variations. PMID:24575184
Community detection in sequence similarity networks based on attribute clustering

DOE PAGES

Chowdhary, Janamejaya; Loeffler, Frank E.; Smith, Jeremy C.

2017-07-24

Networks are powerful tools for the presentation and analysis of interactions in multi-component systems. A commonly studied mesoscopic feature of networks is their community structure, which arises from grouping together similar nodes into one community and dissimilar nodes into separate communities. Here in this paper, the community structure of protein sequence similarity networks is determined with a new method: Attribute Clustering Dependent Communities (ACDC). Sequence similarity has hitherto typically been quantified by the alignment score or its expectation value. However, pair alignments with the same score or expectation value cannot thus be differentiated. To overcome this deficiency, the method constructs,more » for pair alignments, an extended alignment metric, the link attribute vector, which includes the score and other alignment characteristics. Rescaling components of the attribute vectors qualitatively identifies a systematic variation of sequence similarity within protein superfamilies. The problem of community detection is then mapped to clustering the link attribute vectors, selection of an optimal subset of links and community structure refinement based on the partition density of the network. ACDC-predicted communities are found to be in good agreement with gold standard sequence databases for which the "ground truth" community structures (or families) are known. ACDC is therefore a community detection method for sequence similarity networks based entirely on pair similarity information. A serial implementation of ACDC is available from https://cmb.ornl.gov/resources/developments« less
Community detection in sequence similarity networks based on attribute clustering

DOE Office of Scientific and Technical Information (OSTI.GOV)

Chowdhary, Janamejaya; Loeffler, Frank E.; Smith, Jeremy C.

Networks are powerful tools for the presentation and analysis of interactions in multi-component systems. A commonly studied mesoscopic feature of networks is their community structure, which arises from grouping together similar nodes into one community and dissimilar nodes into separate communities. Here in this paper, the community structure of protein sequence similarity networks is determined with a new method: Attribute Clustering Dependent Communities (ACDC). Sequence similarity has hitherto typically been quantified by the alignment score or its expectation value. However, pair alignments with the same score or expectation value cannot thus be differentiated. To overcome this deficiency, the method constructs,more » for pair alignments, an extended alignment metric, the link attribute vector, which includes the score and other alignment characteristics. Rescaling components of the attribute vectors qualitatively identifies a systematic variation of sequence similarity within protein superfamilies. The problem of community detection is then mapped to clustering the link attribute vectors, selection of an optimal subset of links and community structure refinement based on the partition density of the network. ACDC-predicted communities are found to be in good agreement with gold standard sequence databases for which the "ground truth" community structures (or families) are known. ACDC is therefore a community detection method for sequence similarity networks based entirely on pair similarity information. A serial implementation of ACDC is available from https://cmb.ornl.gov/resources/developments« less
Identifying the drivers of liking by investigating the reasons for (dis)liking using CATA in cross-cultural context: a case study on barbecue sauce.

PubMed

Choi, Ji-Hye; Gwak, Mi-Jin; Chung, Seo-Jin; Kim, Kwang-Ok; O'Mahony, Michael; Ishii, Rie; Bae, Ye-Won

2015-06-01

The present study cross-culturally investigated the drivers of liking for traditional and ethnic chicken marinades using descriptive analysis and consumer taste tests incorporating the check-all-that-apply (CATA) method. Seventy-three Koreans and 86 US consumers participated. The tested sauces comprised three tomato-based sauces, a teriyaki-based sauce and a Korean spicy seasoning-based sauce. Chicken breasts were marinated with each of the five barbecue sauces, grilled and served for evaluation. Descriptive analysis and consumer taste tests were conducted. Consumers rated the acceptance on a hedonic scale and checked the reasons for (dis)liking by the CATA method for each sauce. A general linear model, multiple factor analysis and chi-square analysis were conducted using the data. The results showed that the preference orders of the samples between Koreans and US consumers were strikingly similar to each other. However, the reasons for (dis)liking the samples differed cross-culturally. The drivers of liking of two sauces sharing relatively similar sensory profiles but differing significantly in hedonic ratings were effectively delineated by reasons of (dis)liking CATA results. Reasons for (dis)liking CATA proved to be a powerful supporting method to understand the internal drivers of liking which can be overlooked by generic descriptive analysis. © 2014 Society of Chemical Industry.
A generalized association test based on U statistics.

PubMed

Wei, Changshuai; Lu, Qing

2017-07-01

Second generation sequencing technologies are being increasingly used for genetic association studies, where the main research interest is to identify sets of genetic variants that contribute to various phenotypes. The phenotype can be univariate disease status, multivariate responses and even high-dimensional outcomes. Considering the genotype and phenotype as two complex objects, this also poses a general statistical problem of testing association between complex objects. We here proposed a similarity-based test, generalized similarity U (GSU), that can test the association between complex objects. We first studied the theoretical properties of the test in a general setting and then focused on the application of the test to sequencing association studies. Based on theoretical analysis, we proposed to use Laplacian Kernel-based similarity for GSU to boost power and enhance robustness. Through simulation, we found that GSU did have advantages over existing methods in terms of power and robustness. We further performed a whole genome sequencing (WGS) scan for Alzherimer's disease neuroimaging initiative data, identifying three genes, APOE , APOC1 and TOMM40 , associated with imaging phenotype. We developed a C ++ package for analysis of WGS data using GSU. The source codes can be downloaded at https://github.com/changshuaiwei/gsu . weichangshuai@gmail.com ; qlu@epi.msu.edu. Supplementary data are available at Bioinformatics online. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
Variation in mitochondrial DNA and allozymes discriminates early and late forms of Chinook salmon Oncorhynchus tshawytscha in the Kenai and Kasilof Rivers, AK

USGS Publications Warehouse

Adams, Noah S.; Spearman, William J.; Burger, Carl V.; Currens, Kenneth P.; Schreck, Carl B.; Li, Hiram W.

1994-01-01

Genetic differences between early and late forms of Alaskan chinook salmon (Oncorhynchus tshawytscha) were identified using two genetic approaches: mitochondrial DNA (mtDNA) analysis, and protein electrophoresis. Study populations consisted of early and late runs in each of the Kenai and Kasilof rivers in Alaska, and a population from the Minam River, Oregon. Two segments of mtDNA were amplified using the polymerase chain reaction (PCR) and digested with 14–16 restriction enzymes. Results showed that early runs were genetically similar to each other but different from the late runs. The late runs were different from each other based on the frequency of the common haplotypes. Frequency differences in shared haplotypes together with the presence of a unique haplotype separated the Minam River stock from those in Alaska. In the protein analysis, each population was examined at 30 allozyme loci. Based on 14 polymorphic loci, Minam River salmon were genetically distinct from the Alaskan populations. Within the Alaskan populations, early runs were most similar to each other but different from the late runs; the late runs were also genetically most similar to each other. Both mtDNA and allozyme analysis suggest that chinook salmon may segregate into genetically different early and late forms within a drainage.

Comparative analysis of chemical similarity methods for modular natural products with a hypothetical structure enumeration algorithm.

PubMed

Skinnider, Michael A; Dejong, Chris A; Franczak, Brian C; McNicholas, Paul D; Magarvey, Nathan A

2017-08-16

Natural products represent a prominent source of pharmaceutically and industrially important agents. Calculating the chemical similarity of two molecules is a central task in cheminformatics, with applications at multiple stages of the drug discovery pipeline. Quantifying the similarity of natural products is a particularly important problem, as the biological activities of these molecules have been extensively optimized by natural selection. The large and structurally complex scaffolds of natural products distinguish their physical and chemical properties from those of synthetic compounds. However, no analysis of the performance of existing methods for molecular similarity calculation specific to natural products has been reported to date. Here, we present LEMONS, an algorithm for the enumeration of hypothetical modular natural product structures. We leverage this algorithm to conduct a comparative analysis of molecular similarity methods within the unique chemical space occupied by modular natural products using controlled synthetic data, and comprehensively investigate the impact of diverse biosynthetic parameters on similarity search. We additionally investigate a recently described algorithm for natural product retrobiosynthesis and alignment, and find that when rule-based retrobiosynthesis can be applied, this approach outperforms conventional two-dimensional fingerprints, suggesting it may represent a valuable approach for the targeted exploration of natural product chemical space and microbial genome mining. Our open-source algorithm is an extensible method of enumerating hypothetical natural product structures with diverse potential applications in bioinformatics.
Quantifying the strength of the associations of prototype perceptions with behaviour, behavioural willingness and intentions: a meta-analysis.

PubMed

van Lettow, Britt; de Vries, Hein; Burdorf, Alex; van Empelen, Pepijn

2016-01-01

Prototypes (i.e., social images representing perceptions of typical persons engaging in or refraining from certain behaviour) have been shown to explain health-related behaviours. The present meta-analysis quantified the strength of the associations of prototype perceptions with health motivation and behaviour. Specifically, the analysis addressed (i) the relationship of prototype favourability (i.e., degree of likability) and similarity (i.e., perceived resemblance to the self) with behaviour, willingness and intentions; (ii) the effect of the interaction between favourability and similarity; and (iii) the extent to which health-risk and health-protective prototypes differed in their association with these outcomes. A total of 80 independent studies were identified based on 69 articles. The results indicated that prototype favourability and similarity were related to behaviour, intentions and willingness with small-to-medium effect sizes (r+ = 0.12-0.43). Direct measures of prototype perceptions generally produced larger effects than indirect measures. The interaction between favourability and similarity produced small-to-large effect sizes (r+ = .22-.54). The results suggest that both health-risk and health-protective prototypes might be useful targets for interventions (r+ = .22-.54). In order to increase health-protective behaviours, intentions and behaviour could be targeted by increasing similarity to health-protective prototypes. Health-risk behaviour might be decreased by targeting willingness by modifying health-risk prototype favourability and similarity.
Fingerprint chromatogram analysis of Pseudostellaria heterophylla (Miq.) Pax root by high performance liquid chromatography.

PubMed

Han, Chao; Chen, Junhui; Chen, Bo; Lee, Frank Sen-Chun; Wang, Xiaoru

2006-09-01

A simple and reliable high performance liquid chromatographic (HPLC) method has been developed and validated for the fingerprinting of extracts from the root of Pseudostellaria heterophylla (Miq.) Pax. HPLC with gradient elution was performed on an authentic reference standard of powdered P. heterophylla (Miq.) Pax root and 11 plant samples of the root were collected from different geographic locations. The HPLC chromatograms have been standardized through the selection and identification of reference peaks and the normalization of retention times and peak intensities of all the common peaks. The standardized HPLC fingerprints show high stability and reproducibility, and thus can be used effectively for the screening analysis or quality assessment of the root or its derived products. Similarity index calculations based on cosine angle values or correlation methods have been performed on the HPLC fingerprints. As a group, the fingerprints of the P. heterophylla (Miq.) Pax samples studied are highly correlated with closely similar fingerprints. Within the group, the samples can be further divided into subgroups based on hierarchical clustering analysis (HCA). Sample grouping based on HCA coincides nicely with those based on the geographical origins of the samples. The HPLC fingerprinting techniques thus have high potential in authentication or source-tracing types of applications.
Mining author relationship in scholarly networks based on tripartite citation analysis

PubMed Central

Wang, Xiaohan; Yang, Siluo

2017-01-01

Following scholars in Scientometrics as examples, we develop five author relationship networks, namely, co-authorship, author co-citation (AC), author bibliographic coupling (ABC), author direct citation (ADC), and author keyword coupling (AKC). The time frame of data sets is divided into two periods: before 2011 (i.e., T1) and after 2011 (i.e., T2). Through quadratic assignment procedure analysis, we found that some authors have ABC or AC relationships (i.e., potential communication relationship, PCR) but do not have actual collaborations or direct citations (i.e., actual communication relationship, ACR) among them. In addition, we noticed that PCR and AKC are highly correlated and that the old PCR and the new ACR are correlated and consistent. Such facts indicate that PCR tends to produce academic exchanges based on similar themes, and ABC bears more advantages in predicting potential relations. Based on tripartite citation analysis, including AC, ABC, and ADC, we also present an author-relation mining process. Such process can be used to detect deep and potential author relationships. We analyze the prediction capacity by comparing between the T1 and T2 periods, which demonstrate that relation mining can be complementary in identifying authors based on similar themes and discovering more potential collaborations and academic communities. PMID:29117198
Evaluating sufficient similarity for drinking-water disinfection by-product (DBP) mixtures with bootstrap hypothesis test procedures.

PubMed

Feder, Paul I; Ma, Zhenxu J; Bull, Richard J; Teuschler, Linda K; Rice, Glenn

2009-01-01

In chemical mixtures risk assessment, the use of dose-response data developed for one mixture to estimate risk posed by a second mixture depends on whether the two mixtures are sufficiently similar. While evaluations of similarity may be made using qualitative judgments, this article uses nonparametric statistical methods based on the "bootstrap" resampling technique to address the question of similarity among mixtures of chemical disinfectant by-products (DBP) in drinking water. The bootstrap resampling technique is a general-purpose, computer-intensive approach to statistical inference that substitutes empirical sampling for theoretically based parametric mathematical modeling. Nonparametric, bootstrap-based inference involves fewer assumptions than parametric normal theory based inference. The bootstrap procedure is appropriate, at least in an asymptotic sense, whether or not the parametric, distributional assumptions hold, even approximately. The statistical analysis procedures in this article are initially illustrated with data from 5 water treatment plants (Schenck et al., 2009), and then extended using data developed from a study of 35 drinking-water utilities (U.S. EPA/AMWA, 1989), which permits inclusion of a greater number of water constituents and increased structure in the statistical models.
MRTD: man versus machine

NASA Astrophysics Data System (ADS)

van Rheenen, Arthur D.; Taule, Petter; Thomassen, Jan Brede; Madsen, Eirik Blix

2018-04-01

We present Minimum-Resolvable Temperature Difference (MRTD) curves obtained by letting an ensemble of observers judge how many of the six four-bar patterns they can "see" in a set of images taken with different bar-to-background contrasts. The same images are analyzed using elemental signal analysis algorithms and machine-analysis based MRTD curves are obtained. We show that by adjusting the minimum required signal-to-noise ratio the machine-based MRTDs are very similar to the ones obtained with the help of the human observers.
Personalized Medicine in Veterans with Traumatic Brain Injuries

DTIC Science & Technology

2013-05-01

Pair-Group Method using Arithmetic averages ( UPGMA ) based on cosine correlation of row mean centered log2 signal values; this was the top 50%-tile...cluster- ing was performed by the UPGMA method using Cosine correlation as the similarity metric. For comparative purposes, clustered heat maps included...non-mTBI cases were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as the similarity
Personalized Medicine in Veterans with Traumatic Brain Injuries

DTIC Science & Technology

2014-07-01

9 control cases are subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as the similarity...in unsu- pervised hierarchical clustering by the Un- weighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on cosine correlation of row...of log2 trans- formed MAS5.0 signal values; probe set cluster- ing was performed by the UPGMA method using Cosine correlation as the similarity
Intrinsic Remediation Engineering Evaluation/Cost Analysis for Car Care Center at Bolling Air Force Base, Washington, DC

DTIC Science & Technology

1997-01-01

supplemented using established literature values for similar aquifer materials . The groundwater sampling activities and analytical results from both...subsurface materials recovered. Observed soil classification types compared very favorably to the soil classifications determined by the CPT tests. 0 2.1.5...other similar substances were handled in a manner consistent with accepted safety procedures and standard operating practices. Well completion materials
STREAM PROCESSING ALGORITHMS FOR DYNAMIC 3D SCENE ANALYSIS

DTIC Science & Technology

2018-02-15

23 9 Ground truth creation based on marked building feature points in two different views 50 frames apart in...between just two views , each row in the current figure represents a similar assessment however between one camera and all other cameras within the dataset...BA4S. While Fig. 44 depicted the epipolar lines for the point correspondences between just two views , the current figure represents a similar
Similarity based false-positive reduction for breast cancer using radiographic and pathologic imaging features

NASA Astrophysics Data System (ADS)

Pai, Akshay; Samala, Ravi K.; Zhang, Jianying; Qian, Wei

2010-03-01

Mammography reading by radiologists and breast tissue image interpretation by pathologists often leads to high False Positive (FP) Rates. Similarly, current Computer Aided Diagnosis (CADx) methods tend to concentrate more on sensitivity, thus increasing the FP rates. A novel method is introduced here which employs similarity based method to decrease the FP rate in the diagnosis of microcalcifications. This method employs the Principal Component Analysis (PCA) and the similarity metrics in order to achieve the proposed goal. The training and testing set is divided into generalized (Normal and Abnormal) and more specific (Abnormal, Normal, Benign) classes. The performance of this method as a standalone classification system is evaluated in both the cases (general and specific). In another approach the probability of each case belonging to a particular class is calculated. If the probabilities are too close to classify, the augmented CADx system can be instructed to have a detailed analysis of such cases. In case of normal cases with high probability, no further processing is necessary, thus reducing the computation time. Hence, this novel method can be employed in cascade with CADx to reduce the FP rate and also avoid unnecessary computational time. Using this methodology, a false positive rate of 8% and 11% is achieved for mammography and cellular images respectively.
Quantitative Outline-based Shape Analysis and Classification of Planetary Craterforms using Supervised Learning Models

NASA Astrophysics Data System (ADS)

Slezak, Thomas Joseph; Radebaugh, Jani; Christiansen, Eric

2017-10-01

The shapes of craterform morphology on planetary surfaces provides rich information about their origins and evolution. While morphologic information provides rich visual clues to geologic processes and properties, the ability to quantitatively communicate this information is less easily accomplished. This study examines the morphology of craterforms using the quantitative outline-based shape methods of geometric morphometrics, commonly used in biology and paleontology. We examine and compare landforms on planetary surfaces using shape, a property of morphology that is invariant to translation, rotation, and size. We quantify the shapes of paterae on Io, martian calderas, terrestrial basaltic shield calderas, terrestrial ash-flow calderas, and lunar impact craters using elliptic Fourier analysis (EFA) and the Zahn and Roskies (Z-R) shape function, or tangent angle approach to produce multivariate shape descriptors. These shape descriptors are subjected to multivariate statistical analysis including canonical variate analysis (CVA), a multiple-comparison variant of discriminant analysis, to investigate the link between craterform shape and classification. Paterae on Io are most similar in shape to terrestrial ash-flow calderas and the shapes of terrestrial basaltic shield volcanoes are most similar to martian calderas. The shapes of lunar impact craters, including simple, transitional, and complex morphology, are classified with a 100% rate of success in all models. Multiple CVA models effectively predict and classify different craterforms using shape-based identification and demonstrate significant potential for use in the analysis of planetary surfaces.
A cluster pattern algorithm for the analysis of multiparametric cell assays.

PubMed

Kaufman, Menachem; Bloch, David; Zurgil, Naomi; Shafran, Yana; Deutsch, Mordechai

2005-09-01

The issue of multiparametric analysis of complex single cell assays of both static and flow cytometry (SC and FC, respectively) has become common in recent years. In such assays, the analysis of changes, applying common statistical parameters and tests, often fails to detect significant differences between the investigated samples. The cluster pattern similarity (CPS) measure between two sets of gated clusters is based on computing the difference between their density distribution functions' set points. The CPS was applied for the discrimination between two observations in a four-dimensional parameter space. The similarity coefficient (r) ranges between 0 (perfect similarity) to 1 (dissimilar). Three CPS validation tests were carried out: on the same stock samples of fluorescent beads, yielding very low r's (0, 0.066); and on two cell models: mitogenic stimulation of peripheral blood mononuclear cells (PBMC), and apoptosis induction in Jurkat T cell line by H2O2. In both latter cases, r indicated similarity (r < 0.23) within the same group, and dissimilarity (r > 0.48) otherwise. This classification and algorithm approach offers a measure of similarity between samples. It relies on the multidimensional pattern of the sample parameters. The algorithm compensates for environmental drifts in this apparatus and assay; it also may be applied to more than four dimensions.
Prediction of Protein Structural Classes for Low-Similarity Sequences Based on Consensus Sequence and Segmented PSSM.

PubMed

Liang, Yunyun; Liu, Sanyang; Zhang, Shengli

2015-01-01

Prediction of protein structural classes for low-similarity sequences is useful for understanding fold patterns, regulation, functions, and interactions of proteins. It is well known that feature extraction is significant to prediction of protein structural class and it mainly uses protein primary sequence, predicted secondary structure sequence, and position-specific scoring matrix (PSSM). Currently, prediction solely based on the PSSM has played a key role in improving the prediction accuracy. In this paper, we propose a novel method called CSP-SegPseP-SegACP by fusing consensus sequence (CS), segmented PsePSSM, and segmented autocovariance transformation (ACT) based on PSSM. Three widely used low-similarity datasets (1189, 25PDB, and 640) are adopted in this paper. Then a 700-dimensional (700D) feature vector is constructed and the dimension is decreased to 224D by using principal component analysis (PCA). To verify the performance of our method, rigorous jackknife cross-validation tests are performed on 1189, 25PDB, and 640 datasets. Comparison of our results with the existing PSSM-based methods demonstrates that our method achieves the favorable and competitive performance. This will offer an important complementary to other PSSM-based methods for prediction of protein structural classes for low-similarity sequences.
Clusters of community exposure to coastal flooding hazards based on storm and sea level rise scenarios—implications for adaptation networks in the San Francisco Bay region

USGS Publications Warehouse

Hummel, Michelle; Wood, Nathan J.; Schweikert, Amy; Stacey, Mark T.; Jones, Jeanne; Barnard, Patrick L.; Erikson, Li H.

2018-01-01

Sea level is projected to rise over the coming decades, further increasing the extent of flooding hazards in coastal communities. Efforts to address potential impacts from climate-driven coastal hazards have called for collaboration among communities to strengthen the application of best practices. However, communities currently lack practical tools for identifying potential partner communities based on similar hazard exposure characteristics. This study uses statistical cluster analysis to identify similarities in community exposure to flooding hazards for a suite of sea level rise and storm scenarios. We demonstrate this approach using 63 jurisdictions in the San Francisco Bay region of California (USA) and compare 21 distinct exposure variables related to residents, employees, and structures for six hazard scenario combinations of sea level rise and storms. Results indicate that cluster analysis can provide an effective mechanism for identifying community groupings. Cluster compositions changed based on the selected societal variables and sea level rise scenarios, suggesting that a community could participate in multiple networks to target specific issues or policy interventions. The proposed clustering approach can serve as a data-driven foundation to help communities identify other communities with similar adaptation challenges and to enhance regional efforts that aim to facilitate adaptation planning and investment prioritization.
BIOSSES: a semantic sentence similarity estimation system for the biomedical domain.

PubMed

Sogancioglu, Gizem; Öztürk, Hakime; Özgür, Arzucan

2017-07-15

The amount of information available in textual format is rapidly increasing in the biomedical domain. Therefore, natural language processing (NLP) applications are becoming increasingly important to facilitate the retrieval and analysis of these data. Computing the semantic similarity between sentences is an important component in many NLP tasks including text retrieval and summarization. A number of approaches have been proposed for semantic sentence similarity estimation for generic English. However, our experiments showed that such approaches do not effectively cover biomedical knowledge and produce poor results for biomedical text. We propose several approaches for sentence-level semantic similarity computation in the biomedical domain, including string similarity measures and measures based on the distributed vector representations of sentences learned in an unsupervised manner from a large biomedical corpus. In addition, ontology-based approaches are presented that utilize general and domain-specific ontologies. Finally, a supervised regression based model is developed that effectively combines the different similarity computation metrics. A benchmark data set consisting of 100 sentence pairs from the biomedical literature is manually annotated by five human experts and used for evaluating the proposed methods. The experiments showed that the supervised semantic sentence similarity computation approach obtained the best performance (0.836 correlation with gold standard human annotations) and improved over the state-of-the-art domain-independent systems up to 42.6% in terms of the Pearson correlation metric. A web-based system for biomedical semantic sentence similarity computation, the source code, and the annotated benchmark data set are available at: http://tabilab.cmpe.boun.edu.tr/BIOSSES/ . gizemsogancioglu@gmail.com or arzucan.ozgur@boun.edu.tr. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
A statistical physics perspective on alignment-independent protein sequence comparison.

PubMed

Chattopadhyay, Amit K; Nasiev, Diar; Flower, Darren R

2015-08-01

Within bioinformatics, the textual alignment of amino acid sequences has long dominated the determination of similarity between proteins, with all that implies for shared structure, function and evolutionary descent. Despite the relative success of modern-day sequence alignment algorithms, so-called alignment-free approaches offer a complementary means of determining and expressing similarity, with potential benefits in certain key applications, such as regression analysis of protein structure-function studies, where alignment-base similarity has performed poorly. Here, we offer a fresh, statistical physics-based perspective focusing on the question of alignment-free comparison, in the process adapting results from 'first passage probability distribution' to summarize statistics of ensemble averaged amino acid propensity values. In this article, we introduce and elaborate this approach. © The Author 2015. Published by Oxford University Press.
Open Reading Frame Phylogenetic Analysis on the Cloud

PubMed Central

2013-01-01

Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus. PMID:23671843
NATO Guide for Judgement-Based Operational Analysis in Defence Decision Making (Guide OTAN pour l’analyse operationnelle basee sur le jugement dans la prise de decision de defense). Analyst-Oriented Volume: Code of Best Practice for Soft Operational Analysis

DTIC Science & Technology

2012-06-01

Military Operational Research , with special theme ‘The use of ‘soft’ methods in OR’. OR52 (7 – 9 September 2010, Royal Holloway University of London...on human judgement. Judgement-based OA applies the methods of ‘Soft Operational Research ’ developed in academia. It has appeared, however, that the...similarity between judgemental methods in operational research practice and a number of other modes of professional analytical practice. The closest
Molecular analysis of the bacterial microbiome in the forestomach fluid from the dromedary camel (Camelus dromedarius).

PubMed

Bhatt, Vaibhav D; Dande, Suchitra S; Patil, Nitin V; Joshi, Chaitanya G

2013-04-01

Rumen microorganisms play an important role in ruminant digestion and absorption of nutrients and have great potential applications in the field of rumen adjusting, food fermentation and biomass utilization etc. In order to investigate the composition of microorganisms in the rumen of camel (Camelus dromedarius), this study delves in the microbial diversity by culture-independent approach. It includes comparison of rumen samples investigated in the present study to other currently available metagenomes to reveal potential differences in rumen microbial systems. Pyrosequencing based metagenomics was applied to analyze phylogenetic and metabolic profiles by MG-RAST, a web based tool. Pyrosequencing of camel rumen sample yielded 8,979,755 nucleotides assembled to 41,905 sequence reads with an average read length of 214 nucleotides. Taxonomic analysis of metagenomic reads indicated Bacteroidetes (55.5 %), Firmicutes (22.7 %) and Proteobacteria (9.2 %) phyla as predominant camel rumen taxa. At a finer phylogenetic resolution, Bacteroides species dominated the camel rumen metagenome. Functional analysis revealed that clustering-based subsystem and carbohydrate metabolism were the most abundant SEED subsystem representing 17 and 13 % of camel metagenome, respectively. A high taxonomic and functional similarity of camel rumen was found with the cow metagenome which is not surprising given the fact that both are mammalian herbivores with similar digestive tract structures and functions. Combined pyrosequencing approach and subsystems-based annotations available in the SEED database allowed us access to understand the metabolic potential of these microbiomes. Altogether, these data suggest that agricultural and animal husbandry practices can impose significant selective pressures on the rumen microbiota regardless of rumen type. The present study provides a baseline for understanding the complexity of camel rumen microbial ecology while also highlighting striking similarities and differences when compared to other animal gastrointestinal environments.

Characterisation of two-stage ignition in diesel engine-relevant thermochemical conditions using direct numerical simulation

DOE PAGES

Krisman, Alex; Hawkes, Evatt R.; Talei, Mohsen; ...

2016-08-30

With the goal of providing a more detailed fundamental understanding of ignition processes in diesel engines, this study reports analysis of a direct numerical simulation (DNS) database. In the DNS, a pseudo turbulent mixing layer of dimethyl ether (DME) at 400 K and air at 900 K is simulated at a pressure of 40 atmospheres. At these conditions, DME exhibits a two-stage ignition and resides within the negative temperature coefficient (NTC) regime of ignition delay times, similar to diesel fuel. The analysis reveals a complex ignition process with several novel features. Autoignition occurs as a distributed, two-stage event. The high-temperaturemore » stage of ignition establishes edge flames that have a hybrid premixed/autoignition flame structure similar to that previously observed for lifted laminar flames at similar thermochemical conditions. In conclusion, a combustion mode analysis based on key radical species illustrates the multi-stage and multi-mode nature of the ignition process and highlights the substantial modelling challenge presented by diesel combustion.« less
Genome-wide analysis of the homeodomain-leucine zipper (HD-ZIP) gene family in peach (Prunus persica).

PubMed

Zhang, C H; Ma, R J; Shen, Z J; Sun, X; Korir, N K; Yu, M L

2014-04-08

In this study, 33 homeodomain-leucine zipper (HD-ZIP) genes were identified in peach using the HD-ZIP amino acid sequences of Arabidopsis thaliana as a probe. Based on the phylogenetic analysis and the individual gene or protein characteristics, the HD-ZIP gene family in peach can be classified into 4 subfamilies, HD-ZIP I, II, III, and IV, containing 14, 7, 4, and 8 members, respectively. The most closely related peach HD-ZIP members within the same subfamilies shared very similar gene structure in terms of either intron/exon numbers or lengths. Almost all members of the same subfamily shared common motif compositions, thereby implying that the HD-ZIP proteins within the same subfamily may have functional similarity. The 33 peach HD-ZIP genes were distributed across scaffolds 1 to 7. Although the primary structure varied among HD-ZIP family proteins, their tertiary structures were similar. The results from this study will be useful in selecting candidate genes from specific subfamilies for functional analysis.
[Study on action mechanism and material base of compound Danshen dripping pills in treatment of carotid atherosclerosis based on techniques of gene expression profile and molecular fingerprint].

PubMed

Zhou, Wei; Song, Xiang-gang; Chen, Chao; Wang, Shu-mei; Liang, Sheng-wang

2015-08-01

Action mechanism and material base of compound Danshen dripping pills in treatment of carotid atherosclerosis were discussed based on gene expression profile and molecular fingerprint in this paper. First, gene expression profiles of atherosclerotic carotid artery tissues and histologically normal tissues in human body were collected, and were screened using significance analysis of microarray (SAM) to screen out differential gene expressions; then differential genes were analyzed by Gene Ontology (GO) analysis and KEGG pathway analysis; to avoid some genes with non-outstanding differential expression but biologically importance, Gene Set Enrichment Analysis (GSEA) were performed, and 7 chemical ingredients with higher negative enrichment score were obtained by Cmap method, implying that they could reversely regulate the gene expression profiles of pathological tissues; and last, based on the hypotheses that similar structures have similar activities, 336 ingredients of compound Danshen dripping pills were compared with 7 drug molecules in 2D molecular fingerprints method. The results showed that 147 differential genes including 60 up-regulated genes and 87 down regulated genes were screened out by SAM. And in GO analysis, Biological Process ( BP) is mainly concerned with biological adhesion, response to wounding and inflammatory response; Cellular Component (CC) is mainly concerned with extracellular region, extracellular space and plasma membrane; while Molecular Function (MF) is mainly concerned with antigen binding, metalloendopeptidase activity and peptide binding. KEGG pathway analysis is mainly concerned with JAK-STAT, RIG-I like receptor and PPAR signaling pathway. There were 10 compounds, such as hexadecane, with Tanimoto coefficients greater than 0.85, which implied that they may be the active ingredients (AIs) of compound Danshen dripping pills in treatment of carotid atherosclerosis (CAs). The present method can be applied to the research on material base and molecular action mechanism of TCM.
Effectiveness of human anatomy education for pharmacy students via the Internet.

PubMed

Limpach, Aimee L; Bazrafshan, Parham; Turner, Paul D; Monaghan, Michael S

2008-12-15

To evaluate the overall effectiveness of a human anatomy course taught to distance-based and campus-based pharmacy students. A retrospective analysis of students' grades and course evaluations from 2003 through 2006 was conducted. No significant differences in student performance by pathway were found for the 2003-2005 academic years (p > 0.05). However, distance-based students' percentage and letter grades were significantly higher in 2006 (p = 0.013 and p = 0.004 respectively). Comparison of course and instructor evaluations showed that students in the distance course held similar or more positive perceptions of the course than their campus peers. Similar performance by campus and distance students enrolled in a human anatomy suggests that a distance-based course can be used successfully to teach human anatomy to pharmacy students.
Comparison of Atmospheric Pressure Chemical Ionization and Field Ionization Mass Spectrometry for the Analysis of Large Saturated Hydrocarbons.

PubMed

Jin, Chunfen; Viidanoja, Jyrki; Li, Mingzhe; Zhang, Yuyang; Ikonen, Elias; Root, Andrew; Romanczyk, Mark; Manheim, Jeremy; Dziekonski, Eric; Kenttämaa, Hilkka I

2016-11-01

Direct infusion atmospheric pressure chemical ionization mass spectrometry (APCI-MS) was compared to field ionization mass spectrometry (FI-MS) for the determination of hydrocarbon class distributions in lubricant base oils. When positive ion mode APCI with oxygen as the ion source gas was employed to ionize saturated hydrocarbon model compounds (M) in hexane, only stable [M - H] + ions were produced. Ion-molecule reaction studies performed in a linear quadrupole ion trap suggested that fragment ions of ionized hexane can ionize saturated hydrocarbons via hydride abstraction with minimal fragmentation. Hence, APCI-MS shows potential as an alternative of FI-MS in lubricant base oil analysis. Indeed, the APCI-MS method gave similar average molecular weights and hydrocarbon class distributions as FI-MS for three lubricant base oils. However, the reproducibility of APCI-MS method was found to be substantially better than for FI-MS. The paraffinic content determined using the APCI-MS and FI-MS methods for the base oils was similar. The average number of carbons in paraffinic chains followed the same increasing trend from low viscosity to high viscosity base oils for the two methods.
Label-free integrative pharmacology on-target of drugs at the β2-adrenergic receptor

NASA Astrophysics Data System (ADS)

Ferrie, Ann M.; Sun, Haiyan; Fang, Ye

2011-07-01

We describe a label-free integrative pharmacology on-target (iPOT) method to assess the pharmacology of drugs at the β2-adrenergic receptor. This method combines dynamic mass redistribution (DMR) assays using an array of probe molecule-hijacked cells with similarity analysis. The whole cell DMR assays track cell system-based, ligand-directed, and kinetics-dependent biased activities of the drugs, and translates their on-target pharmacology into numerical descriptors which are subject to similarity analysis. We demonstrate that the approach establishes an effective link between the label-free pharmacology and in vivo therapeutic indications of drugs.
Sensitivity and Uncertainty Analysis of the GFR MOX Fuel Subassembly

NASA Astrophysics Data System (ADS)

Lüley, J.; Vrban, B.; Čerba, Š.; Haščík, J.; Nečas, V.; Pelloni, S.

2014-04-01

We performed sensitivity and uncertainty analysis as well as benchmark similarity assessment of the MOX fuel subassembly designed for the Gas-Cooled Fast Reactor (GFR) as a representative material of the core. Material composition was defined for each assembly ring separately allowing us to decompose the sensitivities not only for isotopes and reactions but also for spatial regions. This approach was confirmed by direct perturbation calculations for chosen materials and isotopes. Similarity assessment identified only ten partly comparable benchmark experiments that can be utilized in the field of GFR development. Based on the determined uncertainties, we also identified main contributors to the calculation bias.
A New Classification of Diabetic Gait Pattern Based on Cluster Analysis of Biomechanical Data

PubMed Central

Sawacha, Zimi; Guarneri, Gabriella; Avogaro, Angelo; Cobelli, Claudio

2010-01-01

Background The diabetic foot, one of the most serious complications of diabetes mellitus and a major risk factor for plantar ulceration, is determined mainly by peripheral neuropathy. Neuropathic patients exhibit decreased stability while standing as well as during dynamic conditions. A new methodology for diabetic gait pattern classification based on cluster analysis has been proposed that aims to identify groups of subjects with similar patterns of gait and verify if three-dimensional gait data are able to distinguish diabetic gait patterns from one of the control subjects. Method The gait of 20 nondiabetic individuals and 46 diabetes patients with and without peripheral neuropathy was analyzed [mean age 59.0 (2.9) and 61.1(4.4) years, mean body mass index (BMI) 24.0 (2.8), and 26.3 (2.0)]. K-means cluster analysis was applied to classify the subjects' gait patterns through the analysis of their ground reaction forces, joints and segments (trunk, hip, knee, ankle) angles, and moments. Results Cluster analysis classification led to definition of four well-separated clusters: one aggregating just neuropathic subjects, one aggregating both neuropathics and non-neuropathics, one including only diabetes patients, and one including either controls or diabetic and neuropathic subjects. Conclusions Cluster analysis was useful in grouping subjects with similar gait patterns and provided evidence that there were subgroups that might otherwise not be observed if a group ensemble was presented for any specific variable. In particular, we observed the presence of neuropathic subjects with a gait similar to the controls and diabetes patients with a long disease duration with a gait as altered as the neuropathic one. PMID:20920432
Analysis of Rare, Exonic Variation amongst Subjects with Autism Spectrum Disorders and Population Controls

PubMed Central

Liu, Li; Sabo, Aniko; Neale, Benjamin M.; Nagaswamy, Uma; Stevens, Christine; Lim, Elaine; Bodea, Corneliu A.; Muzny, Donna; Reid, Jeffrey G.; Banks, Eric; Coon, Hillary; DePristo, Mark; Dinh, Huyen; Fennel, Tim; Flannick, Jason; Gabriel, Stacey; Garimella, Kiran; Gross, Shannon; Hawes, Alicia; Lewis, Lora; Makarov, Vladimir; Maguire, Jared; Newsham, Irene; Poplin, Ryan; Ripke, Stephan; Shakir, Khalid; Samocha, Kaitlin E.; Wu, Yuanqing; Boerwinkle, Eric; Buxbaum, Joseph D.; Cook, Edwin H.; Devlin, Bernie; Schellenberg, Gerard D.; Sutcliffe, James S.; Daly, Mark J.; Gibbs, Richard A.; Roeder, Kathryn

2013-01-01

We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD. PMID:23593035
Psychometric analysis in support of shortening the Scale for the Assessment of Negative Symptoms.

PubMed

Levine, Stephen Z; Leucht, Stefan

2013-09-01

Despite recent emphasis on the measurement and treatment of negative symptoms, studies of the Scale for the Assessment of Negative Symptoms (SANS) identify different symptom clusters, offer mixed support for its psychometric properties and suggest that it is shortened. The current study objective is to examine the psychometric properties of the SANS and the feasibility of a short research version of the SANS. Data were re-analyzed from three clinical trials that compared placebo and amisulpride to 60 days. Participants had chronic schizophrenia and predominantly negative symptoms (n=487). Baseline data were examined with exploratory factor analysis and Item Response Theory (IRT) to identify a short SANS. The short and original SANS were compared: with confirmatory factor analysis at endpoint; and on symptom response with mixed modeling to compare. Results showed that at baseline the SANS consisted of three factors labeled Affective-flattening, Asociality and Alogia-inattentiveness. IRT suggested a short SANS with 11 items and 3 response options. Comparisons of the original and short SANS showed: the short version was a better fit to the data based on confirmatory factor analysis at endpoint; similar significant (p<.001) correlations between the baseline and subsequent scores; similar reliability; and similar significance (p<.05) on response based on mixed modeling. It is concluded that a short SANS is feasible to assess predominantly negative symptoms in chronic schizophrenia in research settings. Copyright © 2012 Elsevier B.V. and ECNP. All rights reserved.
Smart Extraction and Analysis System for Clinical Research.

PubMed

Afzal, Muhammad; Hussain, Maqbool; Khan, Wajahat Ali; Ali, Taqdir; Jamshed, Arif; Lee, Sungyoung

2017-05-01

With the increasing use of electronic health records (EHRs), there is a growing need to expand the utilization of EHR data to support clinical research. The key challenge in achieving this goal is the unavailability of smart systems and methods to overcome the issue of data preparation, structuring, and sharing for smooth clinical research. We developed a robust analysis system called the smart extraction and analysis system (SEAS) that consists of two subsystems: (1) the information extraction system (IES), for extracting information from clinical documents, and (2) the survival analysis system (SAS), for a descriptive and predictive analysis to compile the survival statistics and predict the future chance of survivability. The IES subsystem is based on a novel permutation-based pattern recognition method that extracts information from unstructured clinical documents. Similarly, the SAS subsystem is based on a classification and regression tree (CART)-based prediction model for survival analysis. SEAS is evaluated and validated on a real-world case study of head and neck cancer. The overall information extraction accuracy of the system for semistructured text is recorded at 99%, while that for unstructured text is 97%. Furthermore, the automated, unstructured information extraction has reduced the average time spent on manual data entry by 75%, without compromising the accuracy of the system. Moreover, around 88% of patients are found in a terminal or dead state for the highest clinical stage of disease (level IV). Similarly, there is an ∼36% probability of a patient being alive if at least one of the lifestyle risk factors was positive. We presented our work on the development of SEAS to replace costly and time-consuming manual methods with smart automatic extraction of information and survival prediction methods. SEAS has reduced the time and energy of human resources spent unnecessarily on manual tasks.
A transversal approach to predict gene product networks from ontology-based similarity

PubMed Central

Chabalier, Julie; Mosser, Jean; Burgun, Anita

2007-01-01

Background Interpretation of transcriptomic data is usually made through a "standard" approach which consists in clustering the genes according to their expression patterns and exploiting Gene Ontology (GO) annotations within each expression cluster. This approach makes it difficult to underline functional relationships between gene products that belong to different expression clusters. To address this issue, we propose a transversal analysis that aims to predict functional networks based on a combination of GO processes and data expression. Results The transversal approach presented in this paper consists in computing the semantic similarity between gene products in a Vector Space Model. Through a weighting scheme over the annotations, we take into account the representativity of the terms that annotate a gene product. Comparing annotation vectors results in a matrix of gene product similarities. Combined with expression data, the matrix is displayed as a set of functional gene networks. The transversal approach was applied to 186 genes related to the enterocyte differentiation stages. This approach resulted in 18 functional networks proved to be biologically relevant. These results were compared with those obtained through a standard approach and with an approach based on information content similarity. Conclusion Complementary to the standard approach, the transversal approach offers new insight into the cellular mechanisms and reveals new research hypotheses by combining gene product networks based on semantic similarity, and data expression. PMID:17605807
Differentiation of bacterial feeding nematodes in soil ecological studies by means of arbitrarily primed PCR

USGS Publications Warehouse

Van Der Knaap, Esther; Rodriguez, Russell J.; Freckman, Diana W.

1993-01-01

Arbitrarily-primed polymerase chain reaction (ap-PCR) was used to differentiate closely related bacterial-feeding nematodes of the genera: Caenorhabditis, Acrobeloides, Cephalobus and Zeldia. Average percentage similarity of bands generated by ap-PCR with seven different primers between 14 isolates of Caenorhabditis elegans was ⪢ 90%, whereas between C. elegans, C. briggsae and C. remanei similarity was < 20%. Based on intra- and inter-specific similarity between Caenorhabditis isolates, analysis of Acrobeloides, Cephalobus and Zeldia isolates revealed either similar or different genotypes. Distinct genotypes were verified by morphological analyses. In addition, the genotypes obtained from single egg-derived nematode populations were also obtained from ap-PCR analysis of single worms. Due to the difficulty of identification of soil nematodes, the ap-PCR offers potential as a rapid and reliable technique to assess biodiversity. Ap-PCR will make it feasible, for the first time, to study the ecological interactions of unique nematode genotypes in soil habitats.
Basfia succiniciproducens gen. nov., sp. nov., a new member of the family Pasteurellaceae isolated from bovine rumen.

PubMed

Kuhnert, Peter; Scholten, Edzard; Haefner, Stefan; Mayor, Désirée; Frey, Joachim

2010-01-01

Gram-negative, coccoid, non-motile bacteria that are catalase-, urease- and indole-negative, facultatively anaerobic and oxidase-positive were isolated from the bovine rumen using an improved selective medium for members of the Pasteurellaceae. All strains produced significant amounts of succinic acid under anaerobic conditions with glucose as substrate. Phenotypic characterization and multilocus sequence analysis (MLSA) using 16S rRNA, rpoB, infB and recN genes were performed on seven independent isolates. All four genes showed high sequence similarity to their counterparts in the genome sequence of the patent strain MBEL55E, but less than 95 % 16S rRNA gene sequence similarity to any other species of the Pasteurellaceae. Genetically these strains form a very homogeneous group in individual as well as combined phylogenetic trees, clearly separated from other genera of the family from which they can also be separated based on phenotypic markers. Genome relatedness as deduced from the recN gene showed high interspecies similarities, but again low similarity to any of the established genera of the family. No toxicity towards bovine, human or fish cells was observed and no RTX toxin genes were detected in members of the new taxon. Based on phylogenetic clustering in the MLSA analysis, the low genetic similarity to other genera and the phenotypic distinction, we suggest to classify these bovine rumen isolates as Basfia succiniciproducens gen. nov., sp. nov. The type strain is JF4016(T) (=DSM 22022(T) =CCUG 57335(T)).
What computational non-targeted mass spectrometry-based metabolomics can gain from shotgun proteomics.

PubMed

Hamzeiy, Hamid; Cox, Jürgen

2017-02-01

Computational workflows for mass spectrometry-based shotgun proteomics and untargeted metabolomics share many steps. Despite the similarities, untargeted metabolomics is lagging behind in terms of reliable fully automated quantitative data analysis. We argue that metabolomics will strongly benefit from the adaptation of successful automated proteomics workflows to metabolomics. MaxQuant is a popular platform for proteomics data analysis and is widely considered to be superior in achieving high precursor mass accuracies through advanced nonlinear recalibration, usually leading to five to ten-fold better accuracy in complex LC-MS/MS runs. This translates to a sharp decrease in the number of peptide candidates per measured feature, thereby strongly improving the coverage of identified peptides. We argue that similar strategies can be applied to untargeted metabolomics, leading to equivalent improvements in metabolite identification. Copyright © 2016 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Mechlorethamine-based drug structures for intervention of central nervous system tumors.

PubMed

Bartzatt, Ronald

2013-06-01

Tumors of the central nervous system are the third most common type of childhood cancers. Brain tumors occur in children and adults; however pediatric patients require a different treatment process. Thirteen drugs similar to mechlorethamine are analyzed in this study. These drugs possess molecular properties enabling substantial and successful access to tumors of the central nervous system. All drugs exhibit zero violations of the Rule of 5, which indicate favorable bioavailability. Ranges in Log P, formula weight, and polar surface area for these drugs are: 1.554 to 3.52, 156.06 to 460.45, and 3.238 Angstroms(2) to 45.471 Angstroms(2), respectively. Hierarchical cluster analysis determined that agents 7 and 12 are most similar to the parent compound mechlorethamine. The mean values of Log P, formula weight, polar surface area, and molecular volume are 2.25, 268.51, 16.57 Angstroms(2), and 227.01 Angstroms(3), respectively. Principal component analysis indicates that agents 7 and 12 are most similar to mechlorethamine and multiple regression analysis of molecular properties produced a model to enable the design of similar alkylating agents. Values of Log (Cbrain/Cblood) indicate these agents will have very high permeation into the central nervous system.
A Corpus-Based View of Lexical Gender in Written Business English

ERIC Educational Resources Information Center

Fuertes-Olivera, Pedro A.

2007-01-01

This article investigates lexical gender in specialized communication. The key method of analysis is that of forms of address, professional titles, and "generic man" in a 10 million word corpus of written Business English. After a brief introduction and literature review on both gender in specialized communication and similar corpus-based views of…
Analysis of the Microstructure of Titles in the INSPEC Data-Base

ERIC Educational Resources Information Center

And Others; Lynch, Michael F.

1973-01-01

A high degree of constancy has been found in the microstructure of titles of samples of the INSPEC data base taken over a three-year period. Character and digram frequencies are relatively stable, while variable-length character-strings characterizing samples separated by three years in time show close similarities. (2 references) (Author/SJ)
Landowner attitudes and perceptions of forest and wildlife management in rural northern Missouri

Treesearch

Brian E. Schweiss; John Dwyer

2008-01-01

Improving Missouri's forest lands depend on private landowners. Cluster analysis was used to combine nonindustrial private forest landowners with similar interests based on attitudinal information gathered from a mail questionnaire to forest landowners in Macon County, MO. Clusters were analyzed based on objective data gathered in the questionnaire. Seven types of...
Chemical Fingerprint Analysis and Quantitative Analysis of Rosa rugosa by UPLC-DAD.

PubMed

Mansur, Sanawar; Abdulla, Rahima; Ayupbec, Amatjan; Aisa, Haji Akbar

2016-12-21

A method based on ultra performance liquid chromatography with a diode array detector (UPLC-DAD) was developed for quantitative analysis of five active compounds and chemical fingerprint analysis of Rosa rugosa . Ten batches of R. rugosa collected from different plantations in the Xinjiang region of China were used to establish the fingerprint. The feasibility and advantages of the used UPLC fingerprint were verified for its similarity evaluation by systematically comparing chromatograms with professional analytical software recommended by State Food and Drug Administration (SFDA) of China. In quantitative analysis, the five compounds showed good regression (R² = 0.9995) within the test ranges, and the recovery of the method was in the range of 94.2%-103.8%. The similarities of liquid chromatography fingerprints of 10 batches of R. rugosa were more than 0.981. The developed UPLC fingerprint method is simple, reliable, and validated for the quality control and identification of R. rugosa . Additionally, simultaneous quantification of five major bioactive ingredients in the R. rugosa samples was conducted to interpret the consistency of the quality test. The results indicated that the UPLC fingerprint, as a characteristic distinguishing method combining similarity evaluation and quantification analysis, can be successfully used to assess the quality and to identify the authenticity of R. rugosa .

EHME: a new word database for research in Basque language.

PubMed

Acha, Joana; Laka, Itziar; Landa, Josu; Salaburu, Pello

2014-11-14

This article presents EHME, the frequency dictionary of Basque structure, an online program that enables researchers in psycholinguistics to extract word and nonword stimuli, based on a broad range of statistics concerning the properties of Basque words. The database consists of 22.7 million tokens, and properties available include morphological structure frequency and word-similarity measures, apart from classical indexes: word frequency, orthographic structure, orthographic similarity, bigram and biphone frequency, and syllable-based measures. Measures are indexed at the lemma, morpheme and word level. We include reliability and validation analysis. The application is freely available, and enables the user to extract words based on concrete statistical criteria 1 , as well as to obtain statistical characteristics from a list of words
Computational Aeroacoustic Analysis of Slat Trailing-Edge Flow

NASA Technical Reports Server (NTRS)

Singer, Bart A.; Lockard, David P.; Brentner, Kenneth S.; Khorrami, Mehdi R.; Berkman, Mert E.; Choudhari, Meelan

2000-01-01

An acoustic analysis based on the Ffowcs Williams and Hawkings equation was performed for a high-lift system. As input, the acoustic analysis used un- steady flow data obtained from a highly resolved, time-dependent, Reynolds-averaged Navier-Stokes calculation. The analysis strongly suggests that vor- tex shedding from the trailing edge of the slat results in a high-amplitude, high-frequency acoustic signal, similar to that which was observed in a correspond- ing experimental study of the high-lift system.
simDEF: definition-based semantic similarity measure of gene ontology terms for functional similarity analysis of genes.

PubMed

Pesaranghader, Ahmad; Matwin, Stan; Sokolova, Marina; Beiko, Robert G

2016-05-01

Measures of protein functional similarity are essential tools for function prediction, evaluation of protein-protein interactions (PPIs) and other applications. Several existing methods perform comparisons between proteins based on the semantic similarity of their GO terms; however, these measures are highly sensitive to modifications in the topological structure of GO, tend to be focused on specific analytical tasks and concentrate on the GO terms themselves rather than considering their textual definitions. We introduce simDEF, an efficient method for measuring semantic similarity of GO terms using their GO definitions, which is based on the Gloss Vector measure commonly used in natural language processing. The simDEF approach builds optimized definition vectors for all relevant GO terms, and expresses the similarity of a pair of proteins as the cosine of the angle between their definition vectors. Relative to existing similarity measures, when validated on a yeast reference database, simDEF improves correlation with sequence homology by up to 50%, shows a correlation improvement >4% with gene expression in the biological process hierarchy of GO and increases PPI predictability by > 2.5% in F1 score for molecular function hierarchy. Datasets, results and source code are available at http://kiwi.cs.dal.ca/Software/simDEF CONTACT: ahmad.pgh@dal.ca or beiko@cs.dal.ca Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships.

PubMed

Gold, Nicola D; Jackson, Richard M

2006-02-03

The rapid growth in protein structural data and the emergence of structural genomics projects have increased the need for automatic structure analysis and tools for function prediction. Small molecule recognition is critical to the function of many proteins; therefore, determination of ligand binding site similarity is important for understanding ligand interactions and may allow their functional classification. Here, we present a binding sites database (SitesBase) that given a known protein-ligand binding site allows rapid retrieval of other binding sites with similar structure independent of overall sequence or fold similarity. However, each match is also annotated with sequence similarity and fold information to aid interpretation of structure and functional similarity. Similarity in ligand binding sites can indicate common binding modes and recognition of similar molecules, allowing potential inference of function for an uncharacterised protein or providing additional evidence of common function where sequence or fold similarity is already known. Alternatively, the resource can provide valuable information for detailed studies of molecular recognition including structure-based ligand design and in understanding ligand cross-reactivity. Here, we show examples of atomic similarity between superfamily or more distant fold relatives as well as between seemingly unrelated proteins. Assignment of unclassified proteins to structural superfamiles is also undertaken and in most cases substantiates assignments made using sequence similarity. Correct assignment is also possible where sequence similarity fails to find significant matches, illustrating the potential use of binding site comparisons for newly determined proteins.
[A retrieval method of drug molecules based on graph collapsing].

PubMed

Qu, J W; Lv, X Q; Liu, Z M; Liao, Y; Sun, P H; Wang, B; Tang, Z

2018-04-18

To establish a compact and efficient hypergraph representation and a graph-similarity-based retrieval method of molecules to achieve effective and efficient medicine information retrieval. Chemical structural formula (CSF) was a primary search target as a unique and precise identifier for each compound at the molecular level in the research field of medicine information retrieval. To retrieve medicine information effectively and efficiently, a complete workflow of the graph-based CSF retrieval system was introduced. This system accepted the photos taken from smartphones and the sketches drawn on tablet personal computers as CSF inputs, and formalized the CSFs with the corresponding graphs. Then this paper proposed a compact and efficient hypergraph representation for molecules on the basis of analyzing factors that directly affected the efficiency of graph matching. According to the characteristics of CSFs, a hierarchical collapsing method combining graph isomorphism and frequent subgraph mining was adopted. There was yet a fundamental challenge, subgraph overlapping during the collapsing procedure, which hindered the method from establishing the correct compact hypergraph of an original CSF graph. Therefore, a graph-isomorphism-based algorithm was proposed to select dominant acyclic subgraphs on the basis of overlapping analysis. Finally, the spatial similarity among graphical CSFs was evaluated by multi-dimensional measures of similarity. To evaluate the performance of the proposed method, the proposed system was firstly compared with Wikipedia Chemical Structure Explorer (WCSE), the state-of-the-art system that allowed CSF similarity searching within Wikipedia molecules dataset, on retrieval accuracy. The system achieved higher values on mean average precision, discounted cumulative gain, rank-biased precision, and expected reciprocal rank than WCSE from the top-2 to the top-10 retrieved results. Specifically, the system achieved 10%, 1.41, 6.42%, and 1.32% higher than WCSE on these metrics for top-10 retrieval results, respectively. Moreover, several retrieval cases were presented to intuitively compare with WCSE. The results of the above comparative study demonstrated that the proposed method outperformed the existing method with regard to accuracy and effectiveness. This paper proposes a graph-similarity-based retrieval approach for medicine information. To obtain satisfactory retrieval results, an isomorphism-based algorithm is proposed for dominant subgraph selection based on the subgraph overlapping analysis, as well as an effective and efficient hypergraph representation of molecules. Experiment results demonstrate the effectiveness of the proposed approach.
FunSimMat: a comprehensive functional similarity database

PubMed Central

Schlicker, Andreas; Albrecht, Mario

2008-01-01

Functional similarity based on Gene Ontology (GO) annotation is used in diverse applications like gene clustering, gene expression data analysis, protein interaction prediction and evaluation. However, there exists no comprehensive resource of functional similarity values although such a database would facilitate the use of functional similarity measures in different applications. Here, we describe FunSimMat (Functional Similarity Matrix, http://funsimmat.bioinf.mpi-inf.mpg.de/), a large new database that provides several different semantic similarity measures for GO terms. It offers various precomputed functional similarity values for proteins contained in UniProtKB and for protein families in Pfam and SMART. The web interface allows users to efficiently perform both semantic similarity searches with GO terms and functional similarity searches with proteins or protein families. All results can be downloaded in tab-delimited files for use with other tools. An additional XML–RPC interface gives automatic online access to FunSimMat for programs and remote services. PMID:17932054
A comparison of analysis methods to estimate contingency strength.

PubMed

Lloyd, Blair P; Staubitz, Johanna L; Tapp, Jon T

2018-05-09

To date, several data analysis methods have been used to estimate contingency strength, yet few studies have compared these methods directly. To compare the relative precision and sensitivity of four analysis methods (i.e., exhaustive event-based, nonexhaustive event-based, concurrent interval, concurrent+lag interval), we applied all methods to a simulated data set in which several response-dependent and response-independent schedules of reinforcement were programmed. We evaluated the degree to which contingency strength estimates produced from each method (a) corresponded with expected values for response-dependent schedules and (b) showed sensitivity to parametric manipulations of response-independent reinforcement. Results indicated both event-based methods produced contingency strength estimates that aligned with expected values for response-dependent schedules, but differed in sensitivity to response-independent reinforcement. The precision of interval-based methods varied by analysis method (concurrent vs. concurrent+lag) and schedule type (continuous vs. partial), and showed similar sensitivities to response-independent reinforcement. Recommendations and considerations for measuring contingencies are identified. © 2018 Society for the Experimental Analysis of Behavior.
Self-similarity analysis of eubacteria genome based on weighted graph.

PubMed

Qi, Zhao-Hui; Li, Ling; Zhang, Zhi-Meng; Qi, Xiao-Qin

2011-07-07

We introduce a weighted graph model to investigate the self-similarity characteristics of eubacteria genomes. The regular treating in similarity comparison about genome is to discover the evolution distance among different genomes. Few people focus their attention on the overall statistical characteristics of each gene compared with other genes in the same genome. In our model, each genome is attributed to a weighted graph, whose topology describes the similarity relationship among genes in the same genome. Based on the related weighted graph theory, we extract some quantified statistical variables from the topology, and give the distribution of some variables derived from the largest social structure in the topology. The 23 eubacteria recently studied by Sorimachi and Okayasu are markedly classified into two different groups by their double logarithmic point-plots describing the similarity relationship among genes of the largest social structure in genome. The results show that the proposed model may provide us with some new sights to understand the structures and evolution patterns determined from the complete genomes. Copyright © 2011 Elsevier Ltd. All rights reserved.
Novel Approach to Classify Plants Based on Metabolite-Content Similarity.

PubMed

Liu, Kang; Abdullah, Azian Azamimi; Huang, Ming; Nishioka, Takaaki; Altaf-Ul-Amin, Md; Kanaya, Shigehiko

2017-01-01

Secondary metabolites are bioactive substances with diverse chemical structures. Depending on the ecological environment within which they are living, higher plants use different combinations of secondary metabolites for adaptation (e.g., defense against attacks by herbivores or pathogenic microbes). This suggests that the similarity in metabolite content is applicable to assess phylogenic similarity of higher plants. However, such a chemical taxonomic approach has limitations of incomplete metabolomics data. We propose an approach for successfully classifying 216 plants based on their known incomplete metabolite content. Structurally similar metabolites have been clustered using the network clustering algorithm DPClus. Plants have been represented as binary vectors, implying relations with structurally similar metabolite groups, and classified using Ward's method of hierarchical clustering. Despite incomplete data, the resulting plant clusters are consistent with the known evolutional relations of plants. This finding reveals the significance of metabolite content as a taxonomic marker. We also discuss the predictive power of metabolite content in exploring nutritional and medicinal properties in plants. As a byproduct of our analysis, we could predict some currently unknown species-metabolite relations.
Novel Approach to Classify Plants Based on Metabolite-Content Similarity

PubMed Central

Abdullah, Azian Azamimi; Huang, Ming; Nishioka, Takaaki

2017-01-01

Secondary metabolites are bioactive substances with diverse chemical structures. Depending on the ecological environment within which they are living, higher plants use different combinations of secondary metabolites for adaptation (e.g., defense against attacks by herbivores or pathogenic microbes). This suggests that the similarity in metabolite content is applicable to assess phylogenic similarity of higher plants. However, such a chemical taxonomic approach has limitations of incomplete metabolomics data. We propose an approach for successfully classifying 216 plants based on their known incomplete metabolite content. Structurally similar metabolites have been clustered using the network clustering algorithm DPClus. Plants have been represented as binary vectors, implying relations with structurally similar metabolite groups, and classified using Ward's method of hierarchical clustering. Despite incomplete data, the resulting plant clusters are consistent with the known evolutional relations of plants. This finding reveals the significance of metabolite content as a taxonomic marker. We also discuss the predictive power of metabolite content in exploring nutritional and medicinal properties in plants. As a byproduct of our analysis, we could predict some currently unknown species-metabolite relations. PMID:28164123
MultiWaveLink: An interactive data base for the coordination of multiwavelength and multifacility observations

NASA Technical Reports Server (NTRS)

Cordova, F. A.

1993-01-01

MultiWaveLink is an interactive, computerized data base that was developed to facilitate a multi-wavelength approach to studying astrophysical sources. It can be used to access information about multiwavelenth resources (observers, telescopes, data bases and analysis facilities) or to organize observing campaigns that require either many telescopes operating in different spectral regimes or a network of similar telescopes circumspanning the Earth.
Rethinking Critical Mathematics: A Comparative Analysis of Critical, Reform, and Traditional Geometry Instructional Texts

ERIC Educational Resources Information Center

Brantlinger, Andrew

2011-01-01

This paper presents findings from a comparative analysis of three similar secondary geometry texts, one critical unit, one standards-based reform unit, and one specialist chapter. I developed the critical unit as I took the tenets of critical mathematics (CM) and substantiated them in printed curricular materials in which to teach as part of a…
Disability in Physical Education Textbooks: An Analysis of Image Content

ERIC Educational Resources Information Center

Taboas-Pais, Maria Ines; Rey-Cao, Ana

2012-01-01

The aim of this paper is to show how images of disability are portrayed in physical education textbooks for secondary schools in Spain. The sample was composed of 3,316 images published in 36 textbooks by 10 publishing houses. A content analysis was carried out using a coding scheme based on categories employed in other similar studies and adapted…
Electrostatic Similarity Analysis of Human β-Defensin Binding in the Melanocortin System

PubMed Central

Nix, Matthew A.; Kaelin, Christopher B.; Palomino, Rafael; Miller, Jillian L.; Barsh, Gregory S.; Millhauser, Glenn L.

2015-01-01

Summary The β-defensins are a class of small cationic proteins that serve as components of numerous systems in vertebrate biology, including the immune and melanocortin systems. Human β-defensin 3 (HBD3), which is produced in the skin, has been found to bind to melanocortin receptors 1 and 4 through complementary electrostatics, a unique mechanism of ligand-receptor interaction. This finding indicates that electrostatics alone, and not specific amino acid contact points, could be sufficient for function in this ligand-receptor system, and further suggests that other small peptide ligands could interact with these receptors in a similar fashion. Here, we conducted molecular-similarity analyses and functional studies of additional members of the human β-defensin family, examining their potential as ligands of melanocortin-1 receptor, through selection based on their electrostatic similarity to HBD3. Using Poisson-Boltzmann electrostatic calculations and molecular-similarity analysis, we identified members of the human β-defensin family that are both similar and dissimilar to HBD3 in terms of electrostatic potential. Synthesis and functional testing of a subset of these β-defensins showed that peptides with an HBD3-like electrostatic character bound to melanocortin receptors with high affinity, whereas those that were anticorrelated to HBD3 showed no binding affinity. These findings expand on the central role of electrostatics in the control of this ligand-receptor system and further demonstrate the utility of employing molecular-similarity analysis. Additionally, we identified several new potential ligands of melanocortin-1 receptor, which may have implications for our understanding of the role defensins play in melanocortin physiology. PMID:26536271
Path Similarity Analysis: A Method for Quantifying Macromolecular Pathways

PubMed Central

Seyler, Sean L.; Kumar, Avishek; Thorpe, M. F.; Beckstein, Oliver

2015-01-01

Diverse classes of proteins function through large-scale conformational changes and various sophisticated computational algorithms have been proposed to enhance sampling of these macromolecular transition paths. Because such paths are curves in a high-dimensional space, it has been difficult to quantitatively compare multiple paths, a necessary prerequisite to, for instance, assess the quality of different algorithms. We introduce a method named Path Similarity Analysis (PSA) that enables us to quantify the similarity between two arbitrary paths and extract the atomic-scale determinants responsible for their differences. PSA utilizes the full information available in 3N-dimensional configuration space trajectories by employing the Hausdorff or Fréchet metrics (adopted from computational geometry) to quantify the degree of similarity between piecewise-linear curves. It thus completely avoids relying on projections into low dimensional spaces, as used in traditional approaches. To elucidate the principles of PSA, we quantified the effect of path roughness induced by thermal fluctuations using a toy model system. Using, as an example, the closed-to-open transitions of the enzyme adenylate kinase (AdK) in its substrate-free form, we compared a range of protein transition path-generating algorithms. Molecular dynamics-based dynamic importance sampling (DIMS) MD and targeted MD (TMD) and the purely geometric FRODA (Framework Rigidity Optimized Dynamics Algorithm) were tested along with seven other methods publicly available on servers, including several based on the popular elastic network model (ENM). PSA with clustering revealed that paths produced by a given method are more similar to each other than to those from another method and, for instance, that the ENM-based methods produced relatively similar paths. PSA applied to ensembles of DIMS MD and FRODA trajectories of the conformational transition of diphtheria toxin, a particularly challenging example, showed that the geometry-based FRODA occasionally sampled the pathway space of force field-based DIMS MD. For the AdK transition, the new concept of a Hausdorff-pair map enabled us to extract the molecular structural determinants responsible for differences in pathways, namely a set of conserved salt bridges whose charge-charge interactions are fully modelled in DIMS MD but not in FRODA. PSA has the potential to enhance our understanding of transition path sampling methods, validate them, and to provide a new approach to analyzing conformational transitions. PMID:26488417
Comparison of detrending methods for fluctuation analysis in hydrology

NASA Astrophysics Data System (ADS)

Zhang, Qiang; Zhou, Yu; Singh, Vijay P.; Chen, Yongqin David

2011-03-01

SummaryTrends within a hydrologic time series can significantly influence the scaling results of fluctuation analysis, such as rescaled range (RS) analysis and (multifractal) detrended fluctuation analysis (MF-DFA). Therefore, removal of trends is important in the study of scaling properties of the time series. In this study, three detrending methods, including adaptive detrending algorithm (ADA), Fourier-based method, and average removing technique, were evaluated by analyzing numerically generated series and observed streamflow series with obvious relative regular periodic trend. Results indicated that: (1) the Fourier-based detrending method and ADA were similar in detrending practices, and given proper parameters, these two methods can produce similarly satisfactory results; (2) detrended series by Fourier-based detrending method and ADA lose the fluctuation information at larger time scales, and the location of crossover points is heavily impacted by the chosen parameters of these two methods; and (3) the average removing method has an advantage over the other two methods, i.e., the fluctuation information at larger time scales is kept well-an indication of relatively reliable performance in detrending. In addition, the average removing method performed reasonably well in detrending a time series with regular periods or trends. In this sense, the average removing method should be preferred in the study of scaling properties of the hydrometeorolgical series with relative regular periodic trend using MF-DFA.
Bias-correction of PERSIANN-CDR Extreme Precipitation Estimates Over the United States

NASA Astrophysics Data System (ADS)

Faridzad, M.; Yang, T.; Hsu, K. L.; Sorooshian, S.

2017-12-01

Ground-based precipitation measurements can be sparse or even nonexistent over remote regions which make it difficult for extreme event analysis. PERSIANN-CDR (CDR), with 30+ years of daily rainfall information, provides an opportunity to study precipitation for regions where ground measurements are limited. In this study, the use of CDR annual extreme precipitation for frequency analysis of extreme events over limited/ungauged basins is explored. The adjustment of CDR is implemented in two steps: (1) Calculated CDR bias correction factor at limited gauge locations based on the linear regression analysis of gauge and CDR annual maxima precipitation; and (2) Extend the bias correction factor to the locations where gauges are not available. The correction factors are estimated at gauge sites over various catchments, elevation zones, and climate regions and the results were generalized to ungauged sites based on regional and climatic similarity. Case studies were conducted on 20 basins with diverse climate and altitudes in the Eastern and Western US. Cross-validation reveals that the bias correction factors estimated on limited calibration data can be extended to regions with similar characteristics. The adjusted CDR estimates also outperform gauge interpolation on validation sites consistently. It is suggested that the CDR with bias adjustment has a potential for study frequency analysis of extreme events, especially for regions with limited gauge observations.
GeneBee-net: Internet-based server for analyzing biopolymers

DOE Office of Scientific and Technical Information (OSTI.GOV)

Brodsky, L.I.; Ivanov, V.V.; Nikolaev, V.K.

This work describes a network server for searching databanks of biopolymer structures and performing other biocomputing procedures; it is available via direct Internet connection. Basic server procedures are dedicated to homology (similarity) search of sequence and 3D structure of proteins. The homologies found could be used to build multiple alignments, predict protein and RNA secondary structure, and construct phylogenetic trees. In addition to traditional methods of sequence similarity search, the authors propose {open_quotes}non-matrix{close_quotes} (correlational) search. An analogous approach is used to identify regions of similar tertiary structure of proteins. Algorithm concepts and usage examples are presented for new methods. Servicemore » logic is based upon interaction of a client program and server procedures. The client program allows the compilation of queries and the processing of results of an analysis.« less
A Taxonomy-Based Approach to Shed Light on the Babel of Mathematical Models for Rice Simulation

NASA Technical Reports Server (NTRS)

Confalonieri, Roberto; Bregaglio, Simone; Adam, Myriam; Ruget, Francoise; Li, Tao; Hasegawa, Toshihiro; Yin, Xinyou; Zhu, Yan; Boote, Kenneth; Buis, Samuel;

2016-01-01

For most biophysical domains, differences in model structures are seldom quantified. Here, we used a taxonomy-based approach to characterise thirteen rice models. Classification keys and binary attributes for each key were identified, and models were categorised into five clusters using a binary similarity measure and the unweighted pair-group method with arithmetic mean. Principal component analysis was performed on model outputs at four sites. Results indicated that (i) differences in structure often resulted in similar predictions and (ii) similar structures can lead to large differences in model outputs. User subjectivity during calibration may have hidden expected relationships between model structure and behaviour. This explanation, if confirmed, highlights the need for shared protocols to reduce the degrees of freedom during calibration, and to limit, in turn, the risk that user subjectivity influences model performance.

Gathering Real World Evidence with Cluster Analysis for Clinical Decision Support.

PubMed

Xia, Eryu; Liu, Haifeng; Li, Jing; Mei, Jing; Li, Xuejun; Xu, Enliang; Li, Xiang; Hu, Gang; Xie, Guotong; Xu, Meilin

2017-01-01

Clinical decision support systems are information technology systems that assist clinical decision-making tasks, which have been shown to enhance clinical performance. Cluster analysis, which groups similar patients together, aims to separate patient cases into phenotypically heterogenous groups and defining therapeutically homogeneous patient subclasses. Useful as it is, the application of cluster analysis in clinical decision support systems is less reported. Here, we describe the usage of cluster analysis in clinical decision support systems, by first dividing patient cases into similar groups and then providing diagnosis or treatment suggestions based on the group profiles. This integration provides data for clinical decisions and compiles a wide range of clinical practices to inform the performance of individual clinicians. We also include an example usage of the system under the scenario of blood lipid management in type 2 diabetes. These efforts represent a step toward promoting patient-centered care and enabling precision medicine.

Sizing and Lifecycle Cost Analysis of an Ares V Composite Interstage

NASA Technical Reports Server (NTRS)

Mann, Troy; Smeltzer, Stan; Grenoble, Ray; Mason, Brian; Rosario, Sev; Fairbairn, Bob

2012-01-01

The Interstage Element of the Ares V launch vehicle was sized using a commercially available structural sizing software tool. Two different concepts were considered, a metallic design and a composite design. Both concepts were sized using similar levels of analysis fidelity and included the influence of design details on each concept. Additionally, the impact of the different manufacturing techniques and failure mechanisms for composite and metallic construction were considered. Significant details were included in analysis models of each concept, including penetrations for human access, joint connections, as well as secondary loading effects. The designs and results of the analysis were used to determine lifecycle cost estimates for the two Interstage designs. Lifecycle cost estimates were based on industry provided cost data for similar launch vehicle components. The results indicated that significant mass as well as cost savings are attainable for the chosen composite concept as compared with a metallic option.
Rapid differentiation of Chinese hop varieties (Humulus lupulus) using volatile fingerprinting by HS-SPME-GC-MS combined with multivariate statistical analysis.

PubMed

Liu, Zechang; Wang, Liping; Liu, Yumei

2018-01-18

Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
[Cloning and bioinformatic analysis and expression analysis of beta-glucuronidase in Scutellaria baicalensis].

PubMed

Guo, Shuang-shuang; Cheng, Lin; Yang, Li-min; Han, Mei

2015-11-01

The β-Glucuronidase gene (sbGUS) cDNA firstly from Scutellari abaicalensis leaf was cloned by RT-PCR, with GenBank accession number KR364726. The full length cDNA of sbGUS was 1 584 bp with an open reading frame (ORF), encoding an unstable protein with 527 amino acids. The bioinformatic analysis showed that the sbGUS encoding protein had isoelectric point (pI) of 5.55 and a calculated molecular weight about 58.724 8 kDa, with a transmembrane regions and signal peptide, had conserved domains of glycoside hydrolase super family and unintegrated trans-glycosidase catalytic structure. In the secondary structure, the percentage of alpha helix, extended strand, β-extended and random coil were 25.62%, 28.84%, 13.28% and 32.26%, respectively. The homologous analysis indicated the nucleotide sequence 98.93% similarity and the amino acid sequence 98.29% similarity with S. baicalensis (BAA97804.1), in the nine positions were different. The expression level of sGUS was the highest in root based on a real-time PCR analysis, followed by flower and stem, and the lowest was in stem. The results provide a foundation for exploring the molecular function of sbGUS involved in baicalcin biosynthesis based on synthetic biology approach in S. baicalensis plants.
Nateglinide versus repaglinide for type 2 diabetes mellitus in China.

PubMed

Li, Chanjuan; Xia, Jielai; Zhang, Gaokui; Wang, Suzhen; Wang, Ling

2009-12-01

The purpose of this study is to evaluate efficacy and safety of nateglinide tablet administration in comparison with those of repaglinide tablet as control on treating type 2 diabetes mellitus in China. Pooled-analysis with analysis of covariance (ANCOVA) method was applied to assess the efficacy and safety based on original data collected from four independent randomized clinical trials with similar research protocols. However meta-analysis was applied based on the outcomes of the four studies. The results by meta-analysis were comparable to those obtained by pooled-analysis. The means of HbA(1c), and fasting blood glucose in both the nateglinide and repaglinide groups were reduced significantly after 12 weeks duration but no statistical differences in reduction between the two groups. The adverse reaction rates were 9.89 and 6.51% in the nateglinide and repaglinide groups respectively, with the rate difference showing no statistical significance, and the Odds Ratio of adverse reaction rate (95% confidence interval) was 1.59 (0.99, 2.55). Both nateglinide and repaglinide administration have similarly significant effects on reducing HbA(1c) and FBG. However, the adverse reaction rate in the nateglinide group is higher than that in the latter using repaglinide but no statistical significance difference as revealed in the four clinical trials detailed below.
Target Identification Using Harmonic Wavelet Based ISAR Imaging

NASA Astrophysics Data System (ADS)

Shreyamsha Kumar, B. K.; Prabhakar, B.; Suryanarayana, K.; Thilagavathi, V.; Rajagopal, R.

2006-12-01

A new approach has been proposed to reduce the computations involved in the ISAR imaging, which uses harmonic wavelet-(HW) based time-frequency representation (TFR). Since the HW-based TFR falls into a category of nonparametric time-frequency (T-F) analysis tool, it is computationally efficient compared to parametric T-F analysis tools such as adaptive joint time-frequency transform (AJTFT), adaptive wavelet transform (AWT), and evolutionary AWT (EAWT). Further, the performance of the proposed method of ISAR imaging is compared with the ISAR imaging by other nonparametric T-F analysis tools such as short-time Fourier transform (STFT) and Choi-Williams distribution (CWD). In the ISAR imaging, the use of HW-based TFR provides similar/better results with significant (92%) computational advantage compared to that obtained by CWD. The ISAR images thus obtained are identified using a neural network-based classification scheme with feature set invariant to translation, rotation, and scaling.
Classification of upper Mississippi River pools based on contiguous aquatic/geomorphic habitats

USGS Publications Warehouse

Koel, Todd M.

2001-01-01

Navigation pools of the upper Mississippi River (UMR) vary greatly in terms of available contiguous aquatic/geomorphic habitats. These habitats are critical for the biotic diversity and overall productivity of the floodplain corridor of each pool. In this study, similarities among pools 4-26 and an open river reach (river kilometer 47-129) of the UMR were determined from multivariate analysis of eleven habitat types that were hydrologically-contiguous (non-leveed). Isolated floodplain habitats were not included in final analyses because this isolation limits their contribution to overall riverine productivity, in part due to a lack of hydrological connectivity to the main channel during the flood pulse. Cluster analysis based on simple Euclidean distance was used to produce two major pool groups and five pool subgroups. Important habitat variables in defining pool groups, as interpreted from principal components analysis (PCA) axis 1, were contiguous floodplain shallow aquatic area and contiguous impounded area. The habitat variable most important in defining pool subgroups, as interpreted from PCA axis 2, was tertiary channel. Most notably, pool 6 was more similar to pools 14-24 than other upper pools, and pools 19 and 25 were more similar to pools 4-13 than other lower pools. These results were quite different from those of two previous investigators, primarily because only areas of non-isolated aquatic habitat were considered.
Clustering change patterns using Fourier transformation with time-course gene expression data.

PubMed

Kim, Jaehee

2011-01-01

To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a period of time because biologically related gene groups can share the same change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. This work is aimed at discovering gene groups with similar change patterns which share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. We applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns.
Estimation of the potential leakage of the chemical munitions based on two hydrodynamical models implemented for the Baltic Sea

NASA Astrophysics Data System (ADS)

Jakacki, Jaromir; Golenko, Mariya

2014-05-01

Two hydrodynamical models (Princeton Ocean Model (POM) and Parallel Ocean Program (POP)) have been implemented for the Baltic Sea area that consists of locations of the dumped chemical munitions during II War World. The models have been configured based on similar data source - bathymetry, initial conditions and external forces were implemented based on identical data. The horizontal resolutions of the models are also very similar. Several simulations with different initial conditions have been done. Comparison and analysis of the bottom currents from both models have been performed. Based on it estimating of the dangerous area and critical time have been done. Also lagrangian particle tracking and passive tracer were implemented and based on these results probability of the appearing dangerous doses and its time evolution have been presented. This work has been performed in the frame of the MODUM project financially supported by NATO.
The Role of Sex of Peers and Gender-Typed Activities in Young Children’s Peer Affiliative Networks: A Longitudinal Analysis of Selection and Influence

PubMed Central

Martin, Carol Lynn; Kornienko, Olga; Schaefer, David R.; Hanish, Laura D.; Fabes, Richard A.; Goble, Priscilla

2012-01-01

A stochastic actor-based model was used to investigate the origins of sex segregation by examining how similarity in sex and time spent in gender-typed activities affected affiliation network selection and how peers influenced children’s (N = 292; M age = 4.3 years) activity involvement. Gender had powerful effects on interactions through direct and indirect pathways. Children selected playmates of the same-sex and with similar levels of gender-typed activities. Selection based on gender-typed activities partially mediated selection based on sex. Children influenced one another’s engagement in gender-typed activities. When mechanisms producing sex segregation were compared, the largest contributor was selection based on sex; less was due to activity-based selection and peer influence. Implications for sex segregation and gender development are discussed. PMID:23252713
Beyond conventional dose-response curves: Sensorgram comparison in SPR allows single concentration activity and similarity assessment.

PubMed

Gassner, C; Karlsson, R; Lipsmeier, F; Moelleken, J

2018-05-30

Previously we have introduced two SPR-based assay principles (dual-binding assay and bridging assay), which allow the determination of two out of three possible interaction parameters for bispecific molecules within one assay setup: two individual interactions to both targets, and/or one simultaneous/overall interaction, which potentially reflects the inter-dependency of both individual binding events. However, activity and similarity are determined by comparing report points over a concentration range, which also mirrors the way data is generated by conventional ELISA-based methods So far, binding kinetics have not been specifically considered in generic approaches for activity assessment. Here, we introduce an improved slope-ratio model which, together with a sensorgram comparison based similarity assessment, allows the development of a detailed, USP-conformal ligand binding assay using only a single sample concentration. We compare this novel analysis method to the usual concentration-range approach for both SPR-based assay principles and discuss its impact on data quality and increased sample throughput. Copyright © 2018 Elsevier B.V. All rights reserved.
Unsupervised user similarity mining in GSM sensor networks.

PubMed

Shad, Shafqat Ali; Chen, Enhong

2013-01-01

Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining.
Using register data to deduce patterns of social exchange.

PubMed

Jansson, Fredrik

2017-07-01

This paper presents a novel method for deducting propensities for social exchange between individuals based on the choices they make, and based on factors such as country of origin, sex, school grades and socioeconomic background. The objective here is to disentangle the effect of social ties from the other factors, in order to find patterns of social exchange. This is done through a control-treatment design on analysing available data, where the 'treatment' is similarity of choices between socially connected individuals, and the control is similarity of choices between non-connected individuals. Structural dependencies are controlled for and effects from different classes are pooled through a mix of methods from network and meta-analysis. The method is demonstrated and tested on Swedish register data on students at upper secondary school. The results show that having similar grades is a predictor of social exchange. Also, previous results from Norwegian data are replicated, showing that students cluster based on country of origin.
AR(p) -based detrended fluctuation analysis

NASA Astrophysics Data System (ADS)

Alvarez-Ramirez, J.; Rodriguez, E.

2018-07-01

Autoregressive models are commonly used for modeling time-series from nature, economics and finance. This work explored simple autoregressive AR(p) models to remove long-term trends in detrended fluctuation analysis (DFA). Crude oil prices and bitcoin exchange rate were considered, with the former corresponding to a mature market and the latter to an emergent market. Results showed that AR(p) -based DFA performs similar to traditional DFA. However, the former DFA provides information on stability of long-term trends, which is valuable for understanding and quantifying the dynamics of complex time series from financial systems.
IsoCleft Finder – a web-based tool for the detection and analysis of protein binding-site geometric and chemical similarities

PubMed Central

Najmanovich, Rafael

2013-01-01

IsoCleft Finder is a web-based tool for the detection of local geometric and chemical similarities between potential small-molecule binding cavities and a non-redundant dataset of ligand-bound known small-molecule binding-sites. The non-redundant dataset developed as part of this study is composed of 7339 entries representing unique Pfam/PDB-ligand (hetero group code) combinations with known levels of cognate ligand similarity. The query cavity can be uploaded by the user or detected automatically by the system using existing PDB entries as well as user-provided structures in PDB format. In all cases, the user can refine the definition of the cavity interactively via a browser-based Jmol 3D molecular visualization interface. Furthermore, users can restrict the search to a subset of the dataset using a cognate-similarity threshold. Local structural similarities are detected using the IsoCleft software and ranked according to two criteria (number of atoms in common and Tanimoto score of local structural similarity) and the associated Z-score and p-value measures of statistical significance. The results, including predicted ligands, target proteins, similarity scores, number of atoms in common, etc., are shown in a powerful interactive graphical interface. This interface permits the visualization of target ligands superimposed on the query cavity and additionally provides a table of pairwise ligand topological similarities. Similarities between top scoring ligands serve as an additional tool to judge the quality of the results obtained. We present several examples where IsoCleft Finder provides useful functional information. IsoCleft Finder results are complementary to existing approaches for the prediction of protein function from structure, rational drug design and x-ray crystallography. IsoCleft Finder can be found at: http://bcb.med.usherbrooke.ca/isocleftfinder. PMID:24555058
Metabolomic approach for discrimination of processed ginseng genus (Panax ginseng and Panax quinquefolius) using UPLC-QTOF MS

PubMed Central

Park, Hee-Won; In, Gyo; Kim, Jeong-Han; Cho, Byung-Goo; Han, Gyeong-Ho; Chang, Il-Moo

2013-01-01

Discriminating between two herbal medicines (Panax ginseng and Panax quinquefolius), with similar chemical and physical properties but different therapeutic effects, is a very serious and difficult problem. Differentiation between two processed ginseng genera is even more difficult because the characteristics of their appearance are very similar. An ultraperformance liquid chromatography-quadrupole time-of-flight mass spectrometry (UPLC-QTOF MS)-based metabolomic technique was applied for the metabolite profiling of 40 processed P. ginseng and processed P. quinquefolius. Currently known biomarkers such as ginsenoside Rf and F11 have been used for the analysis using the UPLC-photodiode array detector. However, this method was not able to fully discriminate between the two processed ginseng genera. Thus, an optimized UPLC-QTOF-based metabolic profiling method was adapted for the analysis and evaluation of two processed ginseng genera. As a result, all known biomarkers were identified by the proposed metabolomics, and additional potential biomarkers were extracted from the huge amounts of global analysis data. Therefore, it is expected that such metabolomics techniques would be widely applied to the ginseng research field. PMID:24558312
Metadata from data: identifying holidays from anesthesia data.

PubMed

Starnes, Joseph R; Wanderer, Jonathan P; Ehrenfeld, Jesse M

2015-05-01

The increasingly large databases available to researchers necessitate high-quality metadata that is not always available. We describe a method for generating this metadata independently. Cluster analysis and expectation-maximization were used to separate days into holidays/weekends and regular workdays using anesthesia data from Vanderbilt University Medical Center from 2004 to 2014. This classification was then used to describe differences between the two sets of days over time. We evaluated 3802 days and correctly categorized 3797 based on anesthesia case time (representing an error rate of 0.13%). Use of other metrics for categorization, such as billed anesthesia hours and number of anesthesia cases per day, led to similar results. Analysis of the two categories showed that surgical volume increased more quickly with time for non-holidays than holidays (p < 0.001). We were able to successfully generate metadata from data by distinguishing holidays based on anesthesia data. This data can then be used for economic analysis and scheduling purposes. It is possible that the method can be expanded to similar bimodal and multimodal variables.
Clustering Financial Time Series by Network Community Analysis

NASA Astrophysics Data System (ADS)

Piccardi, Carlo; Calatroni, Lisa; Bertoni, Fabio

In this paper, we describe a method for clustering financial time series which is based on community analysis, a recently developed approach for partitioning the nodes of a network (graph). A network with N nodes is associated to the set of N time series. The weight of the link (i, j), which quantifies the similarity between the two corresponding time series, is defined according to a metric based on symbolic time series analysis, which has recently proved effective in the context of financial time series. Then, searching for network communities allows one to identify groups of nodes (and then time series) with strong similarity. A quantitative assessment of the significance of the obtained partition is also provided. The method is applied to two distinct case-studies concerning the US and Italy Stock Exchange, respectively. In the US case, the stability of the partitions over time is also thoroughly investigated. The results favorably compare with those obtained with the standard tools typically used for clustering financial time series, such as the minimal spanning tree and the hierarchical tree.
Floral and Vegetative Morphometrics of Five Pleurothallis (Orchidaceae) Species: Correlation with Taxonomy, Phylogeny, Genetic Variability and Pollination Systems

PubMed Central

BORBA, EDUARDO L.; SHEPHERD, GEORGE J.; BERG, CÁSSIO VAN DEN; SEMIR, JOÃO

2002-01-01

Morphometric analyses of vegetative and floral characters were conducted in 21 populations of five Pleurothallis (Orchidaceae) species occurring in Brazilian ‘campo rupestre’ vegetation. A phylogenetic analysis of this species group was also carried out using nuclear ribosomal DNA internal transcribed spacers (ITS1 and ITS2). Results of the ordination and cluster analyses agree with species’ delimitation revealed by taxonomic and allozyme studies. The groups formed in ordination analysis correspond to the pollinator groups determined in a previous pollination study. Relationships among the species in the cluster analysis using only vegetative characters are similar to those found in a previous allozyme study, but those indicated by cluster analysis using only floral characters differ. These results support the hypothesis that floral similarities are due to convergence driven by similar pollination mechanisms, and therefore floral traits may not be good indicators of phylogenetic relationships in this group. The results of the phylogenetic analysis support this conclusion to some extent. There is no correlation between genetic (allozyme) and morphological variability in the populations nor in the way this variability is distributed among conspecific populations. We describe a new subspecies of Pleurothallis ochreata based on differences in vegetative and chemical characters as well as geographic distribution. Absence of differentiation in floral characters, attraction of the same pollinator species, interfertility and genetic similarity support the argument for subspecific rather than specific status. PMID:12197519
Using Response Surface Analysis to Interpret the Impact of Parent–Offspring Personality Similarity on Adolescent Externalizing Problems

PubMed Central

Laceulle, Odillia M.; Van Aken, Marcel A.G.; Ormel, Johan

2017-01-01

Abstract Personality similarity between parent and offspring has been suggested to play an important role in offspring's development of externalizing problems. Nonetheless, much remains unknown regarding the nature of this association. This study aimed to investigate the effects of parent–offspring similarity at different levels of personality traits, comparing expectations based on evolutionary and goodness‐of‐fit perspectives. Two waves of data from the TRAILS study (N = 1587, 53% girls) were used to study parent–offspring similarity at different levels of personality traits at age 16 predicting externalizing problems at age 19. Polynomial regression analyses and Response Surface Analyses were used to disentangle effects of different levels and combinations of parents and offspring personality similarity. Although several facets of the offspring's personality had an impact on offspring's externalizing problems, few similarity effects were found. Therefore, there is little support for assumptions based on either an evolutionary or a goodness‐of‐fit perspective. Instead, our findings point in the direction that offspring personality, and at similar levels also parent personality might impact the development of externalizing problems during late adolescence. © 2017 The Authors. European Journal of Personality published by John Wiley & Sons Ltd on behalf of European Association of Personality Psychology PMID:28303077
Breast and ovarian cancer risks to carriers of the BRCA1 5382insC and 185delAG and BRCA2 6174delT mutations: a combined analysis of 22 population based studies

PubMed Central

Antoniou, A; Pharoah, P; Narod, S; Risch, H; Eyfjord, J; Hopper, J; Olsson, H; Johannsson, O; Borg, A; Pasini, B; Radice, P; Manoukian, S; Eccles, D; Tang, N; Olah, E; Anton-Culver, H; Warner, E; Lubinski, J; Gronwald, J; Gorski, B; Tulinius, H; Thorlacius, S; Eerola, H; Nevanlinna, H; Syrjakoski, K; Kallioniemi, O; Thompson, D; Evans, C; Peto, J; Lalloo, F; Evans, D; Easton, D

2005-01-01

A recent report estimated the breast cancer risks in carriers of the three Ashkenazi founder mutations to be higher than previously published estimates derived from population based studies. In an attempt to confirm this, the breast and ovarian cancer risks associated with the three Ashkenazi founder mutations were estimated using families included in a previous meta-analysis of populatrion based studies. The estimated breast cancer risks for each of the founder BRCA1 and BRCA2 mutations were similar to the corresponding estimates based on all BRCA1 or BRCA2 mutations in the meta-analysis. These estimates appear to be consistent with the observed prevalence of the mutations in the Ashkenazi Jewish population. PMID:15994883

Engineering Adipose-like Tissue in vitro and in vivo Utilizing Human Bone Marrow and Adipose-derived Mesenchymal Stem Cells with Silk Fibroin 3D Scaffolds

PubMed Central

Mauney, Joshua R; Nguyen, Trang; Gillen, Kelly; Kirker-Head, Carl; Gimble, Jeffrey M.; Kaplan, David L.

2009-01-01

Biomaterials derived from silk fibrion prepared by aqueous (AB) and organic (HFIP) solvent based processes, along with collagen (COL) and poly-lactic acid (PLA) based scaffolds were studied in vitro and in vivo for their utility in adipose tissue engineering strategies. For in vitro studies, human bone marrow and adipose-derived mesenchymal stem cells (hMSCs and hASCs) were seeded on the various biomaterials and cultured for 21 days in the presence of adipogenic stimulants (AD) or maintained as noninduced controls. Alamar Blue analysis revealed each biomaterial supported initial attachment of hMSCs and hASCs to similar levels for all matrices except COL in which higher levels were observed. hASCs and hMSCs cultured on all biomaterials in the presence of AD showed significant upregulation of adipogenic mRNA transcript levels (LPL, GLUT4, FABP4, PPARγ, adipsin, ACS) to similar extents when compared to noninduced controls. Similarly Oil-Red O analysis of hASC or hMSC-seeded scaffolds displayed substantial amounts of lipid accumulating adipocytes following cultivation with AD. The data revealed AB and HFIP scaffolds supported similar extents of lipid accumulating cells while PLA and COL scaffolds qualitatively displayed lower and higher extents by comparison, respectively. Following a 4 week implantation period in a rat muscle pouch defect model, both AB and HFIP scaffolds supported in vivo adipogenesis either alone or seeded with hASCs or hMSCs as assessed by Oil-Red O analysis, however the presence of exogenous cell sources substantially increased the extent and frequency of adipogenesis observed. In contrast, COL and PLA scaffolds underwent rapid scaffold degradation and were irretrievable following the implantation period. The results suggest that macroporous 3D AB and HFIP silk fibroin scaffolds offer an important platform for cell-based adipose tissue engineering applications, and in particular, provide longer-term structural integrity to promote the maintenance of soft tissue in vivo. PMID:17765303
Molecular description of α-keto-based inhibitors of cruzain with activity against Chagas disease combining 3D-QSAR studies and molecular dynamics.

PubMed

Saraiva, Ádria P B; Miranda, Ricardo M; Valente, Renan P P; Araújo, Jéssica O; Souza, Rutelene N B; Costa, Clauber H S; Oliveira, Amanda R S; Almeida, Michell O; Figueiredo, Antonio F; Ferreira, João E V; Alves, Cláudio Nahum; Honorio, Kathia M

2018-04-22

In this work, a group of α-keto-based inhibitors of the cruzain enzyme with anti-chagas activity was selected for a three-dimensional quantitative structure-activity relationship study (3D-QSAR) combined with molecular dynamics (MD). Firstly, statistical models based on Partial Least Square (PLS) regression were developed employing comparative molecular field analysis (CoMFA) and comparative molecular similarity indices analysis (CoMSIA) descriptors. Validation parameters (q 2 and r 2 )for the models were, respectively, 0.910 and 0.997 (CoMFA) and 0.913 and 0.992 (CoMSIA). In addition, external validation for the models using a test group revealed r 2 pred = 0.728 (CoMFA) and 0.971 (CoMSIA). The most relevant aspect in this study was the generation of molecular fields in both favorable and unfavorable regions based on the models developed. These fields are important to interpret modifications necessary to enhance the biological activities of the inhibitors. This analysis was restricted considering the inhibitors in a fixed conformation, not interacting with their target, the cruzain enzyme. Then, MD was employed taking into account important variables such as time and temperature. MD helped describe the behavior of the inhibitors and their properties showed similar results as those generated by QSAR-3D study. © 2018 John Wiley & Sons A/S.
A Knowledge-Based System For Analysis, Intervention Planning and Prevention of Defects in Immovable Cultural Heritage Objects and Monuments

NASA Astrophysics Data System (ADS)

Valach, J.; Cacciotti, R.; Kuneš, P.; ČerÅanský, M.; Bláha, J.

2012-04-01

The paper presents a project aiming to develop a knowledge-based system for documentation and analysis of defects of cultural heritage objects and monuments. The MONDIS information system concentrates knowledge on damage of immovable structures due to various causes, and preventive/remedial actions performed to protect/repair them, where possible. The currently built system is to provide for understanding of causal relationships between a defect, materials, external load, and environment of built object. Foundation for the knowledge-based system will be the systemized and formalized knowledge on defects and their mitigation acquired in the process of analysis of a representative set of cases documented in the past. On the basis of design comparability, used technologies, materials and the nature of the external forces and surroundings, the developed software system has the capacity to indicate the most likely risks of new defect occurrence or the extension of the existing ones. The system will also allow for a comparison of the actual failure with similar cases documented and will propose a suitable technical intervention plan. The system will provide conservationists, administrators and owners of historical objects with a toolkit for defect documentation for their objects. Also, advanced artificial intelligence methods will offer accumulated knowledge to users and will also enable them to get oriented in relevant techniques of preventive interventions and reconstructions based on similarity with their case.
Data-Driven Hierarchical Structure Kernel for Multiscale Part-Based Object Recognition

PubMed Central

Wang, Botao; Xiong, Hongkai; Jiang, Xiaoqian; Zheng, Yuan F.

2017-01-01

Detecting generic object categories in images and videos are a fundamental issue in computer vision. However, it faces the challenges from inter and intraclass diversity, as well as distortions caused by viewpoints, poses, deformations, and so on. To solve object variations, this paper constructs a structure kernel and proposes a multiscale part-based model incorporating the discriminative power of kernels. The structure kernel would measure the resemblance of part-based objects in three aspects: 1) the global similarity term to measure the resemblance of the global visual appearance of relevant objects; 2) the part similarity term to measure the resemblance of the visual appearance of distinctive parts; and 3) the spatial similarity term to measure the resemblance of the spatial layout of parts. In essence, the deformation of parts in the structure kernel is penalized in a multiscale space with respect to horizontal displacement, vertical displacement, and scale difference. Part similarities are combined with different weights, which are optimized efficiently to maximize the intraclass similarities and minimize the interclass similarities by the normalized stochastic gradient ascent algorithm. In addition, the parameters of the structure kernel are learned during the training process with regard to the distribution of the data in a more discriminative way. With flexible part sizes on scale and displacement, it can be more robust to the intraclass variations, poses, and viewpoints. Theoretical analysis and experimental evaluations demonstrate that the proposed multiscale part-based representation model with structure kernel exhibits accurate and robust performance, and outperforms state-of-the-art object classification approaches. PMID:24808345
Analysis of combined data from heterogeneous study designs: an applied example from the patient navigation research program.

PubMed

Roetzheim, Richard G; Freund, Karen M; Corle, Don K; Murray, David M; Snyder, Frederick R; Kronman, Andrea C; Jean-Pierre, Pascal; Raich, Peter C; Holden, Alan Ec; Darnell, Julie S; Warren-Mears, Victoria; Patierno, Steven

2012-04-01

The Patient Navigation Research Program (PNRP) is a cooperative effort of nine research projects, with similar clinical criteria but with different study designs. To evaluate projects such as PNRP, it is desirable to perform a pooled analysis to increase power relative to the individual projects. There is no agreed-upon prospective methodology, however, for analyzing combined data arising from different study designs. Expert opinions were thus solicited from the members of the PNRP Design and Analysis Committee. To review possible methodologies for analyzing combined data arising from heterogeneous study designs. The Design and Analysis Committee critically reviewed the pros and cons of five potential methods for analyzing combined PNRP project data. The conclusions were based on simple consensus. The five approaches reviewed included the following: (1) analyzing and reporting each project separately, (2) combining data from all projects and performing an individual-level analysis, (3) pooling data from projects having similar study designs, (4) analyzing pooled data using a prospective meta-analytic technique, and (5) analyzing pooled data utilizing a novel simulated group-randomized design. Methodologies varied in their ability to incorporate data from all PNRP projects, to appropriately account for differing study designs, and to accommodate differing project sample sizes. The conclusions reached were based on expert opinion and not derived from actual analyses performed. The ability to analyze pooled data arising from differing study designs may provide pertinent information to inform programmatic, budgetary, and policy perspectives. Multisite community-based research may not lend itself well to the more stringent explanatory and pragmatic standards of a randomized controlled trial design. Given our growing interest in community-based population research, the challenges inherent in the analysis of heterogeneous study design are likely to become more salient. Discussion of the analytic issues faced by the PNRP and the methodological approaches we considered may be of value to other prospective community-based research programs.
Comparative analysis of aging policy reforms in Argentina, Chile, Costa Rica, and Mexico.

PubMed

Calvo, Esteban; Berho, Maureen; Roqué, Mónica; Amaro, Juan Sebastián; Morales, Fernando; Rivera, Emiliana; Gutiérrez Robledo, Luis Miguel F; López, Elizabeth Caro; Canals, Bernardita; Kornfeld, Rosa

2018-04-16

This investigation uses case studies and comparative analysis to review and analyze aging policy in Argentina, Chile, Costa Rica, and Mexico, and uncovers similarities and relevant trends in the substance of historical and current aging policy across countries. Initial charity-based approaches to poverty and illness have been gradually replaced by a rights-based approach considering broader notions of well-being, and recent reforms emphasize the need for national, intersectoral, evidence-based policy. The results of this study have implications for understanding aging policy in Latin America from a welfare regime and policymakers' perspective, identifying priorities for intervention, and informing policy reforms in developing countries worldwide.
Gravity Scaling of a Power Reactor Water Shield

NASA Technical Reports Server (NTRS)

Reid, Robert S.; Pearson, J. Boise

2007-01-01

A similarity analysis on a water-based reactor shield examined the effect of gravity on free convection between a reactor shield inner and outer vessel boundaries. Two approaches established similarity between operation on the Earth and the Moon: 1) direct scaling of Rayleigh number equating gravity-surface heat flux products, 2) temperature difference between the wall and thermal boundary layer held constant. Nusselt number for natural convection (laminar and turbulent) is assumed of form Nu = CRa(sup n).
Predicting new drug indications from network analysis

NASA Astrophysics Data System (ADS)

Mohd Ali, Yousoff Effendy; Kwa, Kiam Heong; Ratnavelu, Kurunathan

This work adapts centrality measures commonly used in social network analysis to identify drugs with better positions in drug-side effect network and drug-indication network for the purpose of drug repositioning. Our basic hypothesis is that drugs having similar phenotypic profiles such as side effects may also share similar therapeutic properties based on related mechanism of action and vice versa. The networks were constructed from Side Effect Resource (SIDER) 4.1 which contains 1430 unique drugs with side effects and 1437 unique drugs with indications. Within the giant components of these networks, drugs were ranked based on their centrality scores whereby 18 prominent drugs from the drug-side effect network and 15 prominent drugs from the drug-indication network were identified. Indications and side effects of prominent drugs were deduced from the profiles of their neighbors in the networks and compared to existing clinical studies while an optimum threshold of similarity among drugs was sought for. The threshold can then be utilized for predicting indications and side effects of all drugs. Similarities of drugs were measured by the extent to which they share phenotypic profiles and neighbors. To improve the likelihood of accurate predictions, only profiles such as side effects of common or very common frequencies were considered. In summary, our work is an attempt to offer an alternative approach to drug repositioning using centrality measures commonly used for analyzing social networks.
Receptor-driven, multimodal mapping of the human amygdala.

PubMed

Kedo, Olga; Zilles, Karl; Palomero-Gallagher, Nicola; Schleicher, Axel; Mohlberg, Hartmut; Bludau, Sebastian; Amunts, Katrin

2018-05-01

The human amygdala consists of subdivisions contributing to various functions. However, principles of structural organization at the cellular and molecular level are not well understood. Thus, we re-analyzed the cytoarchitecture of the amygdala and generated cytoarchitectonic probabilistic maps of ten subdivisions in stereotaxic space based on novel workflows and mapping tools. This parcellation was then used as a basis for analyzing the receptor expression for 15 receptor types. Receptor fingerprints, i.e., the characteristic balance between densities of all receptor types, were generated in each subdivision to comprehensively visualize differences and similarities in receptor architecture between the subdivisions. Fingerprints of the central and medial nuclei and the anterior amygdaloid area were highly similar. Fingerprints of the lateral, basolateral and basomedial nuclei were also similar to each other, while those of the remaining nuclei were distinct in shape. Similarities were further investigated by a hierarchical cluster analysis: a two-cluster solution subdivided the phylogenetically older part (central, medial nuclei, anterior amygdaloid area) from the remaining parts of the amygdala. A more fine-grained three-cluster solution replicated our previous parcellation including a laterobasal, superficial and centromedial group. Furthermore, it helped to better characterize the paralaminar nucleus with a molecular organization in-between the laterobasal and the superficial group. The multimodal cyto- and receptor-architectonic analysis of the human amygdala provides new insights into its microstructural organization, intersubject variability, localization in stereotaxic space and principles of receptor-based neurochemical differences.
Empirical evaluation of grouping of lower urinary tract symptoms: principal component analysis of Tampere Ageing Male Urological Study data.

PubMed

Pöyhönen, Antti; Häkkinen, Jukka T; Koskimäki, Juha; Hakama, Matti; Tammela, Teuvo L J; Auvinen, Anssi

2013-03-01

WHAT'S KNOWN ON THE SUBJECT? AND WHAT DOES THE STUDY ADD?: The ICS has divided LUTS into three groups: storage, voiding and post-micturition symptoms. The classification is based on anatomical, physiological and urodynamic considerations of a theoretical nature. We used principal component analysis (PCA) to determine the inter-correlations of various LUTS, which is a novel approach to research and can strengthen existing knowledge of the phenomenology of LUTS. After we had completed our analyses, another study was published that used a similar approach and results were very similar to those of the present study. We evaluated the constellation of LUTS using PCA of the data from a population-based study that included >4000 men. In our analysis, three components emerged from the 12 LUTS: voiding, storage and incontinence components. Our results indicated that incontinence may be separate from the other storage symptoms and post-micturition symptoms should perhaps be regarded as voiding symptoms. To determine how lower urinary tract symptoms (LUTS) relate to each other and assess if the classification proposed by the International Continence Society (ICS) is consistent with empirical findings. The information on urinary symptoms for this population-based study was collected using a self-administered postal questionnaire in 2004. The questionnaire was sent to 7470 men, aged 30-80 years, from Pirkanmaa County (Finland), of whom 4384 (58.7%) returned the questionnaire. The Danish Prostatic Symptom Score-1 questionnaire was used to evaluate urinary symptoms. Principal component analysis (PCA) was used to evaluate the inter-correlations among various urinary symptoms. The PCA produced a grouping of 12 LUTS into three categories consisting of voiding, storage and incontinence symptoms. Post-micturition symptoms were related to voiding symptoms, but incontinence symptoms were separate from storage symptoms. In the analyses by age group, similar categorization was found at ages 40, 50, 60 and 80 years, but only two groups of symptoms emerged among men aged 70 years. The prevalence among men aged 30 was too low for meaningful analysis. This population-based study suggests that LUTS can be divided into three subgroups consisting of voiding, storage and incontinence symptoms based on their inter-correlations. Our empirical findings suggest an alternative grouping of LUTS. The potential utility of such an approach requires careful consideration. © 2012 BJU International.
Assessing Low-Intensity Relationships in Complex Networks

PubMed Central

Spitz, Andreas; Gimmler, Anna; Stoeck, Thorsten; Zweig, Katharina Anna; Horvát, Emőke-Ágnes

2016-01-01

Many large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human interactions. In these networks with missing and spurious links, it is possible to refine the data based on the principle of structural similarity, which assesses the shared neighborhood of two nodes. By using similarity measures to globally rank all possible links and choosing the top-ranked pairs, true links can be validated, missing links inferred, and spurious observations removed. While many similarity measures have been proposed to this end, there is no general consensus on which one to use. In this article, we first contribute a set of benchmarks for complex networks from three different settings (e-commerce, systems biology, and social networks) and thus enable a quantitative performance analysis of classic node similarity measures. Based on this, we then propose a new methodology for link assessment called z* that assesses the statistical significance of the number of their common neighbors by comparison with the expected value in a suitably chosen random graph model and which is a consistently top-performing algorithm for all benchmarks. In addition to a global ranking of links, we also use this method to identify the most similar neighbors of each single node in a local ranking, thereby showing the versatility of the method in two distinct scenarios and augmenting its applicability. Finally, we perform an exploratory analysis on an oceanographic plankton data set and find that the distribution of microbes follows similar biogeographic rules as those of macroorganisms, a result that rejects the global dispersal hypothesis for microbes. PMID:27096435
Assessing Low-Intensity Relationships in Complex Networks.

PubMed

Spitz, Andreas; Gimmler, Anna; Stoeck, Thorsten; Zweig, Katharina Anna; Horvát, Emőke-Ágnes

2016-01-01

Many large network data sets are noisy and contain links representing low-intensity relationships that are difficult to differentiate from random interactions. This is especially relevant for high-throughput data from systems biology, large-scale ecological data, but also for Web 2.0 data on human interactions. In these networks with missing and spurious links, it is possible to refine the data based on the principle of structural similarity, which assesses the shared neighborhood of two nodes. By using similarity measures to globally rank all possible links and choosing the top-ranked pairs, true links can be validated, missing links inferred, and spurious observations removed. While many similarity measures have been proposed to this end, there is no general consensus on which one to use. In this article, we first contribute a set of benchmarks for complex networks from three different settings (e-commerce, systems biology, and social networks) and thus enable a quantitative performance analysis of classic node similarity measures. Based on this, we then propose a new methodology for link assessment called z* that assesses the statistical significance of the number of their common neighbors by comparison with the expected value in a suitably chosen random graph model and which is a consistently top-performing algorithm for all benchmarks. In addition to a global ranking of links, we also use this method to identify the most similar neighbors of each single node in a local ranking, thereby showing the versatility of the method in two distinct scenarios and augmenting its applicability. Finally, we perform an exploratory analysis on an oceanographic plankton data set and find that the distribution of microbes follows similar biogeographic rules as those of macroorganisms, a result that rejects the global dispersal hypothesis for microbes.
A comprehensive physiologically based pharmacokinetic ...

EPA Pesticide Factsheets

Published physiologically based pharmacokinetic (PBPK) models from peer-reviewed articles are often well-parameterized, thoroughly-vetted, and can be utilized as excellent resources for the construction of models pertaining to related chemicals. Specifically, chemical-specific parameters and in vivo pharmacokinetic data used to calibrate these published models can act as valuable starting points for model development of new chemicals with similar molecular structures. A knowledgebase for published PBPK-related articles was compiled to support PBPK model construction for new chemicals based on their close analogues within the knowledgebase, and a web-based interface was developed to allow users to query those close analogues. A list of 689 unique chemicals and their corresponding 1751 articles was created after analysis of 2,245 PBPK-related articles. For each model, the PMID, chemical name, major metabolites, species, gender, life stages and tissue compartments were extracted from the published articles. PaDEL-Descriptor, a Chemistry Development Kit based software, was used to calculate molecular fingerprints. Tanimoto index was implemented in the user interface as measurement of structural similarity. The utility of the PBPK knowledgebase and web-based user interface was demonstrated using two case studies with ethylbenzene and gefitinib. Our PBPK knowledgebase is a novel tool for ranking chemicals based on similarities to other chemicals associated with existi
Reading Guided by Automated Graphical Representations: How Model-Based Text Visualizations Facilitate Learning in Reading Comprehension Tasks

ERIC Educational Resources Information Center

Pirnay-Dummer, Pablo; Ifenthaler, Dirk

2011-01-01

Our study integrates automated natural language-oriented assessment and analysis methodologies into feasible reading comprehension tasks. With the newly developed T-MITOCAR toolset, prose text can be automatically converted into an association net which has similarities to a concept map. The "text to graph" feature of the software is based on…
Interactive decision support in hepatic surgery

PubMed Central

Dugas, Martin; Schauer, Rolf; Volk, Andreas; Rau, Horst

2002-01-01

Background Hepatic surgery is characterized by complicated operations with a significant peri- and postoperative risk for the patient. We developed a web-based, high-granular research database for comprehensive documentation of all relevant variables to evaluate new surgical techniques. Methods To integrate this research system into the clinical setting, we designed an interactive decision support component. The objective is to provide relevant information for the surgeon and the patient to assess preoperatively the risk of a specific surgical procedure. Based on five established predictors of patient outcomes, the risk assessment tool searches for similar cases in the database and aggregates the information to estimate the risk for an individual patient. Results The physician can verify the analysis and exclude manually non-matching cases according to his expertise. The analysis is visualized by means of a Kaplan-Meier plot. To evaluate the decision support component we analyzed data on 165 patients diagnosed with hepatocellular carcinoma (period 1996–2000). The similarity search provides a two-peak distribution indicating there are groups of similar patients and singular cases which are quite different to the average. The results of the risk estimation are consistent with the observed survival data, but must be interpreted with caution because of the limited number of matching reference cases. Conclusion Critical issues for the decision support system are clinical integration, a transparent and reliable knowledge base and user feedback. PMID:12003639
PROBLEMS AND METHODOLOGY OF THE PETROLOGIC ANALYSIS OF COAL FACIES.

USGS Publications Warehouse

Chao, Edward C.T.

1983-01-01

This condensed synthesis gives a broad outline of the methodology of coal facies analysis, procedures for constructing sedimentation and geochemical formation curves, and micro- and macrostratigraphic analysis. The hypothetical coal bed profile has a 3-fold cycle of material characteristics. Based on studies of other similar profiles of the same coal bed, and on field studies of the sedimentary rock types and their facies interpretation, one can assume that the 3-fold subdivision is of regional significance.
Analysis of BSRT Profiles in the LHC at Injection

DOE Office of Scientific and Technical Information (OSTI.GOV)

Fitterer, M.; Stancari, G.; Papadopoulou, S.

The beam synchrotron radiation telescope (BSRT) at the LHC allows to take profiles of the transverse beam distribution, which can provide useful additional insight in the evolution of the transverse beam distribution. A python class has been developed [1], which allows to read in the BSRT profiles, usually stored in binary format, run different analysis tools and generate plots of the statistical parameters and profiles as well as videos of the the profiles. The detailed analysis will be described in this note. The analysis is based on the data obtained at injection energy (450 GeV) during MD1217 [2] and MD1415more » [3] which will be also used as illustrative example. A similar approach is also taken with a MATLAB based analysis described in [4].« less
Extensional channel flow revisited: a dynamical systems perspective

PubMed Central

Meseguer, Alvaro; Mellibovsky, Fernando; Weidman, Patrick D.

2017-01-01

Extensional self-similar flows in a channel are explored numerically for arbitrary stretching–shrinking rates of the confining parallel walls. The present analysis embraces time integrations, and continuations of steady and periodic solutions unfolded in the parameter space. Previous studies focused on the analysis of branches of steady solutions for particular stretching–shrinking rates, although recent studies focused also on the dynamical aspects of the problems. We have adopted a dynamical systems perspective, analysing the instabilities and bifurcations the base state undergoes when increasing the Reynolds number. It has been found that the base state becomes unstable for small Reynolds numbers, and a transitional region including complex dynamics takes place at intermediate Reynolds numbers, depending on the wall acceleration values. The base flow instabilities are constitutive parts of different codimension-two bifurcations that control the dynamics in parameter space. For large Reynolds numbers, the restriction to self-similarity results in simple flows with no realistic behaviour, but the flows obtained in the transition region can be a valuable tool for the understanding of the dynamics of realistic Navier–Stokes solutions. PMID:28690413
Variable-intercept panel model for deformation zoning of a super-high arch dam.

PubMed

Shi, Zhongwen; Gu, Chongshi; Qin, Dong

2016-01-01

This study determines dam deformation similarity indexes based on an analysis of deformation zoning features and panel data clustering theory, with comprehensive consideration to the actual deformation law of super-high arch dams and the spatial-temporal features of dam deformation. Measurement methods of these indexes are studied. Based on the established deformation similarity criteria, the principle used to determine the number of dam deformation zones is constructed through entropy weight method. This study proposes the deformation zoning method for super-high arch dams and the implementation steps, analyzes the effect of special influencing factors of different dam zones on the deformation, introduces dummy variables that represent the special effect of dam deformation, and establishes a variable-intercept panel model for deformation zoning of super-high arch dams. Based on different patterns of the special effect in the variable-intercept panel model, two panel analysis models were established to monitor fixed and random effects of dam deformation. Hausman test method of model selection and model effectiveness assessment method are discussed. Finally, the effectiveness of established models is verified through a case study.
DeltaSA tool for source apportionment benchmarking, description and sensitivity analysis

NASA Astrophysics Data System (ADS)

Pernigotti, D.; Belis, C. A.

2018-05-01

DeltaSA is an R-package and a Java on-line tool developed at the EC-Joint Research Centre to assist and benchmark source apportionment applications. Its key functionalities support two critical tasks in this kind of studies: the assignment of a factor to a source in factor analytical models (source identification) and the model performance evaluation. The source identification is based on the similarity between a given factor and source chemical profiles from public databases. The model performance evaluation is based on statistical indicators used to compare model output with reference values generated in intercomparison exercises. The references values are calculated as the ensemble average of the results reported by participants that have passed a set of testing criteria based on chemical profiles and time series similarity. In this study, a sensitivity analysis of the model performance criteria is accomplished using the results of a synthetic dataset where "a priori" references are available. The consensus modulated standard deviation punc gives the best choice for the model performance evaluation when a conservative approach is adopted.

Multivariate approaches for stability control of the olive oil reference materials for sensory analysis - part I: framework and fundamentals.

PubMed

Valverde-Som, Lucia; Ruiz-Samblás, Cristina; Rodríguez-García, Francisco P; Cuadros-Rodríguez, Luis

2018-02-09

Virgin olive oil is the only food product for which sensory analysis is regulated to classify it in different quality categories. To harmonize the results of the sensorial method, the use of standards or reference materials is crucial. The stability of sensory reference materials is required to enable their suitable control, aiming to confirm that their specific target values are maintained on an ongoing basis. Currently, such stability is monitored by means of sensory analysis and the sensory panels are in the paradoxical situation of controlling the standards that are devoted to controlling the panels. In the present study, several approaches based on similarity analysis are exploited. For each approach, the specific methodology to build a proper multivariate control chart to monitor the stability of the sensory properties is explained and discussed. The normalized Euclidean and Mahalanobis distances, the so-called nearness and hardiness indices respectively, have been defined as new similarity indices to range the values from 0 to 1. Also, the squared mean from Hotelling's T 2 -statistic and Q 2 -statistic has been proposed as another similarity index. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
K2 and K2*: efficient alignment-free sequence similarity measurement based on Kendall statistics.

PubMed

Lin, Jie; Adjeroh, Donald A; Jiang, Bing-Hua; Jiang, Yue

2018-05-15

Alignment-free sequence comparison methods can compute the pairwise similarity between a huge number of sequences much faster than sequence-alignment based methods. We propose a new non-parametric alignment-free sequence comparison method, called K2, based on the Kendall statistics. Comparing to the other state-of-the-art alignment-free comparison methods, K2 demonstrates competitive performance in generating the phylogenetic tree, in evaluating functionally related regulatory sequences, and in computing the edit distance (similarity/dissimilarity) between sequences. Furthermore, the K2 approach is much faster than the other methods. An improved method, K2*, is also proposed, which is able to determine the appropriate algorithmic parameter (length) automatically, without first considering different values. Comparative analysis with the state-of-the-art alignment-free sequence similarity methods demonstrates the superiority of the proposed approaches, especially with increasing sequence length, or increasing dataset sizes. The K2 and K2* approaches are implemented in the R language as a package and is freely available for open access (http://community.wvu.edu/daadjeroh/projects/K2/K2_1.0.tar.gz). yueljiang@163.com. Supplementary data are available at Bioinformatics online.
Hyperspectral remote sensing image retrieval system using spectral and texture features.

PubMed

Zhang, Jing; Geng, Wenhao; Liang, Xi; Li, Jiafeng; Zhuo, Li; Zhou, Qianlan

2017-06-01

Although many content-based image retrieval systems have been developed, few studies have focused on hyperspectral remote sensing images. In this paper, a hyperspectral remote sensing image retrieval system based on spectral and texture features is proposed. The main contributions are fourfold: (1) considering the "mixed pixel" in the hyperspectral image, endmembers as spectral features are extracted by an improved automatic pixel purity index algorithm, then the texture features are extracted with the gray level co-occurrence matrix; (2) similarity measurement is designed for the hyperspectral remote sensing image retrieval system, in which the similarity of spectral features is measured with the spectral information divergence and spectral angle match mixed measurement and in which the similarity of textural features is measured with Euclidean distance; (3) considering the limited ability of the human visual system, the retrieval results are returned after synthesizing true color images based on the hyperspectral image characteristics; (4) the retrieval results are optimized by adjusting the feature weights of similarity measurements according to the user's relevance feedback. The experimental results on NASA data sets can show that our system can achieve comparable superior retrieval performance to existing hyperspectral analysis schemes.
PROSPECT improves cis-acting regulatory element prediction by integrating expression profile data with consensus pattern searches

PubMed Central

Fujibuchi, Wataru; Anderson, John S. J.; Landsman, David

2001-01-01

Consensus pattern and matrix-based searches designed to predict cis-acting transcriptional regulatory sequences have historically been subject to large numbers of false positives. We sought to decrease false positives by incorporating expression profile data into a consensus pattern-based search method. We have systematically analyzed the expression phenotypes of over 6000 yeast genes, across 121 expression profile experiments, and correlated them with the distribution of 14 known regulatory elements over sequences upstream of the genes. Our method is based on a metric we term probabilistic element assessment (PEA), which is a ranking of potential sites based on sequence similarity in the upstream regions of genes with similar expression phenotypes. For eight of the 14 known elements that we examined, our method had a much higher selectivity than a naïve consensus pattern search. Based on our analysis, we have developed a web-based tool called PROSPECT, which allows consensus pattern-based searching of gene clusters obtained from microarray data. PMID:11574681
HPLC fingerprint analysis combined with chemometrics for pattern recognition of ginger.

PubMed

Feng, Xu; Kong, Weijun; Wei, Jianhe; Ou-Yang, Zhen; Yang, Meihua

2014-03-01

Ginger, the fresh rhizome of Zingiber officinale Rosc. (Zingiberaceae), has been used worldwide; however, for a long time, there has been no standard approbated internationally for its quality control. To establish an efficacious and combinational method and pattern recognition technique for quality control of ginger. A simple, accurate and reliable method based on high-performance liquid chromatography with photodiode array (HPLC-PDA) detection was developed for establishing the chemical fingerprints of 10 batches of ginger from different markets in China. The method was validated in terms of precision, reproducibility and stability; and the relative standard deviations were all less than 1.57%. On the basis of this method, the fingerprints of 10 batches of ginger samples were obtained, which showed 16 common peaks. Coupled with similarity evaluation software, the similarities between each fingerprint of the sample and the simulative mean chromatogram were in the range of 0.998-1.000. Then, the chemometric techniques, including similarity analysis, hierarchical clustering analysis and principal component analysis were applied to classify the ginger samples. Consistent results were obtained to show that ginger samples could be successfully classified into two groups. This study revealed that HPLC-PDA method was simple, sensitive and reliable for fingerprint analysis, and moreover, for pattern recognition and quality control of ginger.
[Analysis of different forms Linderae Radix based on HPLC and NIRS fingerprints].

PubMed

Du, Wei-Feng; Yue, Xian-Ke; Wu, Yao; Ge, Wei-Hong; Lu, Tu-Lin; Wang, Zhi-Min

2016-10-01

Three different forms of Linderae Radix were evaluated by HPLC combined with NIRS fingerprint. The Linderae Radix was divided into three forms, including spindle root, straight root and old root. The HPLC fingerprints were developed, and then cluster analysis was performed using the SPSS software. The near-infrared spectra of Linderae Radix was collected, and then established the discriminant analysis model. The similarity values of the spindle root and straight root all were above 0.990, while the similarity value of the old root was less than 0.850. Two forms of Linderae Radix were obviously divided into three parts by the NIRS model and Cluster analysis. The results of HPLC and FT-NIR analysis showed the quality of Linderae Radix old root was different from the spindle root and straight root. The combined use of the two methods could identify different forms of Linderae Radix quickly and accurately. Copyright© by the Chinese Pharmaceutical Association.
Non-linear analytic and coanalytic problems ( L_p-theory, Clifford analysis, examples)

NASA Astrophysics Data System (ADS)

Dubinskii, Yu A.; Osipenko, A. S.

2000-02-01

Two kinds of new mathematical model of variational type are put forward: non-linear analytic and coanalytic problems. The formulation of these non-linear boundary-value problems is based on a decomposition of the complete scale of Sobolev spaces into the "orthogonal" sum of analytic and coanalytic subspaces. A similar decomposition is considered in the framework of Clifford analysis. Explicit examples are presented.
On the feasibility of automatically selecting similar patients in highly individualized radiotherapy dose reconstruction for historic data of pediatric cancer survivors.

PubMed

Virgolin, Marco; van Dijk, Irma W E M; Wiersma, Jan; Ronckers, Cécile M; Witteveen, Cees; Bel, Arjan; Alderliesten, Tanja; Bosman, Peter A N

2018-04-01

The aim of this study is to establish the first step toward a novel and highly individualized three-dimensional (3D) dose distribution reconstruction method, based on CT scans and organ delineations of recently treated patients. Specifically, the feasibility of automatically selecting the CT scan of a recently treated childhood cancer patient who is similar to a given historically treated child who suffered from Wilms' tumor is assessed. A cohort of 37 recently treated children between 2- and 6-yr old are considered. Five potential notions of ground-truth similarity are proposed, each focusing on different anatomical aspects. These notions are automatically computed from CT scans of the abdomen and 3D organ delineations (liver, spleen, spinal cord, external body contour). The first is based on deformable image registration, the second on the Dice similarity coefficient, the third on the Hausdorff distance, the fourth on pairwise organ distances, and the last is computed by means of the overlap volume histogram. The relationship between typically available features of historically treated patients and the proposed ground-truth notions of similarity is studied by adopting state-of-the-art machine learning techniques, including random forest. Also, the feasibility of automatically selecting the most similar patient is assessed by comparing ground-truth rankings of similarity with predicted rankings. Similarities (mainly) based on the external abdomen shape and on the pairwise organ distances are highly correlated (Pearson r p ≥ 0.70) and are successfully modeled with random forests based on historically recorded features (pseudo-R 2 ≥ 0.69). In contrast, similarities based on the shape of internal organs cannot be modeled. For the similarities that random forest can reliably model, an estimation of feature relevance indicates that abdominal diameters and weight are the most important. Experiments on automatically selecting similar patients lead to coarse, yet quite robust results: the most similar patient is retrieved only 22% of the times, however, the error in worst-case scenarios is limited, with the fourth most similar patient being retrieved. Results demonstrate that automatically selecting similar patients is feasible when focusing on the shape of the external abdomen and on the position of internal organs. Moreover, whereas the common practice in phantom-based dose reconstruction is to select a representative phantom using age, height, and weight as discriminant factors for any treatment scenario, our analysis on abdominal tumor treatment for children shows that the most relevant features are weight and the anterior-posterior and left-right abdominal diameters. © 2018 American Association of Physicists in Medicine.
Characterizing Chemical Similarity with Vibrational Spectroscopy: New Insights into the Substituent Effects in Monosubstituted Benzenes.

PubMed

Tao, Yunwen; Zou, Wenli; Cremer, Dieter; Kraka, Elfi

2017-10-26

A novel approach is presented to assess chemical similarity based the local vibrational mode analysis developed by Konkoli and Cremer. The local mode frequency shifts are introduced as similarity descriptors that are sensitive to any electronic structure change. In this work, 59 different monosubstituted benzenes are compared. For a subset of 43 compounds, for which experimental data was available, the ortho-/para- and meta-directing effect in electrophilic aromatic substitution reactions could be correctly reproduced, proving the robustness of the new similarity index. For the remaining 16 compounds, the directing effect was predicted. The new approach is broadly applicable to all compounds for which either experimental or calculated vibrational frequency information is available.
Authentication of commercial spices based on the similarities between gas chromatographic fingerprints.

PubMed

Matsushita, Takaya; Zhao, Jing Jing; Igura, Noriyuki; Shimoda, Mitsuya

2018-06-01

A simple and solvent-free method was developed for the authentication of commercial spices. The similarities between gas chromatographic fingerprints were measured using similarity indices and multivariate data analyses, as morphological differentiation between dried powders and small spice particles was challenging. The volatile compounds present in 11 spices (i.e. allspice, anise, black pepper, caraway, clove, coriander, cumin, dill, fennel, star anise, and white pepper) were extracted by headspace solid-phase microextraction, and analysed by gas chromatography-mass spectrometry. The largest 10 peaks were selected from each total ion chromatogram, and a total of 65 volatiles were tentatively identified. The similarity indices (i.e. the congruence coefficients) were calculated using the data matrices of the identified compound relative peak areas to differentiate between two sets of fingerprints. Where pairs of similar fingerprints produced high congruence coefficients (>0.80), distinctive volatile markers were employed to distinguish between these samples. In addition, hierarchical cluster analysis and principal component analysis were performed to visualise the similarity among fingerprints, and the analysed spices were grouped and characterised according to their distinctive major components. This method is suitable for screening unknown spices, and can therefore be employed to evaluate the quality and authenticity of various spices. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Quantifying humpback whale song sequences to understand the dynamics of song exchange at the ocean basin scale.

PubMed

Garland, Ellen C; Noad, Michael J; Goldizen, Anne W; Lilley, Matthew S; Rekdahl, Melinda L; Garrigue, Claire; Constantine, Rochelle; Daeschler Hauser, Nan; Poole, M Michael; Robbins, Jooke

2013-01-01

Humpback whales have a continually evolving vocal sexual display, or "song," that appears to undergo both evolutionary and "revolutionary" change. All males within a population adhere to the current content and arrangement of the song. Populations within an ocean basin share similarities in their songs; this sharing is complex as multiple variations of the song (song types) may be present within a region at any one time. To quantitatively investigate the similarity of song types, songs were compared at both the individual singer and population level using the Levenshtein distance technique and cluster analysis. The highly stereotyped sequences of themes from the songs of 211 individuals from populations within the western and central South Pacific region from 1998 through 2008 were grouped together based on the percentage of song similarity, and compared to qualitatively assigned song types. The analysis produced clusters of highly similar songs that agreed with previous qualitative assignments. Each cluster contained songs from multiple populations and years, confirming the eastward spread of song types and their progressive evolution through the study region. Quantifying song similarity and exchange will assist in understanding broader song dynamics and contribute to the use of vocal displays as population identifiers.
Similarity Evaluation of Different Origins and Species of Dendrobiums by GC-MS and FTIR Analysis of Polysaccharides

PubMed Central

Chen, Nai-Dong; Chen, Nai-Fu; Li, Jun; Cao, Cai-Yun; Wang, Jin-Mei; Huang, He-Ping

2015-01-01

GC-MS method combined with FTIR techniques by the analysis of polysaccharide was applied to evaluate the similarity between wild (W) and tissue-cultured (TC) Dendrobium huoshanense (DHS), Dendrobium officinale (DO), and Dendrobium moniliforme (DM) as well as 3 wild Dendrobium spp.: Dendrobium henanense (DHN), Dendrobium loddigesii (DL), and Dendrobium crepidatum (DC). Eight monosaccharides involving xylose, arabinose, rhamnose, glucose, mannose, fructose, galactose, and galacturonic acid were identified in the polysaccharide from each Dendrobium sample while the contents of the monosugars varied remarkably across origins and species. Further similarity evaluation based on GC-MS data showed that the r cor values of different origins of DHS, DO, and DM were 0.831, 0.865, and 0.884, respectively, while the r cor values ranged from 0.475 to 0.837 across species. FTIR files of the polysaccharides revealed that the similarity coefficients between W and TC-DHS, DO, and DM were 88.7%, 86.8%, and 88.5%, respectively, in contrast to the similarity coefficients varying from 57.4% to 82.6% across species. These results suggested that the structures of polysaccharides between different origins of the investigated Dendrobiums might be higher than what we had supposed. PMID:26539215
Advanced Taste Sensors Based on Artificial Lipids with Global Selectivity to Basic Taste Qualities and High Correlation to Sensory Scores

PubMed Central

Kobayashi, Yoshikazu; Habara, Masaaki; Ikezazki, Hidekazu; Chen, Ronggang; Naito, Yoshinobu; Toko, Kiyoshi

2010-01-01

Effective R&D and strict quality control of a broad range of foods, beverages, and pharmaceutical products require objective taste evaluation. Advanced taste sensors using artificial-lipid membranes have been developed based on concepts of global selectivity and high correlation with human sensory score. These sensors respond similarly to similar basic tastes, which they quantify with high correlations to sensory score. Using these unique properties, these sensors can quantify the basic tastes of saltiness, sourness, bitterness, umami, astringency and richness without multivariate analysis or artificial neural networks. This review describes all aspects of these taste sensors based on artificial lipid, ranging from the response principle and optimal design methods to applications in the food, beverage, and pharmaceutical markets. PMID:22319306
A new collaborative recommendation approach based on users clustering using artificial bee colony algorithm.

PubMed

Ju, Chunhua; Xu, Chonghuan

2013-01-01

Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users' preferences. In this paper, we propose a novel collaborative filtering recommendation approach based on K-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC) algorithm to overcome the local optimal problem caused by K-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark dataset MovieLens and a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods.
A New Collaborative Recommendation Approach Based on Users Clustering Using Artificial Bee Colony Algorithm

PubMed Central

Ju, Chunhua

2013-01-01

Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users' preferences. In this paper, we propose a novel collaborative filtering recommendation approach based on K-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC) algorithm to overcome the local optimal problem caused by K-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark dataset MovieLens and a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods. PMID:24381525
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

PubMed

Ma, Yan; Mazumdar, Madhu

2011-10-30

Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Statistically advanced, self-similar, radial probability density functions of atmospheric and under-expanded hydrogen jets

NASA Astrophysics Data System (ADS)

Ruggles, Adam J.

2015-11-01

This paper presents improved statistical insight regarding the self-similar scalar mixing process of atmospheric hydrogen jets and the downstream region of under-expanded hydrogen jets. Quantitative planar laser Rayleigh scattering imaging is used to probe both jets. The self-similarity of statistical moments up to the sixth order (beyond the literature established second order) is documented in both cases. This is achieved using a novel self-similar normalization method that facilitated a degree of statistical convergence that is typically limited to continuous, point-based measurements. This demonstrates that image-based measurements of a limited number of samples can be used for self-similar scalar mixing studies. Both jets exhibit the same radial trends of these moments demonstrating that advanced atmospheric self-similarity can be applied in the analysis of under-expanded jets. Self-similar histograms away from the centerline are shown to be the combination of two distributions. The first is attributed to turbulent mixing. The second, a symmetric Poisson-type distribution centered on zero mass fraction, progressively becomes the dominant and eventually sole distribution at the edge of the jet. This distribution is attributed to shot noise-affected pure air measurements, rather than a diffusive superlayer at the jet boundary. This conclusion is reached after a rigorous measurement uncertainty analysis and inspection of pure air data collected with each hydrogen data set. A threshold based upon the measurement noise analysis is used to separate the turbulent and pure air data, and thusly estimate intermittency. Beta-distributions (four parameters) are used to accurately represent the turbulent distribution moments. This combination of measured intermittency and four-parameter beta-distributions constitutes a new, simple approach to model scalar mixing. Comparisons between global moments from the data and moments calculated using the proposed model show excellent agreement. This was attributed to the high quality of the measurements which reduced the width of the correctly identified, noise-affected pure air distribution, with respect to the turbulent mixing distribution. The ignitability of the atmospheric jet is determined using the flammability factor calculated from both kernel density estimated (KDE) PDFs and PDFs generated using the newly proposed model. Agreement between contours from both approaches is excellent. Ignitability of the under-expanded jet is also calculated using KDE PDFs. Contours are compared with those calculated by applying the atmospheric model to the under-expanded jet. Once again, agreement is excellent. This work demonstrates that self-similar scalar mixing statistics and ignitability of atmospheric jets can be accurately described by the proposed model. This description can be applied with confidence to under-expanded jets, which are more realistic of leak and fuel injection scenarios.
Morphodynamic modeling of erodible laminar channels.

PubMed

Devauchelle, Olivier; Josserand, Christophe; Lagrée, Pierre-Yves; Zaleski, Stéphane

2007-11-01

A two-dimensional model for the erosion generated by viscous free-surface flows, based on the shallow-water equations and the lubrication approximation, is presented. It has a family of self-similar solutions for straight erodible channels, with an aspect ratio that increases in time. It is also shown, through a simplified stability analysis, that a laminar river can generate various bar instabilities very similar to those observed in natural rivers. This theoretical similarity reflects the meandering and braiding tendencies of laminar rivers indicated by F. Métivier and P. Meunier [J. Hydrol. 27, 22 (2003)]. Finally, we propose a simple scenario for the transition between patterns observed in experimental erodible channels.
Axiomatic Analysis of Co-occurrence Similarity Functions

DTIC Science & Technology

2012-02-01

Formally, the similarity COSW (q, u) of a target node u to the query q based on weight matrix W is: COSW (q, u) = ∑ c∈Γ(q)∩Γ(u) WqcWuc || Wq :||2||Wu:||2...where Wq : and Wu: are the qth and uth row of the W matrix, respectively. 3 Symbol Definition q Query item with respect to which similarities of other...WqcWuc AA 1log|Γ(c)| COS WqcWuc|| Wq :||2||Wu:||2 FRW Wqc∑ j Wqj Wuc∑ iWic JAC 1|Γ(q)∪Γ(u)| BRW Wuc∑ j Wuj Wqc∑ iWic PMI 1|Γ(q)||Γ(u)| MMT Wqc∑ j Wqj Wuc
Droplet-based microfluidic analysis and screening of single plant cells.

PubMed

Yu, Ziyi; Boehm, Christian R; Hibberd, Julian M; Abell, Chris; Haseloff, Jim; Burgess, Steven J; Reyna-Llorens, Ivan

2018-01-01

Droplet-based microfluidics has been used to facilitate high-throughput analysis of individual prokaryote and mammalian cells. However, there is a scarcity of similar workflows applicable to rapid phenotyping of plant systems where phenotyping analyses typically are time-consuming and low-throughput. We report on-chip encapsulation and analysis of protoplasts isolated from the emergent plant model Marchantia polymorpha at processing rates of >100,000 cells per hour. We use our microfluidic system to quantify the stochastic properties of a heat-inducible promoter across a population of transgenic protoplasts to demonstrate its potential for assessing gene expression activity in response to environmental conditions. We further demonstrate on-chip sorting of droplets containing YFP-expressing protoplasts from wild type cells using dielectrophoresis force. This work opens the door to droplet-based microfluidic analysis of plant cells for applications ranging from high-throughput characterisation of DNA parts to single-cell genomics to selection of rare plant phenotypes.

Physics faculty beliefs and values about the teaching and learning of problem solving. II. Procedures for measurement and analysis

NASA Astrophysics Data System (ADS)

Henderson, Charles; Yerushalmi, Edit; Kuo, Vince H.; Heller, Kenneth; Heller, Patricia

2007-12-01

To identify and describe the basis upon which instructors make curricular and pedagogical decisions, we have developed an artifact-based interview and an analysis technique based on multilayered concept maps. The policy capturing technique used in the interview asks instructors to make judgments about concrete instructional artifacts similar to those they likely encounter in their teaching environment. The analysis procedure alternatively employs both an a priori systems view analysis and an emergent categorization to construct a multilayered concept map, which is a hierarchically arranged set of concept maps where child maps include more details than parent maps. Although our goal was to develop a model of physics faculty beliefs about the teaching and learning of problem solving in the context of an introductory calculus-based physics course, the techniques described here are applicable to a variety of situations in which instructors make decisions that influence teaching and learning.
Artificial neural networks for document analysis and recognition.

PubMed

Marinai, Simone; Gori, Marco; Soda, Giovanni; Society, Computer

2005-01-01

Artificial neural networks have been extensively applied to document analysis and recognition. Most efforts have been devoted to the recognition of isolated handwritten and printed characters with widely recognized successful results. However, many other document processing tasks, like preprocessing, layout analysis, character segmentation, word recognition, and signature verification, have been effectively faced with very promising results. This paper surveys the most significant problems in the area of offline document image processing, where connectionist-based approaches have been applied. Similarities and differences between approaches belonging to different categories are discussed. A particular emphasis is given on the crucial role of prior knowledge for the conception of both appropriate architectures and learning algorithms. Finally, the paper provides a critical analysis on the reviewed approaches and depicts the most promising research guidelines in the field. In particular, a second generation of connectionist-based models are foreseen which are based on appropriate graphical representations of the learning environment.
Quality evaluation of Shenmaidihuang Pills based on the chromatographic fingerprints and simultaneous determination of seven bioactive constituents.

PubMed

Liu, Sifei; Zhang, Guangrui; Qiu, Ying; Wang, Xiaobo; Guo, Lihan; Zhao, Yanxin; Tong, Meng; Wei, Lan; Sun, Lixin

2016-12-01

In this study, we aimed to establish a comprehensive and practical quality evaluation system for Shenmaidihuang pills. A simple and reliable high-performance liquid chromatography coupled with photodiode array detection method was developed both for fingerprint analysis and quantitative determination. In fingerprint analysis, relative retention time and relative peak area were used to identify the common peaks in 18 samples for investigation. Twenty one peaks were selected as the common peaks to evaluate the similarities of 18 Shenmaidihuang pills samples with different manufacture dates. Furthermore, similarity analysis was applied to evaluate the similarity of samples. Hierarchical cluster analysis and principal component analysis were also performed to evaluate the variation of Shenmaidihuang pills. In quantitative analysis, linear regressions, injection precisions, recovery, repeatability and sample stability were all tested and good results were obtained to simultaneously determine the seven identified compounds, namely, 5-hydroxymethylfurfural, morroniside, loganin, paeonol, paeoniflorin, psoralen, isopsoralen in Shenmaidihuang pills. The contents of some analytes in different batches of samples indicated significant difference, especially for 5-hydroxymethylfurfural. So, it was concluded that the chromatographic fingerprint method obtained by high-performance liquid chromatography coupled with photodiode array detection associated with multiple compounds determination is a powerful and meaningful tool to comprehensively conduct the quality control of Shenmaidihuang pills. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Similarity analysis between quantum images

NASA Astrophysics Data System (ADS)

Zhou, Ri-Gui; Liu, XingAo; Zhu, Changming; Wei, Lai; Zhang, Xiafen; Ian, Hou

2018-06-01

Similarity analyses between quantum images are so essential in quantum image processing that it provides fundamental research for the other fields, such as quantum image matching, quantum pattern recognition. In this paper, a quantum scheme based on a novel quantum image representation and quantum amplitude amplification algorithm is proposed. At the end of the paper, three examples and simulation experiments show that the measurement result must be 0 when two images are same, and the measurement result has high probability of being 1 when two images are different.
HYPO: A Precedent-Based Legal Reasoner

DTIC Science & Technology

1987-11-01

identify legal issues in the analysis of law school exam- ination fact patterns involving the contracts law of offer and acceptance. The program...primarily used i’-then rules and an ATN to represent its legal knowledge of contract law . It used heuristics for distinguishing "hard" and "easy" legal...similar to Gardner’s. Also, the use of rules in Meldmdn (to define the elements of a claim) and Gardner (to define ingredients of contract law ) are similar
An Energy-Based Similarity Measure for Time Series

NASA Astrophysics Data System (ADS)

Boudraa, Abdel-Ouahab; Cexus, Jean-Christophe; Groussat, Mathieu; Brunagel, Pierre

2007-12-01

A new similarity measure, called SimilB, for time series analysis, based on the cross-[InlineEquation not available: see fulltext.]-energy operator (2004), is introduced. [InlineEquation not available: see fulltext.] is a nonlinear measure which quantifies the interaction between two time series. Compared to Euclidean distance (ED) or the Pearson correlation coefficient (CC), SimilB includes the temporal information and relative changes of the time series using the first and second derivatives of the time series. SimilB is well suited for both nonstationary and stationary time series and particularly those presenting discontinuities. Some new properties of [InlineEquation not available: see fulltext.] are presented. Particularly, we show that [InlineEquation not available: see fulltext.] as similarity measure is robust to both scale and time shift. SimilB is illustrated with synthetic time series and an artificial dataset and compared to the CC and the ED measures.
Similar protein expression profiles of ovarian and endometrial high-grade serous carcinomas.

PubMed

Hiramatsu, Kosuke; Yoshino, Kiyoshi; Serada, Satoshi; Yoshihara, Kosuke; Hori, Yumiko; Fujimoto, Minoru; Matsuzaki, Shinya; Egawa-Takata, Tomomi; Kobayashi, Eiji; Ueda, Yutaka; Morii, Eiichi; Enomoto, Takayuki; Naka, Tetsuji; Kimura, Tadashi

2016-03-01

Ovarian and endometrial high-grade serous carcinomas (HGSCs) have similar clinical and pathological characteristics; however, exhaustive protein expression profiling of these cancers has yet to be reported. We performed protein expression profiling on 14 cases of HGSCs (7 ovarian and 7 endometrial) and 18 endometrioid carcinomas (9 ovarian and 9 endometrial) using iTRAQ-based exhaustive and quantitative protein analysis. We identified 828 tumour-expressed proteins and evaluated the statistical similarity of protein expression profiles between ovarian and endometrial HGSCs using unsupervised hierarchical cluster analysis (P<0.01). Using 45 statistically highly expressed proteins in HGSCs, protein ontology analysis detected two enriched terms and proteins composing each term: IMP2 and MCM2. Immunohistochemical analyses confirmed the higher expression of IMP2 and MCM2 in ovarian and endometrial HGSCs as well as in tubal and peritoneal HGSCs than in endometrioid carcinomas (P<0.01). The knockdown of either IMP2 or MCM2 by siRNA interference significantly decreased the proliferation rate of ovarian HGSC cell line (P<0.01). We demonstrated the statistical similarity of the protein expression profiles of ovarian and endometrial HGSC beyond the organs. We suggest that increased IMP2 and MCM2 expression may underlie some of the rapid HGSC growth observed clinically.
Trends in public perceptions and preferences on energy and environmental policy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Farhar, B.C.

1993-02-01

This report presents selected results from a secondary analysis of public opinion surveys, taken at the national and state/local levels, relevant to energy and environmental policy choices. The data base used in the analysis includes about 2000 items from nearly 600 separate surveys conducted between 1979 and 1992. Answers to word-for-word questions were traced over time, permitting trend analysis. Patterns of response were also identified for findings from similarly worded survey items. The analysis identifies changes in public opinion concerning energy during the past 10 to 15 years.
Determining similarity in histological images using graph-theoretic description and matching methods for content-based image retrieval in medical diagnostics.

PubMed

Sharma, Harshita; Alekseychuk, Alexander; Leskovsky, Peter; Hellwich, Olaf; Anand, R S; Zerbe, Norman; Hufnagl, Peter

2012-10-04

Computer-based analysis of digitalized histological images has been gaining increasing attention, due to their extensive use in research and routine practice. The article aims to contribute towards the description and retrieval of histological images by employing a structural method using graphs. Due to their expressive ability, graphs are considered as a powerful and versatile representation formalism and have obtained a growing consideration especially by the image processing and computer vision community. The article describes a novel method for determining similarity between histological images through graph-theoretic description and matching, for the purpose of content-based retrieval. A higher order (region-based) graph-based representation of breast biopsy images has been attained and a tree-search based inexact graph matching technique has been employed that facilitates the automatic retrieval of images structurally similar to a given image from large databases. The results obtained and evaluation performed demonstrate the effectiveness and superiority of graph-based image retrieval over a common histogram-based technique. The employed graph matching complexity has been reduced compared to the state-of-the-art optimal inexact matching methods by applying a pre-requisite criterion for matching of nodes and a sophisticated design of the estimation function, especially the prognosis function. The proposed method is suitable for the retrieval of similar histological images, as suggested by the experimental and evaluation results obtained in the study. It is intended for the use in Content Based Image Retrieval (CBIR)-requiring applications in the areas of medical diagnostics and research, and can also be generalized for retrieval of different types of complex images. The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1224798882787923.
Determining similarity in histological images using graph-theoretic description and matching methods for content-based image retrieval in medical diagnostics

PubMed Central

2012-01-01

Background Computer-based analysis of digitalized histological images has been gaining increasing attention, due to their extensive use in research and routine practice. The article aims to contribute towards the description and retrieval of histological images by employing a structural method using graphs. Due to their expressive ability, graphs are considered as a powerful and versatile representation formalism and have obtained a growing consideration especially by the image processing and computer vision community. Methods The article describes a novel method for determining similarity between histological images through graph-theoretic description and matching, for the purpose of content-based retrieval. A higher order (region-based) graph-based representation of breast biopsy images has been attained and a tree-search based inexact graph matching technique has been employed that facilitates the automatic retrieval of images structurally similar to a given image from large databases. Results The results obtained and evaluation performed demonstrate the effectiveness and superiority of graph-based image retrieval over a common histogram-based technique. The employed graph matching complexity has been reduced compared to the state-of-the-art optimal inexact matching methods by applying a pre-requisite criterion for matching of nodes and a sophisticated design of the estimation function, especially the prognosis function. Conclusion The proposed method is suitable for the retrieval of similar histological images, as suggested by the experimental and evaluation results obtained in the study. It is intended for the use in Content Based Image Retrieval (CBIR)-requiring applications in the areas of medical diagnostics and research, and can also be generalized for retrieval of different types of complex images. Virtual Slides The virtual slide(s) for this article can be found here: http://www.diagnosticpathology.diagnomx.eu/vs/1224798882787923. PMID:23035717
Improving protein complex classification accuracy using amino acid composition profile.

PubMed

Huang, Chien-Hung; Chou, Szu-Yu; Ng, Ka-Lok

2013-09-01

Protein complex prediction approaches are based on the assumptions that complexes have dense protein-protein interactions and high functional similarity between their subunits. We investigated those assumptions by studying the subunits' interaction topology, sequence similarity and molecular function for human and yeast protein complexes. Inclusion of amino acids' physicochemical properties can provide better understanding of protein complex properties. Principal component analysis is carried out to determine the major features. Adopting amino acid composition profile information with the SVM classifier serves as an effective post-processing step for complexes classification. Improvement is based on primary sequence information only, which is easy to obtain. Copyright © 2013 Elsevier Ltd. All rights reserved.
Unraveling systematic inventory of Echinops (Asteraceae) with special reference to nrDNA ITS sequence-based molecular typing of Echinops abuzinadianus.

PubMed

Ali, M A; Al-Hemaid, F M; Lee, J; Hatamleh, A A; Gyulai, G; Rahman, M O

2015-10-02

The present study explored the systematic inventory of Echinops L. (Asteraceae) of Saudi Arabia, with special reference to the molecular typing of Echinops abuzinadianus Chaudhary, an endemic species to Saudi Arabia, based on the internal transcribed spacer (ITS) sequences (ITS1-5.8S-ITS2) of nuclear ribosomal DNA. A sequence similarity search using BLAST and a phylogenetic analysis of the ITS sequence of E. abuzinadianus revealed a high level of sequence similarity with E. glaberrimus DC. (section Ritropsis). The novel primary sequence and the secondary structure of ITS2 of E. abuzinadianus could potentially be used for molecular genotyping.
Design Oriented Structural Modeling for Airplane Conceptual Design Optimization

NASA Technical Reports Server (NTRS)

Livne, Eli

1999-01-01

The main goal for research conducted with the support of this grant was to develop design oriented structural optimization methods for the conceptual design of airplanes. Traditionally in conceptual design airframe weight is estimated based on statistical equations developed over years of fitting airplane weight data in data bases of similar existing air- planes. Utilization of such regression equations for the design of new airplanes can be justified only if the new air-planes use structural technology similar to the technology on the airplanes in those weight data bases. If any new structural technology is to be pursued or any new unconventional configurations designed the statistical weight equations cannot be used. In such cases any structural weight estimation must be based on rigorous "physics based" structural analysis and optimization of the airframes under consideration. Work under this grant progressed to explore airframe design-oriented structural optimization techniques along two lines of research: methods based on "fast" design oriented finite element technology and methods based on equivalent plate / equivalent shell models of airframes, in which the vehicle is modelled as an assembly of plate and shell components, each simulating a lifting surface or nacelle / fuselage pieces. Since response to changes in geometry are essential in conceptual design of airplanes, as well as the capability to optimize the shape itself, research supported by this grant sought to develop efficient techniques for parametrization of airplane shape and sensitivity analysis with respect to shape design variables. Towards the end of the grant period a prototype automated structural analysis code designed to work with the NASA Aircraft Synthesis conceptual design code ACS= was delivered to NASA Ames.
Fluorescence-based methods for detecting caries lesions: systematic review, meta-analysis and sources of heterogeneity.

PubMed

Gimenez, Thais; Braga, Mariana Minatel; Raggio, Daniela Procida; Deery, Chris; Ricketts, David N; Mendes, Fausto Medeiros

2013-01-01

Fluorescence-based methods have been proposed to aid caries lesion detection. Summarizing and analysing findings of studies about fluorescence-based methods could clarify their real benefits. We aimed to perform a comprehensive systematic review and meta-analysis to evaluate the accuracy of fluorescence-based methods in detecting caries lesions. Two independent reviewers searched PubMed, Embase and Scopus through June 2012 to identify papers/articles published. Other sources were checked to identify non-published literature. STUDY ELIGIBILITY CRITERIA, PARTICIPANTS AND DIAGNOSTIC METHODS: The eligibility criteria were studies that: (1) have assessed the accuracy of fluorescence-based methods of detecting caries lesions on occlusal, approximal or smooth surfaces, in both primary or permanent human teeth, in the laboratory or clinical setting; (2) have used a reference standard; and (3) have reported sufficient data relating to the sample size and the accuracy of methods. A diagnostic 2×2 table was extracted from included studies to calculate the pooled sensitivity, specificity and overall accuracy parameters (Diagnostic Odds Ratio and Summary Receiver-Operating curve). The analyses were performed separately for each method and different characteristics of the studies. The quality of the studies and heterogeneity were also evaluated. Seventy five studies met the inclusion criteria from the 434 articles initially identified. The search of the grey or non-published literature did not identify any further studies. In general, the analysis demonstrated that the fluorescence-based method tend to have similar accuracy for all types of teeth, dental surfaces or settings. There was a trend of better performance of fluorescence methods in detecting more advanced caries lesions. We also observed moderate to high heterogeneity and evidenced publication bias. Fluorescence-based devices have similar overall performance; however, better accuracy in detecting more advanced caries lesions has been observed.
Molecular characterization and phylogenetic relationships among microsporidian isolates infecting silkworm, Bombyx mori using small subunit rRNA (SSU-rRNA) gene sequence analysis.

PubMed

Nath, B Surendra; Gupta, S K; Bajpai, A K

2012-12-01

The life cycle, spore morphology, pathogenicity, tissue specificity, mode of transmission and small subunit rRNA (SSU-rRNA) gene sequence analysis of the five new microsporidian isolates viz., NIWB-11bp, NIWB-12n, NIWB-13md, NIWB-14b and NIWB-15mb identified from the silkworm, Bombyx mori have been studied along with type species, NIK-1s_mys. The life cycle of the microsporidians identified exhibited the sequential developmental cycles that are similar to the general developmental cycle of the genus, Nosema. The spores showed considerable variations in their shape, length and width. The pathogenicity observed was dose-dependent and differed from each of the microsporidian isolates; the NIWB-15mb was found to be more virulent than other isolates. All of the microsporidians were found to infect most of the tissues examined and showed gonadal infection and transovarial transmission in the infected silkworms. SSU-rRNA sequence based phylogenetic tree placed NIWB-14b, NIWB-12n and NIWB-11bp in a separate branch along with other Nosema species and Nosema bombycis; while NIWB-15mb and NIWB-13md together formed another cluster along with other Nosema species. NIK-1s_mys revealed a signature sequence similar to standard type species, N. bombycis, indicating that NIK-1s_mys is similar to N. bombycis. Based on phylogenetic relationships, branch length information based on genetic distance and nucleotide differences, we conclude that the microsporidian isolates identified are distinctly different from the other known species and belonging to the genus, Nosema. This SSU-rRNA gene sequence analysis method is found to be more useful approach in detecting different and closely related microsporidians of this economically important domestic insect.
Analysis of the human diseasome using phenotype similarity between common, genetic, and infectious diseases

NASA Astrophysics Data System (ADS)

Hoehndorf, Robert; Schofield, Paul N.; Gkoutos, Georgios V.

2015-06-01

Phenotypes are the observable characteristics of an organism arising from its response to the environment. Phenotypes associated with engineered and natural genetic variation are widely recorded using phenotype ontologies in model organisms, as are signs and symptoms of human Mendelian diseases in databases such as OMIM and Orphanet. Exploiting these resources, several computational methods have been developed for integration and analysis of phenotype data to identify the genetic etiology of diseases or suggest plausible interventions. A similar resource would be highly useful not only for rare and Mendelian diseases, but also for common, complex and infectious diseases. We apply a semantic text-mining approach to identify the phenotypes (signs and symptoms) associated with over 6,000 diseases. We evaluate our text-mined phenotypes by demonstrating that they can correctly identify known disease-associated genes in mice and humans with high accuracy. Using a phenotypic similarity measure, we generate a human disease network in which diseases that have similar signs and symptoms cluster together, and we use this network to identify closely related diseases based on common etiological, anatomical as well as physiological underpinnings.
Hyperspectral Image Denoising Using a Nonlocal Spectral Spatial Principal Component Analysis

NASA Astrophysics Data System (ADS)

Li, D.; Xu, L.; Peng, J.; Ma, J.

2018-04-01

Hyperspectral images (HSIs) denoising is a critical research area in image processing duo to its importance in improving the quality of HSIs, which has a negative impact on object detection and classification and so on. In this paper, we develop a noise reduction method based on principal component analysis (PCA) for hyperspectral imagery, which is dependent on the assumption that the noise can be removed by selecting the leading principal components. The main contribution of paper is to introduce the spectral spatial structure and nonlocal similarity of the HSIs into the PCA denoising model. PCA with spectral spatial structure can exploit spectral correlation and spatial correlation of HSI by using 3D blocks instead of 2D patches. Nonlocal similarity means the similarity between the referenced pixel and other pixels in nonlocal area, where Mahalanobis distance algorithm is used to estimate the spatial spectral similarity by calculating the distance in 3D blocks. The proposed method is tested on both simulated and real hyperspectral images, the results demonstrate that the proposed method is superior to several other popular methods in HSI denoising.
Early adolescent friendships and academic adjustment: examining selection and influence processes with longitudinal social network analysis.

PubMed

Shin, Huiyoung; Ryan, Allison M

2014-11-01

This study investigated early adolescent friendship selection and social influence with regard to academic motivation (self-efficacy and intrinsic value), engagement (effortful and disruptive behavior), and achievement (GPA calculated from report card grades) among 6th graders (N = 587, 50% girls at Wave 1; N = 576, 52% girls at Wave 2) followed from fall to spring within 1 academic year. A stochastic actor-based model of social network analysis was used to overcome methodological limitations of prior research on friends, peer groups, and academic adjustment. Evidence that early adolescents sought out friends who were similar to themselves (selection) was found in regard to academic self-efficacy, and a similar trend was found for achievement. Evidence that friends became more similar to their friends over time (influence) was found for all aspects of academic adjustment except academic self-efficacy. Collectively, results indicate that selection effects were not as pervasive as influence effects in explaining similarity among friends in academic adjustment. (PsycINFO Database Record (c) 2014 APA, all rights reserved).
Verification of Sulfate Attack Penetration Rates for Saltstone Disposal Unit Modeling

DOE Office of Scientific and Technical Information (OSTI.GOV)

Flach, G. P.

Recent Special Analysis modeling of Saltstone Disposal Units consider sulfate attack on concrete and utilize degradation rates estimated from Cementitious Barriers Partnership software simulations. This study provides an independent verification of those simulation results using an alternative analysis method and an independent characterization data source. The sulfate penetration depths estimated herein are similar to the best-estimate values in SRNL-STI-2013-00118 Rev. 2 and well below the nominal values subsequently used to define Saltstone Special Analysis base cases.
Strong ion and weak acid analysis in severe preeclampsia: potential clinical significance.

PubMed

Ortner, C M; Combrinck, B; Allie, S; Story, D; Landau, R; Cain, K; Dyer, R A

2015-08-01

The influence of common disturbances seen in preeclampsia, such as changes in strong ions and weak acids (particularly albumin) on acid-base status, has not been fully elucidated. The aims of this study were to provide a comprehensive acid-base analysis in severe preeclampsia and to identify potential new biological predictors of disease severity. Fifty women with severe preeclampsia, 25 healthy non-pregnant- and 46 healthy pregnant controls (26-40 weeks' gestation), were enrolled in this prospective case-control study. Acid-base analysis was performed by applying the physicochemical approach of Stewart and Gilfix. Mean [sd] base excess was similar in preeclamptic- and healthy pregnant women (-3.3 [2.3], and -2.8 [1.5] mEq/L respectively). In preeclampsia, there were greater offsetting contributions to the base excess, in the form of hyperchloraemia (BE(Cl) -2 [2.3] vs -0.4 [2.3] mEq/L, P<0.001) and hypoalbuminaemia (BE(Alb) 3.6 [1] vs 2.1 [0.8] mEq/L, P<0.001). In preeclampsia, hypoalbuminaemic metabolic alkalosis was associated with a non-reassuring/abnormal fetal heart tracing (P<0.001). Quantitative analysis in healthy pregnancy revealed respiratory and hypoalbuminaemic alkalosis that was metabolically offset by acidosis, secondary to unmeasured anions and dilution. While the overall base excess in severe preeclampsia is similar to that in healthy pregnancy, preeclampsia is associated with a greater imbalance offsetting hypoalbuminaemic alkalosis and hyperchloraemic acidosis. Rather than the absolute value of base excess, the magnitude of these opposing contributors may be a better indicator of the severity of this disease. Hypoalbuminaemic alkalosis may also be a predictor of fetal compromise. clinicaltrials.gov: NCT 02164370. © The Author 2015. Published by Oxford University Press on behalf of the British Journal of Anaesthesia. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

SHOP: scaffold HOPping by GRID-based similarity searches.

PubMed

Bergmann, Rikke; Linusson, Anna; Zamora, Ismael

2007-05-31

A new GRID-based method for scaffold hopping (SHOP) is presented. In a fully automatic manner, scaffolds were identified in a database based on three types of 3D-descriptors. SHOP's ability to recover scaffolds was assessed and validated by searching a database spiked with fragments of known ligands of three different protein targets relevant for drug discovery using a rational approach based on statistical experimental design. Five out of eight and seven out of eight thrombin scaffolds and all seven HIV protease scaffolds were recovered within the top 10 and 31 out of 31 neuraminidase scaffolds were in the 31 top-ranked scaffolds. SHOP also identified new scaffolds with substantially different chemotypes from the queries. Docking analysis indicated that the new scaffolds would have similar binding modes to those of the respective query scaffolds observed in X-ray structures. The databases contained scaffolds from published combinatorial libraries to ensure that identified scaffolds could be feasibly synthesized.
Non-Intrusive Gaze Tracking Using Artificial Neural Networks

DTIC Science & Technology

1994-01-05

We have developed an artificial neural network based gaze tracking, system which can be customized to individual users. A three layer feed forward...empirical analysis of the performance of a large number of artificial neural network architectures for this task. Suggestions for further explorations...for neurally based gaze trackers are presented, and are related to other similar artificial neural network applications such as autonomous road following.
Turbulent Flow Over Large Roughness Elements: Effect of Frontal and Plan Solidity on Turbulence Statistics and Structure

NASA Astrophysics Data System (ADS)

Placidi, M.; Ganapathisubramani, B.

2018-04-01

Wind-tunnel experiments were carried out on fully-rough boundary layers with large roughness (δ /h ≈ 10, where h is the height of the roughness elements and δ is the boundary-layer thickness). Twelve different surface conditions were created by using LEGO™ bricks of uniform height. Six cases are tested for a fixed plan solidity (λ _P) with variations in frontal density (λ _F), while the other six cases have varying λ _P for fixed λ _F. Particle image velocimetry and floating-element drag-balance measurements were performed. The current results complement those contained in Placidi and Ganapathisubramani (J Fluid Mech 782:541-566, 2015), extending the previous analysis to the turbulence statistics and spatial structure. Results indicate that mean velocity profiles in defect form agree with Townsend's similarity hypothesis with varying λ _F, however, the agreement is worse for cases with varying λ _P. The streamwise and wall-normal turbulent stresses, as well as the Reynolds shear stresses, show a lack of similarity across most examined cases. This suggests that the critical height of the roughness for which outer-layer similarity holds depends not only on the height of the roughness, but also on the local wall morphology. A new criterion based on shelter solidity, defined as the sheltered plan area per unit wall-parallel area, which is similar to the `effective shelter area' in Raupach and Shaw (Boundary-Layer Meteorol 22:79-90, 1982), is found to capture the departure of the turbulence statistics from outer-layer similarity. Despite this lack of similarity reported in the turbulence statistics, proper orthogonal decomposition analysis, as well as two-point spatial correlations, show that some form of universal flow structure is present, as all cases exhibit virtually identical proper orthogonal decomposition mode shapes and correlation fields. Finally, reduced models based on proper orthogonal decomposition reveal that the small scales of the turbulence play a significant role in assessing outer-layer similarity.
Analysis of 3D face forms for proper sizing and CAD of spectacle frames.

PubMed

Kouchi, Makiko; Mochimaru, Masaaki

2004-11-01

Three-dimensional morphological variations in the human face were analysed using digital models of the human face, and the usefulness of such analysis in designing industrial products was demonstrated by validating spectacle frame designs based on an original sizing system developed based on the analysis. A normalized model of the three-dimensional face form was made for each of 56 young adult Japanese males. The morphological distances between subjects were defined, and subjects were divided into four groups based on analysis of the distance matrix. A prototype spectacle frame was designed for the average form of each of the four groups. Tightening force of the prototype frames was adjusted using the materialized average forms with soft material placed at the nasal bridge and side of the head. Four prototype frames as well as a conventional frame were evaluated using sensory evaluation and physical measurement of the pressure and slip in 38 young adult male subjects. For each of the 38 subjects, prototype frames were ranked according to the morphological similarity of the subjects and the average form of the four groups: the frame designed for the average form of the group most similar to the subject was #1, the frame designed for the average form of the next most similar group was #2, and so on. For the groups with smaller or narrower faces, new frame #1 was most preferred and had the best overall fit, smallest slip sensation and largest pressure sensation. The groups with larger or wider faces preferred tighter frames than new frame #1, because they were concerned that the frames might slip, although the frames did not. Most of the subjects habitually wore spectacles, and the reason that groups with larger or wider faces preferred tighter frames was thought to be that they were accustomed to tighter fitting frames.
Understanding human activity patterns based on space-time-semantics

NASA Astrophysics Data System (ADS)

Huang, Wei; Li, Songnian

2016-11-01

Understanding human activity patterns plays a key role in various applications in an urban environment, such as transportation planning and traffic forecasting, urban planning, public health and safety, and emergency response. Most existing studies in modeling human activity patterns mainly focus on spatiotemporal dimensions, which lacks consideration of underlying semantic context. In fact, what people do and discuss at some places, inferring what is happening at the places, cannot be simple neglected because it is the root of human mobility patterns. We believe that the geo-tagged semantic context, representing what individuals do and discuss at a place and a specific time, drives a formation of specific human activity pattern. In this paper, we aim to model human activity patterns not only based on space and time but also with consideration of associated semantics, and attempt to prove a hypothesis that similar mobility patterns may have different motivations. We develop a spatiotemporal-semantic model to quantitatively express human activity patterns based on topic models, leading to an analysis of space, time and semantics. A case study is conducted using Twitter data in Toronto based on our model. Through computing the similarities between users in terms of spatiotemporal pattern, semantic pattern and spatiotemporal-semantic pattern, we find that only a small number of users (2.72%) have very similar activity patterns, while the majority (87.14%) show different activity patterns (i.e., similar spatiotemporal patterns and different semantic patterns, similar semantic patterns and different spatiotemporal patterns, or different in both). The population of users that has very similar activity patterns is decreased by 56.41% after incorporating semantic information in the corresponding spatiotemporal patterns, which can quantitatively prove the hypothesis.
Cluster analysis and prediction of treatment outcomes for chronic rhinosinusitis.

PubMed

Soler, Zachary M; Hyer, J Madison; Rudmik, Luke; Ramakrishnan, Viswanathan; Smith, Timothy L; Schlosser, Rodney J

2016-04-01

Current clinical classifications of chronic rhinosinusitis (CRS) have weak prognostic utility regarding treatment outcomes. Simplified discriminant analysis based on unsupervised clustering has identified novel phenotypic subgroups of CRS, but prognostic utility is unknown. We sought to determine whether discriminant analysis allows prognostication in patients choosing surgery versus continued medical management. A multi-institutional prospective study of patients with CRS in whom initial medical therapy failed who then self-selected continued medical management or surgical treatment was used to separate patients into 5 clusters based on a previously described discriminant analysis using total Sino-Nasal Outcome Test-22 (SNOT-22) score, age, and missed productivity. Patients completed the SNOT-22 at baseline and for 18 months of follow-up. Baseline demographic and objective measures included olfactory testing, computed tomography, and endoscopy scoring. SNOT-22 outcomes for surgical versus continued medical treatment were compared across clusters. Data were available on 690 patients. Baseline differences in demographics, comorbidities, objective disease measures, and patient-reported outcomes were similar to previous clustering reports. Three of 5 clusters identified by means of discriminant analysis had improved SNOT-22 outcomes with surgical intervention when compared with continued medical management (surgery was a mean of 21.2 points better across these 3 clusters at 6 months, P < .05). These differences were sustained at 18 months of follow-up. Two of 5 clusters had similar outcomes when comparing surgery with continued medical management. A simplified discriminant analysis based on 3 common clinical variables is able to cluster patients and provide prognostic information regarding surgical treatment versus continued medical management in patients with CRS. Copyright © 2015 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
A Comparison of Functional Models for Use in the Function-Failure Design Method

NASA Technical Reports Server (NTRS)

Stock, Michael E.; Stone, Robert B.; Tumer, Irem Y.

2006-01-01

When failure analysis and prevention, guided by historical design knowledge, are coupled with product design at its conception, shorter design cycles are possible. By decreasing the design time of a product in this manner, design costs are reduced and the product will better suit the customer s needs. Prior work indicates that similar failure modes occur with products (or components) with similar functionality. To capitalize on this finding, a knowledge base of historical failure information linked to functionality is assembled for use by designers. One possible use for this knowledge base is within the Elemental Function-Failure Design Method (EFDM). This design methodology and failure analysis tool begins at conceptual design and keeps the designer cognizant of failures that are likely to occur based on the product s functionality. The EFDM offers potential improvement over current failure analysis methods, such as FMEA, FMECA, and Fault Tree Analysis, because it can be implemented hand in hand with other conceptual design steps and carried throughout a product s design cycle. These other failure analysis methods can only truly be effective after a physical design has been completed. The EFDM however is only as good as the knowledge base that it draws from, and therefore it is of utmost importance to develop a knowledge base that will be suitable for use across a wide spectrum of products. One fundamental question that arises in using the EFDM is: At what level of detail should functional descriptions of components be encoded? This paper explores two approaches to populating a knowledge base with actual failure occurrence information from Bell 206 helicopters. Functional models expressed at various levels of detail are investigated to determine the necessary detail for an applicable knowledge base that can be used by designers in both new designs as well as redesigns. High level and more detailed functional descriptions are derived for each failed component based on NTSB accident reports. To best record this data, standardized functional and failure mode vocabularies are used. Two separate function-failure knowledge bases are then created aid compared. Results indicate that encoding failure data using more detailed functional models allows for a more robust knowledge base. Interestingly however, when applying the EFDM, high level descriptions continue to produce useful results when using the knowledge base generated from the detailed functional models.
A simplified and efficient method for the analysis of fatty acid methyl esters suitable for large clinical studies.

PubMed

Masood, Athar; Stark, Ken D; Salem, Norman

2005-10-01

Conventional sample preparation for fatty acid analysis is a complicated, multiple-step process, and gas chromatography (GC) analysis alone can require >1 h per sample to resolve fatty acid methyl esters (FAMEs). Fast GC analysis was adapted to human plasma FAME analysis using a modified polyethylene glycol column with smaller internal diameters, thinner stationary phase films, increased carrier gas linear velocity, and faster temperature ramping. Our results indicated that fast GC analyses were comparable to conventional GC in peak resolution. A conventional transesterification method based on Lepage and Roy was simplified to a one-step method with the elimination of the neutralization and centrifugation steps. A robotics-amenable method was also developed, with lower methylation temperatures and in an open-tube format using multiple reagent additions. The simplified methods produced results that were quantitatively similar and with similar coefficients of variation as compared with the original Lepage and Roy method. The present streamlined methodology is suitable for the direct fatty acid analysis of human plasma, is appropriate for research studies, and will facilitate large clinical trials and make possible population studies.
Assessment of repeatability of composition of perfumed waters by high-performance liquid chromatography combined with numerical data analysis based on cluster analysis (HPLC UV/VIS - CA).

PubMed

Ruzik, L; Obarski, N; Papierz, A; Mojski, M

2015-06-01

High-performance liquid chromatography (HPLC) with UV/VIS spectrophotometric detection combined with the chemometric method of cluster analysis (CA) was used for the assessment of repeatability of composition of nine types of perfumed waters. In addition, the chromatographic method of separating components of the perfume waters under analysis was subjected to an optimization procedure. The chromatograms thus obtained were used as sources of data for the chemometric method of cluster analysis (CA). The result was a classification of a set comprising 39 perfumed water samples with a similar composition at a specified level of probability (level of agglomeration). A comparison of the classification with the manufacturer's declarations reveals a good degree of consistency and demonstrates similarity between samples in different classes. A combination of the chromatographic method with cluster analysis (HPLC UV/VIS - CA) makes it possible to quickly assess the repeatability of composition of perfumed waters at selected levels of probability. © 2014 Society of Cosmetic Scientists and the Société Française de Cosmétologie.
Anti-inflammatory drugs and prediction of new structures by comparative analysis.

PubMed

Bartzatt, Ronald

2012-01-01

Nonsteroidal anti-inflammatory drugs (NSAIDs) are a group of agents important for their analgesic, anti-inflammatory, and antipyretic properties. This study presents several approaches to predict and elucidate new molecular structures of NSAIDs based on 36 known and proven anti-inflammatory compounds. Based on 36 known NSAIDs the mean value of Log P is found to be 3.338 (standard deviation= 1.237), mean value of polar surface area is 63.176 Angstroms2 (standard deviation = 20.951 A2), and the mean value of molecular weight is 292.665 (standard deviation = 55.627). Nine molecular properties are determined for these 36 NSAID agents, including Log P, number of -OH and -NHn, violations of Rule of 5, number of rotatable bonds, and number of oxygens and nitrogens. Statistical analysis of these nine molecular properties provides numerical parameters to conform to in the design of novel NSAID drug candidates. Multiple regression analysis is accomplished using these properties of 36 agents followed with examples of predicted molecular weight based on minimum and maximum property values. Hierarchical cluster analysis indicated that licofelone, tolfenamic acid, meclofenamic acid, droxicam, and aspirin are substantially distinct from all remaining NSAIDs. Analysis of similarity (ANOSIM) produced R = 0.4947, which indicates low to moderate level of dissimilarity between these 36 NSAIDs. Non-hierarchical K-means cluster analysis separated the 36 NSAIDs into four groups having members of greatest similarity. Likewise, discriminant analysis divided the 36 agents into two groups indicating the greatest level of distinction (discrimination) based on nine properties. These two multivariate methods together provide investigators a means to compare and elucidate novel drug designs to 36 proven compounds and ascertain to which of those are most analogous in pharmacodynamics. In addition, artificial neural network modeling is demonstrated as an approach to predict numerous molecular properties of new drug designs that is based on neural training from 36 proven NSAIDs. Comprehensive and effective approaches are presented in this study for the design of new NSAID type agents which are so very important for inhibition of COX-2 and COX-1 isoenzymes.
Meta-Storms: efficient search for similar microbial communities based on a novel indexing scheme and similarity score for metagenomic data.

PubMed

Su, Xiaoquan; Xu, Jian; Ning, Kang

2012-10-01

It has long been intriguing scientists to effectively compare different microbial communities (also referred as 'metagenomic samples' here) in a large scale: given a set of unknown samples, find similar metagenomic samples from a large repository and examine how similar these samples are. With the current metagenomic samples accumulated, it is possible to build a database of metagenomic samples of interests. Any metagenomic samples could then be searched against this database to find the most similar metagenomic sample(s). However, on one hand, current databases with a large number of metagenomic samples mostly serve as data repositories that offer few functionalities for analysis; and on the other hand, methods to measure the similarity of metagenomic data work well only for small set of samples by pairwise comparison. It is not yet clear, how to efficiently search for metagenomic samples against a large metagenomic database. In this study, we have proposed a novel method, Meta-Storms, that could systematically and efficiently organize and search metagenomic data. It includes the following components: (i) creating a database of metagenomic samples based on their taxonomical annotations, (ii) efficient indexing of samples in the database based on a hierarchical taxonomy indexing strategy, (iii) searching for a metagenomic sample against the database by a fast scoring function based on quantitative phylogeny and (iv) managing database by index export, index import, data insertion, data deletion and database merging. We have collected more than 1300 metagenomic data from the public domain and in-house facilities, and tested the Meta-Storms method on these datasets. Our experimental results show that Meta-Storms is capable of database creation and effective searching for a large number of metagenomic samples, and it could achieve similar accuracies compared with the current popular significance testing-based methods. Meta-Storms method would serve as a suitable database management and search system to quickly identify similar metagenomic samples from a large pool of samples. ningkang@qibebt.ac.cn Supplementary data are available at Bioinformatics online.
Cryptanalysis of Chatterjee-Sarkar Hierarchical Identity-Based Encryption Scheme at PKC 06

NASA Astrophysics Data System (ADS)

Park, Jong Hwan; Lee, Dong Hoon

In 2006, Chatterjee and Sarkar proposed a hierarchical identity-based encryption (HIBE) scheme which can support an unbounded number of identity levels. This property is particularly useful in providing forward secrecy by embedding time components within hierarchical identities. In this paper we show that their scheme does not provide the claimed property. Our analysis shows that if the number of identity levels becomes larger than the value of a fixed public parameter, an unintended receiver can reconstruct a new valid ciphertext and decrypt the ciphertext using his or her own private key. The analysis is similarly applied to a multi-receiver identity-based encryption scheme presented as an application of Chatterjee and Sarkar's HIBE scheme.
An Analysis of Defense Information and Information Technology Articles: A Sixteen-Year Perspective

DTIC Science & Technology

2009-03-01

exploratory,” or “subjective” ( Denzin & Lincoln , 2000). Existing Research This research is based on content analysis methodologies utilized by Carter...same codes ( Denzin & Lincoln , 2000). Different analysts should code the same text in a similar manner (Weber, 1990). Typically, researchers compute...chosen. Krippendorf recommends an agreement level of at least .70 (Krippendorff, 2004). Some scholars use a cut-off rate of .80 ( Denzin & Lincoln
47 CFR 36.161 - Tangible assets-Account 2680.

Code of Federal Regulations, 2010 CFR

2010-10-01

... 47 Telecommunication 2 2010-10-01 2010-10-01 false Tangible assets-Account 2680. 36.161 Section 36.161 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) COMMON CARRIER SERVICES... costs of capital leases are apportioned among the operations based on similar plant owned or by analysis...
Validating a Geographical Image Retrieval System.

ERIC Educational Resources Information Center

Zhu, Bin; Chen, Hsinchun

2000-01-01

Summarizes a prototype geographical image retrieval system that demonstrates how to integrate image processing and information analysis techniques to support large-scale content-based image retrieval. Describes an experiment to validate the performance of this image retrieval system against that of human subjects by examining similarity analysis…
47 CFR 36.161 - Tangible assets-Account 2680.

Code of Federal Regulations, 2011 CFR

2011-10-01

... 47 Telecommunication 2 2011-10-01 2011-10-01 false Tangible assets-Account 2680. 36.161 Section 36.161 Telecommunication FEDERAL COMMUNICATIONS COMMISSION (CONTINUED) COMMON CARRIER SERVICES... costs of capital leases are apportioned among the operations based on similar plant owned or by analysis...
Fatality Reduction by Air Bags: Analyses of Accident Data through Early 1996

DOT National Transportation Integrated Search

1996-08-01

The fatality risk of front-seat occupants of passenger cars and light trucks equipped with air bags is compared to the corresponding risk in similar vehicles without air bags, based on statistical analysis of Fatal Accident Reporting System (FARS)dat...
Taxonomic evaluation of Streptomyces hirsutus and related species using multi-locus sequence analysis

USDA-ARS?s Scientific Manuscript database

Phylogenetic analyses of species of Streptomyces based on 16S rRNA gene sequences resulted in a statistically well-supported clade (100% bootstrap value) containing 8 species having very similar gross morphology. These species, including Streptomyces bambergiensis, Streptomyces chlorus, Streptomyces...
Analysis of Arterial Mechanics During Head-down Tilt Bed Rest

NASA Technical Reports Server (NTRS)

Elliot, Morgan; Martin, David S.; Westby, Christian M.; Stenger, Michael B.; Platts, Steve

2014-01-01

Arterial health may be affected by microgravity or ground based analogs of spaceflight, as shown by an increase in thoracic aorta stiffness1. Head-down tilt bed rest (HDTBR) is often used as a ground-based simulation of spaceflight because it induces physiological changes similar to those that occur in space2, 3. This abstract details an analysis of arterial stiffness (a subclinical measure of atherosclerosis), the distensibility coefficient (DC), and the pressure-strain elastic modulus (PSE) of the arterial walls during HDTBR. This project may help determine how spaceflight differentially affects arterial function in the upper vs. lower body.
Information categorization approach to literary authorship disputes

NASA Astrophysics Data System (ADS)

Yang, Albert C.-C.; Peng, C.-K.; Yien, H.-W.; Goldberger, Ary L.

2003-11-01

Scientific analysis of the linguistic styles of different authors has generated considerable interest. We present a generic approach to measuring the similarity of two symbolic sequences that requires minimal background knowledge about a given human language. Our analysis is based on word rank order-frequency statistics and phylogenetic tree construction. We demonstrate the applicability of this method to historic authorship questions related to the classic Chinese novel “The Dream of the Red Chamber,” to the plays of William Shakespeare, and to the Federalist papers. This method may also provide a simple approach to other large databases based on their information content.

EUGENE'HOM: A generic similarity-based gene finder using multiple homologous sequences.

PubMed

Foissac, Sylvain; Bardou, Philippe; Moisan, Annick; Cros, Marie-Josée; Schiex, Thomas

2003-07-01

EUGENE'HOM is a gene prediction software for eukaryotic organisms based on comparative analysis. EUGENE'HOM is able to take into account multiple homologous sequences from more or less closely related organisms. It integrates the results of TBLASTX analysis, splice site and start codon prediction and a robust coding/non-coding probabilistic model which allows EUGENE'HOM to handle sequences from a variety of organisms. The current target of EUGENE'HOM is plant sequences. The EUGENE'HOM web site is available at http://genopole.toulouse.inra.fr/bioinfo/eugene/EuGeneHom/cgi-bin/EuGeneHom.pl.
Fast gene ontology based clustering for microarray experiments.

PubMed

Ovaska, Kristian; Laakso, Marko; Hautaniemi, Sampsa

2008-11-21

Analysis of a microarray experiment often results in a list of hundreds of disease-associated genes. In order to suggest common biological processes and functions for these genes, Gene Ontology annotations with statistical testing are widely used. However, these analyses can produce a very large number of significantly altered biological processes. Thus, it is often challenging to interpret GO results and identify novel testable biological hypotheses. We present fast software for advanced gene annotation using semantic similarity for Gene Ontology terms combined with clustering and heat map visualisation. The methodology allows rapid identification of genes sharing the same Gene Ontology cluster. Our R based semantic similarity open-source package has a speed advantage of over 2000-fold compared to existing implementations. From the resulting hierarchical clustering dendrogram genes sharing a GO term can be identified, and their differences in the gene expression patterns can be seen from the heat map. These methods facilitate advanced annotation of genes resulting from data analysis.
Evaluation of clinical outcomes among nonvalvular atrial fibrillation patients treated with rivaroxaban or warfarin, stratified by renal function .

PubMed

Weir, Matthew R; Haskell, Lloyd; Berger, Jeffrey S; Ashton, Veronica; Laliberté, François; Crivera, Concetta; Brown, Kip; Lefebvre, Patrick; Schein, Jeffrey

2018-05-01

Renal dysfunction increases the risk of thromboembolic and bleeding events in patients with nonvalvular atrial fibrillation (NVAF). Adult NVAF patients with ≥ 6 months prior to first warfarin or rivaroxaban dispensing were selected from the IMS Health Real-World Data Adjudicated Claims database (05/2011 - 06/2015) with electronic medical records. Ischemic stroke events, thromboembolic events (venous thromboembolism, myocardial infarction, or ischemic stroke), and major bleeding events were compared between patients by renal function identified by 1) relevant ICD-9-CM diagnosis codes and 2) estimated creatinine clearance (eCrCl). Baseline confounders were adjusted using inverse probability of treatment weights. The diagnosis-based analysis included 39,872 rivaroxaban and 48,637 warfarin users (3,572 and 8,230 with renal dysfunction, respectively). The eCrCl-based analysis included 874 rivaroxaban and 1,069 warfarin users (66 and 208 with eCrCl < 60 mL/min, respectively). In the diagnosis-based analysis, rivaroxaban users with renal dysfunction had a significantly lower stroke rate (HR = 0.55, p = 0.0004) compared to warfarin users; rivaroxaban users with and without renal dysfunction had significantly lower thromboembolic event rates (HR = 0.62, p < 0.0001; and HR = 0.64, p < 0.0001, respectively), and similar major bleeding rates to warfarin users. In the eCrCl-based analysis, rivaroxaban users with eCrCl ≥ 60 mL/min had a significantly lower thromboembolic event rate, but other outcomes were not statistically significant. Rivaroxaban-treated NVAF patients with diagnosed renal dysfunction had a significantly lower stroke rate compared to warfarin-treated patients. Regardless of renal dysfunction diagnoses, rivaroxaban users had lower thromboembolic event rates compared to warfarin users, and a similar rate of major bleeding. eCrCl-based analysis was limited by a small sample size. .
SGFSC: speeding the gene functional similarity calculation based on hash tables.

PubMed

Tian, Zhen; Wang, Chunyu; Guo, Maozu; Liu, Xiaoyan; Teng, Zhixia

2016-11-04

In recent years, many measures of gene functional similarity have been proposed and widely used in all kinds of essential research. These methods are mainly divided into two categories: pairwise approaches and group-wise approaches. However, a common problem with these methods is their time consumption, especially when measuring the gene functional similarities of a large number of gene pairs. The problem of computational efficiency for pairwise approaches is even more prominent because they are dependent on the combination of semantic similarity. Therefore, the efficient measurement of gene functional similarity remains a challenging problem. To speed current gene functional similarity calculation methods, a novel two-step computing strategy is proposed: (1) establish a hash table for each method to store essential information obtained from the Gene Ontology (GO) graph and (2) measure gene functional similarity based on the corresponding hash table. There is no need to traverse the GO graph repeatedly for each method with the help of the hash table. The analysis of time complexity shows that the computational efficiency of these methods is significantly improved. We also implement a novel Speeding Gene Functional Similarity Calculation tool, namely SGFSC, which is bundled with seven typical measures using our proposed strategy. Further experiments show the great advantage of SGFSC in measuring gene functional similarity on the whole genomic scale. The proposed strategy is successful in speeding current gene functional similarity calculation methods. SGFSC is an efficient tool that is freely available at http://nclab.hit.edu.cn/SGFSC . The source code of SGFSC can be downloaded from http://pan.baidu.com/s/1dFFmvpZ .
An Experimental Comparison of Similarity Assessment Measures for 3D Models on Constrained Surface Deformation

NASA Astrophysics Data System (ADS)

Quan, Lulin; Yang, Zhixin

2010-05-01

To address the issues in the area of design customization, this paper expressed the specification and application of the constrained surface deformation, and reported the experimental performance comparison of three prevail effective similarity assessment algorithms on constrained surface deformation domain. Constrained surface deformation becomes a promising method that supports for various downstream applications of customized design. Similarity assessment is regarded as the key technology for inspecting the success of new design via measuring the difference level between the deformed new design and the initial sample model, and indicating whether the difference level is within the limitation. According to our theoretical analysis and pre-experiments, three similarity assessment algorithms are suitable for this domain, including shape histogram based method, skeleton based method, and U system moment based method. We analyze their basic functions and implementation methodologies in detail, and do a series of experiments on various situations to test their accuracy and efficiency using precision-recall diagram. Shoe model is chosen as an industrial example for the experiments. It shows that shape histogram based method gained an optimal performance in comparison. Based on the result, we proposed a novel approach that integrating surface constrains and shape histogram description with adaptive weighting method, which emphasize the role of constrains during the assessment. The limited initial experimental result demonstrated that our algorithm outperforms other three algorithms. A clear direction for future development is also drawn at the end of the paper.
Traditional Indian medicine (TIM) and traditional Korean medicine (TKM): aconstitutional-based concept and comparison.

PubMed

Kang, Young Min; Komakech, Richard; Karigar, Chandrakant Shivappa; Saqib, Asma

2017-06-01

Traditional and complementary medicine (T&CM) plays an integral role in providing health care worldwide. It is based on sound fundamental principles and centuries of practices. This study compared traditional Indian medicine (TIM) and traditional Korean medicine (TKM) basing on data obtained from peer reviewed articles, respective government institutional reports and World Health Organization reports. Despite the fact that TIM and TKM have individual qualities that are unique from each other including different histories of origin, they share a lot in common. Apart from Homeopathy in TIM, both systems are hinged on similar principle of body constitutional-based concept and similar disease diagnosis methods of mainly auscultation, palpation, visual inspection, and interrogation. Similarly, the treatment methods of TIM and TKM follow similar patterns involving use of medicinal herbs, moxibustion, acupuncture, cupping, and manual therapy. Both T&CM are majorly practiced in well-established hospitals by T&CM doctors who have undergone an average of 6-7 years of specialized trainings. However, unlike TIM which has less insurance coverage, the popularity of TKM is majorly due to its wide national insurance coverage. These two medical traditions occupy increasingly greater portion of the global market. However, TIM especially Ayurveda has gained more global recognition than TKM although the emergence of Sasang Constitutional Medicine in TKM is beginning to become more popular. This comparative analysis between TIM and TKM may provide vital and insightful contribution towards constitutional-based concept for further development and future studies in T&CM.
Functional diversity of fish in tropical estuaries: A traits-based approach of communities in Pernambuco, Brazil

NASA Astrophysics Data System (ADS)

Silva-Júnior, C. A. B.; Mérigot, B.; Lucena-Frédou, F.; Ferreira, B. P.; Coxey, M. S.; Rezende, S. M.; Frédou, T.

2017-11-01

Environmental changes and human activities may have strong impacts on biodiversity and ecosystem functioning. While biodiversity is traditionally based on species richness and composition, there is a growing concern to take into account functional diversity to assess and manage species communities. In spite of their economic importance, functional diversity quantified by a traits-based approach is still poorly documented in tropical estuaries. In this study, the functional diversity of fishes was investigated within four estuaries in Pernambuco state, northeast of Brazil. These areas are subject to different levels of human impact (e.g. mangrove deforestation, shrimp farming, fishing etc.) and environmental conditions. Fishes were collected during 34 scientific surveys. A total of 122 species were identified and 12 functional traits were quantified describing two main functions: food acquisition and locomotion. Fish abundance and functional dissimilarities data were combined into a multivariate analysis, the Double Principal Coordinate Analysis, to identify the functional typology of fish assemblages according to the estuary. Results showed that Itapissuma, the largest estuary with a wider mangrove forest area, differs from the other three estuaries, showing higher mean values per samples of species richness S and quadratic entropy Q. Similarly, it presented a different functional typology (the first two axes of the DPCoA account for 68.7% of total inertia, while those of a traditional PCA based solely on species abundances provided only 17.4%). Conversely, Suape, Sirinhaém, and to a lower extent Rio Formoso, showed more similarity in their diversity. This result was attributed to their predominantly marine influenced hydrological features, and similar levels of species abundances and in morphological traits. Overall, this study, combining diversity indices and a recent multivariate analysis to access species contribution to functional typology, allows to deepen diversity assessment by providing additional information regarding the functional pattern of fish assemblages.
A New Feature-Enhanced Speckle Reduction Method Based on Multiscale Analysis for Ultrasound B-Mode Imaging.

PubMed

Kang, Jinbum; Lee, Jae Young; Yoo, Yangmo

2016-06-01

Effective speckle reduction in ultrasound B-mode imaging is important for enhancing the image quality and improving the accuracy in image analysis and interpretation. In this paper, a new feature-enhanced speckle reduction (FESR) method based on multiscale analysis and feature enhancement filtering is proposed for ultrasound B-mode imaging. In FESR, clinical features (e.g., boundaries and borders of lesions) are selectively emphasized by edge, coherence, and contrast enhancement filtering from fine to coarse scales while simultaneously suppressing speckle development via robust diffusion filtering. In the simulation study, the proposed FESR method showed statistically significant improvements in edge preservation, mean structure similarity, speckle signal-to-noise ratio, and contrast-to-noise ratio (CNR) compared with other speckle reduction methods, e.g., oriented speckle reducing anisotropic diffusion (OSRAD), nonlinear multiscale wavelet diffusion (NMWD), the Laplacian pyramid-based nonlinear diffusion and shock filter (LPNDSF), and the Bayesian nonlocal means filter (OBNLM). Similarly, the FESR method outperformed the OSRAD, NMWD, LPNDSF, and OBNLM methods in terms of CNR, i.e., 10.70 ± 0.06 versus 9.00 ± 0.06, 9.78 ± 0.06, 8.67 ± 0.04, and 9.22 ± 0.06 in the phantom study, respectively. Reconstructed B-mode images that were developed using the five speckle reduction methods were reviewed by three radiologists for evaluation based on each radiologist's diagnostic preferences. All three radiologists showed a significant preference for the abdominal liver images obtained using the FESR methods in terms of conspicuity, margin sharpness, artificiality, and contrast, p<0.0001. For the kidney and thyroid images, the FESR method showed similar improvement over other methods. However, the FESR method did not show statistically significant improvement compared with the OBNLM method in margin sharpness for the kidney and thyroid images. These results demonstrate that the proposed FESR method can improve the image quality of ultrasound B-mode imaging by enhancing the visualization of lesion features while effectively suppressing speckle noise.
The ability of human nuclear DNA to cause false positive low-abundance heteroplasmy calls varies across the mitochondrial genome.

PubMed

Albayrak, Levent; Khanipov, Kamil; Pimenova, Maria; Golovko, George; Rojas, Mark; Pavlidis, Ioannis; Chumakov, Sergei; Aguilar, Gerardo; Chávez, Arturo; Widger, William R; Fofanov, Yuriy

2016-12-12

Low-abundance mutations in mitochondrial populations (mutations with minor allele frequency ≤ 1%), are associated with cancer, aging, and neurodegenerative disorders. While recent progress in high-throughput sequencing technology has significantly improved the heteroplasmy identification process, the ability of this technology to detect low-abundance mutations can be affected by the presence of similar sequences originating from nuclear DNA (nDNA). To determine to what extent nDNA can cause false positive low-abundance heteroplasmy calls, we have identified mitochondrial locations of all subsequences that are common or similar (one mismatch allowed) between nDNA and mitochondrial DNA (mtDNA). Performed analysis revealed up to a 25-fold variation in the lengths of longest common and longest similar (one mismatch allowed) subsequences across the mitochondrial genome. The size of the longest subsequences shared between nDNA and mtDNA in several regions of the mitochondrial genome were found to be as low as 11 bases, which not only allows using these regions to design new, very specific PCR primers, but also supports the hypothesis of the non-random introduction of mtDNA into the human nuclear DNA. Analysis of the mitochondrial locations of the subsequences shared between nDNA and mtDNA suggested that even very short (36 bases) single-end sequencing reads can be used to identify low-abundance variation in 20.4% of the mitochondrial genome. For longer (76 and 150 bases) reads, the proportion of the mitochondrial genome where nDNA presence will not interfere found to be 44.5 and 67.9%, when low-abundance mutations at 100% of locations can be identified using 417 bases long single reads. This observation suggests that the analysis of low-abundance variations in mitochondria population can be extended to a variety of large data collections such as NCBI Sequence Read Archive, European Nucleotide Archive, The Cancer Genome Atlas, and International Cancer Genome Consortium.
Identification of atypical flight patterns

NASA Technical Reports Server (NTRS)

Statler, Irving C. (Inventor); Ferryman, Thomas A. (Inventor); Amidan, Brett G. (Inventor); Whitney, Paul D. (Inventor); White, Amanda M. (Inventor); Willse, Alan R. (Inventor); Cooley, Scott K. (Inventor); Jay, Joseph Griffith (Inventor); Lawrence, Robert E. (Inventor); Mosbrucker, Chris (Inventor)

2005-01-01

Method and system for analyzing aircraft data, including multiple selected flight parameters for a selected phase of a selected flight, and for determining when the selected phase of the selected flight is atypical, when compared with corresponding data for the same phase for other similar flights. A flight signature is computed using continuous-valued and discrete-valued flight parameters for the selected flight parameters and is optionally compared with a statistical distribution of other observed flight signatures, yielding atypicality scores for the same phase for other similar flights. A cluster analysis is optionally applied to the flight signatures to define an optimal collection of clusters. A level of atypicality for a selected flight is estimated, based upon an index associated with the cluster analysis.
Aquatic exercise training and stable heart failure: A systematic review and meta-analysis.

PubMed

Adsett, Julie A; Mudge, Alison M; Morris, Norman; Kuys, Suzanne; Paratz, Jennifer D

2015-01-01

A meta-analysis and review of the evidence was conducted to determine the efficacy of aquatic exercise training for individuals with heart failure compared to traditional land-based programmes. A systematic search was conducted for studies published prior to March 2014, using MEDLINE, PUBMED, Cochrane Library, CINAHL and PEDro databases. Key words and synonyms relating to aquatic exercise and heart failure comprised the search strategy. Interventions included aquatic exercise or a combination of aquatic plus land-based training, whilst comparator protocols included usual care, no exercise or land-based training alone. The primary outcome of interest was exercise performance. Studies reporting on muscle strength, quality of life and a range of haemodynamic and physiological parameters were also reviewed. Eight studies met criteria, accounting for 156 participants. Meta-analysis identified studies including aquatic exercise to be superior to comparator protocols for 6 minute walk test (p < 0.004) and peak power (p < 0.044). Compared to land-based training programmes, aquatic exercise training provided similar benefits for VO(2peak), muscle strength and quality of life, though was not superior. Cardiac dimensions, left ventricular ejection fraction, cardiac output and BNP were not influenced by aquatic exercise training. For those with stable heart failure, aquatic exercise training can improve exercise capacity, muscle strength and quality of life similar to land-based training programmes. This form of exercise may provide a safe and effective alternative for those unable to participate in traditional exercise programmes. Crown Copyright © 2015. Published by Elsevier Ireland Ltd. All rights reserved.
Self-similar cosmological solutions with dark energy. I. Formulation and asymptotic analysis

NASA Astrophysics Data System (ADS)

Harada, Tomohiro; Maeda, Hideki; Carr, B. J.

2008-01-01

Based on the asymptotic analysis of ordinary differential equations, we classify all spherically symmetric self-similar solutions to the Einstein equations which are asymptotically Friedmann at large distances and contain a perfect fluid with equation of state p=(γ-1)μ with 0<γ<2/3. This corresponds to a “dark energy” fluid and the Friedmann solution is accelerated in this case due to antigravity. This extends the previous analysis of spherically symmetric self-similar solutions for fluids with positive pressure (γ>1). However, in the latter case there is an additional parameter associated with the weak discontinuity at the sonic point and the solutions are only asymptotically “quasi-Friedmann,” in the sense that they exhibit an angle deficit at large distances. In the 0<γ<2/3 case, there is no sonic point and there exists a one-parameter family of solutions which are genuinely asymptotically Friedmann at large distances. We find eight classes of asymptotic behavior: Friedmann or quasi-Friedmann or quasistatic or constant-velocity at large distances, quasi-Friedmann or positive-mass singular or negative-mass singular at small distances, and quasi-Kantowski-Sachs at intermediate distances. The self-similar asymptotically quasistatic and quasi-Kantowski-Sachs solutions are analytically extendible and of great cosmological interest. We also investigate their conformal diagrams. The results of the present analysis are utilized in an accompanying paper to obtain and physically interpret numerical solutions.
Self-similar cosmological solutions with dark energy. I. Formulation and asymptotic analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harada, Tomohiro; Maeda, Hideki; Centro de Estudios Cientificos

2008-01-15

Based on the asymptotic analysis of ordinary differential equations, we classify all spherically symmetric self-similar solutions to the Einstein equations which are asymptotically Friedmann at large distances and contain a perfect fluid with equation of state p=({gamma}-1){mu} with 0<{gamma}<2/3. This corresponds to a 'dark energy' fluid and the Friedmann solution is accelerated in this case due to antigravity. This extends the previous analysis of spherically symmetric self-similar solutions for fluids with positive pressure ({gamma}>1). However, in the latter case there is an additional parameter associated with the weak discontinuity at the sonic point and the solutions are only asymptotically 'quasi-Friedmann',more » in the sense that they exhibit an angle deficit at large distances. In the 0<{gamma}<2/3 case, there is no sonic point and there exists a one-parameter family of solutions which are genuinely asymptotically Friedmann at large distances. We find eight classes of asymptotic behavior: Friedmann or quasi-Friedmann or quasistatic or constant-velocity at large distances, quasi-Friedmann or positive-mass singular or negative-mass singular at small distances, and quasi-Kantowski-Sachs at intermediate distances. The self-similar asymptotically quasistatic and quasi-Kantowski-Sachs solutions are analytically extendible and of great cosmological interest. We also investigate their conformal diagrams. The results of the present analysis are utilized in an accompanying paper to obtain and physically interpret numerical solutions.« less
Use of an expert system data analysis manager for space shuttle main engine test evaluation

NASA Technical Reports Server (NTRS)

Abernethy, Ken

1988-01-01

The ability to articulate, collect, and automate the application of the expertise needed for the analysis of space shuttle main engine (SSME) test data would be of great benefit to NASA liquid rocket engine experts. This paper describes a project whose goal is to build a rule-based expert system which incorporates such expertise. Experiential expertise, collected directly from the experts currently involved in SSME data analysis, is used to build a rule base to identify engine anomalies similar to those analyzed previously. Additionally, an alternate method of expertise capture is being explored. This method would generate rules inductively based on calculations made using a theoretical model of the SSME's operation. The latter rules would be capable of diagnosing anomalies which may not have appeared before, but whose effects can be predicted by the theoretical model.
Scalar Similarity for Relaxed Eddy Accumulation Methods

NASA Astrophysics Data System (ADS)

Ruppert, Johannes; Thomas, Christoph; Foken, Thomas

2006-07-01

The relaxed eddy accumulation (REA) method allows the measurement of trace gas fluxes when no fast sensors are available for eddy covariance measurements. The flux parameterisation used in REA is based on the assumption of scalar similarity, i.e., similarity of the turbulent exchange of two scalar quantities. In this study changes in scalar similarity between carbon dioxide, sonic temperature and water vapour were assessed using scalar correlation coefficients and spectral analysis. The influence on REA measurements was assessed by simulation. The evaluation is based on observations over grassland, irrigated cotton plantation and spruce forest. Scalar similarity between carbon dioxide, sonic temperature and water vapour showed a distinct diurnal pattern and change within the day. Poor scalar similarity was found to be linked to dissimilarities in the energy contained in the low frequency part of the turbulent spectra ( < 0.01 Hz). The simulations of REA showed significant change in b-factors throughout the diurnal course. The b-factor is part of the REA parameterisation scheme and describes a relation between the concentration difference and the vertical flux of a trace gas. The diurnal course of b-factors for carbon dioxide, sonic temperature and water vapour matched well. Relative flux errors induced in REA by varying scalar similarity were generally below ± 10%. Systematic underestimation of the flux of up to - 40% was found for the use of REA applying a hyperbolic deadband (HREA). This underestimation was related to poor scalar similarity between the scalar of interest and the scalar used as proxy for the deadband definition.
Volatility and correlation-based systemic risk measures in the US market

NASA Astrophysics Data System (ADS)

Civitarese, Jamil

2016-10-01

This paper deals with the problem of how to use simple systemic risk measures to assess portfolio risk characteristics. Using three simple examples taken from previous literature, one based on raw and partial correlations, another based on the eigenvalue decomposition of the covariance matrix and the last one based on an eigenvalue entropy, a Granger-causation analysis revealed some of them are not always a good measure of risk in the S&P 500 and in the VIX. The measures selected do not Granger-cause the VIX index in all windows selected; therefore, in the sense of risk as volatility, the indicators are not always suitable. Nevertheless, their results towards returns are similar to previous works that accept them. A deeper analysis has shown that any symmetric measure based on eigenvalue decomposition of correlation matrices, however, is not useful as a measure of "correlation" risk. The empirical counterpart analysis of this proposition stated that negative correlations are usually small and, therefore, do not heavily distort the behavior of the indicator.
User Evaluation of the NASA Technical Report Server Recommendation Service

NASA Technical Reports Server (NTRS)

Nelson, Michael L.; Bollen, Johan; Calhoun, JoAnne R.; Mackey, Calvin E.

2004-01-01

We present the user evaluation of two recommendation server methodologies implemented for the NASA Technical Report Server (NTRS). One methodology for generating recommendations uses log analysis to identify co-retrieval events on full-text documents. For comparison, we used the Vector Space Model (VSM) as the second methodology. We calculated cosine similarities and used the top 10 most similar documents (based on metadata) as recommendations . We then ran an experiment with NASA Langley Research Center (LaRC) staff members to gather their feedback on which method produced the most quality recommendations. We found that in most cases VSM outperformed log analysis of co-retrievals. However, analyzing the data revealed the evaluations may have been structurally biased in favor of the VSM generated recommendations. We explore some possible methods for combining log analysis and VSM generated recommendations and suggest areas of future work.
User Evaluation of the NASA Technical Report Server Recommendation Service

NASA Technical Reports Server (NTRS)

Nelson, Michael L.; Bollen, Johan; Calhoun, JoAnne R.; Mackey, Calvin E.

2004-01-01

We present the user evaluation of two recommendation server methodologies implemented for the NASA Technical Report Server (NTRS). One methodology for generating recommendations uses log analysis to identify co-retrieval events on full-text documents. For comparison, we used the Vector Space Model (VSM) as the second methodology. We calculated cosine similarities and used the top 10 most similar documents (based on metadata) as 'recommendations'. We then ran an experiment with NASA Langley Research Center (LaRC) staff members to gather their feedback on which method produced the most 'quality' recommendations. We found that in most cases VSM outperformed log analysis of co-retrievals. However, analyzing the data revealed the evaluations may have been structurally biased in favor of the VSM generated recommendations. We explore some possible methods for combining log analysis and VSM generated recommendations and suggest areas of future work.
Spatial characterization of acid rain stress in Canadian Shield Lakes

NASA Technical Reports Server (NTRS)

Tanis, Fred J.

1987-01-01

An analysis was performed to interpret the spatial aspects of lake acidification. Three types of relationships were investigated based upon the August to May seasonal scene pairing. In the first type of analysis ANOVA was used to examine the mean Thematic Mapper band one count by ecophysical strata. The primary difference in the two ecophysical strata is the soil type and depth over the underlying bedrock. Examination of the August to May difference values for TM band one produced similar results. Group A and B strata were the same as above. The third type of analysis examined the relationship between values of the August to May difference from polygons which have similar ecophysical properties with the exception of sulfate deposition. For this case lakes were selected from units with sandy soils over granitic rock types and the sulfate deposition was 1.5 or 2.5 g/sq m/yr.
Unsupervised User Similarity Mining in GSM Sensor Networks

PubMed Central

Shad, Shafqat Ali; Chen, Enhong

2013-01-01

Mobility data has attracted the researchers for the past few years because of its rich context and spatiotemporal nature, where this information can be used for potential applications like early warning system, route prediction, traffic management, advertisement, social networking, and community finding. All the mentioned applications are based on mobility profile building and user trend analysis, where mobility profile building is done through significant places extraction, user's actual movement prediction, and context awareness. However, significant places extraction and user's actual movement prediction for mobility profile building are a trivial task. In this paper, we present the user similarity mining-based methodology through user mobility profile building by using the semantic tagging information provided by user and basic GSM network architecture properties based on unsupervised clustering approach. As the mobility information is in low-level raw form, our proposed methodology successfully converts it to a high-level meaningful information by using the cell-Id location information rather than previously used location capturing methods like GPS, Infrared, and Wifi for profile mining and user similarity mining. PMID:23576905

Modeling and statistical analysis of non-Gaussian random fields with heavy-tailed distributions.

PubMed

Nezhadhaghighi, Mohsen Ghasemi; Nakhlband, Abbas

2017-04-01

In this paper, we investigate and develop an alternative approach to the numerical analysis and characterization of random fluctuations with the heavy-tailed probability distribution function (PDF), such as turbulent heat flow and solar flare fluctuations. We identify the heavy-tailed random fluctuations based on the scaling properties of the tail exponent of the PDF, power-law growth of qth order correlation function, and the self-similar properties of the contour lines in two-dimensional random fields. Moreover, this work leads to a substitution for the fractional Edwards-Wilkinson (EW) equation that works in the presence of μ-stable Lévy noise. Our proposed model explains the configuration dynamics of the systems with heavy-tailed correlated random fluctuations. We also present an alternative solution to the fractional EW equation in the presence of μ-stable Lévy noise in the steady state, which is implemented numerically, using the μ-stable fractional Lévy motion. Based on the analysis of the self-similar properties of contour loops, we numerically show that the scaling properties of contour loop ensembles can qualitatively and quantitatively distinguish non-Gaussian random fields from Gaussian random fluctuations.
Chemometric analysis of correlations between electronic absorption characteristics and structural and/or physicochemical parameters for ampholytic substances of biological and pharmaceutical relevance.

PubMed

Judycka-Proma, U; Bober, L; Gajewicz, A; Puzyn, T; Błażejowski, J

2015-03-05

Forty ampholytic compounds of biological and pharmaceutical relevance were subjected to chemometric analysis based on unsupervised and supervised learning algorithms. This enabled relations to be found between empirical spectral characteristics derived from electronic absorption data and structural and physicochemical parameters predicted by quantum chemistry methods or phenomenological relationships based on additivity rules. It was found that the energies of long wavelength absorption bands are correlated through multiparametric linear relationships with parameters reflecting the bulkiness features of the absorbing molecules as well as their nucleophilicity and electrophilicity. These dependences enable the quantitative analysis of spectral features of the compounds, as well as a comparison of their similarities and certain pharmaceutical and biological features. Three QSPR models to predict the energies of long-wavelength absorption in buffers with pH=2.5 and pH=7.0, as well as in methanol, were developed and validated in this study. These models can be further used to predict the long-wavelength absorption energies of untested substances (if they are structurally similar to the training compounds). Copyright © 2014 Elsevier B.V. All rights reserved.
Artistic image analysis using graph-based learning approaches.

PubMed

Carneiro, Gustavo

2013-08-01

We introduce a new methodology for the problem of artistic image analysis, which among other tasks, involves the automatic identification of visual classes present in an art work. In this paper, we advocate the idea that artistic image analysis must explore a graph that captures the network of artistic influences by computing the similarities in terms of appearance and manual annotation. One of the novelties of our methodology is the proposed formulation that is a principled way of combining these two similarities in a single graph. Using this graph, we show that an efficient random walk algorithm based on an inverted label propagation formulation produces more accurate annotation and retrieval results compared with the following baseline algorithms: bag of visual words, label propagation, matrix completion, and structural learning. We also show that the proposed approach leads to a more efficient inference and training procedures. This experiment is run on a database containing 988 artistic images (with 49 visual classification problems divided into a multiclass problem with 27 classes and 48 binary problems), where we show the inference and training running times, and quantitative comparisons with respect to several retrieval and annotation performance measures.
Automated 3D renal segmentation based on image partitioning

NASA Astrophysics Data System (ADS)

Yeghiazaryan, Varduhi; Voiculescu, Irina D.

2016-03-01

Despite several decades of research into segmentation techniques, automated medical image segmentation is barely usable in a clinical context, and still at vast user time expense. This paper illustrates unsupervised organ segmentation through the use of a novel automated labelling approximation algorithm followed by a hypersurface front propagation method. The approximation stage relies on a pre-computed image partition forest obtained directly from CT scan data. We have implemented all procedures to operate directly on 3D volumes, rather than slice-by-slice, because our algorithms are dimensionality-independent. The results picture segmentations which identify kidneys, but can easily be extrapolated to other body parts. Quantitative analysis of our automated segmentation compared against hand-segmented gold standards indicates an average Dice similarity coefficient of 90%. Results were obtained over volumes of CT data with 9 kidneys, computing both volume-based similarity measures (such as the Dice and Jaccard coefficients, true positive volume fraction) and size-based measures (such as the relative volume difference). The analysis considered both healthy and diseased kidneys, although extreme pathological cases were excluded from the overall count. Such cases are difficult to segment both manually and automatically due to the large amplitude of Hounsfield unit distribution in the scan, and the wide spread of the tumorous tissue inside the abdomen. In the case of kidneys that have maintained their shape, the similarity range lies around the values obtained for inter-operator variability. Whilst the procedure is fully automated, our tools also provide a light level of manual editing.
Redescription of Haemogregarina garnhami (Apicomplexa: Adeleorina) from the blood of Psammophis schokari (Serpentes: Colubridae) as Hepatozoon garnhami n. comb. based on molecular, morphometric and morphologic characters.

PubMed

Abdel-Baki, Abdel-Azeem S; Al-Quraishy, Saleh; Zhang, J Y

2014-06-01

Hepatozoon garnhami n. comb. was redescribed from Schokari sand snakes (Psammophis schokari) collected from Riyadh city in Saudi Arabia. Gametocytes were found in the peripheral blood of 2 of 15 snakes examined. Based on the similar morphological and morphometric characteristics, the same host and a similar host habitat environment, it can be concluded for the first time that the present species is conspecific with Haemogregarina garnhami previously reported from Psammophis shokari aegyptius. To further characterize this parasite, the partial 18S rRNA gene was amplified and sequenced. The sequence analysis also showed that Haemogregarina garnhami should be reassigned into the genus Hepatozoon as Hepatozoon garnhami which has 99.5% (859/863 bp) sequence similarity to Hepatozoon ayorgbor, infecting the erythrocytes of Python regius in Ghana. Phylogenetic analysis showed that H. garnhami formed a mixed clade with Hepatozoon spp. from geckos, snakes and rodents and ophidian Hepatozoon spp. did not form a separated phylogenetic unit. Also, Psammophis schokari-infecting Hepatozoon contained several different genetic lineages. To our knowledge, the present work extends the geographic distribution of H. garnhami and is the first report of Hepatozoon infection in snakes from Saudi Arabia.
Using Cluster Analysis to Compartmentalize a Large Managed Wetland Based on Physical, Biological, and Climatic Geospatial Attributes.

PubMed

Hahus, Ian; Migliaccio, Kati; Douglas-Mankin, Kyle; Klarenberg, Geraldine; Muñoz-Carpena, Rafael

2018-04-27

Hierarchical and partitional cluster analyses were used to compartmentalize Water Conservation Area 1, a managed wetland within the Arthur R. Marshall Loxahatchee National Wildlife Refuge in southeast Florida, USA, based on physical, biological, and climatic geospatial attributes. Single, complete, average, and Ward's linkages were tested during the hierarchical cluster analyses, with average linkage providing the best results. In general, the partitional method, partitioning around medoids, found clusters that were more evenly sized and more spatially aggregated than those resulting from the hierarchical analyses. However, hierarchical analysis appeared to be better suited to identify outlier regions that were significantly different from other areas. The clusters identified by geospatial attributes were similar to clusters developed for the interior marsh in a separate study using water quality attributes, suggesting that similar factors have influenced variations in both the set of physical, biological, and climatic attributes selected in this study and water quality parameters. However, geospatial data allowed further subdivision of several interior marsh clusters identified from the water quality data, potentially indicating zones with important differences in function. Identification of these zones can be useful to managers and modelers by informing the distribution of monitoring equipment and personnel as well as delineating regions that may respond similarly to future changes in management or climate.
Investigating the long-term course of schizophrenia by sequence analysis.

PubMed

An der Heiden, Wolfram; Häfner, Heinz

2015-08-30

In the present study we set out to explore the long-term clinical course of schizophrenia in a holistic manner by adopting sequence analysis. Our aim was to identify course types of illness by means of cluster analysis. The study was based on course and outcome data for 107 patients followed up over 134 months after first admission in the ABC Schizophrenia Study. Focusing on the main syndromes (positive, negative, depressive and unspecific symptoms) and their combinations we looked for similarities in individual illness courses using the 'optimal matching' method. A cluster analysis performed on the resulting similarity matrix yielded two main groups (a 'improving' and a 'chronic' group), which comprised a total of six different types of illness course. The course types differed in both quantitative (frequency of syndromes and syndrome combinations) and qualitative terms (clinical presentation, sequence of syndromes). Cluster membership was only rarely, but clearly associated with sociodemographic characteristics, treatment data and other illness variables. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Diffusion of Super-Gaussian Profiles

ERIC Educational Resources Information Center

Rosenberg, C.-J.; Anderson, D.; Desaix, M.; Johannisson, P.; Lisak, M.

2007-01-01

The present analysis describes an analytically simple and systematic approximation procedure for modelling the free diffusive spreading of initially super-Gaussian profiles. The approach is based on a self-similar ansatz for the evolution of the diffusion profile, and the parameter functions involved in the modelling are determined by suitable…
14 CFR 35.38 - Lightning strike.

Code of Federal Regulations, 2011 CFR

2011-01-01

... STANDARDS: PROPELLERS Tests and Inspections § 35.38 Lightning strike. The applicant must demonstrate, by tests, analysis based on tests, or experience on similar designs, that the propeller can withstand a lightning strike without causing a major or hazardous propeller effect. The limit to which the propeller has...
14 CFR 35.38 - Lightning strike.

Code of Federal Regulations, 2010 CFR

2010-01-01

... STANDARDS: PROPELLERS Tests and Inspections § 35.38 Lightning strike. The applicant must demonstrate, by tests, analysis based on tests, or experience on similar designs, that the propeller can withstand a lightning strike without causing a major or hazardous propeller effect. The limit to which the propeller has...
14 CFR 35.38 - Lightning strike.

Code of Federal Regulations, 2013 CFR

2013-01-01

... STANDARDS: PROPELLERS Tests and Inspections § 35.38 Lightning strike. The applicant must demonstrate, by tests, analysis based on tests, or experience on similar designs, that the propeller can withstand a lightning strike without causing a major or hazardous propeller effect. The limit to which the propeller has...
14 CFR 35.38 - Lightning strike.

Code of Federal Regulations, 2012 CFR

2012-01-01

... STANDARDS: PROPELLERS Tests and Inspections § 35.38 Lightning strike. The applicant must demonstrate, by tests, analysis based on tests, or experience on similar designs, that the propeller can withstand a lightning strike without causing a major or hazardous propeller effect. The limit to which the propeller has...
14 CFR 35.38 - Lightning strike.

Code of Federal Regulations, 2014 CFR

2014-01-01

... STANDARDS: PROPELLERS Tests and Inspections § 35.38 Lightning strike. The applicant must demonstrate, by tests, analysis based on tests, or experience on similar designs, that the propeller can withstand a lightning strike without causing a major or hazardous propeller effect. The limit to which the propeller has...
Net Venn - An integrated network analysis web platform for gene lists

USDA-ARS?s Scientific Manuscript database

Many lists containing biological identifiers such as gene lists have been generated in various genomics projects. Identifying the overlap among gene lists can enable us to understand the similarities and differences between the datasets. Here, we present an interactome network-based web application...
Physical-chemical property based sequence motifs and methods regarding same

DOEpatents

Braun, Werner [Friendswood, TX; Mathura, Venkatarajan S [Sarasota, FL; Schein, Catherine H [Friendswood, TX

2008-09-09

A data analysis system, program, and/or method, e.g., a data mining/data exploration method, using physical-chemical property motifs. For example, a sequence database may be searched for identifying segments thereof having physical-chemical properties similar to the physical-chemical property motifs.
Content Structure as a Design Strategy Variable in Concept Acquisition.

ERIC Educational Resources Information Center

Tennyson, Robert D.; Tennyson, Carol L.

Three methods of sequencing coordinate concepts (simultaneous, collective, and successive) were investigated with a Bayesian, computer-based, adaptive control system. The data analysis showed that when coordinate concepts are taught simultaneously (contextually similar concepts presented at the same time), student performance is superior to either…
The role of sex of peers and gender-typed activities in young children's peer affiliative networks: a longitudinal analysis of selection and influence.

PubMed

Martin, Carol Lynn; Kornienko, Olga; Schaefer, David R; Hanish, Laura D; Fabes, Richard A; Goble, Priscilla

2013-01-01

A stochastic actor-based model was used to investigate the origins of sex segregation by examining how similarity in sex of peers and time spent in gender-typed activities affected affiliation network selection and how peers influenced children's (N = 292; Mage = 4.3 years) activity involvement. Gender had powerful effects on interactions through direct and indirect pathways. Children selected playmates of the same sex and with similar levels of gender-typed activities. Selection based on gender-typed activities partially mediated selection based on sex of peers. Children influenced one another's engagement in gender-typed activities. When mechanisms producing sex segregation were compared, the largest contributor was selection based on sex of peers; less was due to activity-based selection and peer influence. Implications for sex segregation and gender development are discussed. © 2012 The Authors. Child Development © 2012 Society for Research in Child Development, Inc.
Drug Promiscuity in PDB: Protein Binding Site Similarity Is Key.

PubMed

Haupt, V Joachim; Daminelli, Simone; Schroeder, Michael

2013-01-01

Drug repositioning applies established drugs to new disease indications with increasing success. A pre-requisite for drug repurposing is drug promiscuity (polypharmacology) - a drug's ability to bind to several targets. There is a long standing debate on the reasons for drug promiscuity. Based on large compound screens, hydrophobicity and molecular weight have been suggested as key reasons. However, the results are sometimes contradictory and leave space for further analysis. Protein structures offer a structural dimension to explain promiscuity: Can a drug bind multiple targets because the drug is flexible or because the targets are structurally similar or even share similar binding sites? We present a systematic study of drug promiscuity based on structural data of PDB target proteins with a set of 164 promiscuous drugs. We show that there is no correlation between the degree of promiscuity and ligand properties such as hydrophobicity or molecular weight but a weak correlation to conformational flexibility. However, we do find a correlation between promiscuity and structural similarity as well as binding site similarity of protein targets. In particular, 71% of the drugs have at least two targets with similar binding sites. In order to overcome issues in detection of remotely similar binding sites, we employed a score for binding site similarity: LigandRMSD measures the similarity of the aligned ligands and uncovers remote local similarities in proteins. It can be applied to arbitrary structural binding site alignments. Three representative examples, namely the anti-cancer drug methotrexate, the natural product quercetin and the anti-diabetic drug acarbose are discussed in detail. Our findings suggest that global structural and binding site similarity play a more important role to explain the observed drug promiscuity in the PDB than physicochemical drug properties like hydrophobicity or molecular weight. Additionally, we find ligand flexibility to have a minor influence.
Geographical markers for Saccharomyces cerevisiae strains with similar technological origins domesticated for rice-based ethnic fermented beverages production in North East India.

PubMed

Jeyaram, Kumaraswamy; Tamang, Jyoti Prakash; Capece, Angela; Romano, Patrizia

2011-11-01

Autochthonous strains of Saccharomyces cerevisiae from traditional starters used for the production of rice-based ethnic fermented beverage in North East India were examined for their genetic polymorphism using mitochondrial DNA-RFLP and electrophoretic karyotyping. Mitochondrial DNA-RFLP analysis of S. cerevisiae strains with similar technological origins from hamei starter of Manipur and marcha starter of Sikkim revealed widely separated clusters based on their geographical origin. Electrophoretic karyotyping showed high polymorphism amongst the hamei strains within similar mitochondrial DNA-RFLP cluster and one unique karyotype of marcha strain was widely distributed in the Sikkim-Himalayan region. We conceptualized the possibility of separate domestication events for hamei strains in Manipur (located in the Indo-Burma biodiversity hotspot) and marcha strains in Sikkim (located in Himalayan biodiversity hotspot), as a consequence of less homogeneity in the genomic structure between these two groups, their clear separation being based on geographical origin, but not on technological origin and low strain level diversity within each group. The molecular markers developed based on HinfI-mtDNA-RFLP profile and the chromosomal doublets in chromosome VIII position of Sikkim-Himalayan strains could be effectively used as geographical markers for authenticating the above starter strains and differentiating them from other commercial strains.
Niche segregation among sympatric Amazonian teiid lizards.

PubMed

Vitt, L J; Sartorius, S S; Avila-Pires, T C S; Espósito, M C; Miles, D B

2000-02-01

We examined standard niche axes (time, place, and food) for three sympatric teiid lizards in the Amazon rain forest. Activity times during the day were similar among species. Ameiva ameiva were in more open microhabitats and had higher body temperatures compared with the two species of Kentropyx. Microhabitat overlaps were low and not significantly different from simulations based on Monte Carlo analysis. Grasshoppers, crickets, and spiders were important in the diets of all three species and many relatively abundant prey were infrequently eaten (e.g., ants). Dietary overlaps were most similar between the two species of Kentropyx even though microhabitat overlaps were relatively low. A Monte Carlo analysis on prey types revealed that dietary overlaps were higher at all ranks than simulated overlaps indicating that use of prey is not random. Although prey size was correlated with lizard body size, there were no species differences in adjusted prey size. A. ameiva ate more prey items at a given body size than either species of Kentropyx. Body size varies among species, with A. ameiva being the largest and K. altamazonica the smallest. The two species of Kentropyx are most distant morphologically, with A. ameiva intermediate. The most distant species morphologically are the most similar in terms of prey types. A morphological analysis including 15 species from four genera revealed patterns of covariation that reflected phylogenetic affinities (i.e., taxonomic patterns are evident). A cluster analysis revealed that A. ameiva, K. pelviceps, and K. altamazonica were in the same morphological group and that within that group, A. ameiva differed from the rest of the species. In addition, K. pelviceps and K. altamazonica were distinguishable from other species of Kentropyx based on morphology.

Flight Versus Ground Out-of-hospital Rapid Sequence Intubation Success: a Systematic Review and Meta-analysis.

PubMed

Fouche, Pieter F; Stein, Christopher; Simpson, Paul; Carlson, Jestin N; Zverinova, Kristina M; Doi, Suhail A

2018-01-29

Endotracheal intubation (ETI) is a critical procedure performed by both air medical and ground based emergency medical services (EMS). Previous work has suggested that ETI success rates are greater for air medical providers. However, air medical providers may have greater airway experience, enhanced airway education, and access to alternative ETI options such as rapid sequence intubation (RSI). We sought to analyze the impact of the type of EMS on RSI success. A systematic literature search of Medline, Embase, and the Cochrane Library was conducted and eligibility, data extraction, and assessment of risk of bias were assessed independently by two reviewers. A bias-adjusted meta-analysis using a quality-effects model was conducted for the primary outcomes of overall intubation success and first-pass intubation success. Forty-nine studies were included in the meta-analysis. There was no difference in the overall success between flight and ground based EMS; 97% (95% CI 96-98) vs. 98% (95% CI 91-100), and no difference in first-pass success for flight compared to ground based RSI; 82% (95% CI 73-89) vs. 82% (95% CI 70-93). Compared to flight non-physicians, flight physicians have higher overall success 99% (95% CI 98-100) vs. 96% (95% CI 94-97) and first-pass success 89% (95% CI 77-98) vs. 71% (95% CI 57-84). Ground-based physicians and non-physicians have a similar overall success 98% (95% CI 88-100) vs. 98% (95% CI 95-100), but no analysis for physician ground first pass was possible. Both overall and first-pass success of RSI did not differ between flight and road based EMS. Flight physicians have a higher overall and first-pass success compared to flight non-physicians and all ground based EMS, but no such differences are seen for ground EMS. Our results suggest that ground EMS can use RSI with similar outcomes compared to their flight counterparts.
High order cell-centered scheme totally based on cell average

NASA Astrophysics Data System (ADS)

Liu, Ze-Yu; Cai, Qing-Dong

2018-05-01

This work clarifies the concept of cell average by pointing out the differences between cell average and cell centroid value, which are averaged cell-centered value and pointwise cell-centered value, respectively. Interpolation based on cell averages is constructed and high order QUICK-like numerical scheme is designed for such interpolation. A new approach of error analysis is introduced in this work, which is similar to Taylor’s expansion.
Formation of dehydroalanine from mimosine and cysteine: artifacts in gas chromatography/mass spectrometry based metabolomics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kim, Young-Mo; Metz, Thomas O.; Hu, Zeping

2011-08-15

Trimethylsilyation is a chemical derivatization procedure routinely applied in gas chromatography-mass spectrometry (GC-MS)-based metabolomics. In this report, through de novo structural elucidation and comparison with authentic standards, we demonstrate that mimosine can be completely converted into dehydroalanine and 3,4-dihydroxypyridine during the trimethylsilyating process. Similarly, dehydroalanine can be formed from derivatization of cysteine. This conversion is a potential interference in GC-MS-based global metabolomics, as well as in analysis of amino acids.
Development and Validation of Methods for Applying Pharmacokinetic Data in Risk Assessment. Volume 3. Tetrachloroethylene

DTIC Science & Technology

1990-12-01

Armstrong Aerospace Medical Research Laboratory, Wright Paterson Air Force Base, and Drs. Melvin Andersen and Michael Cargas , formerly with the Harry G...based on the arterial blood concentration surrogate were more III-1-10 similar to those de ’!ved in the traditional manner than were the estimates based on...pharmacokinetic modeling. Prepared by Office of Risk Analysis, Oak Ridge National L-ioratory, Oak Pidge, Tenn.zsee. Prepared under Contract No. DE -ACO5-84
SibRank: Signed bipartite network analysis for neighbor-based collaborative ranking

NASA Astrophysics Data System (ADS)

Shams, Bita; Haratizadeh, Saman

2016-09-01

Collaborative ranking is an emerging field of recommender systems that utilizes users' preference data rather than rating values. Unfortunately, neighbor-based collaborative ranking has gained little attention despite its more flexibility and justifiability. This paper proposes a novel framework, called SibRank that seeks to improve the state of the art neighbor-based collaborative ranking methods. SibRank represents users' preferences as a signed bipartite network, and finds similar users, through a novel personalized ranking algorithm in signed networks.
Sequence Alignment to Predict Across Species Susceptibility ...

EPA Pesticide Factsheets

Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev
Saponin Profile of Wild Asparagus Species.

PubMed

Jaramillo-Carmona, Sara; Rodriguez-Arcos, Rocío; Jiménez-Araujo, Ana; López, Sergio; Gil, Juan; Moreno, Roberto; Guillén-Bejarano, Rafael

2017-03-01

The aim of this work was to study the saponin profiles from spears of different wild asparagus species in the context of its genetic diversity aside from geographical seed origin. They included Asparagus pseudoscaber Grecescu, Asparagus maritimus (L.) Mill., Asparagus brachiphyllus Turcz., Asparagus prostrates Dumort., and Asparagus officinalis L. The saponin analysis by LC-MS has shown that saponin profile from wild asparagus is similar to that previously described for triguero asparagus from Huétor-Tájar landrace (triguero HT), which had not ever been reported in the edible part of asparagus. All the samples, except A. officinalis, were characterized for having saponins distinct to protodioscin and the total saponin contents were 10-fold higher than those described for commercial hybrids of green asparagus. In particular, A. maritimus from different origins were rich in saponins previously found in triguero HT. These findings supported previous suggestion, based on genetic analysis, about A. maritimus being the origin of triguero HT. Multivariate statistics including principal component analysis and hierarchical clustering analysis were used to define both similarities and differences among samples. The results showed that the greatest variance of the tested wild asparagus could be attributed to differences in the concentration of particular saponins and this knowledge could be a tool for identifying similar species. © 2017 Institute of Food Technologists®.
4-D spatiotemporal analysis of ultrasound contrast agent dispersion for prostate cancer localization: a feasibility study.

PubMed

Schalk, Stefan G; Demi, Libertario; Smeenge, Martijn; Mills, David M; Wallace, Kirk D; de la Rosette, Jean J M C H; Wijkstra, Hessel; Mischi, Massimo

2015-05-01

Currently, nonradical treatment for prostate cancer is hampered by the lack of reliable diagnostics. Contrastultrasound dispersion imaging (CUDI) has recently shown great potential as a prostate cancer imaging technique. CUDI estimates the local dispersion of intravenously injected contrast agents, imaged by transrectal dynamic contrast-enhanced ultrasound (DCE-US), to detect angiogenic processes related to tumor growth. The best CUDI results have so far been obtained by similarity analysis of the contrast kinetics in neighboring pixels. To date, CUDI has been investigated in 2-D only. In this paper, an implementation of 3-D CUDI based on spatiotemporal similarity analysis of 4-D DCE-US is described. Different from 2-D methods, 3-D CUDI permits analysis of the entire prostate using a single injection of contrast agent. To perform 3-D CUDI, a new strategy was designed to estimate the similarity in the contrast kinetics at each voxel, and data processing steps were adjusted to the characteristics of 4-D DCE-US images. The technical feasibility of 4-D DCE-US in 3-D CUDI was assessed and confirmed. Additionally, in a preliminary validation in two patients, dispersion maps by 3-D CUDI were quantitatively compared with those by 2-D CUDI and with 12-core systematic biopsies with promising results.
Identification among morphologically similar Argyreia (Convolvulaceae) based on leaf anatomy and phenetic analyses.

PubMed

Traiperm, Paweena; Chow, Janene; Nopun, Possathorn; Staples, G; Swangpol, Sasivimon C

2017-12-01

The genus Argyreia Lour. is one of the species-rich Asian genera in the family Convolvulaceae. Several species complexes were recognized in which taxon delimitation was imprecise, especially when examining herbarium materials without fully developed open flowers. The main goal of this study is to investigate and describe leaf anatomy for some morphologically similar Argyreia using epidermal peeling, leaf and petiole transverse sections, and scanning electron microscopy. Phenetic analyses including cluster analysis and principal component analysis were used to investigate the similarity of these morpho-types. Anatomical differences observed between the morpho-types include epidermal cell walls and the trichome types on the leaf epidermis. Additional differences in the leaf and petiole transverse sections include the epidermal cell shape of the adaxial leaf blade, the leaf margins, and the petiole transverse sectional outline. The phenogram from cluster analysis using the UPGMA method represented four groups with an R value of 0.87. Moreover, the important quantitative and qualitative leaf anatomical traits of the four groups were confirmed by the principal component analysis of the first two components. The results from phenetic analyses confirmed the anatomical differentiation between the morpho-types. Leaf anatomical features regarded as particularly informative for morpho-type differentiation can be used to supplement macro morphological identification.
Conveyor Performance based on Motor DC 12 Volt Eg-530ad-2f using K-Means Clustering

NASA Astrophysics Data System (ADS)

Arifin, Zaenal; Artini, Sri DP; Much Ibnu Subroto, Imam

2017-04-01

To produce goods in industry, a controlled tool to improve production is required. Separation process has become a part of production process. Separation process is carried out based on certain criteria to get optimum result. By knowing the characteristics performance of a controlled tools in separation process the optimum results is also possible to be obtained. Clustering analysis is popular method for clustering data into smaller segments. Clustering analysis is useful to divide a group of object into a k-group in which the member value of the group is homogeny or similar. Similarity in the group is set based on certain criteria. The work in this paper based on K-Means method to conduct clustering of loading in the performance of a conveyor driven by a dc motor 12 volt eg-530-2f. This technique gives a complete clustering data for a prototype of conveyor driven by dc motor to separate goods in term of height. The parameters involved are voltage, current, time of travelling. These parameters give two clusters namely optimal cluster with center of cluster 10.50 volt, 0.3 Ampere, 10.58 second, and unoptimal cluster with center of cluster 10.88 volt, 0.28 Ampere and 40.43 second.
Identification of (R)-selective ω-aminotransferases by exploring evolutionary sequence space.

PubMed

Kim, Eun-Mi; Park, Joon Ho; Kim, Byung-Gee; Seo, Joo-Hyun

2018-03-01

Several (R)-selective ω-aminotransferases (R-ωATs) have been reported. The existence of additional R-ωATs having different sequence characteristics from previous ones is highly expected. In addition, it is generally accepted that R-ωATs are variants of aminotransferase group III. Based on these backgrounds, sequences in RefSeq database were scored using family profiles of branched-chain amino acid aminotransferase (BCAT) and d-alanine aminotransferase (DAT) to predict and identify putative R-ωATs. Sequences with two profile analysis scores were plotted on two-dimensional score space. Candidates with relatively similar scores in both BCAT and DAT profiles (i.e., profile analysis score using BCAT profile was similar to profile analysis score using DAT profile) were selected. Experimental results for selected candidates showed that putative R-ωATs from Saccharopolyspora erythraea (R-ωAT_Sery), Bacillus cellulosilyticus (R-ωAT_Bcel), and Bacillus thuringiensis (R-ωAT_Bthu) had R-ωAT activity. Additional experiments revealed that R-ωAT_Sery also possessed DAT activity while R-ωAT_Bcel and R-ωAT_Bthu had BCAT activity. Selecting putative R-ωATs from regions with similar profile analysis scores identified potential R-ωATs. Therefore, R-ωATs could be efficiently identified by using simple family profile analysis and exploring evolutionary sequence space. Copyright © 2017 Elsevier Inc. All rights reserved.
24 CFR 965.406 - Benefit/cost analysis for similar projects.

Code of Federal Regulations, 2010 CFR

2010-04-01

... 24 Housing and Urban Development 4 2010-04-01 2010-04-01 false Benefit/cost analysis for similar... Existing PHA-Owned Projects § 965.406 Benefit/cost analysis for similar projects. PHAs with more than one project of similar design and utilities service may prepare a benefit/cost analysis for a representative...
Using a high-dimensional graph of semantic space to model relationships among words

PubMed Central

Jackson, Alice F.; Bolger, Donald J.

2014-01-01

The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD). PMID:24860525
Using a high-dimensional graph of semantic space to model relationships among words.

PubMed

Jackson, Alice F; Bolger, Donald J

2014-01-01

The GOLD model (Graph Of Language Distribution) is a network model constructed based on co-occurrence in a large corpus of natural language that may be used to explore what information may be present in a graph-structured model of language, and what information may be extracted through theoretically-driven algorithms as well as standard graph analysis methods. The present study will employ GOLD to examine two types of relationship between words: semantic similarity and associative relatedness. Semantic similarity refers to the degree of overlap in meaning between words, while associative relatedness refers to the degree to which two words occur in the same schematic context. It is expected that a graph structured model of language constructed based on co-occurrence should easily capture associative relatedness, because this type of relationship is thought to be present directly in lexical co-occurrence. However, it is hypothesized that semantic similarity may be extracted from the intersection of the set of first-order connections, because two words that are semantically similar may occupy similar thematic or syntactic roles across contexts and thus would co-occur lexically with the same set of nodes. Two versions the GOLD model that differed in terms of the co-occurence window, bigGOLD at the paragraph level and smallGOLD at the adjacent word level, were directly compared to the performance of a well-established distributional model, Latent Semantic Analysis (LSA). The superior performance of the GOLD models (big and small) suggest that a single acquisition and storage mechanism, namely co-occurrence, can account for associative and conceptual relationships between words and is more psychologically plausible than models using singular value decomposition (SVD).
MotionFlow: Visual Abstraction and Aggregation of Sequential Patterns in Human Motion Tracking Data.

PubMed

Jang, Sujin; Elmqvist, Niklas; Ramani, Karthik

2016-01-01

Pattern analysis of human motions, which is useful in many research areas, requires understanding and comparison of different styles of motion patterns. However, working with human motion tracking data to support such analysis poses great challenges. In this paper, we propose MotionFlow, a visual analytics system that provides an effective overview of various motion patterns based on an interactive flow visualization. This visualization formulates a motion sequence as transitions between static poses, and aggregates these sequences into a tree diagram to construct a set of motion patterns. The system also allows the users to directly reflect the context of data and their perception of pose similarities in generating representative pose states. We provide local and global controls over the partition-based clustering process. To support the users in organizing unstructured motion data into pattern groups, we designed a set of interactions that enables searching for similar motion sequences from the data, detailed exploration of data subsets, and creating and modifying the group of motion patterns. To evaluate the usability of MotionFlow, we conducted a user study with six researchers with expertise in gesture-based interaction design. They used MotionFlow to explore and organize unstructured motion tracking data. Results show that the researchers were able to easily learn how to use MotionFlow, and the system effectively supported their pattern analysis activities, including leveraging their perception and domain knowledge.
Characterization of distinct classes of differential gene expression in osteoblast cultures from non-syndromic craniosynostosis bone.

PubMed

Rojas-Peña, Monica L; Olivares-Navarrete, Rene; Hyzy, Sharon; Arafat, Dalia; Schwartz, Zvi; Boyan, Barbara D; Williams, Joseph; Gibson, Greg

2014-01-01

Craniosynostosis, the premature fusion of one or more skull sutures, occurs in approximately 1 in 2500 infants, with the majority of cases non-syndromic and of unknown etiology. Two common reasons proposed for premature suture fusion are abnormal compression forces on the skull and rare genetic abnormalities. Our goal was to evaluate whether different sub-classes of disease can be identified based on total gene expression profiles. RNA-Seq data were obtained from 31 human osteoblast cultures derived from bone biopsy samples collected between 2009 and 2011, representing 23 craniosynostosis fusions and 8 normal cranial bones or long bones. No differentiation between regions of the skull was detected, but variance component analysis of gene expression patterns nevertheless supports transcriptome-based classification of craniosynostosis. Cluster analysis showed 4 distinct groups of samples; 1 predominantly normal and 3 craniosynostosis subtypes. Similar constellations of sub-types were also observed upon re-analysis of a similar dataset of 199 calvarial osteoblast cultures. Annotation of gene function of differentially expressed transcripts strongly implicates physiological differences with respect to cell cycle and cell death, stromal cell differentiation, extracellular matrix (ECM) components, and ribosomal activity. Based on these results, we propose non-syndromic craniosynostosis cases can be classified by differences in their gene expression patterns and that these may provide targets for future clinical intervention.
Characterization of Distinct Classes of Differential Gene Expression in Osteoblast Cultures from Non-Syndromic Craniosynostosis Bone

PubMed Central

Rojas-Peña, Monica L.; Olivares-Navarrete, Rene; Hyzy, Sharon; Arafat, Dalia; Schwartz, Zvi; Boyan, Barbara D.; Williams, Joseph; Gibson, Greg

2014-01-01

Craniosynostosis, the premature fusion of one or more skull sutures, occurs in approximately 1 in 2500 infants, with the majority of cases non-syndromic and of unknown etiology. Two common reasons proposed for premature suture fusion are abnormal compression forces on the skull and rare genetic abnormalities. Our goal was to evaluate whether different sub-classes of disease can be identified based on total gene expression profiles. RNA-Seq data were obtained from 31 human osteoblast cultures derived from bone biopsy samples collected between 2009 and 2011, representing 23 craniosynostosis fusions and 8 normal cranial bones or long bones. No differentiation between regions of the skull was detected, but variance component analysis of gene expression patterns nevertheless supports transcriptome-based classification of craniosynostosis. Cluster analysis showed 4 distinct groups of samples; 1 predominantly normal and 3 craniosynostosis subtypes. Similar constellations of sub-types were also observed upon re-analysis of a similar dataset of 199 calvarial osteoblast cultures. Annotation of gene function of differentially expressed transcripts strongly implicates physiological differences with respect to cell cycle and cell death, stromal cell differentiation, extracellular matrix (ECM) components, and ribosomal activity. Based on these results, we propose non-syndromic craniosynostosis cases can be classified by differences in their gene expression patterns and that these may provide targets for future clinical intervention. PMID:25184005
Toxmatch-a new software tool to aid in the development and evaluation of chemically similar groups.

PubMed

Patlewicz, G; Jeliazkova, N; Gallegos Saliner, A; Worth, A P

2008-01-01

Chemical similarity is a widely used concept in toxicology, and is based on the hypothesis that similar compounds should have similar biological activities. This forms the underlying basis for performing read-across, forming chemical groups and developing (Quantitative) Structure-Activity Relationships ((Q)SARs). Chemical similarity is often perceived as structural similarity but in fact there are a number of other approaches that can be used to assess similarity. A systematic similarity analysis usually comprises two main steps. Firstly the chemical structures to be compared need to be characterised in terms of relevant descriptors which encode their physicochemical, topological, geometrical and/or surface properties. A second step involves a quantitative comparison of those descriptors using similarity (or dissimilarity) indices. This work outlines the use of chemical similarity principles in the formation of endpoint specific chemical groupings. Examples are provided to illustrate the development and evaluation of chemical groupings using a new software application called Toxmatch that was recently commissioned by the European Chemicals Bureau (ECB), of the European Commission's Joint Research Centre. Insights from using this software are highlighted with specific focus on the prospective application of chemical groupings under the new chemicals legislation, REACH.
Extending 'Deep Blue' aerosol retrieval coverage to cases of absorbing aerosols above clouds: sensitivity analysis and first case studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sayer, Andrew M.; Hsu, C.; Bettenhausen, Corey

Cases of absorbing aerosols above clouds (AAC), such as smoke or mineral dust, are omitted from most routinely-processed space-based aerosol optical depth (AOD) data products, including those from the Moderate Resolution Imaging Spectroradiometer (MODIS). This study presents a sensitivity analysis and preliminary algorithm to retrieve above-cloud AOD and liquid cloud optical depth (COD) for AAC cases from MODIS or similar
Regional L-Moment-Based Flood Frequency Analysis in the Upper Vistula River Basin, Poland

NASA Astrophysics Data System (ADS)

Rutkowska, A.; Żelazny, M.; Kohnová, S.; Łyp, M.; Banasik, K.

2017-02-01

The Upper Vistula River basin was divided into pooling groups with similar dimensionless frequency distributions of annual maximum river discharge. The cluster analysis and the Hosking and Wallis (HW) L-moment-based method were used to divide the set of 52 mid-sized catchments into disjoint clusters with similar morphometric, land use, and rainfall variables, and to test the homogeneity within clusters. Finally, three and four pooling groups were obtained alternatively. Two methods for identification of the regional distribution function were used, the HW method and the method of Kjeldsen and Prosdocimi based on a bivariate extension of the HW measure. Subsequently, the flood quantile estimates were calculated using the index flood method. The ordinary least squares (OLS) and the generalised least squares (GLS) regression techniques were used to relate the index flood to catchment characteristics. Predictive performance of the regression scheme for the southern part of the Upper Vistula River basin was improved by using GLS instead of OLS. The results of the study can be recommended for the estimation of flood quantiles at ungauged sites, in flood risk mapping applications, and in engineering hydrology to help design flood protection structures.

SMMRNA: a database of small molecule modulators of RNA

PubMed Central

Mehta, Ankita; Sonam, Surabhi; Gouri, Isha; Loharch, Saurabh; Sharma, Deepak K.; Parkesh, Raman

2014-01-01

We have developed SMMRNA, an interactive database, available at http://www.smmrna.org, with special focus on small molecule ligands targeting RNA. Currently, SMMRNA consists of ∼770 unique ligands along with structural images of RNA molecules. Each ligand in the SMMRNA contains information such as Kd, Ki, IC50, ΔTm, molecular weight (MW), hydrogen donor and acceptor count, XlogP, number of rotatable bonds, number of aromatic rings and 2D and 3D structures. These parameters can be explored using text search, advanced search, substructure and similarity-based analysis tools that are embedded in SMMRNA. A structure editor is provided for 3D visualization of ligands. Advance analysis can be performed using substructure and OpenBabel-based chemical similarity fingerprints. Upload facility for both RNA and ligands is also provided. The physicochemical properties of the ligands were further examined using OpenBabel descriptors, hierarchical clustering, binning partition and multidimensional scaling. We have also generated a 3D conformation database of ligands to support the structure and ligand-based screening. SMMRNA provides comprehensive resource for further design, development and refinement of small molecule modulators for selective targeting of RNA molecules. PMID:24163098
A break-even analysis of a community rehabilitation falls prevention service.

PubMed

Comans, Tracy; Brauer, Sandy; Haines, Terry

2009-06-01

To identify and compare the minimum number of clients that a multidisciplinary falls prevention service delivered through domiciliary or centre-based care needs to treat to allow the service to reach a 'break-even' point. A break-even analysis was undertaken for each of two models of care for a multidisciplinary community rehabilitation falls prevention service. The two models comprised either a centre-based group exercise and education program or a similar program delivered individually in the client's home. The service consisted of a physiotherapist, occupational therapist and therapy assistant. The participants were adults aged over 65 years who had experienced previous falls. Costs were based on the actual cost of running a community rehabilitation team located in Brisbane. Benefits were obtained by estimating the savings gained to society from the number of falls prevented by the program on the basis of the falls reduction rates obtained in similar multidisciplinary programs. It is estimated that a multi-disciplinary community falls prevention team would need to see 57 clients per year to make the service break-even using a centre-based model of care and 78 clients for a domiciliary-based model. The service this study was based on has the capability to see around 300 clients per year in a centre-based service or 200-250 clients per year in a home-based service. Based on the best available estimates of costs of falls, multidisciplinary falls prevention teams in the community targeting people at high risk of falls are worthwhile funding from a societal viewpoint.
Body-Earth Mover's Distance: A Matching-Based Approach for Sleep Posture Recognition.

PubMed

Xu, Xiaowei; Lin, Feng; Wang, Aosen; Hu, Yu; Huang, Ming-Chun; Xu, Wenyao

2016-10-01

Sleep posture is a key component in sleep quality assessment and pressure ulcer prevention. Currently, body pressure analysis has been a popular method for sleep posture recognition. In this paper, a matching-based approach, Body-Earth Mover's Distance (BEMD), for sleep posture recognition is proposed. BEMD treats pressure images as weighted 2D shapes, and combines EMD and Euclidean distance for similarity measure. Compared with existing work, sleep posture recognition is achieved with posture similarity rather than multiple features for specific postures. A pilot study is performed with 14 persons for six different postures. The experimental results show that the proposed BEMD can achieve 91.21% accuracy, which outperforms the previous method with an improvement of 8.01%.
Multi-hazard Assessment and Scenario Toolbox (MhAST): A Framework for Analyzing Compounding Effects of Multiple Hazards

NASA Astrophysics Data System (ADS)

Sadegh, M.; Moftakhari, H.; AghaKouchak, A.

2017-12-01

Many natural hazards are driven by multiple forcing variables, and concurrence/consecutive extreme events significantly increases risk of infrastructure/system failure. It is a common practice to use univariate analysis based upon a perceived ruling driver to estimate design quantiles and/or return periods of extreme events. A multivariate analysis, however, permits modeling simultaneous occurrence of multiple forcing variables. In this presentation, we introduce the Multi-hazard Assessment and Scenario Toolbox (MhAST) that comprehensively analyzes marginal and joint probability distributions of natural hazards. MhAST also offers a wide range of scenarios of return period and design levels and their likelihoods. Contribution of this study is four-fold: 1. comprehensive analysis of marginal and joint probability of multiple drivers through 17 continuous distributions and 26 copulas, 2. multiple scenario analysis of concurrent extremes based upon the most likely joint occurrence, one ruling variable, and weighted random sampling of joint occurrences with similar exceedance probabilities, 3. weighted average scenario analysis based on a expected event, and 4. uncertainty analysis of the most likely joint occurrence scenario using a Bayesian framework.
A self-similar hierarchy of the Korean stock market

NASA Astrophysics Data System (ADS)

Lim, Gyuchang; Min, Seungsik; Yoo, Kun-Woo

2013-01-01

A scaling analysis is performed on market values of stocks listed on Korean stock exchanges such as the KOSPI and the KOSDAQ. Different from previous studies on price fluctuations, market capitalizations are dealt with in this work. First, we show that the sum of the two stock exchanges shows a clear rank-size distribution, i.e., the Zipf's law, just as each separate one does. Second, by abstracting Zipf's law as a γ-sequence, we define a self-similar hierarchy consisting of many levels, with the numbers of firms at each level forming a geometric sequence. We also use two exponential functions to describe the hierarchy and derive a scaling law from them. Lastly, we propose a self-similar hierarchical process and perform an empirical analysis on our data set. Based on our findings, we argue that all money invested in the stock market is distributed in a hierarchical way and that a slight difference exists between the two exchanges.
Statistical Inference for Porous Materials using Persistent Homology.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Moon, Chul; Heath, Jason E.; Mitchell, Scott A.

2017-12-01

We propose a porous materials analysis pipeline using persistent homology. We rst compute persistent homology of binarized 3D images of sampled material subvolumes. For each image we compute sets of homology intervals, which are represented as summary graphics called persistence diagrams. We convert persistence diagrams into image vectors in order to analyze the similarity of the homology of the material images using the mature tools for image analysis. Each image is treated as a vector and we compute its principal components to extract features. We t a statistical model using the loadings of principal components to estimate material porosity, permeability,more » anisotropy, and tortuosity. We also propose an adaptive version of the structural similarity index (SSIM), a similarity metric for images, as a measure to determine the statistical representative elementary volumes (sREV) for persistence homology. Thus we provide a capability for making a statistical inference of the uid ow and transport properties of porous materials based on their geometry and connectivity.« less
Integrated analysis of drug-induced gene expression profiles predicts novel hERG inhibitors.

PubMed

Babcock, Joseph J; Du, Fang; Xu, Kaiping; Wheelan, Sarah J; Li, Min

2013-01-01

Growing evidence suggests that drugs interact with diverse molecular targets mediating both therapeutic and toxic effects. Prediction of these complex interactions from chemical structures alone remains challenging, as compounds with different structures may possess similar toxicity profiles. In contrast, predictions based on systems-level measurements of drug effect may reveal pharmacologic similarities not evident from structure or known therapeutic indications. Here we utilized drug-induced transcriptional responses in the Connectivity Map (CMap) to discover such similarities among diverse antagonists of the human ether-à-go-go related (hERG) potassium channel, a common target of promiscuous inhibition by small molecules. Analysis of transcriptional profiles generated in three independent cell lines revealed clusters enriched for hERG inhibitors annotated using a database of experimental measurements (hERGcentral) and clinical indications. As a validation, we experimentally identified novel hERG inhibitors among the unannotated drugs in these enriched clusters, suggesting transcriptional responses may serve as predictive surrogates of cardiotoxicity complementing existing functional assays.
Integrated Analysis of Drug-Induced Gene Expression Profiles Predicts Novel hERG Inhibitors

PubMed Central

Babcock, Joseph J.; Du, Fang; Xu, Kaiping; Wheelan, Sarah J.; Li, Min

2013-01-01

Growing evidence suggests that drugs interact with diverse molecular targets mediating both therapeutic and toxic effects. Prediction of these complex interactions from chemical structures alone remains challenging, as compounds with different structures may possess similar toxicity profiles. In contrast, predictions based on systems-level measurements of drug effect may reveal pharmacologic similarities not evident from structure or known therapeutic indications. Here we utilized drug-induced transcriptional responses in the Connectivity Map (CMap) to discover such similarities among diverse antagonists of the human ether-à-go-go related (hERG) potassium channel, a common target of promiscuous inhibition by small molecules. Analysis of transcriptional profiles generated in three independent cell lines revealed clusters enriched for hERG inhibitors annotated using a database of experimental measurements (hERGcentral) and clinical indications. As a validation, we experimentally identified novel hERG inhibitors among the unannotated drugs in these enriched clusters, suggesting transcriptional responses may serve as predictive surrogates of cardiotoxicity complementing existing functional assays. PMID:23936032
Removing Grit During Wastewater Treatment: CFD Analysis of HDVS Performance.

PubMed

Meroney, Robert N; Sheker, Robert E

2016-05-01

Computational Fluid Dynamics (CFD) was used to simulate the grit and sand separation effectiveness of a typical hydrodynamic vortex separator (HDVS) system. The analysis examined the influences on the separator efficiency of: flow rate, fluid viscosities, total suspended solids (TSS), and particle size and distribution. It was found that separator efficiency for a wide range of these independent variables could be consolidated into a few curves based on the particle fall velocity to separator inflow velocity ratio, Ws/Vin. Based on CFD analysis it was also determined that systems of different sizes with length scale ratios ranging from 1 to 10 performed similarly when Ws/Vin and TSS were held constant. The CFD results have also been compared to a limited range of experimental data.
Networks of plants: how to measure similarity in vegetable species.

PubMed

Vivaldo, Gianna; Masi, Elisa; Pandolfi, Camilla; Mancuso, Stefano; Caldarelli, Guido

2016-06-07

Despite the common misconception of nearly static organisms, plants do interact continuously with the environment and with each other. It is fair to assume that during their evolution they developed particular features to overcome similar problems and to exploit possibilities from environment. In this paper we introduce various quantitative measures based on recent advancements in complex network theory that allow to measure the effective similarities of various species. By using this approach on the similarity in fruit-typology ecological traits we obtain a clear plant classification in a way similar to traditional taxonomic classification. This result is not trivial, since a similar analysis done on the basis of diaspore morphological properties do not provide any clear parameter to classify plants species. Complex network theory can then be used in order to determine which feature amongst many can be used to distinguish scope and possibly evolution of plants. Future uses of this approach range from functional classification to quantitative determination of plant communities in nature.
Genetic analysis of biodegradation of tetralin by a Sphingomonas strain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hernaez, M.J.; Santero, E.; Reineke, W.

Tetralin (1,2,3,4-tetrahydronaphthalene) is produced for industrial purposes from naphthalene by catalytic hydrogenation or from anthracene by cracking. A strain designated TFA which very efficiently utilizes tetralin has been isolated from the Rhine river. The strain has been identified as Sphingomonas macrogoltabidus, based on 16S rDNA sequence similarity. Genetic analysis of tetralin biodegradation has been performed by insertion mutagenesis and by physical analysis and analysis of complementation between the mutants. The genes involved in tetralin utilization are clustered in a region of 9 kb, comprising at least five genes grouped in two divergently transcribed operons.
Identification of Bacillus Probiotics Isolated from Soil Rhizosphere Using 16S rRNA, recA, rpoB Gene Sequencing and RAPD-PCR.

PubMed

Mohkam, Milad; Nezafat, Navid; Berenjian, Aydin; Mobasher, Mohammad Ali; Ghasemi, Younes

2016-03-01

Some Bacillus species, especially Bacillus subtilis and Bacillus pumilus groups, have highly similar 16S rRNA gene sequences, which are hard to identify based on 16S rDNA sequence analysis. To conquer this drawback, rpoB, recA sequence analysis along with randomly amplified polymorphic (RAPD) fingerprinting was examined as an alternative method for differentiating Bacillus species. The 16S rRNA, rpoB and recA genes were amplified via a polymerase chain reaction using their specific primers. The resulted PCR amplicons were sequenced, and phylogenetic analysis was employed by MEGA 6 software. Identification based on 16S rRNA gene sequencing was underpinned by rpoB and recA gene sequencing as well as RAPD-PCR technique. Subsequently, concatenation and phylogenetic analysis showed that extent of diversity and similarity were better obtained by rpoB and recA primers, which are also reinforced by RAPD-PCR methods. However, in one case, these approaches failed to identify one isolate, which in combination with the phenotypical method offsets this issue. Overall, RAPD fingerprinting, rpoB and recA along with concatenated genes sequence analysis discriminated closely related Bacillus species, which highlights the significance of the multigenic method in more precisely distinguishing Bacillus strains. This research emphasizes the benefit of RAPD fingerprinting, rpoB and recA sequence analysis superior to 16S rRNA gene sequence analysis for suitable and effective identification of Bacillus species as recommended for probiotic products.
Liquid chromatography tandem mass spectrometry determination of chemical markers and principal component analysis of Vitex agnus-castus L. fruits (Verbenaceae) and derived food supplements.

PubMed

Mari, Angela; Montoro, Paola; Pizza, Cosimo; Piacente, Sonia

2012-11-01

A validated analytical method for the quantitative determination of seven chemical markers occurring in a hydroalcoholic extract of Vitex agnus-castus fruits by liquid chromatography electrospray triple quadrupole tandem mass spectrometry (LC/ESI/(QqQ)MSMS) is reported. To carry out a comparative study, five commercial food supplements corresponding to hydroalcoholic extracts of V. agnus-castus fruits were analysed under the same chromatographic conditions of the crude extract. Principal component analysis (PCA), based only on the variation of the amount of the seven chemical markers, was applied in order to find similarities between the hydroalcoholic extract and the food supplements. A second PCA analysis was carried out considering the whole spectroscopic data deriving from liquid chromatography electrospray linear ion trap mass spectrometry (LC/ESI/(LIT)MS) analysis. High similarity between the two PCA was observed, showing the possibility to select one of these two approaches for future applications in the field of comparative analysis of food supplements and quality control procedures. Copyright © 2012 Elsevier B.V. All rights reserved.
Network neighborhood analysis with the multi-node topological overlap measure.

PubMed

Li, Ai; Horvath, Steve

2007-01-15

The goal of neighborhood analysis is to find a set of genes (the neighborhood) that is similar to an initial 'seed' set of genes. Neighborhood analysis methods for network data are important in systems biology. If individual network connections are susceptible to noise, it can be advantageous to define neighborhoods on the basis of a robust interconnectedness measure, e.g. the topological overlap measure. Since the use of multiple nodes in the seed set may lead to more informative neighborhoods, it can be advantageous to define multi-node similarity measures. The pairwise topological overlap measure is generalized to multiple network nodes and subsequently used in a recursive neighborhood construction method. A local permutation scheme is used to determine the neighborhood size. Using four network applications and a simulated example, we provide empirical evidence that the resulting neighborhoods are biologically meaningful, e.g. we use neighborhood analysis to identify brain cancer related genes. An executable Windows program and tutorial for multi-node topological overlap measure (MTOM) based analysis can be downloaded from the webpage (http://www.genetics.ucla.edu/labs/horvath/MTOM/).
Development and evaluation of an automatic labeling technique for spring small grains

NASA Technical Reports Server (NTRS)

Crist, E. P.; Malila, W. A. (Principal Investigator)

1981-01-01

A labeling technique is described which seeks to associate a sampling entity with a particular crop or crop group based on similarity of growing season and temporal-spectral patterns of development. Human analyst provide contextual information, after which labeling decisions are made automatically. Results of a test of the technique on a large, multi-year data set are reported. Grain labeling accuracies are similar to those achieved by human analysis techniques, while non-grain accuracies are lower. Recommendations for improvments and implications of the test results are discussed.
Stockholm Syndrome and Child Sexual Abuse

ERIC Educational Resources Information Center

Julich, Shirley

2005-01-01

This article, based on an analysis of unstructured interviews, identifies that the emotional bond between survivors of child sexual abuse and the people who perpetrated the abuse against them is similar to that of the powerful bi-directional relationship central to Stockholm Syndrome as described by Graham (1994). Aspects of Stockholm Syndrome…
14 CFR 35.36 - Bird impact.

Code of Federal Regulations, 2013 CFR

2013-01-01

... STANDARDS: PROPELLERS Tests and Inspections § 35.36 Bird impact. The applicant must demonstrate, by tests or analysis based on tests or experience on similar designs, that the propeller can withstand the impact of a... without causing a major or hazardous propeller effect. This section does not apply to fixed-pitch wood...
14 CFR 35.36 - Bird impact.

Code of Federal Regulations, 2014 CFR

2014-01-01

... STANDARDS: PROPELLERS Tests and Inspections § 35.36 Bird impact. The applicant must demonstrate, by tests or analysis based on tests or experience on similar designs, that the propeller can withstand the impact of a... without causing a major or hazardous propeller effect. This section does not apply to fixed-pitch wood...
14 CFR 35.36 - Bird impact.

Code of Federal Regulations, 2012 CFR

2012-01-01

... STANDARDS: PROPELLERS Tests and Inspections § 35.36 Bird impact. The applicant must demonstrate, by tests or analysis based on tests or experience on similar designs, that the propeller can withstand the impact of a... without causing a major or hazardous propeller effect. This section does not apply to fixed-pitch wood...
Characterization of the Serralysin-like gene of 'Ca. Liberibacter solanacearum' associated with Potato Zebra Chip disease

USDA-ARS?s Scientific Manuscript database

The non-culturable bacterium ‘Candidatus Liberibacter solanacearum’ (Lso) is the causative agent of zebra chip disease in potato. Computational analysis of the Lso genome revealed a serralysin-like gene based on conserved domains characteristic of genes encoding metalloprotease enzymes similar to se...

14 CFR 25.562 - Emergency landing dynamic conditions.

Code of Federal Regulations, 2010 CFR

2010-01-01

....562 Emergency landing dynamic conditions. (a) The seat and restraint system in the airplane must be... 14 Aeronautics and Space 1 2010-01-01 2010-01-01 false Emergency landing dynamic conditions. 25... successfully complete dynamic tests or be demonstrated by rational analysis based on dynamic tests of a similar...
Host specificity and phylogenetic relationships of chicken and turkey parvoviruses

USDA-ARS?s Scientific Manuscript database

Previous reports indicate that the newly discovered chicken parvoviruses (ChPV) and turkey parvoviruses (TuPV) are very similar to each other, yet they represent different species within a new genus of Parvoviridae. Currently, strain classification is based on the phylogenetic analysis of a 561 bas...
FTree query construction for virtual screening: a statistical analysis.

PubMed

Gerlach, Christof; Broughton, Howard; Zaliani, Andrea

2008-02-01

FTrees (FT) is a known chemoinformatic tool able to condense molecular descriptions into a graph object and to search for actives in large databases using graph similarity. The query graph is classically derived from a known active molecule, or a set of actives, for which a similar compound has to be found. Recently, FT similarity has been extended to fragment space, widening its capabilities. If a user were able to build a knowledge-based FT query from information other than a known active structure, the similarity search could be combined with other, normally separate, fields like de-novo design or pharmacophore searches. With this aim in mind, we performed a comprehensive analysis of several databases in terms of FT description and provide a basic statistical analysis of the FT spaces so far at hand. Vendors' catalogue collections and MDDR as a source of potential or known "actives", respectively, have been used. With the results reported herein, a set of ranges, mean values and standard deviations for several query parameters are presented in order to set a reference guide for the users. Applications on how to use this information in FT query building are also provided, using a newly built 3D-pharmacophore from 57 5HT-1F agonists and a published one which was used for virtual screening for tRNA-guanine transglycosylase (TGT) inhibitors.
FTree query construction for virtual screening: a statistical analysis

NASA Astrophysics Data System (ADS)

Gerlach, Christof; Broughton, Howard; Zaliani, Andrea

2008-02-01

FTrees (FT) is a known chemoinformatic tool able to condense molecular descriptions into a graph object and to search for actives in large databases using graph similarity. The query graph is classically derived from a known active molecule, or a set of actives, for which a similar compound has to be found. Recently, FT similarity has been extended to fragment space, widening its capabilities. If a user were able to build a knowledge-based FT query from information other than a known active structure, the similarity search could be combined with other, normally separate, fields like de-novo design or pharmacophore searches. With this aim in mind, we performed a comprehensive analysis of several databases in terms of FT description and provide a basic statistical analysis of the FT spaces so far at hand. Vendors' catalogue collections and MDDR as a source of potential or known "actives", respectively, have been used. With the results reported herein, a set of ranges, mean values and standard deviations for several query parameters are presented in order to set a reference guide for the users. Applications on how to use this information in FT query building are also provided, using a newly built 3D-pharmacophore from 57 5HT-1F agonists and a published one which was used for virtual screening for tRNA-guanine transglycosylase (TGT) inhibitors.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family.

PubMed

Danisman, Selahattin; van Dijk, Aalt D J; Bimbo, Andrea; van der Wal, Froukje; Hennig, Lars; de Folter, Stefan; Angenent, Gerco C; Immink, Richard G H

2013-12-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein-protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein-protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family.
Analysis of functional redundancies within the Arabidopsis TCP transcription factor family

PubMed Central

Danisman, Selahattin; de Folter, Stefan; Immink, Richard G. H.

2013-01-01

Analyses of the functions of TEOSINTE-LIKE1, CYCLOIDEA, and PROLIFERATING CELL FACTOR1 (TCP) transcription factors have been hampered by functional redundancy between its individual members. In general, putative functionally redundant genes are predicted based on sequence similarity and confirmed by genetic analysis. In the TCP family, however, identification is impeded by relatively low overall sequence similarity. In a search for functionally redundant TCP pairs that control Arabidopsis leaf development, this work performed an integrative bioinformatics analysis, combining protein sequence similarities, gene expression data, and results of pair-wise protein–protein interaction studies for the 24 members of the Arabidopsis TCP transcription factor family. For this, the work completed any lacking gene expression and protein–protein interaction data experimentally and then performed a comprehensive prediction of potential functional redundant TCP pairs. Subsequently, redundant functions could be confirmed for selected predicted TCP pairs by genetic and molecular analyses. It is demonstrated that the previously uncharacterized class I TCP19 gene plays a role in the control of leaf senescence in a redundant fashion with TCP20. Altogether, this work shows the power of combining classical genetic and molecular approaches with bioinformatics predictions to unravel functional redundancies in the TCP transcription factor family. PMID:24129704
Markov model plus k-word distributions: a synergy that produces novel statistical measures for sequence comparison.

PubMed

Dai, Qi; Yang, Yanchun; Wang, Tianming

2008-10-15

Many proposed statistical measures can efficiently compare biological sequences to further infer their structures, functions and evolutionary information. They are related in spirit because all the ideas for sequence comparison try to use the information on the k-word distributions, Markov model or both. Motivated by adding k-word distributions to Markov model directly, we investigated two novel statistical measures for sequence comparison, called wre.k.r and S2.k.r. The proposed measures were tested by similarity search, evaluation on functionally related regulatory sequences and phylogenetic analysis. This offers the systematic and quantitative experimental assessment of our measures. Moreover, we compared our achievements with these based on alignment or alignment-free. We grouped our experiments into two sets. The first one, performed via ROC (receiver operating curve) analysis, aims at assessing the intrinsic ability of our statistical measures to search for similar sequences from a database and discriminate functionally related regulatory sequences from unrelated sequences. The second one aims at assessing how well our statistical measure is used for phylogenetic analysis. The experimental assessment demonstrates that our similarity measures intending to incorporate k-word distributions into Markov model are more efficient.
An ASIC-chip for stereoscopic depth analysis in video-real-time based on visual cortical cell behavior.

PubMed

Wörgötter, F

1999-10-01

In a stereoscopic system both eyes or cameras have a slightly different view. As a consequence small variations between the projected images exist ("disparities") which are spatially evaluated in order to retrieve depth information. We will show that two related algorithmic versions can be designed which recover disparity. Both approaches are based on the comparison of filter outputs from filtering the left and the right image. The difference of the phase components between left and right filter responses encodes the disparity. One approach uses regular Gabor filters and computes the spatial phase differences in a conventional way as described already in 1988 by Sanger. Novel to this approach, however, is that we formulate it in a way which is fully compatible with neural operations in the visual cortex. The second approach uses the apparently paradoxical similarity between the analysis of visual disparities and the determination of the azimuth of a sound source. Animals determine the direction of the sound from the temporal delay between the left and right ear signals. Similarly, in our second approach we transpose the spatially defined problem of disparity analysis into the temporal domain and utilize two resonators implemented in the form of causal (electronic) filters to determine the disparity as local temporal phase differences between the left and right filter responses. This approach permits video real-time analysis of stereo image sequences (see movies at http://www.neurop.ruhr-uni-bochum.de/Real- Time-Stereo) and a FPGA-based PC-board has been developed which performs stereo-analysis at full PAL resolution in video real-time. An ASIC chip will be available in March 2000.
Semantic Similarity in Biomedical Ontologies

PubMed Central

Pesquita, Catia; Faria, Daniel; Falcão, André O.; Lord, Phillip; Couto, Francisco M.

2009-01-01

In recent years, ontologies have become a mainstream topic in biomedical research. When biological entities are described using a common schema, such as an ontology, they can be compared by means of their annotations. This type of comparison is called semantic similarity, since it assesses the degree of relatedness between two entities by the similarity in meaning of their annotations. The application of semantic similarity to biomedical ontologies is recent; nevertheless, several studies have been published in the last few years describing and evaluating diverse approaches. Semantic similarity has become a valuable tool for validating the results drawn from biomedical studies such as gene clustering, gene expression data analysis, prediction and validation of molecular interactions, and disease gene prioritization. We review semantic similarity measures applied to biomedical ontologies and propose their classification according to the strategies they employ: node-based versus edge-based and pairwise versus groupwise. We also present comparative assessment studies and discuss the implications of their results. We survey the existing implementations of semantic similarity measures, and we describe examples of applications to biomedical research. This will clarify how biomedical researchers can benefit from semantic similarity measures and help them choose the approach most suitable for their studies. Biomedical ontologies are evolving toward increased coverage, formality, and integration, and their use for annotation is increasingly becoming a focus of both effort by biomedical experts and application of automated annotation procedures to create corpora of higher quality and completeness than are currently available. Given that semantic similarity measures are directly dependent on these evolutions, we can expect to see them gaining more relevance and even becoming as essential as sequence similarity is today in biomedical research. PMID:19649320
Examining the relationship between pre- and postimplant geometry in prostate low-dose-rate brachytherapy and its correlation with dosimetric quality using the similarity concept.

PubMed

Todor, Dorin A; Anscher, Mitchell S; Karlin, Jeremy D; Hagan, Michael P

2014-01-01

This is a retrospective study in which we define multiple metrics for similarity and then inquire on the relationship between similarity and currently used dosimetric quantities describing preimplant and postimplant plans. We analyzed a unique cohort of 94 consecutively performed prostate seed implant patients, associated with excellent dosimetric and clinical outcomes. For each patient, an ultrasound (US) preimplant and two CT postimplant (Day 0 and Day 30) studies were available. Measures for similarity were created and computed using feature vectors based on two classes of moments: first, invariant to rotation and translation, and the second polar-radius moments invariant to rotation, translation, and scaling. Both similarity measures were calibrated using controlled perturbations (random and systematic) of seed positions and contours in different size implants, thus producing meaningful numerical threshold values used in the clinical analysis. An important finding is that similarity, for both seed distributions and contours, improves significantly when scaling invariance is added to translation and rotation. No correlation between seed and contours similarity was found. In the setting of preplanned prostate seed implants using preloaded needles, based on our data, similarity between preimplant and postimplant plans does not correlate with either minimum dose to 90% of the volume of the prostate or analogous similarity metrics for prostate contours. We have developed novel tools and metrics, which will allow practitioners to better understand the relationship between preimplant and postimplant plans. Geometrical similarity between a preplan and an actual implant, although useful, does not seem to be necessary to achieve minimum dose to 90% of the volume of the prostate-good dosimetric implants. Copyright © 2014 American Brachytherapy Society. All rights reserved.
Empirical analysis of web-based user-object bipartite networks

NASA Astrophysics Data System (ADS)

Shang, Ming-Sheng; Lü, Linyuan; Zhang, Yi-Cheng; Zhou, Tao

2010-05-01

Understanding the structure and evolution of web-based user-object networks is a significant task since they play a crucial role in e-commerce nowadays. This letter reports the empirical analysis on two large-scale web sites, audioscrobbler.com and del.icio.us, where users are connected with music groups and bookmarks, respectively. The degree distributions and degree-degree correlations for both users and objects are reported. We propose a new index, named collaborative similarity, to quantify the diversity of tastes based on the collaborative selection. Accordingly, the correlation between degree and selection diversity is investigated. We report some novel phenomena well characterizing the selection mechanism of web users and outline the relevance of these phenomena to the information recommendation problem.
Algorithm of reducing the false positives in IDS based on correlation Analysis

NASA Astrophysics Data System (ADS)

Liu, Jianyi; Li, Sida; Zhang, Ru

2018-03-01

This paper proposes an algorithm of reducing the false positives in IDS based on correlation Analysis. Firstly, the algorithm analyzes the distinguishing characteristics of false positives and real alarms, and preliminary screen the false positives; then use the method of attribute similarity clustering to the alarms and further reduces the amount of alarms; finally, according to the characteristics of multi-step attack, associated it by the causal relationship. The paper also proposed a reverse causation algorithm based on the attack association method proposed by the predecessors, turning alarm information into a complete attack path. Experiments show that the algorithm simplifies the number of alarms, improve the efficiency of alarm processing, and contribute to attack purposes identification and alarm accuracy improvement.
Selecting relevant 3D image features of margin sharpness and texture for lung nodule retrieval.

PubMed

Ferreira, José Raniery; de Azevedo-Marques, Paulo Mazzoncini; Oliveira, Marcelo Costa

2017-03-01

Lung cancer is the leading cause of cancer-related deaths in the world. Its diagnosis is a challenge task to specialists due to several aspects on the classification of lung nodules. Therefore, it is important to integrate content-based image retrieval methods on the lung nodule classification process, since they are capable of retrieving similar cases from databases that were previously diagnosed. However, this mechanism depends on extracting relevant image features in order to obtain high efficiency. The goal of this paper is to perform the selection of 3D image features of margin sharpness and texture that can be relevant on the retrieval of similar cancerous and benign lung nodules. A total of 48 3D image attributes were extracted from the nodule volume. Border sharpness features were extracted from perpendicular lines drawn over the lesion boundary. Second-order texture features were extracted from a cooccurrence matrix. Relevant features were selected by a correlation-based method and a statistical significance analysis. Retrieval performance was assessed according to the nodule's potential malignancy on the 10 most similar cases and by the parameters of precision and recall. Statistical significant features reduced retrieval performance. Correlation-based method selected 2 margin sharpness attributes and 6 texture attributes and obtained higher precision compared to all 48 extracted features on similar nodule retrieval. Feature space dimensionality reduction of 83 % obtained higher retrieval performance and presented to be a computationaly low cost method of retrieving similar nodules for the diagnosis of lung cancer.
A new strategy for statistical analysis-based fingerprint establishment: Application to quality assessment of Semen sojae praeparatum.

PubMed

Guo, Hui; Zhang, Zhen; Yao, Yuan; Liu, Jialin; Chang, Ruirui; Liu, Zhao; Hao, Hongyuan; Huang, Taohong; Wen, Jun; Zhou, Tingting

2018-08-30

Semen sojae praeparatum with homology of medicine and food is a famous traditional Chinese medicine. A simple and effective quality fingerprint analysis, coupled with chemometrics methods, was developed for quality assessment of Semen sojae praeparatum. First, similarity analysis (SA) and hierarchical clusting analysis (HCA) were applied to select the qualitative markers, which obviously influence the quality of Semen sojae praeparatum. 21 chemicals were selected and characterized by high resolution ion trap/time-of-flight mass spectrometry (LC-IT-TOF-MS). Subsequently, principal components analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) were conducted to select the quantitative markers of Semen sojae praeparatum samples from different origins. Moreover, 11 compounds with statistical significance were determined quantitatively, which provided an accurate and informative data for quality evaluation. This study proposes a new strategy for "statistic analysis-based fingerprint establishment", which would be a valuable reference for further study. Copyright © 2018 Elsevier Ltd. All rights reserved.
Mouse Vk gene classification by nucleic acid sequence similarity.

PubMed

Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

1989-01-01

Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.
Finding text in color images

NASA Astrophysics Data System (ADS)

Zhou, Jiangying; Lopresti, Daniel P.; Tasdizen, Tolga

1998-04-01

In this paper, we consider the problem of locating and extracting text from WWW images. A previous algorithm based on color clustering and connected components analysis works well as long as the color of each character is relatively uniform and the typography is fairly simple. It breaks down quickly, however, when these assumptions are violated. In this paper, we describe more robust techniques for dealing with this challenging problem. We present an improved color clustering algorithm that measures similarity based on both RGB and spatial proximity. Layout analysis is also incorporated to handle more complex typography. THese changes significantly enhance the performance of our text detection procedure.
Differential spatial activity patterns of acupuncture by a machine learning based analysis

NASA Astrophysics Data System (ADS)

You, Youbo; Bai, Lijun; Xue, Ting; Zhong, Chongguang; Liu, Zhenyu; Tian, Jie

2011-03-01

Acupoint specificity, lying at the core of the Traditional Chinese Medicine, underlies the theoretical basis of acupuncture application. However, recent studies have reported that acupuncture stimulation at nonacupoint and acupoint can both evoke similar signal intensity decreases in multiple regions. And these regions were spatially overlapped. We used a machine learning based Support Vector Machine (SVM) approach to elucidate the specific neural response pattern induced by acupuncture stimulation. Group analysis demonstrated that stimulation at two different acupoints (belong to the same nerve segment but different meridians) could elicit distinct neural response patterns. Our findings may provide evidence for acupoint specificity.
Design and optimization of liquid core optical ring resonator for refractive index sensing.

PubMed

Lin, Nai; Jiang, Lan; Wang, Sumei; Xiao, Hai; Lu, Yongfeng; Tsai, Hai-Lung

2011-07-10

This study performs a detailed theoretical analysis of refractive index (RI) sensors based on whispering gallery modes (WGMs) in liquid core optical ring resonators (LCORRs). Both TE- and TM-polarized WGMs of various orders are considered. The analysis shows that WGMs of higher orders need thicker walls to achieve a near-zero thermal drift, but WGMs of different orders exhibit a similar RI sensing performance at the thermostable wall thicknesses. The RI detection limit is very low at the thermostable thickness. The theoretical predications should provide a general guidance in the development of LCORR-based thermostable RI sensors. © 2011 Optical Society of America
EUGÈNE'HOM: a generic similarity-based gene finder using multiple homologous sequences

PubMed Central

Foissac, Sylvain; Bardou, Philippe; Moisan, Annick; Cros, Marie-Josée; Schiex, Thomas

2003-01-01

EUGÈNE'HOM is a gene prediction software for eukaryotic organisms based on comparative analysis. EUGÈNE'HOM is able to take into account multiple homologous sequences from more or less closely related organisms. It integrates the results of TBLASTX analysis, splice site and start codon prediction and a robust coding/non-coding probabilistic model which allows EUGÈNE'HOM to handle sequences from a variety of organisms. The current target of EUGÈNE'HOM is plant sequences. The EUGÈNE'HOM web site is available at http://genopole.toulouse.inra.fr/bioinfo/eugene/EuGeneHom/cgi-bin/EuGeneHom.pl. PMID:12824408
Elastic, Frictional, Strength and Dynamic Characteristics of the Bell Shape Shock Absorbers Made of MR Wire Material

NASA Astrophysics Data System (ADS)

Lazutkin, G. V.; Davydov, D. P.; Boyarov, K. V.; Volkova, T. V.

2018-01-01

The results of the mechanical characteristic experimental studies are presented for the shock absorbers of DKU type with the elastic elements of the bell shape made of MR material and obtained by the cold pressing of mutually crossing wire spirals with their inclusion in the array of reinforcing wire harnesses. The design analysis and the technology of MR production based on the methods of similarity theory and dimensional analysis revealed the dimensionless determined and determining parameters of elastic frictional, dynamic and strength characteristics under the static and dynamic loading of vibration isolators. The main similarity criteria of mechanical characteristics for vibration isolators and their graphical and analytical representation are determined, taking into account the coefficients of these (affine) transformations of the hysteresis loop family field.

Promoting Metacognition in Introductory Calculus-based Physics Labs

NASA Astrophysics Data System (ADS)

Grennell, Drew; Boudreaux, Andrew

2010-10-01

In the Western Washington University physics department, a project is underway to develop research-based laboratory curriculum for the introductory calculus-based course. Instructional goals not only include supporting students' conceptual understanding and reasoning ability, but also providing students with opportunities to engage in metacognition. For the latter, our approach has been to scaffold reflective thinking with guided questions. Specific instructional strategies include analysis of alternate reasoning presented in fictitious dialogues and comparison of students' initial ideas with their lab group's final, consensus understanding. Assessment of student metacognition includes pre- and post- course data from selected questions on the CLASS survey, analysis of written lab worksheets, and student opinion surveys. CLASS results are similar to a traditional physics course and analysis of lab sheets show that students struggle to engage in a metacognitive process. Future directions include video studies, as well as use of additional written assessments adapted from educational psychology.
Location and Size Planning of Distributed Photovoltaic Generation in Distribution network System Based on K-means Clustering Analysis

NASA Astrophysics Data System (ADS)

Lu, Siqi; Wang, Xiaorong; Wu, Junyong

2018-01-01

The paper presents a method to generate the planning scenarios, which is based on K-means clustering analysis algorithm driven by data, for the location and size planning of distributed photovoltaic (PV) units in the network. Taken the power losses of the network, the installation and maintenance costs of distributed PV, the profit of distributed PV and the voltage offset as objectives and the locations and sizes of distributed PV as decision variables, Pareto optimal front is obtained through the self-adaptive genetic algorithm (GA) and solutions are ranked by a method called technique for order preference by similarity to an ideal solution (TOPSIS). Finally, select the planning schemes at the top of the ranking list based on different planning emphasis after the analysis in detail. The proposed method is applied to a 10-kV distribution network in Gansu Province, China and the results are discussed.
Cloud GIS Based Watershed Management

NASA Astrophysics Data System (ADS)

Bediroğlu, G.; Colak, H. E.

2017-11-01

In this study, we generated a Cloud GIS based watershed management system with using Cloud Computing architecture. Cloud GIS is used as SAAS (Software as a Service) and DAAS (Data as a Service). We applied GIS analysis on cloud in terms of testing SAAS and deployed GIS datasets on cloud in terms of DAAS. We used Hybrid cloud computing model in manner of using ready web based mapping services hosted on cloud (World Topology, Satellite Imageries). We uploaded to system after creating geodatabases including Hydrology (Rivers, Lakes), Soil Maps, Climate Maps, Rain Maps, Geology and Land Use. Watershed of study area has been determined on cloud using ready-hosted topology maps. After uploading all the datasets to systems, we have applied various GIS analysis and queries. Results shown that Cloud GIS technology brings velocity and efficiency for watershed management studies. Besides this, system can be easily implemented for similar land analysis and management studies.
Robust demarcation of basal cell carcinoma by dependent component analysis-based segmentation of multi-spectral fluorescence images.

PubMed

Kopriva, Ivica; Persin, Antun; Puizina-Ivić, Neira; Mirić, Lina

2010-07-02

This study was designed to demonstrate robust performance of the novel dependent component analysis (DCA)-based approach to demarcation of the basal cell carcinoma (BCC) through unsupervised decomposition of the red-green-blue (RGB) fluorescent image of the BCC. Robustness to intensity fluctuation is due to the scale invariance property of DCA algorithms, which exploit spectral and spatial diversities between the BCC and the surrounding tissue. Used filtering-based DCA approach represents an extension of the independent component analysis (ICA) and is necessary in order to account for statistical dependence that is induced by spectral similarity between the BCC and surrounding tissue. This generates weak edges what represents a challenge for other segmentation methods as well. By comparative performance analysis with state-of-the-art image segmentation methods such as active contours (level set), K-means clustering, non-negative matrix factorization, ICA and ratio imaging we experimentally demonstrate good performance of DCA-based BCC demarcation in two demanding scenarios where intensity of the fluorescent image has been varied almost two orders of magnitude. Copyright 2010 Elsevier B.V. All rights reserved.
Point-by-point compositional analysis for atom probe tomography.

PubMed

Stephenson, Leigh T; Ceguerra, Anna V; Li, Tong; Rojhirunsakool, Tanaporn; Nag, Soumya; Banerjee, Rajarshi; Cairney, Julie M; Ringer, Simon P

2014-01-01

This new alternate approach to data processing for analyses that traditionally employed grid-based counting methods is necessary because it removes a user-imposed coordinate system that not only limits an analysis but also may introduce errors. We have modified the widely used "binomial" analysis for APT data by replacing grid-based counting with coordinate-independent nearest neighbour identification, improving the measurements and the statistics obtained, allowing quantitative analysis of smaller datasets, and datasets from non-dilute solid solutions. It also allows better visualisation of compositional fluctuations in the data. Our modifications include:.•using spherical k-atom blocks identified by each detected atom's first k nearest neighbours.•3D data visualisation of block composition and nearest neighbour anisotropy.•using z-statistics to directly compare experimental and expected composition curves. Similar modifications may be made to other grid-based counting analyses (contingency table, Langer-Bar-on-Miller, sinusoidal model) and could be instrumental in developing novel data visualisation options.
Predicate Oriented Pattern Analysis for Biomedical Knowledge Discovery

PubMed Central

Shen, Feichen; Liu, Hongfang; Sohn, Sunghwan; Larson, David W.; Lee, Yugyung

2017-01-01

In the current biomedical data movement, numerous efforts have been made to convert and normalize a large number of traditional structured and unstructured data (e.g., EHRs, reports) to semi-structured data (e.g., RDF, OWL). With the increasing number of semi-structured data coming into the biomedical community, data integration and knowledge discovery from heterogeneous domains become important research problem. In the application level, detection of related concepts among medical ontologies is an important goal of life science research. It is more crucial to figure out how different concepts are related within a single ontology or across multiple ontologies by analysing predicates in different knowledge bases. However, the world today is one of information explosion, and it is extremely difficult for biomedical researchers to find existing or potential predicates to perform linking among cross domain concepts without any support from schema pattern analysis. Therefore, there is a need for a mechanism to do predicate oriented pattern analysis to partition heterogeneous ontologies into closer small topics and do query generation to discover cross domain knowledge from each topic. In this paper, we present such a model that predicates oriented pattern analysis based on their close relationship and generates a similarity matrix. Based on this similarity matrix, we apply an innovated unsupervised learning algorithm to partition large data sets into smaller and closer topics and generate meaningful queries to fully discover knowledge over a set of interlinked data sources. We have implemented a prototype system named BmQGen and evaluate the proposed model with colorectal surgical cohort from the Mayo Clinic. PMID:28983419
Comparison of a High-Resolution Melting Assay to Next-Generation Sequencing for Analysis of HIV Diversity

PubMed Central

Cousins, Matthew M.; Ou, San-San; Wawer, Maria J.; Munshaw, Supriya; Swan, David; Magaret, Craig A.; Mullis, Caroline E.; Serwadda, David; Porcella, Stephen F.; Gray, Ronald H.; Quinn, Thomas C.; Donnell, Deborah; Eshleman, Susan H.

2012-01-01

Next-generation sequencing (NGS) has recently been used for analysis of HIV diversity, but this method is labor-intensive, costly, and requires complex protocols for data analysis. We compared diversity measures obtained using NGS data to those obtained using a diversity assay based on high-resolution melting (HRM) of DNA duplexes. The HRM diversity assay provides a single numeric score that reflects the level of diversity in the region analyzed. HIV gag and env from individuals in Rakai, Uganda, were analyzed in a previous study using NGS (n = 220 samples from 110 individuals). Three sequence-based diversity measures were calculated from the NGS sequence data (percent diversity, percent complexity, and Shannon entropy). The amplicon pools used for NGS were analyzed with the HRM diversity assay. HRM scores were significantly associated with sequence-based measures of HIV diversity for both gag and env (P < 0.001 for all measures). The level of diversity measured by the HRM diversity assay and NGS increased over time in both regions analyzed (P < 0.001 for all measures except for percent complexity in gag), and similar amounts of diversification were observed with both methods (P < 0.001 for all measures except for percent complexity in gag). Diversity measures obtained using the HRM diversity assay were significantly associated with those from NGS, and similar increases in diversity over time were detected by both methods. The HRM diversity assay is faster and less expensive than NGS, facilitating rapid analysis of large studies of HIV diversity and evolution. PMID:22785188
A web server for analysis, comparison and prediction of protein ligand binding sites.

PubMed

Singh, Harinder; Srivastava, Hemant Kumar; Raghava, Gajendra P S

2016-03-25

One of the major challenges in the field of system biology is to understand the interaction between a wide range of proteins and ligands. In the past, methods have been developed for predicting binding sites in a protein for a limited number of ligands. In order to address this problem, we developed a web server named 'LPIcom' to facilitate users in understanding protein-ligand interaction. Analysis, comparison and prediction modules are available in the "LPIcom' server to predict protein-ligand interacting residues for 824 ligands. Each ligand must have at least 30 protein binding sites in PDB. Analysis module of the server can identify residues preferred in interaction and binding motif for a given ligand; for example residues glycine, lysine and arginine are preferred in ATP binding sites. Comparison module of the server allows comparing protein-binding sites of multiple ligands to understand the similarity between ligands based on their binding site. This module indicates that ATP, ADP and GTP ligands are in the same cluster and thus their binding sites or interacting residues exhibit a high level of similarity. Propensity-based prediction module has been developed for predicting ligand-interacting residues in a protein for more than 800 ligands. In addition, a number of web-based tools have been integrated to facilitate users in creating web logo and two-sample between ligand interacting and non-interacting residues. In summary, this manuscript presents a web-server for analysis of ligand interacting residue. This server is available for public use from URL http://crdd.osdd.net/raghava/lpicom .
Significant impact of amount of PCR input templates on various PCR-based DNA methylation analysis and countermeasure.

PubMed

Liu, Zhaojun; Zhou, Jing; Gu, Liankun; Deng, Dajun

2016-08-30

Methylation changes of CpG islands can be determined using PCR-based assays. However, the exact impact of the amount of input templates (TAIT) on DNA methylation analysis has not been previously recognized. Using COL2A1 gene as an input reference, TAIT difference between human tissues with methylation-positive and -negative detection was calculated for two representative genes GFRA1 and P16. Results revealed that TAIT in GFRA1 methylation-positive frozen samples (n = 332) was significantly higher than the methylation-negative ones (n = 44) (P < 0.001). Similar difference was found in P16 methylation analysis. The TAIT-related effect was also observed in methylation-specific PCR (MSP) and denatured high performance liquid chromatography (DHPLC) analysis. Further study showed that the minimum TAIT for a successful MethyLight PCR reaction should be ≥ 9.4 ng (CtCOL2A1 ≤ 29.3), when the cutoff value of the methylated-GFRA1 proportion for methylation-positive detection was set at 1.6%. After TAIT of the methylation non-informative frozen samples (n = 94; CtCOL2A1 > 29.3) was increased above the minimum TAIT, the methylation-positive rate increased from 72.3% to 95.7% for GFRA1 and 26.6% to 54.3% for P16, respectively (Ps < 0.001). Similar results were observed in the FFPE samples. In conclusion, TAIT critically affects results of various PCR-based DNA methylation analyses. Characterization of the minimum TAIT for target CpG islands is essential to avoid false-negative results.
Providing a Theoretical Basis for Nanotoxicity Risk Analysis Departing from Traditional Physiologically-Based Pharmacokinetic (PBPK) Modeling

DTIC Science & Technology

2010-09-01

estimation of total exposure at any toxicological endpoint in the body. This effort is a significant contribution as it highlights future research needs...rigorous modeling of the nanoparticle transport by including physico-chemical properties of engineered particles. Similarly, toxicological dose-response...exposure risks as compared to larger sized particles of the same material. Although the toxicology of a base material may be thoroughly defined, the
Common disease signatures from gene expression analysis in Huntington's disease human blood and brain.

PubMed

Mina, Eleni; van Roon-Mom, Willeke; Hettne, Kristina; van Zwet, Erik; Goeman, Jelle; Neri, Christian; A C 't Hoen, Peter; Mons, Barend; Roos, Marco

2016-08-01

Huntington's disease (HD) is a devastating brain disorder with no effective treatment or cure available. The scarcity of brain tissue makes it hard to study changes in the brain and impossible to perform longitudinal studies. However, peripheral pathology in HD suggests that it is possible to study the disease using peripheral tissue as a monitoring tool for disease progression and/or efficacy of novel therapies. In this study, we investigated if blood can be used to monitor disease severity and progression in brain. Since previous attempts using only gene expression proved unsuccessful, we compared blood and brain Huntington's disease signatures in a functional context. Microarray HD gene expression profiles from three brain regions were compared to the transcriptome of HD blood generated by next generation sequencing. The comparison was performed with a combination of weighted gene co-expression network analysis and literature based functional analysis (Concept Profile Analysis). Uniquely, our comparison of blood and brain datasets was not based on (the very limited) gene overlap but on the similarity between the gene annotations in four different semantic categories: "biological process", "cellular component", "molecular function" and "disease or syndrome". We identified signatures in HD blood reflecting a broad pathophysiological spectrum, including alterations in the immune response, sphingolipid biosynthetic processes, lipid transport, cell signaling, protein modification, spliceosome, RNA splicing, vesicle transport, cell signaling and synaptic transmission. Part of this spectrum was reminiscent of the brain pathology. The HD signatures in caudate nucleus and BA4 exhibited the highest similarity with blood, irrespective of the category of semantic annotations used. BA9 exhibited an intermediate similarity, while cerebellum had the least similarity. We present two signatures that were shared between blood and brain: immune response and spinocerebellar ataxias. Our results demonstrate that HD blood exhibits dysregulation that is similar to brain at a functional level, but not necessarily at the level of individual genes. We report two common signatures that can be used to monitor the pathology in brain of HD patients in a non-invasive manner. Our results are an exemplar of how signals in blood data can be used to represent brain disorders. Our methodology can be used to study disease specific signatures in diseases where heterogeneous tissues are involved in the pathology.
Discovering relevance knowledge in data: a growing cell structures approach.

PubMed

Azuaje, F; Dubitzky, W; Black, N; Adamson, K

2000-01-01

Both information retrieval and case-based reasoning systems rely on effective and efficient selection of relevant data. Typically, relevance in such systems is approximated by similarity or indexing models. However, the definition of what makes data items similar or how they should be indexed is often nontrivial and time-consuming. Based on growing cell structure artificial neural networks, this paper presents a method that automatically constructs a case retrieval model from existing data. Within the case-based reasoning (CBR) framework, the method is evaluated for two medical prognosis tasks, namely, colorectal cancer survival and coronary heart disease risk prognosis. The results of the experiments suggest that the proposed method is effective and robust. To gain a deeper insight and understanding of the underlying mechanisms of the proposed model, a detailed empirical analysis of the models structural and behavioral properties is also provided.
Similarity spectra analysis of high-performance jet aircraft noise.

PubMed

Neilsen, Tracianne B; Gee, Kent L; Wall, Alan T; James, Michael M

2013-04-01

Noise measured in the vicinity of an F-22A Raptor has been compared to similarity spectra found previously to represent mixing noise from large-scale and fine-scale turbulent structures in laboratory-scale jet plumes. Comparisons have been made for three engine conditions using ground-based sideline microphones, which covered a large angular aperture. Even though the nozzle geometry is complex and the jet is nonideally expanded, the similarity spectra do agree with large portions of the measured spectra. Toward the sideline, the fine-scale similarity spectrum is used, while the large-scale similarity spectrum provides a good fit to the area of maximum radiation. Combinations of the two similarity spectra are shown to match the data in between those regions. Surprisingly, a combination of the two is also shown to match the data at the farthest aft angle. However, at high frequencies the degree of congruity between the similarity and the measured spectra changes with engine condition and angle. At the higher engine conditions, there is a systematically shallower measured high-frequency slope, with the largest discrepancy occurring in the regions of maximum radiation.
Application of the SCALE TSUNAMI Tools for the Validation of Criticality Safety Calculations Involving 233U

DOE Office of Scientific and Technical Information (OSTI.GOV)

Mueller, Don; Rearden, Bradley T; Hollenbach, Daniel F

2009-02-01

The Radiochemical Development Facility at Oak Ridge National Laboratory has been storing solid materials containing 233U for decades. Preparations are under way to process these materials into a form that is inherently safe from a nuclear criticality safety perspective. This will be accomplished by down-blending the {sup 233}U materials with depleted or natural uranium. At the request of the U.S. Department of Energy, a study has been performed using the SCALE sensitivity and uncertainty analysis tools to demonstrate how these tools could be used to validate nuclear criticality safety calculations of selected process and storage configurations. ISOTEK nuclear criticality safetymore » staff provided four models that are representative of the criticality safety calculations for which validation will be needed. The SCALE TSUNAMI-1D and TSUNAMI-3D sequences were used to generate energy-dependent k{sub eff} sensitivity profiles for each nuclide and reaction present in the four safety analysis models, also referred to as the applications, and in a large set of critical experiments. The SCALE TSUNAMI-IP module was used together with the sensitivity profiles and the cross-section uncertainty data contained in the SCALE covariance data files to propagate the cross-section uncertainties ({Delta}{sigma}/{sigma}) to k{sub eff} uncertainties ({Delta}k/k) for each application model. The SCALE TSUNAMI-IP module was also used to evaluate the similarity of each of the 672 critical experiments with each application. Results of the uncertainty analysis and similarity assessment are presented in this report. A total of 142 experiments were judged to be similar to application 1, and 68 experiments were judged to be similar to application 2. None of the 672 experiments were judged to be adequately similar to applications 3 and 4. Discussion of the uncertainty analysis and similarity assessment is provided for each of the four applications. Example upper subcritical limits (USLs) were generated for application 1 based on trending of the energy of average lethargy of neutrons causing fission, trending of the TSUNAMI similarity parameters, and use of data adjustment techniques.« less
Genome analysis of Betanodavirus from cultured marine fish species in Malaysia.

PubMed

Ransangan, Julian; Manin, Benny Obrain

2012-04-23

Betanodavirus is the causative agent of the viral nervous necrosis (VNN) or viral encephalopathy and retinopathy disease in marine fish. This disease is responsible for most of the mass mortalities that occurred in marine fish hatcheries in Malaysia. The genome of this virus consists of two positive-sense RNA molecules which are the RNA1 and RNA2. The RNA1 molecule contains the RdRp gene which encodes for the RNA-dependent RNA polymerase and the RNA2 molecule contains the Cp gene which encodes for the viral coat protein. In this study, total RNAs were extracted from 32 fish specimens representing the four most cultured marine fish species in Malaysia. The fish specimens were collected from different hatcheries and aquaculture farms in Malaysia. The RNA1 was successfully amplified using three pairs of overlapping PCR primers whereas the RNA2 was amplified using a pair of primers. The nucleotide analysis of RdRp gene revealed that the Betanodavirus in Malaysia were 94.5-99.7% similar to the RGNNV genotype, 79.8-82.1% similar to SJNNV genotype, 81.5-82.4% similar to BFNNV genotype and 79.8-80.7% similar to TPNNV genotype. However, they showed lower similarities to FHV (9.4-14.2%) and BBV (7.2-15.7%), respectively. Similarly, the Cp gene revealed that the viruses showed high nucleotide similarity to RGNNV (95.9-99.8%), SJNNV (72.2-77.4%), BFNNV (80.9-83.5%), TPNNV (77.2-78.1%) and TNV (75.1-76.5%). However, as in the RdRp gene, the coat protein gene was highly dissimilar to FHV (3.0%) and BBV (2.6-4.1%), respectively. Based on the genome analysis, the Betanodavirus infecting cultured marine fish species in Malaysia belong to the RGNNV genotype. However, the phylogenetic analysis of the genes revealed that the viruses can be further divided into nine sub-groups. This has been expected since various marine fish species of different origins are cultured in Malaysia. Copyright Â© 2011 Elsevier B.V. All rights reserved.
Convective heat transfer in MHD slip flow over a stretching surface in the presence of carbon nanotubes

NASA Astrophysics Data System (ADS)

Ul Haq, Rizwan; Nadeem, Sohail; Khan, Z. H.; Noor, N. F. M.

2015-01-01

In the present study, thermal conductivity and viscosity of both single-wall and multiple-wall Carbon Nanotubes (CNT) within the base fluids (water, engine oil and ethylene glycol) of similar volume have been investigated when the fluid is flowing over a stretching surface. The magnetohydrodynamic (MHD) and viscous dissipation effects are also incorporated in the present phenomena. Experimental data consists of thermo-physical properties of each base fluid and CNT have been considered. The mathematical model has been constructed and by employing similarity transformation, system of partial differential equations is rehabilitated into the system of non-linear ordinary differential equations. The results of local skin friction and local Nusselt number are plotted for each base fluid by considering both Single Wall Carbon Nanotube (SWCNT) and Multiple-Wall Carbon Nanotubes (MWCNT). The behavior of fluid flow for water based-SWCNT and MWCNT are analyzed through streamlines. Concluding remarks have been developed on behalf of the whole analysis and it is found that engine oil-based CNT have higher skin friction and heat transfer rate as compared to water and ethylene glycol-based CNT.
Bacillus Strains Most Closely Related to Bacillus nealsonii Are Not Effectively Circumscribed within the Taxonomic Species Definition

PubMed Central

Peak, K. Kealy; Duncan, Kathleen E.; Luna, Vicki A.; King, Debra S.; McCarthy, Peter J.; Cannons, Andrew C.

2011-01-01

Bacillus strains with >99.7% 16S rRNA gene sequence similarity were characterized with DNA:DNA hybridization, cellular fatty acid (CFA) analysis, and testing of 100 phenotypic traits. When paired with the most closely related type strain, percent DNA:DNA similarities (% S) for six Bacillus strains were all far below the recommended 70% threshold value for species circumscription with Bacillus nealsonii. An apparent genomic group of four Bacillus strain pairings with 94%–70% S was contradicted by the failure of the strains to cluster in CFA- and phenotype-based dendrograms as well as by their differentiation with 9–13 species level discriminators such as nitrate reduction, temperature range, and acid production from carbohydrates. The novel Bacillus strains were monophyletic and very closely related based on 16S rRNA gene sequence. Coherent genomic groups were not however supported by similarly organized phenotypic clusters. Therefore, the strains were not effectively circumscribed within the taxonomic species definition. PMID:22046187
Lipid composition analysis of milk fats from different mammalian species: potential for use as human milk fat substitutes.

PubMed

Zou, Xiaoqiang; Huang, Jianhua; Jin, Qingzhe; Guo, Zheng; Liu, Yuanfa; Cheong, Lingzhi; Xu, Xuebing; Wang, Xingguo

2013-07-24

The lipid compositions of commercial milks from cow, buffalo, donkey, sheep, and camel were compared with that of human milk fat (HMF) based on total and sn-2 fatty acid, triacylglycerol (TAG), phospholipid, and phospholipid fatty acid compositions and melting and crystallization profiles, and their degrees of similarity were digitized and differentiated by an evaluation model. The results showed that these milk fats had high degrees of similarity to HMF in total fatty acid composition. However, the degrees of similarity in other chemical aspects were low, indicating that these milk fats did not meet the requirements of human milk fat substitutes (HMFSs). However, an economically feasible solution to make these milks useful as raw materials for infant formula production could be to modify these fats, and a possible method is blending of polyunsaturated fatty acids (PUFA) and 1,3-dioleoyl-2-palmitoylglycerol (OPO) enriched fats and minor lipids based on the corresponding chemical compositions of HMF.
Pansharpening on the Narrow Vnir and SWIR Spectral Bands of SENTINEL-2

NASA Astrophysics Data System (ADS)

Vaiopoulos, A. D.; Karantzalos, K.

2016-06-01

In this paper results from the evaluation of several state-of-the-art pansharpening techniques are presented for the VNIR and SWIR bands of Sentinel-2. A procedure for the pansharpening is also proposed which aims at respecting the closest spectral similarities between the higher and lower resolution bands. The evaluation included 21 different fusion algorithms and three evaluation frameworks based both on standard quantitative image similarity indexes and qualitative evaluation from remote sensing experts. The overall analysis of the evaluation results indicated that remote sensing experts disagreed with the outcomes and method ranking from the quantitative assessment. The employed image quality similarity indexes and quantitative evaluation framework based on both high and reduced resolution data from the literature didn't manage to highlight/evaluate mainly the spatial information that was injected to the lower resolution images. Regarding the SWIR bands none of the methods managed to deliver significantly better results than a standard bicubic interpolation on the original low resolution bands.
EnsembleGraph: Interactive Visual Analysis of Spatial-Temporal Behavior for Ensemble Simulation Data

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shu, Qingya; Guo, Hanqi; Che, Limei

We present a novel visualization framework—EnsembleGraph— for analyzing ensemble simulation data, in order to help scientists understand behavior similarities between ensemble members over space and time. A graph-based representation is used to visualize individual spatiotemporal regions with similar behaviors, which are extracted by hierarchical clustering algorithms. A user interface with multiple-linked views is provided, which enables users to explore, locate, and compare regions that have similar behaviors between and then users can investigate and analyze the selected regions in detail. The driving application of this paper is the studies on regional emission influences over tropospheric ozone, which is based onmore » ensemble simulations conducted with different anthropogenic emission absences using the MOZART-4 (model of ozone and related tracers, version 4) model. We demonstrate the effectiveness of our method by visualizing the MOZART-4 ensemble simulation data and evaluating the relative regional emission influences on tropospheric ozone concentrations. Positive feedbacks from domain experts and two case studies prove efficiency of our method.« less

Plant seed species identification from chemical fingerprints: a high-throughput application of direct analysis in real time mass spectrometry.

PubMed

Lesiak, Ashton D; Cody, Robert B; Dane, A John; Musah, Rabi A

2015-09-01

Plant species identification based on the morphological features of plant parts is a well-established science in botany. However, species identification from seeds has largely been unexplored, despite the fact that the seeds contain all of the genetic information that distinguishes one plant from another. Using seeds of genus Datura plants, we show here that the mass spectrum-derived chemical fingerprints for seeds of the same species are similar. On the other hand, seeds from different species within the same genus display distinct chemical signatures, even though they may contain similar characteristic biomarkers. The intraspecies chemical signature similarities on the one hand, and interspecies fingerprint differences on the other, can be processed by multivariate statistical analysis methods to enable rapid species-level identification and differentiation. The chemical fingerprints can be acquired rapidly and in a high-throughput manner by direct analysis in real time mass spectrometry (DART-MS) analysis of the seeds in their native form, without use of a solvent extract. Importantly, knowledge of the identity of the detected molecules is not required for species level identification. However, confirmation of the presence within the seeds of various characteristic tropane and other alkaloids, including atropine, scopolamine, scopoline, tropine, tropinone, and tyramine, was accomplished by comparison of the in-source collision-induced dissociation (CID) fragmentation patterns of authentic standards, to the fragmentation patterns observed in the seeds when analyzed under similar in-source CID conditions. The advantages, applications, and implications of the chemometric processing of DART-MS derived seed chemical signatures for species level identification and differentiation are discussed.
Detecting Network Communities: An Application to Phylogenetic Analysis

PubMed Central

Andrade, Roberto F. S.; Rocha-Neto, Ivan C.; Santos, Leonardo B. L.; de Santana, Charles N.; Diniz, Marcelo V. C.; Lobão, Thierry Petit; Goés-Neto, Aristóteles; Pinho, Suani T. R.; El-Hani, Charbel N.

2011-01-01

This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. PMID:21573202
Trust, confidence, procedural fairness, outcome fairness, moral conviction, and the acceptance of GM field experiments.

PubMed

Siegrist, Michael; Connor, Melanie; Keller, Carmen

2012-08-01

In 2005, Swiss citizens endorsed a moratorium on gene technology, resulting in the prohibition of the commercial cultivation of genetically modified crops and the growth of genetically modified animals until 2013. However, scientific research was not affected by this moratorium, and in 2008, GMO field experiments were conducted that allowed us to examine the factors that influence their acceptance by the public. In this study, trust and confidence items were analyzed using principal component analysis. The analysis revealed the following three factors: "economy/health and environment" (value similarity based trust), "trust and honesty of industry and scientists" (value similarity based trust), and "competence" (confidence). The results of a regression analysis showed that all the three factors significantly influenced the acceptance of GM field experiments. Furthermore, risk communication scholars have suggested that fairness also plays an important role in the acceptance of environmental hazards. We, therefore, included measures for outcome fairness and procedural fairness in our model. However, the impact of fairness may be moderated by moral conviction. That is, fairness may be significant for people for whom GMO is not an important issue, but not for people for whom GMO is an important issue. The regression analysis showed that, in addition to the trust and confidence factors, moral conviction, outcome fairness, and procedural fairness were significant predictors. The results suggest that the influence of procedural fairness is even stronger for persons having high moral convictions compared with persons having low moral convictions. © 2012 Society for Risk Analysis.
Perfume Fragrance Discrimination Using Resistance And Capacitance Responses Of Polymer Sensors

NASA Astrophysics Data System (ADS)

Lima, John Paul Hempel; Vandendriessche, Thomas; Fonseca, Fernando J.; Lammertyn, Jeroen; Nicolai, Bart M.; de Andrade, Adnei Melges

2009-05-01

This work shows a comparison between electrical resistance and capacitance responses of ethanol and five different fragrances using an electronic nose based on conducting polymers. Gas chromatography—mass spectrometry (GC-MS) measurements were performed to evaluate the main differences between the analytes. It is shown that although the fragrances are quite similar in their compositions the sensors are able to discriminate them through PCA (Principal Component Analysis) and ANNs (Artificial Neural Network) analysis.
Finding an appropriate equation to measure similarity between binary vectors: case studies on Indonesian and Japanese herbal medicines.

PubMed

Wijaya, Sony Hartono; Afendi, Farit Mochamad; Batubara, Irmanida; Darusman, Latifah K; Altaf-Ul-Amin, Md; Kanaya, Shigehiko

2016-12-07

The binary similarity and dissimilarity measures have critical roles in the processing of data consisting of binary vectors in various fields including bioinformatics and chemometrics. These metrics express the similarity and dissimilarity values between two binary vectors in terms of the positive matches, absence mismatches or negative matches. To our knowledge, there is no published work presenting a systematic way of finding an appropriate equation to measure binary similarity that performs well for certain data type or application. A proper method to select a suitable binary similarity or dissimilarity measure is needed to obtain better classification results. In this study, we proposed a novel approach to select binary similarity and dissimilarity measures. We collected 79 binary similarity and dissimilarity equations by extensive literature search and implemented those equations as an R package called bmeasures. We applied these metrics to quantify the similarity and dissimilarity between herbal medicine formulas belonging to the Indonesian Jamu and Japanese Kampo separately. We assessed the capability of binary equations to classify herbal medicine pairs into match and mismatch efficacies based on their similarity or dissimilarity coefficients using the Receiver Operating Characteristic (ROC) curve analysis. According to the area under the ROC curve results, we found Indonesian Jamu and Japanese Kampo datasets obtained different ranking of binary similarity and dissimilarity measures. Out of all the equations, the Forbes-2 similarity and the Variant of Correlation similarity measures are recommended for studying the relationship between Jamu formulas and Kampo formulas, respectively. The selection of binary similarity and dissimilarity measures for multivariate analysis is data dependent. The proposed method can be used to find the most suitable binary similarity and dissimilarity equation wisely for a particular data. Our finding suggests that all four types of matching quantities in the Operational Taxonomic Unit (OTU) table are important to calculate the similarity and dissimilarity coefficients between herbal medicine formulas. Also, the binary similarity and dissimilarity measures that include the negative match quantity d achieve better capability to separate herbal medicine pairs compared to equations that exclude d.
Quantifying light-dependent circadian disruption in humans and animal models.

PubMed

Rea, Mark S; Figueiro, Mariana G

2014-12-01

Although circadian disruption is an accepted term, little has been done to develop methods to quantify the degree of disruption or entrainment individual organisms actually exhibit in the field. A variety of behavioral, physiological and hormonal responses vary in amplitude over a 24-h period and the degree to which these circadian rhythms are synchronized to the daily light-dark cycle can be quantified with a technique known as phasor analysis. Several studies have been carried out using phasor analysis in an attempt to measure circadian disruption exhibited by animals and by humans. To perform these studies, species-specific light measurement and light delivery technologies had to be developed based upon a fundamental understanding of circadian phototransduction mechanisms in the different species. When both nocturnal rodents and diurnal humans, experienced different species-specific light-dark shift schedules, they showed, based upon phasor analysis of the light-dark and activity-rest patterns, similar levels of light-dependent circadian disruption. Indeed, both rodents and humans show monotonically increasing and quantitatively similar levels of light-dependent circadian disruption with increasing shift-nights per week. Thus, phasor analysis provides a method for quantifying circadian disruption in the field and in the laboratory as well as a bridge between ecological measurements of circadian entrainment in humans and parametric studies of circadian disruption in animal models, including nocturnal rodents.
A study of ignition of metal impregnated carbons: the influence of oxygen content in the activated carbon matrix.

PubMed

van der Merwe, M M; Bandosz, T J

2005-02-01

A study of the reason for the early ignition of coconut-based impregnated carbon in comparison with the peat-based impregnated carbon was conducted. The surface features of carbons were evaluated using various physicochemical methods. The metal analysis of the initial carbon indicated that the content of potassium was higher in the coconut-based carbon. The surface functional group analysis revealed the presence of similar surface species; however, the peat-based carbon was more acidic in its chemical nature. Since the oxygen content was higher in the peat-based carbon, the early ignition of the coconut-based material was attributed to its higher affinity to chemisorb oxygen, which leads to exothermic effects. This conclusion was confirmed by performing oxidation of coconut-based carbon prior to impregnation. This process increased the ignition temperature for Cu/Cr impregnated coconut-based material from 186 to 289 degrees C and for the Cu/Zn/Mo impregnated carbon from 235 to 324 degrees C.
Modified multidimensional scaling approach to analyze financial markets.

PubMed

Yin, Yi; Shang, Pengjian

2014-06-01

Detrended cross-correlation coefficient (σDCCA) and dynamic time warping (DTW) are introduced as the dissimilarity measures, respectively, while multidimensional scaling (MDS) is employed to translate the dissimilarities between daily price returns of 24 stock markets. We first propose MDS based on σDCCA dissimilarity and MDS based on DTW dissimilarity creatively, while MDS based on Euclidean dissimilarity is also employed to provide a reference for comparisons. We apply these methods in order to further visualize the clustering between stock markets. Moreover, we decide to confront MDS with an alternative visualization method, "Unweighed Average" clustering method, for comparison. The MDS analysis and "Unweighed Average" clustering method are employed based on the same dissimilarity. Through the results, we find that MDS gives us a more intuitive mapping for observing stable or emerging clusters of stock markets with similar behavior, while the MDS analysis based on σDCCA dissimilarity can provide more clear, detailed, and accurate information on the classification of the stock markets than the MDS analysis based on Euclidean dissimilarity. The MDS analysis based on DTW dissimilarity indicates more knowledge about the correlations between stock markets particularly and interestingly. Meanwhile, it reflects more abundant results on the clustering of stock markets and is much more intensive than the MDS analysis based on Euclidean dissimilarity. In addition, the graphs, originated from applying MDS methods based on σDCCA dissimilarity and DTW dissimilarity, may also guide the construction of multivariate econometric models.
Analysis of genetic association using hierarchical clustering and cluster validation indices.

PubMed

Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

2017-10-01

It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks.

PubMed

Oh, S June; Joung, Je-Gun; Chang, Jeong-Ho; Zhang, Byoung-Tak

2006-06-06

To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway structures using meta-level information rather than sequence information. This method may yield further information about biological evolution, such as the history of horizontal transfer of each gene, by studying the detailed structure of the phylogenetic tree constructed by the kernel-based method.
A novel computer based expert decision making model for prostate cancer disease management.

PubMed

Richman, Martin B; Forman, Ernest H; Bayazit, Yildirim; Einstein, Douglas B; Resnick, Martin I; Stovsky, Mark D

2005-12-01

We propose a strategic, computer based, prostate cancer decision making model based on the analytic hierarchy process. We developed a model that improves physician-patient joint decision making and enhances the treatment selection process by making this critical decision rational and evidence based. Two groups (patient and physician-expert) completed a clinical study comparing an initial disease management choice with the highest ranked option generated by the computer model. Participants made pairwise comparisons to derive priorities for the objectives and subobjectives related to the disease management decision. The weighted comparisons were then applied to treatment options to yield prioritized rank lists that reflect the likelihood that a given alternative will achieve the participant treatment goal. Aggregate data were evaluated by inconsistency ratio analysis and sensitivity analysis, which assessed the influence of individual objectives and subobjectives on the final rank list of treatment options. Inconsistency ratios less than 0.05 were reliably generated, indicating that judgments made within the model were mathematically rational. The aggregate prioritized list of treatment options was tabulated for the patient and physician groups with similar outcomes for the 2 groups. Analysis of the major defining objectives in the treatment selection decision demonstrated the same rank order for the patient and physician groups with cure, survival and quality of life being more important than controlling cancer, preventing major complications of treatment, preventing blood transfusion complications and limiting treatment cost. Analysis of subobjectives, including quality of life and sexual dysfunction, produced similar priority rankings for the patient and physician groups. Concordance between initial treatment choice and the highest weighted model option differed between the groups with the patient group having 59% concordance and the physician group having only 42% concordance. This study successfully validated the usefulness of a computer based prostate cancer management decision making model to produce individualized, rational, clinically appropriate disease management decisions without physician bias.
Mechanisms of Hydrocarbon Based Polymer Etch

NASA Astrophysics Data System (ADS)

Lane, Barton; Ventzek, Peter; Matsukuma, Masaaki; Suzuki, Ayuta; Koshiishi, Akira

2015-09-01

Dry etch of hydrocarbon based polymers is important for semiconductor device manufacturing. The etch mechanisms for oxygen rich plasma etch of hydrocarbon based polymers has been studied but the mechanism for lean chemistries has received little attention. We report on an experimental and analytic study of the mechanism for etching of a hydrocarbon based polymer using an Ar/O2 chemistry in a single frequency 13.56 MHz test bed. The experimental study employs an analysis of transients from sequential oxidation and Ar sputtering steps using OES and surface analytics to constrain conceptual models for the etch mechanism. The conceptual model is consistent with observations from MD studies and surface analysis performed by Vegh et al. and Oehrlein et al. and other similar studies. Parameters of the model are fit using published data and the experimentally observed time scales.
A novel iris transillumination grading scale allowing flexible assessment with quantitative image analysis and visual matching.

PubMed

Wang, Chen; Brancusi, Flavia; Valivullah, Zaheer M; Anderson, Michael G; Cunningham, Denise; Hedberg-Buenz, Adam; Power, Bradley; Simeonov, Dimitre; Gahl, William A; Zein, Wadih M; Adams, David R; Brooks, Brian

2018-01-01

To develop a sensitive scale of iris transillumination suitable for clinical and research use, with the capability of either quantitative analysis or visual matching of images. Iris transillumination photographic images were used from 70 study subjects with ocular or oculocutaneous albinism. Subjects represented a broad range of ocular pigmentation. A subset of images was subjected to image analysis and ranking by both expert and nonexpert reviewers. Quantitative ordering of images was compared with ordering by visual inspection. Images were binned to establish an 8-point scale. Ranking consistency was evaluated using the Kendall rank correlation coefficient (Kendall's tau). Visual ranking results were assessed using Kendall's coefficient of concordance (Kendall's W) analysis. There was a high degree of correlation among the image analysis, expert-based and non-expert-based image rankings. Pairwise comparisons of the quantitative ranking with each reviewer generated an average Kendall's tau of 0.83 ± 0.04 (SD). Inter-rater correlation was also high with Kendall's W of 0.96, 0.95, and 0.95 for nonexpert, expert, and all reviewers, respectively. The current standard for assessing iris transillumination is expert assessment of clinical exam findings. We adapted an image-analysis technique to generate quantitative transillumination values. Quantitative ranking was shown to be highly similar to a ranking produced by both expert and nonexpert reviewers. This finding suggests that the image characteristics used to quantify iris transillumination do not require expert interpretation. Inter-rater rankings were also highly similar, suggesting that varied methods of transillumination ranking are robust in terms of producing reproducible results.
Genetic algorithms as global random search methods

NASA Technical Reports Server (NTRS)

Peck, Charles C.; Dhawan, Atam P.

1995-01-01

Genetic algorithm behavior is described in terms of the construction and evolution of the sampling distributions over the space of candidate solutions. This novel perspective is motivated by analysis indicating that the schema theory is inadequate for completely and properly explaining genetic algorithm behavior. Based on the proposed theory, it is argued that the similarities of candidate solutions should be exploited directly, rather than encoding candidate solutions and then exploiting their similarities. Proportional selection is characterized as a global search operator, and recombination is characterized as the search process that exploits similarities. Sequential algorithms and many deletion methods are also analyzed. It is shown that by properly constraining the search breadth of recombination operators, convergence of genetic algorithms to a global optimum can be ensured.
A rational model of function learning.

PubMed

Lucas, Christopher G; Griffiths, Thomas L; Williams, Joseph J; Kalish, Michael L

2015-10-01

Theories of how people learn relationships between continuous variables have tended to focus on two possibilities: one, that people are estimating explicit functions, or two that they are performing associative learning supported by similarity. We provide a rational analysis of function learning, drawing on work on regression in machine learning and statistics. Using the equivalence of Bayesian linear regression and Gaussian processes, which provide a probabilistic basis for similarity-based function learning, we show that learning explicit rules and using similarity can be seen as two views of one solution to this problem. We use this insight to define a rational model of human function learning that combines the strengths of both approaches and accounts for a wide variety of experimental results.
Genetic algorithms as global random search methods

NASA Technical Reports Server (NTRS)

Peck, Charles C.; Dhawan, Atam P.

1995-01-01

Genetic algorithm behavior is described in terms of the construction and evolution of the sampling distributions over the space of candidate solutions. This novel perspective is motivated by analysis indicating that that schema theory is inadequate for completely and properly explaining genetic algorithm behavior. Based on the proposed theory, it is argued that the similarities of candidate solutions should be exploited directly, rather than encoding candidate solution and then exploiting their similarities. Proportional selection is characterized as a global search operator, and recombination is characterized as the search process that exploits similarities. Sequential algorithms and many deletion methods are also analyzed. It is shown that by properly constraining the search breadth of recombination operators, convergence of genetic algorithms to a global optimum can be ensured.
SALAD database: a motif-based database of protein annotations for plant comparative genomics

PubMed Central

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209 529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named ‘SALAD on ARRAYs’ to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis. PMID:19854933
DiseaseConnect: a comprehensive web server for mechanism-based disease–disease connections

PubMed Central

Liu, Chun-Chi; Tseng, Yu-Ting; Li, Wenyuan; Wu, Chia-Yu; Mayzus, Ilya; Rzhetsky, Andrey; Sun, Fengzhu; Waterman, Michael; Chen, Jeremy J. W.; Chaudhary, Preet M.; Loscalzo, Joseph; Crandall, Edward; Zhou, Xianghong Jasmine

2014-01-01

The DiseaseConnect (http://disease-connect.org) is a web server for analysis and visualization of a comprehensive knowledge on mechanism-based disease connectivity. The traditional disease classification system groups diseases with similar clinical symptoms and phenotypic traits. Thus, diseases with entirely different pathologies could be grouped together, leading to a similar treatment design. Such problems could be avoided if diseases were classified based on their molecular mechanisms. Connecting diseases with similar pathological mechanisms could inspire novel strategies on the effective repositioning of existing drugs and therapies. Although there have been several studies attempting to generate disease connectivity networks, they have not yet utilized the enormous and rapidly growing public repositories of disease-related omics data and literature, two primary resources capable of providing insights into disease connections at an unprecedented level of detail. Our DiseaseConnect, the first public web server, integrates comprehensive omics and literature data, including a large amount of gene expression data, Genome-Wide Association Studies catalog, and text-mined knowledge, to discover disease–disease connectivity via common molecular mechanisms. Moreover, the clinical comorbidity data and a comprehensive compilation of known drug–disease relationships are additionally utilized for advancing the understanding of the disease landscape and for facilitating the mechanism-based development of new drug treatments. PMID:24895436
SALAD database: a motif-based database of protein annotations for plant comparative genomics.

PubMed

Mihara, Motohiro; Itoh, Takeshi; Izawa, Takeshi

2010-01-01

Proteins often have several motifs with distinct evolutionary histories. Proteins with similar motifs have similar biochemical properties and thus related biological functions. We constructed a unique comparative genomics database termed the SALAD database (http://salad.dna.affrc.go.jp/salad/) from plant-genome-based proteome data sets. We extracted evolutionarily conserved motifs by MEME software from 209,529 protein-sequence annotation groups selected by BLASTP from the proteome data sets of 10 species: rice, sorghum, Arabidopsis thaliana, grape, a lycophyte, a moss, 3 algae, and yeast. Similarity clustering of each protein group was performed by pairwise scoring of the motif patterns of the sequences. The SALAD database provides a user-friendly graphical viewer that displays a motif pattern diagram linked to the resulting bootstrapped dendrogram for each protein group. Amino-acid-sequence-based and nucleotide-sequence-based phylogenetic trees for motif combination alignment, a logo comparison diagram for each clade in the tree, and a Pfam-domain pattern diagram are also available. We also developed a viewer named 'SALAD on ARRAYs' to view arbitrary microarray data sets of paralogous genes linked to the same dendrogram in a window. The SALAD database is a powerful tool for comparing protein sequences and can provide valuable hints for biological analysis.
RoleSim and RoleMatch: Role-Based Similarity and Graph Matching

ERIC Educational Resources Information Center

Lee, Victor Eugene

2012-01-01

With the rise of the internet, mobile communications, electronic transactions, and personal broadcasting, the scale of connectedness has grown immensely. Not only can an individual interact with thousands and millions of others, but details about those interactions are being stored in databases, for later retrieval and analysis. Two key concepts…

Some links on this page may take you to non-federal websites. Their policies may differ from this site.