Sample records for cluster analysis shows

  1. Missing continuous outcomes under covariate dependent missingness in cluster randomised trials

    PubMed Central

    Diaz-Ordaz, Karla; Bartlett, Jonathan W

    2016-01-01

    Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group. PMID:27177885

  2. Missing continuous outcomes under covariate dependent missingness in cluster randomised trials.

    PubMed

    Hossain, Anower; Diaz-Ordaz, Karla; Bartlett, Jonathan W

    2017-06-01

    Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group.

  3. A hierarchical cluster analysis of normal-tension glaucoma using spectral-domain optical coherence tomography parameters.

    PubMed

    Bae, Hyoung Won; Ji, Yongwoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2015-01-01

    Normal-tension glaucoma (NTG) is a heterogenous disease, and there is still controversy about subclassifications of this disorder. On the basis of spectral-domain optical coherence tomography (SD-OCT), we subdivided NTG with hierarchical cluster analysis using optic nerve head (ONH) parameters and retinal nerve fiber layer (RNFL) thicknesses. A total of 200 eyes of 200 NTG patients between March 2011 and June 2012 underwent SD-OCT scans to measure ONH parameters and RNFL thicknesses. We classified NTG into homogenous subgroups based on these variables using a hierarchical cluster analysis, and compared clusters to evaluate diverse NTG characteristics. Three clusters were found after hierarchical cluster analysis. Cluster 1 (62 eyes) had the thickest RNFL and widest rim area, and showed early glaucoma features. Cluster 2 (60 eyes) was characterized by the largest cup/disc ratio and cup volume, and showed advanced glaucomatous damage. Cluster 3 (78 eyes) had small disc areas in SD-OCT and were comprised of patients with significantly younger age, longer axial length, and greater myopia than the other 2 groups. A hierarchical cluster analysis of SD-OCT scans divided NTG patients into 3 groups based upon ONH parameters and RNFL thicknesses. It is anticipated that the small disc area group comprised of younger and more myopic patients may show unique features unlike the other 2 groups.

  4. Variable number of tandem repeats and pulsed-field gel electrophoresis cluster analysis of enterohemorrhagic Escherichia coli serovar O157 strains.

    PubMed

    Yokoyama, Eiji; Uchimura, Masako

    2007-11-01

    Ninety-five enterohemorrhagic Escherichia coli serovar O157 strains, including 30 strains isolated from 13 intrafamily outbreaks and 14 strains isolated from 3 mass outbreaks, were studied by pulsed-field gel electrophoresis (PFGE) and variable number of tandem repeats (VNTR) typing, and the resulting data were subjected to cluster analysis. Cluster analysis of the VNTR typing data revealed that 57 (60.0%) of 95 strains, including all epidemiologically linked strains, formed clusters with at least 95% similarity. Cluster analysis of the PFGE patterns revealed that 67 (70.5%) of 95 strains, including all but 1 of the epidemiologically linked strains, formed clusters with 90% similarity. The number of epidemiologically unlinked strains forming clusters was significantly less by VNTR cluster analysis than by PFGE cluster analysis. The congruence value between PFGE and VNTR cluster analysis was low and did not show an obvious correlation. With two-step cluster analysis, the number of clustered epidemiologically unlinked strains by PFGE cluster analysis that were divided by subsequent VNTR cluster analysis was significantly higher than the number by VNTR cluster analysis that were divided by subsequent PFGE cluster analysis. These results indicate that VNTR cluster analysis is more efficient than PFGE cluster analysis as an epidemiological tool to trace the transmission of enterohemorrhagic E. coli O157.

  5. Transcriptional and Chromatin Dynamics of Muscle Regeneration After Severe Trauma

    DTIC Science & Technology

    2016-10-12

    performed pathway analysis of the time-clustered RNA- Seq data16 and showed an initial burst of pro-inflammatory and immune-response transcripts in the...143 showed dynamic behavior (See Methods) and analysis of the dynamic miRNAs reinforced many of the results observed from the RNA-Seq datasets...excellent agreement was viewed. Hierarchical clustering of the datasets through time revealed 5 clusters, and gene ontology (GO) analysis of the

  6. Orbit Clustering Based on Transfer Cost

    NASA Technical Reports Server (NTRS)

    Gustafson, Eric D.; Arrieta-Camacho, Juan J.; Petropoulos, Anastassios E.

    2013-01-01

    We propose using cluster analysis to perform quick screening for combinatorial global optimization problems. The key missing component currently preventing cluster analysis from use in this context is the lack of a useable metric function that defines the cost to transfer between two orbits. We study several proposed metrics and clustering algorithms, including k-means and the expectation maximization algorithm. We also show that proven heuristic methods such as the Q-law can be modified to work with cluster analysis.

  7. A novel polyketide biosynthesis gene cluster is involved in fruiting body morphogenesis in the filamentous fungi Sordaria macrospora and Neurospora crassa.

    PubMed

    Nowrousian, Minou

    2009-04-01

    During fungal fruiting body development, hyphae aggregate to form multicellular structures that protect and disperse the sexual spores. Analysis of microarray data revealed a gene cluster strongly upregulated during fruiting body development in the ascomycete Sordaria macrospora. Real time PCR analysis showed that the genes from the orthologous cluster in Neurospora crassa are also upregulated during development. The cluster encodes putative polyketide biosynthesis enzymes, including a reducing polyketide synthase. Analysis of knockout strains of a predicted dehydrogenase gene from the cluster showed that mutants in N. crassa and S. macrospora are delayed in fruiting body formation. In addition to the upregulated cluster, the N. crassa genome comprises another cluster containing a polyketide synthase gene, and five additional reducing polyketide synthase (rpks) genes that are not part of clusters. To study the role of these genes in sexual development, expression of the predicted rpks genes in S. macrospora (five genes) and N. crassa (six genes) was analyzed; all but one are upregulated during sexual development. Analysis of knockout strains for the N. crassa rpks genes showed that one of them is essential for fruiting body formation. These data indicate that polyketides produced by RPKSs are involved in sexual development in filamentous ascomycetes.

  8. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

    PubMed

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

    2015-05-01

    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  9. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering

    PubMed Central

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor

    2015-01-01

    Abstract To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice. PMID:25560745

  10. Cluster analysis of autoantibodies in 852 patients with systemic lupus erythematosus from a single center.

    PubMed

    Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat

    2014-07-01

    Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.

  11. Cluster analysis of particulate matter (PM10) and black carbon (BC) concentrations

    NASA Astrophysics Data System (ADS)

    Žibert, Janez; Pražnikar, Jure

    2012-09-01

    The monitoring of air-pollution constituents like particulate matter (PM10) and black carbon (BC) can provide information about air quality and the dynamics of emissions. Air quality depends on natural and anthropogenic sources of emissions as well as the weather conditions. For a one-year period the diurnal concentrations of PM10 and BC in the Port of Koper were analysed by clustering days into similar groups according to the similarity of the BC and PM10 hourly derived day-profiles without any prior assumptions about working and non-working days, weather conditions or hot and cold seasons. The analysis was performed by using k-means clustering with the squared Euclidean distance as the similarity measure. The analysis showed that 10 clusters in the BC case produced 3 clusters with just one member day and 7 clusters that encompasses more than one day with similar BC profiles. Similar results were found in the PM10 case, where one cluster has a single-member day, while 7 clusters contain several member days. The clustering analysis revealed that the clusters with less pronounced bimodal patterns and low hourly and average daily concentrations for both types of measurements include the most days in the one-year analysis. A typical day profile of the BC measurements includes a bimodal pattern with morning and evening peaks, while the PM10 measurements reveal a less pronounced bimodality. There are also clusters with single-peak day-profiles. The BC data in such cases exhibit morning peaks, while the PM10 data consist of noon or afternoon single peaks. Single pronounced peaks can be explained by appropriate cluster wind speed profiles. The analysis also revealed some special day-profiles. The BC cluster with a high midnight peak at 30/04/2010 and the PM10 cluster with the highest observed concentration of PM10 at 01/05/2010 (208.0 μg m-3) coincide with 1 May, which is a national holiday in Slovenia and has very strong tradition of bonfire parties. The clustering of the diurnal concentration showed that various different day-profiles are presented in a cold period, while this is not the case for the hot season. Additional analysis of ship traffic and rain fall data showed that there is no statistically significant difference between the ship gross (bruto) registered tonnage (BRT) values in the case of BC and PM10 clusters, but that there is statistically significant differences between the rain fall in the BC and PM10 clusters. The wind-rose for clusters which included most days in the sampling period indicating that emitted PM10 and BC from Port of Koper were manly transported in the west direction over the sea and in the east direction, where there is in no populated area. Presented analysis showed that both BC and PM10 concentrations were driven by rain intensity and wind speed.

  12. Application of clustering methods: Regularized Markov clustering (R-MCL) for analyzing dengue virus similarity

    NASA Astrophysics Data System (ADS)

    Lestari, D.; Raharjo, D.; Bustamam, A.; Abdillah, B.; Widhianto, W.

    2017-07-01

    Dengue virus consists of 10 different constituent proteins and are classified into 4 major serotypes (DEN 1 - DEN 4). This study was designed to perform clustering against 30 protein sequences of dengue virus taken from Virus Pathogen Database and Analysis Resource (VIPR) using Regularized Markov Clustering (R-MCL) algorithm and then we analyze the result. By using Python program 3.4, R-MCL algorithm produces 8 clusters with more than one centroid in several clusters. The number of centroid shows the density level of interaction. Protein interactions that are connected in a tissue, form a complex protein that serves as a specific biological process unit. The analysis of result shows the R-MCL clustering produces clusters of dengue virus family based on the similarity role of their constituent protein, regardless of serotypes.

  13. A hybrid monkey search algorithm for clustering analysis.

    PubMed

    Chen, Xin; Zhou, Yongquan; Luo, Qifang

    2014-01-01

    Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.

  14. Deconstructing Bipolar Disorder and Schizophrenia: A cross-diagnostic cluster analysis of cognitive phenotypes.

    PubMed

    Lee, Junghee; Rizzo, Shemra; Altshuler, Lori; Glahn, David C; Miklowitz, David J; Sugar, Catherine A; Wynn, Jonathan K; Green, Michael F

    2017-02-01

    Bipolar disorder (BD) and schizophrenia (SZ) show substantial overlap. It has been suggested that a subgroup of patients might contribute to these overlapping features. This study employed a cross-diagnostic cluster analysis to identify subgroups of individuals with shared cognitive phenotypes. 143 participants (68 BD patients, 39 SZ patients and 36 healthy controls) completed a battery of EEG and performance assessments on perception, nonsocial cognition and social cognition. A K-means cluster analysis was conducted with all participants across diagnostic groups. Clinical symptoms, functional capacity, and functional outcome were assessed in patients. A two-cluster solution across 3 groups was the most stable. One cluster including 44 BD patients, 31 controls and 5 SZ patients showed better cognition (High cluster) than the other cluster with 24 BD patients, 35 SZ patients and 5 controls (Low cluster). BD patients in the High cluster performed better than BD patients in the Low cluster across cognitive domains. Within each cluster, participants with different clinical diagnoses showed different profiles across cognitive domains. All patients are in the chronic phase and out of mood episode at the time of assessment and most of the assessment were behavioral measures. This study identified two clusters with shared cognitive phenotype profiles that were not proxies for clinical diagnoses. The finding of better social cognitive performance of BD patients than SZ patients in the Lowe cluster suggest that relatively preserved social cognition may be important to identify disease process distinct to each disorder. Copyright © 2016 Elsevier B.V. All rights reserved.

  15. [Styles of interpersonal conflict in patients with panic disorder, alcoholism, rheumatoid arthritis and healthy controls: a cluster analysis study].

    PubMed

    Eher, R; Windhaber, J; Rau, H; Schmitt, M; Kellner, E

    2000-05-01

    Conflict and conflict resolution in intimate relationships are not only among the most important factors influencing relationship satisfaction but are also seen in association with clinical symptoms. Styles of conflict will be assessed in patients suffering from panic disorder with and without agoraphobia, in alcoholics and in patients suffering from rheumatoid arthritis. 176 patients and healthy controls filled out the Styles of Conflict Inventory and questionnaires concerning severity of clinical symptoms. A cluster analysis revealed 5 types of conflict management. Healthy controls showed predominantely assertive and constructive styles, patients with panic disorder showed high levels of cognitive and/or behavioral aggression. Alcoholics showed high levels of repressed aggression, and patients with rheumatoid arthritis often did not exhibit any aggression during conflict. 5 Clusters of conflict pattern have been identified by cluster analysis. Each patient group showed considerable different patterns of conflict management.

  16. ICAP - An Interactive Cluster Analysis Procedure for analyzing remotely sensed data

    NASA Technical Reports Server (NTRS)

    Wharton, S. W.; Turner, B. J.

    1981-01-01

    An Interactive Cluster Analysis Procedure (ICAP) was developed to derive classifier training statistics from remotely sensed data. ICAP differs from conventional clustering algorithms by allowing the analyst to optimize the cluster configuration by inspection, rather than by manipulating process parameters. Control of the clustering process alternates between the algorithm, which creates new centroids and forms clusters, and the analyst, who can evaluate and elect to modify the cluster structure. Clusters can be deleted, or lumped together pairwise, or new centroids can be added. A summary of the cluster statistics can be requested to facilitate cluster manipulation. The principal advantage of this approach is that it allows prior information (when available) to be used directly in the analysis, since the analyst interacts with ICAP in a straightforward manner, using basic terms with which he is more likely to be familiar. Results from testing ICAP showed that an informed use of ICAP can improve classification, as compared to an existing cluster analysis procedure.

  17. Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

    NASA Astrophysics Data System (ADS)

    Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

    2017-08-01

    Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.

  18. Unsupervised analysis of small animal dynamic Cerenkov luminescence imaging

    NASA Astrophysics Data System (ADS)

    Spinelli, Antonello E.; Boschi, Federico

    2011-12-01

    Clustering analysis (CA) and principal component analysis (PCA) were applied to dynamic Cerenkov luminescence images (dCLI). In order to investigate the performances of the proposed approaches, two distinct dynamic data sets obtained by injecting mice with 32P-ATP and 18F-FDG were acquired using the IVIS 200 optical imager. The k-means clustering algorithm has been applied to dCLI and was implemented using interactive data language 8.1. We show that cluster analysis allows us to obtain good agreement between the clustered and the corresponding emission regions like the bladder, the liver, and the tumor. We also show a good correspondence between the time activity curves of the different regions obtained by using CA and manual region of interest analysis on dCLIT and PCA images. We conclude that CA provides an automatic unsupervised method for the analysis of preclinical dynamic Cerenkov luminescence image data.

  19. Cluster analysis of multiple planetary flow regimes

    NASA Technical Reports Server (NTRS)

    Mo, Kingtse; Ghil, Michael

    1987-01-01

    A modified cluster analysis method was developed to identify spatial patterns of planetary flow regimes, and to study transitions between them. This method was applied first to a simple deterministic model and second to Northern Hemisphere (NH) 500 mb data. The dynamical model is governed by the fully-nonlinear, equivalent-barotropic vorticity equation on the sphere. Clusters of point in the model's phase space are associated with either a few persistent or with many transient events. Two stationary clusters have patterns similar to unstable stationary model solutions, zonal, or blocked. Transient clusters of wave trains serve as way stations between the stationary ones. For the NH data, cluster analysis was performed in the subspace of the first seven empirical orthogonal functions (EOFs). Stationary clusters are found in the low-frequency band of more than 10 days, and transient clusters in the bandpass frequency window between 2.5 and 6 days. In the low-frequency band three pairs of clusters determine, respectively, EOFs 1, 2, and 3. They exhibit well-known regional features, such as blocking, the Pacific/North American (PNA) pattern and wave trains. Both model and low-pass data show strong bimodality. Clusters in the bandpass window show wave-train patterns in the two jet exit regions. They are related, as in the model, to transitions between stationary clusters.

  20. ClusterViz: A Cytoscape APP for Cluster Analysis of Biological Network.

    PubMed

    Wang, Jianxin; Zhong, Jiancheng; Chen, Gang; Li, Min; Wu, Fang-xiang; Pan, Yi

    2015-01-01

    Cluster analysis of biological networks is one of the most important approaches for identifying functional modules and predicting protein functions. Furthermore, visualization of clustering results is crucial to uncover the structure of biological networks. In this paper, ClusterViz, an APP of Cytoscape 3 for cluster analysis and visualization, has been developed. In order to reduce complexity and enable extendibility for ClusterViz, we designed the architecture of ClusterViz based on the framework of Open Services Gateway Initiative. According to the architecture, the implementation of ClusterViz is partitioned into three modules including interface of ClusterViz, clustering algorithms and visualization and export. ClusterViz fascinates the comparison of the results of different algorithms to do further related analysis. Three commonly used clustering algorithms, FAG-EC, EAGLE and MCODE, are included in the current version. Due to adopting the abstract interface of algorithms in module of the clustering algorithms, more clustering algorithms can be included for the future use. To illustrate usability of ClusterViz, we provided three examples with detailed steps from the important scientific articles, which show that our tool has helped several research teams do their research work on the mechanism of the biological networks.

  1. Neuro- and social-cognitive clustering highlights distinct profiles in adults with anorexia nervosa.

    PubMed

    Renwick, Beth; Musiat, Peter; Lose, Anna; DeJong, Hannah; Broadbent, Hannah; Kenyon, Martha; Loomes, Rachel; Watson, Charlotte; Ghelani, Shreena; Serpell, Lucy; Richards, Lorna; Johnson-Sabine, Eric; Boughton, Nicky; Treasure, Janet; Schmidt, Ulrike

    2015-01-01

    This study aimed to explore the neuro- and social-cognitive profile of a consecutive series of adult outpatients with anorexia nervosa (AN) when compared with widely available age and gender matched historical control data. The relationship between performance profiles, clinical characteristics, service utilization, and treatment adherence was also investigated. Consecutively recruited outpatients with a broad diagnosis of AN (restricting subtype AN-R: n = 44, binge-purge subtype AN-BP: n = 33 or Eating Disorder Not Otherwise Specified-AN subtype EDNOS-AN: n = 23) completed a comprehensive set of neurocognitive (set-shifting, central coherence) and social-cognitive measures (Emotional Theory of Mind). Data were subjected to hierarchical cluster analysis and a discriminant function analysis. Three separate, meaningful clusters emerged. Cluster 1 (n = 45) showed overall average to high average neuro- and social- cognitive performance, Cluster 2 (n = 38) showed mixed performance characterized by distinct strengths and weaknesses, and Cluster 3 (n = 17) showed poor overall performance (Autism Spectrum disorder (ASD) like cluster). The three clusters did not differ in terms of eating disorder symptoms, comorbid features or service utilization and treatment adherence. A discriminant function analysis confirmed that the clusters were best characterized by performance in perseveration and set-shifting measures. The findings suggest that considerable neuro- and social-cognitive heterogeneity exists in patients with AN, with a subset showing ASD-like features. The value of this method of profiling in predicting longer term patient outcomes and in guiding development of etiologically targeted treatments remains to be seen. © 2014 Wiley Periodicals, Inc.

  2. An effective fuzzy kernel clustering analysis approach for gene expression data.

    PubMed

    Sun, Lin; Xu, Jiucheng; Yin, Jiaojiao

    2015-01-01

    Fuzzy clustering is an important tool for analyzing microarray data. A major problem in applying fuzzy clustering method to microarray gene expression data is the choice of parameters with cluster number and centers. This paper proposes a new approach to fuzzy kernel clustering analysis (FKCA) that identifies desired cluster number and obtains more steady results for gene expression data. First of all, to optimize characteristic differences and estimate optimal cluster number, Gaussian kernel function is introduced to improve spectrum analysis method (SAM). By combining subtractive clustering with max-min distance mean, maximum distance method (MDM) is proposed to determine cluster centers. Then, the corresponding steps of improved SAM (ISAM) and MDM are given respectively, whose superiority and stability are illustrated through performing experimental comparisons on gene expression data. Finally, by introducing ISAM and MDM into FKCA, an effective improved FKCA algorithm is proposed. Experimental results from public gene expression data and UCI database show that the proposed algorithms are feasible for cluster analysis, and the clustering accuracy is higher than the other related clustering algorithms.

  3. Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment.

    PubMed

    Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

    2014-04-29

    To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.

  4. High-throughput analysis of the satellitome illuminates satellite DNA evolution

    NASA Astrophysics Data System (ADS)

    Ruiz-Ruano, Francisco J.; López-León, María Dolores; Cabrero, Josefa; Camacho, Juan Pedro M.

    2016-07-01

    Satellite DNA (satDNA) is a major component yet the great unknown of eukaryote genomes and clearly underrepresented in genome sequencing projects. Here we show the high-throughput analysis of satellite DNA content in the migratory locust by means of the bioinformatic analysis of Illumina reads with the RepeatExplorer and RepeatMasker programs. This unveiled 62 satDNA families and we propose the term “satellitome” for the whole collection of different satDNA families in a genome. The finding that satDNAs were present in many contigs of the migratory locust draft genome indicates that they show many genomic locations invisible by fluorescent in situ hybridization (FISH). The cytological pattern of five satellites showing common descent (belonging to the SF3 superfamily) suggests that non-clustered satDNAs can become into clustered through local amplification at any of the many genomic loci resulting from previous dissemination of short satDNA arrays. The fact that all kinds of satDNA (micro- mini- and satellites) can show the non-clustered and clustered states suggests that all these elements are mostly similar, except for repeat length. Finally, the presence of VNTRs in bacteria, showing similar properties to non-clustered satDNAs in eukaryotes, suggests that this kind of tandem repeats show common properties in all living beings.

  5. An investigation about the structures, thermodynamics and kinetics of the formic acid involved molecular clusters

    NASA Astrophysics Data System (ADS)

    Zhang, Rui; Jiang, Shuai; Liu, Yi-Rong; Wen, Hui; Feng, Ya-Juan; Huang, Teng; Huang, Wei

    2018-05-01

    Despite the very important role of atmospheric aerosol nucleation in climate change and air quality, the detailed aerosol nucleation mechanism is still unclear. Here we investigated the formic acid (FA) involved multicomponent nucleation molecular clusters including sulfuric acid (SA), dimethylamine (DMA) and water (W) through a quantum chemical method. The thermodynamics and kinetics analysis was based on the global minima given by Basin-Hopping (BH) algorithm coupled with Density Functional Theory (DFT) and subsequent benchmarked calculations. Then the interaction analysis based on ElectroStatic Potential (ESP), Topological and Atomic Charges analysis was made to characterize the binding features of the clusters. The results show that FA binds weakly with the other molecules in the cluster while W binds more weakly. Further kinetic analysis about the time evolution of the clusters show that even though the formic acid's weak interaction with other nucleation precursors, its effect on sulfuric acid dimer steady state concentration cannot be neglected due to its high concentration in the atmosphere.

  6. Visualizing Confidence in Cluster-Based Ensemble Weather Forecast Analyses.

    PubMed

    Kumpf, Alexander; Tost, Bianca; Baumgart, Marlene; Riemer, Michael; Westermann, Rudiger; Rautenhaus, Marc

    2018-01-01

    In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we - a team of visualization scientists and meteorologists-deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows meteorologists how representative a clustering result is, and with respect to which changes in the selected region it becomes unstable. Furthermore, our solution helps to identify those ensemble members which stably belong to a given cluster and can thus be considered similar. In a real-world application case we show how our approach is used to analyze the clustering behavior of different regions in a forecast of "Tropical Cyclone Karl", guiding the user towards the cluster robustness information required for subsequent ensemble analysis.

  7. Cluster subgroups based on overall pressure pain sensitivity and psychosocial factors in chronic musculoskeletal pain: Differences in clinical outcomes.

    PubMed

    Almeida, Suzana C; George, Steven Z; Leite, Raquel D V; Oliveira, Anamaria S; Chaves, Thais C

    2018-05-17

    We aimed to empirically derive psychosocial and pain sensitivity subgroups using cluster analysis within a sample of individuals with chronic musculoskeletal pain (CMP) and to investigate derived subgroups for differences in pain and disability outcomes. Eighty female participants with CMP answered psychosocial and disability scales and were assessed for pressure pain sensitivity. A cluster analysis was used to derive subgroups, and analysis of variance (ANOVA) was used to investigate differences between subgroups. Psychosocial factors (kinesiophobia, pain catastrophizing, anxiety, and depression) and overall pressure pain threshold (PPT) were entered into the cluster analysis. Three subgroups were empirically derived: cluster 1 (high pain sensitivity and high psychosocial distress; n = 12) characterized by low overall PPT and high psychosocial scores; cluster 2 (high pain sensitivity and intermediate psychosocial distress; n = 39) characterized by low overall PPT and intermediate psychosocial scores; and cluster 3 (low pain sensitivity and low psychosocial distress; n = 29) characterized by high overall PPT and low psychosocial scores compared to the other subgroups. Cluster 1 showed higher values for mean pain intensity (F (2,77)  = 10.58, p < 0.001) compared with cluster 3, and cluster 1 showed higher values for disability (F (2,77)  = 3.81, p = 0.03) compared with both clusters 2 and 3. Only cluster 1 was distinct from cluster 3 according to both pain and disability outcomes. Pain catastrophizing, depression, and anxiety were the psychosocial variables that best differentiated the subgroups. Overall, these results call attention to the importance of considering pain sensitivity and psychosocial variables to obtain a more comprehensive characterization of CMP patients' subtypes.

  8. Another collision for the Coma cluster

    NASA Technical Reports Server (NTRS)

    Vikhlinin, A.; Forman, W.; Jones, C.

    1996-01-01

    The wavelet transform analysis of the Rosat position sensitive proportional counter (PSPC) images of the Coma cluster are presented. The analysis shows, on small scales, a substructure dominated by two extended sources surrounding the two bright clusters NGC 4874 and NGC 4889. On scales of about 2 arcmin to 3 arcmin, the analysis reveals a tail of X-ray emission originating near the cluster center, curving to the south and east for approximately 25 arcmin and ending near the galaxy NGC 4911. The results are interpreted in terms of a merger of a group, having a core mass of approximately 10(exp 13) solar mass, with the main body of the Coma cluster.

  9. Cluster Correspondence Analysis.

    PubMed

    van de Velden, M; D'Enza, A Iodice; Palumbo, F

    2017-03-01

    A method is proposed that combines dimension reduction and cluster analysis for categorical data by simultaneously assigning individuals to clusters and optimal scaling values to categories in such a way that a single between variance maximization objective is achieved. In a unified framework, a brief review of alternative methods is provided and we show that the proposed method is equivalent to GROUPALS applied to categorical data. Performance of the methods is appraised by means of a simulation study. The results of the joint dimension reduction and clustering methods are compared with the so-called tandem approach, a sequential analysis of dimension reduction followed by cluster analysis. The tandem approach is conjectured to perform worse when variables are added that are unrelated to the cluster structure. Our simulation study confirms this conjecture. Moreover, the results of the simulation study indicate that the proposed method also consistently outperforms alternative joint dimension reduction and clustering methods.

  10. Towards Effective Clustering Techniques for the Analysis of Electric Power Grids

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hogan, Emilie A.; Cotilla Sanchez, Jose E.; Halappanavar, Mahantesh

    2013-11-30

    Clustering is an important data analysis technique with numerous applications in the analysis of electric power grids. Standard clustering techniques are oblivious to the rich structural and dynamic information available for power grids. Therefore, by exploiting the inherent topological and electrical structure in the power grid data, we propose new methods for clustering with applications to model reduction, locational marginal pricing, phasor measurement unit (PMU or synchrophasor) placement, and power system protection. We focus our attention on model reduction for analysis based on time-series information from synchrophasor measurement devices, and spectral techniques for clustering. By comparing different clustering techniques onmore » two instances of realistic power grids we show that the solutions are related and therefore one could leverage that relationship for a computational advantage. Thus, by contrasting different clustering techniques we make a case for exploiting structure inherent in the data with implications for several domains including power systems.« less

  11. Analysis of local bond-orientational order for liquid gallium at ambient pressure: Two types of cluster structures.

    PubMed

    Chen, Lin-Yuan; Tang, Ping-Han; Wu, Ten-Ming

    2016-07-14

    In terms of the local bond-orientational order (LBOO) parameters, a cluster approach to analyze local structures of simple liquids was developed. In this approach, a cluster is defined as a combination of neighboring seeds having at least nb local-orientational bonds and their nearest neighbors, and a cluster ensemble is a collection of clusters with a specified nb and number of seeds ns. This cluster analysis was applied to investigate the microscopic structures of liquid Ga at ambient pressure (AP). The liquid structures studied were generated through ab initio molecular dynamics simulations. By scrutinizing the static structure factors (SSFs) of cluster ensembles with different combinations of nb and ns, we found that liquid Ga at AP contained two types of cluster structures, one characterized by sixfold orientational symmetry and the other showing fourfold orientational symmetry. The SSFs of cluster structures with sixfold orientational symmetry were akin to the SSF of a hard-sphere fluid. On the contrary, the SSFs of cluster structures showing fourfold orientational symmetry behaved similarly as the anomalous SSF of liquid Ga at AP, which is well known for exhibiting a high-q shoulder. The local structures of a highly LBOO cluster whose SSF displayed a high-q shoulder were found to be more similar to the structure of β-Ga than those of other solid phases of Ga. More generally, the cluster structures showing fourfold orientational symmetry have an inclination to resemble more to β-Ga.

  12. The Psychology of Yoga Practitioners: A Cluster Analysis.

    PubMed

    Genovese, Jeremy E C; Fondran, Kristine M

    2017-11-01

    Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall -Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.

  13. The Psychology of Yoga Practitioners: A Cluster Analysis.

    PubMed

    Genovese, Jeremy E C; Fondran, Kristine M

    2017-03-30

    Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall-Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.

  14. Computational gene expression profiling under salt stress reveals patterns of co-expression

    PubMed Central

    Sanchita; Sharma, Ashok

    2016-01-01

    Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411

  15. Modest validity and fair reproducibility of dietary patterns derived by cluster analysis.

    PubMed

    Funtikova, Anna N; Benítez-Arciniega, Alejandra A; Fitó, Montserrat; Schröder, Helmut

    2015-03-01

    Cluster analysis is widely used to analyze dietary patterns. We aimed to analyze the validity and reproducibility of the dietary patterns defined by cluster analysis derived from a food frequency questionnaire (FFQ). We hypothesized that the dietary patterns derived by cluster analysis have fair to modest reproducibility and validity. Dietary data were collected from 107 individuals from population-based survey, by an FFQ at baseline (FFQ1) and after 1 year (FFQ2), and by twelve 24-hour dietary recalls (24-HDR). Repeatability and validity were measured by comparing clusters obtained by the FFQ1 and FFQ2 and by the FFQ2 and 24-HDR (reference method), respectively. Cluster analysis identified a "fruits & vegetables" and a "meat" pattern in each dietary data source. Cluster membership was concordant for 66.7% of participants in FFQ1 and FFQ2 (reproducibility), and for 67.0% in FFQ2 and 24-HDR (validity). Spearman correlation analysis showed reasonable reproducibility, especially in the "fruits & vegetables" pattern, and lower validity also especially in the "fruits & vegetables" pattern. κ statistic revealed a fair validity and reproducibility of clusters. Our findings indicate a reasonable reproducibility and fair to modest validity of dietary patterns derived by cluster analysis. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Using cluster analysis to identify phenotypes and validation of mortality in men with COPD.

    PubMed

    Chen, Chiung-Zuei; Wang, Liang-Yi; Ou, Chih-Ying; Lee, Cheng-Hung; Lin, Chien-Chung; Hsiue, Tzuen-Ren

    2014-12-01

    Cluster analysis has been proposed to examine phenotypic heterogeneity in chronic obstructive pulmonary disease (COPD). The aim of this study was to use cluster analysis to define COPD phenotypes and validate them by assessing their relationship with mortality. Male subjects with COPD were recruited to identify and validate COPD phenotypes. Seven variables were assessed for their relevance to COPD, age, FEV(1) % predicted, BMI, history of severe exacerbations, mMRC, SpO(2), and Charlson index. COPD groups were identified by cluster analysis and validated prospectively against mortality during a 4-year follow-up. Analysis of 332 COPD subjects identified five clusters from cluster A to cluster E. Assessment of the predictive validity of these clusters of COPD showed that cluster E patients had higher all cause mortality (HR 18.3, p < 0.0001), and respiratory cause mortality (HR 21.5, p < 0.0001) than those in the other four groups. Cluster E patients also had higher all cause mortality (HR 14.3, p = 0.0002) and respiratory cause mortality (HR 10.1, p = 0.0013) than patients in cluster D alone. COPD patient with severe airflow limitation, many symptoms, and a history of frequent severe exacerbations was a novel and distinct clinical phenotype predicting mortality in men with COPD.

  17. Analysis of 3D vortex motion in a dusty plasma

    NASA Astrophysics Data System (ADS)

    Mulsow, M.; Himpel, M.; Melzer, A.

    2017-12-01

    Dust clusters of about 50-1000 particles have been confined near the sheath region of a gaseous radio-frequency plasma discharge. These compact clusters exhibit a vortex motion which has been reconstructed in full three dimensions from stereoscopy. Smaller clusters are found to show a competition between solid-like cluster structure and vortex motion, whereas larger clusters feature very pronounced vortices. From the three-dimensional analysis, the dust flow field has been found to be nearly incompressible. The vortices in all observed clusters are essentially poloidal. The dependence of the vorticity on the cluster size is discussed. Finally, the vortex motion has been quantitatively attributed to radial gradients of the ion drag force.

  18. Interactive K-Means Clustering Method Based on User Behavior for Different Analysis Target in Medicine.

    PubMed

    Lei, Yang; Yu, Dai; Bin, Zhang; Yang, Yang

    2017-01-01

    Clustering algorithm as a basis of data analysis is widely used in analysis systems. However, as for the high dimensions of the data, the clustering algorithm may overlook the business relation between these dimensions especially in the medical fields. As a result, usually the clustering result may not meet the business goals of the users. Then, in the clustering process, if it can combine the knowledge of the users, that is, the doctor's knowledge or the analysis intent, the clustering result can be more satisfied. In this paper, we propose an interactive K -means clustering method to improve the user's satisfactions towards the result. The core of this method is to get the user's feedback of the clustering result, to optimize the clustering result. Then, a particle swarm optimization algorithm is used in the method to optimize the parameters, especially the weight settings in the clustering algorithm to make it reflect the user's business preference as possible. After that, based on the parameter optimization and adjustment, the clustering result can be closer to the user's requirement. Finally, we take an example in the breast cancer, to testify our method. The experiments show the better performance of our algorithm.

  19. Stability-based validation of dietary patterns obtained by cluster analysis.

    PubMed

    Sauvageot, Nicolas; Schritz, Anna; Leite, Sonia; Alkerwi, Ala'a; Stranges, Saverio; Zannad, Faiez; Streel, Sylvie; Hoge, Axelle; Donneau, Anne-Françoise; Albert, Adelin; Guillaume, Michèle

    2017-01-14

    Cluster analysis is a data-driven method used to create clusters of individuals sharing similar dietary habits. However, this method requires specific choices from the user which have an influence on the results. Therefore, there is a need of an objective methodology helping researchers in their decisions during cluster analysis. The objective of this study was to use such a methodology based on stability of clustering solutions to select the most appropriate clustering method and number of clusters for describing dietary patterns in the NESCAV study (Nutrition, Environment and Cardiovascular Health), a large population-based cross-sectional study in the Greater Region (N = 2298). Clustering solutions were obtained with K-means, K-medians and Ward's method and a number of clusters varying from 2 to 6. Their stability was assessed with three indices: adjusted Rand index, Cramer's V and misclassification rate. The most stable solution was obtained with K-means method and a number of clusters equal to 3. The "Convenient" cluster characterized by the consumption of convenient foods was the most prevalent with 46% of the population having this dietary behaviour. In addition, a "Prudent" and a "Non-Prudent" patterns associated respectively with healthy and non-healthy dietary habits were adopted by 25% and 29% of the population. The "Convenient" and "Non-Prudent" clusters were associated with higher cardiovascular risk whereas the "Prudent" pattern was associated with a decreased cardiovascular risk. Associations with others factors showed that the choice of a specific dietary pattern is part of a wider lifestyle profile. This study is of interest for both researchers and public health professionals. From a methodological standpoint, we showed that using stability of clustering solutions could help researchers in their choices. From a public health perspective, this study showed the need of targeted health promotion campaigns describing the benefits of healthy dietary patterns.

  20. Ecological tolerances of Miocene larger benthic foraminifera from Indonesia

    NASA Astrophysics Data System (ADS)

    Novak, Vibor; Renema, Willem

    2018-01-01

    To provide a comprehensive palaeoenvironmental reconstruction based on larger benthic foraminifera (LBF), a quantitative analysis of their assemblage composition is needed. Besides microfacies analysis which includes environmental preferences of foraminiferal taxa, statistical analyses should also be employed. Therefore, detrended correspondence analysis and cluster analysis were performed on relative abundance data of identified LBF assemblages deposited in mixed carbonate-siliciclastic (MCS) systems and blue-water (BW) settings. Studied MCS system localities include ten sections from the central part of the Kutai Basin in East Kalimantan, ranging from late Burdigalian to Serravallian age. The BW samples were collected from eleven sections of the Bulu Formation on Central Java, dated as Serravallian. Results from detrended correspondence analysis reveal significant differences between these two environmental settings. Cluster analysis produced five clusters of samples; clusters 1 and 2 comprise dominantly MCS samples, clusters 3 and 4 with dominance of BW samples, and cluster 5 showing a mixed composition with both MCS and BW samples. The results of cluster analysis were afterwards subjected to indicator species analysis resulting in the interpretation that generated three groups among LBF taxa: typical assemblage indicators, regularly occurring taxa and rare taxa. By interpreting the results of detrended correspondence analysis, cluster analysis and indicator species analysis, along with environmental preferences of identified LBF taxa, a palaeoenvironmental model is proposed for the distribution of LBF in Miocene MCS systems and adjacent BW settings of Indonesia.

  1. FAST TRACK COMMUNICATION: Poly(methyl methacrylate)-palladium clusters nanocomposite formation by supersonic cluster beam deposition: a method for microstructured metallization of polymer surfaces

    NASA Astrophysics Data System (ADS)

    Ravagnan, Luca; Divitini, Giorgio; Rebasti, Sara; Marelli, Mattia; Piseri, Paolo; Milani, Paolo

    2009-04-01

    Nanocomposite films were fabricated by supersonic cluster beam deposition (SCBD) of palladium clusters on poly(methyl methacrylate) (PMMA) surfaces. The evolution of the electrical conductance with cluster coverage and microscopy analysis show that Pd clusters are implanted in the polymer and form a continuous layer extending for several tens of nanometres beneath the polymer surface. This allows the deposition, using stencil masks, of cluster-assembled Pd microstructures on PMMA showing a remarkably high adhesion compared with metallic films obtained by thermal evaporation. These results suggest that SCBD is a promising tool for the fabrication of metallic microstructures on flexible polymeric substrates.

  2. Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-means Cluster Analysis.

    PubMed

    Craen, Saskia de; Commandeur, Jacques J F; Frank, Laurence E; Heiser, Willem J

    2006-06-01

    K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these populations showed a significant effect of lack of sphericity and group size. This effect was, however, not as large as expected, with still a recovery index of more than 0.5 in the "worst case scenario." An interaction effect between the two data aspects was also found. The decreasing trend in the recovery of clusters for increasing departures from sphericity is different for equal and unequal group sizes.

  3. Sensory Clusters of Toddlers with Autism Spectrum Disorders: Differences in Affective Symptoms

    ERIC Educational Resources Information Center

    Ben-Sasson, A.; Cermak, S. A.; Orsmond, G. I.; Tager-Flusberg, H.; Kadlec, M. B.; Carter, A. S.

    2008-01-01

    Background: Individuals with autism spectrum disorders (ASDs) show variability in their sensory behaviors. In this study we identified clusters of toddlers with ASDs who shared sensory profiles and examined differences in affective symptoms across these clusters. Method: Using cluster analysis 170 toddlers with ASDs were grouped based on parent…

  4. Identification of different nutritional status groups in institutionalized elderly people by cluster analysis.

    PubMed

    López-Contreras, María José; López, Maria Ángeles; Canteras, Manuel; Candela, María Emilia; Zamora, Salvador; Pérez-Llamas, Francisca

    2014-03-01

    To apply a cluster analysis to groups of individuals of similar characteristics in an attempt to identify undernutrition or the risk of undernutrition in this population. A cross-sectional study. Seven public nursing homes in the province of Murcia, on the Mediterranean coast of Spain. 205 subjects aged 65 and older (131 women and 74 men). Dietary intake (energy and nutrients), anthropometric (body mass index, skinfold thickness, mid-arm muscle circumference, mid-arm muscle area, corrected arm muscle area, waist to hip ratio) and biochemical and haematological (serum albumin, transferrin, total cholesterol, total lymphocyte count). Variables were analyzed by cluster analysis. The results of the cluster analysis, including intake, anthropometric and analytical data showed that, of the 205 elderly subjects, 66 (32.2%) were over - weight/obese, 72 (35.1%) had an adequate nutritional status and 67 (32.7%) were undernourished or at risk of undernutrition. The undernourished or at risk of undernutrition group showed the lowest values for dietary intake and the anthropometric and analytical parameters measured. Our study shows that cluster analysis is a useful statistical method for assessing the nutritional status of institutionalized elderly populations. In contrast, use of the specific reference values frequently described in the literature might fail to detect real cases of undernourishment or those at risk of undernutrition. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.

  5. Clinical Phenotype of Diabetic Peripheral Neuropathy and Relation to Symptom Patterns: Cluster and Factor Analysis in Patients with Type 2 Diabetes in Korea.

    PubMed

    Won, Jong Chul; Im, Yong-Jin; Lee, Ji-Hyun; Kim, Chong Hwa; Kwon, Hyuk Sang; Cha, Bong-Yun; Park, Tae Sun

    2017-01-01

    Patients with diabetic peripheral neuropathy (DPN) is the most common complication. However, patients are usually suffering from not only diverse sensory deficit but also neuropathy-related discomforts. The aim of this study is to identify distinct groups of patients with DPN with respect to its clinical impacts on symptom patterns and comorbidities. A hierarchical cluster analysis and factor analysis were performed to identify relevant subgroups of patients with DPN ( n = 1338) and symptom patterns. Patients with DPN were divided into three clusters: asymptomatic (cluster 1, n = 448, 33.5%), moderate symptoms with disturbed sleep (cluster 2, n = 562, 42.0%), and severe symptoms with decreased quality of life (cluster 3, n = 328, 24.5%). Patients in cluster 3, compared with clusters 1 and 2, were characterized by higher levels of HbA1c and more severe pain and physical impairments. Patients in cluster 2 had moderate pain levels but disturbed sleep patterns comparable to those in cluster 3. The frequency of symptoms on each item of MNSI by "painful" symptom pattern showed a similar distribution pattern with increasing intensities along the three clusters. Cluster and factor analysis endorsed the use of comprehensive and symptomatic subgrouping to individualize the evaluation of patients with DPN.

  6. Diversity in aconitine alkaloid profile of Aconitum plants in Hokkaido contrasts with their genetic similarity.

    PubMed

    Kakiuchi, Nobuko; Atsumi, Toshiyuki; Higuchi, Mari; Kamikawa, Shohei; Miyako, Haruka; Wakita, Yuriko; Ohtsuka, Isao; Hayashi, Shigeki; Hishida, Atsuyuki; Kawahara, Nobuo; Nishizawa, Makoto; Yamagishi, Takashi; Kadota, Yuichi

    2015-01-01

    Aconite tuber is a representative crude drug for warming the body internally in Japanese Kampo medicine and Chinese traditional medicine. The crude drug is used in major prescriptions for the aged. Varieties of Aconitum plants are distributed throughout the Japanese Islands, especially Hokkaido. With the aim of identifying the medicinal potential of Aconitum plants from Hokkaido, 107 specimens were collected from 36 sites in the summer of 2011 and 2012. Their nuclear DNA region, internal transcribed spacer (ITS), and aconitine alkaloid contents were analyzed. Phylogenic analysis of ITS by maximum parsimony analysis showed that the majority of the specimens were grouped into one cluster (cluster I), separated from the other cluster (cluster II) consisting of alpine specimens. The aconitine alkaloid content of the tuberous roots of 76 specimens showed 2 aspects-specimens from the same collection site showed similar aconitine alkaloid profiles, and cluster I specimens from different habitats showed various alkaloid profiles. Environmental pressure of each habitat is presumed to have caused the morphology and aconitine alkaloid profile of these genetically similar specimens to diversify.

  7. Differences in Pedaling Technique in Cycling: A Cluster Analysis.

    PubMed

    Lanferdini, Fábio J; Bini, Rodrigo R; Figueiredo, Pedro; Diefenthaeler, Fernando; Mota, Carlos B; Arndt, Anton; Vaz, Marco A

    2016-10-01

    To employ cluster analysis to assess if cyclists would opt for different strategies in terms of neuromuscular patterns when pedaling at the power output of their second ventilatory threshold (PO VT2 ) compared with cycling at their maximal power output (PO MAX ). Twenty athletes performed an incremental cycling test to determine their power output (PO MAX and PO VT2 ; first session), and pedal forces, muscle activation, muscle-tendon unit length, and vastus lateralis architecture (fascicle length, pennation angle, and muscle thickness) were recorded (second session) in PO MAX and PO VT2 . Athletes were assigned to 2 clusters based on the behavior of outcome variables at PO VT2 and PO MAX using cluster analysis. Clusters 1 (n = 14) and 2 (n = 6) showed similar power output and oxygen uptake. Cluster 1 presented larger increases in pedal force and knee power than cluster 2, without differences for the index of effectiveness. Cluster 1 presented less variation in knee angle, muscle-tendon unit length, pennation angle, and tendon length than cluster 2. However, clusters 1 and 2 showed similar muscle thickness, fascicle length, and muscle activation. When cycling at PO VT2 vs PO MAX , cyclists could opt for keeping a constant knee power and pedal-force production, associated with an increase in tendon excursion and a constant fascicle length. Increases in power output lead to greater variations in knee angle, muscle-tendon unit length, tendon length, and pennation angle of vastus lateralis for a similar knee-extensor activation and smaller pedal-force changes in cyclists from cluster 2 than in cluster 1.

  8. An improved clustering algorithm based on reverse learning in intelligent transportation

    NASA Astrophysics Data System (ADS)

    Qiu, Guoqing; Kou, Qianqian; Niu, Ting

    2017-05-01

    With the development of artificial intelligence and data mining technology, big data has gradually entered people's field of vision. In the process of dealing with large data, clustering is an important processing method. By introducing the reverse learning method in the clustering process of PAM clustering algorithm, to further improve the limitations of one-time clustering in unsupervised clustering learning, and increase the diversity of clustering clusters, so as to improve the quality of clustering. The algorithm analysis and experimental results show that the algorithm is feasible.

  9. Interactive visual exploration and refinement of cluster assignments.

    PubMed

    Kern, Michael; Lex, Alexander; Gehlenborg, Nils; Johnson, Chris R

    2017-09-12

    With ever-increasing amounts of data produced in biology research, scientists are in need of efficient data analysis methods. Cluster analysis, combined with visualization of the results, is one such method that can be used to make sense of large data volumes. At the same time, cluster analysis is known to be imperfect and depends on the choice of algorithms, parameters, and distance measures. Most clustering algorithms don't properly account for ambiguity in the source data, as records are often assigned to discrete clusters, even if an assignment is unclear. While there are metrics and visualization techniques that allow analysts to compare clusterings or to judge cluster quality, there is no comprehensive method that allows analysts to evaluate, compare, and refine cluster assignments based on the source data, derived scores, and contextual data. In this paper, we introduce a method that explicitly visualizes the quality of cluster assignments, allows comparisons of clustering results and enables analysts to manually curate and refine cluster assignments. Our methods are applicable to matrix data clustered with partitional, hierarchical, and fuzzy clustering algorithms. Furthermore, we enable analysts to explore clustering results in context of other data, for example, to observe whether a clustering of genomic data results in a meaningful differentiation in phenotypes. Our methods are integrated into Caleydo StratomeX, a popular, web-based, disease subtype analysis tool. We show in a usage scenario that our approach can reveal ambiguities in cluster assignments and produce improved clusterings that better differentiate genotypes and phenotypes.

  10. Globular Cluster Abundances from High-resolution, Integrated-light Spectroscopy. II. Expanding the Metallicity Range for Old Clusters and Updated Analysis Techniques

    NASA Astrophysics Data System (ADS)

    Colucci, Janet E.; Bernstein, Rebecca A.; McWilliam, Andrew

    2017-01-01

    We present abundances of globular clusters (GCs) in the Milky Way and Fornax from integrated-light (IL) spectra. Our goal is to evaluate the consistency of the IL analysis relative to standard abundance analysis for individual stars in those same clusters. This sample includes an updated analysis of seven clusters from our previous publications and results for five new clusters that expand the metallicity range over which our technique has been tested. We find that the [Fe/H] measured from IL spectra agrees to ˜0.1 dex for GCs with metallicities as high as [Fe/H] = -0.3, but the abundances measured for more metal-rich clusters may be underestimated. In addition we systematically evaluate the accuracy of abundance ratios, [X/Fe], for Na I, Mg I, Al I, Si I, Ca I, Ti I, Ti II, Sc II, V I, Cr I, Mn I, Co I, Ni I, Cu I, Y II, Zr I, Ba II, La II, Nd II, and Eu II. The elements for which the IL analysis gives results that are most similar to analysis of individual stellar spectra are Fe I, Ca I, Si I, Ni I, and Ba II. The elements that show the greatest differences include Mg I and Zr I. Some elements show good agreement only over a limited range in metallicity. More stellar abundance data in these clusters would enable more complete evaluation of the IL results for other important elements. This paper includes data gathered with the 6.5 m Magellan Telescopes located at Las Campanas Observatory, Chile.

  11. Clinical Characteristics of Exacerbation-Prone Adult Asthmatics Identified by Cluster Analysis.

    PubMed

    Kim, Mi Ae; Shin, Seung Woo; Park, Jong Sook; Uh, Soo Taek; Chang, Hun Soo; Bae, Da Jeong; Cho, You Sook; Park, Hae Sim; Yoon, Ho Joo; Choi, Byoung Whui; Kim, Yong Hoon; Park, Choon Sik

    2017-11-01

    Asthma is a heterogeneous disease characterized by various types of airway inflammation and obstruction. Therefore, it is classified into several subphenotypes, such as early-onset atopic, obese non-eosinophilic, benign, and eosinophilic asthma, using cluster analysis. A number of asthmatics frequently experience exacerbation over a long-term follow-up period, but the exacerbation-prone subphenotype has rarely been evaluated by cluster analysis. This prompted us to identify clusters reflecting asthma exacerbation. A uniform cluster analysis method was applied to 259 adult asthmatics who were regularly followed-up for over 1 year using 12 variables, selected on the basis of their contribution to asthma phenotypes. After clustering, clinical profiles and exacerbation rates during follow-up were compared among the clusters. Four subphenotypes were identified: cluster 1 was comprised of patients with early-onset atopic asthma with preserved lung function, cluster 2 late-onset non-atopic asthma with impaired lung function, cluster 3 early-onset atopic asthma with severely impaired lung function, and cluster 4 late-onset non-atopic asthma with well-preserved lung function. The patients in clusters 2 and 3 were identified as exacerbation-prone asthmatics, showing a higher risk of asthma exacerbation. Two different phenotypes of exacerbation-prone asthma were identified among Korean asthmatics using cluster analysis; both were characterized by impaired lung function, but the age at asthma onset and atopic status were different between the two. Copyright © 2017 The Korean Academy of Asthma, Allergy and Clinical Immunology · The Korean Academy of Pediatric Allergy and Respiratory Disease

  12. Identification of five chronic obstructive pulmonary disease subgroups with different prognoses in the ECLIPSE cohort using cluster analysis.

    PubMed

    Rennard, Stephen I; Locantore, Nicholas; Delafont, Bruno; Tal-Singer, Ruth; Silverman, Edwin K; Vestbo, Jørgen; Miller, Bruce E; Bakke, Per; Celli, Bartolomé; Calverley, Peter M A; Coxson, Harvey; Crim, Courtney; Edwards, Lisa D; Lomas, David A; MacNee, William; Wouters, Emiel F M; Yates, Julie C; Coca, Ignacio; Agustí, Alvar

    2015-03-01

    Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease that likely includes clinically relevant subgroups. To identify subgroups of COPD in ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) subjects using cluster analysis and to assess clinically meaningful outcomes of the clusters during 3 years of longitudinal follow-up. Factor analysis was used to reduce 41 variables determined at recruitment in 2,164 patients with COPD to 13 main factors, and the variables with the highest loading were used for cluster analysis. Clusters were evaluated for their relationship with clinically meaningful outcomes during 3 years of follow-up. The relationships among clinical parameters were evaluated within clusters. Five subgroups were distinguished using cross-sectional clinical features. These groups differed regarding outcomes. Cluster A included patients with milder disease and had fewer deaths and hospitalizations. Cluster B had less systemic inflammation at baseline but had notable changes in health status and emphysema extent. Cluster C had many comorbidities, evidence of systemic inflammation, and the highest mortality. Cluster D had low FEV1, severe emphysema, and the highest exacerbation and COPD hospitalization rate. Cluster E was intermediate for most variables and may represent a mixed group that includes further clusters. The relationships among clinical variables within clusters differed from that in the entire COPD population. Cluster analysis using baseline data in ECLIPSE identified five COPD subgroups that differ in outcomes and inflammatory biomarkers and show different relationships between clinical parameters, suggesting the clusters represent clinically and biologically different subtypes of COPD.

  13. GLOBULAR CLUSTER ABUNDANCES FROM HIGH-RESOLUTION, INTEGRATED-LIGHT SPECTROSCOPY. II. EXPANDING THE METALLICITY RANGE FOR OLD CLUSTERS AND UPDATED ANALYSIS TECHNIQUES

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Colucci, Janet E.; Bernstein, Rebecca A.; McWilliam, Andrew

    2017-01-10

    We present abundances of globular clusters (GCs) in the Milky Way and Fornax from integrated-light (IL) spectra. Our goal is to evaluate the consistency of the IL analysis relative to standard abundance analysis for individual stars in those same clusters. This sample includes an updated analysis of seven clusters from our previous publications and results for five new clusters that expand the metallicity range over which our technique has been tested. We find that the [Fe/H] measured from IL spectra agrees to ∼0.1 dex for GCs with metallicities as high as [Fe/H] = −0.3, but the abundances measured for more metal-rich clustersmore » may be underestimated. In addition we systematically evaluate the accuracy of abundance ratios, [X/Fe], for Na i, Mg i, Al i, Si i, Ca i, Ti i, Ti ii, Sc ii, V i, Cr i, Mn i, Co i, Ni i, Cu i, Y ii, Zr i, Ba ii, La ii, Nd ii, and Eu ii. The elements for which the IL analysis gives results that are most similar to analysis of individual stellar spectra are Fe i, Ca i, Si i, Ni i, and Ba ii. The elements that show the greatest differences include Mg i and Zr i. Some elements show good agreement only over a limited range in metallicity. More stellar abundance data in these clusters would enable more complete evaluation of the IL results for other important elements.« less

  14. DICON: interactive visual analysis of multidimensional clusters.

    PubMed

    Cao, Nan; Gotz, David; Sun, Jimeng; Qu, Huamin

    2011-12-01

    Clustering as a fundamental data analysis technique has been widely used in many analytic applications. However, it is often difficult for users to understand and evaluate multidimensional clustering results, especially the quality of clusters and their semantics. For large and complex data, high-level statistical information about the clusters is often needed for users to evaluate cluster quality while a detailed display of multidimensional attributes of the data is necessary to understand the meaning of clusters. In this paper, we introduce DICON, an icon-based cluster visualization that embeds statistical information into a multi-attribute display to facilitate cluster interpretation, evaluation, and comparison. We design a treemap-like icon to represent a multidimensional cluster, and the quality of the cluster can be conveniently evaluated with the embedded statistical information. We further develop a novel layout algorithm which can generate similar icons for similar clusters, making comparisons of clusters easier. User interaction and clutter reduction are integrated into the system to help users more effectively analyze and refine clustering results for large datasets. We demonstrate the power of DICON through a user study and a case study in the healthcare domain. Our evaluation shows the benefits of the technique, especially in support of complex multidimensional cluster analysis. © 2011 IEEE

  15. Morphological and Inter Simple Sequence Repeat (ISSR) markers analyses of Corynespora cassiicola isolates from rubber plantations in Malaysia.

    PubMed

    Nghia, Nguyen Anh; Kadir, Jugah; Sunderasan, E; Puad Abdullah, Mohd; Malik, Adam; Napis, Suhaimi

    2008-10-01

    Morphological features and Inter Simple Sequence Repeat (ISSR) polymorphism were employed to analyse 21 Corynespora cassiicola isolates obtained from a number of Hevea clones grown in rubber plantations in Malaysia. The C. cassiicola isolates used in this study were collected from several states in Malaysia from 1998 to 2005. The morphology of the isolates was characteristic of that previously described for C. cassiicola. Variations in colony and conidial morphology were observed not only among isolates but also within a single isolate with no inclination to either clonal or geographical origin of the isolates. ISSR analysis delineated the isolates into two distinct clusters. The dendrogram created from UPGMA analysis based on Nei and Li's coefficient (calculated from the binary matrix data of 106 amplified DNA bands generated from 8 ISSR primers) showed that cluster 1 encompasses 12 isolates from the states of Johor and Selangor (this cluster was further split into 2 sub clusters (1A, 1B), sub cluster 1B consists of a unique isolate, CKT05D); while cluster 2 comprises of 9 isolates that were obtained from the other states. Detached leaf assay performed on selected Hevea clones showed that the pathogenicity of representative isolates from cluster 1 (with the exception of CKT05D) resembled that of race 1; and isolates in cluster 2 showed pathogenicity similar to race 2 of the fungus that was previously identified in Malaysia. The isolate CKT05D from sub cluster 1B showed pathogenicity dissimilar to either race 1 or race 2.

  16. Multiscale visual quality assessment for cluster analysis with self-organizing maps

    NASA Astrophysics Data System (ADS)

    Bernard, Jürgen; von Landesberger, Tatiana; Bremm, Sebastian; Schreck, Tobias

    2011-01-01

    Cluster analysis is an important data mining technique for analyzing large amounts of data, reducing many objects to a limited number of clusters. Cluster visualization techniques aim at supporting the user in better understanding the characteristics and relationships among the found clusters. While promising approaches to visual cluster analysis already exist, these usually fall short of incorporating the quality of the obtained clustering results. However, due to the nature of the clustering process, quality plays an important aspect, as for most practical data sets, typically many different clusterings are possible. Being aware of clustering quality is important to judge the expressiveness of a given cluster visualization, or to adjust the clustering process with refined parameters, among others. In this work, we present an encompassing suite of visual tools for quality assessment of an important visual cluster algorithm, namely, the Self-Organizing Map (SOM) technique. We define, measure, and visualize the notion of SOM cluster quality along a hierarchy of cluster abstractions. The quality abstractions range from simple scalar-valued quality scores up to the structural comparison of a given SOM clustering with output of additional supportive clustering methods. The suite of methods allows the user to assess the SOM quality on the appropriate abstraction level, and arrive at improved clustering results. We implement our tools in an integrated system, apply it on experimental data sets, and show its applicability.

  17. Cluster Analysis to Identify Possible Subgroups in Tinnitus Patients.

    PubMed

    van den Berge, Minke J C; Free, Rolien H; Arnold, Rosemarie; de Kleine, Emile; Hofman, Rutger; van Dijk, J Marc C; van Dijk, Pim

    2017-01-01

    In tinnitus treatment, there is a tendency to shift from a "one size fits all" to a more individual, patient-tailored approach. Insight in the heterogeneity of the tinnitus spectrum might improve the management of tinnitus patients in terms of choice of treatment and identification of patients with severe mental distress. The goal of this study was to identify subgroups in a large group of tinnitus patients. Data were collected from patients with severe tinnitus complaints visiting our tertiary referral tinnitus care group at the University Medical Center Groningen. Patient-reported and physician-reported variables were collected during their visit to our clinic. Cluster analyses were used to characterize subgroups. For the selection of the right variables to enter in the cluster analysis, two approaches were used: (1) variable reduction with principle component analysis and (2) variable selection based on expert opinion. Various variables of 1,783 tinnitus patients were included in the analyses. Cluster analysis (1) included 976 patients and resulted in a four-cluster solution. The effect of external influences was the most discriminative between the groups, or clusters, of patients. The "silhouette measure" of the cluster outcome was low (0.2), indicating a "no substantial" cluster structure. Cluster analysis (2) included 761 patients and resulted in a three-cluster solution, comparable to the first analysis. Again, a "no substantial" cluster structure was found (0.2). Two cluster analyses on a large database of tinnitus patients revealed that clusters of patients are mostly formed by a different response of external influences on their disease. However, both cluster outcomes based on this dataset showed a poor stability, suggesting that our tinnitus population comprises a continuum rather than a number of clearly defined subgroups.

  18. Descriptive Statistics and Cluster Analysis for Extreme Rainfall in Java Island

    NASA Astrophysics Data System (ADS)

    E Komalasari, K.; Pawitan, H.; Faqih, A.

    2017-03-01

    This study aims to describe regional pattern of extreme rainfall based on maximum daily rainfall for period 1983 to 2012 in Java Island. Descriptive statistics analysis was performed to obtain centralization, variation and distribution of maximum precipitation data. Mean and median are utilized to measure central tendency data while Inter Quartile Range (IQR) and standard deviation are utilized to measure variation of data. In addition, skewness and kurtosis used to obtain shape the distribution of rainfall data. Cluster analysis using squared euclidean distance and ward method is applied to perform regional grouping. Result of this study show that mean (average) of maximum daily rainfall in Java Region during period 1983-2012 is around 80-181mm with median between 75-160mm and standard deviation between 17 to 82. Cluster analysis produces four clusters and show that western area of Java tent to have a higher annual maxima of daily rainfall than northern area, and have more variety of annual maximum value.

  19. Prediction models for clustered data: comparison of a random intercept and standard regression model

    PubMed Central

    2013-01-01

    Background When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Methods Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. Results The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. Conclusion The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters. PMID:23414436

  20. Prediction models for clustered data: comparison of a random intercept and standard regression model.

    PubMed

    Bouwmeester, Walter; Twisk, Jos W R; Kappen, Teus H; van Klei, Wilton A; Moons, Karel G M; Vergouwe, Yvonne

    2013-02-15

    When study data are clustered, standard regression analysis is considered inappropriate and analytical techniques for clustered data need to be used. For prediction research in which the interest of predictor effects is on the patient level, random effect regression models are probably preferred over standard regression analysis. It is well known that the random effect parameter estimates and the standard logistic regression parameter estimates are different. Here, we compared random effect and standard logistic regression models for their ability to provide accurate predictions. Using an empirical study on 1642 surgical patients at risk of postoperative nausea and vomiting, who were treated by one of 19 anesthesiologists (clusters), we developed prognostic models either with standard or random intercept logistic regression. External validity of these models was assessed in new patients from other anesthesiologists. We supported our results with simulation studies using intra-class correlation coefficients (ICC) of 5%, 15%, or 30%. Standard performance measures and measures adapted for the clustered data structure were estimated. The model developed with random effect analysis showed better discrimination than the standard approach, if the cluster effects were used for risk prediction (standard c-index of 0.69 versus 0.66). In the external validation set, both models showed similar discrimination (standard c-index 0.68 versus 0.67). The simulation study confirmed these results. For datasets with a high ICC (≥15%), model calibration was only adequate in external subjects, if the used performance measure assumed the same data structure as the model development method: standard calibration measures showed good calibration for the standard developed model, calibration measures adapting the clustered data structure showed good calibration for the prediction model with random intercept. The models with random intercept discriminate better than the standard model only if the cluster effect is used for predictions. The prediction model with random intercept had good calibration within clusters.

  1. Groundwater quality assessment of urban Bengaluru using multivariate statistical techniques

    NASA Astrophysics Data System (ADS)

    Gulgundi, Mohammad Shahid; Shetty, Amba

    2018-03-01

    Groundwater quality deterioration due to anthropogenic activities has become a subject of prime concern. The objective of the study was to assess the spatial and temporal variations in groundwater quality and to identify the sources in the western half of the Bengaluru city using multivariate statistical techniques. Water quality index rating was calculated for pre and post monsoon seasons to quantify overall water quality for human consumption. The post-monsoon samples show signs of poor quality in drinking purpose compared to pre-monsoon. Cluster analysis (CA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the groundwater quality data measured on 14 parameters from 67 sites distributed across the city. Hierarchical cluster analysis (CA) grouped the 67 sampling stations into two groups, cluster 1 having high pollution and cluster 2 having lesser pollution. Discriminant analysis (DA) was applied to delineate the most meaningful parameters accounting for temporal and spatial variations in groundwater quality of the study area. Temporal DA identified pH as the most important parameter, which discriminates between water quality in the pre-monsoon and post-monsoon seasons and accounts for 72% seasonal assignation of cases. Spatial DA identified Mg, Cl and NO3 as the three most important parameters discriminating between two clusters and accounting for 89% spatial assignation of cases. Principal component analysis was applied to the dataset obtained from the two clusters, which evolved three factors in each cluster, explaining 85.4 and 84% of the total variance, respectively. Varifactors obtained from principal component analysis showed that groundwater quality variation is mainly explained by dissolution of minerals from rock water interactions in the aquifer, effect of anthropogenic activities and ion exchange processes in water.

  2. Evaluation of Potential Locations for Siting Small Modular Reactors near Federal Energy Clusters to Support Federal Clean Energy Goals

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Belles, Randy J.; Omitaomu, Olufemi A.

    2014-09-01

    Geographic information systems (GIS) technology was applied to analyze federal energy demand across the contiguous US. Several federal energy clusters were previously identified, including Hampton Roads, Virginia, which was subsequently studied in detail. This study provides an analysis of three additional diverse federal energy clusters. The analysis shows that there are potential sites in various federal energy clusters that could be evaluated further for placement of an integral pressurized-water reactor (iPWR) to support meeting federal clean energy goals.

  3. Cross sectional TEM analysis of duplex HIPIMS and DC magnetron sputtered Mo and W doped carbon coatings

    NASA Astrophysics Data System (ADS)

    Sharp, J.; Castillo Muller, I.; Mandal, P.; Abbas, A.; West, G.; Rainforth, W. M.; Ehiasarian, A.; Hovsepian, P.

    2015-10-01

    A FIB lift-out sample was made from a wear-resistant carbon coating deposited by high power impulse magnetron sputtering (HIPIMS) with Mo and W. TEM analysis found columnar grains extending the whole ∼1800 nm thick film. Within the grains, the carbon was found to be organised into clusters showing some onion-like structure, with amorphous material between them; energy dispersive X-ray spectroscopy (EDS) found these clusters to be Mo- and W-rich in a later, thinner sample of the same material. Electron energy-loss spectroscopy (EELS) showed no difference in C-K edge, implying the bonding type to be the same in cluster and matrix. These clusters were arranged into stripes parallel to the film plane, of spacing 7-8 nm; there was a modulation in spacing between clusters within these stripes that produced a second, coarser set of striations of spacing ∼37 nm.

  4. On the blind use of statistical tools in the analysis of globular cluster stars

    NASA Astrophysics Data System (ADS)

    D'Antona, Francesca; Caloi, Vittoria; Tailo, Marco

    2018-04-01

    As with most data analysis methods, the Bayesian method must be handled with care. We show that its application to determine stellar evolution parameters within globular clusters can lead to paradoxical results if used without the necessary precautions. This is a cautionary tale on the use of statistical tools for big data analysis.

  5. The Technical and Biological Reproducibility of Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) Based Typing: Employment of Bioinformatics in a Multicenter Study.

    PubMed

    Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian

    2016-01-01

    The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Technical and biological reproducibility ranged between 96.8-99.4% and 47.6-94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable.

  6. The association between content of the elements S, Cl, K, Fe, Cu, Zn and Br in normal and cirrhotic liver tissue from Danes and Greenlandic Inuit examined by dual hierarchical clustering analysis.

    PubMed

    Laursen, Jens; Milman, Nils; Pind, Niels; Pedersen, Henrik; Mulvad, Gert

    2014-01-01

    Meta-analysis of previous studies evaluating associations between content of elements sulphur (S), chlorine (Cl), potassium (K), iron (Fe), copper (Cu), zinc (Zn) and bromine (Br) in normal and cirrhotic autopsy liver tissue samples. Normal liver samples from 45 Greenlandic Inuit, median age 60 years and from 71 Danes, median age 61 years. Cirrhotic liver samples from 27 Danes, median age 71 years. Element content was measured using X-ray fluorescence spectrometry. Dual hierarchical clustering analysis, creating a dual dendrogram, one clustering element contents according to calculated similarities, one clustering elements according to correlation coefficients between the element contents, both using Euclidian distance and Ward Procedure. One dendrogram separated subjects in 7 clusters showing no differences in ethnicity, gender or age. The analysis discriminated between elements in normal and cirrhotic livers. The other dendrogram clustered elements in four clusters: sulphur and chlorine; copper and bromine; potassium and zinc; iron. There were significant correlations between the elements in normal liver samples: S was associated with Cl, K, Br and Zn; Cl with S and Br; K with S, Br and Zn; Cu with Br. Zn with S and K. Br with S, Cl, K and Cu. Fe did not show significant associations with any other element. In contrast to simple statistical methods, which analyses content of elements separately one by one, dual hierarchical clustering analysis incorporates all elements at the same time and can be used to examine the linkage and interplay between multiple elements in tissue samples. Copyright © 2013 Elsevier GmbH. All rights reserved.

  7. Periorbital melasma: Hierarchical cluster analysis of clinical features in Asian patients.

    PubMed

    Jung, Y S; Bae, J M; Kim, B J; Kang, J-S; Cho, S B

    2017-11-01

    Studies have shown melasma lesions to be distributed across the face in centrofacial, malar, and mandibular patterns. Meanwhile, however, melasma lesions of the periorbital area have yet to be thoroughly described. We analyzed normal and ultraviolet light-exposed photographs of patients with melasma. The periorbital melasma lesions were measured according to anatomical reference points and a hierarchical cluster analysis was performed. The periorbital melasma lesions showed clinical features of fine and homogenous melasma pigmentation, involving both the upper and lower eyelids that extended to other anatomical sites with a darker and coarser appearance. The hierarchical cluster analysis indicated that patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. Significant differences between cluster 1 and cluster 2 were found in lateral distance and inferolateral distance, but not in medial distance and superior distance. Comparing the two clusters, patients in cluster 2 were found to be significantly older and more commonly accompanied by melasma lesions of the temple and medial cheek. Our hierarchical cluster analysis of periorbital melasma lesions demonstrated that Asian patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. Changing cluster composition in cluster randomised controlled trials: design and analysis considerations

    PubMed Central

    2014-01-01

    Background There are many methodological challenges in the conduct and analysis of cluster randomised controlled trials, but one that has received little attention is that of post-randomisation changes to cluster composition. To illustrate this, we focus on the issue of cluster merging, considering the impact on the design, analysis and interpretation of trial outcomes. Methods We explored the effects of merging clusters on study power using standard methods of power calculation. We assessed the potential impacts on study findings of both homogeneous cluster merges (involving clusters randomised to the same arm of a trial) and heterogeneous merges (involving clusters randomised to different arms of a trial) by simulation. To determine the impact on bias and precision of treatment effect estimates, we applied standard methods of analysis to different populations under analysis. Results Cluster merging produced a systematic reduction in study power. This effect depended on the number of merges and was most pronounced when variability in cluster size was at its greatest. Simulations demonstrate that the impact on analysis was minimal when cluster merges were homogeneous, with impact on study power being balanced by a change in observed intracluster correlation coefficient (ICC). We found a decrease in study power when cluster merges were heterogeneous, and the estimate of treatment effect was attenuated. Conclusions Examples of cluster merges found in previously published reports of cluster randomised trials were typically homogeneous rather than heterogeneous. Simulations demonstrated that trial findings in such cases would be unbiased. However, simulations also showed that any heterogeneous cluster merges would introduce bias that would be hard to quantify, as well as having negative impacts on the precision of estimates obtained. Further methodological development is warranted to better determine how to analyse such trials appropriately. Interim recommendations include avoidance of cluster merges where possible, discontinuation of clusters following heterogeneous merges, allowance for potential loss of clusters and additional variability in cluster size in the original sample size calculation, and use of appropriate ICC estimates that reflect cluster size. PMID:24884591

  9. [Identification of different Citrus sinensis (L.) Osbeck trees varieties using Fourier transform infrared spectroscopy and hierarchical cluster analysis].

    PubMed

    Yi, Shi-Lai; Deng, Lie; He, Shao-Lan; Shi, You-Ming; Zheng, Yong-Qiang; Lu, Qiang; Xie, Rang-Jin; Wei, Xian-Guoi; Li, Song-Wei; Jian, Shui-Xian

    2012-11-01

    Researched on diversity of the spring leaf samples of seven different Citrus sinensis (L.) Osbeck varieties by Fourier transform infrared (FTIR) spectroscopy technology, the results showed that the Fourier transform infrared spectra of seven varieties leaves was composited by the absorption band of cellulose and polysaccharide mainly, the wave number of characteristics absorption peaks were similar at their FTIR spectra. However, there were some differences in shape of peaks and relatively absorption intensity. The conspicuous difference was presented at the region between 1 500 and 700 cm(-1) by second derivative spectra. Through the hierarchical cluster analysis (HCA) of second derivative spectra between 1 500 and 700 cm(-1), the results showed that the clustering of the different varieties of Citrus sinensis (L.) Osbeck varieties was classification according to genetic relationship. The results showed that FTIR spectroscopy combined with hierarchical cluster analysis could be used to identify and classify of citrus varieties rapidly, it was an extension method to study on early leaves of varieties orange seedlings.

  10. [Optimization of cluster analysis based on drug resistance profiles of MRSA isolates].

    PubMed

    Tani, Hiroya; Kishi, Takahiko; Gotoh, Minehiro; Yamagishi, Yuka; Mikamo, Hiroshige

    2015-12-01

    We examined 402 methicillin-resistant Staphylococcus aureus (MRSA) strains isolated from clinical specimens in our hospital between November 19, 2010 and December 27, 2011 to evaluate the similarity between cluster analysis of drug susceptibility tests and pulsed-field gel electrophoresis (PFGE). The results showed that the 402 strains tested were classified into 27 PFGE patterns (151 subtypes of patterns). Cluster analyses of drug susceptibility tests with the cut-off distance yielding a similar classification capability showed favorable results--when the MIC method was used, and minimum inhibitory concentration (MIC) values were used directly in the method, the level of agreement with PFGE was 74.2% when 15 drugs were tested. The Unweighted Pair Group Method with Arithmetic mean (UPGMA) method was effective when the cut-off distance was 16. Using the SIR method in which susceptible (S), intermediate (I), and resistant (R) were coded as 0, 2, and 3, respectively, according to the Clinical and Laboratory Standards Institute (CLSI) criteria, the level of agreement with PFGE was 75.9% when the number of drugs tested was 17, the method used for clustering was the UPGMA, and the cut-off distance was 3.6. In addition, to assess the reproducibility of the results, 10 strains were randomly sampled from the overall test and subjected to cluster analysis. This was repeated 100 times under the same conditions. The results indicated good reproducibility of the results, with the level of agreement with PFGE showing a mean of 82.0%, standard deviation of 12.1%, and mode of 90.0% for the MIC method and a mean of 80.0%, standard deviation of 13.4%, and mode of 90.0% for the SIR method. In summary, cluster analysis for drug susceptibility tests is useful for the epidemiological analysis of MRSA.

  11. Functional analysis of the upstream regulatory region of chicken miR-17-92 cluster.

    PubMed

    Cheng, Min; Zhang, Wen-jian; Xing, Tian-yu; Yan, Xiao-hong; Li, Yu-mao; Li, Hui; Wang, Ning

    2016-08-01

    miR-17-92 cluster plays important roles in cell proliferation, differentiation, apoptosis, animal development and tumorigenesis. The transcriptional regulation of miR-17-92 cluster has been extensively studied in mammals, but not in birds. To date, avian miR-17-92 cluster genomic structure has not been fully determined. The promoter location and sequence of miR-17-92 cluster have not been determined, due to the existence of a genomic gap sequence upstream of miR-17-92 cluster in all the birds whose genomes have been sequenced. In this study, genome walking was used to close the genomic gap upstream of chicken miR-17-92 cluster. In addition, bioinformatics analysis, reporter gene assay and truncation mutagenesis were used to investigate functional role of the genomic gap sequence. Genome walking analysis showed that the gap region was 1704 bp long, and its GC content was 80.11%. Bioinformatics analysis showed that in the gap region, there was a 200 bp conserved sequence among the tested 10 species (Gallus gallus, Homo sapiens, Pan troglodytes, Bos taurus, Sus scrofa, Rattus norvegicus, Mus musculus, Possum, Danio rerio, Rana nigromaculata), which is core promoter region of mammalian miR-17-92 host gene (MIR17HG). Promoter luciferase reporter gene vector of the gap region was constructed and reporter assay was performed. The result showed that the promoter activity of pGL3-cMIR17HG (-4228/-2506) was 417 times than that of negative control (empty pGL3 basic vector), suggesting that chicken miR-17-92 cluster promoter exists in the gap region. To further gain insight into the promoter structure, two different truncations for the cloned gap sequence were generated by PCR. One had a truncation of 448 bp at the 5'-end and the other had a truncation of 894 bp at the 3'-end. Further reporter analysis showed that compared with the promoter activity of pGL3-cMIR17HG (-4228/-2506), the reporter activities of the 5'-end truncation and the 3'-end truncation were reduced by 19.82% and 60.14%, respectively. These data demonstrated that the important promoter region of chicken miR-17-92 cluster is located in the -3400/-2506 bp region. Our results lay the foundation for revealing the transcriptional regulatory mechanisms of chicken miR-17-92 cluster.

  12. Coagulation-fragmentation for a finite number of particles and application to telomere clustering in the yeast nucleus

    NASA Astrophysics Data System (ADS)

    Hozé, Nathanaël; Holcman, David

    2012-01-01

    We develop a coagulation-fragmentation model to study a system composed of a small number of stochastic objects moving in a confined domain, that can aggregate upon binding to form local clusters of arbitrary sizes. A cluster can also dissociate into two subclusters with a uniform probability. To study the statistics of clusters, we combine a Markov chain analysis with a partition number approach. Interestingly, we obtain explicit formulas for the size and the number of clusters in terms of hypergeometric functions. Finally, we apply our analysis to study the statistical physics of telomeres (ends of chromosomes) clustering in the yeast nucleus and show that the diffusion-coagulation-fragmentation process can predict the organization of telomeres.

  13. Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

    PubMed

    Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

    2017-07-01

    Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.

  14. Phylodynamic Analysis Reveals CRF01_AE Dissemination between Japan and Neighboring Asian Countries and the Role of Intravenous Drug Use in Transmission

    PubMed Central

    Shiino, Teiichiro; Hattori, Junko; Yokomaku, Yoshiyuki; Iwatani, Yasumasa; Sugiura, Wataru

    2014-01-01

    Background One major circulating HIV-1 subtype in Southeast Asian countries is CRF01_AE, but little is known about its epidemiology in Japan. We conducted a molecular phylodynamic study of patients newly diagnosed with CRF01_AE from 2003 to 2010. Methods Plasma samples from patients registered in Japanese Drug Resistance HIV-1 Surveillance Network were analyzed for protease-reverse transcriptase sequences; all sequences undergo subtyping and phylogenetic analysis using distance-matrix-based, maximum likelihood and Bayesian coalescent Markov Chain Monte Carlo (MCMC) phylogenetic inferences. Transmission clusters were identified using interior branch test and depth-first searches for sub-tree partitions. Times of most recent common ancestor (tMRCAs) of significant clusters were estimated using Bayesian MCMC analysis. Results Among 3618 patient registered in our network, 243 were infected with CRF01_AE. The majority of individuals with CRF01_AE were Japanese, predominantly male, and reported heterosexual contact as their risk factor. We found 5 large clusters with ≥5 members and 25 small clusters consisting of pairs of individuals with highly related CRF01_AE strains. The earliest cluster showed a tMRCA of 1996, and consisted of individuals with their known risk as heterosexual contacts. The other four large clusters showed later tMRCAs between 2000 and 2002 with members including intravenous drug users (IVDU) and non-Japanese, but not men who have sex with men (MSM). In contrast, small clusters included a high frequency of individuals reporting MSM risk factors. Phylogenetic analysis also showed that some individuals infected with HIV strains spread in East and South-eastern Asian countries. Conclusions Introduction of CRF01_AE viruses into Japan is estimated to have occurred in the 1990s. CFR01_AE spread via heterosexual behavior, then among persons connected with non-Japanese, IVDU, and MSM. Phylogenetic analysis demonstrated that some viral variants are largely restricted to Japan, while others have a broad geographic distribution. PMID:25025900

  15. Cluster Analysis of Acute Care Use Yields Insights for Tailored Pediatric Asthma Interventions.

    PubMed

    Abir, Mahshid; Truchil, Aaron; Wiest, Dawn; Nelson, Daniel B; Goldstick, Jason E; Koegel, Paul; Lozon, Marie M; Choi, Hwajung; Brenner, Jeffrey

    2017-09-01

    We undertake this study to understand patterns of pediatric asthma-related acute care use to inform interventions aimed at reducing potentially avoidable hospitalizations. Hospital claims data from 3 Camden city facilities for 2010 to 2014 were used to perform cluster analysis classifying patients aged 0 to 17 years according to their asthma-related hospital use. Clusters were based on 2 variables: asthma-related ED visits and hospitalizations. Demographics and a number of sociobehavioral and use characteristics were compared across clusters. Children who met the criteria (3,170) were included in the analysis. An examination of a scree plot showing the decline in within-cluster heterogeneity as the number of clusters increased confirmed that clusters of pediatric asthma patients according to hospital use exist in the data. Five clusters of patients with distinct asthma-related acute care use patterns were observed. Cluster 1 (62% of patients) showed the lowest rates of acute care use. These patients were least likely to have a mental health-related diagnosis, were less likely to have visited multiple facilities, and had no hospitalizations for asthma. Cluster 2 (19% of patients) had a low number of asthma ED visits and onetime hospitalization. Cluster 3 (11% of patients) had a high number of ED visits and low hospitalization rates, and the highest rates of multiple facility use. Cluster 4 (7% of patients) had moderate ED use for both asthma and other illnesses, and high rates of asthma hospitalizations; nearly one quarter received care at all facilities, and 1 in 10 had a mental health diagnosis. Cluster 5 (1% of patients) had extreme rates of acute care use. Differences observed between groups across multiple sociobehavioral factors suggest these clusters may represent children who differ along multiple dimensions, in addition to patterns of service use, with implications for tailored interventions. Copyright © 2017 American College of Emergency Physicians. Published by Elsevier Inc. All rights reserved.

  16. Analysis of Rainfall and PM2.5 Data Using Clustered Trajectory Analysis for National Park Sites in the Western U.S.

    NASA Astrophysics Data System (ADS)

    Solorzano, N. N.; Hafner, W.; Jaffe, D.

    2005-12-01

    We calculated daily kinematic back-trajectories using the NOAA-HYSPLIT model to analyze 7 years of PM2.5 data from National Park sites in the Western U.S. (Glacier N.P., Mount Rainier N.P., Sequoia N.P., Rocky Mountain N.P. and Denali N.P.) The back-trajectories were clustered using a k-means clustering algorithm to segregate the trajectories into 6 main transport patterns. We calculated trajectory clusters for 1, 5 and 10 days to represent short, medium and long-range flow patterns. Some trajectory types and clusters show marked seasonality. Generally faster flow patterns are more prevalent in winter and slower/stagnant patterns are more prevalent in summer. In addition, we found significant inter-annual variability that may be important for explaining variations in rainfall and/or pollutant concentrations. The 5 and 10-day analyses revealed that, for the 4 non-Alaskan sites, trajectories from Asia tend to be less frequent in the summer, compared to the rest of the year. The clusters of different duration show very different predictive power for rainfall and PM2.5. We found that the 1-day clusters are a better predictor for precipitation and PM2.5 concentrations, as compared to the 5 and 10-day clusters. At each of the sites, there is at least one cluster with an average PM2.5 concentration that is different than the average for the site, indicating distinctive transport patterns. The same is true for 5 and 10-day clusters. Interestingly, only one site, Mount Rainier N.P., shows seasonal differences in PM2.5 concentrations between the clusters that differ from the average.

  17. Multivariate Statistical Analysis: a tool for groundwater quality assessment in the hidrogeologic region of the Ring of Cenotes, Yucatan, Mexico.

    NASA Astrophysics Data System (ADS)

    Ye, M.; Pacheco Castro, R. B.; Pacheco Avila, J.; Cabrera Sansores, A.

    2014-12-01

    The karstic aquifer of Yucatan is a vulnerable and complex system. The first fifteen meters of this aquifer have been polluted, due to this the protection of this resource is important because is the only source of potable water of the entire State. Through the assessment of groundwater quality we can gain some knowledge about the main processes governing water chemistry as well as spatial patterns which are important to establish protection zones. In this work multivariate statistical techniques are used to assess the groundwater quality of the supply wells (30 to 40 meters deep) in the hidrogeologic region of the Ring of Cenotes, located in Yucatan, Mexico. Cluster analysis and principal component analysis are applied in groundwater chemistry data of the study area. Results of principal component analysis show that the main sources of variation in the data are due sea water intrusion and the interaction of the water with the carbonate rocks of the system and some pollution processes. The cluster analysis shows that the data can be divided in four clusters. The spatial distribution of the clusters seems to be random, but is consistent with sea water intrusion and pollution with nitrates. The overall results show that multivariate statistical analysis can be successfully applied in the groundwater quality assessment of this karstic aquifer.

  18. First CCD UBVI photometric analysis of six open cluster candidates

    NASA Astrophysics Data System (ADS)

    Piatti, A. E.; Clariá, J. J.; Ahumada, A. V.

    2011-04-01

    We have obtained CCD UBVIKC photometry down to V ˜ 22 for the open cluster candidates Haffner 3, Haffner 5, NGC 2368, Haffner 25, Hogg 3 and Hogg 4 and their surrounding fields. None of these objects have been photometrically studied so far. Our analysis shows that these stellar groups are not genuine open clusters since no clear main sequences or other meaningful features can be seen in their colour-magnitude and colour-colour diagrams. We checked for possible differential reddening across the studied fields that could be hiding the characteristics of real open clusters. However, the dust in the directions to these objects appears to be uniformly distributed. Moreover, star counts carried out within and outside the open cluster candidate fields do not support the hypothesis that these objects are real open clusters or even open cluster remnants.

  19. Emergy-based comparative analysis on industrial clusters: economic and technological development zone of Shenyang area, China.

    PubMed

    Liu, Zhe; Geng, Yong; Zhang, Pan; Dong, Huijuan; Liu, Zuoxi

    2014-09-01

    In China, local governments of many areas prefer to give priority to the development of heavy industrial clusters in pursuit of high value of gross domestic production (GDP) growth to get political achievements, which usually results in higher costs from ecological degradation and environmental pollution. Therefore, effective methods and reasonable evaluation system are urgently needed to evaluate the overall efficiency of industrial clusters. Emergy methods links economic and ecological systems together, which can evaluate the contribution of ecological products and services as well as the load placed on environmental systems. This method has been successfully applied in many case studies of ecosystem but seldom in industrial clusters. This study applied the methodology of emergy analysis to perform the efficiency of industrial clusters through a series of emergy-based indices as well as the proposed indicators. A case study of Shenyang Economic Technological Development Area (SETDA) was investigated to show the emergy method's practical potential to evaluate industrial clusters to inform environmental policy making. The results of our study showed that the industrial cluster of electric equipment and electronic manufacturing produced the most economic value and had the highest efficiency of energy utilization among the four industrial clusters. However, the sustainability index of the industrial cluster of food and beverage processing was better than the other industrial clusters.

  20. Dynamical Organization of Syntaxin-1A at the Presynaptic Active Zone

    PubMed Central

    Ullrich, Alexander; Böhme, Mathias A.; Schöneberg, Johannes; Depner, Harald; Sigrist, Stephan J.; Noé, Frank

    2015-01-01

    Synaptic vesicle fusion is mediated by SNARE proteins forming in between synaptic vesicle (v-SNARE) and plasma membrane (t-SNARE), one of which is Syntaxin-1A. Although exocytosis mainly occurs at active zones, Syntaxin-1A appears to cover the entire neuronal membrane. By using STED super-resolution light microscopy and image analysis of Drosophila neuro-muscular junctions, we show that Syntaxin-1A clusters are more abundant and have an increased size at active zones. A computational particle-based model of syntaxin cluster formation and dynamics is developed. The model is parametrized to reproduce Syntaxin cluster-size distributions found by STED analysis, and successfully reproduces existing FRAP results. The model shows that the neuronal membrane is adjusted in a way to strike a balance between having most syntaxins stored in large clusters, while still keeping a mobile fraction of syntaxins free or in small clusters that can efficiently search the membrane or be traded between clusters. This balance is subtle and can be shifted toward almost no clustering and almost complete clustering by modifying the syntaxin interaction energy on the order of only 1 kBT. This capability appears to be exploited at active zones. The larger active-zone syntaxin clusters are more stable and provide regions of high docking and fusion capability, whereas the smaller clusters outside may serve as flexible reserve pool or sites of spontaneous ectopic release. PMID:26367029

  1. Cluster size selectivity in the product distribution of ethene dehydrogenation on niobium clusters.

    PubMed

    Parnis, J Mark; Escobar-Cabrera, Eric; Thompson, Matthew G K; Jacula, J Paul; Lafleur, Rick D; Guevara-García, Alfredo; Martínez, Ana; Rayner, David M

    2005-08-18

    Ethene reactions with niobium atoms and clusters containing up to 25 constituent atoms have been studied in a fast-flow metal cluster reactor. The clusters react with ethene at about the gas-kinetic collision rate, indicating a barrierless association process as the cluster removal step. Exceptions are Nb8 and Nb10, for which a significantly diminished rate is observed, reflecting some cluster size selectivity. Analysis of the experimental primary product masses indicates dehydrogenation of ethene for all clusters save Nb10, yielding either Nb(n)C2H2 or Nb(n)C2. Over the range Nb-Nb6, the extent of dehydrogenation increases with cluster size, then decreases for larger clusters. For many clusters, secondary and tertiary product masses are also observed, showing varying degrees of dehydrogenation corresponding to net addition of C2H4, C2H2, or C2. With Nb atoms and several small clusters, formal addition of at least six ethene molecules is observed, suggesting a polymerization process may be active. Kinetic analysis of the Nb atom and several Nb(n) cluster reactions with ethene shows that the process is consistent with sequential addition of ethene units at rates corresponding approximately to the gas-kinetic collision frequency for several consecutive reacting ethene molecules. Some variation in the rate of ethene pick up is found, which likely reflects small energy barriers or steric constraints associated with individual mechanistic steps. Density functional calculations of structures of Nb clusters up to Nb(6), and the reaction products Nb(n)C2H2 and Nb(n)C2 (n = 1...6) are presented. Investigation of the thermochemistry for the dehydrogenation of ethene to form molecular hydrogen, for the Nb atom and clusters up to Nb6, demonstrates that the exergonicity of the formation of Nb(n)C2 species increases with cluster size over this range, which supports the proposal that the extent of dehydrogenation is determined primarily by thermodynamic constraints. Analysis of the structural variations present in the cluster species studied shows an increase in C-H bond lengths with cluster size that closely correlates with the increased thermodynamic drive to full dehydrogenation. This correlation strongly suggests that all steps in the reaction are barrierless, and that weakening of the C-H bonds is directly reflected in the thermodynamics of the overall dehydrogenation process. It is also demonstrated that reaction exergonicity in the initial partial dehydrogenation step must be carried through as excess internal energy into the second dehydrogenation step.

  2. Relation between financial market structure and the real economy: comparison between clustering methods.

    PubMed

    Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T

    2015-01-01

    We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover,we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging [corrected].

  3. Relation between Financial Market Structure and the Real Economy: Comparison between Clustering Methods

    PubMed Central

    Musmeci, Nicoló; Aste, Tomaso; Di Matteo, T.

    2015-01-01

    We quantify the amount of information filtered by different hierarchical clustering methods on correlations between stock returns comparing the clustering structure with the underlying industrial activity classification. We apply, for the first time to financial data, a novel hierarchical clustering approach, the Directed Bubble Hierarchical Tree and we compare it with other methods including the Linkage and k-medoids. By taking the industrial sector classification of stocks as a benchmark partition, we evaluate how the different methods retrieve this classification. The results show that the Directed Bubble Hierarchical Tree can outperform other methods, being able to retrieve more information with fewer clusters. Moreover, we show that the economic information is hidden at different levels of the hierarchical structures depending on the clustering method. The dynamical analysis on a rolling window also reveals that the different methods show different degrees of sensitivity to events affecting financial markets, like crises. These results can be of interest for all the applications of clustering methods to portfolio optimization and risk hedging. PMID:25786703

  4. An investigation on thermal patterns in Iran based on spatial autocorrelation

    NASA Astrophysics Data System (ADS)

    Fallah Ghalhari, Gholamabbas; Dadashi Roudbari, Abbasali

    2018-02-01

    The present study aimed at investigating temporal-spatial patterns and monthly patterns of temperature in Iran using new spatial statistical methods such as cluster and outlier analysis, and hotspot analysis. To do so, climatic parameters, monthly average temperature of 122 synoptic stations, were assessed. Statistical analysis showed that January with 120.75% had the most fluctuation among the studied months. Global Moran's Index revealed that yearly changes of temperature in Iran followed a strong spatially clustered pattern. Findings showed that the biggest thermal cluster pattern in Iran, 0.975388, occurred in May. Cluster and outlier analyses showed that thermal homogeneity in Iran decreases in cold months, while it increases in warm months. This is due to the radiation angle and synoptic systems which strongly influence thermal order in Iran. The elevations, however, have the most notable part proved by Geographically weighted regression model. Iran's thermal analysis through hotspot showed that hot thermal patterns (very hot, hot, and semi-hot) were dominant in the South, covering an area of 33.5% (about 552,145.3 km2). Regions such as mountain foot and low lands lack any significant spatial autocorrelation, 25.2% covering about 415,345.1 km2. The last is the cold thermal area (very cold, cold, and semi-cold) with about 25.2% covering about 552,145.3 km2 of the whole area of Iran.

  5. The study of structures and properties of PdnHm(n=1-10, m=1,2) clusters by density functional theory

    NASA Astrophysics Data System (ADS)

    Wen, Jun-Qing; Chen, Guo-Xiang; Zhang, Jian-Min; Wu, Hua

    2018-04-01

    The geometrical evolution, local relative stability, magnetism and charge transfer characteristics of PdnHm(n = 1-10, m = 1,2) have been systematically calculated by using density functional theory. The studied results show that the most stable geometries of PdnH and PdnH2 (n = 1-10) can be got by doping one or two H atoms on the sides of Pdn clusters except Pd6H and Pd6H2. It is found that doping one or two H atoms on Pdn clusters cannot change the basic framework of Pdn. The analysis of stability shows that Pd2H, Pd4H, Pd7H, Pd2H2, Pd4H2 and Pd7H2 clusters have higher local relative stability than neighboring clusters. The analysis of magnetic properties demonstrates that absorption of hydrogen atoms decreases the average atomic magnetic moments compared with pure Pdn clusters. More charges transfer from H atoms to Pd atoms for Pd6H and Pd6H2 clusters, demonstrating the adsorption of hydrogen atoms change from side adsorption to surface adsorption.

  6. Analysis of Basis Weight Uniformity of Microfiber Nonwovens and Its Impact on Permeability and Filtration Properties

    NASA Astrophysics Data System (ADS)

    Amirnasr, Elham

    It is widely recognized that nonwoven basis weight non-uniformity affects various properties of nonwovens. However, few studies can be found in this topic. The development of uniformity definition and measurement methods and the study of their impact on various web properties such as filtration properties and air permeability would be beneficial both in industrial applications and in academia. They can be utilized as a quality control tool and would provide insights about nonwoven behaviors that cannot be solely explained by average values. Therefore, for quantifying nonwoven web basis weight uniformity we purse to develop an optical analytical tool. The quadrant method and clustering analysis was utilized in an image analysis scheme to help define "uniformity" and its spatial variation. Implementing the quadrant method in an image analysis system allows the establishment of a uniformity index that can be used to quantify the degree of uniformity. Clustering analysis has also been modified and verified using uniform and random simulated images with known parameters. Number of clusters and cluster properties such as cluster size, member and density was determined. We also utilized this new measurement method to evaluate uniformity of nonwovens produced with different processes and investigated impacts of uniformity on filtration and permeability. The results of quadrant method shows that uniformity index computed from quadrant method demonstrate a good range for non-uniformity of nonwoven webs. Clustering analysis is also been applied on reference nonwoven with known visual uniformity. From clustering analysis results, cluster size is promising to be used as uniformity parameter. It is been shown that non-uniform nonwovens has provide lager cluster size than uniform nonwovens. It was been tried to find a relationship between web properties and uniformity index (as a web characteristic). To achieve this, filtration properties, air permeability, solidity and uniformity index of meltblown and spunbond samples was measured. Results for filtration test show some deviation between theoretical and experimental filtration efficiency by considering different types of fiber diameter. This deviation can occur due to variation in basis weight non-uniformity. So an appropriate theory is required to predict the variation of filtration efficiency with respect to non-uniformity of nonwoven filter media. And the results for air permeability test showed that uniformity index determined by quadrant method and measured properties have some relationship. In the other word, air permeability decreases as uniformity index on nonwoven web increase.

  7. Dimensional assessment of personality pathology in patients with eating disorders.

    PubMed

    Goldner, E M; Srikameswaran, S; Schroeder, M L; Livesley, W J; Birmingham, C L

    1999-02-22

    This study examined patients with eating disorders on personality pathology using a dimensional method. Female subjects who met DSM-IV diagnostic criteria for eating disorder (n = 136) were evaluated and compared to an age-controlled general population sample (n = 68). We assessed 18 features of personality disorder with the Dimensional Assessment of Personality Pathology - Basic Questionnaire (DAPP-BQ). Factor analysis and cluster analysis were used to derive three clusters of patients. A five-factor solution was obtained with limited intercorrelation between factors. Cluster analysis produced three clusters with the following characteristics: Cluster 1 members (constituting 49.3% of the sample and labelled 'rigid') had higher mean scores on factors denoting compulsivity and interpersonal difficulties; Cluster 2 (18.4% of the sample) showed highest scores in factors denoting psychopathy, neuroticism and impulsive features, and appeared to constitute a borderline psychopathology group; Cluster 3 (32.4% of the sample) was characterized by few differences in personality pathology in comparison to the normal population sample. Cluster membership was associated with DSM-IV diagnosis -- a large proportion of patients with anorexia nervosa were members of Cluster 1. An empirical classification of eating-disordered patients derived from dimensional assessment of personality pathology identified three groups with clinical relevance.

  8. Developmental analysis of the dopamine-containing neurons of the Drosophila brain

    PubMed Central

    Hartenstein, Volker; Cruz, Louie; Lovick, Jennifer K.; Guo, Ming

    2016-01-01

    The Drosophila dopaminergic (DA) system consists of a relatively small number of neurons clustered throughout the brain and ventral nerve cord. Previous work shows that clusters of DA neurons innervate different brain compartments, which in part accounts for functional diversity of the DA system. In this paper, we analyzed the association between DA neuron clusters and specific brain lineages, developmental and structural units of the Drosophila brain which provide a framework of connections that can be followed throughout development. The hatching larval brain contains six groups of primary DA neurons (born in the embryo), which we assign to six distinct lineages. We can show that all larval DA clusters persist into the adult brain. Some clusters increase in cell number during late larval stages while others do not become DA-positive until early pupa. Ablating neuroblasts with hydroxyurea (HU) prior to onset of larval proliferation (generates secondary neurons) confirms these added DA clusters are primary neurons born in the embryo, rather than secondary neurons. A single cluster that becomes DA-positive in the late pupa, PAM1/lineage DALcm1/2, forms part of a secondary lineage which can be ablated by larval HU application. By supplying lineage information for each DA cluster, our analysis promotes further developmental and functional analyses of this important system of neurons. PMID:27350102

  9. A Cluster Analysis of Tic Symptoms in Children and Adults with Tourette Syndrome: Clinical Correlates and Treatment Outcome

    PubMed Central

    McGuire, Joseph F.; Nyirabahizi, Epiphanie; Kircanski, Katharina; Piacentini, John; Peterson, Alan L.; Woods, Douglas W.; Wilhelm, Sabine; Walkup, John T.; Scahill, Lawrence

    2013-01-01

    Cluster analytic methods have examined the symptom presentation of chronic tic disorders (CTDs), with limited agreement across studies. The present study investigated patterns, clinical correlates, and treatment outcome of tic symptoms. 239 youth and adults with CTDs completed a battery of assessments at baseline to determine diagnoses, tic severity, and clinical characteristics. Participants were randomly assigned to receive either a comprehensive behavioral intervention for tics (CBIT) or psychoeducation and supportive therapy (PST). A cluster analysis was conducted on the baseline Yale Global Tic Severity Scale (YGTSS) symptom checklist to identify the constellations of tic symptoms. Four tic clusters were identified: Impulse Control and Complex Phonic Tics; Complex Motor Tics; Simple Head Motor/Vocal Tics; and Primarily Simple Motor Tics. Frequencies of tic symptoms showed few differences across youth and adults. Tic clusters had small associations with clinical characteristics and showed no associations to the presence of coexisting psychiatric conditions. Cluster membership scores did not predict treatment response to CBIT or tic severity reductions. Tic symptoms distinctly cluster with few difference across youth and adults, or coexisting conditions. This study, which is the first to examine tic clusters in relation to treatment, suggested that tic symptom profiles respond equally well to CBIT. PMID:24144615

  10. Identifying the heterogeneity of young adult rhinitis through cluster analysis in the Isle of Wight birth cohort.

    PubMed

    Kurukulaaratchy, Ramesh J; Zhang, Hongmei; Patil, Veeresh; Raza, Abid; Karmaus, Wilfried; Ewart, Susan; Arshad, S Hasan

    2015-01-01

    Rhinitis affects many young adults and often shows comorbidity with asthma. We hypothesized that young adult rhinitis, like asthma, exhibits clinical heterogeneity identifiable by means of cluster analysis. Participants in the Isle of Wight birth cohort (n = 1456) were assessed at 1, 2, 4, 10, and 18 years of age. Cluster analysis was performed on those with rhinitis at age 18 years (n = 468) by using 13 variables defining clinical characteristics. Four clusters were identified. Patients in cluster 1 (n = 128 [27.4%]; ie, moderate childhood-onset rhinitis) had high atopy and eczema prevalence and high total IgE levels but low asthma prevalence. They showed the best lung function at 18 years of age, with normal fraction of exhaled nitric oxide (Feno), low bronchial hyperresponsiveness (BHR), and low bronchodilator reversibility (BDR) but high rhinitis symptoms and treatment. Patients in cluster 2 (n = 199 [42.5%]; ie, mild-adolescence-onset female rhinitis) had the lowest prevalence of comorbid atopy, asthma, and eczema. They had normal lung function and low BHR, BDR, Feno values, and total IgE levels plus low rhinitis symptoms, severity, and treatment. Patients in cluster 3 (n = 59 [12.6%]; ie, severe earliest-onset rhinitis with asthma) had the youngest rhinitis onset plus the highest comorbid asthma (of simultaneous onset) and atopy. They showed the most obstructed lung function with high BHR, BDR, and Feno values plus high rhinitis symptoms, severity, and treatment. Patient 4 in cluster 4 (n = 82 [17.5%]; ie, moderate childhood-onset male rhinitis with asthma) had high atopy, intermediate asthma, and low eczema. They had impaired lung function with high Feno values and total IgE levels but intermediate BHR and BDR. They had moderate rhinitis symptoms. Clinically distinctive adolescent rhinitis clusters are apparent with varying sex and asthma associations plus differing rhinitis severity and treatment needs. Copyright © 2014 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  11. [IR study on a series of tungsten clusters].

    PubMed

    Yu, R; Chen, J; Lu, S

    2000-10-01

    In this paper, the IR study on a series of tungsten clusters which contain a [W2S4]2+ or [W2MM'S4]4+ (M,M'=Cu,Ag) core is reported. According to the results of X-ray structural analysis and the IR spectra of the clusters, some characteristic IR absorptions of the clusters were assigned. The study of IR spectra of these clusters shows that the variation of structure can reflect on the IR spectra significantly.

  12. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanfilippo, Antonio P.; Chikkagoudar, Satish

    We describe an approach to analyzing trade data which uses clustering to detect similarities across shipping manifest records, classification to evaluate clustering results and categorize new unseen shipping data records, and visual analytics to provide to support situation awareness in dynamic decision making to monitor and warn against the movement of radiological threat materials through search, analysis and forecasting capabilities. The evaluation of clustering results through classification and systematic inspection of the clusters show the clusters have strong semantic cohesion and offer novel ways to detect transactions related to nuclear smuggling.

  13. Cluster analysis as a prediction tool for pregnancy outcomes.

    PubMed

    Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L

    2015-03-01

    Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.

  14. Text mining to decipher free-response consumer complaints: insights from the NHTSA vehicle owner's complaint database.

    PubMed

    Ghazizadeh, Mahtab; McDonald, Anthony D; Lee, John D

    2014-09-01

    This study applies text mining to extract clusters of vehicle problems and associated trends from free-response data in the National Highway Traffic Safety Administration's vehicle owner's complaint database. As the automotive industry adopts new technologies, it is important to systematically assess the effect of these changes on traffic safety. Driving simulators, naturalistic driving data, and crash databases all contribute to a better understanding of how drivers respond to changing vehicle technology, but other approaches, such as automated analysis of incident reports, are needed. Free-response data from incidents representing two severity levels (fatal incidents and incidents involving injury) were analyzed using a text mining approach: latent semantic analysis (LSA). LSA and hierarchical clustering identified clusters of complaints for each severity level, which were compared and analyzed across time. Cluster analysis identified eight clusters of fatal incidents and six clusters of incidents involving injury. Comparisons showed that although the airbag clusters across the two severity levels have the same most frequent terms, the circumstances around the incidents differ. The time trends show clear increases in complaints surrounding the Ford/Firestone tire recall and the Toyota unintended acceleration recall. Increases in complaints may be partially driven by these recall announcements and the associated media attention. Text mining can reveal useful information from free-response databases that would otherwise be prohibitively time-consuming and difficult to summarize manually. Text mining can extend human analysis capabilities for large free-response databases to support earlier detection of problems and more timely safety interventions.

  15. Radiogenomic analysis of lower grade glioma: a pilot multi-institutional study shows an association between quantitative image features and tumor genomics

    NASA Astrophysics Data System (ADS)

    Mazurowski, Maciej A.; Clark, Kal; Czarnek, Nicholas M.; Shamsesfandabadi, Parisa; Peters, Katherine B.; Saha, Ashirbani

    2017-03-01

    Recent studies showed that genomic analysis of lower grade gliomas can be very effective for stratification of patients into groups with different prognosis and proposed specific genomic classifications. In this study, we explore the association of one of those genomic classifications with imaging parameters to determine whether imaging could serve a similar role to genomics in cancer patient treatment. Specifically, we analyzed imaging and genomics data for 110 patients from 5 institutions from The Cancer Genome Atlas and The Cancer Imaging Archive datasets. The analyzed imaging data contained preoperative FLAIR sequence for each patient. The images were analyzed using the in-house algorithms which quantify 2D and 3D aspects of the tumor shape. Genomic data consisted of a cluster of clusters classification proposed in a very recent and leading publication in the field of lower grade glioma genomics. Our statistical analysis showed that there is a strong association between the tumor cluster-of-clusters subtype and two imaging features: bounding ellipsoid volume ratio and angular standard deviation. This result shows high promise for the potential use of imaging as a surrogate measure for genomics in the decision process regarding treatment of lower grade glioma patients.

  16. Challenges in microarray class discovery: a comprehensive examination of normalization, gene selection and clustering

    PubMed Central

    2010-01-01

    Background Cluster analysis, and in particular hierarchical clustering, is widely used to extract information from gene expression data. The aim is to discover new classes, or sub-classes, of either individuals or genes. Performing a cluster analysis commonly involve decisions on how to; handle missing values, standardize the data and select genes. In addition, pre-processing, involving various types of filtration and normalization procedures, can have an effect on the ability to discover biologically relevant classes. Here we consider cluster analysis in a broad sense and perform a comprehensive evaluation that covers several aspects of cluster analyses, including normalization. Result We evaluated 2780 cluster analysis methods on seven publicly available 2-channel microarray data sets with common reference designs. Each cluster analysis method differed in data normalization (5 normalizations were considered), missing value imputation (2), standardization of data (2), gene selection (19) or clustering method (11). The cluster analyses are evaluated using known classes, such as cancer types, and the adjusted Rand index. The performances of the different analyses vary between the data sets and it is difficult to give general recommendations. However, normalization, gene selection and clustering method are all variables that have a significant impact on the performance. In particular, gene selection is important and it is generally necessary to include a relatively large number of genes in order to get good performance. Selecting genes with high standard deviation or using principal component analysis are shown to be the preferred gene selection methods. Hierarchical clustering using Ward's method, k-means clustering and Mclust are the clustering methods considered in this paper that achieves the highest adjusted Rand. Normalization can have a significant positive impact on the ability to cluster individuals, and there are indications that background correction is preferable, in particular if the gene selection is successful. However, this is an area that needs to be studied further in order to draw any general conclusions. Conclusions The choice of cluster analysis, and in particular gene selection, has a large impact on the ability to cluster individuals correctly based on expression profiles. Normalization has a positive effect, but the relative performance of different normalizations is an area that needs more research. In summary, although clustering, gene selection and normalization are considered standard methods in bioinformatics, our comprehensive analysis shows that selecting the right methods, and the right combinations of methods, is far from trivial and that much is still unexplored in what is considered to be the most basic analysis of genomic data. PMID:20937082

  17. Pathological and non-pathological variants of restrictive eating behaviors in middle childhood: A latent class analysis.

    PubMed

    Schmidt, Ricarda; Vogel, Mandy; Hiemisch, Andreas; Kiess, Wieland; Hilbert, Anja

    2018-08-01

    Although restrictive eating behaviors are very common during early childhood, their precise nature and clinical correlates remain unclear. Especially, there is little evidence on restrictive eating behaviors in older children and their associations with children's shape concern. The present population-based study sought to delineate subgroups of restrictive eating patterns in N = 799 7-14 year old children. Using Latent Class Analysis, children were classified based on six restrictive eating behaviors (for example, picky eating, food neophobia, and eating-related anxiety) and shape concern, separately in three age groups. For cluster validation, sociodemographic and objective anthropometric data, parental feeding practices, and general and eating disorder psychopathology were used. The results showed a 3-cluster solution across all age groups: an asymptomatic class (Cluster 1), a class with restrictive eating behaviors without shape concern (Cluster 2), and a class showing restrictive eating behaviors with prominent shape concern (Cluster 3). The clusters differed in all variables used for validation. Particularly, the proportion of children with symptoms of avoidant/restrictive food intake disorder was greater in Cluster 2 than Clusters 1 and 3. The study underlined the importance of considering shape concern to distinguish between different phenotypes of children's restrictive eating patterns. Longitudinal data are needed to evaluate the clusters' predictive effects on children's growth and development of clinical eating disorders. Copyright © 2018 Elsevier Ltd. All rights reserved.

  18. A WISE Survey of New Star Clusters in the Central Plane Region of the Milky Way

    NASA Astrophysics Data System (ADS)

    Ryu, Jinhyuk; Lee, Myung Gyoon

    2018-04-01

    We present the discovery of new star clusters in the central plane region (| l| < 30^\\circ and | b| < 6^\\circ ) of the Milky Way. In order to overcome the extinction problem and the spatial limit of previous surveys, we use the Wide-field Infrared Survey Explorer (WISE) data to find clusters. We also use other infrared survey data in the archive for additional analysis. We find 923 new clusters, of which 202 clusters are embedded clusters. These clusters are concentrated toward the Galactic plane and show a symmetric distribution with respect to the Galactic latitude. The embedded clusters show a stronger concentration to the Galactic plane than the nonembedded clusters. The new clusters are found more in the first Galactic quadrant, while previously known clusters are found more in the fourth Galactic quadrant. The spatial distribution of the combined sample of known clusters and new clusters is approximately symmetric with respect to the Galactic longitude. We estimate reddenings, distances, and relative ages of the 15 class A clusters using theoretical isochrones. Ten of them are relatively old (age >800 Myr) and five are young (age ≈4 Myr).

  19. Subspace K-means clustering.

    PubMed

    Timmerman, Marieke E; Ceulemans, Eva; De Roover, Kim; Van Leeuwen, Karla

    2013-12-01

    To achieve an insightful clustering of multivariate data, we propose subspace K-means. Its central idea is to model the centroids and cluster residuals in reduced spaces, which allows for dealing with a wide range of cluster types and yields rich interpretations of the clusters. We review the existing related clustering methods, including deterministic, stochastic, and unsupervised learning approaches. To evaluate subspace K-means, we performed a comparative simulation study, in which we manipulated the overlap of subspaces, the between-cluster variance, and the error variance. The study shows that the subspace K-means algorithm is sensitive to local minima but that the problem can be reasonably dealt with by using partitions of various cluster procedures as a starting point for the algorithm. Subspace K-means performs very well in recovering the true clustering across all conditions considered and appears to be superior to its competitor methods: K-means, reduced K-means, factorial K-means, mixtures of factor analyzers (MFA), and MCLUST. The best competitor method, MFA, showed a performance similar to that of subspace K-means in easy conditions but deteriorated in more difficult ones. Using data from a study on parental behavior, we show that subspace K-means analysis provides a rich insight into the cluster characteristics, in terms of both the relative positions of the clusters (via the centroids) and the shape of the clusters (via the within-cluster residuals).

  20. Chemotaxonomy of heterocystous cyanobacteria using FAME profiling as species markers.

    PubMed

    Shukla, Ekta; Singh, Satya Shila; Singh, Prashant; Mishra, Arun Kumar

    2012-07-01

    The fatty acid methyl ester (FAME) analysis of the 12 heterocystous cyanobacterial strains showed different fatty acid profiling based on the presence/absence and the percentage of 13 different types of fatty acids. The major fatty acids viz. palmitic acid (16:0), hexadecadienoic acid (16:2), stearic acid (18:0), oleic acid (18:1), linoleic (18:2), and linolenic acid (18:3) were present among all the strains except Cylindrospermum musicola where oleic acid (18:1) was absent. All the strains showed high levels of polyunsaturated fatty acid (PUFAs; 41-68.35%) followed by saturated fatty acid (SAFAs; 1.82-40.66%) and monounsaturated fatty acid (0.85-24.98%). Highest percentage of PUFAs and essential fatty acid (linolenic acid; 18:3) was reported in Scytonema bohnerii which can be used as fatty acid supplement in medical and biotechnological purpose. The cluster analysis based on FAME profiling suggests the presence of two distinct clusters with Euclidean distance ranging from 0 to 25. S. bohnerii of cluster I was distantly related to the other strains of cluster II. The genotypes of cluster II were further divided into two subclusters, i.e., IIa with C. musicola showing great divergence with the other genotypes of IIb which was further subdivided into two groups. Subsubcluster IIb(1) was represented by a genotype, Anabaena sp. whereas subsubcluster IIb(2) was distinguished by two groups, i.e., one group having significant similarity among their three genotypes showed distant relation with the other group having closely related six genotypes. To test the validity of the fatty acid profiles as a marker, cluster analysis has also been generated on the basis of morphological attributes. Our results suggest that FAME profiling might be used as species markers in the study of polyphasic approach based taxonomy and phylogenetic relationship.

  1. Rhodium clustering process on defective (8,0) SWCNT: Analysis of chemical and physical properties using density functional theory

    NASA Astrophysics Data System (ADS)

    Ambrusi, Ruben E.; Luna, C. Romina; Sandoval, Mario G.; Bechthold, Pablo; Pronsato, M. Estela; Juan, Alfredo

    2017-12-01

    The Spin-polarized density functional theory is used to study the effect of a single vacancy in a (8,0) single-walled carbon nanotube (SWCNT) on the Rh clustering process. The vacancy is considered oxygenated and non-oxygenated and, in each case, different Rhn cluster sizes (n = 1-4) are taken into account. For the analysis of these systems some physical and chemical properties are calculated, such as binding energy (Eb), work function (WF), magnetic moment, charge transfer, bond length, band gap (Eg), and density of state (DOS). From this analysis it can be concluded that: a single Rh atom and Rh2 dimer are adsorbed on vacancy without oxygen, whereas Rh3 and Rh4 clusters prefer to be adsorbed on oxygenated vacancy. In all cases, Rh adsorption induces a magnetic moment. When the Rh atom and Rh2 dimer are bonded to the defective SWCNT, it has been found that they show a semiconductor behavior that could be interesting to use in the spintronic area. In the case of Rh3 and Rh4 clusters our results show a metallic behavior suggesting that these systems are good candidates for nanotube contact.

  2. Peeking Network States with Clustered Patterns

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kim, Jinoh; Sim, Alex

    2015-10-20

    Network traffic monitoring has long been a core element for effec- tive network management and security. However, it is still a chal- lenging task with a high degree of complexity for comprehensive analysis when considering multiple variables and ever-increasing traffic volumes to monitor. For example, one of the widely con- sidered approaches is to scrutinize probabilistic distributions, but it poses a scalability concern and multivariate analysis is not gen- erally supported due to the exponential increase of the complexity. In this work, we propose a novel method for network traffic moni- toring based on clustering, one of the powerful deep-learningmore » tech- niques. We show that the new approach enables us to recognize clustered results as patterns representing the network states, which can then be utilized to evaluate “similarity” of network states over time. In addition, we define a new quantitative measure for the similarity between two compared network states observed in dif- ferent time windows, as a supportive means for intuitive analysis. Finally, we demonstrate the clustering-based network monitoring with public traffic traces, and show that the proposed approach us- ing the clustering method has a great opportunity for feasible, cost- effective network monitoring.« less

  3. Development of an automated energy audit protocol for office buildings

    NASA Astrophysics Data System (ADS)

    Deb, Chirag

    This study aims to enhance the building energy audit process, and bring about reduction in time and cost requirements in the conduction of a full physical audit. For this, a total of 5 Energy Service Companies in Singapore have collaborated and provided energy audit reports for 62 office buildings. Several statistical techniques are adopted to analyse these reports. These techniques comprise cluster analysis and development of prediction models to predict energy savings for buildings. The cluster analysis shows that there are 3 clusters of buildings experiencing different levels of energy savings. To understand the effect of building variables on the change in EUI, a robust iterative process for selecting the appropriate variables is developed. The results show that the 4 variables of GFA, non-air-conditioning energy consumption, average chiller plant efficiency and installed capacity of chillers should be taken for clustering. This analysis is extended to the development of prediction models using linear regression and artificial neural networks (ANN). An exhaustive variable selection algorithm is developed to select the input variables for the two energy saving prediction models. The results show that the ANN prediction model can predict the energy saving potential of a given building with an accuracy of +/-14.8%.

  4. Application of clustering analysis in the prediction of photovoltaic power generation based on neural network

    NASA Astrophysics Data System (ADS)

    Cheng, K.; Guo, L. M.; Wang, Y. K.; Zafar, M. T.

    2017-11-01

    In order to select effective samples in the large number of data of PV power generation years and improve the accuracy of PV power generation forecasting model, this paper studies the application of clustering analysis in this field and establishes forecasting model based on neural network. Based on three different types of weather on sunny, cloudy and rainy days, this research screens samples of historical data by the clustering analysis method. After screening, it establishes BP neural network prediction models using screened data as training data. Then, compare the six types of photovoltaic power generation prediction models before and after the data screening. Results show that the prediction model combining with clustering analysis and BP neural networks is an effective method to improve the precision of photovoltaic power generation.

  5. Cluster analysis of accelerated molecular dynamics simulations: A case study of the decahedron to icosahedron transition in Pt nanoparticles.

    PubMed

    Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F; Perez, Danny

    2017-10-21

    Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.

  6. Cluster analysis of accelerated molecular dynamics simulations: A case study of the decahedron to icosahedron transition in Pt nanoparticles

    NASA Astrophysics Data System (ADS)

    Huang, Rao; Lo, Li-Ta; Wen, Yuhua; Voter, Arthur F.; Perez, Danny

    2017-10-01

    Modern molecular-dynamics-based techniques are extremely powerful to investigate the dynamical evolution of materials. With the increase in sophistication of the simulation techniques and the ubiquity of massively parallel computing platforms, atomistic simulations now generate very large amounts of data, which have to be carefully analyzed in order to reveal key features of the underlying trajectories, including the nature and characteristics of the relevant reaction pathways. We show that clustering algorithms, such as the Perron Cluster Cluster Analysis, can provide reduced representations that greatly facilitate the interpretation of complex trajectories. To illustrate this point, clustering tools are used to identify the key kinetic steps in complex accelerated molecular dynamics trajectories exhibiting shape fluctuations in Pt nanoclusters. This analysis provides an easily interpretable coarse representation of the reaction pathways in terms of a handful of clusters, in contrast to the raw trajectory that contains thousands of unique states and tens of thousands of transitions.

  7. Toward a Richer View of the Scientific Method: The Role of Conceptual Analysis

    ERIC Educational Resources Information Center

    Machado, Armando; Silva, Francisco J.

    2007-01-01

    Within the complex set of activities that comprise the scientific method, three clusters of activities can be recognized: experimentation, mathematization, and conceptual analysis. In psychology, the first two of these clusters are well-known and valued, but the third seems less known and valued. The authors show the value of these three clusters…

  8. Cluster Approach to Network Interaction in Pedagogical University

    ERIC Educational Resources Information Center

    Chekaleva, Nadezhda V.; Makarova, Natalia S.; Drobotenko, Yulia B.

    2016-01-01

    The study presented in the article is devoted to the analysis of theory and practice of network interaction within the framework of education clusters. Education clusters are considered to be a novel form of network interaction in pedagogical education in Russia. The aim of the article is to show the advantages and disadvantages of the cluster…

  9. Methane Production in Dairy Cows Correlates with Rumen Methanogenic and Bacterial Community Structure.

    PubMed

    Danielsson, Rebecca; Dicksved, Johan; Sun, Li; Gonda, Horacio; Müller, Bettina; Schnürer, Anna; Bertilsson, Jan

    2017-01-01

    Methane (CH 4 ) is produced as an end product from feed fermentation in the rumen. Yield of CH 4 varies between individuals despite identical feeding conditions. To get a better understanding of factors behind the individual variation, 73 dairy cows given the same feed but differing in CH 4 emissions were investigated with focus on fiber digestion, fermentation end products and bacterial and archaeal composition. In total 21 cows (12 Holstein, 9 Swedish Red) identified as persistent low, medium or high CH 4 emitters over a 3 month period were furthermore chosen for analysis of microbial community structure in rumen fluid. This was assessed by sequencing the V4 region of 16S rRNA gene and by quantitative qPCR of targeted Methanobrevibacter groups. The results showed a positive correlation between low CH 4 emitters and higher abundance of Methanobrevibacter ruminantium clade. Principal coordinate analysis (PCoA) on operational taxonomic unit (OTU) level of bacteria showed two distinct clusters ( P < 0.01) that were related to CH 4 production. One cluster was associated with low CH 4 production (referred to as cluster L) whereas the other cluster was associated with high CH 4 production (cluster H) and the medium emitters occurred in both clusters. The differences between clusters were primarily linked to differential abundances of certain OTUs belonging to Prevotella . Moreover, several OTUs belonging to the family Succinivibrionaceae were dominant in samples belonging to cluster L. Fermentation pattern of volatile fatty acids showed that proportion of propionate was higher in cluster L, while proportion of butyrate was higher in cluster H. No difference was found in milk production or organic matter digestibility between cows. Cows in cluster L had lower CH 4 /kg energy corrected milk (ECM) compared to cows in cluster H, 8.3 compared to 9.7 g CH 4 /kg ECM, showing that low CH 4 cows utilized the feed more efficient for milk production which might indicate a more efficient microbial population or host genetic differences that is reflected in bacterial and archaeal (or methanogens) populations.

  10. Methane Production in Dairy Cows Correlates with Rumen Methanogenic and Bacterial Community Structure

    PubMed Central

    Danielsson, Rebecca; Dicksved, Johan; Sun, Li; Gonda, Horacio; Müller, Bettina; Schnürer, Anna; Bertilsson, Jan

    2017-01-01

    Methane (CH4) is produced as an end product from feed fermentation in the rumen. Yield of CH4 varies between individuals despite identical feeding conditions. To get a better understanding of factors behind the individual variation, 73 dairy cows given the same feed but differing in CH4 emissions were investigated with focus on fiber digestion, fermentation end products and bacterial and archaeal composition. In total 21 cows (12 Holstein, 9 Swedish Red) identified as persistent low, medium or high CH4 emitters over a 3 month period were furthermore chosen for analysis of microbial community structure in rumen fluid. This was assessed by sequencing the V4 region of 16S rRNA gene and by quantitative qPCR of targeted Methanobrevibacter groups. The results showed a positive correlation between low CH4 emitters and higher abundance of Methanobrevibacter ruminantium clade. Principal coordinate analysis (PCoA) on operational taxonomic unit (OTU) level of bacteria showed two distinct clusters (P < 0.01) that were related to CH4 production. One cluster was associated with low CH4 production (referred to as cluster L) whereas the other cluster was associated with high CH4 production (cluster H) and the medium emitters occurred in both clusters. The differences between clusters were primarily linked to differential abundances of certain OTUs belonging to Prevotella. Moreover, several OTUs belonging to the family Succinivibrionaceae were dominant in samples belonging to cluster L. Fermentation pattern of volatile fatty acids showed that proportion of propionate was higher in cluster L, while proportion of butyrate was higher in cluster H. No difference was found in milk production or organic matter digestibility between cows. Cows in cluster L had lower CH4/kg energy corrected milk (ECM) compared to cows in cluster H, 8.3 compared to 9.7 g CH4/kg ECM, showing that low CH4 cows utilized the feed more efficient for milk production which might indicate a more efficient microbial population or host genetic differences that is reflected in bacterial and archaeal (or methanogens) populations. PMID:28261182

  11. [Visual field progression in glaucoma: cluster analysis].

    PubMed

    Bresson-Dumont, H; Hatton, J; Foucher, J; Fonteneau, M

    2012-11-01

    Visual field progression analysis is one of the key points in glaucoma monitoring, but distinction between true progression and random fluctuation is sometimes difficult. There are several different algorithms but no real consensus for detecting visual field progression. The trend analysis of global indices (MD, sLV) may miss localized deficits or be affected by media opacities. Conversely, point-by-point analysis makes progression difficult to differentiate from physiological variability, particularly when the sensitivity of a point is already low. The goal of our study was to analyse visual field progression with the EyeSuite™ Octopus Perimetry Clusters algorithm in patients with no significant changes in global indices or worsening of the analysis of pointwise linear regression. We analyzed the visual fields of 162 eyes (100 patients - 58 women, 42 men, average age 66.8 ± 10.91) with ocular hypertension or glaucoma. For inclusion, at least six reliable visual fields per eye were required, and the trend analysis (EyeSuite™ Perimetry) of visual field global indices (MD and SLV), could show no significant progression. The analysis of changes in cluster mode was then performed. In a second step, eyes with statistically significant worsening of at least one of their clusters were analyzed point-by-point with the Octopus Field Analysis (OFA). Fifty four eyes (33.33%) had a significant worsening in some clusters, while their global indices remained stable over time. In this group of patients, more advanced glaucoma was present than in stable group (MD 6.41 dB vs. 2.87); 64.82% (35/54) of those eyes in which the clusters progressed, however, had no statistically significant change in the trend analysis by pointwise linear regression. Most software algorithms for analyzing visual field progression are essentially trend analyses of global indices, or point-by-point linear regression. This study shows the potential role of analysis by clusters trend. However, for best results, it is preferable to compare the analyses of several tests in combination with morphologic exam. Copyright © 2012 Elsevier Masson SAS. All rights reserved.

  12. Reconstruction of sediment transport pathways in modern microtidal sand flat by multiple classification analysis

    NASA Astrophysics Data System (ADS)

    Yamashita, S.; Nakajo, T.; Naruse, H.

    2009-12-01

    In this study, we statistically classified the grain size distribution of the bottom surface sediment on a microtidal sand flat to analyze the depositional processes of the sediment. Multiple classification analysis revealed that two types of sediment populations exist in the bottom surface sediment. Then, we employed the sediment trend model developed by Gao and Collins (1992) for the estimation of sediment transport pathways. As a result, we found that statistical discrimination of the bottom surface sediment provides useful information for the sediment trend model while dealing with various types of sediment transport processes. The microtidal sand flat along the Kushida River estuary, Ise Bay, central Japan, was investigated, and 102 bottom surface sediment samples were obtained. Then, their grain size distribution patterns were measured by the settling tube method, and each grain size distribution parameter (mud and gravel contents, mean grain size, coefficient of variance (CV), skewness, kurtosis, 5, 25, 50, 75, and 95 percentile) was calculated. Here, CV is the normalized sorting value divided by the mean grain size. Two classical statistical methods—principal component analysis (PCA) and fuzzy cluster analysis—were applied. The results of PCA showed that the bottom surface sediment of the study area is mainly characterized by grain size (mean grain size and 5-95 percentile) and the CV value, indicating predominantly large absolute values of factor loadings in primal component (PC) 1. PC1 is interpreted as being indicative of the grain-size trend, in which a finer grain-size distribution indicates better size sorting. The frequency distribution of PC1 has a bimodal shape and suggests the existence of two types of sediment populations. Therefore, we applied fuzzy cluster analysis, the results of which revealed two groupings of the sediment (Cluster 1 and Cluster 2). Cluster 1 shows a lower value of PC1, indicating coarse and poorly sorted sediments. Cluster 1 sediments are distributed around the branched channel from Kushida River and show an expanding distribution from the river mouth toward the northeast direction. Cluster 2 shows a higher value of PC1, indicating fine and well-sorted sediments; this cluster is distributed in a distant area from the river mouth, including the offshore region. Therefore, Cluster 1 and Cluster 2 are interpreted as being deposited by fluvial and wave processes, respectively. Finally, on the basis of this distribution pattern, the sediment trend model was applied in areas dominated separately by fluvial and wave processes. Resultant sediment transport patterns showed good agreement with those obtained by field observations. The results of this study provide an important insight into the numerical models of sediment transport.

  13. The Technical and Biological Reproducibility of Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry (MALDI-TOF MS) Based Typing: Employment of Bioinformatics in a Multicenter Study

    PubMed Central

    Oberle, Michael; Wohlwend, Nadia; Jonas, Daniel; Maurer, Florian P.; Jost, Geraldine; Tschudin-Sutter, Sarah; Vranckx, Katleen; Egli, Adrian

    2016-01-01

    Background The technical, biological, and inter-center reproducibility of matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI TOF MS) typing data has not yet been explored. The aim of this study is to compare typing data from multiple centers employing bioinformatics using bacterial strains from two past outbreaks and non-related strains. Material/Methods Participants received twelve extended spectrum betalactamase-producing E. coli isolates and followed the same standard operating procedure (SOP) including a full-protein extraction protocol. All laboratories provided visually read spectra via flexAnalysis (Bruker, Germany). Raw data from each laboratory allowed calculating the technical and biological reproducibility between centers using BioNumerics (Applied Maths NV, Belgium). Results Technical and biological reproducibility ranged between 96.8–99.4% and 47.6–94.4%, respectively. The inter-center reproducibility showed a comparable clustering among identical isolates. Principal component analysis indicated a higher tendency to cluster within the same center. Therefore, we used a discriminant analysis, which completely separated the clusters. Next, we defined a reference center and performed a statistical analysis to identify specific peaks to identify the outbreak clusters. Finally, we used a classifier algorithm and a linear support vector machine on the determined peaks as classifier. A validation showed that within the set of the reference center, the identification of the cluster was 100% correct with a large contrast between the score with the correct cluster and the next best scoring cluster. Conclusions Based on the sufficient technical and biological reproducibility of MALDI-TOF MS based spectra, detection of specific clusters is possible from spectra obtained from different centers. However, we believe that a shared SOP and a bioinformatics approach are required to make the analysis robust and reliable. PMID:27798637

  14. Cluster randomised trials in the medical literature: two bibliometric surveys

    PubMed Central

    Bland, J Martin

    2004-01-01

    Background Several reviews of published cluster randomised trials have reported that about half did not take clustering into account in the analysis, which was thus incorrect and potentially misleading. In this paper I ask whether cluster randomised trials are increasing in both number and quality of reporting. Methods Computer search for papers on cluster randomised trials since 1980, hand search of trial reports published in selected volumes of the British Medical Journal over 20 years. Results There has been a large increase in the numbers of methodological papers and of trial reports using the term 'cluster random' in recent years, with about equal numbers of each type of paper. The British Medical Journal contained more such reports than any other journal. In this journal there was a corresponding increase over time in the number of trials where subjects were randomised in clusters. In 2003 all reports showed awareness of the need to allow for clustering in the analysis. In 1993 and before clustering was ignored in most such trials. Conclusion Cluster trials are becoming more frequent and reporting is of higher quality. Perhaps statistician pressure works. PMID:15310402

  15. Sputum neutrophil counts are associated with more severe asthma phenotypes using cluster analysis.

    PubMed

    Moore, Wendy C; Hastie, Annette T; Li, Xingnan; Li, Huashi; Busse, William W; Jarjour, Nizar N; Wenzel, Sally E; Peters, Stephen P; Meyers, Deborah A; Bleecker, Eugene R

    2014-06-01

    Clinical cluster analysis from the Severe Asthma Research Program (SARP) identified 5 asthma subphenotypes that represent the severity spectrum of early-onset allergic asthma, late-onset severe asthma, and severe asthma with chronic obstructive pulmonary disease characteristics. Analysis of induced sputum from a subset of SARP subjects showed 4 sputum inflammatory cellular patterns. Subjects with concurrent increases in eosinophil (≥2%) and neutrophil (≥40%) percentages had characteristics of very severe asthma. To better understand interactions between inflammation and clinical subphenotypes, we integrated inflammatory cellular measures and clinical variables in a new cluster analysis. Participants in SARP who underwent sputum induction at 3 clinical sites were included in this analysis (n = 423). Fifteen variables, including clinical characteristics and blood and sputum inflammatory cell assessments, were selected using factor analysis for unsupervised cluster analysis. Four phenotypic clusters were identified. Cluster A (n = 132) and B (n = 127) subjects had mild-to-moderate early-onset allergic asthma with paucigranulocytic or eosinophilic sputum inflammatory cell patterns. In contrast, these inflammatory patterns were present in only 7% of cluster C (n = 117) and D (n = 47) subjects who had moderate-to-severe asthma with frequent health care use despite treatment with high doses of inhaled or oral corticosteroids and, in cluster D, reduced lung function. The majority of these subjects (>83%) had sputum neutrophilia either alone or with concurrent sputum eosinophilia. Baseline lung function and sputum neutrophil percentages were the most important variables determining cluster assignment. This multivariate approach identified 4 asthma subphenotypes representing the severity spectrum from mild-to-moderate allergic asthma with minimal or eosinophil-predominant sputum inflammation to moderate-to-severe asthma with neutrophil-predominant or mixed granulocytic inflammation. Published by Mosby, Inc.

  16. Sputum neutrophils are associated with more severe asthma phenotypes using cluster analysis

    PubMed Central

    Moore, Wendy C.; Hastie, Annette T.; Li, Xingnan; Li, Huashi; Busse, William W.; Jarjour, Nizar N.; Wenzel, Sally E.; Peters, Stephen P.; Meyers, Deborah A.; Bleecker, Eugene R.

    2013-01-01

    Background Clinical cluster analysis from the Severe Asthma Research Program (SARP) identified five asthma subphenotypes that represent the severity spectrum of early onset allergic asthma, late onset severe asthma and severe asthma with COPD characteristics. Analysis of induced sputum from a subset of SARP subjects showed four sputum inflammatory cellular patterns. Subjects with concurrent increases in eosinophils (≥2%) and neutrophils (≥40%) had characteristics of very severe asthma. Objective To better understand interactions between inflammation and clinical subphenotypes we integrated inflammatory cellular measures and clinical variables in a new cluster analysis. Methods Participants in SARP at three clinical sites who underwent sputum induction were included in this analysis (n=423). Fifteen variables including clinical characteristics and blood and sputum inflammatory cell assessments were selected by factor analysis for unsupervised cluster analysis. Results Four phenotypic clusters were identified. Cluster A (n=132) and B (n=127) subjects had mild-moderate early onset allergic asthma with paucigranulocytic or eosinophilic sputum inflammatory cell patterns. In contrast, these inflammatory patterns were present in only 7% of Cluster C (n=117) and D (n=47) subjects who had moderate-severe asthma with frequent health care utilization despite treatment with high doses of inhaled or oral corticosteroids, and in Cluster D, reduced lung function. The majority these subjects (>83%) had sputum neutrophilia either alone or with concurrent sputum eosinophilia. Baseline lung function and sputum neutrophils were the most important variables determining cluster assignment. Conclusion This multivariate approach identified four asthma subphenotypes representing the severity spectrum from mild-moderate allergic asthma with minimal or eosinophilic predominant sputum inflammation to moderate-severe asthma with neutrophilic predominant or mixed granulocytic inflammation. PMID:24332216

  17. Cluster Subcutaneous Allergen Specific Immunotherapy for the Treatment of Allergic Rhinitis: A Systematic Review and Meta-Analysis

    PubMed Central

    Sun, Yueqi; Luo, Xi; Li, Huabin

    2014-01-01

    Background Although allergen specific immunotherapy (SIT) represents the only immune- modifying and curative option available for patients with allergic rhinitis (AR), the optimal schedule for specific subcutaneous immunotherapy (SCIT) is still unknown. The objective of this study is to systematically assess the efficacy and safety of cluster SCIT for patients with AR. Methods By searching PubMed, EMBASE and the Cochrane clinical trials database from 1980 through May 10th, 2013, we collected and analyzed the randomized controlled trials (RCTs) of cluster SCIT to assess its efficacy and safety. Results Eight trials involving 567 participants were included in this systematic review. Our meta-analysis showed that cluster SCIT have similar effect in reduction of both rhinitis symptoms and the requirement for anti-allergic medication compared with conventional SCIT, but when comparing cluster SCIT with placebo, no statistic significance were found in reduction of symptom scores or medication scores. Some caution is required in this interpretation as there was significant heterogeneity between studies. Data relating to Rhinoconjunctivitis Quality of Life Questionnaire (RQLQ) in 3 included studies were analyzed, which consistently point to the efficacy of cluster SCIT in improving quality of life compared to placebo. To assess the safety of cluster SCIT, meta-analysis showed that no differences existed in the incidence of either local adverse reaction or systemic adverse reaction between the cluster group and control group. Conclusion Based on the current limited evidence, we still could not conclude affirmatively that cluster SCIT was a safe and efficacious option for the treatment of AR patients. Further large-scale, well-designed RCTs on this topic are still needed. PMID:24489740

  18. Characteristics of airflow and particle deposition in COPD current smokers

    NASA Astrophysics Data System (ADS)

    Zou, Chunrui; Choi, Jiwoong; Haghighi, Babak; Choi, Sanghun; Hoffman, Eric A.; Lin, Ching-Long

    2017-11-01

    A recent imaging-based cluster analysis of computed tomography (CT) lung images in a chronic obstructive pulmonary disease (COPD) cohort identified four clusters, viz. disease sub-populations. Cluster 1 had relatively normal airway structures; Cluster 2 had wall thickening; Cluster 3 exhibited decreased wall thickness and luminal narrowing; Cluster 4 had a significant decrease of luminal diameter and a significant reduction of lung deformation, thus having relatively low pulmonary functions. To better understand the characteristics of airflow and particle deposition in these clusters, we performed computational fluid and particle dynamics analyses on representative cluster patients and healthy controls using CT-based airway models and subject-specific 3D-1D coupled boundary conditions. The results show that particle deposition in central airways of cluster 4 patients was noticeably increased especially with increasing particle size despite reduced vital capacity as compared to other clusters and healthy controls. This may be attributable in part to significant airway constriction in cluster 4. This study demonstrates the potential application of cluster-guided CFD analysis in disease populations. NIH Grants U01HL114494 and S10-RR022421, and FDA Grant U01FD005837.

  19. Countries population determination to test rice crisis indicator at national level using k-means cluster analysis

    NASA Astrophysics Data System (ADS)

    Hidayat, Y.; Purwandari, T.; Sukono; Ariska, Y. D.

    2017-01-01

    This study aimed to obtain information on the population of the countries which is have similarities with Indonesia based on three characteristics, that is the democratic atmosphere, rice consumption and purchasing power of rice. It is useful as a reference material for research which tested the strength and predictability of the rice crisis indicators Unprecedented Restlessness (UR). The similarities countries with Indonesia were conducted using multivariate analysis that is non-hierarchical cluster analysis k-Means with 38 countries as the data population. This analysis is done repeatedly until the obtainment number of clusters which is capable to show the differentiator power of the three characteristics and describe the high similarity within clusters. Based on the results, it turns out with 6 clusters can describe the differentiator power of characteristics of formed clusters. However, to answer the purpose of the study, only one cluster which will be taken accordance with the criteria of success for the population of countries that have similarities with Indonesia that cluster contain Indonesia therein, there are countries which is sustain crisis and non-crisis of rice in 2008, and cluster which is have the largest member among them. This criterion is met by cluster 2, which consists of 22 countries, namely Indonesia, Brazil, Costa Rica, Djibouti, Dominican Republic, Ecuador, Fiji, Guinea-Bissau, Haiti, India, Jamaica, Japan, Korea South, Madagascar, Malaysia, Mali, Nicaragua, Panama, Peru, Senegal, Sierra Leone and Suriname.

  20. Scoring clustering solutions by their biological relevance.

    PubMed

    Gat-Viks, I; Sharan, R; Shamir, R

    2003-12-12

    A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering gene expression data into homogeneous groups was shown to be instrumental in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on clustering algorithms for gene expression analysis, very few works addressed the systematic comparison and evaluation of clustering results. Typically, different clustering algorithms yield different clustering solutions on the same data, and there is no agreed upon guideline for choosing among them. We developed a novel statistically based method for assessing a clustering solution according to prior biological knowledge. Our method can be used to compare different clustering solutions or to optimize the parameters of a clustering algorithm. The method is based on projecting vectors of biological attributes of the clustered elements onto the real line, such that the ratio of between-groups and within-group variance estimators is maximized. The projected data are then scored using a non-parametric analysis of variance test, and the score's confidence is evaluated. We validate our approach using simulated data and show that our scoring method outperforms several extant methods, including the separation to homogeneity ratio and the silhouette measure. We apply our method to evaluate results of several clustering methods on yeast cell-cycle gene expression data. The software is available from the authors upon request.

  1. Cluster analysis for determining distribution center location

    NASA Astrophysics Data System (ADS)

    Lestari Widaningrum, Dyah; Andika, Aditya; Murphiyanto, Richard Dimas Julian

    2017-12-01

    Determination of distribution facilities is highly important to survive in the high level of competition in today’s business world. Companies can operate multiple distribution centers to mitigate supply chain risk. Thus, new problems arise, namely how many and where the facilities should be provided. This study examines a fast-food restaurant brand, which located in the Greater Jakarta. This brand is included in the category of top 5 fast food restaurant chain based on retail sales. There were three stages in this study, compiling spatial data, cluster analysis, and network analysis. Cluster analysis results are used to consider the location of the additional distribution center. Network analysis results show a more efficient process referring to a shorter distance to the distribution process.

  2. Template growth of Au, Ni and Ni–Au nanoclusters on hexagonal boron nitride/Rh(111): a combined STM, TPD and AES study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wu, Fanglue; Huang, Dali; Yue, Yuan

    In this study, the template growth of Au, Ni, and Ni–Au bimetallic nanoclusters on hexagonal boron nitride/Rh(111), i.e. h-BN/Rh(111), was investigated via scanning tunneling microscopy (STM), temperature programmed-desorption (TPD), and Auger electron spectroscopy (AES). STM study shows that template growth of Au clusters on h-BN/Rh(111) forms mainly well-dispersed monolayer clusters. In contrast, Ni forms large multilayer clusters showing a relatively high diffusivity on h-BN/Rh(111) substrate. Ni–Au bimetallic clusters are effectively formed first by Au deposition followed by Ni deposition, with the Au clusters functioning as nucleation sites for the subsequently deposited Ni. Further structural analysis was carried out via TPDmore » and AES. The resulting TPD and AES data show the surface composition and charge transfer between Au and Ni of the bimetallic clusters. These results suggest that the h-BN/Rh(111) substrate represents a unique candidate for supporting Ni–Au bimetallic clusters in further catalytic reactions.« less

  3. Template growth of Au, Ni and Ni–Au nanoclusters on hexagonal boron nitride/Rh(111): a combined STM, TPD and AES study

    DOE PAGES

    Wu, Fanglue; Huang, Dali; Yue, Yuan; ...

    2017-09-12

    In this study, the template growth of Au, Ni, and Ni–Au bimetallic nanoclusters on hexagonal boron nitride/Rh(111), i.e. h-BN/Rh(111), was investigated via scanning tunneling microscopy (STM), temperature programmed-desorption (TPD), and Auger electron spectroscopy (AES). STM study shows that template growth of Au clusters on h-BN/Rh(111) forms mainly well-dispersed monolayer clusters. In contrast, Ni forms large multilayer clusters showing a relatively high diffusivity on h-BN/Rh(111) substrate. Ni–Au bimetallic clusters are effectively formed first by Au deposition followed by Ni deposition, with the Au clusters functioning as nucleation sites for the subsequently deposited Ni. Further structural analysis was carried out via TPDmore » and AES. The resulting TPD and AES data show the surface composition and charge transfer between Au and Ni of the bimetallic clusters. These results suggest that the h-BN/Rh(111) substrate represents a unique candidate for supporting Ni–Au bimetallic clusters in further catalytic reactions.« less

  4. Comparative genomics reveals phylogenetic distribution patterns of secondary metabolites in Amycolatopsis species.

    PubMed

    Adamek, Martina; Alanjary, Mohammad; Sales-Ortells, Helena; Goodfellow, Michael; Bull, Alan T; Winkler, Anika; Wibberg, Daniel; Kalinowski, Jörn; Ziemert, Nadine

    2018-06-01

    Genome mining tools have enabled us to predict biosynthetic gene clusters that might encode compounds with valuable functions for industrial and medical applications. With the continuously increasing number of genomes sequenced, we are confronted with an overwhelming number of predicted clusters. In order to guide the effective prioritization of biosynthetic gene clusters towards finding the most promising compounds, knowledge about diversity, phylogenetic relationships and distribution patterns of biosynthetic gene clusters is necessary. Here, we provide a comprehensive analysis of the model actinobacterial genus Amycolatopsis and its potential for the production of secondary metabolites. A phylogenetic characterization, together with a pan-genome analysis showed that within this highly diverse genus, four major lineages could be distinguished which differed in their potential to produce secondary metabolites. Furthermore, we were able to distinguish gene cluster families whose distribution correlated with phylogeny, indicating that vertical gene transfer plays a major role in the evolution of secondary metabolite gene clusters. Still, the vast majority of the diverse biosynthetic gene clusters were derived from clusters unique to the genus, and also unique in comparison to a database of known compounds. Our study on the locations of biosynthetic gene clusters in the genomes of Amycolatopsis' strains showed that clusters acquired by horizontal gene transfer tend to be incorporated into non-conserved regions of the genome thereby allowing us to distinguish core and hypervariable regions in Amycolatopsis genomes. Using a comparative genomics approach, it was possible to determine the potential of the genus Amycolatopsis to produce a huge diversity of secondary metabolites. Furthermore, the analysis demonstrates that horizontal and vertical gene transfer play an important role in the acquisition and maintenance of valuable secondary metabolites. Our results cast light on the interconnections between secondary metabolite gene clusters and provide a way to prioritize biosynthetic pathways in the search and discovery of novel compounds.

  5. Model-based clustering for RNA-seq data.

    PubMed

    Si, Yaqing; Liu, Peng; Li, Pinghua; Brutnell, Thomas P

    2014-01-15

    RNA-seq technology has been widely adopted as an attractive alternative to microarray-based methods to study global gene expression. However, robust statistical tools to analyze these complex datasets are still lacking. By grouping genes with similar expression profiles across treatments, cluster analysis provides insight into gene functions and networks, and hence is an important technique for RNA-seq data analysis. In this manuscript, we derive clustering algorithms based on appropriate probability models for RNA-seq data. An expectation-maximization algorithm and another two stochastic versions of expectation-maximization algorithms are described. In addition, a strategy for initialization based on likelihood is proposed to improve the clustering algorithms. Moreover, we present a model-based hybrid-hierarchical clustering method to generate a tree structure that allows visualization of relationships among clusters as well as flexibility of choosing the number of clusters. Results from both simulation studies and analysis of a maize RNA-seq dataset show that our proposed methods provide better clustering results than alternative methods such as the K-means algorithm and hierarchical clustering methods that are not based on probability models. An R package, MBCluster.Seq, has been developed to implement our proposed algorithms. This R package provides fast computation and is publicly available at http://www.r-project.org

  6. [Cluster analysis applicability to fitness evaluation of cosmonauts on long-term missions of the International space station].

    PubMed

    Egorov, A D; Stepantsov, V I; Nosovskiĭ, A M; Shipov, A A

    2009-01-01

    Cluster analysis was applied to evaluate locomotion training (running and running intermingled with walking) of 13 cosmonauts on long-term ISS missions by the parameters of duration (min), distance (m) and intensity (km/h). Based on the results of analyses, the cosmonauts were distributed into three steady groups of 2, 5 and 6 persons. Distance and speed showed a statistical rise (p < 0.03) from group 1 to group 3. Duration of physical locomotion training was not statistically different in the groups (p = 0.125). Therefore, cluster analysis is an adequate method of evaluating fitness of cosmonauts on long-term missions.

  7. Finding Groups Using Model-based Cluster Analysis: Heterogeneous Emotional Self-regulatory Processes and Heavy Alcohol Use Risk

    PubMed Central

    Mun, Eun-Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

    2010-01-01

    Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of non-nested models using the Bayesian Information Criterion (BIC) to compare multiple models and identify the optimum number of clusters. The current study clustered 36 young men and women based on their baseline heart rate (HR) and HR variability (HRV), chronic alcohol use, and reasons for drinking. Two cluster groups were identified and labeled High Alcohol Risk and Normative groups. Compared to the Normative group, individuals in the High Alcohol Risk group had higher levels of alcohol use and more strongly endorsed disinhibition and suppression reasons for use. The High Alcohol Risk group showed significant HRV changes in response to positive and negative emotional and appetitive picture cues, compared to neutral cues. In contrast, the Normative group showed a significant HRV change only to negative cues. Findings suggest that the individuals with autonomic self-regulatory difficulties may be more susceptible to heavy alcohol use and use alcohol for emotional regulation. PMID:18331138

  8. AGN Feedback in Clusters of Galaxies

    DTIC Science & Technology

    2010-01-01

    cooling non-radiatively or being heated to higher temperatures. Throughout this paper , we use the term “cooling flow” to indicate clusters with...taurus cluster [51] and M87/ Virgo [24]. Concentric ripple-like features are also seen surrounding the center of Abell 2052, but current analysis shows that...2002) Chandra Imaging of the X-ray Core of the Virgo Cluster . ApJ 579:560-570. 37. Fujita Y et al. (2002) Chandra Observations of the Disruption of the

  9. A cluster analysis of tic symptoms in children and adults with Tourette syndrome: clinical correlates and treatment outcome.

    PubMed

    McGuire, Joseph F; Nyirabahizi, Epiphanie; Kircanski, Katharina; Piacentini, John; Peterson, Alan L; Woods, Douglas W; Wilhelm, Sabine; Walkup, John T; Scahill, Lawrence

    2013-12-30

    Cluster analytic methods have examined the symptom presentation of chronic tic disorders (CTDs), with limited agreement across studies. The present study investigated patterns, clinical correlates, and treatment outcome of tic symptoms. 239 youth and adults with CTDs completed a battery of assessments at baseline to determine diagnoses, tic severity, and clinical characteristics. Participants were randomly assigned to receive either a comprehensive behavioral intervention for tics (CBIT) or psychoeducation and supportive therapy (PST). A cluster analysis was conducted on the baseline Yale Global Tic Severity Scale (YGTSS) symptom checklist to identify the constellations of tic symptoms. Four tic clusters were identified: Impulse Control and Complex Phonic Tics; Complex Motor Tics; Simple Head Motor/Vocal Tics; and Primarily Simple Motor Tics. Frequencies of tic symptoms showed few differences across youth and adults. Tic clusters had small associations with clinical characteristics and showed no associations to the presence of coexisting psychiatric conditions. Cluster membership scores did not predict treatment response to CBIT or tic severity reductions. Tic symptoms distinctly cluster with little difference across youth and adults, or coexisting conditions. This study, which is the first to examine tic clusters and response to treatment, suggested that tic symptom profiles respond equally well to CBIT. Clinical trials.gov. identifiers: NCT00218777; NCT00231985. © 2013 Elsevier Ireland Ltd. All rights reserved.

  10. Network-constrained spatio-temporal clustering analysis of traffic collisions in Jianghan District of Wuhan, China

    PubMed Central

    Fan, Yaxin; Zhu, Xinyan; Guo, Wei; Guo, Tao

    2018-01-01

    The analysis of traffic collisions is essential for urban safety and the sustainable development of the urban environment. Reducing the road traffic injuries and the financial losses caused by collisions is the most important goal of traffic management. In addition, traffic collisions are a major cause of traffic congestion, which is a serious issue that affects everyone in the society. Therefore, traffic collision analysis is essential for all parties, including drivers, pedestrians, and traffic officers, to understand the road risks at a finer spatio-temporal scale. However, traffic collisions in the urban context are dynamic and complex. Thus, it is important to detect how the collision hotspots evolve over time through spatio-temporal clustering analysis. In addition, traffic collisions are not isolated events in space. The characteristics of the traffic collisions and their surrounding locations also present an influence of the clusters. This work tries to explore the spatio-temporal clustering patterns of traffic collisions by combining a set of network-constrained methods. These methods were tested using the traffic collision data in Jianghan District of Wuhan, China. The results demonstrated that these methods offer different perspectives of the spatio-temporal clustering patterns. The weighted network kernel density estimation provides an intuitive way to incorporate attribute information. The network cross K-function shows that there are varying clustering tendencies between traffic collisions and different types of POIs. The proposed network differential Local Moran’s I and network local indicators of mobility association provide straightforward and quantitative measures of the hotspot changes. This case study shows that these methods could help researchers, practitioners, and policy-makers to better understand the spatio-temporal clustering patterns of traffic collisions. PMID:29672551

  11. Phenotypes of sleeplessness: stressing the need for psychodiagnostics in the assessment of insomnia.

    PubMed

    van de Laar, Merijn; Leufkens, Tim; Bakker, Bart; Pevernagie, Dirk; Overeem, Sebastiaan

    2017-09-01

    Insomnia is a too general term for various subtypes that might have different etiologies and therefore require different types of treatment. In this explorative study we used cluster analysis to distinguish different phenotypes in 218 patients with insomnia, taking into account several factors including sleep variables and characteristics related to personality and psychiatric comorbidity. Three clusters emerged from the analysis. The 'moderate insomnia with low psychopathology'-cluster was characterized by relatively normal personality traits, as well as normal levels of anxiety and depressive symptoms in the presence of moderate insomnia severity. The 'severe insomnia with moderate psychopathology'-cluster showed relatively high scores on the Insomnia Severity Index and scores on the sleep log that were indicative for severe insomnia. Anxiety and depressive symptoms were slightly above the cut-off and they were characterized by below average self-sufficiency and less goal-directed behavior. The 'early onset insomnia with high psychopathology'-cluster showed a much younger age and earlier insomnia onset than the other two groups. Anxiety and depressive symptoms were well above the cut-off score and the group consisted of a higher percentage of subjects with comorbid psychiatric disorders. This cluster showed a 'typical psychiatric' personality profile. Our findings stress the need for psychodiagnostic procedures next to a sleep-related diagnostic approach, especially in the younger insomnia patients. Specific treatment suggestions are given based on the three phenotypes.

  12. Spatio-Temporal Analysis of Smear-Positive Tuberculosis in the Sidama Zone, Southern Ethiopia

    PubMed Central

    Dangisso, Mesay Hailu; Datiko, Daniel Gemechu; Lindtjørn, Bernt

    2015-01-01

    Background Tuberculosis (TB) is a disease of public health concern, with a varying distribution across settings depending on socio-economic status, HIV burden, availability and performance of the health system. Ethiopia is a country with a high burden of TB, with regional variations in TB case notification rates (CNRs). However, TB program reports are often compiled and reported at higher administrative units that do not show the burden at lower units, so there is limited information about the spatial distribution of the disease. We therefore aim to assess the spatial distribution and presence of the spatio-temporal clustering of the disease in different geographic settings over 10 years in the Sidama Zone in southern Ethiopia. Methods A retrospective space–time and spatial analysis were carried out at the kebele level (the lowest administrative unit within a district) to identify spatial and space-time clusters of smear-positive pulmonary TB (PTB). Scan statistics, Global Moran’s I, and Getis and Ordi (Gi*) statistics were all used to help analyze the spatial distribution and clusters of the disease across settings. Results A total of 22,545 smear-positive PTB cases notified over 10 years were used for spatial analysis. In a purely spatial analysis, we identified the most likely cluster of smear-positive PTB in 192 kebeles in eight districts (RR= 2, p<0.001), with 12,155 observed and 8,668 expected cases. The Gi* statistic also identified the clusters in the same areas, and the spatial clusters showed stability in most areas in each year during the study period. The space-time analysis also detected the most likely cluster in 193 kebeles in the same eight districts (RR= 1.92, p<0.001), with 7,584 observed and 4,738 expected cases in 2003-2012. Conclusion The study found variations in CNRs and significant spatio-temporal clusters of smear-positive PTB in the Sidama Zone. The findings can be used to guide TB control programs to devise effective TB control strategies for the geographic areas characterized by the highest CNRs. Further studies are required to understand the factors associated with clustering based on individual level locations and investigation of cases. PMID:26030162

  13. Analysis of candidates for interacting galaxy clusters. I. A1204 and A2029/A2033

    NASA Astrophysics Data System (ADS)

    Gonzalez, Elizabeth Johana; de los Rios, Martín; Oio, Gabriel A.; Lang, Daniel Hernández; Tagliaferro, Tania Aguirre; Domínguez R., Mariano J.; Castellón, José Luis Nilo; Cuevas L., Héctor; Valotto, Carlos A.

    2018-04-01

    Context. Merging galaxy clusters allow for the study of different mass components, dark and baryonic, separately. Also, their occurrence enables to test the ΛCDM scenario, which can be used to put constraints on the self-interacting cross-section of the dark-matter particle. Aim. It is necessary to perform a homogeneous analysis of these systems. Hence, based on a recently presented sample of candidates for interacting galaxy clusters, we present the analysis of two of these cataloged systems. Methods: In this work, the first of a series devoted to characterizing galaxy clusters in merger processes, we perform a weak lensing analysis of clusters A1204 and A2029/A2033 to derive the total masses of each identified interacting structure together with a dynamical study based on a two-body model. We also describe the gas and the mass distributions in the field through a lensing and an X-ray analysis. This is the first of a series of works which will analyze these type of system in order to characterize them. Results: Neither merging cluster candidate shows evidence of having had a recent merger event. Nevertheless, there is dynamical evidence that these systems could be interacting or could interact in the future. Conclusions: It is necessary to include more constraints in order to improve the methodology of classifying merging galaxy clusters. Characterization of these clusters is important in order to properly understand the nature of these systems and their connection with dynamical studies.

  14. Identification of complex metabolic states in critically injured patients using bioinformatic cluster analysis.

    PubMed

    Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M Margaret; Butte, Atul J; Manley, Geoffrey T

    2010-01-01

    Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enables pattern recognition and identification of physiologic patient states. We hypothesized that processing of multivariate data using hierarchical clustering techniques would allow identification of otherwise hidden patient physiologic patterns that would be predictive of outcome. Multivariate physiologic and ventilator data were collected continuously using a multimodal bioinformatics system in the surgical ICU at San Francisco General Hospital. These data were incorporated with non-continuous data and stored on a server in the ICU. A hierarchical clustering algorithm grouped each minute of data into 1 of 10 clusters. Clusters were correlated with outcome measures including incidence of infection, multiple organ failure (MOF), and mortality. We identified 10 clusters, which we defined as distinct patient states. While patients transitioned between states, they spent significant amounts of time in each. Clusters were enriched for our outcome measures: 2 of the 10 states were enriched for infection, 6 of 10 were enriched for MOF, and 3 of 10 were enriched for death. Further analysis of correlations between pairs of variables within each cluster reveals significant differences in physiology between clusters. Here we show for the first time the feasibility of clustering physiological measurements to identify clinically relevant patient states after trauma. These results demonstrate that hierarchical clustering techniques can be useful for visualizing complex multivariate data and may provide new insights for the care of critically injured patients.

  15. Nearest clusters based partial least squares discriminant analysis for the classification of spectral data.

    PubMed

    Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar

    2018-06-07

    Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.

  16. Determining the Optimal Number of Clusters with the Clustergram

    NASA Technical Reports Server (NTRS)

    Fluegemann, Joseph K.; Davies, Misty D.; Aguirre, Nathan D.

    2011-01-01

    Cluster analysis aids research in many different fields, from business to biology to aerospace. It consists of using statistical techniques to group objects in large sets of data into meaningful classes. However, this process of ordering data points presents much uncertainty because it involves several steps, many of which are subject to researcher judgment as well as inconsistencies depending on the specific data type and research goals. These steps include the method used to cluster the data, the variables on which the cluster analysis will be operating, the number of resulting clusters, and parts of the interpretation process. In most cases, the number of clusters must be guessed or estimated before employing the clustering method. Many remedies have been proposed, but none is unassailable and certainly not for all data types. Thus, the aim of current research for better techniques of determining the number of clusters is generally confined to demonstrating that the new technique excels other methods in performance for several disparate data types. Our research makes use of a new cluster-number-determination technique based on the clustergram: a graph that shows how the number of objects in the cluster and the cluster mean (the ordinate) change with the number of clusters (the abscissa). We use the features of the clustergram to make the best determination of the cluster-number.

  17. Characterizing the course of back pain after osteoporotic vertebral fracture: a hierarchical cluster analysis of a prospective cohort study.

    PubMed

    Toyoda, Hiromitsu; Takahashi, Shinji; Hoshino, Masatoshi; Takayama, Kazushi; Iseki, Kazumichi; Sasaoka, Ryuichi; Tsujio, Tadao; Yasuda, Hiroyuki; Sasaki, Takeharu; Kanematsu, Fumiaki; Kono, Hiroshi; Nakamura, Hiroaki

    2017-09-23

    This study demonstrated four distinct patterns in the course of back pain after osteoporotic vertebral fracture (OVF). Greater angular instability in the first 6 months after the baseline was one factor affecting back pain after OVF. Understanding the natural course of symptomatic acute OVF is important in deciding the optimal treatment strategy. We used latent class analysis to classify the course of back pain after OVF and identify the risk factors associated with persistent pain. This multicenter cohort study included 218 consecutive patients with ≤ 2-week-old OVFs who were enrolled at 11 institutions. Dynamic x-rays and back pain assessment with a visual analog scale (VAS) were obtained at enrollment and at 1-, 3-, and 6-month follow-ups. The VAS scores were used to characterize patient groups, using hierarchical cluster analysis. VAS for 128 patients was used for hierarchical cluster analysis. Analysis yielded four clusters representing different patterns of back pain progression. Cluster 1 patients (50.8%) had stable, mild pain. Cluster 2 patients (21.1%) started with moderate pain and progressed quickly to very low pain. Patients in cluster 3 (10.9%) had moderate pain that initially improved but worsened after 3 months. Cluster 4 patients (17.2%) had persistent severe pain. Patients in cluster 4 showed significant high baseline pain intensity, higher degree of angular instability, and higher number of previous OVFs, and tended to lack regular exercise. In contrast, patients in cluster 2 had significantly lower baseline VAS and less angular instability. We identified four distinct groups of OVF patients with different patterns of back pain progression. Understanding the course of back pain after OVF may help in its management and contribute to future treatment trials.

  18. Using data mining to segment healthcare markets from patients' preference perspectives.

    PubMed

    Liu, Sandra S; Chen, Jie

    2009-01-01

    This paper aims to provide an example of how to use data mining techniques to identify patient segments regarding preferences for healthcare attributes and their demographic characteristics. Data were derived from a number of individuals who received in-patient care at a health network in 2006. Data mining and conventional hierarchical clustering with average linkage and Pearson correlation procedures are employed and compared to show how each procedure best determines segmentation variables. Data mining tools identified three differentiable segments by means of cluster analysis. These three clusters have significantly different demographic profiles. The study reveals, when compared with traditional statistical methods, that data mining provides an efficient and effective tool for market segmentation. When there are numerous cluster variables involved, researchers and practitioners need to incorporate factor analysis for reducing variables to clearly and meaningfully understand clusters. Interests and applications in data mining are increasing in many businesses. However, this technology is seldom applied to healthcare customer experience management. The paper shows that efficient and effective application of data mining methods can aid the understanding of patient healthcare preferences.

  19. Cluster headache and the hypocretin receptor 2 reconsidered: a genetic association study and meta-analysis.

    PubMed

    Weller, Claudia M; Wilbrink, Leopoldine A; Houwing-Duistermaat, Jeanine J; Koelewijn, Stephany C; Vijfhuizen, Lisanne S; Haan, Joost; Ferrari, Michel D; Terwindt, Gisela M; van den Maagdenberg, Arn M J M; de Vries, Boukje

    2015-08-01

    Cluster headache is a severe neurological disorder with a complex genetic background. A missense single nucleotide polymorphism (rs2653349; p.Ile308Val) in the HCRTR2 gene that encodes the hypocretin receptor 2 is the only genetic factor that is reported to be associated with cluster headache in different studies. However, as there are conflicting results between studies, we re-evaluated its role in cluster headache. We performed a genetic association analysis for rs2653349 in our large Leiden University Cluster headache Analysis (LUCA) program study population. Systematic selection of the literature yielded three additional studies comprising five study populations, which were included in our meta-analysis. Data were extracted according to predefined criteria. A total of 575 cluster headache patients from our LUCA study and 874 controls were genotyped for HCRTR2 SNP rs2653349 but no significant association with cluster headache was found (odds ratio 0.91 (95% confidence intervals 0.75-1.10), p = 0.319). In contrast, the meta-analysis that included in total 1167 cluster headache cases and 1618 controls from the six study populations, which were part of four different studies, showed association of the single nucleotide polymorphism with cluster headache (random effect odds ratio 0.69 (95% confidence intervals 0.53-0.90), p = 0.006). The association became weaker, as the odds ratio increased to 0.80, when the meta-analysis was repeated without the initial single South European study with the largest effect size. Although we did not find evidence for association of rs2653349 in our LUCA study, which is the largest investigated study population thus far, our meta-analysis provides genetic evidence for a role of HCRTR2 in cluster headache. Regardless, we feel that the association should be interpreted with caution as meta-analyses with individual populations that have limited power have diminished validity. © International Headache Society 2014.

  20. Identification and Functional Analysis of the Nocardithiocin Gene Cluster in Nocardia pseudobrasiliensis

    PubMed Central

    Sakai, Kanae; Komaki, Hisayuki; Gonoi, Tohru

    2015-01-01

    Nocardithiocin is a thiopeptide compound isolated from the opportunistic pathogen Nocardia pseudobrasiliensis. It shows a strong activity against acid-fast bacteria and is also active against rifampicin-resistant Mycobacterium tuberculosis. Here, we report the identification of the nocardithiocin gene cluster in N. pseudobrasiliensis IFM 0761 based on conserved thiopeptide biosynthesis gene sequence and the whole genome sequence. The predicted gene cluster was confirmed by gene disruption and complementation. As expected, strains containing the disrupted gene did not produce nocardithiocin while gene complementation restored nocardithiocin production in these strains. The predicted cluster was further analyzed using RNA-seq which showed that the nocardithiocin gene cluster contains 12 genes within a 15.2-kb region. This finding will promote the improvement of nocardithiocin productivity and its derivatives production. PMID:26588225

  1. COVARIATE-ADAPTIVE CLUSTERING OF EXPOSURES FOR AIR POLLUTION EPIDEMIOLOGY COHORTS*

    PubMed Central

    Keller, Joshua P.; Drton, Mathias; Larson, Timothy; Kaufman, Joel D.; Sandler, Dale P.; Szpiro, Adam A.

    2017-01-01

    Cohort studies in air pollution epidemiology aim to establish associations between health outcomes and air pollution exposures. Statistical analysis of such associations is complicated by the multivariate nature of the pollutant exposure data as well as the spatial misalignment that arises from the fact that exposure data are collected at regulatory monitoring network locations distinct from cohort locations. We present a novel clustering approach for addressing this challenge. Specifically, we present a method that uses geographic covariate information to cluster multi-pollutant observations and predict cluster membership at cohort locations. Our predictive k-means procedure identifies centers using a mixture model and is followed by multi-class spatial prediction. In simulations, we demonstrate that predictive k-means can reduce misclassification error by over 50% compared to ordinary k-means, with minimal loss in cluster representativeness. The improved prediction accuracy results in large gains of 30% or more in power for detecting effect modification by cluster in a simulated health analysis. In an analysis of the NIEHS Sister Study cohort using predictive k-means, we find that the association between systolic blood pressure (SBP) and long-term fine particulate matter (PM2.5) exposure varies significantly between different clusters of PM2.5 component profiles. Our cluster-based analysis shows that for subjects assigned to a cluster located in the Midwestern U.S., a 10 μg/m3 difference in exposure is associated with 4.37 mmHg (95% CI, 2.38, 6.35) higher SBP. PMID:28572869

  2. Relation between the Dynamics of Glassy Clusters and Characteristic Features of their Energy Landscape

    NASA Astrophysics Data System (ADS)

    De, Sandip; Schaefer, Bastian; Sadeghi, Ali; Sicher, Michael; Kanhere, D. G.; Goedecker, Stefan

    2014-02-01

    Based on a recently introduced metric for measuring distances between configurations, we introduce distance-energy (DE) plots to characterize the potential energy surface of clusters. Producing such plots is computationally feasible on the density functional level since it requires only a few hundred stable low energy configurations including the global minimum. By using standard criteria based on disconnectivity graphs and the dynamics of Lennard-Jones clusters, we show that the DE plots convey the necessary information about the character of the potential energy surface and allow us to distinguish between glassy and nonglassy systems. We then apply this analysis to real clusters at the density functional theory level and show that both glassy and nonglassy clusters can be found in simulations. It turns out that among our investigated clusters only those can be synthesized experimentally which exhibit a nonglassy landscape.

  3. Sirenomelia in Argentina: Prevalence, geographic clusters and temporal trends analysis.

    PubMed

    Groisman, Boris; Liascovich, Rosa; Gili, Juan Antonio; Barbero, Pablo; Bidondo, María Paz

    2016-07-01

    Sirenomelia is a severe malformation of the lower body characterized by a single medial lower limb and a variable combination of visceral abnormalities. Given that Sirenomelia is a very rare birth defect, epidemiological studies are scarce. The aim of this study is to evaluate prevalence, geographic clusters and time trends of sirenomelia in Argentina, using data from the National Network of Congenital Anomalies of Argentina (RENAC) from November 2009 until December 2014. This is a descriptive study using data from the RENAC, a hospital-based surveillance system for newborns affected with major morphological congenital anomalies. We calculated sirenomelia prevalence throughout the period, searched for geographical clusters, and evaluated time trends. The prevalence of confirmed cases of sirenomelia throughout the period was 2.35 per 100,000 births. Cluster analysis showed no statistically significant geographical aggregates. Time-trends analysis showed that the prevalence was higher in years 2009 to 2010. The observed prevalence was higher than the observed in previous epidemiological studies in other geographic regions. We observed a likely real increase in the initial period of our study. We used strict diagnostic criteria, excluding cases that only had clinical diagnosis of sirenomelia. Therefore, real prevalence could be even higher. This study did not show any geographic clusters. Because etiology of sirenomelia has not yet been established, studies of epidemiological features of this defect may contribute to define its causes. Birth Defects Research (Part A) 106:604-611, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  4. Structural and electronic properties Te62+ and Te82+: A DFT study

    NASA Astrophysics Data System (ADS)

    Sharma, Tamanna; Tamboli, Rohit; Kanhere, D. G.; Sharma, Raman

    2018-05-01

    Structural and electronic properties of Tellurium cluster (Ten) and their cations (Ten2+) (n = 6, 8) have been studied theoretically using VASP within generalized gradient approximation. Ground state geometries and higher energy isomers of these clusters have been examined on the basis of total free energy calculations. Lowest energy isomers of neutral clusters are ring like structures whereas the lowest energy isomers of cations are polyhedral cages. HOMO-LUMO gap in cationic clusters is small compared to its neutral clusters. Removal of two electrons from the neutral cluster raises the free energy. Analysis of free energy, HOMO-LUMO gap and density of states (DOS) show that neutral cluster are more stable than their cations.

  5. HICOSMO - cosmology with a complete sample of galaxy clusters - I. Data analysis, sample selection and luminosity-mass scaling relation

    NASA Astrophysics Data System (ADS)

    Schellenberger, G.; Reiprich, T. H.

    2017-08-01

    The X-ray regime, where the most massive visible component of galaxy clusters, the intracluster medium, is visible, offers directly measured quantities, like the luminosity, and derived quantities, like the total mass, to characterize these objects. The aim of this project is to analyse a complete sample of galaxy clusters in detail and constrain cosmological parameters, like the matter density, Ωm, or the amplitude of initial density fluctuations, σ8. The purely X-ray flux-limited sample (HIFLUGCS) consists of the 64 X-ray brightest galaxy clusters, which are excellent targets to study the systematic effects, that can bias results. We analysed in total 196 Chandra observations of the 64 HIFLUGCS clusters, with a total exposure time of 7.7 Ms. Here, we present our data analysis procedure (including an automated substructure detection and an energy band optimization for surface brightness profile analysis) that gives individually determined, robust total mass estimates. These masses are tested against dynamical and Planck Sunyaev-Zeldovich (SZ) derived masses of the same clusters, where good overall agreement is found with the dynamical masses. The Planck SZ masses seem to show a mass-dependent bias to our hydrostatic masses; possible biases in this mass-mass comparison are discussed including the Planck selection function. Furthermore, we show the results for the (0.1-2.4) keV luminosity versus mass scaling relation. The overall slope of the sample (1.34) is in agreement with expectations and values from literature. Splitting the sample into galaxy groups and clusters reveals, even after a selection bias correction, that galaxy groups exhibit a significantly steeper slope (1.88) compared to clusters (1.06).

  6. Profiling physical activity motivation based on self-determination theory: a cluster analysis approach.

    PubMed

    Friederichs, Stijn Ah; Bolman, Catherine; Oenema, Anke; Lechner, Lilian

    2015-01-01

    In order to promote physical activity uptake and maintenance in individuals who do not comply with physical activity guidelines, it is important to increase our understanding of physical activity motivation among this group. The present study aimed to examine motivational profiles in a large sample of adults who do not comply with physical activity guidelines. The sample for this study consisted of 2473 individuals (31.4% male; age 44.6 ± 12.9). In order to generate motivational profiles based on motivational regulation, a cluster analysis was conducted. One-way analyses of variance were then used to compare the clusters in terms of demographics, physical activity level, motivation to be active and subjective experience while being active. Three motivational clusters were derived based on motivational regulation scores: a low motivation cluster, a controlled motivation cluster and an autonomous motivation cluster. These clusters differed significantly from each other with respect to physical activity behavior, motivation to be active and subjective experience while being active. Overall, the autonomous motivation cluster displayed more favorable characteristics compared to the other two clusters. The results of this study provide additional support for the importance of autonomous motivation in the context of physical activity behavior. The three derived clusters may be relevant in the context of physical activity interventions as individuals within the different clusters might benefit most from different intervention approaches. In addition, this study shows that cluster analysis is a useful method for differentiating between motivational profiles in large groups of individuals who do not comply with physical activity guidelines.

  7. Phylogenetic relationship of Ornithobacterium rhinotracheale strains.

    PubMed

    DE Oca-Jimenez, Roberto Montes; Vega-Sanchez, Vicente; Morales-Erasto, Vladimir; Salgado-Miranda, Celene; Blackall, Patrick J; Soriano-Vargas, Edgardo

    2018-04-10

    The bacterium Ornithobacterium rhinotracheale is associated with respiratory disease in wild birds and poultry. In this study, the phylogenetic analysis of nine reference strains of O. rhinotracheale belonging to serovars A to I, and eight Mexican isolates belonging to serovar A, was performed. The analysis was extended to include available sequences from another 23 strains available in the public domain. The analysis showed that the 40 sequences formed six clusters, I to VI. All eight Mexican field isolates were placed in cluster I. One of the reference strains appears to present genetic diversity not previously recognized and was placed in a new genetic cluster. In conclusion, the phylogenetic analysis of O. rhinotracheale strains, based on the 16S rRNA gene, is a suitable tool for epidemiologic studies.

  8. Text grouping in patent analysis using adaptive K-means clustering algorithm

    NASA Astrophysics Data System (ADS)

    Shanie, Tiara; Suprijadi, Jadi; Zulhanif

    2017-03-01

    Patents are one of the Intellectual Property. Analyzing patent is one requirement in knowing well the development of technology in each country and in the world now. This study uses the patent document coming from the Espacenet server about Green Tea. Patent documents related to the technology in the field of tea is still widespread, so it will be difficult for users to information retrieval (IR). Therefore, it is necessary efforts to categorize documents in a specific group of related terms contained therein. This study uses titles patent text data with the proposed Green Tea in Statistical Text Mining methods consists of two phases: data preparation and data analysis stage. The data preparation phase uses Text Mining methods and data analysis stage is done by statistics. Statistical analysis in this study using a cluster analysis algorithm, the Adaptive K-Means Clustering Algorithm. Results from this study showed that based on the maximum value Silhouette, generate 87 clusters associated fifteen terms therein that can be utilized in the process of information retrieval needs.

  9. An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks.

    PubMed

    Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong

    2015-08-07

    Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.

  10. Cluster and constraint analysis in tetrahedron packings

    NASA Astrophysics Data System (ADS)

    Jin, Weiwei; Lu, Peng; Liu, Lufeng; Li, Shuixiang

    2015-04-01

    The disordered packings of tetrahedra often show no obvious macroscopic orientational or positional order for a wide range of packing densities, and it has been found that the local order in particle clusters is the main order form of tetrahedron packings. Therefore, a cluster analysis is carried out to investigate the local structures and properties of tetrahedron packings in this work. We obtain a cluster distribution of differently sized clusters, and peaks are observed at two special clusters, i.e., dimer and wagon wheel. We then calculate the amounts of dimers and wagon wheels, which are observed to have linear or approximate linear correlations with packing density. Following our previous work, the amount of particles participating in dimers is used as an order metric to evaluate the order degree of the hierarchical packing structure of tetrahedra, and an order map is consequently depicted. Furthermore, a constraint analysis is performed to determine the isostatic or hyperstatic region in the order map. We employ a Monte Carlo algorithm to test jamming and then suggest a new maximally random jammed packing of hard tetrahedra from the order map with a packing density of 0.6337.

  11. Comparisons of non-Gaussian statistical models in DNA methylation analysis.

    PubMed

    Ma, Zhanyu; Teschendorff, Andrew E; Yu, Hong; Taghia, Jalil; Guo, Jun

    2014-06-16

    As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.

  12. Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis

    PubMed Central

    Ma, Zhanyu; Teschendorff, Andrew E.; Yu, Hong; Taghia, Jalil; Guo, Jun

    2014-01-01

    As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance. PMID:24937687

  13. Clustering of health-related behaviors, health outcomes and demographics in Dutch adolescents: a cross-sectional study.

    PubMed

    Busch, Vincent; Van Stel, Henk F; Schrijvers, Augustinus J P; de Leeuw, Johannes R J

    2013-12-04

    Recent studies show several health-related behaviors to cluster in adolescents. This has important implications for public health. Interrelated behaviors have been shown to be most effectively targeted by multimodal interventions addressing wider-ranging improvements in lifestyle instead of via separate interventions targeting individual behaviors. However, few previous studies have taken into account a broad, multi-disciplinary range of health-related behaviors and connected these behavioral patterns to health-related outcomes. This paper presents an analysis of the clustering of a broad range of health-related behaviors with relevant demographic factors and several health-related outcomes in adolescents. Self-report questionnaire data were collected from a sample of 2,690 Dutch high school adolescents. Behavioral patterns were deducted via Principal Components Analysis. Subsequently a Two-Step Cluster Analysis was used to identify groups of adolescents with similar behavioral patterns and health-related outcomes. Four distinct behavioral patterns describe the analyzed individual behaviors: 1- risk-prone behavior, 2- bully behavior, 3- problematic screen time use, and 4- sedentary behavior. Subsequent cluster analysis identified four clusters of adolescents. Multi-problem behavior was associated with problematic physical and psychosocial health outcomes, as opposed to those exerting relatively few unhealthy behaviors. These associations were relatively independent of demographics such as ethnicity, gender and socio-economic status. The results show that health-related behaviors tend to cluster, indicating that specific behavioral patterns underlie individual health behaviors. In addition, specific patterns of health-related behaviors were associated with specific health outcomes and demographic factors. In general, unhealthy behavior on account of multiple health-related behaviors was associated with both poor psychosocial and physical health. These findings have significant meaning for future public health programs, which should be more tailored with use of such knowledge on behavioral clustering via e.g. Transfer Learning.

  14. Clustering of health-related behaviors, health outcomes and demographics in Dutch adolescents: a cross-sectional study

    PubMed Central

    2013-01-01

    Background Recent studies show several health-related behaviors to cluster in adolescents. This has important implications for public health. Interrelated behaviors have been shown to be most effectively targeted by multimodal interventions addressing wider-ranging improvements in lifestyle instead of via separate interventions targeting individual behaviors. However, few previous studies have taken into account a broad, multi-disciplinary range of health-related behaviors and connected these behavioral patterns to health-related outcomes. This paper presents an analysis of the clustering of a broad range of health-related behaviors with relevant demographic factors and several health-related outcomes in adolescents. Methods Self-report questionnaire data were collected from a sample of 2,690 Dutch high school adolescents. Behavioral patterns were deducted via Principal Components Analysis. Subsequently a Two-Step Cluster Analysis was used to identify groups of adolescents with similar behavioral patterns and health-related outcomes. Results Four distinct behavioral patterns describe the analyzed individual behaviors: 1- risk-prone behavior, 2- bully behavior, 3- problematic screen time use, and 4- sedentary behavior. Subsequent cluster analysis identified four clusters of adolescents. Multi-problem behavior was associated with problematic physical and psychosocial health outcomes, as opposed to those exerting relatively few unhealthy behaviors. These associations were relatively independent of demographics such as ethnicity, gender and socio-economic status. Conclusions The results show that health-related behaviors tend to cluster, indicating that specific behavioral patterns underlie individual health behaviors. In addition, specific patterns of health-related behaviors were associated with specific health outcomes and demographic factors. In general, unhealthy behavior on account of multiple health-related behaviors was associated with both poor psychosocial and physical health. These findings have significant meaning for future public health programs, which should be more tailored with use of such knowledge on behavioral clustering via e.g. Transfer Learning. PMID:24305509

  15. Chemical characteristics for different parts of Panax notoginseng using pressurized liquid extraction and HPLC-ELSD.

    PubMed

    Wan, J B; Yang, F Q; Li, S P; Wang, Y T; Cui, X M

    2006-08-28

    The chemical characteristics for different parts of Panax notoginseng, including root, fibre root, rhizome, stem, leaf, flower and seed, were determined using high performance liquid chromatography-evaporative light scattering detection (HPLC-ELSD) and pressurized liquid extraction (PLE). Eight major saponins, namely notoginsenoside R1, ginsenosides Rg1, Re, Rb1, Rc, Rb2, Rb3 and Rd were also quantitatively compared among the different parts of P. notoginseng. The chromatograms showed that there was significant difference between underground (root, fibre root, rhizome) and aerial (leaf and flower) parts from P. notoginseng, though the similarities of entire chromatographic patterns among tested samples from underground (0.965+/-0.029, n=12) and aerial parts (0.987+/-0.014, n=5) were similar, respectively. Especially, no saponin was detected in the seed of P. notoginseng. Hierarchical clustering analysis based on eight investigated saponins or the ratios of contents for ginsenoside Rg1/Rb1 and ginsenoside Rb3/Rb1 showed that the samples from different parts of P. notoginseng were divided into three main clusters. One cluster was underground parts, which contained rich protopanaxatriol and protopanaxadiol types saponins. The leaf and flower were in the same cluster, which contained protopanaxadiol type saponins only. Especially, ginsenoside Rc, Rb2 and Rb3, rare in the underground parts, were rich in aerial parts of P. notoginseng. The stem of P. notoginseng was another cluster. Based on the cluster analysis, the chemical characteristics for different parts of P. notoginseng were revealed. They are composite cluster (underground parts), protopanaxadiol cluster (aerial parts) and interim (stem) cluster, which was the one between the two typical clusters, respectively. The result shows that chemical characteristics of underground parts and aerial parts from P. notoginseng are obviously different, which is helpful for pharmacological evaluation and quality control of P. notoginseng.

  16. Interactive visual exploration and analysis of origin-destination data

    NASA Astrophysics Data System (ADS)

    Ding, Linfang; Meng, Liqiu; Yang, Jian; Krisp, Jukka M.

    2018-05-01

    In this paper, we propose a visual analytics approach for the exploration of spatiotemporal interaction patterns of massive origin-destination data. Firstly, we visually query the movement database for data at certain time windows. Secondly, we conduct interactive clustering to allow the users to select input variables/features (e.g., origins, destinations, distance, and duration) and to adjust clustering parameters (e.g. distance threshold). The agglomerative hierarchical clustering method is applied for the multivariate clustering of the origin-destination data. Thirdly, we design a parallel coordinates plot for visualizing the precomputed clusters and for further exploration of interesting clusters. Finally, we propose a gradient line rendering technique to show the spatial and directional distribution of origin-destination clusters on a map view. We implement the visual analytics approach in a web-based interactive environment and apply it to real-world floating car data from Shanghai. The experiment results show the origin/destination hotspots and their spatial interaction patterns. They also demonstrate the effectiveness of our proposed approach.

  17. The ergot alkaloid gene cluster in Claviceps purpurea: extension of the cluster sequence and intra species evolution.

    PubMed

    Haarmann, Thomas; Machado, Caroline; Lübbe, Yvonne; Correia, Telmo; Schardl, Christopher L; Panaccione, Daniel G; Tudzynski, Paul

    2005-06-01

    The genomic region of Claviceps purpurea strain P1 containing the ergot alkaloid gene cluster [Tudzynski, P., Hölter, K., Correia, T., Arntz, C., Grammel, N., Keller, U., 1999. Evidence for an ergot alkaloid gene cluster in Claviceps purpurea. Mol. Gen. Genet. 261, 133-141] was explored by chromosome walking, and additional genes probably involved in the ergot alkaloid biosynthesis have been identified. The putative cluster sequence (extending over 68.5kb) contains 4 different nonribosomal peptide synthetase (NRPS) genes and several putative oxidases. Northern analysis showed that most of the genes were co-regulated (repressed by high phosphate), and identified probable flanking genes by lack of co-regulation. Comparison of the cluster sequences of strain P1, an ergotamine producer, with that of strain ECC93, an ergocristine producer, showed high conservation of most of the cluster genes, but significant variation in the NRPS modules, strongly suggesting that evolution of these chemical races of C. purpurea is determined by evolution of NRPS module specificity.

  18. MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

    PubMed Central

    Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta

    2014-01-01

    Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499

  19. Clustering of Multivariate Geostatistical Data

    NASA Astrophysics Data System (ADS)

    Fouedjio, Francky

    2017-04-01

    Multivariate data indexed by geographical coordinates have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations belonging to the same cluster have a certain degree of homogeneity while data locations in the different clusters have to be as different as possible. However, groups of data locations created through classical clustering techniques turn out to show poor spatial contiguity, a feature obviously inconvenient for many geoscience applications. In this work, we develop a clustering method that overcomes this problem by accounting the spatial dependence structure of data; thus reinforcing the spatial contiguity of resulting cluster. The capability of the proposed clustering method to provide spatially contiguous and meaningful clusters of data locations is assessed using both synthetic and real datasets. Keywords: clustering, geostatistics, spatial contiguity, spatial dependence.

  20. Functional grouping of similar genes using eigenanalysis on minimum spanning tree based neighborhood graph.

    PubMed

    Jothi, R; Mohanty, Sraban Kumar; Ojha, Aparajita

    2016-04-01

    Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.

  1. Analyzing Patients' Values by Applying Cluster Analysis and LRFM Model in a Pediatric Dental Clinic in Taiwan

    PubMed Central

    Lin, Shih-Yen; Liu, Chih-Wei

    2014-01-01

    This study combines cluster analysis and LRFM (length, recency, frequency, and monetary) model in a pediatric dental clinic in Taiwan to analyze patients' values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients' needs. PMID:25045741

  2. Variable selection based on clustering analysis for improvement of polyphenols prediction in green tea using synchronous fluorescence spectra

    NASA Astrophysics Data System (ADS)

    Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi

    2018-04-01

    Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models’ performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.

  3. Variable selection based on clustering analysis for improvement of polyphenols prediction in green tea using synchronous fluorescence spectra.

    PubMed

    Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi

    2018-03-13

    Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models' performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.

  4. Analyzing patients' values by applying cluster analysis and LRFM model in a pediatric dental clinic in Taiwan.

    PubMed

    Wu, Hsin-Hung; Lin, Shih-Yen; Liu, Chih-Wei

    2014-01-01

    This study combines cluster analysis and LRFM (length, recency, frequency, and monetary) model in a pediatric dental clinic in Taiwan to analyze patients' values. A two-stage approach by self-organizing maps and K-means method is applied to segment 1,462 patients into twelve clusters. The average values of L, R, and F excluding monetary covered by national health insurance program are computed for each cluster. In addition, customer value matrix is used to analyze customer values of twelve clusters in terms of frequency and monetary. Customer relationship matrix considering length and recency is also applied to classify different types of customers from these twelve clusters. The results show that three clusters can be classified into loyal patients with L, R, and F values greater than the respective average L, R, and F values, while three clusters can be viewed as lost patients without any variable above the average values of L, R, and F. When different types of patients are identified, marketing strategies can be designed to meet different patients' needs.

  5. Behavioral Health Risk Profiles of Undergraduate University Students in England, Wales, and Northern Ireland: A Cluster Analysis.

    PubMed

    El Ansari, Walid; Ssewanyana, Derrick; Stock, Christiane

    2018-01-01

    Limited research has explored clustering of lifestyle behavioral risk factors (BRFs) among university students. This study aimed to explore clustering of BRFs, composition of clusters, and the association of the clusters with self-rated health and perceived academic performance. We assessed (BRFs), namely tobacco smoking, physical inactivity, alcohol consumption, illicit drug use, unhealthy nutrition, and inadequate sleep, using a self-administered general Student Health Survey among 3,706 undergraduates at seven UK universities. A two-step cluster analysis generated: Cluster 1 (the high physically active and health conscious) with very high health awareness/consciousness, good nutrition, and physical activity (PA), and relatively low alcohol, tobacco, and other drug (ATOD) use. Cluster 2 (the abstinent) had very low ATOD use, high health awareness, good nutrition, and medium high PA. Cluster 3 (the moderately health conscious) included the highest regard for healthy eating, second highest fruit/vegetable consumption, and moderately high ATOD use. Cluster 4 (the risk taking) showed the highest ATOD use, were the least health conscious, least fruit consuming, and attached the least importance on eating healthy. Compared to the healthy cluster (Cluster 1), students in other clusters had lower self-rated health, and particularly, students in the risk taking cluster (Cluster 4) reported lower academic performance. These associations were stronger for men than for women. Of the four clusters, Cluster 4 had the youngest students. Our results suggested that prevention among university students should address multiple BRFs simultaneously, with particular focus on the younger students.

  6. Gene duplications in prokaryotes can be associated with environmental adaptation

    PubMed Central

    2010-01-01

    Background Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Results Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Conclusions Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment. PMID:20961426

  7. Gene duplications in prokaryotes can be associated with environmental adaptation.

    PubMed

    Bratlie, Marit S; Johansen, Jostein; Sherman, Brad T; Huang, Da Wei; Lempicki, Richard A; Drabløs, Finn

    2010-10-20

    Gene duplication is a normal evolutionary process. If there is no selective advantage in keeping the duplicated gene, it is usually reduced to a pseudogene and disappears from the genome. However, some paralogs are retained. These gene products are likely to be beneficial to the organism, e.g. in adaptation to new environmental conditions. The aim of our analysis is to investigate the properties of paralog-forming genes in prokaryotes, and to analyse the role of these retained paralogs by relating gene properties to life style of the corresponding prokaryotes. Paralogs were identified in a number of prokaryotes, and these paralogs were compared to singletons of persistent orthologs based on functional classification. This showed that the paralogs were associated with for example energy production, cell motility, ion transport, and defence mechanisms. A statistical overrepresentation analysis of gene and protein annotations was based on paralogs of the 200 prokaryotes with the highest fraction of paralog-forming genes. Biclustering of overrepresented gene ontology terms versus species was used to identify clusters of properties associated with clusters of species. The clusters were classified using similarity scores on properties and species to identify interesting clusters, and a subset of clusters were analysed by comparison to literature data. This analysis showed that paralogs often are associated with properties that are important for survival and proliferation of the specific organisms. This includes processes like ion transport, locomotion, chemotaxis and photosynthesis. However, the analysis also showed that the gene ontology terms sometimes were too general, imprecise or even misleading for automatic analysis. Properties described by gene ontology terms identified in the overrepresentation analysis are often consistent with individual prokaryote lifestyles and are likely to give a competitive advantage to the organism. Paralogs and singletons dominate different categories of functional classification, where paralogs in particular seem to be associated with processes involving interaction with the environment.

  8. Analysis of genetic diversity in banana cultivars (Musa cvs.) from the South of Oman using AFLP markers and classification by phylogenetic, hierarchical clustering and principal component analyses*

    PubMed Central

    Opara, Umezuruike Linus; Jacobson, Dan; Al-Saady, Nadiya Abubakar

    2010-01-01

    Banana is an important crop grown in Oman and there is a dearth of information on its genetic diversity to assist in crop breeding and improvement programs. This study employed amplified fragment length polymorphism (AFLP) to investigate the genetic variation in local banana cultivars from the southern region of Oman. Using 12 primer combinations, a total of 1094 bands were scored, of which 1012 were polymorphic. Eighty-two unique markers were identified, which revealed the distinct separation of the seven cultivars. The results obtained show that AFLP can be used to differentiate the banana cultivars. Further classification by phylogenetic, hierarchical clustering and principal component analyses showed significant differences between the clusters found with molecular markers and those clusters created by previous studies using morphological analysis. Based on the analytical results, a consensus dendrogram of the banana cultivars is presented. PMID:20443211

  9. Atlas-guided cluster analysis of large tractography datasets.

    PubMed

    Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

    2013-01-01

    Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment.

  10. A stellar census in globular clusters with MUSE: The contribution of rotation to cluster dynamics studied with 200 000 stars

    NASA Astrophysics Data System (ADS)

    Kamann, S.; Husser, T.-O.; Dreizler, S.; Emsellem, E.; Weilbacher, P. M.; Martens, S.; Bacon, R.; den Brok, M.; Giesers, B.; Krajnović, D.; Roth, M. M.; Wendt, M.; Wisotzki, L.

    2018-02-01

    This is the first of a series of papers presenting the results from our survey of 25 Galactic globular clusters with the MUSE integral-field spectrograph. In combination with our dedicated algorithm for source deblending, MUSE provides unique multiplex capabilities in crowded stellar fields and allows us to acquire samples of up to 20 000 stars within the half-light radius of each cluster. The present paper focuses on the analysis of the internal dynamics of 22 out of the 25 clusters, using about 500 000 spectra of 200 000 individual stars. Thanks to the large stellar samples per cluster, we are able to perform a detailed analysis of the central rotation and dispersion fields using both radial profiles and two-dimensional maps. The velocity dispersion profiles we derive show a good general agreement with existing radial velocity studies but typically reach closer to the cluster centres. By comparison with proper motion data, we derive or update the dynamical distance estimates to 14 clusters. Compared to previous dynamical distance estimates for 47 Tuc, our value is in much better agreement with other methods. We further find significant (>3σ) rotation in the majority (13/22) of our clusters. Our analysis seems to confirm earlier findings of a link between rotation and the ellipticities of globular clusters. In addition, we find a correlation between the strengths of internal rotation and the relaxation times of the clusters, suggesting that the central rotation fields are relics of the cluster formation that are gradually dissipated via two-body relaxation.

  11. Groundwater source contamination mechanisms: Physicochemical profile clustering, risk factor analysis and multivariate modelling

    NASA Astrophysics Data System (ADS)

    Hynds, Paul; Misstear, Bruce D.; Gill, Laurence W.; Murphy, Heather M.

    2014-04-01

    An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p = 0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p < 0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts.

  12. Atomistic cluster alignment method for local order mining in liquids and glasses

    NASA Astrophysics Data System (ADS)

    Fang, X. W.; Wang, C. Z.; Yao, Y. X.; Ding, Z. J.; Ho, K. M.

    2010-11-01

    An atomistic cluster alignment method is developed to identify and characterize the local atomic structural order in liquids and glasses. With the “order mining” idea for structurally disordered systems, the method can detect the presence of any type of local order in the system and can quantify the structural similarity between a given set of templates and the aligned clusters in a systematic and unbiased manner. Moreover, population analysis can also be carried out for various types of clusters in the system. The advantages of the method in comparison with other previously developed analysis methods are illustrated by performing the structural analysis for four prototype systems (i.e., pure Al, pure Zr, Zr35Cu65 , and Zr36Ni64 ). The results show that the cluster alignment method can identify various types of short-range orders (SROs) in these systems correctly while some of these SROs are difficult to capture by most of the currently available analysis methods (e.g., Voronoi tessellation method). Such a full three-dimensional atomistic analysis method is generic and can be applied to describe the magnitude and nature of noncrystalline ordering in many disordered systems.

  13. Cluster Analysis Identifies 3 Phenotypes within Allergic Asthma.

    PubMed

    Sendín-Hernández, María Paz; Ávila-Zarza, Carmelo; Sanz, Catalina; García-Sánchez, Asunción; Marcos-Vadillo, Elena; Muñoz-Bellido, Francisco J; Laffond, Elena; Domingo, Christian; Isidoro-García, María; Dávila, Ignacio

    Asthma is a heterogeneous chronic disease with different clinical expressions and responses to treatment. In recent years, several unbiased approaches based on clinical, physiological, and molecular features have described several phenotypes of asthma. Some phenotypes are allergic, but little is known about whether these phenotypes can be further subdivided. We aimed to phenotype patients with allergic asthma using an unbiased approach based on multivariate classification techniques (unsupervised hierarchical cluster analysis). From a total of 54 variables of 225 patients with well-characterized allergic asthma diagnosed following American Thoracic Society (ATS) recommendation, positive skin prick test to aeroallergens, and concordant symptoms, we finally selected 19 variables by multiple correspondence analyses. Then a cluster analysis was performed. Three groups were identified. Cluster 1 was constituted by patients with intermittent or mild persistent asthma, without family antecedents of atopy, asthma, or rhinitis. This group showed the lowest total IgE levels. Cluster 2 was constituted by patients with mild asthma with a family history of atopy, asthma, or rhinitis. Total IgE levels were intermediate. Cluster 3 included patients with moderate or severe persistent asthma that needed treatment with corticosteroids and long-acting β-agonists. This group showed the highest total IgE levels. We identified 3 phenotypes of allergic asthma in our population. Furthermore, we described 2 phenotypes of mild atopic asthma mainly differentiated by a family history of allergy. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.

  14. Candidatus Frankia Datiscae Dg1, the Actinobacterial Microsymbiont of Datisca glomerata, Expresses the Canonical nod Genes nodABC in Symbiosis with Its Host Plant

    PubMed Central

    Persson, Tomas; Battenberg, Kai; Demina, Irina V.; Vigil-Stenman, Theoden; Vanden Heuvel, Brian; Pujic, Petar; Facciotti, Marc T.; Wilbanks, Elizabeth G.; O'Brien, Anna; Fournier, Pascale; Cruz Hernandez, Maria Antonia; Mendoza Herrera, Alberto; Médigue, Claudine; Normand, Philippe; Pawlowski, Katharina; Berry, Alison M.

    2015-01-01

    Frankia strains are nitrogen-fixing soil actinobacteria that can form root symbioses with actinorhizal plants. Phylogenetically, symbiotic frankiae can be divided into three clusters, and this division also corresponds to host specificity groups. The strains of cluster II which form symbioses with actinorhizal Rosales and Cucurbitales, thus displaying a broad host range, show suprisingly low genetic diversity and to date can not be cultured. The genome of the first representative of this cluster, Candidatus Frankia datiscae Dg1 (Dg1), a microsymbiont of Datisca glomerata, was recently sequenced. A phylogenetic analysis of 50 different housekeeping genes of Dg1 and three published Frankia genomes showed that cluster II is basal among the symbiotic Frankia clusters. Detailed analysis showed that nodules of D. glomerata, independent of the origin of the inoculum, contain several closely related cluster II Frankia operational taxonomic units. Actinorhizal plants and legumes both belong to the nitrogen-fixing plant clade, and bacterial signaling in both groups involves the common symbiotic pathway also used by arbuscular mycorrhizal fungi. However, so far, no molecules resembling rhizobial Nod factors could be isolated from Frankia cultures. Alone among Frankia genomes available to date, the genome of Dg1 contains the canonical nod genes nodA, nodB and nodC known from rhizobia, and these genes are arranged in two operons which are expressed in D. glomerata nodules. Furthermore, Frankia Dg1 nodC was able to partially complement a Rhizobium leguminosarum A34 nodC::Tn5 mutant. Phylogenetic analysis showed that Dg1 Nod proteins are positioned at the root of both α- and β-rhizobial NodABC proteins. NodA-like acyl transferases were found across the phylum Actinobacteria, but among Proteobacteria only in nodulators. Taken together, our evidence indicates an Actinobacterial origin of rhizobial Nod factors. PMID:26020781

  15. Application of cluster analysis to geochemical compositional data for identifying ore-related geochemical anomalies

    NASA Astrophysics Data System (ADS)

    Zhou, Shuguang; Zhou, Kefa; Wang, Jinlin; Yang, Genfang; Wang, Shanshan

    2017-12-01

    Cluster analysis is a well-known technique that is used to analyze various types of data. In this study, cluster analysis is applied to geochemical data that describe 1444 stream sediment samples collected in northwestern Xinjiang with a sample spacing of approximately 2 km. Three algorithms (the hierarchical, k-means, and fuzzy c-means algorithms) and six data transformation methods (the z-score standardization, ZST; the logarithmic transformation, LT; the additive log-ratio transformation, ALT; the centered log-ratio transformation, CLT; the isometric log-ratio transformation, ILT; and no transformation, NT) are compared in terms of their effects on the cluster analysis of the geochemical compositional data. The study shows that, on the one hand, the ZST does not affect the results of column- or variable-based (R-type) cluster analysis, whereas the other methods, including the LT, the ALT, and the CLT, have substantial effects on the results. On the other hand, the results of the row- or observation-based (Q-type) cluster analysis obtained from the geochemical data after applying NT and the ZST are relatively poor. However, we derive some improved results from the geochemical data after applying the CLT, the ILT, the LT, and the ALT. Moreover, the k-means and fuzzy c-means clustering algorithms are more reliable than the hierarchical algorithm when they are used to cluster the geochemical data. We apply cluster analysis to the geochemical data to explore for Au deposits within the study area, and we obtain a good correlation between the results retrieved by combining the CLT or the ILT with the k-means or fuzzy c-means algorithms and the potential zones of Au mineralization. Therefore, we suggest that the combination of the CLT or the ILT with the k-means or fuzzy c-means algorithms is an effective tool to identify potential zones of mineralization from geochemical data.

  16. Detection of Functional Change Using Cluster Trend Analysis in Glaucoma.

    PubMed

    Gardiner, Stuart K; Mansberger, Steven L; Demirel, Shaban

    2017-05-01

    Global analyses using mean deviation (MD) assess visual field progression, but can miss localized changes. Pointwise analyses are more sensitive to localized progression, but more variable so require confirmation. This study assessed whether cluster trend analysis, averaging information across subsets of locations, could improve progression detection. A total of 133 test-retest eyes were tested 7 to 10 times. Rates of change and P values were calculated for possible re-orderings of these series to generate global analysis ("MD worsening faster than x dB/y with P < y"), pointwise and cluster analyses ("n locations [or clusters] worsening faster than x dB/y with P < y") with specificity exactly 95%. These criteria were applied to 505 eyes tested over a mean of 10.5 years, to find how soon each detected "deterioration," and compared using survival models. This was repeated including two subsequent visual fields to determine whether "deterioration" was confirmed. The best global criterion detected deterioration in 25% of eyes in 5.0 years (95% confidence interval [CI], 4.7-5.3 years), compared with 4.8 years (95% CI, 4.2-5.1) for the best cluster analysis criterion, and 4.1 years (95% CI, 4.0-4.5) for the best pointwise criterion. However, for pointwise analysis, only 38% of these changes were confirmed, compared with 61% for clusters and 76% for MD. The time until 25% of eyes showed subsequently confirmed deterioration was 6.3 years (95% CI, 6.0-7.2) for global, 6.3 years (95% CI, 6.0-7.0) for pointwise, and 6.0 years (95% CI, 5.3-6.6) for cluster analyses. Although the specificity is still suboptimal, cluster trend analysis detects subsequently confirmed deterioration sooner than either global or pointwise analyses.

  17. Penicillin production in industrial strain Penicillium chrysogenum P2niaD18 is not dependent on the copy number of biosynthesis genes.

    PubMed

    Ziemons, Sandra; Koutsantas, Katerina; Becker, Kordula; Dahlmann, Tim; Kück, Ulrich

    2017-02-16

    Multi-copy gene integration into microbial genomes is a conventional tool for obtaining improved gene expression. For Penicillium chrysogenum, the fungal producer of the beta-lactam antibiotic penicillin, many production strains carry multiple copies of the penicillin biosynthesis gene cluster. This discovery led to the generally accepted view that high penicillin titers are the result of multiple copies of penicillin genes. Here we investigated strain P2niaD18, a production line that carries only two copies of the penicillin gene cluster. We performed pulsed-field gel electrophoresis (PFGE), quantitative qRT-PCR, and penicillin bioassays to investigate production, deletion and overexpression strains generated in the P. chrysogenum P2niaD18 background, in order to determine the copy number of the penicillin biosynthesis gene cluster, and study the expression of one penicillin biosynthesis gene, and the penicillin titer. Analysis of production and recombinant strain showed that the enhanced penicillin titer did not depend on the copy number of the penicillin gene cluster. Our assumption was strengthened by results with a penicillin null strain lacking pcbC encoding isopenicillin N synthase. Reintroduction of one or two copies of the cluster into the pcbC deletion strain restored transcriptional high expression of the pcbC gene, but recombinant strains showed no significantly different penicillin titer compared to parental strains. Here we present a molecular genetic analysis of production and recombinant strains in the P2niaD18 background carrying different copy numbers of the penicillin biosynthesis gene cluster. Our analysis shows that the enhanced penicillin titer does not strictly depend on the copy number of the cluster. Based on these overall findings, we hypothesize that instead, complex regulatory mechanisms are prominently implicated in increased penicillin biosynthesis in production strains.

  18. Denaturing gradient gel electrophoresis profiles of bacteria from the saliva of twenty four different individuals form clusters that showed no relationship to the yeasts present.

    PubMed

    M Weerasekera, Manjula; H Sissons, Chris; Wong, Lisa; A Anderson, Sally; R Holmes, Ann; D Cannon, Richard

    2017-10-01

    The aim was to investigate the relationship between groups of bacteria identified by cluster analysis of the DGGE fingerprints and the amounts and diversity of yeast present. Bacterial and yeast populations in saliva samples from 24 adults were analysed using denaturing gradient gel electrophoresis (DGGE) of the bacteria present and by yeast culture. Eubacterial DGGE banding patterns showed considerable variation between individuals. Seventy one different amplicon bands were detected, the band number per saliva sample ranged from 21 to 39 (mean±SD=29.3±4.9). Cluster and principal component analysis of the bacterial DGGE patterns yielded three major clusters containing 20 of the samples. Seventeen of the 24 (71%) saliva samples were yeast positive with concentrations up to 10 3 cfu/mL. Candida albicans was the predominant species in saliva samples although six other yeast species, including Candida dubliniensis, Candida tropicalis, Candida krusei, Candida guilliermondii, Candida rugosa and Saccharomyces cerevisiae, were identified. The presence, concentration, and species of yeast in samples showed no clear relationship to the bacterial clusters. Despite indications of in vitro bacteria-yeast interactions, there was a lack of association between the presence, identity and diversity of yeasts and the bacterial DGGE fingerprint clusters in saliva. This suggests significant ecological individual-specificity of these associations in highly complex in vivo oral biofilm systems under normal oral conditions. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Cluster formation and drag reduction-proposed mechanism of particle recirculation within the partition column of the bottom spray fluid-bed coater.

    PubMed

    Wang, Li Kun; Heng, Paul Wan Sia; Liew, Celine Valeria

    2015-04-01

    Bottom spray fluid-bed coating is a common technique for coating multiparticulates. Under the quality-by-design framework, particle recirculation within the partition column is one of the main variability sources affecting particle coating and coat uniformity. However, the occurrence and mechanism of particle recirculation within the partition column of the coater are not well understood. The purpose of this study was to visualize and define particle recirculation within the partition column. Based on different combinations of partition gap setting, air accelerator insert diameter, and particle size fraction, particle movements within the partition column were captured using a high-speed video camera. The particle recirculation probability and voidage information were mapped using a visiometric process analyzer. High-speed images showed that particles contributing to the recirculation phenomenon were behaving as clustered colonies. Fluid dynamics analysis indicated that particle recirculation within the partition column may be attributed to the combined effect of cluster formation and drag reduction. Both visiometric process analysis and particle coating experiments showed that smaller particles had greater propensity toward cluster formation than larger particles. The influence of cluster formation on coating performance and possible solutions to cluster formation were further discussed. © 2014 Wiley Periodicals, Inc. and the American Pharmacists Association.

  20. Analysis of cytokine release assay data using machine learning approaches.

    PubMed

    Xiong, Feiyu; Janko, Marco; Walker, Mindi; Makropoulos, Dorie; Weinstock, Daniel; Kam, Moshe; Hrebien, Leonid

    2014-10-01

    The possible onset of Cytokine Release Syndrome (CRS) is an important consideration in the development of monoclonal antibody (mAb) therapeutics. In this study, several machine learning approaches are used to analyze CRS data. The analyzed data come from a human blood in vitro assay which was used to assess the potential of mAb-based therapeutics to produce cytokine release similar to that induced by Anti-CD28 superagonistic (Anti-CD28 SA) mAbs. The data contain 7 mAbs and two negative controls, a total of 423 samples coming from 44 donors. Three (3) machine learning approaches were applied in combination to observations obtained from that assay, namely (i) Hierarchical Cluster Analysis (HCA); (ii) Principal Component Analysis (PCA) followed by K-means clustering; and (iii) Decision Tree Classification (DTC). All three approaches were able to identify the treatment that caused the most severe cytokine response. HCA was able to provide information about the expected number of clusters in the data. PCA coupled with K-means clustering allowed classification of treatments sample by sample, and visualizing clusters of treatments. DTC models showed the relative importance of various cytokines such as IFN-γ, TNF-α and IL-10 to CRS. The use of these approaches in tandem provides better selection of parameters for one method based on outcomes from another, and an overall improved analysis of the data through complementary approaches. Moreover, the DTC analysis showed in addition that IL-17 may be correlated with CRS reactions, although this correlation has not yet been corroborated in the literature. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. Patterns of Childhood Abuse and Neglect in a Representative German Population Sample

    PubMed Central

    Schilling, Christoph; Weidner, Kerstin; Brähler, Elmar; Glaesmer, Heide; Häuser, Winfried; Pöhlmann, Karin

    2016-01-01

    Background Different types of childhood maltreatment, like emotional abuse, emotional neglect, physical abuse, physical neglect and sexual abuse are interrelated because of their co-occurrence. Different patterns of childhood abuse and neglect are associated with the degree of severity of mental disorders in adulthood. The purpose of this study was (a) to identify different patterns of childhood maltreatment in a representative German community sample, (b) to replicate the patterns of childhood neglect and abuse recently found in a clinical German sample, (c) to examine whether participants reporting exposure to specific patterns of child maltreatment would report different levels of psychological distress, and (d) to compare the results of the typological approach and the results of a cumulative risk model based on our data set. Methods In a cross-sectional survey conducted in 2010, a representative random sample of 2504 German participants aged between 14 and 92 years completed the Childhood Trauma Questionnaire (CTQ). General anxiety and depression were assessed by standardized questionnaires (GAD-2, PHQ-2). Cluster analysis was conducted with the CTQ-subscales to identify different patterns of childhood maltreatment. Results Three different patterns of childhood abuse and neglect could be identified by cluster analysis. Cluster one showed low values on all CTQ-scales. Cluster two showed high values in emotional and physical neglect. Only cluster three showed high values in physical and sexual abuse. The three patterns of childhood maltreatment showed different degrees of depression (PHQ-2) and anxiety (GAD-2). Cluster one showed lowest levels of psychological distress, cluster three showed highest levels of mental distress. Conclusion The results show that different types of childhood maltreatment are interrelated and can be grouped into specific patterns of childhood abuse and neglect, which are associated with differing severity of psychological distress in adulthood. The results correspond to those recently found in a German clinical sample and support a typological approach in the research of maltreatment. While cumulative risk models focus on the number of maltreatment types, the typological approach takes the number as well as the severity of the maltreatment types into account. Thus, specific patterns of maltreatment can be examined with regard to specific long-term psychological consequences. PMID:27442446

  2. Improved Ant Colony Clustering Algorithm and Its Performance Study

    PubMed Central

    Gao, Wei

    2016-01-01

    Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533

  3. Development of small scale cluster computer for numerical analysis

    NASA Astrophysics Data System (ADS)

    Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

    2017-09-01

    In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.

  4. Epidemiological study of phylogenetic transmission clusters in a local HIV-1 epidemic reveals distinct differences between subtype B and non-B infections.

    PubMed

    Chalmet, Kristen; Staelens, Delfien; Blot, Stijn; Dinakis, Sylvie; Pelgrom, Jolanda; Plum, Jean; Vogelaers, Dirk; Vandekerckhove, Linos; Verhofstede, Chris

    2010-09-07

    The number of HIV-1 infected individuals in the Western world continues to rise. More in-depth understanding of regional HIV-1 epidemics is necessary for the optimal design and adequate use of future prevention strategies. The use of a combination of phylogenetic analysis of HIV sequences, with data on patients' demographics, infection route, clinical information and laboratory results, will allow a better characterization of individuals responsible for local transmission. Baseline HIV-1 pol sequences, obtained through routine drug-resistance testing, from 506 patients, newly diagnosed between 2001 and 2009, were used to construct phylogenetic trees and identify transmission-clusters. Patients' demographics, laboratory and clinical data, were retrieved anonymously. Statistical analysis was performed to identify subtype-specific and transmission-cluster-specific characteristics. Multivariate analysis showed significant differences between the 59.7% of individuals with subtype B infection and the 40.3% non-B infected individuals, with regard to route of transmission, origin, infection with Chlamydia (p = 0.01) and infection with Hepatitis C virus (p = 0.017). More and larger transmission-clusters were identified among the subtype B infections (p < 0.001). Overall, in multivariate analysis, clustering was significantly associated with Caucasian origin, infection through homosexual contact and younger age (all p < 0.001). Bivariate analysis additionally showed a correlation between clustering and syphilis (p < 0.001), higher CD4 counts (p = 0.002), Chlamydia infection (p = 0.013) and primary HIV (p = 0.017). Combination of phylogenetics with demographic information, laboratory and clinical data, revealed that HIV-1 subtype B infected Caucasian men-who-have-sex-with-men with high prevalence of sexually transmitted diseases, account for the majority of local HIV-transmissions. This finding elucidates observed epidemiological trends through molecular analysis, and justifies sustained focus in prevention on this high risk group.

  5. Identification of clusters of individuals relevant to temporomandibular disorders and other chronic pain conditions: the OPPERA study

    PubMed Central

    Bair, Eric; Gaynor, Sheila; Slade, Gary D.; Ohrbach, Richard; Fillingim, Roger B.; Greenspan, Joel D.; Dubner, Ronald; Smith, Shad B.; Diatchenko, Luda; Maixner, William

    2016-01-01

    The classification of most chronic pain disorders gives emphasis to anatomical location of the pain to distinguish one disorder from the other (eg, back pain vs temporomandibular disorder [TMD]) or to define subtypes (eg, TMD myalgia vs arthralgia). However, anatomical criteria overlook etiology, potentially hampering treatment decisions. This study identified clusters of individuals using a comprehensive array of biopsychosocial measures. Data were collected from a case–control study of 1031 chronic TMD cases and 3247 TMD-free controls. Three subgroups were identified using supervised cluster analysis (referred to as the adaptive, pain-sensitive, and global symptoms clusters). Compared with the adaptive cluster, participants in the pain-sensitive cluster showed heightened sensitivity to experimental pain, and participants in the global symptoms cluster showed both greater pain sensitivity and greater psychological distress. Cluster membership was strongly associated with chronic TMD: 91.5% of TMD cases belonged to the pain-sensitive and global symptoms clusters, whereas 41.2% of controls belonged to the adaptive cluster. Temporomandibular disorder cases in the pain-sensitive and global symptoms clusters also showed greater pain intensity, jaw functional limitation, and more comorbid pain conditions. Similar results were obtained when the same methodology was applied to a smaller case–control study consisting of 199 chronic TMD cases and 201 TMD-free controls. During a median 3-year follow-up period of TMD-free individuals, participants in the global symptoms cluster had greater risk of developing first-onset TMD (hazard ratio = 2.8) compared with participants in the other 2 clusters. Cross-cohort predictive modeling was used to demonstrate the reliability of the clusters. PMID:26928952

  6. Geographic atrophy phenotype identification by cluster analysis.

    PubMed

    Monés, Jordi; Biarnés, Marc

    2018-03-01

    To identify ocular phenotypes in patients with geographic atrophy secondary to age-related macular degeneration (GA) using a data-driven cluster analysis. This was a retrospective analysis of data from a prospective, natural history study of patients with GA who were followed for ≥6 months. Cluster analysis was used to identify subgroups within the population based on the presence of several phenotypic features: soft drusen, reticular pseudodrusen (RPD), primary foveal atrophy, increased fundus autofluorescence (FAF), greyish FAF appearance and subfoveal choroidal thickness (SFCT). A comparison of features between the subgroups was conducted, and a qualitative description of the new phenotypes was proposed. The atrophy growth rate between phenotypes was then compared. Data were analysed from 77 eyes of 77 patients with GA. Cluster analysis identified three groups: phenotype 1 was characterised by high soft drusen load, foveal atrophy and slow growth; phenotype 3 showed high RPD load, extrafoveal and greyish FAF appearance and thin SFCT; the characteristics of phenotype 2 were midway between phenotypes 1 and 3. Phenotypes differed in all measured features (p≤0.013), with decreases in the presence of soft drusen, foveal atrophy and SFCT seen from phenotypes 1 to 3 and corresponding increases in high RPD load, high FAF and greyish FAF appearance. Atrophy growth rate differed between phenotypes 1, 2 and 3 (0.63, 1.91 and 1.73 mm 2 /year, respectively, p=0.0005). Cluster analysis identified three distinct phenotypes in GA. One of them showed a particularly slow growth pattern. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  7. Radiomics of CT Features May Be Nonreproducible and Redundant: Influence of CT Acquisition Parameters.

    PubMed

    Berenguer, Roberto; Pastor-Juan, María Del Rosario; Canales-Vázquez, Jesús; Castro-García, Miguel; Villas, María Victoria; Legorburo, Francisco Mansilla; Sabater, Sebastià

    2018-04-24

    Purpose To identify the reproducible and nonredundant radiomics features (RFs) for computed tomography (CT). Materials and Methods Two phantoms were used to test RF reproducibility by using test-retest analysis, by changing the CT acquisition parameters (hereafter, intra-CT analysis), and by comparing five different scanners with the same CT parameters (hereafter, inter-CT analysis). Reproducible RFs were selected by using the concordance correlation coefficient (as a measure of the agreement between variables) and the coefficient of variation (defined as the ratio of the standard deviation to the mean). Redundant features were grouped by using hierarchical cluster analysis. Results A total of 177 RFs including intensity, shape, and texture features were evaluated. The test-retest analysis showed that 91% (161 of 177) of the RFs were reproducible according to concordance correlation coefficient. Reproducibility of intra-CT RFs, based on coefficient of variation, ranged from 89.3% (151 of 177) to 43.1% (76 of 177) where the pitch factor and the reconstruction kernel were modified, respectively. Reproducibility of inter-CT RFs, based on coefficient of variation, also showed large material differences, from 85.3% (151 of 177; wood) to only 15.8% (28 of 177; polyurethane). Ten clusters were identified after the hierarchical cluster analysis and one RF per cluster was chosen as representative. Conclusion Many RFs were redundant and nonreproducible. If all the CT parameters are fixed except field of view, tube voltage, and milliamperage, then the information provided by the analyzed RFs can be summarized in only 10 RFs (each representing a cluster) because of redundancy. © RSNA, 2018 Online supplemental material is available for this article.

  8. InCHlib - interactive cluster heatmap for web applications.

    PubMed

    Skuta, Ctibor; Bartůněk, Petr; Svozil, Daniel

    2014-12-01

    Hierarchical clustering is an exploratory data analysis method that reveals the groups (clusters) of similar objects. The result of the hierarchical clustering is a tree structure called dendrogram that shows the arrangement of individual clusters. To investigate the row/column hierarchical cluster structure of a data matrix, a visualization tool called 'cluster heatmap' is commonly employed. In the cluster heatmap, the data matrix is displayed as a heatmap, a 2-dimensional array in which the colour of each element corresponds to its value. The rows/columns of the matrix are ordered such that similar rows/columns are near each other. The ordering is given by the dendrogram which is displayed on the side of the heatmap. We developed InCHlib (Interactive Cluster Heatmap Library), a highly interactive and lightweight JavaScript library for cluster heatmap visualization and exploration. InCHlib enables the user to select individual or clustered heatmap rows, to zoom in and out of clusters or to flexibly modify heatmap appearance. The cluster heatmap can be augmented with additional metadata displayed in a different colour scale. In addition, to further enhance the visualization, the cluster heatmap can be interconnected with external data sources or analysis tools. Data clustering and the preparation of the input file for InCHlib is facilitated by the Python utility script inchlib_clust . The cluster heatmap is one of the most popular visualizations of large chemical and biomedical data sets originating, e.g., in high-throughput screening, genomics or transcriptomics experiments. The presented JavaScript library InCHlib is a client-side solution for cluster heatmap exploration. InCHlib can be easily deployed into any modern web application and configured to cooperate with external tools and data sources. Though InCHlib is primarily intended for the analysis of chemical or biological data, it is a versatile tool which application domain is not limited to the life sciences only.

  9. Phylogeny of kemenyan (Styrax sp.) from North Sumatra based on morphological characters

    NASA Astrophysics Data System (ADS)

    Susilowati, A.; Kholibrina, C. R.; Rachmat, H. H.; Munthe, M. A.

    2018-02-01

    Kemenyan is the most famous local tree species from North Sumatra. Kemenyan is known as rosin producer that very valuable for pharmacheutical, cosmetic, food preservatives and vernis. Based on its history, there were only two species of kemenyan those were kemenyan durame and toba, but in its the natural distribution we also found others species showing different characteristics with previously known ones. The objectives of this research were:The objectives of this research were: (1). To determine the morphological diversity of kemenyan in North Sumatra and (2). To determine phylogeny clustering based on the morphological characters. Data was collected from direct observation and morphological characterization, based on purposive sampling technique to those samples trees atPakpak Bharat, North Sumatra. Morphological characters were examined using descriptive analysis, phenotypic variability using standard deviation, and cluster analysis. The result showed that there was a difference between 4 species kemenyen (batak, minyak, durame and toba) according to 75 observed characters including flower, fruits, leaf, stem, bark, crown type, wood and the resin. Analysis and both quantitative and qualitative characters kemenyan clustered into two groups. In which, kemenyan toba separated with other clusters.

  10. Nutritional information and health warnings on wine labels: Exploring consumer interest and preferences.

    PubMed

    Annunziata, A; Pomarici, E; Vecchio, R; Mariani, A

    2016-11-01

    This paper aims to contribute to the current debate on the inclusion of nutritional information and health warnings on wine labels, exploring consumers' interest and preferences. The results of a survey conducted on a sample of Italian wine consumers (N = 300) show the strong interest of respondents in the inclusion of such information on the label. Conjoint analysis reveals that consumers assign greater utility to health warnings, followed by nutritional information. Cluster analysis shows the existence of three different consumer segments. The first cluster, which included mainly female consumers (over 55) and those with high wine involvement, revealed greater awareness of the links between wine and health and better knowledge of wine nutritional properties, preferring a more detailed nutritional label, such as a panel with GDA%. By contrast, the other two clusters, consisting of individuals who generally find it more difficult to understand nutritional labels, preferred the less detailed label of a glass showing calories. The second and largest cluster comprising mainly younger men (under 44), showed the highest interest in health warnings while the third cluster - with a relatively low level of education - preferred the specification of the number of glasses not to exceed. Our results support the idea that the policy maker should consider introducing a mandatory nutritional label in the easier-to-implement and not-too-costly form of a glass with calories, rotating health warnings and the maximum number of glasses not to exceed. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Use of multivariate statistics to identify unreliable data obtained using CASA.

    PubMed

    Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón

    2013-06-01

    In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.

  12. UV TO FAR-IR CATALOG OF A GALAXY SAMPLE IN NEARBY CLUSTERS: SPECTRAL ENERGY DISTRIBUTIONS AND ENVIRONMENTAL TRENDS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hernandez-Fernandez, Jonathan D.; Iglesias-Paramo, J.; Vilchez, J. M., E-mail: jonatan@iaa.es

    2012-03-01

    In this paper, we present a sample of cluster galaxies devoted to study the environmental influence on the star formation activity. This sample of galaxies inhabits in clusters showing a rich variety in their characteristics and have been observed by the SDSS-DR6 down to M{sub B} {approx} -18, and by the Galaxy Evolution Explorer AIS throughout sky regions corresponding to several megaparsecs. We assign the broadband and emission-line fluxes from ultraviolet to far-infrared to each galaxy performing an accurate spectral energy distribution for spectral fitting analysis. The clusters follow the general X-ray luminosity versus velocity dispersion trend of L{sub X}more » {proportional_to} {sigma}{sup 4.4}{sub c}. The analysis of the distributions of galaxy density counting up to the 5th nearest neighbor {Sigma}{sub 5} shows: (1) the virial regions and the cluster outskirts share a common range in the high density part of the distribution. This can be attributed to the presence of massive galaxy structures in the surroundings of virial regions. (2) The virial regions of massive clusters ({sigma}{sub c} > 550 km s{sup -1}) present a {Sigma}{sub 5} distribution statistically distinguishable ({approx}96%) from the corresponding distribution of low-mass clusters ({sigma}{sub c} < 550 km s{sup -1}). Both massive and low-mass clusters follow a similar density-radius trend, but the low-mass clusters avoid the high density extreme. We illustrate, with ABELL 1185, the environmental trends of galaxy populations. Maps of sky projected galaxy density show how low-luminosity star-forming galaxies appear distributed along more spread structures than their giant counterparts, whereas low-luminosity passive galaxies avoid the low-density environment. Giant passive and star-forming galaxies share rather similar sky regions with passive galaxies exhibiting more concentrated distributions.« less

  13. Analysis of indoor air pollutants checklist using environmetric technique for health risk assessment of sick building complaint in nonindustrial workplace

    PubMed Central

    Syazwan, AI; Rafee, B Mohd; Juahir, Hafizan; Azman, AZF; Nizar, AM; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, AA; Yunos, MA Syafiq; Anita, AR; Hanafiah, J Muhamad; Shaharuddin, MS; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, MN Mohamad; Azizan, HS; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, FT

    2012-01-01

    Purpose To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. Design A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. Method A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Result Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. Conclusion This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ investigations and complaints procedures. PMID:23055779

  14. Analysis of indoor air pollutants checklist using environmetric technique for health risk assessment of sick building complaint in nonindustrial workplace.

    PubMed

    Syazwan, Ai; Rafee, B Mohd; Juahir, Hafizan; Azman, Azf; Nizar, Am; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, Aa; Yunos, Ma Syafiq; Anita, Ar; Hanafiah, J Muhamad; Shaharuddin, Ms; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, Mn Mohamad; Azizan, Hs; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, Ft

    2012-01-01

    To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ investigations and complaints procedures.

  15. Pattern of clustering of menopausal problems: A study with a Bengali Hindu ethnic group.

    PubMed

    Dasgupta, Doyel; Pal, Baidyanath; Ray, Subha

    2016-01-01

    We attempted to find out how menopausal problems cluster with each other. The study was conducted among a group of women belonging to a Bengali-speaking Hindu ethnic group of West Bengal, a state located in Eastern India. We recruited 1,400 participants for the study. Information on sociodemographic aspects and menopausal problems were collected from these participants with the help of a pretested questionnaire. Results of cluster analysis showed that vasomotor, vaginal, and urinary problems cluster together, separately from physical and psychosomatic problems.

  16. Clustering Analysis of Antibiograms and Antibiogram Types of Streptococcus agalactiae Strains from Tilapia in China.

    PubMed

    Liu, Chan; Feng, Juan; Zhang, Defeng; Xie, Yundan; Li, Anxing; Wang, Jiangyong; Su, Youlu

    2018-05-11

    In view of the changing antibiotic-resistance profiles of Streptococcus agalactiae from tilapia in China, antimicrobial susceptibilities of 75 S. agalactiae strains were determined by the disc diffusion method, and cluster analyses of the antibiograms and antibiogram types were performed. All strains displayed multidrug resistance (MDR). The antimicrobial-resistance rates were highest (>90%) to aminoglycosides, sulfonamides, pipemidic acid, and norfloxacin, followed by penicillin, ampicillin, and ciprofloxacin (26.7-38.7%); those to furadantin, lincomycin, erythromycin, ofloxacin, tetracycline, and florfenicol were low (<10%), and no resistance to vancomycin, cefalexin, cefoxitin, amoxicillin, medemycin, doxitard, oxytetracycline, rifampin, chloramphenicol, or thiamphenicol was detected. Statistical analysis showed that the resistance rate to ciprofloxacin increased significantly in 2016 (p = 0.009), whereas that to trimethoprim/sulfamethoxazole decreased (p = 0.017). Cluster analyses identified that the strains had 23 antibiogram types (A-W) and clustered in five groups (Groups I-V). The strains with higher antimicrobial resistance mainly clustered in Groups I and II. Our results show that the antibiograms varied with time and by location and that antibiogram types are constantly updating and expanding. Effective measures must be taken to reduce the antimicrobial resistance and spread of MDR strains.

  17. Density-cluster NMA: A new protein decomposition technique for coarse-grained normal mode analysis.

    PubMed

    Demerdash, Omar N A; Mitchell, Julie C

    2012-07-01

    Normal mode analysis has emerged as a useful technique for investigating protein motions on long time scales. This is largely due to the advent of coarse-graining techniques, particularly Hooke's Law-based potentials and the rotational-translational blocking (RTB) method for reducing the size of the force-constant matrix, the Hessian. Here we present a new method for domain decomposition for use in RTB that is based on hierarchical clustering of atomic density gradients, which we call Density-Cluster RTB (DCRTB). The method reduces the number of degrees of freedom by 85-90% compared with the standard blocking approaches. We compared the normal modes from DCRTB against standard RTB using 1-4 residues in sequence in a single block, with good agreement between the two methods. We also show that Density-Cluster RTB and standard RTB perform well in capturing the experimentally determined direction of conformational change. Significantly, we report superior correlation of DCRTB with B-factors compared with 1-4 residue per block RTB. Finally, we show significant reduction in computational cost for Density-Cluster RTB that is nearly 100-fold for many examples. Copyright © 2012 Wiley Periodicals, Inc.

  18. Isomers and energy landscapes of micro-hydrated sulfite and chlorate clusters

    NASA Astrophysics Data System (ADS)

    Hey, John C.; Doyle, Emily J.; Chen, Yuting; Johnston, Roy L.

    2018-03-01

    We present putative global minima for the micro-hydrated sulfite SO32-(H2O)N and chlorate ClO32(H2O)N systems in the range 3≤N≤15 found using basin-hopping global structure optimization with an empirical potential. We present a structural analysis of the hydration of a large number of minimized structures for hydrated sulfite and chlorate clusters in the range 3≤N≤50. We show that sulfite is a significantly stronger net acceptor of hydrogen bonding within water clusters than chlorate, completely suppressing the appearance of hydroxyl groups pointing out from the cluster surface (dangling OH bonds), in low-energy clusters. We also present a qualitative analysis of a highly explored energy landscape in the region of the global minimum of the eight water hydrated sulfite and chlorate systems. This article is part of the theme issue `Modern theoretical chemistry'.

  19. Isomers and energy landscapes of micro-hydrated sulfite and chlorate clusters.

    PubMed

    Hey, John C; Doyle, Emily J; Chen, Yuting; Johnston, Roy L

    2018-03-13

    We present putative global minima for the micro-hydrated sulfite SO 3 2- (H 2 O) N and chlorate ClO 3 - (H 2 O) N systems in the range 3≤ N ≤15 found using basin-hopping global structure optimization with an empirical potential. We present a structural analysis of the hydration of a large number of minimized structures for hydrated sulfite and chlorate clusters in the range 3≤ N ≤50. We show that sulfite is a significantly stronger net acceptor of hydrogen bonding within water clusters than chlorate, completely suppressing the appearance of hydroxyl groups pointing out from the cluster surface (dangling OH bonds), in low-energy clusters. We also present a qualitative analysis of a highly explored energy landscape in the region of the global minimum of the eight water hydrated sulfite and chlorate systems.This article is part of the theme issue 'Modern theoretical chemistry'. © 2018 The Authors.

  20. Self-Assembled Gold Nano-Ripple Formation by Gas Cluster Ion Beam Bombardment.

    PubMed

    Tilakaratne, Buddhi P; Chen, Quark Y; Chu, Wei-Kan

    2017-09-08

    In this study, we used a 30 keV argon cluster ion beam bombardment to investigate the dynamic processes during nano-ripple formation on gold surfaces. Atomic force microscope analysis shows that the gold surface has maximum roughness at an incident angle of 60° from the surface normal; moreover, at this angle, and for an applied fluence of 3 × 10 16 clusters/cm², the aspect ratio of the nano-ripple pattern is in the range of ~50%. Rutherford backscattering spectrometry analysis reveals a formation of a surface gradient due to prolonged gas cluster ion bombardment, although the surface roughness remains consistent throughout the bombarded surface area. As a result, significant mass redistribution is triggered by gas cluster ion beam bombardment at room temperature. Where mass redistribution is responsible for nano-ripple formation, the surface erosion process refines the formed nano-ripple structures.

  1. Cluster-specific small airway modeling for imaging-based CFD analysis of pulmonary air flow and particle deposition in COPD smokers

    NASA Astrophysics Data System (ADS)

    Haghighi, Babak; Choi, Jiwoong; Choi, Sanghun; Hoffman, Eric A.; Lin, Ching-Long

    2017-11-01

    Accurate modeling of small airway diameters in patients with chronic obstructive pulmonary disease (COPD) is a crucial step toward patient-specific CFD simulations of regional airflow and particle transport. We proposed to use computed tomography (CT) imaging-based cluster membership to identify structural characteristics of airways in each cluster and use them to develop cluster-specific airway diameter models. We analyzed 284 COPD smokers with airflow limitation, and 69 healthy controls. We used multiscale imaging-based cluster analysis (MICA) to classify smokers into 4 clusters. With representative cluster patients and healthy controls, we performed multiple regressions to quantify variation of airway diameters by generation as well as by cluster. The cluster 2 and 4 showed more diameter decrease as generation increases than other clusters. The cluster 4 had more rapid decreases of airway diameters in the upper lobes, while cluster 2 in the lower lobes. We then used these regression models to estimate airway diameters in CT unresolved regions to obtain pressure-volume hysteresis curves using a 1D resistance model. These 1D flow solutions can be used to provide the patient-specific boundary conditions for 3D CFD simulations in COPD patients. Support for this study was provided, in part, by NIH Grants U01-HL114494, R01-HL112986 and S10-RR022421.

  2. A comparison of IQ and memory cluster solutions in moderate and severe pediatric traumatic brain injury.

    PubMed

    Thaler, Nicholas S; Terranova, Jennifer; Turner, Alisa; Mayfield, Joan; Allen, Daniel N

    2015-01-01

    Recent studies have examined heterogeneous neuropsychological outcomes in childhood traumatic brain injury (TBI) using cluster analysis. These studies have identified homogeneous subgroups based on tests of IQ, memory, and other cognitive abilities that show some degree of association with specific cognitive, emotional, and behavioral outcomes, and have demonstrated that the clusters derived for children with TBI are different from those observed in normal populations. However, the extent to which these subgroups are stable across abilities has not been examined, and this has significant implications for the generalizability and clinical utility of TBI clusters. The current study addressed this by comparing IQ and memory profiles of 137 children who sustained moderate-to-severe TBI. Cluster analysis of IQ and memory scores indicated that a four-cluster solution was optimal for the IQ scores and a five-cluster solution was optimal for the memory scores. Three clusters on each battery differed primarily by level of performance, while the others had pattern variations. Cross-plotting the clusters across respective IQ and memory test scores indicated that clusters defined by level were generally stable, while clusters defined by pattern differed. Notably, children with slower processing speed exhibited low-average to below-average performance on memory indexes. These results provide some support for the stability of previously identified memory and IQ clusters and provide information about the relationship between IQ and memory in children with TBI.

  3. Mapping the Dark Matter Distribution of the Merging Galaxy Cluster Abell 115

    NASA Astrophysics Data System (ADS)

    Kim, Mincheol; Jee, Myungkook James; Forman, William; Golovich, Nathan; van Weeren, Reinout

    2018-01-01

    The colliding galaxy cluster Abell 115 shows a number of clear merging features including radio relics, double X-ray peaks, and offsets between the cluster member galaxies and the X-ray distributions. In order to constrain the merging scenario of this complex system, it is critical to know where the dark matter is. We present a high-fidelity weak-lensing analysis of the system using a state-of-the-art method that robustly models the detailed PSF variations. Our mass reconstruction reveals two distinct mass peaks. Through a careful bootstrapping analysis, we demonstrate that the positions of these two mass peaks are highly consistent with those of the cluster galaxies, although the comparison with the X-ray emission shows that the mass peaks lead the X-ray peaks. We obtain the first weak-lensing mass of each subcluster by simultaneously fitting two NFW profiles, as well as the total mass of the system. Interestingly, the total mass is a few factors lower than the published dynamical mass based on velocity dispersion. This large mass discrepancy may be attributed to a significant disruption of the cluster galaxy orbits due to the violent merger. Our preliminary analysis indicates that the two subclusters might have experienced a first off-axis collision a few Gyrs ago and might be now returning for a second collision.

  4. [Study of human immunodeficiency virus transmission chains in Andalusia: analysis from baseline antiretroviral resistance sequences].

    PubMed

    Pérez-Parra, Santiago; Chueca-Porcuna, Natalia; Álvarez-Estevez, Marta; Pasquau, Juan; Omar, Mohamed; Collado, Antonio; Vinuesa, David; Lozano, Ana Belen; García-García, Federico

    2015-11-01

    Protease and reverse transcriptase HIV-1 sequences provide useful information for patient clinical management, as well as information on resistance to antiretrovirals. The aim of this study is to evaluate transmission events, transmitted drug resistance, and to georeference subtypes among newly diagnosed patients referred to our center. A study was conducted on 693 patients diagnosed between 2005 and 2012 in Southern Spain. Protease and reverse transcriptase sequences were obtained for resistance to cART analysis with Trugene(®) HIV Genotyping Kit (Siemens, NAD). MEGA 5.2, Neighbor-Joining, ArcGIS and REGA were used for subsequent analysis. The results showed 298 patients clustered into 77 different transmission events. Most of the clusters were formed by pairs (n=49), of men having sex with men (n=26), Spanish (n=37), and below 45 years of age (73.5%). Urban areas from Granada, and the coastal areas of Almeria and Granada showed the greatest subtype heterogeneity. Five clusters were formed by more than 10 patients, and 15 clusters had transmitted drug resistance. The study data demonstrate how the phylogenetic characterization of transmission clusters is a powerful tool to monitor the spread of HIV, and may contribute to design correct preventive measures to minimize it. Copyright © 2015 Elsevier España, S.L.U. y Sociedad Española de Enfermedades Infecciosas y Microbiología Clínica. All rights reserved.

  5. Stressful jobs and non-stressful jobs: a cluster analysis of office jobs.

    PubMed

    Carayon, P

    1994-02-01

    The purpose of the study was to determine if office jobs could be characterized by a small number of combinations of stressors that could be related to job-title information and self-report of psychological strain. Two-hundred-and-sixty-two office workers from three public service organizations provided data on nine job stressors and seven indicators of psychological strain. Using cluster analysis on the nine stressors, office jobs were classified into three clusters. The first cluster included jobs with high skill utilization, task clarity, job control and social support and low future ambiguity, but also high on job demands such as quantitative work-load, attention and work pressure. The second cluster included jobs with high demands and future ambiguity and low skill utilization, task clarity, job control and social support. The third cluster was intermediary between the first two clusters. The three clusters were related to job-title information. The second cluster was the highest on a range of psychological strain indicators, while the other two clusters were high on certain strain indicators but low on others. The study showed that office jobs could be characterized by a small number of combinations of stressors that were related to job-title information and psychological strain.

  6. Metabolic Analysis of Various Date Palm Fruit (Phoenix dactylifera L.) Cultivars from Saudi Arabia to Assess Their Nutritional Quality.

    PubMed

    Hamad, Ismail; AbdElgawad, Hamada; Al Jaouni, Soad; Zinta, Gaurav; Asard, Han; Hassan, Sherif; Hegab, Momtaz; Hagagy, Nashwa; Selim, Samy

    2015-07-27

    Date palm is an important crop, especially in the hot-arid regions of the world. Date palm fruits have high nutritional and therapeutic value and possess significant antibacterial and antifungal properties. In this study, we performed bioactivity analyses and metabolic profiling of date fruits of 12 cultivars from Saudi Arabia to assess their nutritional value. Our results showed that the date extracts from different cultivars have different free radical scavenging and anti-lipid peroxidation activities. Moreover, the cultivars showed significant differences in their chemical composition, e.g., the phenolic content (10.4-22.1 mg/100 g DW), amino acids (37-108 μmol·g-1 FW) and minerals (237-969 mg/100 g DW). Principal component analysis (PCA) showed a clear separation of the cultivars into four different groups. The first group consisted of the Sokary, Nabtit Ali cultivars, the second group of Khlas Al Kharj, Khla Al Qassim, Mabroom, Khlas Al Ahsa, the third group of Khals Elshiokh, Nabot Saif, Khodry, and the fourth group consisted of Ajwa Al Madinah, Saffawy, Rashodia, cultivars. Hierarchical cluster analysis (HCA) revealed clustering of date cultivars into two groups. The first cluster consisted of the Sokary, Rashodia and Nabtit Ali cultivars, and the second cluster contained all the other tested cultivars. These results indicate that date fruits have high nutritive value, and different cultivars have different chemical composition.

  7. Minimal spanning tree algorithm for γ-ray source detection in sparse photon images: cluster parameters and selection strategies

    DOE PAGES

    Campana, R.; Bernieri, E.; Massaro, E.; ...

    2013-05-22

    We present that the minimal spanning tree (MST) algorithm is a graph-theoretical cluster-finding method. We previously applied it to γ-ray bidimensional images, showing that it is quite sensitive in finding faint sources. Possible sources are associated with the regions where the photon arrival directions clusterize. MST selects clusters starting from a particular “tree” connecting all the point of the image and performing a cut based on the angular distance between photons, with a number of events higher than a given threshold. In this paper, we show how a further filtering, based on some parameters linked to the cluster properties, canmore » be applied to reduce spurious detections. We find that the most efficient parameter for this secondary selection is the magnitudeM of a cluster, defined as the product of its number of events by its clustering degree. We test the sensitivity of the method by means of simulated and real Fermi-Large Area Telescope (LAT) fields. Our results show that √M is strongly correlated with other statistical significance parameters, derived from a wavelet based algorithm and maximum likelihood (ML) analysis, and that it can be used as a good estimator of statistical significance of MST detections. Finally, we apply the method to a 2-year LAT image at energies higher than 3 GeV, and we show the presence of new clusters, likely associated with BL Lac objects.« less

  8. Elements concentration analysis in groundwater from the North Serra Geral aquifer in Santa Helena-Brazil using SR-TXRF spectrometer.

    PubMed

    Justen, Gisele C; Espinoza-Quiñones, Fernando R; Módenes, Aparecido Nivaldo; Bergamasco, Rosangela

    2012-01-01

    In this work the analysis of elements concentration in groundwater was performed using the synchrotron radiation total-reflection X-ray fluorescence (SR-TXRF) technique. A set of nine tube-wells with serious risk of contamination was chosen to monitor the mean concentration of elements in groundwater from the North Serra Geral aquifer in Santa Helena, Brazil, during 1 year. Element concentrations were determined applying a SR-TXRF methodology. The accuracy of SR-TXRF technique was validated by analysis of a certified reference material. As the groundwater composition in the North Serra Geral aquifer showed heterogeneity in the spatial distribution of eight major elements, a hierarchical clustering to the data was performed. By a similarity in their compositions, two of the nine wells were grouped in a first cluster, while the other seven were grouped in a second cluster. Calcium was the major element in all wells, with higher Ca concentration in the second cluster than in the first cluster. However, concentrations of Ti, V, Cr in the first cluster are slightly higher than those in the second cluster. The findings of this study within a monitoring program of tube-wells could provide a useful assessment of controls over groundwater composition and support management at regional level.

  9. Sun Protection Belief Clusters: Analysis of Amazon Mechanical Turk Data.

    PubMed

    Santiago-Rivas, Marimer; Schnur, Julie B; Jandorf, Lina

    2016-12-01

    This study aimed (i) to determine whether people could be differentiated on the basis of their sun protection belief profiles and individual characteristics and (ii) explore the use of a crowdsourcing web service for the assessment of sun protection beliefs. A sample of 500 adults completed an online survey of sun protection belief items using Amazon Mechanical Turk. A two-phased cluster analysis (i.e., hierarchical and non-hierarchical K-means) was utilized to determine clusters of sun protection barriers and facilitators. Results yielded three distinct clusters of sun protection barriers and three distinct clusters of sun protection facilitators. Significant associations between gender, age, sun sensitivity, and cluster membership were identified. Results also showed an association between barrier and facilitator cluster membership. The results of this study provided a potential alternative approach to developing future sun protection promotion initiatives in the population. Findings add to our knowledge regarding individuals who support, oppose, or are ambivalent toward sun protection and inform intervention research by identifying distinct subtypes that may best benefit from (or have a higher need for) skin cancer prevention efforts.

  10. Ligand Effects in Aluminum Cluster based Energetic Materials

    DTIC Science & Technology

    2017-09-01

    was recently reported and the effect of their increased steric bulk is discussed here. Experimental results and density functional theory (DFT...analysis show that these clusters are enthalpically more stable than the Cp* variant, due primarily to non -covalent interactions (NCIs) across ligand...C5Me4iPr), two clusters similar to Al4Cp*4, was recently reported and the effect of their increased steric bulk is discussed here. Experimental

  11. Chaperone expression profiles correlate with distinct physiological states of Plasmodium falciparum in malaria patients

    PubMed Central

    2010-01-01

    Background Molecular chaperones have been shown to be important in the growth of the malaria parasite Plasmodium falciparum and inhibition of chaperone function by pharmacological agents has been shown to abrogate parasite growth. A recent study has demonstrated that clinical isolates of the parasite have distinct physiological states, one of which resembles environmental stress response showing up-regulation of specific molecular chaperones. Methods Chaperone networks operational in the distinct physiological clusters in clinical malaria parasites were constructed using cytoscape by utilizing their clinical expression profiles. Results Molecular chaperones show distinct profiles in the previously defined physiologically distinct states. Further, expression profiles of the chaperones from different cellular compartments correlate with specific patient clusters. While cluster 1 parasites, representing a starvation response, show up-regulation of organellar chaperones, cluster 2 parasites, which resemble active growth based on glycolysis, show up-regulation of cytoplasmic chaperones. Interestingly, cytoplasmic Hsp90 and its co-chaperones, previously implicated as drug targets in malaria, cluster in the same group. Detailed analysis of chaperone expression in the patient cluster 2 reveals up-regulation of the entire Hsp90-dependent pro-survival circuitries. In addition, cluster 2 also shows up-regulation of Plasmodium export element (PEXEL)-containing Hsp40s thought to have regulatory and host remodeling roles in the infected erythrocyte. Conclusion In all, this study demonstrates an intimate involvement of parasite-encoded chaperones, PfHsp90 in particular, in defining pathogenesis of malaria. PMID:20719001

  12. Characterization of distinct Arctic aerosol accumulation modes and their sources

    NASA Astrophysics Data System (ADS)

    Lange, R.; Dall'Osto, M.; Skov, H.; Nøjgaard, J. K.; Nielsen, I. E.; Beddows, D. C. S.; Simo, R.; Harrison, R. M.; Massling, A.

    2018-06-01

    In this work we use cluster analysis of long term particle size distribution data to expand an array of different shorter term atmospheric measurements, thereby gaining insights into longer term patterns and properties of Arctic aerosol. Measurements of aerosol number size distributions (9-915 nm) were conducted at Villum Research Station (VRS), Station Nord in North Greenland during a 5 year record (2012-2016). Alongside this, measurements of aerosol composition, meteorological parameters, gaseous compounds and cloud condensation nuclei (CCN) activity were performed during different shorter occasions. K-means clustering analysis of particle number size distributions on daily basis identified several clusters. Clusters of accumulation mode aerosols (main size modes > 100 nm) accounted for 56% of the total aerosol during the sampling period (89-91% during February-April, 1-3% during June-August). By association to chemical composition, cloud condensation nuclei properties, and meteorological variables, three typical accumulation mode aerosol clusters were identified: Haze (32% of the time), Bimodal (14%) and Aged (6%). In brief: (1) Haze accumulation mode aerosol shows a single mode at 150 nm, peaking in February-April, with highest loadings of sulfate and black carbon concentrations. (2) Accumulation mode Bimodal aerosol shows two modes, at 38 nm and 150 nm, peaking in June-August, with the highest ratio of organics to sulfate concentrations. (3) Aged accumulation mode aerosol shows a single mode at 213 nm, peaking in September-October and is associated with cloudy and humid weather conditions during autumn. The three aerosol clusters were considered alongside CCN concentrations. We suggest that organic compounds, that are likely marine biogenic in nature, greatly influence the Bimodal cluster and contribute significantly to its CCN activity. This stresses the importance of better characterizing the marine ecosystem and the aerosol-mediated climate effects in the Arctic.

  13. Cluster size resolving analysis of CH3F-(ortho-H2)n in solid para-hydrogen using FTIR absorption spectroscopy at 3 μm region.

    PubMed

    Miyamoto, Yuki; Momose, Takamasa; Kanamori, Hideto

    2012-11-21

    Infrared absorption spectra of methyl fluoride with ortho-hydrogen (ortho-H(2)) clusters in a solid para-hydrogen (para-H(2)) crystal at 3.6 K were studied in the C-H stretching fundamental region (~3000 cm(-1)) using an FTIR spectrometer. As shown previously, the ν(3) C-F stretching fundamental band of CH(3)F-(ortho-H(2))(n) (n = 0, 1, 2, ...) clusters at 1040 cm(-1) shows a series of n discrete absorption lines, which correspond to different-sized clusters. We observed three unresolved broad peaks in the C-H stretching region and applied this cluster model to them assuming the same intensity distribution function as the ν(3) band. A fitting analysis successfully gave us the linewidth and lineshift of the components in each vibrational band. It was found that the separately determined linewidth, matrix shift of the band origin, and cluster shift are dependent on the vibrational mode. From the transition intensities of the monomer component derived from the fitting analysis, we discuss the mixing ratio of the vibrational modes due to Fermi resonance.

  14. Space-time analysis of pneumonia hospitalisations in the Netherlands.

    PubMed

    Benincà, Elisa; van Boven, Michiel; Hagenaars, Thomas; van der Hoek, Wim

    2017-01-01

    Community acquired pneumonia is a major global public health problem. In the Netherlands there are 40,000-50,000 hospital admissions for pneumonia per year. In the large majority of these hospital admissions the etiologic agent is not determined and a real-time surveillance system is lacking. Localised and temporal increases in hospital admissions for pneumonia are therefore only detected retrospectively and the etiologic agents remain unknown. Here, we perform spatio-temporal analyses of pneumonia hospital admission data in the Netherlands. To this end, we scanned for spatial clusters on yearly and seasonal basis, and applied wavelet cluster analysis on the time series of five main regions. The pneumonia hospital admissions show strong clustering in space and time superimposed on a regular yearly cycle with high incidence in winter and low incidence in summer. Cluster analysis reveals a heterogeneous pattern, with most significant clusters occurring in the western, highly urbanised, and in the eastern, intensively farmed, part of the Netherlands. Quantitatively, the relative risk (RR) of the significant clusters for the age-standardised incidence varies from a minimum of 1.2 to a maximum of 2.2. We discuss possible underlying causes for the patterns observed, such as variations in air pollution.

  15. Atlas-Guided Cluster Analysis of Large Tractography Datasets

    PubMed Central

    Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

    2013-01-01

    Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment. PMID:24386292

  16. Novel approach to characterising individuals with low back-related leg pain: cluster identification with latent class analysis and 12-month follow-up.

    PubMed

    Stynes, Siobhán; Konstantinou, Kika; Ogollah, Reuben; Hay, Elaine M; Dunn, Kate M

    2018-04-01

    Traditionally, low back-related leg pain (LBLP) is diagnosed clinically as referred leg pain or sciatica (nerve root involvement). However, within the spectrum of LBLP, we hypothesised that there may be other unrecognised patient subgroups. This study aimed to identify clusters of patients with LBLP using latent class analysis and describe their clinical course. The study population was 609 LBLP primary care consulters. Variables from clinical assessment were included in the latent class analysis. Characteristics of the statistically identified clusters were compared, and their clinical course over 1 year was described. A 5 cluster solution was optimal. Cluster 1 (n = 104) had mild leg pain severity and was considered to represent a referred leg pain group with no clinical signs, suggesting nerve root involvement (sciatica). Cluster 2 (n = 122), cluster 3 (n = 188), and cluster 4 (n = 69) had mild, moderate, and severe pain and disability, respectively, and response to clinical assessment items suggested categories of mild, moderate, and severe sciatica. Cluster 5 (n = 126) had high pain and disability, longer pain duration, and more comorbidities and was difficult to map to a clinical diagnosis. Most improvement for pain and disability was seen in the first 4 months for all clusters. At 12 months, the proportion of patients reporting recovery ranged from 27% for cluster 5 to 45% for cluster 2 (mild sciatica). This is the first study that empirically shows the variability in profile and clinical course of patients with LBLP including sciatica. More homogenous groups were identified, which could be considered in future clinical and research settings.

  17. Fingerprint analysis of Hibiscus mutabilis L. leaves based on ultra performance liquid chromatography with photodiode array detector combined with similarity analysis and hierarchical clustering analysis methods

    PubMed Central

    Liang, Xianrui; Ma, Meiling; Su, Weike

    2013-01-01

    Background: A method for chemical fingerprint analysis of Hibiscus mutabilis L. leaves was developed based on ultra performance liquid chromatography with photodiode array detector (UPLC-PAD) combined with similarity analysis (SA) and hierarchical clustering analysis (HCA). Materials and Methods: 10 batches of Hibiscus mutabilis L. leaves samples were collected from different regions of China. UPLC-PAD was employed to collect chemical fingerprints of Hibiscus mutabilis L. leaves. Results: The relative standard deviations (RSDs) of the relative retention times (RRT) and relative peak areas (RPA) of 10 characteristic peaks (one of them was identified as rutin) in precision, repeatability and stability test were less than 3%, and the method of fingerprint analysis was validated to be suitable for the Hibiscus mutabilis L. leaves. Conclusions: The chromatographic fingerprints showed abundant diversity of chemical constituents qualitatively in the 10 batches of Hibiscus mutabilis L. leaves samples from different locations by similarity analysis on basis of calculating the correlation coefficients between each two fingerprints. Moreover, the HCA method clustered the samples into four classes, and the HCA dendrogram showed the close or distant relations among the 10 samples, which was consistent to the SA result to some extent. PMID:23930008

  18. Galaxy clusters in the SDSS Stripe 82 based on photometric redshifts

    DOE PAGES

    Durret, F.; Adami, C.; Bertin, E.; ...

    2015-06-10

    Based on a recent photometric redshift galaxy catalogue, we have searched for galaxy clusters in the Stripe ~82 region of the Sloan Digital Sky Survey by applying the Adami & MAzure Cluster FInder (AMACFI). Extensive tests were made to fine-tune the AMACFI parameters and make the cluster detection as reliable as possible. The same method was applied to the Millennium simulation to estimate our detection efficiency and the approximate masses of the detected clusters. Considering all the cluster galaxies (i.e. within a 1 Mpc radius of the cluster to which they belong and with a photoz differing by less thanmore » 0.05 from that of the cluster), we stacked clusters in various redshift bins to derive colour-magnitude diagrams and galaxy luminosity functions (GLFs). For each galaxy with absolute magnitude brighter than -19.0 in the r band, we computed the disk and spheroid components by applying SExtractor, and by stacking clusters we determined how the disk-to-spheroid flux ratio varies with cluster redshift and mass. We also detected 3663 clusters in the redshift range 0.1513 and a few 10 14 solar masses. Furthermore, by stacking the cluster galaxies in various redshift bins, we find a clear red sequence in the (g'-r') versus r' colour-magnitude diagrams, and the GLFs are typical of clusters, though with a possible contamination from field galaxies. The morphological analysis of the cluster galaxies shows that the fraction of late-type to early-type galaxies shows an increase with redshift (particularly in high mass clusters) and a decrease with detection level, i.e. cluster mass. From the properties of the cluster galaxies, the majority of the candidate clusters detected here seem to be real clusters with typical cluster properties.« less

  19. Galaxy clusters in the SDSS Stripe 82 based on photometric redshifts

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Durret, F.; Adami, C.; Bertin, E.

    Based on a recent photometric redshift galaxy catalogue, we have searched for galaxy clusters in the Stripe ~82 region of the Sloan Digital Sky Survey by applying the Adami & MAzure Cluster FInder (AMACFI). Extensive tests were made to fine-tune the AMACFI parameters and make the cluster detection as reliable as possible. The same method was applied to the Millennium simulation to estimate our detection efficiency and the approximate masses of the detected clusters. Considering all the cluster galaxies (i.e. within a 1 Mpc radius of the cluster to which they belong and with a photoz differing by less thanmore » 0.05 from that of the cluster), we stacked clusters in various redshift bins to derive colour-magnitude diagrams and galaxy luminosity functions (GLFs). For each galaxy with absolute magnitude brighter than -19.0 in the r band, we computed the disk and spheroid components by applying SExtractor, and by stacking clusters we determined how the disk-to-spheroid flux ratio varies with cluster redshift and mass. We also detected 3663 clusters in the redshift range 0.1513 and a few 10 14 solar masses. Furthermore, by stacking the cluster galaxies in various redshift bins, we find a clear red sequence in the (g'-r') versus r' colour-magnitude diagrams, and the GLFs are typical of clusters, though with a possible contamination from field galaxies. The morphological analysis of the cluster galaxies shows that the fraction of late-type to early-type galaxies shows an increase with redshift (particularly in high mass clusters) and a decrease with detection level, i.e. cluster mass. From the properties of the cluster galaxies, the majority of the candidate clusters detected here seem to be real clusters with typical cluster properties.« less

  20. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity

    PubMed Central

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F; Abbazia, Patrick; Ababio, Amma; Adam, Naazneen

    2015-01-01

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery. DOI: http://dx.doi.org/10.7554/eLife.06416.001 PMID:25919952

  1. Whole genome comparison of a large collection of mycobacteriophages reveals a continuum of phage genetic diversity.

    PubMed

    Pope, Welkin H; Bowman, Charles A; Russell, Daniel A; Jacobs-Sera, Deborah; Asai, David J; Cresawn, Steven G; Jacobs, William R; Hendrix, Roger W; Lawrence, Jeffrey G; Hatfull, Graham F

    2015-04-28

    The bacteriophage population is large, dynamic, ancient, and genetically diverse. Limited genomic information shows that phage genomes are mosaic, and the genetic architecture of phage populations remains ill-defined. To understand the population structure of phages infecting a single host strain, we isolated, sequenced, and compared 627 phages of Mycobacterium smegmatis. Their genetic diversity is considerable, and there are 28 distinct genomic types (clusters) with related nucleotide sequences. However, amino acid sequence comparisons show pervasive genomic mosaicism, and quantification of inter-cluster and intra-cluster relatedness reveals a continuum of genetic diversity, albeit with uneven representation of different phages. Furthermore, rarefaction analysis shows that the mycobacteriophage population is not closed, and there is a constant influx of genes from other sources. Phage isolation and analysis was performed by a large consortium of academic institutions, illustrating the substantial benefits of a disseminated, structured program involving large numbers of freshman undergraduates in scientific discovery.

  2. Rosacea assessment by erythema index and principal component analysis segmentation maps

    NASA Astrophysics Data System (ADS)

    Kuzmina, Ilona; Rubins, Uldis; Saknite, Inga; Spigulis, Janis

    2017-12-01

    RGB images of rosacea were analyzed using segmentation maps of principal component analysis (PCA) and erythema index (EI). Areas of segmented clusters were compared to Clinician's Erythema Assessment (CEA) values given by two dermatologists. The results show that visible blood vessels are segmented more precisely on maps of the erythema index and the third principal component (PC3). In many cases, a distribution of clusters on EI and PC3 maps are very similar. Mean values of clusters' areas on these maps show a decrease of the area of blood vessels and erythema and an increase of lighter skin area after the therapy for the patients with diagnosis CEA = 2 on the first visit and CEA=1 on the second visit. This study shows that EI and PC3 maps are more useful than the maps of the first (PC1) and second (PC2) principal components for indicating vascular structures and erythema on the skin of rosacea patients and therapy monitoring.

  3. A baseline record of trace elements concentration along the beach placer mining areas of Kanyakumari coast, South India.

    PubMed

    Simon Peter, T; Chandrasekar, N; John Wilson, J S; Selvakumar, S; Krishnakumar, S; Magesh, N S

    2017-06-15

    Trace element concentration in the beach placer mining areas of Kanyakumari coast, South India was assessed. Sewage and contaminated sediments from mining sites has contaminated the surface sediments. Enrichment factor indicates moderately severe enrichment for Pb, minor enrichment for Mn, Zn, Ni, Fe and no enrichment for Cr and Cu. The Igeo values show higher concentration of Pb ranging in the scale of 3-4, which shows strong contamination due to high anthropogenic activity such as mining and terrestrial influences into the coastal regions. Correlation coefficient shows that most of the elements are associated with each other except Ni and Pb. Factor analysis reveals that Mn, Zn, Fe, Cr, Pb and Cu are having a significant loading and it indicates that these elements are mainly derived from similar origin. The cluster analysis clearly indicated that the mining areas are grouped under cluster 2 and non-mining areas are clustered under group 1. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Clustering, randomness and regularity in cloud fields. I - Theoretical considerations. II - Cumulus cloud fields

    NASA Technical Reports Server (NTRS)

    Weger, R. C.; Lee, J.; Zhu, Tianri; Welch, R. M.

    1992-01-01

    The current controversy existing in reference to the regularity vs. clustering in cloud fields is examined by means of analysis and simulation studies based upon nearest-neighbor cumulative distribution statistics. It is shown that the Poisson representation of random point processes is superior to pseudorandom-number-generated models and that pseudorandom-number-generated models bias the observed nearest-neighbor statistics towards regularity. Interpretation of this nearest-neighbor statistics is discussed for many cases of superpositions of clustering, randomness, and regularity. A detailed analysis is carried out of cumulus cloud field spatial distributions based upon Landsat, AVHRR, and Skylab data, showing that, when both large and small clouds are included in the cloud field distributions, the cloud field always has a strong clustering signal.

  5. A graph-Laplacian-based feature extraction algorithm for neural spike sorting.

    PubMed

    Ghanbari, Yasser; Spence, Larry; Papamichalis, Panos

    2009-01-01

    Analysis of extracellular neural spike recordings is highly dependent upon the accuracy of neural waveform classification, commonly referred to as spike sorting. Feature extraction is an important stage of this process because it can limit the quality of clustering which is performed in the feature space. This paper proposes a new feature extraction method (which we call Graph Laplacian Features, GLF) based on minimizing the graph Laplacian and maximizing the weighted variance. The algorithm is compared with Principal Components Analysis (PCA, the most commonly-used feature extraction method) using simulated neural data. The results show that the proposed algorithm produces more compact and well-separated clusters compared to PCA. As an added benefit, tentative cluster centers are output which can be used to initialize a subsequent clustering stage.

  6. Detecting space-time disease clusters with arbitrary shapes and sizes using a co-clustering approach.

    PubMed

    Ullah, Sami; Daud, Hanita; Dass, Sarat C; Khan, Habib Nawaz; Khalil, Alamgir

    2017-11-06

    Ability to detect potential space-time clusters in spatio-temporal data on disease occurrences is necessary for conducting surveillance and implementing disease prevention policies. Most existing techniques use geometrically shaped (circular, elliptical or square) scanning windows to discover disease clusters. In certain situations, where the disease occurrences tend to cluster in very irregularly shaped areas, these algorithms are not feasible in practise for the detection of space-time clusters. To address this problem, a new algorithm is proposed, which uses a co-clustering strategy to detect prospective and retrospective space-time disease clusters with no restriction on shape and size. The proposed method detects space-time disease clusters by tracking the changes in space-time occurrence structure instead of an in-depth search over space. This method was utilised to detect potential clusters in the annual and monthly malaria data in Khyber Pakhtunkhwa Province, Pakistan from 2012 to 2016 visualising the results on a heat map. The results of the annual data analysis showed that the most likely hotspot emerged in three sub-regions in the years 2013-2014. The most likely hotspots in monthly data appeared in the month of July to October in each year and showed a strong periodic trend.

  7. Psychological profiles derived by cluster analysis of Minnesota Multiphasic Personality Inventory and long term clinical outcome after coronary artery by pass grafting.

    PubMed

    Modica, Maddalena; Carabalona, Roberta; Spezzaferri, Rosa; Tavanelli, Monica; Torri, A; Ripamonti, Vittorino; Castiglioni, Paolo; De Maria, Renata; Ferratini, Maurizio

    2012-03-01

    To evaluate the psychological characteristics of coronary heart disease (CHD) patients after coronary artery bypass grafting (CABG) by cluster analysis of Minnesota Multiphasic Personality Inventory (MMPI-2) questionnaires and to assess the impact of the profiles obtained on long-term outcome. 229 CHD patients admitted to cardiac rehabilitation filled in self-administered MMPI-2 questionnaires early after CABG. We assessed the relation between MMPI-2 profiles derived by cluster analysis, clinical characteristics and outcome at 3-year follow-up. Among the 215 patients (76% men, median age 66 years) with valid criteria in control scales, we identified 3 clusters (G) with homogenous psychological characteristics: G1 patients (N = 75) presented somatoform complaints but overall minimal psychological distress. G2 patients (N=72) presented type D personality traits. G3 subjects (N=68) showed a trend to cynicism, mild increases in anger, social introversion and hostility. Clusters overlapped for clinical characteristics such as smoking (G1 21%, G2 24%, G3 24%, p ns), previous myocardial infarction (G1 43%, G2 47%, G3 49% p ns), LV ejection fraction (G1 60 [51-60]; G2 58 [49-60]; G3 60 [55-60], p ns), 3-vessel-disease prevalence (G1 69%, G2 65%, G3 71%, p ns). Three-year event rates were comparable (G1 15%; G2 18%; G3 15%) and Kaplan-Meier curves overlapped among clusters (p ns). After CABG, the interpretation of MMPI-2 by cluster analysis is useful for the psychological and personological diagnosis to direct psychological assistance. Conversely, results from cluster analysis of MMPI-2 do not seem helpful to the clinician to predict long term outcome.

  8. Joint fMRI analysis and subject clustering using sparse dictionary learning

    NASA Astrophysics Data System (ADS)

    Kim, Seung-Jun; Dontaraju, Krishna K.

    2017-08-01

    Multi-subject fMRI data analysis methods based on sparse dictionary learning are proposed. In addition to identifying the component spatial maps by exploiting the sparsity of the maps, clusters of the subjects are learned by postulating that the fMRI volumes admit a subspace clustering structure. Furthermore, in order to tune the associated hyper-parameters systematically, a cross-validation strategy is developed based on entry-wise sampling of the fMRI dataset. Efficient algorithms for solving the proposed constrained dictionary learning formulations are developed. Numerical tests performed on synthetic fMRI data show promising results and provides insights into the proposed technique.

  9. The formation and evolution of M33 as revealed by its star clusters

    NASA Astrophysics Data System (ADS)

    San Roman, Izaskun

    2012-03-01

    Numerical simulations based on the Lambda-Cold Dark Matter (Λ-CDM) model predict a scenario consistent with observational evidence in terms of the build-up of Milky Way-like halos. Under this scenario, large disk galaxies derive from the merger and accretion of many smaller subsystems. However, it is less clear how low-mass spiral galaxies fit into this picture. The best way to answer this question is to study the nearest example of a dwarf spiral galaxy, M33. We will use star clusters to understand the structure, kinematics and stellar populations of this galaxy. Star clusters provide a unique and powerful tool for studying the star formation histories of galaxies. In particular, the ages and metallicities of star clusters bear the imprint of the galaxy formation process. We have made use of the star clusters to uncover the formation and evolution of M33. In this dissertation, we have carried out a comprehensive study of the M33 star cluster system, including deep photometry as well as high signal-to-noise spectroscopy. In order to mitigate the significant incompleteness presents in previous catalogs, we have conducted ground-based and space-based photometric surveys of M33 star clusters. Using archival images, we have analyzed 12 fields using the Advanced Camera for Surveys Wide Field Channel onboard the Hubble Space Telescope (ACS/HST) along the major axis of the galaxy. We present integrated photometry and color-magnitude diagrams for 161 star clusters in M33, of which 115 were previously uncataloged. This survey extends the depth of the existing M33 cluster catalogs by ˜ 1 mag. We have expanded our search through a photometric survey in a 1° x 1° area centered on M33 using the MegaCam camera on the 3.6m Canada-France-Hawaii Telescope (CFHT). In this work we discuss the photometric properties of the sample, including color-color diagrams of 599 new candidate stellar clusters, and 204 confirmed clusters. Comparisons with models of simple stellar populations suggest a large range of ages some as old as ˜ 10 Gyr. In addition, we find in the color-color diagrams a significant population of very young clusters (< 10 Myr) possessing nebular emission. Analysis of the radial density distribution suggests that the cluster system of M33 has suffered from significant depletion, possibly due to interactions with M31. To further understand the properties of M33 star clusters, we have carried out a morphological study 161 star clusters in M33 using ACS/HST images. We have obtained, for the first time, ellipticities, position angles, and surface brightness profiles of a statistically significant number of clusters. Ellipticities show that, on average, M33 clusters are more flattened than those of the Milky Way and M31, and more similar to clusters in the Small Magellanic Cloud. The ellipticities do not show any correlation with age or mass, suggesting that rotation is not the main cause of elongation in the M33 clusters. The position angles of the clusters show a bimodality with a strong peak perpendicular to the position angle of the galaxy. These results support the notion that tidal forces are the reason for the cluster flattening. We have fit analytical models to the surface brightness profiles, and derived structural parameters. The overall analysis shows several differences between the structural properties of the M33 cluster system and cluster systems in nearby galaxies. Finally, we have performed a spectroscopic study of star clusters in the above mentioned catalog. We present high-precision velocity measures of 45 star clusters, based on observations from the 10.4m Gran Telescopio Canarias (GTC) using OSIRIS and 4.2m William Herschel Telescope (WHT) using WYFFOS. All the clusters have been previously confirmed using HST imaging, and ages and integrated photometry are known. The velocity of the clusters with respect to local disk motion increases with age for young and intermediate clusters. The mean dispersion velocity for the intermediate age clusters in our sample is significantly larger than in previous studies. Analysis of these velocities along the major axis of the galaxy show no net rotation of the intermediate age subsample. The small number of old clusters in our sample does not allow for any conclusive evidence in that age division.

  10. Microforms in gravel bed rivers: Formation, disintegration, and effects on bedload transport

    USGS Publications Warehouse

    Strom, K.; Papanicolaou, A.N.; Evangelopoulos, N.; Odeh, M.

    2004-01-01

    This research aims to advance current knowledge on cluster formation and evolution by tackling some of the aspects associated with cluster microtopography and the effects of clusters on bedload transport. The specific objectives of the study are (1) to identify the bed shear stress range in which clusters form and disintegrate, (2) to quantitatively describe the spacing characteristics and orientation of clusters with respect to flow characteristics, (3) to quantify the effects clusters have on the mean bedload rate, and (4) to assess the effects of clusters on the pulsating nature of bedload. In order to meet the objectives of this study, two main experimental scenarios, namely, Test Series A and B (20 experiments overall) are considered in a laboratory flume under well-controlled conditions. Series A tests are performed to address objectives (1) and (2) while Series B is designed to meet objectives (3) and (4). Results show that cluster microforms develop in uniform sediment at 1.25 to 2 times the Shields parameter of an individual particle and start disintegrating at about 2.25 times the Shields parameter. It is found that during an unsteady flow event, effects of clusters on bedload transport rate can be classified in three different phases: a sink phase where clusters absorb incoming sediment, a neutral phase where clusters do not affect bedload, and a source phase where clusters release particles. Clusters also increase the magnitude of the fluctuations in bedload transport rate, showing that clusters amplify the unsteady nature of bedload transport. A fourth-order autoregressive, autoregressive integrated moving average model is employed to describe the time series of bedload and provide a predictive formula for predicting bedload at different periods. Finally, a change-point analysis enhanced with a binary segmentation procedure is performed to identify the abrupt changes in the bedload statistic characteristics due to the effects of clusters and detect the different phases in bedload time series using probability theory. The analysis verifies the experimental findings that three phases are detected in the bedload rate time series structure, namely, sink, neutral, and source. ?? ASCE / JUNE 2004.

  11. Reducing Earth Topography Resolution for SMAP Mission Ground Tracks Using K-Means Clustering

    NASA Technical Reports Server (NTRS)

    Rizvi, Farheen

    2013-01-01

    The K-means clustering algorithm is used to reduce Earth topography resolution for the SMAP mission ground tracks. As SMAP propagates in orbit, knowledge of the radar antenna footprints on Earth is required for the antenna misalignment calibration. Each antenna footprint contains a latitude and longitude location pair on the Earth surface. There are 400 pairs in one data set for the calibration model. It is computationally expensive to calculate corresponding Earth elevation for these data pairs. Thus, the antenna footprint resolution is reduced. Similar topographical data pairs are grouped together with the K-means clustering algorithm. The resolution is reduced to the mean of each topographical cluster called the cluster centroid. The corresponding Earth elevation for each cluster centroid is assigned to the entire group. Results show that 400 data points are reduced to 60 while still maintaining algorithm performance and computational efficiency. In this work, sensitivity analysis is also performed to show a trade-off between algorithm performance versus computational efficiency as the number of cluster centroids and algorithm iterations are increased.

  12. Investigation of defect clusters in ion-irradiated Ni and NiCo using diffuse X-ray scattering and electron microscopy

    DOE PAGES

    Olsen, Raina J.; Jin, Ke; Lu, Chenyang; ...

    2015-11-23

    The nature of defect clusters in Ni and Nimore » $$_{50}$$Co$$_{50}$$ (NiCo) irradiated at room temperature with 2–16 MeV Ni ions is studied using asymptotic diffuse X-ray scattering and transmission electron microscopy (TEM). Analysis of the scattering data provides separate size distributions for vacancy and interstitial type defect clusters, showing that both types of defect clusters have a smaller size and higher density in NiCo than in Ni. Diffuse scattering results show good quantitative agreement with TEM results for cluster sizes greater than 4 nm diameter, but find that the majority of vacancy clusters are under 2 nm in NiCo, which, if not detected, would lead to the conclusion that defect density was actually lower in the alloy. Interstitial dislocation loops and stacking fault tetrahedra are identified by TEM. Lastly comparison of diffuse scattering lineshapes to those calculated for dislocation loops and SFTs indicates that most of the vacancy clusters are SFTs.« less

  13. Weighted graph cuts without eigenvectors a multilevel approach.

    PubMed

    Dhillon, Inderjit S; Guan, Yuqiang; Kulis, Brian

    2007-11-01

    A variety of clustering algorithms have recently been proposed to handle data that is not linearly separable; spectral clustering and kernel k-means are two of the main methods. In this paper, we discuss an equivalence between the objective functions used in these seemingly different methods--in particular, a general weighted kernel k-means objective is mathematically equivalent to a weighted graph clustering objective. We exploit this equivalence to develop a fast, high-quality multilevel algorithm that directly optimizes various weighted graph clustering objectives, such as the popular ratio cut, normalized cut, and ratio association criteria. This eliminates the need for any eigenvector computation for graph clustering problems, which can be prohibitive for very large graphs. Previous multilevel graph partitioning methods, such as Metis, have suffered from the restriction of equal-sized clusters; our multilevel algorithm removes this restriction by using kernel k-means to optimize weighted graph cuts. Experimental results show that our multilevel algorithm outperforms a state-of-the-art spectral clustering algorithm in terms of speed, memory usage, and quality. We demonstrate that our algorithm is applicable to large-scale clustering tasks such as image segmentation, social network analysis and gene network analysis.

  14. Assembly and features of secondary metabolite biosynthetic gene clusters in Streptomyces ansochromogenes.

    PubMed

    Zhong, Xingyu; Tian, Yuqing; Niu, Guoqing; Tan, Huarong

    2013-07-01

    A draft genome sequence of Streptomyces ansochromogenes 7100 was generated using 454 sequencing technology. In combination with local BLAST searches and gap filling techniques, a comprehensive antiSMASH-based method was adopted to assemble the secondary metabolite biosynthetic gene clusters in the draft genome of S. ansochromogenes. A total of at least 35 putative gene clusters were identified and assembled. Transcriptional analysis showed that 20 of the 35 gene clusters were expressed in either or all of the three different media tested, whereas the other 15 gene clusters were silent in all three different media. This study provides a comprehensive method to identify and assemble secondary metabolite biosynthetic gene clusters in draft genomes of Streptomyces, and will significantly promote functional studies of these secondary metabolite biosynthetic gene clusters.

  15. Population clustering based on copy number variations detected from next generation sequencing data.

    PubMed

    Duan, Junbo; Zhang, Ji-Gang; Wan, Mingxi; Deng, Hong-Wen; Wang, Yu-Ping

    2014-08-01

    Copy number variations (CNVs) can be used as significant bio-markers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering.

  16. Identifying Patient Attitudinal Clusters Associated with Asthma Control: The European REALISE Survey.

    PubMed

    van der Molen, Thys; Fletcher, Monica; Price, David

    Asthma is a highly heterogeneous disease that can be classified into different clinical phenotypes, and treatment may be tailored accordingly. However, factors beyond purely clinical traits, such as patient attitudes and behaviors, can also have a marked impact on treatment outcomes. The objective of this study was to further analyze data from the REcognise Asthma and LInk to Symptoms and Experience (REALISE) Europe survey, to identify distinct patient groups sharing common attitudes toward asthma and its management. Factor analysis of respondent data (N = 7,930) from the REALISE Europe survey consolidated the 34 attitudinal variables provided by the study population into a set of 8 summary factors. Cluster analyses were used to identify patient clusters that showed similar attitudes and behaviors toward each of the 8 summary factors. Five distinct patient clusters were identified and named according to the key characteristics comprising that cluster: "Confident and self-managing," "Confident and accepting of their asthma," "Confident but dependent on others," "Concerned but confident in their health care professional (HCP)," and "Not confident in themselves or their HCP." Clusters showed clear variability in attributes such as degree of confidence in managing their asthma, use of reliever and preventer medication, and level of asthma control. The 5 patient clusters identified in this analysis displayed distinctly different personal attitudes that would require different approaches in the consultation room certainly for asthma but probably also for other chronic diseases. Copyright © 2018 The Authors. Published by Elsevier Inc. All rights reserved.

  17. Catchment classification by runoff behaviour with self-organizing maps (SOM)

    NASA Astrophysics Data System (ADS)

    Ley, R.; Casper, M. C.; Hellebrand, H.; Merz, R.

    2011-09-01

    Catchments show a wide range of response behaviour, even if they are adjacent. For many purposes it is necessary to characterise and classify them, e.g. for regionalisation, prediction in ungauged catchments, model parameterisation. In this study, we investigate hydrological similarity of catchments with respect to their response behaviour. We analyse more than 8200 event runoff coefficients (ERCs) and flow duration curves of 53 gauged catchments in Rhineland-Palatinate, Germany, for the period from 1993 to 2008, covering a huge variability of weather and runoff conditions. The spatio-temporal variability of event-runoff coefficients and flow duration curves are assumed to represent how different catchments "transform" rainfall into runoff. From the runoff coefficients and flow duration curves we derive 12 signature indices describing various aspects of catchment response behaviour to characterise each catchment. Hydrological similarity of catchments is defined by high similarities of their indices. We identify, analyse and describe hydrologically similar catchments by cluster analysis using Self-Organizing Maps (SOM). As a result of the cluster analysis we get five clusters of similarly behaving catchments where each cluster represents one differentiated class of catchments. As catchment response behaviour is supposed to be dependent on its physiographic and climatic characteristics, we compare groups of catchments clustered by response behaviour with clusters of catchments based on catchment properties. Results show an overlap of 67% between these two pools of clustered catchments which can be improved using the topologic correctness of SOMs.

  18. Catchment classification by runoff behaviour with self-organizing maps (SOM)

    NASA Astrophysics Data System (ADS)

    Ley, R.; Casper, M. C.; Hellebrand, H.; Merz, R.

    2011-03-01

    Catchments show a wide range of response behaviour, even if they are adjacent. For many purposes it is necessary to characterise and classify them, e.g. for regionalisation, prediction in ungauged catchments, model parameterisation. In this study, we investigate hydrological similarity of catchments with respect to their response behaviour. We analyse more than 8200 event runoff coefficients (ERCs) and flow duration curves of 53 gauged catchments in Rhineland-Palatinate, Germany, for the period from 1993 to 2008, covering a huge variability of weather and runoff conditions. The spatio-temporal variability of event-runoff coefficients and flow duration curves are assumed to represent how different catchments "transform" rainfall into runoff. From the runoff coefficients and flow duration curves we derive 12 signature indices describing various aspects of catchment response behaviour to characterise each catchment. Hydrological similarity of catchments is defined by high similarities of their indices. We identify, analyse and describe hydrologically similar catchments by cluster analysis using Self-Organizing Maps (SOM). As a result of the cluster analysis we get five clusters of similarly behaving catchments where each cluster represents one differentiated class of catchments. As catchment response behaviour is supposed to be dependent on its physiographic and climatic characteristics, we compare groups of catchments clustered by response behaviour with clusters of catchments based on catchment properties. Results show an overlap of 67% between these two pools of clustered catchments which can be improved using the topologic correctness of SOMs.

  19. Transcriptional analysis of exopolysaccharides biosynthesis gene clusters in Lactobacillus plantarum.

    PubMed

    Vastano, Valeria; Perrone, Filomena; Marasco, Rosangela; Sacco, Margherita; Muscariello, Lidia

    2016-04-01

    Exopolysaccharides (EPS) from lactic acid bacteria contribute to specific rheology and texture of fermented milk products and find applications also in non-dairy foods and in therapeutics. Recently, four clusters of genes (cps) associated with surface polysaccharide production have been identified in Lactobacillus plantarum WCFS1, a probiotic and food-associated lactobacillus. These clusters are involved in cell surface architecture and probably in release and/or exposure of immunomodulating bacterial molecules. Here we show a transcriptional analysis of these clusters. Indeed, RT-PCR experiments revealed that the cps loci are organized in five operons. Moreover, by reverse transcription-qPCR analysis performed on L. plantarum WCFS1 (wild type) and WCFS1-2 (ΔccpA), we demonstrated that expression of three cps clusters is under the control of the global regulator CcpA. These results, together with the identification of putative CcpA target sequences (catabolite responsive element CRE) in the regulatory region of four out of five transcriptional units, strongly suggest for the first time a role of the master regulator CcpA in EPS gene transcription among lactobacilli.

  20. Aftershock identification problem via the nearest-neighbor analysis for marked point processes

    NASA Astrophysics Data System (ADS)

    Gabrielov, A.; Zaliapin, I.; Wong, H.; Keilis-Borok, V.

    2007-12-01

    The centennial observations on the world seismicity have revealed a wide variety of clustering phenomena that unfold in the space-time-energy domain and provide most reliable information about the earthquake dynamics. However, there is neither a unifying theory nor a convenient statistical apparatus that would naturally account for the different types of seismic clustering. In this talk we present a theoretical framework for nearest-neighbor analysis of marked processes and obtain new results on hierarchical approach to studying seismic clustering introduced by Baiesi and Paczuski (2004). Recall that under this approach one defines an asymmetric distance D in space-time-energy domain such that the nearest-neighbor spanning graph with respect to D becomes a time- oriented tree. We demonstrate how this approach can be used to detect earthquake clustering. We apply our analysis to the observed seismicity of California and synthetic catalogs from ETAS model and show that the earthquake clustering part is statistically different from the homogeneous part. This finding may serve as a basis for an objective aftershock identification procedure.

  1. When the wind goes out of the sail - declining recovery expectations in the first weeks of back pain.

    PubMed

    Carstens, J K P; Shaw, W S; Boersma, K; Reme, S E; Pransky, G; Linton, S J

    2014-02-01

    Expectations for recovery are a known predictor for returning to work. Most studies seem to conclude that the higher the expectancy the better the outcome. However, the development of expectations over time is rarely researched and experimental studies show that realistic expectations rather than high expectancies are the most adaptive. This study aims to explore patterns of stability and change in expectations for recovery during the first weeks of a back-pain episode and how these patterns relate to other psychological variables and outcome. The study included 496 volunteer patients seeking treatment for work-related, acute back pain. The participants were measured with self-report scales of depression, fear of pain, life impact of pain, catastrophizing and expectations for recovery at two time points. A follow-up focusing on recovery and return to work was conducted 3 months later. A cluster analysis was conducted, categorizing the data on the trajectories of recovery expectations. Cluster analysis revealed four clusters regarding the development of expectations for recovery during a 2-week period after pain onset. Three out of four clusters showed stability in their expectations as well as corresponding levels of proximal psychological factors. The fourth cluster showed increases in distress and a decrease in expectations for recovery. This cluster also has poor odds ratios for returning to work and recovery. Decreases in expectancies for recovery seem as important as baseline values in terms of outcome, which has clinical and theoretical implications. © 2013 European Pain Federation - EFIC®

  2. Determination of Arctic sea ice variability modes on interannual timescales via nonhierarchical clustering

    NASA Astrophysics Data System (ADS)

    Fučkar, Neven-Stjepan; Guemas, Virginie; Massonnet, François; Doblas-Reyes, Francisco

    2015-04-01

    Over the modern observational era, the northern hemisphere sea ice concentration, age and thickness have experienced a sharp long-term decline superimposed with strong internal variability. Hence, there is a crucial need to identify robust patterns of Arctic sea ice variability on interannual timescales and disentangle them from the long-term trend in noisy datasets. The principal component analysis (PCA) is a versatile and broadly used method for the study of climate variability. However, the PCA has several limiting aspects because it assumes that all modes of variability have symmetry between positive and negative phases, and suppresses nonlinearities by using a linear covariance matrix. Clustering methods offer an alternative set of dimension reduction tools that are more robust and capable of taking into account possible nonlinear characteristics of a climate field. Cluster analysis aggregates data into groups or clusters based on their distance, to simultaneously minimize the distance between data points in a given cluster and maximize the distance between the centers of the clusters. We extract modes of Arctic interannual sea-ice variability with nonhierarchical K-means cluster analysis and investigate the mechanisms leading to these modes. Our focus is on the sea ice thickness (SIT) as the base variable for clustering because SIT holds most of the climate memory for variability and predictability on interannual timescales. We primarily use global reconstructions of sea ice fields with a state-of-the-art ocean-sea-ice model, but we also verify the robustness of determined clusters in other Arctic sea ice datasets. Applied cluster analysis over the 1958-2013 period shows that the optimal number of detrended SIT clusters is K=3. Determined SIT cluster patterns and their time series of occurrence are rather similar between different seasons and months. Two opposite thermodynamic modes are characterized with prevailing negative or positive SIT anomalies over the Arctic basin. The intermediate mode, with negative anomalies centered on the East Siberian shelf and positive anomalies along the North American side of the basin, has predominately dynamic characteristics. The associated sea ice concentration (SIC) clusters vary more between different seasons and months, but the SIC patterns are physically framed by the SIT cluster patterns.

  3. Structure and Stability of GeAu{sub n}, n = 1-10 clusters: A Density Functional Study

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Priyanka,; Dharamvir, Keya; Sharma, Hitesh

    2011-12-12

    The structures of Germanium doped gold clusters GeAu{sub n} (n = 1-10) have been investigated using ab initio calculations based on density functional theory (DFT). We have obtained ground state geometries of GeAu{sub n} clusters and have it compared with Silicon doped gold clusters and pure gold clusters. The ground state geometries of the GeAu{sub n} clusters show patterns similar to silicon doped gold clusters except for n = 5, 6 and 9. The introduction of germanium atom increases the binding energy of gold clusters. The binding energy per atom of germanium doped cluster is smaller than the corresponding siliconmore » doped gold cluster. The HUMO-LOMO gap for Au{sub n}Ge clusters have been found to vary between 0.46 eV-2.09 eV. The mullikan charge analysis indicates that charge of order of 0.1e always transfers from germanium atom to gold atom.« less

  4. [Study on HPLC fingerprint of Oldenlandia diffusa].

    PubMed

    Chen, Yan; Yao, Zhi-Hong; Dai, Yi; Cheng, Hong; Wen, Li-Rong; Zhou, Guang-Xiong; Yao, Xin-Sheng

    2012-06-01

    To establish the HPLC fingerprint chromatogram of Oldenlandia diffusa coupled with chemometrics means for the quality control of multi-batches of medicinal material. The separation was developed on C18 column(4.6 mm x 250 mm, 5 microm) by gradient elution with acetonitrile-water(both containing 0.1 per thousand (V/V) ocetic acid) as mobile phase at a flow rate of 0.8 mL/min, the detection wavelength at 238 nm and column temperature at 30 degrees C. The HPLC fingerprint chromatogram of Oldenlandia diffusa was set up and the main characteristic peaks were identified by comparing with chemical reference substance. The quality of 22 batches of medicinal material was evaluated by similarity assay as well as principal component analysis (PCA) and cluster analysis. The established HPLC fingerprint chromatogram of Oldenlandia diffusa was specific, precise, reproducible and stable. 11 peaks were chemically identified. The similarity of 17 batches of Oldenlandia diffusa was obviously higher than 5 batches of adulterants. PCA showed that 17 batches of Oldenlandia diffusa were in a domain and 5 batches of adulterants were far apart from the domain. The cluster analysis of the 22 batches of medicinal material showed that 17 batches of Oldenlandia diffusa were in a cluster while 5 batches of adulterants were excluded. Further cluster analysis was carried out for the quality consistency of 17 batches of Oldenlandia diffusa and accordingly they were devided into 4 clusters. With the combination of chemometrics means, the HPLC fingerprint chromatogram provides a method for evaluation of authenticity and quality control of Oldenlandia diffusa, which is favorable to improve overall quality control of Oldenlandia diffusa.

  5. Quantifying the impact of fixed effects modeling of clusters in multiple imputation for cluster randomized trials

    PubMed Central

    Andridge, Rebecca. R.

    2011-01-01

    In cluster randomized trials (CRTs), identifiable clusters rather than individuals are randomized to study groups. Resulting data often consist of a small number of clusters with correlated observations within a treatment group. Missing data often present a problem in the analysis of such trials, and multiple imputation (MI) has been used to create complete data sets, enabling subsequent analysis with well-established analysis methods for CRTs. We discuss strategies for accounting for clustering when multiply imputing a missing continuous outcome, focusing on estimation of the variance of group means as used in an adjusted t-test or ANOVA. These analysis procedures are congenial to (can be derived from) a mixed effects imputation model; however, this imputation procedure is not yet available in commercial statistical software. An alternative approach that is readily available and has been used in recent studies is to include fixed effects for cluster, but the impact of using this convenient method has not been studied. We show that under this imputation model the MI variance estimator is positively biased and that smaller ICCs lead to larger overestimation of the MI variance. Analytical expressions for the bias of the variance estimator are derived in the case of data missing completely at random (MCAR), and cases in which data are missing at random (MAR) are illustrated through simulation. Finally, various imputation methods are applied to data from the Detroit Middle School Asthma Project, a recent school-based CRT, and differences in inference are compared. PMID:21259309

  6. Do beef risk perceptions or risk attitudes have a greater effect on the beef purchase decisions of Canadian consumers?

    PubMed

    Yang, Jun; Goddard, Ellen

    2011-01-01

    Cluster analysis is applied in this study to group Canadian households by two characteristics, their risk perceptions and risk attitudes toward beef. There are some similarities in demographic profiles, meat purchases, and bovine spongiform encephalopathy (BSE) media recall between the cluster that perceives beef to be the most risky and the cluster that has little willingness to accept the risks of eating beef. There are similarities between the medium risk perception cluster and the medium risk attitude cluster, as well as between the cluster that perceives beef to have little risk and the cluster that is most willing to accept the risks of eating beef. Regression analysis shows that risk attitudes have a larger impact on household-level beef purchasing decisions than do risk perceptions for all consumer clusters. This implies that it may be more effective to undertake policies that reduce the risks associated with eating beef, instead of enhancing risk communication to improve risk perceptions. Only for certain clusters with higher willingness to accept the risks of eating beef might enhancing risk communication increase beef consumption significantly. The different role of risk perceptions and risk attitudes in beef consumption needs to be recognized during the design of risk management policies.

  7. A New Variable Weighting and Selection Procedure for K-Means Cluster Analysis

    ERIC Educational Resources Information Center

    Steinley, Douglas; Brusco, Michael J.

    2008-01-01

    A variance-to-range ratio variable weighting procedure is proposed. We show how this weighting method is theoretically grounded in the inherent variability found in data exhibiting cluster structure. In addition, a variable selection procedure is proposed to operate in conjunction with the variable weighting technique. The performances of these…

  8. Order-Constrained Solutions in K-Means Clustering: Even Better than Being Globally Optimal

    ERIC Educational Resources Information Center

    Steinley, Douglas; Hubert, Lawrence

    2008-01-01

    This paper proposes an order-constrained K-means cluster analysis strategy, and implements that strategy through an auxiliary quadratic assignment optimization heuristic that identifies an initial object order. A subsequent dynamic programming recursion is applied to optimally subdivide the object set subject to the order constraint. We show that…

  9. Clustered Numerical Data Analysis Using Markov Lie Monoid Based Networks

    NASA Astrophysics Data System (ADS)

    Johnson, Joseph

    2016-03-01

    We have designed and build an optimal numerical standardization algorithm that links numerical values with their associated units, error level, and defining metadata thus supporting automated data exchange and new levels of artificial intelligence (AI). The software manages all dimensional and error analysis and computational tracing. Tables of entities verses properties of these generalized numbers (called ``metanumbers'') support a transformation of each table into a network among the entities and another network among their properties where the network connection matrix is based upon a proximity metric between the two items. We previously proved that every network is isomorphic to the Lie algebra that generates continuous Markov transformations. We have also shown that the eigenvectors of these Markov matrices provide an agnostic clustering of the underlying patterns. We will present this methodology and show how our new work on conversion of scientific numerical data through this process can reveal underlying information clusters ordered by the eigenvalues. We will also show how the linking of clusters from different tables can be used to form a ``supernet'' of all numerical information supporting new initiatives in AI.

  10. Analysis of ligand-protein exchange by Clustering of Ligand Diffusion Coefficient Pairs (CoLD-CoP)

    NASA Astrophysics Data System (ADS)

    Snyder, David A.; Chantova, Mihaela; Chaudhry, Saadia

    2015-06-01

    NMR spectroscopy is a powerful tool in describing protein structures and protein activity for pharmaceutical and biochemical development. This study describes a method to determine weak binding ligands in biological systems by using hierarchic diffusion coefficient clustering of multidimensional data obtained with a 400 MHz Bruker NMR. Comparison of DOSY spectrums of ligands of the chemical library in the presence and absence of target proteins show translational diffusion rates for small molecules upon interaction with macromolecules. For weak binders such as compounds found in fragment libraries, changes in diffusion rates upon macromolecular binding are on the order of the precision of DOSY diffusion measurements, and identifying such subtle shifts in diffusion requires careful statistical analysis. The "CoLD-CoP" (Clustering of Ligand Diffusion Coefficient Pairs) method presented here uses SAHN clustering to identify protein-binders in a chemical library or even a not fully characterized metabolite mixture. We will show how DOSY NMR and the "CoLD-CoP" method complement each other in identifying the most suitable candidates for lysozyme and wheat germ acid phosphatase.

  11. Dengue Fever Occurrence and Vector Detection by Larval Survey, Ovitrap and MosquiTRAP: A Space-Time Clusters Analysis

    PubMed Central

    de Melo, Diogo Portella Ornelas; Scherrer, Luciano Rios; Eiras, Álvaro Eduardo

    2012-01-01

    The use of vector surveillance tools for preventing dengue disease requires fine assessment of risk, in order to improve vector control activities. Nevertheless, the thresholds between vector detection and dengue fever occurrence are currently not well established. In Belo Horizonte (Minas Gerais, Brazil), dengue has been endemic for several years. From January 2007 to June 2008, the dengue vector Aedes (Stegomyia) aegypti was monitored by ovitrap, the sticky-trap MosquiTRAP™ and larval surveys in an study area in Belo Horizonte. Using a space-time scan for clusters detection implemented in SaTScan software, the vector presence recorded by the different monitoring methods was evaluated. Clusters of vectors and dengue fever were detected. It was verified that ovitrap and MosquiTRAP vector detection methods predicted dengue occurrence better than larval survey, both spatially and temporally. MosquiTRAP and ovitrap presented similar results of space-time intersections to dengue fever clusters. Nevertheless ovitrap clusters presented longer duration periods than MosquiTRAP ones, less acuratelly signalizing the dengue risk areas, since the detection of vector clusters during most of the study period was not necessarily correlated to dengue fever occurrence. It was verified that ovitrap clusters occurred more than 200 days (values ranged from 97.0±35.35 to 283.0±168.4 days) before dengue fever clusters, whereas MosquiTRAP clusters preceded dengue fever clusters by approximately 80 days (values ranged from 65.5±58.7 to 94.0±14. 3 days), the former showing to be more temporally precise. Thus, in the present cluster analysis study MosquiTRAP presented superior results for signaling dengue transmission risks both geographically and temporally. Since early detection is crucial for planning and deploying effective preventions, MosquiTRAP showed to be a reliable tool and this method provides groundwork for the development of even more precise tools. PMID:22848729

  12. A new approach for evaluating flexible working hours.

    PubMed

    Giebel, Ole; Janssen, Daniela; Schomann, Carsten; Nachreiner, Friedhelm

    2004-01-01

    Recent studies on flexible working hours show at least some of these working time arrangements seem to be associated with impairing effects of health and well-being. According to available evidence, variability of working hours seems to play an important role. The question, however, is how this variability can be assessed and used to explain or predict impairments. Based on earlier methods used to assess shift-work effects, a time series analysis approach was applied to the matter of flexible working hours. Data on the working hours of 4 week's length of 137 respondents derived from a survey on flexible work hours involving 15 companies of different production and service sectors in Germany were converted to time series and analyzed by spectral analysis. A cluster analysis of the resulting power spectra yielded 5 clusters of flexible work hours. Analyzing these clusters for differences in reported impairments showed that workers who showed suppression of circadian and weekly rhythms experienced severest impairments, especially in circadian controlled functions like sleep and digestion. The results thus indicate that analyzing the periodicity of flexible working hours seems to be a promising approach for predicting impairments which should be investigated further in the future.

  13. High-Resolution CCD Spectra of Stars in Globular Clusters. IX. The "Young" Clusters Ruprecht 106 and PAL 12

    NASA Astrophysics Data System (ADS)

    Brown, Jeffrey A.; Wallerstein, George; Zucker, Daniel

    1997-07-01

    We have performed a spectroscopic abundance analysis of two stars each in the anomalously young globular clusters Rup 106 and Pal 12. We find [Fe/H] =~ -1.45 for Rup 106 and -1.0 for Pal 12. The abundance ratios in both clusters are peculiar in comparison to other globulars: the alpha -elements are not enhanced over the solar ratio. We find that oxygen in Rup 106 is also relatively low, with [O/Fe] =~ 0.0 - +0.1. The similarity of the ratio of the alpha-elements to iron to the solar ratio shows that species contributed by supernovae of type Ia have ``caught up" with species produced by SN II's. The similar contributions of the alpha - and Fe-peak species to disk stars shows that age, not metallicity, is the determining factor in the ratio of SN II/SN Ia nucleosynthesis. Galactic enrichment models show that these abundance ratios can be understood as being the result of these two clusters coming from an environment with multiple discontinuous star formation events.

  14. Cluster analysis of quantitative parametric maps from DCE-MRI: application in evaluating heterogeneity of tumor response to antiangiogenic treatment.

    PubMed

    Longo, Dario Livio; Dastrù, Walter; Consolino, Lorena; Espak, Miklos; Arigoni, Maddalena; Cavallo, Federica; Aime, Silvio

    2015-07-01

    The objective of this study was to compare a clustering approach to conventional analysis methods for assessing changes in pharmacokinetic parameters obtained from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) during antiangiogenic treatment in a breast cancer model. BALB/c mice bearing established transplantable her2+ tumors were treated with a DNA-based antiangiogenic vaccine or with an empty plasmid (untreated group). DCE-MRI was carried out by administering a dose of 0.05 mmol/kg of Gadocoletic acid trisodium salt, a Gd-based blood pool contrast agent (CA) at 1T. Changes in pharmacokinetic estimates (K(trans) and vp) in a nine-day interval were compared between treated and untreated groups on a voxel-by-voxel analysis. The tumor response to therapy was assessed by a clustering approach and compared with conventional summary statistics, with sub-regions analysis and with histogram analysis. Both the K(trans) and vp estimates, following blood-pool CA injection, showed marked and spatial heterogeneous changes with antiangiogenic treatment. Averaged values for the whole tumor region, as well as from the rim/core sub-regions analysis were unable to assess the antiangiogenic response. Histogram analysis resulted in significant changes only in the vp estimates (p<0.05). The proposed clustering approach depicted marked changes in both the K(trans) and vp estimates, with significant spatial heterogeneity in vp maps in response to treatment (p<0.05), provided that DCE-MRI data are properly clustered in three or four sub-regions. This study demonstrated the value of cluster analysis applied to pharmacokinetic DCE-MRI parametric maps for assessing tumor response to antiangiogenic therapy. Copyright © 2015 Elsevier Inc. All rights reserved.

  15. Analysis of radiation-induced small Cu particle cluster formation in aqueous CuCl2

    USGS Publications Warehouse

    Jayanetti, Sumedha; Mayanovic, Robert A.; Anderson, Alan J.; Bassett, William A.; Chou, I.-Ming

    2001-01-01

    Radition-induced small Cu particle cluster formation in aqueous CuCl2 was analyzed. It was noticed that nearest neighbor distance increased with the increase in the time of irradiation. This showed that the clusters approached the lattice dimension of bulk copper. As the average cluster size approached its bulk dimensions, an increase in the nearest neighbor coordination number was found with the decrease in the surface to volume ratio. Radiolysis of water by incident x-ray beam led to the reduction of copper ions in the solution to themetallic state.

  16. Spatial distribution and ecological risk assessment of heavy metal on surface sediment in west part of Java Sea

    NASA Astrophysics Data System (ADS)

    Effendi, Hefni; Wardiatno, Yusli; Kawaroe, Mujizat; Mursalin; Fauzia Lestari, Dea

    2017-01-01

    The surface sediments were identified from west part of Java Sea to evaluate spatial distribution and ecological risk potential of heavy metals (Hg, As, Cd, Cr, Cu, Pb, Zn and Ni). The samples were taken from surface sediment (<0.5 m) in 26 m up to 80 m water depth with Eikman grab. The average material composition on sediment samples were clay (9.86%), sand (8.57%) and mud sand (81.57%). The analysis showed that Pb (11.2%), Cd (49.7%), and Ni (59.5%) exceeded of Probably Effect Level (PEL). Base on ecological risk analysis, {{Cd }}≤ft( {E_r^i:300.64} \\right) and {{Cr }}≤ft( {E_r^i:0.02} \\right) were categorized to high risk and low risk criteria. The ecological risk potential sequences of this study were Cd>Hg>Pb>Ni>Cu>As>Zn>Cr. Furthermore, the result of multivariate statistical analysis shows that correlation among heavy metals (As/Ni, Cd/Ni, and Cu/Zn) and heavy metals with Risk Index (Cd/Ri and Ni/Ri) had positive correlation in significance level p<0.05. Total variance of analysis factor was 80.04% and developed into 3 factors (eigenvalues >1). On the cluster analysis, Cd, Ni, Pb were identified as fairly high contaminations level (cluster 1), Hg as moderate contamination level (cluster 2) and Cu, Zn, Cr with lower contamination level (cluster 3).

  17. Symptom clustering and quality of life in patients with ovarian cancer undergoing chemotherapy.

    PubMed

    Nho, Ju-Hee; Reul Kim, Sung; Nam, Joo-Hyun

    2017-10-01

    The symptom clusters in patients with ovarian cancer undergoing chemotherapy have not been well evaluated. We investigated the symptom clusters and effects of symptom clusters on the quality of life of patients with ovarian cancer. We recruited 210 ovarian cancer patients being treated with chemotherapy and used a descriptive cross-sectional study design to collect information on their symptoms. To determine inter-relationships among symptoms, a principal component analysis with varimax rotation was performed based on the patient's symptoms (fatigue, pain, sleep disturbance, chemotherapy-induced peripheral neuropathy, anxiety, depression, and sexual dysfunction). All patients had experienced at least two domains of concurrent symptoms, and there were two types of symptom clusters. The first symptom cluster consisted of anxiety, depression, fatigue, and sleep disturbance symptoms, while the second symptom cluster consisted of pain and chemotherapy-induced peripheral neuropathy symptoms. Our subgroup cluster analysis showed that ovarian cancer patients with higher-scoring symptoms had significantly poorer quality of life in both symptom cluster 1 and 2 subgroups, with subgroup-specific patterns. The symptom clusters were different depending on age, age at disease onset, disease duration, recurrence, and performance status of patients with ovarian cancer. In addition, ovarian cancer patients experienced different symptom clusters according to cancer stage. The current study demonstrated that there is a specific pattern of symptom clusters, and symptom clusters negatively influence the quality of life in patients with ovarian cancer. Identifying symptom clusters of ovarian cancer patients may have clinical implications in improving symptom management. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Using Cluster Bootstrapping to Analyze Nested Data With a Few Clusters.

    PubMed

    Huang, Francis L

    2018-04-01

    Cluster randomized trials involving participants nested within intact treatment and control groups are commonly performed in various educational, psychological, and biomedical studies. However, recruiting and retaining intact groups present various practical, financial, and logistical challenges to evaluators and often, cluster randomized trials are performed with a low number of clusters (~20 groups). Although multilevel models are often used to analyze nested data, researchers may be concerned of potentially biased results due to having only a few groups under study. Cluster bootstrapping has been suggested as an alternative procedure when analyzing clustered data though it has seen very little use in educational and psychological studies. Using a Monte Carlo simulation that varied the number of clusters, average cluster size, and intraclass correlations, we compared standard errors using cluster bootstrapping with those derived using ordinary least squares regression and multilevel models. Results indicate that cluster bootstrapping, though more computationally demanding, can be used as an alternative procedure for the analysis of clustered data when treatment effects at the group level are of primary interest. Supplementary material showing how to perform cluster bootstrapped regressions using R is also provided.

  19. A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set

    PubMed Central

    Peng, Yi; Zhang, Yong; Kou, Gang; Shi, Yong

    2012-01-01

    Determining the number of clusters in a data set is an essential yet difficult step in cluster analysis. Since this task involves more than one criterion, it can be modeled as a multiple criteria decision making (MCDM) problem. This paper proposes a multiple criteria decision making (MCDM)-based approach to estimate the number of clusters for a given data set. In this approach, MCDM methods consider different numbers of clusters as alternatives and the outputs of any clustering algorithm on validity measures as criteria. The proposed method is examined by an experimental study using three MCDM methods, the well-known clustering algorithm–k-means, ten relative measures, and fifteen public-domain UCI machine learning data sets. The results show that MCDM methods work fairly well in estimating the number of clusters in the data and outperform the ten relative measures considered in the study. PMID:22870181

  20. Raman spectroscopy of normal oral buccal mucosa tissues: study on intact and incised biopsies

    NASA Astrophysics Data System (ADS)

    Deshmukh, Atul; Singh, S. P.; Chaturvedi, Pankaj; Krishna, C. Murali

    2011-12-01

    Oral squamous cell carcinoma is one of among the top 10 malignancies. Optical spectroscopy, including Raman, is being actively pursued as alternative/adjunct for cancer diagnosis. Earlier studies have demonstrated the feasibility of classifying normal, premalignant, and malignant oral ex vivo tissues. Spectral features showed predominance of lipids and proteins in normal and cancer conditions, respectively, which were attributed to membrane lipids and surface proteins. In view of recent developments in deep tissue Raman spectroscopy, we have recorded Raman spectra from superior and inferior surfaces of 10 normal oral tissues on intact, as well as incised, biopsies after separation of epithelium from connective tissue. Spectral variations and similarities among different groups were explored by unsupervised (principal component analysis) and supervised (linear discriminant analysis, factorial discriminant analysis) methodologies. Clusters of spectra from superior and inferior surfaces of intact tissues show a high overlap; whereas spectra from separated epithelium and connective tissue sections yielded clear clusters, though they also overlap on clusters of intact tissues. Spectra of all four groups of normal tissues gave exclusive clusters when tested against malignant spectra. Thus, this study demonstrates that spectra recorded from the superior surface of an intact tissue may have contributions from deeper layers but has no bearing from the classification of a malignant tissues point of view.

  1. Limits on turbulent propagation of energy in cool-core clusters of galaxies

    NASA Astrophysics Data System (ADS)

    Bambic, C. J.; Pinto, C.; Fabian, A. C.; Sanders, J.; Reynolds, C. S.

    2018-07-01

    We place constraints on the propagation velocity of bulk turbulence within the intracluster medium of three clusters and an elliptical galaxy. Using Reflection Grating Spectrometer measurements of turbulent line broadening, we show that for these clusters, the 90 per cent upper limit on turbulent velocities when accounting for instrumental broadening is too low to propagate energy radially to the cooling radius of the clusters within the required cooling time. In this way, we extend previous Hitomi-based analysis on the Perseus cluster to more clusters, with the intention of applying these results to a future, more extensive catalogue. These results constrain models of turbulent heating in active galactic nucleus feedback by requiring a mechanism which can not only provide sufficient energy to offset radiative cooling but also resupply that energy rapidly enough to balance cooling at each cluster radius.

  2. Limits on turbulent propagation of energy in cool-core clusters of galaxies

    NASA Astrophysics Data System (ADS)

    Bambic, C. J.; Pinto, C.; Fabian, A. C.; Sanders, J.; Reynolds, C. S.

    2018-04-01

    We place constraints on the propagation velocity of bulk turbulence within the intracluster medium of three clusters and an elliptical galaxy. Using Reflection Grating Spectrometer measurements of turbulent line broadening, we show that for these clusters, the 90% upper limit on turbulent velocities when accounting for instrumental broadening is too low to propagate energy radially to the cooling radius of the clusters within the required cooling time. In this way, we extend previous Hitomi-based analysis on the Perseus cluster to more clusters, with the intention of applying these results to a future, more extensive catalog. These results constrain models of turbulent heating in AGN feedback by requiring a mechanism which can not only provide sufficient energy to offset radiative cooling, but resupply that energy rapidly enough to balance cooling at each cluster radius.

  3. The relationship between a low grain intake dietary pattern and impulsive behaviors in middle-aged Japanese people.

    PubMed

    Toyomaki, Atsuhito; Koga, Minori; Okada, Emiko; Nakai, Yukiei; Miyazaki, Akane; Tamakoshi, Akiko; Kiso, Yoshinobu; Kusumi, Ichiro

    2017-01-01

    Several studies indicate that dietary habits are associated with mental health. We are interested in identifying not a specific single nutrient/food group but the population preferring specific food combinations that can be related to mental health. Very few studies have examined relationships between dietary patterns and multifaceted mental states using cluster analysis. The purpose of this study was to investigate population-level dietary patterns associated with mental state using cluster analysis. We focused on depressive state, sleep quality, subjective well-being, and impulsive behaviors using rating scales. Two hundred and seventy-nine Japanese middle-aged people participated in the present study. Dietary pattern was estimated using a brief self-administered diet-history questionnaire (the BDHQ). We conducted K-means cluster analysis using thirteen BDHQ food groups: milk, meat, fish, egg, pulses, potatoes, green and yellow vegetables, other vegetables, mushrooms, seaweed, sweets, fruits, and grain. We identified three clusters characterized as "vegetable and fruit dominant," "grain dominant," and "low grain tendency" subgroups. The vegetable and fruit dominant group showed increases in several aspects of subjective well-being demonstrated by the SF-8. Differences in mean subject characteristics across clusters were tested using ANOVA. The low frequency intake of grain group showed higher impulsive behavior, demonstrated by BIS-11 deliberation and sum scores. The present study demonstrated that traditional Japanese dietary patterns, such as eating rice, can help with beneficial changes in mental health.

  4. The relationship between a low grain intake dietary pattern and impulsive behaviors in middle-aged Japanese people

    PubMed Central

    Toyomaki, Atsuhito; Koga, Minori; Okada, Emiko; Nakai, Yukiei; Miyazaki, Akane; Tamakoshi, Akiko; Kiso, Yoshinobu; Kusumi, Ichiro

    2017-01-01

    Several studies indicate that dietary habits are associated with mental health. We are interested in identifying not a specific single nutrient/food group but the population preferring specific food combinations that can be related to mental health. Very few studies have examined relationships between dietary patterns and multifaceted mental states using cluster analysis. The purpose of this study was to investigate population-level dietary patterns associated with mental state using cluster analysis. We focused on depressive state, sleep quality, subjective well-being, and impulsive behaviors using rating scales. Two hundred and seventy-nine Japanese middle-aged people participated in the present study. Dietary pattern was estimated using a brief self-administered diet-history questionnaire (the BDHQ). We conducted K-means cluster analysis using thirteen BDHQ food groups: milk, meat, fish, egg, pulses, potatoes, green and yellow vegetables, other vegetables, mushrooms, seaweed, sweets, fruits, and grain. We identified three clusters characterized as “vegetable and fruit dominant,” “grain dominant,” and “low grain tendency” subgroups. The vegetable and fruit dominant group showed increases in several aspects of subjective well-being demonstrated by the SF-8. Differences in mean subject characteristics across clusters were tested using ANOVA. The low frequency intake of grain group showed higher impulsive behavior, demonstrated by BIS-11 deliberation and sum scores. The present study demonstrated that traditional Japanese dietary patterns, such as eating rice, can help with beneficial changes in mental health. PMID:28704469

  5. Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

    PubMed Central

    Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

    2013-01-01

    Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674

  6. Competing Effects Between Screen Media Time and Physical Activity in Adolescent Girls: Clustering a Self-Organizing Maps Analysis.

    PubMed

    Valencia-Peris, Alexandra; Devís-Devís, José; García-Massó, Xavier; Lizandra, Jorge; Pérez-Gimeno, Esther; Peiró-Velert, Carmen

    2016-06-01

    Previous research shows contradictory findings on potential competing effects between sedentary screen media usage (SMU) and physical activity (PA). This study examined these effects on adolescent girls via self-organizing maps analysis focusing on 3 target profiles. A sample of 1,516 girls aged 12 to 18 years self-reported daily time engagement in PA (moderate and vigorous intensity) and in screen media activities (TV/video/DVD, computer, and videogames), separately and combined. Topological interrelationships from the 13 emerging maps indicated a moderate competing effect between physically active and sedentary SMU patterns. Higher SES and overweight status were linked to either active or inactive behaviors. Three target clusters were explored in more detail. Cluster 1, named temperate-media actives, showed capabilities of being active while engaging in a moderate level of SMU (TV/video/DVD mainly). In Cluster 2, named prudent-media inactives, and Cluster 3, compulsive-media inactives, a competing effect between SMU and PA emerged, being sedentary SMU behaviors responsible for a low involvement in active pursuits. SMU and PA emerge as both related and independent behaviors in girls, resulting in a moderate competing effect. Findings support the case for recommending the timing of PA and SMU for recreational purposes considering different profiles, sociodemographic factors and types of SMU.

  7. The history of introduction of the African baobab (Adansonia digitata, Malvaceae: Bombacoideae) in the Indian subcontinent

    PubMed Central

    Bell, Karen L.; Rangan, Haripriya; Kull, Christian A.; Murphy, Daniel J.

    2015-01-01

    To investigate the pathways of introduction of the African baobab, Adansonia digitata, to the Indian subcontinent, we examined 10 microsatellite loci in individuals from Africa, India, the Mascarenes and Malaysia, and matched this with historical evidence of human interactions between source and destination regions. Genetic analysis showed broad congruence of African clusters with biogeographic regions except along the Zambezi (Mozambique) and Kilwa (Tanzania), where populations included a mixture of individuals assigned to at least two different clusters. Individuals from West Africa, the Mascarenes, southeast India and Malaysia shared a cluster. Baobabs from western and central India clustered separately from Africa. Genetic diversity was lower in populations from the Indian subcontinent than in African populations, but the former contained private alleles. Phylogenetic analysis showed Indian populations were closest to those from the Mombasa-Dar es Salaam coast. The genetic results provide evidence of multiple introductions of African baobabs to the Indian subcontinent over a longer time period than previously assumed. Individuals belonging to different genetic clusters in Zambezi and Kilwa may reflect the history of trafficking captives from inland areas to supply the slave trade between the fifteenth and nineteenth centuries. Baobabs in the Mascarenes, southeast India and Malaysia indicate introduction from West Africa through eighteenth and nineteenth century European colonial networks. PMID:26473060

  8. The history of introduction of the African baobab (Adansonia digitata, Malvaceae: Bombacoideae) in the Indian subcontinent.

    PubMed

    Bell, Karen L; Rangan, Haripriya; Kull, Christian A; Murphy, Daniel J

    2015-09-01

    To investigate the pathways of introduction of the African baobab, Adansonia digitata, to the Indian subcontinent, we examined 10 microsatellite loci in individuals from Africa, India, the Mascarenes and Malaysia, and matched this with historical evidence of human interactions between source and destination regions. Genetic analysis showed broad congruence of African clusters with biogeographic regions except along the Zambezi (Mozambique) and Kilwa (Tanzania), where populations included a mixture of individuals assigned to at least two different clusters. Individuals from West Africa, the Mascarenes, southeast India and Malaysia shared a cluster. Baobabs from western and central India clustered separately from Africa. Genetic diversity was lower in populations from the Indian subcontinent than in African populations, but the former contained private alleles. Phylogenetic analysis showed Indian populations were closest to those from the Mombasa-Dar es Salaam coast. The genetic results provide evidence of multiple introductions of African baobabs to the Indian subcontinent over a longer time period than previously assumed. Individuals belonging to different genetic clusters in Zambezi and Kilwa may reflect the history of trafficking captives from inland areas to supply the slave trade between the fifteenth and nineteenth centuries. Baobabs in the Mascarenes, southeast India and Malaysia indicate introduction from West Africa through eighteenth and nineteenth century European colonial networks.

  9. Complex networks as a unified framework for descriptive analysis and predictive modeling in climate

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Steinhaeuser, Karsten J K; Chawla, Nitesh; Ganguly, Auroop R

    The analysis of climate data has relied heavily on hypothesis-driven statistical methods, while projections of future climate are based primarily on physics-based computational models. However, in recent years a wealth of new datasets has become available. Therefore, we take a more data-centric approach and propose a unified framework for studying climate, with an aim towards characterizing observed phenomena as well as discovering new knowledge in the climate domain. Specifically, we posit that complex networks are well-suited for both descriptive analysis and predictive modeling tasks. We show that the structural properties of climate networks have useful interpretation within the domain. Further,more » we extract clusters from these networks and demonstrate their predictive power as climate indices. Our experimental results establish that the network clusters are statistically significantly better predictors than clusters derived using a more traditional clustering approach. Using complex networks as data representation thus enables the unique opportunity for descriptive and predictive modeling to inform each other.« less

  10. Diversity in phenotypic and nutritional traits in vegetable amaranth (Amaranthus tricolor), a nutritionally underutilised crop.

    PubMed

    Shukla, Sudhir; Bhargava, Atul; Chatterjee, Avijeet; Pandey, Avinash Chandra; Mishra, Brij K

    2010-01-15

    Assessment of genetic diversity in a crop-breeding programme helps in the identification of diverse parental combinations to create segregating progenies with maximum genetic variability and facilitates introgression of desirable genes from diverse germplasm into the available genetic base. In the present study, 39 strains of vegetable amaranth (Amaranthus tricolor) were evaluated for eight morphological and seven quality traits for two test seasons to study the extent of genetic divergence among the strains. Multivariate analysis showed that the first four principal components contributed 67.55% of the variability. Cluster analysis grouped the strains into six clusters that displayed a wide range of diversity for most of the traits. Cluster analysis has proved to be an effective method in grouping strains that may facilitate effective management and utilisation in crop-breeding programmes. The diverse strains falling in different clusters were identified, which can be utilised in different hybridisation programmes to develop high-foliage-yielding varieties rich in nutritional components. Copyright (c) 2009 Society of Chemical Industry.

  11. Genetic diversity analysis of Capparis spinosa L. populations by using ISSR markers.

    PubMed

    Liu, C; Xue, G P; Cheng, B; Wang, X; He, J; Liu, G H; Yang, W J

    2015-12-09

    Capparis spinosa L. is an important medicinal species in the Xinjiang Province of China. Ten natural populations of C. spinosa from 3 locations in North, Central, and South Xinjiang were studied using morphological trait inter simple sequence repeat (ISSR) molecular markers to assess the genetic diversity and population structure. In this study, the 10 ISSR primers produced 313 amplified DNA fragments, with 52% of fragments being polymorphic. Unweighted pair-group method with arithmetic average (UPGMA) cluster analysis indicated that 10 C. spinosa populations were clustered into 3 geographically distinct groups. The Nei gene of C. spinosa populations in different regions had Diversity and Shannon's information index ranges of 0.1312-0.2001 and 0.1004-0.1875, respectively. The 362 markers were used to construct the dendrogram based on the UPGMA cluster analysis. The dendrogram indicated that 10 populations of C. spinosa were clustered into 3 geographically distinct groups. The results showed these genotypes have high genetic diversity, and can be used for an alternative breeding program.

  12. Input frequency and lexical variability in phonological development: a survival analysis of word-initial cluster production.

    PubMed

    Ota, Mitsuhiko; Green, Sam J

    2013-06-01

    Although it has been often hypothesized that children learn to produce new sound patterns first in frequently heard words, the available evidence in support of this claim is inconclusive. To re-examine this question, we conducted a survival analysis of word-initial consonant clusters produced by three children in the Providence Corpus (0 ; 11-4 ; 0). The analysis took account of several lexical factors in addition to lexical input frequency, including the age of first production, production frequency, neighborhood density and number of phonemes. The results showed that lexical input frequency was a significant predictor of the age at which the accuracy level of cluster production in each word first reached 80%. The magnitude of the frequency effect differed across cluster types. Our findings indicate that some of the between-word variance found in the development of sound production can indeed be attributed to the frequency of words in the child's ambient language.

  13. Statistical analysis of atom probe data: detecting the early stages of solute clustering and/or co-segregation.

    PubMed

    Hyde, J M; Cerezo, A; Williams, T J

    2009-04-01

    Statistical analysis of atom probe data has improved dramatically in the last decade and it is now possible to determine the size, the number density and the composition of individual clusters or precipitates such as those formed in reactor pressure vessel (RPV) steels during irradiation. However, the characterisation of the onset of clustering or co-segregation is more difficult and has traditionally focused on the use of composition frequency distributions (for detecting clustering) and contingency tables (for detecting co-segregation). In this work, the authors investigate the possibility of directly examining the neighbourhood of each individual solute atom as a means of identifying the onset of solute clustering and/or co-segregation. The methodology involves comparing the mean observed composition around a particular type of solute with that expected from the overall composition of the material. The methodology has been applied to atom probe data obtained from several irradiated RPV steels. The results show that the new approach is more sensitive to fine scale clustering and co-segregation than that achievable using composition frequency distribution and contingency table analyses.

  14. Electrical Load Profile Analysis Using Clustering Techniques

    NASA Astrophysics Data System (ADS)

    Damayanti, R.; Abdullah, A. G.; Purnama, W.; Nandiyanto, A. B. D.

    2017-03-01

    Data mining is one of the data processing techniques to collect information from a set of stored data. Every day the consumption of electricity load is recorded by Electrical Company, usually at intervals of 15 or 30 minutes. This paper uses a clustering technique, which is one of data mining techniques to analyse the electrical load profiles during 2014. The three methods of clustering techniques were compared, namely K-Means (KM), Fuzzy C-Means (FCM), and K-Means Harmonics (KHM). The result shows that KHM is the most appropriate method to classify the electrical load profile. The optimum number of clusters is determined using the Davies-Bouldin Index. By grouping the load profile, the demand of variation analysis and estimation of energy loss from the group of load profile with similar pattern can be done. From the group of electric load profile, it can be known cluster load factor and a range of cluster loss factor that can help to find the range of values of coefficients for the estimated loss of energy without performing load flow studies.

  15. Percolation analyses of observed and simulated galaxy clustering

    NASA Astrophysics Data System (ADS)

    Bhavsar, S. P.; Barrow, J. D.

    1983-11-01

    A percolation cluster analysis is performed on equivalent regions of the CFA redshift survey of galaxies and the 4000 body simulations of gravitational clustering made by Aarseth, Gott and Turner (1979). The observed and simulated percolation properties are compared and, unlike correlation and multiplicity function analyses, favour high density (Omega = 1) models with n = - 1 initial data. The present results show that the three-dimensional data are consistent with the degree of filamentary structure present in isothermal models of galaxy formation at the level of percolation analysis. It is also found that the percolation structure of the CFA data is a function of depth. Percolation structure does not appear to be a sensitive probe of intrinsic filamentary structure.

  16. Passion and intrinsic motivation in digital gaming.

    PubMed

    Wang, Chee Keng John; Khoo, Angeline; Liu, Woon Chia; Divaharan, Shanti

    2008-02-01

    Digital gaming is fast becoming a favorite activity all over the world. Yet very few studies have examined the underlying motivational processes involved in digital gaming. One motivational force that receives little attention in psychology is passion, which could help us understand the motivation of gamers. The purpose of the present study was to identify subgroups of young people with distinctive passion profiles on self-determined regulations, flow dispositions, affect, and engagement time in gaming. One hundred fifty-five students from two secondary schools in Singapore participated in the survey. There were 134 males and 8 females (13 unspecified). The participants completed a questionnaire to measure harmonious passion (HP), obsessive passion (OP), perceived locus of causality, disposition flow, positive and negative affects, and engagement time in gaming. Cluster analysis found three clusters with distinct passion profiles. The first cluster had an average HP/OP profile, the second cluster had a low HP/OP profile, and the third cluster had a high HP/OP profile. The three clusters displayed different levels of cognitive, affective, and behavioral outcomes. Cluster analysis, as this study shows, is useful in identifying groups of gamers with different passion profiles. It has helped us gain a deeper understanding of motivation in digital gaming.

  17. Microcolumn Formation due to Induced-Charge Electroosmosis in a Floating Mode

    NASA Astrophysics Data System (ADS)

    Sugioka, Hideyuki; Dan, Hironobu; Hanazawa, Yuya

    2017-10-01

    Self-organization of particles is important since it may provide new functional materials. Previously, by using two-dimensional multiphysics simulations, we theoretically showed microcolumn formation due to induced-charge electroosmosis (ICEO). In this study, we experimentally demonstrate that gold leaves on a water surface move slowly and dynamically form a microcolumn due to a hydrodynamic interaction under an ac electric field. Further, by numerically analyzing video data, we show the time evolutions of the maximum cluster length and the maximum cluster area. In addition, by cluster analysis, we show the dependences of the average velocity on the applied voltage and frequency to clarify the phenomena. We believe that our findings make a new stage in the development of new functional materials on a water surface.

  18. Cluster analysis of fasciolosis in dairy cow herds in Munster province of Ireland and detection of major climatic and environmental predictors of the exposure risk.

    PubMed

    Selemetas, Nikolaos; Phelan, Paul; O'Kiely, Padraig; de Waal, Theo

    2015-03-19

    Fasciolosis caused by Fasciola hepatica is a widespread parasitic disease in cattle farms. The aim of this study was to detect clusters of fasciolosis in dairy cow herds in Munster Province, Ireland and to identify significant climatic and environmental predictors of the exposure risk. In total, 1,292 dairy herds across Munster was sampled in September 2012 providing a single bulk tank milk (BTM) sample. The analysis of samples by an in-house antibody-detection enzyme-linked immunosorbent assay (ELISA), showed that 65% of the dairy herds (n = 842) had been exposed to F. hepatica. Using the Getis-Ord Gi* statistic, 16 high-risk and 24 low-risk (P <0.01) clusters of fasciolosis were identified. The spatial distribution of high-risk clusters was more dispersed and mainly located in the northern and western regions of Munster compared to the low-risk clusters that were mostly concentrated in the southern and eastern regions. The most significant classes of variables that could reflect the difference between high-risk and low-risk clusters were the total number of wet-days and rain-days, rainfall, the normalized difference vegetation index (NDVI), temperature and soil type. There was a bigger proportion of well-drained soils among the low-risk clusters, whereas poorly drained soils were more common among the high-risk clusters. These results stress the role of precipitation, grazing, temperature and drainage on the life cycle of F. hepatica in the temperate Irish climate. The findings of this study highlight the importance of cluster analysis for identifying significant differences in climatic and environmental variables between high-risk and low-risk clusters of fasciolosis in Irish dairy herds.

  19. Exploring the individual patterns of spiritual well-being in people newly diagnosed with advanced cancer: a cluster analysis.

    PubMed

    Bai, Mei; Dixon, Jane; Williams, Anna-Leila; Jeon, Sangchoon; Lazenby, Mark; McCorkle, Ruth

    2016-11-01

    Research shows that spiritual well-being correlates positively with quality of life (QOL) for people with cancer, whereas contradictory findings are frequently reported with respect to the differentiated associations between dimensions of spiritual well-being, namely peace, meaning and faith, and QOL. This study aimed to examine individual patterns of spiritual well-being among patients newly diagnosed with advanced cancer. Cluster analysis was based on the twelve items of the 12-item Functional Assessment of Chronic Illness Therapy-Spiritual Well-Being Scale at Time 1. A combination of hierarchical and k-means (non-hierarchical) clustering methods was employed to jointly determine the number of clusters. Self-rated health, depressive symptoms, peace, meaning and faith, and overall QOL were compared at Time 1 and Time 2. Hierarchical and k-means clustering methods both suggested four clusters. Comparison of the four clusters supported statistically significant and clinically meaningful differences in QOL outcomes among clusters while revealing contrasting relations of faith with QOL. Cluster 1, Cluster 3, and Cluster 4 represented high, medium, and low levels of overall QOL, respectively, with correspondingly high, medium, and low levels of peace, meaning, and faith. Cluster 2 was distinguished from other clusters by its medium levels of overall QOL, peace, and meaning and low level of faith. This study provides empirical support for individual difference in response to a newly diagnosed cancer and brings into focus conceptual and methodological challenges associated with the measure of spiritual well-being, which may partly contribute to the attenuated relation between faith and QOL.

  20. Genome-scale cluster analysis of replicated microarrays using shrinkage correlation coefficient.

    PubMed

    Yao, Jianchao; Chang, Chunqi; Salmi, Mari L; Hung, Yeung Sam; Loraine, Ann; Roux, Stanley J

    2008-06-18

    Currently, clustering with some form of correlation coefficient as the gene similarity metric has become a popular method for profiling genomic data. The Pearson correlation coefficient and the standard deviation (SD)-weighted correlation coefficient are the two most widely-used correlations as the similarity metrics in clustering microarray data. However, these two correlations are not optimal for analyzing replicated microarray data generated by most laboratories. An effective correlation coefficient is needed to provide statistically sufficient analysis of replicated microarray data. In this study, we describe a novel correlation coefficient, shrinkage correlation coefficient (SCC), that fully exploits the similarity between the replicated microarray experimental samples. The methodology considers both the number of replicates and the variance within each experimental group in clustering expression data, and provides a robust statistical estimation of the error of replicated microarray data. The value of SCC is revealed by its comparison with two other correlation coefficients that are currently the most widely-used (Pearson correlation coefficient and SD-weighted correlation coefficient) using statistical measures on both synthetic expression data as well as real gene expression data from Saccharomyces cerevisiae. Two leading clustering methods, hierarchical and k-means clustering were applied for the comparison. The comparison indicated that using SCC achieves better clustering performance. Applying SCC-based hierarchical clustering to the replicated microarray data obtained from germinating spores of the fern Ceratopteris richardii, we discovered two clusters of genes with shared expression patterns during spore germination. Functional analysis suggested that some of the genetic mechanisms that control germination in such diverse plant lineages as mosses and angiosperms are also conserved among ferns. This study shows that SCC is an alternative to the Pearson correlation coefficient and the SD-weighted correlation coefficient, and is particularly useful for clustering replicated microarray data. This computational approach should be generally useful for proteomic data or other high-throughput analysis methodology.

  1. Detection of a pair of prominent X-ray cavities in Abell 3847

    NASA Astrophysics Data System (ADS)

    Vagshette, Nilkanth D.; Naik, Sachindra; Patil, Madhav. K.; Sonkamble, Satish S.

    2017-04-01

    We present the results obtained from a detailed analysis of a deep Chandra observation of the bright FRII radio galaxy 3C 444 in Abell 3847 cluster. A pair of huge X-ray cavities are detected along the north and south directions from the centre of 3C 444. X-ray and radio images of the cluster reveal peculiar positioning of the cavities and radio bubbles. The radio lobes and X-ray cavities are apparently not spatially coincident and exhibit offsets by ˜61 and 77 kpc from each other along the north and south directions, respectively. Radial temperature and density profiles reveal the presence of a cool core in the cluster. Imaging and spectral studies showed the removal of substantial amount of matter from the core of the cluster by the radio jets. A detailed analysis of the temperature and density profiles showed the presence of a rarely detected elliptical shock in the cluster. Detection of inflating cavities at an average distance of ˜55 kpc from the centre implies that the central engine feeds a remarkable amount of radio power (˜6.3 × 1044 erg s-1) into the intra-cluster medium over ˜108 yr, the estimated age of cavity. The cooling luminosity of the cluster was estimated to be ˜8.30 × 1043 erg s-1 , which confirms that the AGN power is sufficient to quench the cooling. Ratios of mass accretion rate to Eddington and Bondi rates were estimated to be ˜0.08 and 3.5 × 104, respectively. This indicates that the black hole in the core of the cluster accretes matter through chaotic cold accretion.

  2. Dynamic competitive probabilistic principal components analysis.

    PubMed

    López-Rubio, Ezequiel; Ortiz-DE-Lazcano-Lobato, Juan Miguel

    2009-04-01

    We present a new neural model which extends the classical competitive learning (CL) by performing a Probabilistic Principal Components Analysis (PPCA) at each neuron. The model also has the ability to learn the number of basis vectors required to represent the principal directions of each cluster, so it overcomes a drawback of most local PCA models, where the dimensionality of a cluster must be fixed a priori. Experimental results are presented to show the performance of the network with multispectral image data.

  3. Clustering cancer gene expression data by projective clustering ensemble

    PubMed Central

    Yu, Xianxue; Yu, Guoxian

    2017-01-01

    Gene expression data analysis has paramount implications for gene treatments, cancer diagnosis and other domains. Clustering is an important and promising tool to analyze gene expression data. Gene expression data is often characterized by a large amount of genes but with limited samples, thus various projective clustering techniques and ensemble techniques have been suggested to combat with these challenges. However, it is rather challenging to synergy these two kinds of techniques together to avoid the curse of dimensionality problem and to boost the performance of gene expression data clustering. In this paper, we employ a projective clustering ensemble (PCE) to integrate the advantages of projective clustering and ensemble clustering, and to avoid the dilemma of combining multiple projective clusterings. Our experimental results on publicly available cancer gene expression data show PCE can improve the quality of clustering gene expression data by at least 4.5% (on average) than other related techniques, including dimensionality reduction based single clustering and ensemble approaches. The empirical study demonstrates that, to further boost the performance of clustering cancer gene expression data, it is necessary and promising to synergy projective clustering with ensemble clustering. PCE can serve as an effective alternative technique for clustering gene expression data. PMID:28234920

  4. Study on Adaptive Parameter Determination of Cluster Analysis in Urban Management Cases

    NASA Astrophysics Data System (ADS)

    Fu, J. Y.; Jing, C. F.; Du, M. Y.; Fu, Y. L.; Dai, P. P.

    2017-09-01

    The fine management for cities is the important way to realize the smart city. The data mining which uses spatial clustering analysis for urban management cases can be used in the evaluation of urban public facilities deployment, and support the policy decisions, and also provides technical support for the fine management of the city. Aiming at the problem that DBSCAN algorithm which is based on the density-clustering can not realize parameter adaptive determination, this paper proposed the optimizing method of parameter adaptive determination based on the spatial analysis. Firstly, making analysis of the function Ripley's K for the data set to realize adaptive determination of global parameter MinPts, which means setting the maximum aggregation scale as the range of data clustering. Calculating every point object's highest frequency K value in the range of Eps which uses K-D tree and setting it as the value of clustering density to realize the adaptive determination of global parameter MinPts. Then, the R language was used to optimize the above process to accomplish the precise clustering of typical urban management cases. The experimental results based on the typical case of urban management in XiCheng district of Beijing shows that: The new DBSCAN clustering algorithm this paper presents takes full account of the data's spatial and statistical characteristic which has obvious clustering feature, and has a better applicability and high quality. The results of the study are not only helpful for the formulation of urban management policies and the allocation of urban management supervisors in XiCheng District of Beijing, but also to other cities and related fields.

  5. Transcriptome Analysis of Aspergillus flavus Reveals veA-Dependent Regulation of Secondary Metabolite Gene Clusters, Including the Novel Aflavarin Cluster

    PubMed Central

    Cary, J. W.; Han, Z.; Yin, Y.; Lohmar, J. M.; Shantappa, S.; Harris-Coward, P. Y.; Mack, B.; Ehrlich, K. C.; Wei, Q.; Arroyo-Manzanares, N.; Uka, V.; Vanhaecke, L.; Bhatnagar, D.; Yu, J.; Nierman, W. C.; Johns, M. A.; Sorensen, D.; Shen, H.; De Saeger, S.; Diana Di Mavungu, J.

    2015-01-01

    The global regulatory veA gene governs development and secondary metabolism in numerous fungal species, including Aspergillus flavus. This is especially relevant since A. flavus infects crops of agricultural importance worldwide, contaminating them with potent mycotoxins. The most well-known are aflatoxins, which are cytotoxic and carcinogenic polyketide compounds. The production of aflatoxins and the expression of genes implicated in the production of these mycotoxins are veA dependent. The genes responsible for the synthesis of aflatoxins are clustered, a signature common for genes involved in fungal secondary metabolism. Studies of the A. flavus genome revealed many gene clusters possibly connected to the synthesis of secondary metabolites. Many of these metabolites are still unknown, or the association between a known metabolite and a particular gene cluster has not yet been established. In the present transcriptome study, we show that veA is necessary for the expression of a large number of genes. Twenty-eight out of the predicted 56 secondary metabolite gene clusters include at least one gene that is differentially expressed depending on presence or absence of veA. One of the clusters under the influence of veA is cluster 39. The absence of veA results in a downregulation of the five genes found within this cluster. Interestingly, our results indicate that the cluster is expressed mainly in sclerotia. Chemical analysis of sclerotial extracts revealed that cluster 39 is responsible for the production of aflavarin. PMID:26209694

  6. Comparison of Salmonella enteritidis phage types isolated from layers and humans in Belgium in 2005.

    PubMed

    Welby, Sarah; Imberechts, Hein; Riocreux, Flavien; Bertrand, Sophie; Dierick, Katelijne; Wildemauwe, Christa; Hooyberghs, Jozef; Van der Stede, Yves

    2011-08-01

    The aim of this study was to investigate the available results for Belgium of the European Union coordinated monitoring program (2004/665 EC) on Salmonella in layers in 2005, as well as the results of the monthly outbreak reports of Salmonella Enteritidis in humans in 2005 to identify a possible statistical significant trend in both populations. Separate descriptive statistics and univariate analysis were carried out and the parametric and/or non-parametric hypothesis tests were conducted. A time cluster analysis was performed for all Salmonella Enteritidis phage types (PTs) isolated. The proportions of each Salmonella Enteritidis PT in layers and in humans were compared and the monthly distribution of the most common PT, isolated in both populations, was evaluated. The time cluster analysis revealed significant clusters during the months May and June for layers and May, July, August, and September for humans. PT21, the most frequently isolated PT in both populations in 2005, seemed to be responsible of these significant clusters. PT4 was the second most frequently isolated PT. No significant difference was found for the monthly trend evolution of both PT in both populations based on parametric and non-parametric methods. A similar monthly trend of PT distribution in humans and layers during the year 2005 was observed. The time cluster analysis and the statistical significance testing confirmed these results. Moreover, the time cluster analysis showed significant clusters during the summer time and slightly delayed in time (humans after layers). These results suggest a common link between the prevalence of Salmonella Enteritidis in layers and the occurrence of the pathogen in humans. Phage typing was confirmed to be a useful tool for identifying temporal trends.

  7. Nuclear fusion at heavy water clusters collision with deuterized targets

    NASA Astrophysics Data System (ADS)

    Bolotin, Yu. L.; Inopin, E. V.; Lyashko, Yu. V.; Slabospitskij, R. P.

    A review of research developed in different laboratories on animal heavy particle yield in D-D fusion reactions induced by heavy water cluster collisions with deuterized targets is presented. Analysis of data shows, on one hand, nontriviality of experimental results and inadequacy of their interpretation and, on the other hand, the multipromising prospects of such a research.

  8. [Principal component analysis and cluster analysis of inorganic elements in sea cucumber Apostichopus japonicus].

    PubMed

    Liu, Xiao-Fang; Xue, Chang-Hu; Wang, Yu-Ming; Li, Zhao-Jie; Xue, Yong; Xu, Jie

    2011-11-01

    The present study is to investigate the feasibility of multi-elements analysis in determination of the geographical origin of sea cucumber Apostichopus japonicus, and to make choice of the effective tracers in sea cucumber Apostichopus japonicus geographical origin assessment. The content of the elements such as Al, V, Cr, Mn, Fe, Co, Ni, Cu, Zn, As, Se, Mo, Cd, Hg and Pb in sea cucumber Apostichopus japonicus samples from seven places of geographical origin were determined by means of ICP-MS. The results were used for the development of elements database. Cluster analysis(CA) and principal component analysis (PCA) were applied to differentiate the sea cucumber Apostichopus japonicus geographical origin. Three principal components which accounted for over 89% of the total variance were extracted from the standardized data. The results of Q-type cluster analysis showed that the 26 samples could be clustered reasonably into five groups, the classification results were significantly associated with the marine distribution of the sea cucumber Apostichopus japonicus samples. The CA and PCA were the effective methods for elements analysis of sea cucumber Apostichopus japonicus samples. The content of the mineral elements in sea cucumber Apostichopus japonicus samples was good chemical descriptors for differentiating their geographical origins.

  9. Microarray characterization of gene expression changes in blood during acute ethanol exposure

    PubMed Central

    2013-01-01

    Background As part of the civil aviation safety program to define the adverse effects of ethanol on flying performance, we performed a DNA microarray analysis of human whole blood samples from a five-time point study of subjects administered ethanol orally, followed by breathalyzer analysis, to monitor blood alcohol concentration (BAC) to discover significant gene expression changes in response to the ethanol exposure. Methods Subjects were administered either orange juice or orange juice with ethanol. Blood samples were taken based on BAC and total RNA was isolated from PaxGene™ blood tubes. The amplified cDNA was used in microarray and quantitative real-time polymerase chain reaction (RT-qPCR) analyses to evaluate differential gene expression. Microarray data was analyzed in a pipeline fashion to summarize and normalize and the results evaluated for relative expression across time points with multiple methods. Candidate genes showing distinctive expression patterns in response to ethanol were clustered by pattern and further analyzed for related function, pathway membership and common transcription factor binding within and across clusters. RT-qPCR was used with representative genes to confirm relative transcript levels across time to those detected in microarrays. Results Microarray analysis of samples representing 0%, 0.04%, 0.08%, return to 0.04%, and 0.02% wt/vol BAC showed that changes in gene expression could be detected across the time course. The expression changes were verified by qRT-PCR. The candidate genes of interest (GOI) identified from the microarray analysis and clustered by expression pattern across the five BAC points showed seven coordinately expressed groups. Analysis showed function-based networks, shared transcription factor binding sites and signaling pathways for members of the clusters. These include hematological functions, innate immunity and inflammation functions, metabolic functions expected of ethanol metabolism, and pancreatic and hepatic function. Five of the seven clusters showed links to the p38 MAPK pathway. Conclusions The results of this study provide a first look at changing gene expression patterns in human blood during an acute rise in blood ethanol concentration and its depletion because of metabolism and excretion, and demonstrate that it is possible to detect changes in gene expression using total RNA isolated from whole blood. The analysis approach for this study serves as a workflow to investigate the biology linked to expression changes across a time course and from these changes, to identify target genes that could serve as biomarkers linked to pilot performance. PMID:23883607

  10. Mapping Informative Clusters in a Hierarchial Framework of fMRI Multivariate Analysis

    PubMed Central

    Xu, Rui; Zhen, Zonglei; Liu, Jia

    2010-01-01

    Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies. PMID:21152081

  11. DAFi: A directed recursive data filtering and clustering approach for improving and interpreting data clustering identification of cell populations from polychromatic flow cytometry data.

    PubMed

    Lee, Alexandra J; Chang, Ivan; Burel, Julie G; Lindestam Arlehamn, Cecilia S; Mandava, Aishwarya; Weiskopf, Daniela; Peters, Bjoern; Sette, Alessandro; Scheuermann, Richard H; Qian, Yu

    2018-04-17

    Computational methods for identification of cell populations from polychromatic flow cytometry data are changing the paradigm of cytometry bioinformatics. Data clustering is the most common computational approach to unsupervised identification of cell populations from multidimensional cytometry data. However, interpretation of the identified data clusters is labor-intensive. Certain types of user-defined cell populations are also difficult to identify by fully automated data clustering analysis. Both are roadblocks before a cytometry lab can adopt the data clustering approach for cell population identification in routine use. We found that combining recursive data filtering and clustering with constraints converted from the user manual gating strategy can effectively address these two issues. We named this new approach DAFi: Directed Automated Filtering and Identification of cell populations. Design of DAFi preserves the data-driven characteristics of unsupervised clustering for identifying novel cell subsets, but also makes the results interpretable to experimental scientists through mapping and merging the multidimensional data clusters into the user-defined two-dimensional gating hierarchy. The recursive data filtering process in DAFi helped identify small data clusters which are otherwise difficult to resolve by a single run of the data clustering method due to the statistical interference of the irrelevant major clusters. Our experiment results showed that the proportions of the cell populations identified by DAFi, while being consistent with those by expert centralized manual gating, have smaller technical variances across samples than those from individual manual gating analysis and the nonrecursive data clustering analysis. Compared with manual gating segregation, DAFi-identified cell populations avoided the abrupt cut-offs on the boundaries. DAFi has been implemented to be used with multiple data clustering methods including K-means, FLOCK, FlowSOM, and the ClusterR package. For cell population identification, DAFi supports multiple options including clustering, bisecting, slope-based gating, and reversed filtering to meet various autogating needs from different scientific use cases. © 2018 International Society for Advancement of Cytometry. © 2018 International Society for Advancement of Cytometry.

  12. Migratory connectivity and effects of winter temperatures on migratory behaviour of the European robin Erithacus rubecula: a continent-wide analysis.

    PubMed

    Ambrosini, Roberto; Cuervo, José Javier; du Feu, Chris; Fiedler, Wolfgang; Musitelli, Federica; Rubolini, Diego; Sicurella, Beatrice; Spina, Fernando; Saino, Nicola; Møller, Anders Pape

    2016-05-01

    Many partially migratory species show phenotypically divergent populations in terms of migratory behaviour, with climate hypothesized to be a major driver of such variability through its differential effects on sedentary and migratory individuals. Based on long-term (1947-2011) bird ringing data, we analysed phenotypic differentiation of migratory behaviour among populations of the European robin Erithacus rubecula across Europe. We showed that clusters of populations sharing breeding and wintering ranges varied from partial (British Isles and Western Europe, NW cluster) to completely migratory (Scandinavia and north-eastern Europe, NE cluster). Distance migrated by birds of the NE (but not of the NW) cluster decreased through time because of a north-eastwards shift in the wintering grounds. Moreover, when winter temperatures in the breeding areas were cold, individuals from the NE cluster also migrated longer distances, while those of the NW cluster moved over shorter distances. Climatic conditions may therefore affect migratory behaviour of robins, although large geographical variation in response to climate seems to exist. © 2016 The Authors. Journal of Animal Ecology © 2016 British Ecological Society.

  13. Orbits of Selected Globular Clusters in the Galactic Bulge

    NASA Astrophysics Data System (ADS)

    Pérez-Villegas, A.; Rossi, L.; Ortolani, S.; Casotto, S.; Barbuy, B.; Bica, E.

    2018-05-01

    We present orbit analysis for a sample of eight inner bulge globular clusters, together with one reference halo object. We used proper motion values derived from long time base CCD data. Orbits are integrated in both an axisymmetric model and a model including the Galactic bar potential. The inclusion of the bar proved to be essential for the description of the dynamical behaviour of the clusters. We use the Monte Carlo scheme to construct the initial conditions for each cluster, taking into account the uncertainties in the kinematical data and distances. The sample clusters show typically maximum height to the Galactic plane below 1.5 kpc, and develop rather eccentric orbits. Seven of the bulge sample clusters share the orbital properties of the bar/bulge, having perigalactic and apogalatic distances, and maximum vertical excursion from the Galactic plane inside the bar region. NGC 6540 instead shows a completely different orbital behaviour, having a dynamical signature of the thick disc. Both prograde and prograde-retrograde orbits with respect to the direction of the Galactic rotation were revealed, which might characterise a chaotic behaviour.

  14. Complete Genome Sequence and Comparative Analysis of the Fish Pathogen Lactococcus garvieae

    PubMed Central

    Oshima, Kenshiro; Yoshizaki, Mariko; Kawanishi, Michiko; Nakaya, Kohei; Suzuki, Takehito; Miyauchi, Eiji; Ishii, Yasuo; Tanabe, Soichi; Murakami, Masaru; Hattori, Masahira

    2011-01-01

    Lactococcus garvieae causes fatal haemorrhagic septicaemia in fish such as yellowtail. The comparative analysis of genomes of a virulent strain Lg2 and a non-virulent strain ATCC 49156 of L. garvieae revealed that the two strains shared a high degree of sequence identity, but Lg2 had a 16.5-kb capsule gene cluster that is absent in ATCC 49156. The capsule gene cluster was composed of 15 genes, of which eight genes are highly conserved with those in exopolysaccharide biosynthesis gene cluster often found in Lactococcus lactis strains. Sequence analysis of the capsule gene cluster in the less virulent strain L. garvieae Lg2-S, Lg2-derived strain, showed that two conserved genes were disrupted by a single base pair deletion, respectively. These results strongly suggest that the capsule is crucial for virulence of Lg2. The capsule gene cluster of Lg2 may be a genomic island from several features such as the presence of insertion sequences flanked on both ends, different GC content from the chromosomal average, integration into the locus syntenic to other lactococcal genome sequences, and distribution in human gut microbiomes. The analysis also predicted other potential virulence factors such as haemolysin. The present study provides new insights into understanding of the virulence mechanisms of L. garvieae in fish. PMID:21829716

  15. Unifying principles in homodimeric type I photosynthetic reaction centers: properties of PscB and the FA, FB and FX iron-sulfur clusters in green sulfur bacteria.

    PubMed

    Jagannathan, Bharat; Golbeck, John H

    2008-12-01

    The photosynthetic reaction center from the green sulfur bacterium Chlorobium tepidum (CbRC) was solubilized from membranes using Triton X-100 and isolated by sucrose density ultra-centrifugation. The CbRC complexes were subsequently treated with 0.5 M NaCl and ultrafiltered over a 100 kDa cutoff membrane. The resulting CbRC cores did not exhibit the low-temperature EPR resonances from FA- and FB- and were unable to reduce NADP+. SDS-PAGE and mass spectrometric analysis showed that the PscB subunit, which harbors the FA and FB clusters, had become dissociated, and was now present in the filtrate. Attempts to rebind PscB onto CbRC cores were unsuccessful. Mössbauer spectroscopy showed that recombinant PscB contains a heterogeneous mixture of [4Fe-4S]2+,1+ and other types of Fe/S clusters tentatively identified as [2Fe-2S]2+,1+ clusters and rubredoxin-like Fe3+,2+ centers, and that the [4Fe-4S]2+,1+ clusters which were present were degraded at high ionic strength. Quantitative analysis confirmed that the amount of iron and sulfide in the recombinant protein was sub-stoichiometric. A heme-staining assay indicated that cytochrome c551 remained firmly attached to the CbRC cores. Low-temperature EPR spectroscopy of photoaccumulated CbRC complexes and CbRC cores showed resonances between g=5.4 and 4.4 assigned to a S=3/2 ground spin state [4Fe-4S]1+ cluster and at g=1.77 assigned to a S=1/2 ground spin state [4Fe-4S]1+ cluster, both from FX-. These results unify the properties of the acceptor side of the Type I homodimeric reaction centers found in green sulfur bacteria and heliobacteria: in both, the FA and FB iron-sulfur clusters are present on a salt-dissociable subunit, and FX is present as an interpolypeptide [4Fe-4S]2+,1+ cluster with a significant population in a S=3/2 ground spin state.

  16. [A spatial adaptive algorithm for endmember extraction on multispectral remote sensing image].

    PubMed

    Zhu, Chang-Ming; Luo, Jian-Cheng; Shen, Zhan-Feng; Li, Jun-Li; Hu, Xiao-Dong

    2011-10-01

    Due to the problem that the convex cone analysis (CCA) method can only extract limited endmember in multispectral imagery, this paper proposed a new endmember extraction method by spatial adaptive spectral feature analysis in multispectral remote sensing image based on spatial clustering and imagery slice. Firstly, in order to remove spatial and spectral redundancies, the principal component analysis (PCA) algorithm was used for lowering the dimensions of the multispectral data. Secondly, iterative self-organizing data analysis technology algorithm (ISODATA) was used for image cluster through the similarity of the pixel spectral. And then, through clustering post process and litter clusters combination, we divided the whole image data into several blocks (tiles). Lastly, according to the complexity of image blocks' landscape and the feature of the scatter diagrams analysis, the authors can determine the number of endmembers. Then using hourglass algorithm extracts endmembers. Through the endmember extraction experiment on TM multispectral imagery, the experiment result showed that the method can extract endmember spectra form multispectral imagery effectively. What's more, the method resolved the problem of the amount of endmember limitation and improved accuracy of the endmember extraction. The method has provided a new way for multispectral image endmember extraction.

  17. A high fat diet containing saturated but not unsaturated fatty acids enhances T cell receptor clustering on the nanoscale.

    PubMed

    Shaikh, Saame Raza; Boyle, Sarah; Edidin, Michael

    2015-09-01

    Cell culture studies show that the nanoscale lateral organization of surface receptors, their clustering or dispersion, can be altered by changing the lipid composition of the membrane bilayer. However, little is known about similar changes in vivo, which can be effected by changing dietary lipids. We describe the use of a newly developed method, k-space image correlation spectroscopy, kICS, for analysis of quantum dot fluorescence to show that a high fat diet can alter the nanometer-scale clustering of the murine T cell receptor, TCR, on the surface of naive CD4(+) T cells. We found that diets enriched primarily in saturated fatty acids increased TCR nanoscale clustering to a level usually seen only on activated cells. Diets enriched in monounsaturated or n-3 polyunsaturated fatty acids had no effect on TCR clustering. Also none of the high fat diets affected TCR clustering on the micrometer scale. Furthermore, the effect of the diets was similar in young and middle aged mice. Our data establish proof-of-principle that TCR nanoscale clustering is sensitive to the composition of dietary fat. Copyright © 2015 Elsevier Ltd. All rights reserved.

  18. Cosmological Constraints from Galaxy Cluster Velocity Statistics

    NASA Astrophysics Data System (ADS)

    Bhattacharya, Suman; Kosowsky, Arthur

    2007-04-01

    Future microwave sky surveys will have the sensitivity to detect the kinematic Sunyaev-Zeldovich signal from moving galaxy clusters, thus providing a direct measurement of their line-of-sight peculiar velocity. We show that cluster peculiar velocity statistics applied to foreseeable surveys will put significant constraints on fundamental cosmological parameters. We consider three statistical quantities that can be constructed from a cluster peculiar velocity catalog: the probability density function, the mean pairwise streaming velocity, and the pairwise velocity dispersion. These quantities are applied to an envisioned data set that measures line-of-sight cluster velocities with normal errors of 100 km s-1 for all clusters with masses larger than 1014 Msolar over a sky area of up to 5000 deg2. A simple Fisher matrix analysis of this survey shows that the normalization of the matter power spectrum and the dark energy equation of state can be constrained to better than 10%, and that the Hubble constant and the primordial power spectrum index can be constrained to a few percent, independent of any other cosmological observations. We also find that the current constraint on the power spectrum normalization can be improved by more than a factor of 2 using data from a 400 deg2 survey and WMAP third-year priors. We also show how the constraints on cosmological parameters change if cluster velocities are measured with normal errors of 300 km s-1.

  19. Nuclear Potential Clustering As a New Tool to Detect Patterns in High Dimensional Datasets

    NASA Astrophysics Data System (ADS)

    Tonkova, V.; Paulus, D.; Neeb, H.

    2013-02-01

    We present a new approach for the clustering of high dimensional data without prior assumptions about the structure of the underlying distribution. The proposed algorithm is based on a concept adapted from nuclear physics. To partition the data, we model the dynamic behaviour of nucleons interacting in an N-dimensional space. An adaptive nuclear potential, comprised of a short-range attractive (strong interaction) and a long-range repulsive term (Coulomb force) is assigned to each data point. By modelling the dynamics, nucleons that are densely distributed in space fuse to build nuclei (clusters) whereas single point clusters repel each other. The formation of clusters is completed when the system reaches the state of minimal potential energy. The data are then grouped according to the particles' final effective potential energy level. The performance of the algorithm is tested with several synthetic datasets showing that the proposed method can robustly identify clusters even when complex configurations are present. Furthermore, quantitative MRI data from 43 multiple sclerosis patients were analyzed, showing a reasonable splitting into subgroups according to the individual patients' disease grade. The good performance of the algorithm on such highly correlated non-spherical datasets, which are typical for MRI derived image features, shows that Nuclear Potential Clustering is a valuable tool for automated data analysis, not only in the MRI domain.

  20. Cause-specific mortality trends in The Netherlands, 1875-1992: a formal analysis of the epidemiologic transition.

    PubMed

    Wolleswinkel-van den Bosch, J H; Looman, C W; Van Poppel, F W; Mackenbach, J P

    1997-08-01

    The objective of this study is to produce a detailed yet robust description of the epidemiologic transition in The Netherlands. National mortality data on sex, age, cause of death and calendar year (1875-1992) were extracted from official publications. For the entire period, 27 causes of death could be distinguished, while 65 causes (nested within the 27) could be studied from 1901 onwards. Cluster analysis was used to determine groups of causes of death with similar trend curves over a period of time with respect to age- and sex-standardized mortality rates. With respect to the 27 causes, three important clusters were found: (1) infectious diseases which declined rapidly in the late 19th century (e.g. typhoid fever), (2) infectious diseases which showed a less precipitous decline (e.g. respiratory tuberculosis), and (3) non-infectious diseases which showed an increasing trend during most of the period 1875-1992 (e.g. cancer). The 65 causes provided more detail. Seven important clusters were found: four consisted mainly of infectious diseases, including a new cluster that declined rapidly after the Second World War (WW2) (e.g. acute bronchitis/influenza) and a new cluster showing an increasing trend in the 1920s and 1930s before declining in the years thereafter (e.g. appendicitis). Three clusters mainly contained non-infectious diseases, including a new one that declined from 1900 onwards (e.g. cancer of the stomach) and a new one that increased until WW2 but declined thereafter (e.g. chronic rheumatic heart disease). The results suggest that the conventional interpretation of the epidemiologic transition, which assumes a uniform decline of infectious diseases and a uniform increase of non-infectious diseases, needs to be modified.

  1. Sequential analysis of hydrochemical data for watershed characterization.

    PubMed

    Thyne, Geoffrey; Güler, Cüneyt; Poeter, Eileen

    2004-01-01

    A methodology for characterizing the hydrogeology of watersheds using hydrochemical data that combine statistical, geochemical, and spatial techniques is presented. Surface water and ground water base flow and spring runoff samples (180 total) from a single watershed are first classified using hierarchical cluster analysis. The statistical clusters are analyzed for spatial coherence confirming that the clusters have a geological basis corresponding to topographic flowpaths and showing that the fractured rock aquifer behaves as an equivalent porous medium on the watershed scale. Then principal component analysis (PCA) is used to determine the sources of variation between parameters. PCA analysis shows that the variations within the dataset are related to variations in calcium, magnesium, SO4, and HCO3, which are derived from natural weathering reactions, and pH, NO3, and chlorine, which indicate anthropogenic impact. PHREEQC modeling is used to quantitatively describe the natural hydrochemical evolution for the watershed and aid in discrimination of samples that have an anthropogenic component. Finally, the seasonal changes in the water chemistry of individual sites were analyzed to better characterize the spatial variability of vertical hydraulic conductivity. The integrated result provides a method to characterize the hydrogeology of the watershed that fully utilizes traditional data.

  2. Optimizing disinfection by-product monitoring points in a distribution system using cluster analysis.

    PubMed

    Delpla, Ianis; Florea, Mihai; Pelletier, Geneviève; Rodriguez, Manuel J

    2018-06-04

    Trihalomethanes (THMs) and Haloacetic Acids (HAAs) are the main groups detected in drinking water and are consequently strictly regulated. However, the increasing quantity of data for disinfection byproducts (DBPs) produced from research projects and regulatory programs remains largely unexploited, despite a great potential for its use in optimizing drinking water quality monitoring to meet specific objectives. In this work, we developed a procedure to optimize locations and periods for DBPs monitoring based on a set of monitoring scenarios using the cluster analysis technique. The optimization procedure used a robust set of spatio-temporal monitoring results on DBPs (THMs and HAAs) generated from intensive sampling campaigns conducted in a residential sector of a water distribution system. Results shows that cluster analysis allows for the classification of water quality in different groups of THMs and HAAs according to their similarities, and the identification of locations presenting water quality concerns. By using cluster analysis with different monitoring objectives, this work provides a set of monitoring solutions and a comparison between various monitoring scenarios for decision-making purposes. Finally, it was demonstrated that the data from intensive monitoring of free chlorine residual and water temperature as DBP proxy parameters, when processed using cluster analysis, could also help identify the optimal sampling points and periods for regulatory THMs and HAAs monitoring. Copyright © 2018 Elsevier Ltd. All rights reserved.

  3. Genetic diversity and variation of Chinese fir from Fujian province and Taiwan, China, based on ISSR markers

    PubMed Central

    Chen, Yu; Peng, Zhuqing; Wu, Chao; Ma, Zhihui; Ding, Guochang; Cao, Guangqiu; Ruan, Shaoning; Lin, Sizu

    2017-01-01

    Genetic diversity and variation among 11 populations of Chinese fir from Fujian province and Taiwan were assessed using inter-simple sequence repeat (ISSR) markers to reveal the evolutionary relationship in their distribution range in this report. Analysis of genetic parameters of the different populations showed that populations in Fujian province exhibited a greater level of genetic diversity than did the populations in Taiwan. Compared to Taiwan populations, significant limited gene flow were observed among Fujian populations. An UPGMA cluster analysis showed that the most individuals of Taiwan populations formed a single cluster, whereas 6 discrete clusters were formed by each population from Fujian. All populations were divided into 3 main groups and that all 5 populations from Taiwan were gathered into a subgroup combined with 2 populations, Dehua and Liancheng, formed one of the 3 main groups, which indicated relative stronger relatedness. It is supported by a genetic structure analysis. All those results are suggesting different levels of genetic diversity and variation of Chinese fir between Fujian and Taiwan, and indicating different patterns of evolutionary process and local environmental adaption. PMID:28406956

  4. Genetic diversity and variation of Chinese fir from Fujian province and Taiwan, China, based on ISSR markers.

    PubMed

    Chen, Yu; Peng, Zhuqing; Wu, Chao; Ma, Zhihui; Ding, Guochang; Cao, Guangqiu; Ruan, Shaoning; Lin, Sizu

    2017-01-01

    Genetic diversity and variation among 11 populations of Chinese fir from Fujian province and Taiwan were assessed using inter-simple sequence repeat (ISSR) markers to reveal the evolutionary relationship in their distribution range in this report. Analysis of genetic parameters of the different populations showed that populations in Fujian province exhibited a greater level of genetic diversity than did the populations in Taiwan. Compared to Taiwan populations, significant limited gene flow were observed among Fujian populations. An UPGMA cluster analysis showed that the most individuals of Taiwan populations formed a single cluster, whereas 6 discrete clusters were formed by each population from Fujian. All populations were divided into 3 main groups and that all 5 populations from Taiwan were gathered into a subgroup combined with 2 populations, Dehua and Liancheng, formed one of the 3 main groups, which indicated relative stronger relatedness. It is supported by a genetic structure analysis. All those results are suggesting different levels of genetic diversity and variation of Chinese fir between Fujian and Taiwan, and indicating different patterns of evolutionary process and local environmental adaption.

  5. Examining the effectiveness of discriminant function analysis and cluster analysis in species identification of male field crickets based on their calling songs.

    PubMed

    Jaiswara, Ranjana; Nandi, Diptarup; Balakrishnan, Rohini

    2013-01-01

    Traditional taxonomy based on morphology has often failed in accurate species identification owing to the occurrence of cryptic species, which are reproductively isolated but morphologically identical. Molecular data have thus been used to complement morphology in species identification. The sexual advertisement calls in several groups of acoustically communicating animals are species-specific and can thus complement molecular data as non-invasive tools for identification. Several statistical tools and automated identifier algorithms have been used to investigate the efficiency of acoustic signals in species identification. Despite a plethora of such methods, there is a general lack of knowledge regarding the appropriate usage of these methods in specific taxa. In this study, we investigated the performance of two commonly used statistical methods, discriminant function analysis (DFA) and cluster analysis, in identification and classification based on acoustic signals of field cricket species belonging to the subfamily Gryllinae. Using a comparative approach we evaluated the optimal number of species and calling song characteristics for both the methods that lead to most accurate classification and identification. The accuracy of classification using DFA was high and was not affected by the number of taxa used. However, a constraint in using discriminant function analysis is the need for a priori classification of songs. Accuracy of classification using cluster analysis, which does not require a priori knowledge, was maximum for 6-7 taxa and decreased significantly when more than ten taxa were analysed together. We also investigated the efficacy of two novel derived acoustic features in improving the accuracy of identification. Our results show that DFA is a reliable statistical tool for species identification using acoustic signals. Our results also show that cluster analysis of acoustic signals in crickets works effectively for species classification and identification.

  6. Classification of attempted suicide by cluster analysis: A study of 888 suicide attempters presenting to the emergency department.

    PubMed

    Kim, Hyeyoung; Kim, Bora; Kim, Se Hyun; Park, C Hyung Keun; Kim, Eun Young; Ahn, Yong Min

    2018-08-01

    It is essential to understand the latent structure of the population of suicide attempters for effective suicide prevention. The aim of this study was to identify subgroups among Korean suicide attempters in terms of the details of the suicide attempt. A total of 888 people who attempted suicide and were subsequently treated in the emergency rooms of 17 medical centers between May and November of 2013 were included in the analysis. The variables assessed included demographic characteristics, clinical information, and details of the suicide attempt assessed by the Suicide Intent Scale (SIS) and Columbia-Suicide Severity Rating Scale (C-SSRS). Cluster analysis was performed using the Ward method. Of the participants, 85.4% (n = 758) fell into a cluster characterized by less planning, low lethality methods, and ambivalence towards death ("impulsive"). The other cluster (n = 130) involved a more severe and well-planned attempt, used highly lethal methods, and took more precautions to avoid being interrupted ("planned"). The first cluster was dominated by women, while the second cluster was associated more with men, older age, and physical illness. We only included participants who visited the emergency department after their suicide attempt and had no missing values for SIS or C-SSRS. Cluster analysis extracted two distinct subgroups of Korean suicide attempters showing different patterns of suicidal behaviors. Understanding that a significant portion of suicide attempts occur impulsively calls for new prevention strategies tailored to differing subgroup profiles. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Two Ti13-oxo-clusters showing non-compact structures, film electrode preparation and photocurrent properties.

    PubMed

    Hou, Jin-Le; Luo, Wen; Wu, Yin-Yin; Su, Hu-Chao; Zhang, Guang-Lin; Zhu, Qin-Yu; Dai, Jie

    2015-12-14

    Two benzene dicarboxylate (BDC) and salicylate (SAL) substituted titanium-oxo-clusters, Ti13O10(o-BDC)4(SAL)4(O(i)Pr)16 (1) and Ti13O10(o-BDC)4(SAL-Cl)4(O(i)Pr)16 (2), are prepared by one step in situ solvothermal synthesis. Single crystal analysis shows that the two Ti13 clusters take a paddle arrangement with an S4 symmetry. The non-compact (non-sphere) structure is stabilized by the coordination of BDC and SAL. Film photoelectrodes are prepared by the wet coating process using the solution of the clusters and the photocurrent response properties of the electrodes are studied. It is found that the photocurrent density and photoresponsiveness of the electrodes are related to the number of coating layers and the annealing temperature. Using ligand coordinated titanium-oxo-clusters as the molecular precursors of TiO2 anatase films is found to be effective due to their high solubility, appropriate stability in solution and hence the easy controllability.

  8. Cluster analysis of cognitive performance in elderly and demented subjects.

    PubMed

    Giaquinto, S; Nolfe, G; Calvani, M

    1985-06-01

    48 elderly normals, 14 demented subjects and 76 young controls were tested for basic cognitive functions. All the tests were quantified and could therefore be subjected to statistical analysis. The results show a difference in the speed of information processing and in memory load between the young controls and elderly normals but the age groups differed in quantitative terms only. Cluster analysis showed that the elderly and the demented formed two distinctly separate groups at the qualitative level, the basic cognitive processes being damaged in the demented group. Age thus appears to be only a risk factor for dementia and not its cause. It is concluded that batteries based on precise and measurable tasks are the most appropriate not only for the study of dementia but for rehabilitation purposes too.

  9. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis.

    PubMed

    Zhang, Chao; Zhang, Pengcheng; Zhang, Weizhan

    2017-09-27

    A wireless-powered sensor network (WPSN) consisting of one hybrid access point (HAP), a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF) manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation.

  10. Cluster Cooperation in Wireless-Powered Sensor Networks: Modeling and Performance Analysis

    PubMed Central

    Zhang, Chao; Zhang, Pengcheng; Zhang, Weizhan

    2017-01-01

    A wireless-powered sensor network (WPSN) consisting of one hybrid access point (HAP), a near cluster and the corresponding far cluster is investigated in this paper. These sensors are wireless-powered and they transmit information by consuming the harvested energy from signal ejected by the HAP. Sensors are able to harvest energy as well as store the harvested energy. We propose that if sensors in near cluster do not have their own information to transmit, acting as relays, they can help the sensors in a far cluster to forward information to the HAP in an amplify-and-forward (AF) manner. We use a finite Markov chain to model the dynamic variation process of the relay battery, and give a general analyzing model for WPSN with cluster cooperation. Though the model, we deduce the closed-form expression for the outage probability as the metric of this network. Finally, simulation results validate the start point of designing this paper and correctness of theoretical analysis and show how parameters have an effect on system performance. Moreover, it is also known that the outage probability of sensors in far cluster can be drastically reduced without sacrificing the performance of sensors in near cluster if the transmit power of HAP is fairly high. Furthermore, in the aspect of outage performance of far cluster, the proposed scheme significantly outperforms the direct transmission scheme without cooperation. PMID:28953231

  11. Density functional theory study of small X-doped Mg(n) (X = Fe, Co, Ni, n = 1-9) bimetallic clusters: equilibrium structures, stabilities, electronic and magnetic properties.

    PubMed

    Kong, Fanjie; Hu, Yanfei

    2014-03-01

    The geometries, stabilities, and electronic and magnetic properties of Mg(n) X (X = Fe, Co, Ni, n = 1-9) clusters were investigated systematically within the framework of the gradient-corrected density functional theory. The results show that the Mg(n)Fe, Mg(n)Co, and Mg(n)Ni clusters have similar geometric structures and that the X atom in Mg(n)X clusters prefers to be endohedrally doped. The average atomic binding energies, fragmentation energies, second-order differences in energy, and HOMO-LUMO gaps show that Mg₄X (X = Fe, Co, Ni) clusters possess relatively high stability. Natural population analysis was performed and the results showed that the 3s and 4s electrons always transfer to the 3d and 4p orbitals in the bonding atoms, and that electrons also transfer from the Mg atoms to the doped atoms (Fe, Co, Ni). In addition, the spin magnetic moments were analyzed and compared. Several clusters, such as Mg₁,₂,₃,₄,₅,₆,₈,₉Fe, Mg₁,₂,₄,₅,₆,₈,₉Co, and Mg₁,₂,₅,₆,₇,₉Ni, present high magnetic moments (4 μ(B), 3 μ(B), and 2 μ(B), respectively).

  12. A taxonomy of epithelial human cancer and their metastases

    PubMed Central

    2009-01-01

    Background Microarray technology has allowed to molecularly characterize many different cancer sites. This technology has the potential to individualize therapy and to discover new drug targets. However, due to technological differences and issues in standardized sample collection no study has evaluated the molecular profile of epithelial human cancer in a large number of samples and tissues. Additionally, it has not yet been extensively investigated whether metastases resemble their tissue of origin or tissue of destination. Methods We studied the expression profiles of a series of 1566 primary and 178 metastases by unsupervised hierarchical clustering. The clustering profile was subsequently investigated and correlated with clinico-pathological data. Statistical enrichment of clinico-pathological annotations of groups of samples was investigated using Fisher exact test. Gene set enrichment analysis (GSEA) and DAVID functional enrichment analysis were used to investigate the molecular pathways. Kaplan-Meier survival analysis and log-rank tests were used to investigate prognostic significance of gene signatures. Results Large clusters corresponding to breast, gastrointestinal, ovarian and kidney primary tissues emerged from the data. Chromophobe renal cell carcinoma clustered together with follicular differentiated thyroid carcinoma, which supports recent morphological descriptions of thyroid follicular carcinoma-like tumors in the kidney and suggests that they represent a subtype of chromophobe carcinoma. We also found an expression signature identifying primary tumors of squamous cell histology in multiple tissues. Next, a subset of ovarian tumors enriched with endometrioid histology clustered together with endometrium tumors, confirming that they share their etiopathogenesis, which strongly differs from serous ovarian tumors. In addition, the clustering of colon and breast tumors correlated with clinico-pathological characteristics. Moreover, a signature was developed based on our unsupervised clustering of breast tumors and this was predictive for disease-specific survival in three independent studies. Next, the metastases from ovarian, breast, lung and vulva cluster with their tissue of origin while metastases from colon showed a bimodal distribution. A significant part clusters with tissue of origin while the remaining tumors cluster with the tissue of destination. Conclusion Our molecular taxonomy of epithelial human cancer indicates surprising correlations over tissues. This may have a significant impact on the classification of many cancer sites and may guide pathologists, both in research and daily practice. Moreover, these results based on unsupervised analysis yielded a signature predictive of clinical outcome in breast cancer. Additionally, we hypothesize that metastases from gastrointestinal origin either remember their tissue of origin or adapt to the tissue of destination. More specifically, colon metastases in the liver show strong evidence for such a bimodal tissue specific profile. PMID:20017941

  13. Identification and characterization of the ergochrome gene cluster in the plant pathogenic fungus Claviceps purpurea.

    PubMed

    Neubauer, Lisa; Dopstadt, Julian; Humpf, Hans-Ulrich; Tudzynski, Paul

    2016-01-01

    Claviceps purpurea is a phytopathogenic fungus infecting a broad range of grasses including economically important cereal crop plants. The infection cycle ends with the formation of the typical purple-black pigmented sclerotia containing the toxic ergot alkaloids. Besides these ergot alkaloids little is known about the secondary metabolism of the fungus. Red anthraquinone derivatives and yellow xanthone dimers (ergochromes) have been isolated from sclerotia and described as ergot pigments, but the corresponding gene cluster has remained unknown. Fungal pigments gain increasing interest for example as environmentally friendly alternatives to existing dyes. Furthermore, several pigments show biological activities and may have some pharmaceutical value. This study identified the gene cluster responsible for the synthesis of the ergot pigments. Overexpression of the cluster-specific transcription factor led to activation of the gene cluster and to the production of several known ergot pigments. Knock out of the cluster key enzyme, a nonreducing polyketide synthase, clearly showed that this cluster is responsible for the production of red anthraquinones as well as yellow ergochromes. Furthermore, a tentative biosynthetic pathway for the ergot pigments is proposed. By changing the culture conditions, pigment production was activated in axenic culture so that high concentration of phosphate and low concentration of sucrose induced pigment syntheses. This is the first functional analysis of a secondary metabolite gene cluster in the ergot fungus besides that for the classical ergot alkaloids. We demonstrated that this gene cluster is responsible for the typical purple-black color of the ergot sclerotia and showed that the red and yellow ergot pigments are products of the same biosynthetic pathway. Activation of the gene cluster in axenic culture opened up new possibilities for biotechnological applications like the dye production or the development of new pharmaceuticals.

  14. Clustering of energy balance-related behaviors and parental education in European children: the ENERGY-project.

    PubMed

    Fernández-Alvira, Juan M; De Bourdeaudhuij, Ilse; Singh, Amika S; Vik, Frøydis N; Manios, Yannis; Kovacs, Eva; Jan, Natasa; Brug, Johannes; Moreno, Luis A

    2013-01-15

    Recent research and literature reviews show that, among schoolchildren, some specific energy balance-related behaviors (EBRBs) are relevant for overweight and obesity prevention. It is also well known that the prevalence of overweight and obesity is considerably higher among schoolchildren from lower socio-economic backgrounds. This study examines whether sugared drinks intake, physical activity, screen time and usual sleep duration cluster in reliable and meaningful ways among European children, and whether the identified clusters could be characterized by parental education. The cross-sectional study comprised a total of 5284 children (46% male), from seven European countries participating in the ENERGY-project ("EuropeaN Energy balance Research to prevent excessive weight Gain among Youth"). Information on sugared drinks intake, physical activity, screen time and usual sleep duration was obtained using validated self-report questionnaires. Based on these behaviors, gender-specific cluster analysis was performed. Associations with parental education were identified using chi-square tests and odds ratios. Five meaningful and stable clusters were found for both genders. The cluster with high physical activity level showed the highest proportion of participants with highly educated parents, while clusters with high sugared drinks consumption, high screen time and low sleep duration were more prevalent in the group with lower educated parents. Odds ratio showed that children with lower educated parents were less likely to be allocated in the active cluster and more likely to be allocated in the low activity/sedentary pattern cluster. Children with lower educated parents seemed to be more likely to present unhealthier EBRBs clustering, mainly characterized by their self-reported time spent on physical activity and screen viewing. Therefore, special focus should be given to lower educated parents and their children in order to develop effective primary prevention strategies.

  15. Health and disease phenotyping in old age using a cluster network analysis.

    PubMed

    Valenzuela, Jesus Felix; Monterola, Christopher; Tong, Victor Joo Chuan; Ng, Tze Pin; Larbi, Anis

    2017-11-15

    Human ageing is a complex trait that involves the synergistic action of numerous biological processes that interact to form a complex network. Here we performed a network analysis to examine the interrelationships between physiological and psychological functions, disease, disability, quality of life, lifestyle and behavioural risk factors for ageing in a cohort of 3,270 subjects aged ≥55 years. We considered associations between numerical and categorical descriptors using effect-size measures for each variable pair and identified clusters of variables from the resulting pairwise effect-size network and minimum spanning tree. We show, by way of a correspondence analysis between the two sets of clusters, that they correspond to coarse-grained and fine-grained structure of the network relationships. The clusters obtained from the minimum spanning tree mapped to various conceptual domains and corresponded to physiological and syndromic states. Hierarchical ordering of these clusters identified six common themes based on interactions with physiological systems and common underlying substrates of age-associated morbidity and disease chronicity, functional disability, and quality of life. These findings provide a starting point for indepth analyses of ageing that incorporate immunologic, metabolomic and proteomic biomarkers, and ultimately offer low-level-based typologies of healthy and unhealthy ageing.

  16. First-principles study on stability, and growth strategies of small AlnZr (n=1-9) clusters

    NASA Astrophysics Data System (ADS)

    Li, Zhi; Zhou, Zhonghao; Wang, Hongbin; Li, Shengli; Zhao, Zhen

    2016-09-01

    The geometries, relative stability as well as growth strategies of the AlnZr (n=1-9) clusters are investigated with spin polarized density functional theory: BLYP. The results reveal that the AlnZr clusters are more likely to form the dense accumulation structures than the AlN (N=1-10) clusters. The average binding energies of AlnZr are higher than those of AlN clusters. The AlnZr (n=3, 5, and 7) clusters are more stable than others by the differences of the total binding energies. Mülliken population analysis for the AlnZr clusters shows that the electron's adsorption ability of Zr is slightly lower than that of Al except for AlZr cluster. Local peaks of the HOMO-LUMO gap curve are found at n=3, 5, and 7. The reaction energies of AlnZr are higher, which means that AlnZr clusters are easier to react with Al clusters. Zr atom preferential reacts with Al2 cluster. Local peaks of the magnetic dipole moments are found at n=2, 5, and 8.

  17. Gas and galaxies in filaments between clusters of galaxies. The study of A399-A401

    NASA Astrophysics Data System (ADS)

    Bonjean, V.; Aghanim, N.; Salomé, P.; Douspis, M.; Beelen, A.

    2018-01-01

    We have performed a multi-wavelength analysis of two galaxy cluster systems selected with the thermal Sunyaev-Zel'dovich (tSZ) effect and composed of cluster pairs and an inter-cluster filament. We have focused on one pair of particular interest: A399-A401 at redshift z 0.073 seperated by 3 Mpc. We have also performed the first analysis of one lower-significance newly associated pair: A21-PSZ2 G114.09-34.34 at z 0.094, separated by 4.2 Mpc. We have characterised the intra-cluster gas using the tSZ signal from Planck and, when possible, the galaxy optical and infrared (IR) properties based on two photometric redshift catalogues: 2MPZ and WISExSCOS. From the tSZ data, we measured the gas pressure in the clusters and in the inter-cluster filaments. In the case of A399-A401, the results are in perfect agreement with previous studies and, using the temperature measured from the X-rays, we further estimate the gas density in the filament and find n0 = (4.3 ± 0.7) × 10-4 cm-3. The optical and IR colour-colour and colour-magnitude analyses of the galaxies selected in the cluster system, together with their star formation rate, show no segregation between galaxy populations, both in the clusters and in the filament of A399-A401. Galaxies are all passive, early type, and red and dead. The gas and galaxy properties of this system suggest that the whole system formed at the same time and corresponds to a pre-merger, with a cosmic filament gas heated by the collapse. For the other cluster system, the tSZ analysis was performed and the pressure in the clusters and in the inter-cluster filament was constrained. However, the limited or nonexistent optical and IR data prevent us from concluding on the presence of an actual cosmic filament or from proposing a scenario.

  18. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale

    PubMed Central

    Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Overview Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms—Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. Cluster Quality Metrics We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Network Clustering Algorithms Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters. PMID:27391786

  19. Extracting Galaxy Cluster Gas Inhomogeneity from X-Ray Surface Brightness: A Statistical Approach and Application to Abell 3667

    NASA Astrophysics Data System (ADS)

    Kawahara, Hajime; Reese, Erik D.; Kitayama, Tetsu; Sasaki, Shin; Suto, Yasushi

    2008-11-01

    Our previous analysis indicates that small-scale fluctuations in the intracluster medium (ICM) from cosmological hydrodynamic simulations follow the lognormal probability density function. In order to test the lognormal nature of the ICM directly against X-ray observations of galaxy clusters, we develop a method of extracting statistical information about the three-dimensional properties of the fluctuations from the two-dimensional X-ray surface brightness. We first create a set of synthetic clusters with lognormal fluctuations around their mean profile given by spherical isothermal β-models, later considering polytropic temperature profiles as well. Performing mock observations of these synthetic clusters, we find that the resulting X-ray surface brightness fluctuations also follow the lognormal distribution fairly well. Systematic analysis of the synthetic clusters provides an empirical relation between the three-dimensional density fluctuations and the two-dimensional X-ray surface brightness. We analyze Chandra observations of the galaxy cluster Abell 3667, and find that its X-ray surface brightness fluctuations follow the lognormal distribution. While the lognormal model was originally motivated by cosmological hydrodynamic simulations, this is the first observational confirmation of the lognormal signature in a real cluster. Finally we check the synthetic cluster results against clusters from cosmological hydrodynamic simulations. As a result of the complex structure exhibited by simulated clusters, the empirical relation between the two- and three-dimensional fluctuation properties calibrated with synthetic clusters when applied to simulated clusters shows large scatter. Nevertheless we are able to reproduce the true value of the fluctuation amplitude of simulated clusters within a factor of 2 from their two-dimensional X-ray surface brightness alone. Our current methodology combined with existing observational data is useful in describing and inferring the statistical properties of the three-dimensional inhomogeneity in galaxy clusters.

  20. Genetic characterization of Uruguayan Pampa Rocha pigs with microsatellite markers

    PubMed Central

    Montenegro, M; Llambí, S; Castro, G; Barlocco, N; Vadell, A; Landi, V; Delgado, JV; Martínez, A

    2015-01-01

    In this study, we genetically characterized the Uruguayan pig breed Pampa Rocha. Genetic variability was assessed by analyzing a panel of 25 microsatellite markers from a sample of 39 individuals. Pampa Rocha pigs showed high genetic variability with observed and expected heterozygosities of 0.583 and 0.603, respectively. The mean number of alleles was 5.72. Twenty-four markers were polymorphic, with 95.8% of them in Hardy Weinberg equilibrium. The level of endogamy was low (FIS = 0.0475). A factorial analysis of correspondence was used to assess the genetic differences between Pampa Rocha and other pig breeds; genetic distances were calculated, and a tree was designed to reflect the distance matrix. Individuals were also allocated into clusters. This analysis showed that the Pampa Rocha breed was separated from the other breeds along the first and second axes. The neighbour-joining tree generated by the genetic distances DA showed clustering of Pampa Rocha with the Meishan breed. The allocation of individuals to clusters showed a clear separation of Pampa Rocha pigs. These results provide insights into the genetic variability of Pampa Rocha pigs and indicate that this breed is a well-defined genetic entity. PMID:25983624

  1. Coronal Mass Ejection Data Clustering and Visualization of Decision Trees

    NASA Astrophysics Data System (ADS)

    Ma, Ruizhe; Angryk, Rafal A.; Riley, Pete; Filali Boubrahimi, Soukaina

    2018-05-01

    Coronal mass ejections (CMEs) can be categorized as either “magnetic clouds” (MCs) or non-MCs. Features such as a large magnetic field, low plasma-beta, and low proton temperature suggest that a CME event is also an MC event; however, so far there is neither a definitive method nor an automatic process to distinguish the two. Human labeling is time-consuming, and results can fluctuate owing to the imprecise definition of such events. In this study, we approach the problem of MC and non-MC distinction from a time series data analysis perspective and show how clustering can shed some light on this problem. Although many algorithms exist for traditional data clustering in the Euclidean space, they are not well suited for time series data. Problems such as inadequate distance measure, inaccurate cluster center description, and lack of intuitive cluster representations need to be addressed for effective time series clustering. Our data analysis in this work is twofold: clustering and visualization. For clustering we compared the results from the popular hierarchical agglomerative clustering technique to a distance density clustering heuristic we developed previously for time series data clustering. In both cases, dynamic time warping will be used for similarity measure. For classification as well as visualization, we use decision trees to aggregate single-dimensional clustering results to form a multidimensional time series decision tree, with averaged time series to present each decision. In this study, we achieved modest accuracy and, more importantly, an intuitive interpretation of how different parameters contribute to an MC event.

  2. Study on text mining algorithm for ultrasound examination of chronic liver diseases based on spectral clustering

    NASA Astrophysics Data System (ADS)

    Chang, Bingguo; Chen, Xiaofei

    2018-05-01

    Ultrasonography is an important examination for the diagnosis of chronic liver disease. The doctor gives the liver indicators and suggests the patient's condition according to the description of ultrasound report. With the rapid increase in the amount of data of ultrasound report, the workload of professional physician to manually distinguish ultrasound results significantly increases. In this paper, we use the spectral clustering method to cluster analysis of the description of the ultrasound report, and automatically generate the ultrasonic diagnostic diagnosis by machine learning. 110 groups ultrasound examination report of chronic liver disease were selected as test samples in this experiment, and the results were validated by spectral clustering and compared with k-means clustering algorithm. The results show that the accuracy of spectral clustering is 92.73%, which is higher than that of k-means clustering algorithm, which provides a powerful ultrasound-assisted diagnosis for patients with chronic liver disease.

  3. Chemical Fingerprint and Quantitative Analysis for the Quality Evaluation of Platycladi cacumen by Ultra-performance Liquid Chromatography Coupled with Hierarchical Cluster Analysis.

    PubMed

    Shan, Mingqiu; Li, Sam Fong Yau; Yu, Sheng; Qian, Yan; Guo, Shuchen; Zhang, Li; Ding, Anwei

    2018-01-01

    Platycladi cacumen (dried twigs and leaves of Platycladus orientalis (L.) Franco) is a frequently utilized Chinese medicinal herb. To evaluate the quality of the phytomedcine, an ultra-performance liquid chromatographic method with diode array detection was established for chemical fingerprinting and quantitative analysis. In this study, 27 batches of P. cacumen from different regions were collected for analysis. A chemical fingerprint with 20 common peaks was obtained using Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine (Version 2004A). Among these 20 components, seven flavonoids (myricitrin, isoquercitrin, quercitrin, afzelin, cupressuflavone, amentoflavone and hinokiflavone) were identified and determined simultaneously. In the method validation, the seven analytes showed good regressions (R ≥ 0.9995) within linear ranges and good recoveries from 96.4% to 103.3%. Furthermore, with the contents of these seven flavonoids, hierarchical clustering analysis was applied to distinguish the 27 batches into five groups. The chemometric results showed that these groups were almost consistent with geographical positions and climatic conditions of the production regions. Integrating fingerprint analysis, simultaneous determination and hierarchical clustering analysis, the established method is rapid, sensitive, accurate and readily applicable, and also provides a significant foundation for quality control of P. cacumen efficiently. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  4. The NGC 7742 star cluster luminosity function: a population analysis revisited

    NASA Astrophysics Data System (ADS)

    de Grijs, Richard; Ma, Chao

    2018-02-01

    We re-examine the properties of the star cluster population in the circumnuclear starburst ring in the face-on spiral galaxy NGC 7742, whose young cluster mass function has been reported to exhibit significant deviations from the canonical power law. We base our reassessment on the clusters’ luminosities (an observational quantity) rather than their masses (a derived quantity), and confirm conclusively that the galaxy’s starburst-ring clusters—and particularly the youngest subsample, {log}(t {{{yr}}}-1)≤ 7.2—show evidence of a turnover in the cluster luminosity function well above the 90% completeness limit adopted to ensure the reliability of our results. This confirmation emphasizes the unique conundrum posed by this unusual cluster population.

  5. Validation of hierarchical cluster analysis for identification of bacterial species using 42 bacterial isolates

    NASA Astrophysics Data System (ADS)

    Ghebremedhin, Meron; Yesupriya, Shubha; Luka, Janos; Crane, Nicole J.

    2015-03-01

    Recent studies have demonstrated the potential advantages of the use of Raman spectroscopy in the biomedical field due to its rapidity and noninvasive nature. In this study, Raman spectroscopy is applied as a method for differentiating between bacteria isolates for Gram status and Genus species. We created models for identifying 28 bacterial isolates using spectra collected with a 785 nm laser excitation Raman spectroscopic system. In order to investigate the groupings of these samples, partial least squares discriminant analysis (PLSDA) and hierarchical cluster analysis (HCA) was implemented. In addition, cluster analyses of the isolates were performed using various data types consisting of, biochemical tests, gene sequence alignment, high resolution melt (HRM) analysis and antimicrobial susceptibility tests of minimum inhibitory concentration (MIC) and degree of antimicrobial resistance (SIR). In order to evaluate the ability of these models to correctly classify bacterial isolates using solely Raman spectroscopic data, a set of 14 validation samples were tested using the PLSDA models and consequently the HCA models. External cluster evaluation criteria of purity and Rand index were calculated at different taxonomic levels to compare the performance of clustering using Raman spectra as well as the other datasets. Results showed that Raman spectra performed comparably, and in some cases better than, the other data types with Rand index and purity values up to 0.933 and 0.947, respectively. This study clearly demonstrates that the discrimination of bacterial species using Raman spectroscopic data and hierarchical cluster analysis is possible and has the potential to be a powerful point-of-care tool in clinical settings.

  6. Application of 2D and 3D image technologies to characterise morphological attributes of grapevine clusters.

    PubMed

    Tello, Javier; Cubero, Sergio; Blasco, José; Tardaguila, Javier; Aleixos, Nuria; Ibáñez, Javier

    2016-10-01

    Grapevine cluster morphology influences the quality and commercial value of wine and table grapes. It is routinely evaluated by subjective and inaccurate methods that do not meet the requirements set by the food industry. Novel two-dimensional (2D) and three-dimensional (3D) machine vision technologies emerge as promising tools for its automatic and fast evaluation. The automatic evaluation of cluster length, width and elongation was successfully achieved by the analysis of 2D images, significant and strong correlations with the manual methods being found (r = 0.959, 0.861 and 0.852, respectively). The classification of clusters according to their shape can be achieved by evaluating their conicity in different sections of the cluster. The geometric reconstruction of the morphological volume of the cluster from 2D features worked better than the direct 3D laser scanning system, showing a high correlation (r = 0.956) with the manual approach (water displacement method). In addition, we constructed and validated a simple linear regression model for cluster compactness estimation. It showed a high predictive capacity for both the training and validation subsets of clusters (R(2)  = 84.5 and 71.1%, respectively). The methodologies proposed in this work provide continuous and accurate data for the fast and objective characterisation of cluster morphology. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.

  7. Rhizoma Dioscoreae extract protects against alveolar bone loss by regulating the cell cycle: A predictive study based on the protein‑protein interaction network.

    PubMed

    Zhang, Zhi-Guo; Song, Chang-Heng; Zhang, Fang-Zhen; Chen, Yan-Jing; Xiang, Li-Hua; Xiao, Gary Guishan; Ju, Da-Hong

    2016-06-01

    Rhizoma Dioscoreae extract (RDE) exhibits a protective effect on alveolar bone loss in ovariectomized (OVX) rats. The aim of this study was to predict the pathways or targets that are regulated by RDE, by re‑assessing our previously reported data and conducting a protein‑protein interaction (PPI) network analysis. In total, 383 differentially expressed genes (≥3‑fold) between alveolar bone samples from the RDE and OVX group rats were identified, and a PPI network was constructed based on these genes. Furthermore, four molecular clusters (A‑D) in the PPI network with the smallest P‑values were detected by molecular complex detection (MCODE) algorithm. Using Database for Annotation, Visualization and Integrated Discovery (DAVID) and Ingenuity Pathway Analysis (IPA) tools, two molecular clusters (A and B) were enriched for biological process in Gene Ontology (GO). Only cluster A was associated with biological pathways in the IPA database. GO and pathway analysis results showed that cluster A, associated with cell cycle regulation, was the most important molecular cluster in the PPI network. In addition, cyclin‑dependent kinase 1 (CDK1) may be a key molecule achieving the cell‑cycle‑regulatory function of cluster A. From the PPI network analysis, it was predicted that delayed cell cycle progression in excessive alveolar bone remodeling via downregulation of CDK1 may be another mechanism underling the anti‑osteopenic effect of RDE on alveolar bone.

  8. Application of a XMM-Newton EPIC Monte Carlo to Analysis And Interpretation of Data for Abell 1689, RXJ0658-55 And the Centaurus Clusters of Galaxies

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Andersson, Karl E.; /Stockholm U. /SLAC; Peterson, J.R.

    2007-04-17

    We propose a new Monte Carlo method to study extended X-ray sources with the European Photon Imaging Camera (EPIC) aboard XMM Newton. The Smoothed Particle Inference (SPI) technique, described in a companion paper, is applied here to the EPIC data for the clusters of galaxies Abell 1689, Centaurus and RXJ 0658-55 (the ''bullet cluster''). We aim to show the advantages of this method of simultaneous spectral-spatial modeling over traditional X-ray spectral analysis. In Abell 1689 we confirm our earlier findings about structure in temperature distribution and produce a high resolution temperature map. We also confirm our findings about velocity structuremore » within the gas. In the bullet cluster, RXJ 0658-55, we produce the highest resolution temperature map ever to be published of this cluster allowing us to trace what looks like the motion of the bullet in the cluster. We even detect a south to north temperature gradient within the bullet itself. In the Centaurus cluster we detect, by dividing up the luminosity of the cluster in bands of gas temperatures, a striking feature to the north-east of the cluster core. We hypothesize that this feature is caused by a subcluster left over from a substantial merger that slightly displaced the core. We conclude that our method is very powerful in determining the spatial distributions of plasma temperatures and very useful for systematic studies in cluster structure.« less

  9. Analysis of precipitation data in Bangladesh through hierarchical clustering and multidimensional scaling

    NASA Astrophysics Data System (ADS)

    Rahman, Md. Habibur; Matin, M. A.; Salma, Umma

    2017-12-01

    The precipitation patterns of seventeen locations in Bangladesh from 1961 to 2014 were studied using a cluster analysis and metric multidimensional scaling. In doing so, the current research applies four major hierarchical clustering methods to precipitation in conjunction with different dissimilarity measures and metric multidimensional scaling. A variety of clustering algorithms were used to provide multiple clustering dendrograms for a mixture of distance measures. The dendrogram of pre-monsoon rainfall for the seventeen locations formed five clusters. The pre-monsoon precipitation data for the areas of Srimangal and Sylhet were located in two clusters across the combination of five dissimilarity measures and four hierarchical clustering algorithms. The single linkage algorithm with Euclidian and Manhattan distances, the average linkage algorithm with the Minkowski distance, and Ward's linkage algorithm provided similar results with regard to monsoon precipitation. The results of the post-monsoon and winter precipitation data are shown in different types of dendrograms with disparate combinations of sub-clusters. The schematic geometrical representations of the precipitation data using metric multidimensional scaling showed that the post-monsoon rainfall of Cox's Bazar was located far from those of the other locations. The results of a box-and-whisker plot, different clustering techniques, and metric multidimensional scaling indicated that the precipitation behaviour of Srimangal and Sylhet during the pre-monsoon season, Cox's Bazar and Sylhet during the monsoon season, Maijdi Court and Cox's Bazar during the post-monsoon season, and Cox's Bazar and Khulna during the winter differed from those at other locations in Bangladesh.

  10. Dark matter phenomenology of high-speed galaxy cluster collisions

    DOE PAGES

    Mishchenko, Yuriy; Ji, Chueng-Ryong

    2017-07-29

    Here, we perform a general computational analysis of possible post-collision mass distributions in high-speed galaxy cluster collisions in the presence of self-interacting dark matter. Using this analysis, we show that astrophysically weakly self-interacting dark matter can impart subtle yet measurable features in the mass distributions of colliding galaxy clusters even without significant disruptions to the dark matter halos of the colliding galaxy clusters themselves. Most profound such evidence is found to reside in the tails of dark matter halos’ distributions, in the space between the colliding galaxy clusters. Such features appear in our simulations as shells of scattered dark mattermore » expanding in alignment with the outgoing original galaxy clusters, contributing significant densities to projected mass distributions at large distances from collision centers and large scattering angles of up to 90°. Our simulations indicate that as much as 20% of the total collision’s mass may be deposited into such structures without noticeable disruptions to the main galaxy clusters. Such structures at large scattering angles are forbidden in purely gravitational high-speed galaxy cluster collisions.Convincing identification of such structures in real colliding galaxy clusters would be a clear indication of the self-interacting nature of dark matter. Our findings may offer an explanation for the ring-like dark matter feature recently identified in the long-range reconstructions of the mass distribution of the colliding galaxy cluster CL0024+017.« less

  11. Dark matter phenomenology of high-speed galaxy cluster collisions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mishchenko, Yuriy; Ji, Chueng-Ryong

    Here, we perform a general computational analysis of possible post-collision mass distributions in high-speed galaxy cluster collisions in the presence of self-interacting dark matter. Using this analysis, we show that astrophysically weakly self-interacting dark matter can impart subtle yet measurable features in the mass distributions of colliding galaxy clusters even without significant disruptions to the dark matter halos of the colliding galaxy clusters themselves. Most profound such evidence is found to reside in the tails of dark matter halos’ distributions, in the space between the colliding galaxy clusters. Such features appear in our simulations as shells of scattered dark mattermore » expanding in alignment with the outgoing original galaxy clusters, contributing significant densities to projected mass distributions at large distances from collision centers and large scattering angles of up to 90°. Our simulations indicate that as much as 20% of the total collision’s mass may be deposited into such structures without noticeable disruptions to the main galaxy clusters. Such structures at large scattering angles are forbidden in purely gravitational high-speed galaxy cluster collisions.Convincing identification of such structures in real colliding galaxy clusters would be a clear indication of the self-interacting nature of dark matter. Our findings may offer an explanation for the ring-like dark matter feature recently identified in the long-range reconstructions of the mass distribution of the colliding galaxy cluster CL0024+017.« less

  12. Density-based clustering: A 'landscape view' of multi-channel neural data for inference and dynamic complexity analysis.

    PubMed

    Baglietto, Gabriel; Gigante, Guido; Del Giudice, Paolo

    2017-01-01

    Two, partially interwoven, hot topics in the analysis and statistical modeling of neural data, are the development of efficient and informative representations of the time series derived from multiple neural recordings, and the extraction of information about the connectivity structure of the underlying neural network from the recorded neural activities. In the present paper we show that state-space clustering can provide an easy and effective option for reducing the dimensionality of multiple neural time series, that it can improve inference of synaptic couplings from neural activities, and that it can also allow the construction of a compact representation of the multi-dimensional dynamics, that easily lends itself to complexity measures. We apply a variant of the 'mean-shift' algorithm to perform state-space clustering, and validate it on an Hopfield network in the glassy phase, in which metastable states are largely uncorrelated from memories embedded in the synaptic matrix. In this context, we show that the neural states identified as clusters' centroids offer a parsimonious parametrization of the synaptic matrix, which allows a significant improvement in inferring the synaptic couplings from the neural activities. Moving to the more realistic case of a multi-modular spiking network, with spike-frequency adaptation inducing history-dependent effects, we propose a procedure inspired by Boltzmann learning, but extending its domain of application, to learn inter-module synaptic couplings so that the spiking network reproduces a prescribed pattern of spatial correlations; we then illustrate, in the spiking network, how clustering is effective in extracting relevant features of the network's state-space landscape. Finally, we show that the knowledge of the cluster structure allows casting the multi-dimensional neural dynamics in the form of a symbolic dynamics of transitions between clusters; as an illustration of the potential of such reduction, we define and analyze a measure of complexity of the neural time series.

  13. Nature of bonding and cooperativity in linear DMSO clusters: A DFT, AIM and NCI analysis.

    PubMed

    Venkataramanan, Natarajan Sathiyamoorthy; Suvitha, Ambigapathy

    2018-05-01

    This study aims to cast light on the nature of interactions and cooperativity that exists in linear dimethyl sulfoxide (DMSO) clusters using dispersion corrected density functional theory. In the linear DMSO, DMSO molecules in the middle of the clusters are bound strongly than at the terminal. The plot of the total binding energy of the clusters vs the cluster size and mean polarizabilities vs cluster size shows an excellent linearity demonstrating the presence of cooperativity effect. The computed incremental binding energy of the clusters remains nearly constant, implying that DMSO addition at the terminal site can happen to form an infinite chain. In the linear clusters, two σ-hole at the terminal DMSO molecules were found and the value on it was found to increase with the increase in cluster size. The quantum theory of atoms in molecules topography shows the existence of hydrogen and SO⋯S type in linear tetramer and larger clusters. In the dimer and trimer SO⋯OS type of interaction exists. In 2D non-covalent interactions plot, additional peaks in the regions which contribute to the stabilization of the clusters were observed and it splits in the trimer and intensifies in the larger clusters. In the trimer and larger clusters in addition to the blue patches due to hydrogen bonds, additional, light blue patches were seen between the hydrogen atom of the methyl groups and the sulphur atom of the nearby DMSO molecule. Thus, in addition to the strong H-bonds, strong electrostatic interactions between the sulphur atom and methyl hydrogens exists in the linear clusters. Copyright © 2018 Elsevier Inc. All rights reserved.

  14. Identification and analysis of ZFPM2 as a target gene of miR-17-92 cluster in chicken.

    PubMed

    Zhang, Xiao-fei; Song, He; Liu, Jing; Zhang, Wen-jian; Yan, Xiao-hong; Li, Hui; Wang, Ning

    2017-04-20

    The miR-17-92 cluster plays important roles in a variety of physiological and pathological processes in mammals. Previously, we showed that miR-17-92 cluster promotes chicken preadipocyte proliferation; however, the mechanism for its action is unknown. In order to explore the mechanism by which miR-17-92 cluster promotes chicken preadipocyte proliferation, CCK8 proliferation assay was performed to determine the effect of ZFPM2 knockdown on chicken preadipocyte proliferation. The results showed that ZFPM2 knockdown significantly promoted chicken preadipocyte proliferation (P<0.01). Consistent with the CCK8 results, the mRNA levels of cell proliferation marker genes, i.e., Cyclin D1, PCNA and Ki67, were markedly increased in the si-ZFPM2-transfected preadipocytes (P<0.01 or P<0.05). Bioinformatics analysis showed that there were two potential miRNA binding sites for the four individual members of miR-17-92 cluster in the ZFPM2 3'UTR, one for miR-17-5p and miR-20a and the other for miR-19a and miR-19b. To test whether ZFPM2 is a target for the miR-17-92 cluster, the ZFPM2 3'UTR reporter (psi-CHECK2-ZFPM2-3'UTR-WT) and its mutant reporter (psi-CHECK2-ZFPM2-3'UTR-MUT) were constructed. Reporter assays showed that overexpression of miR-17-92 cluster significantly inhibited the luciferase reporter activity of psi-CHECK2-ZFPM2-3'UTR-WT (P<0.01), as compared with control vector (empty pcDNA3.1). Transfection of miR-17-5p, miR-19a and miR-20a inhibitors increased the reporter activities of psi-CHECK2-ZFPM2-3'UTR-WT (P<0.01 or P<0.05). In contrast, transfection of miR-17-5p, miR-19a, and miR-20a inhibitors had no obvious effect on reporter activity of psi-CHECK2-ZFPM2-3'UTR-MUT. Further qRT-PCR analysis showed that miR-17-5p, miR-20a and miR-19a inhibitors significantly elevated the endogenous ZFPM2 mRNA expression (P<0.01 or P<0.05). Cotransfection of either miR-17-5p or miR-19a inhibitor and siZFPM2 showed that both inhibitors tended to reduce only slightly the promoting effect of siZFPM2 on chicken preadipocyte proliferation. Taken together, these data demonstrated that ZFPM2 is a target of miR-17-5p, miR-20a, miR-19a, and miR-19b, and that miR-17-92 cluster promotes chicken preadipocyte proliferation at least in part by targeting ZFPM2 and inhibiting its expression.

  15. Cluster Analysis of Indonesian Province Based on Household Primary Cooking Fuel Using K-Means

    NASA Astrophysics Data System (ADS)

    Huda, S. N.

    2017-03-01

    Each household definitely provides installations for cooking. Kerosene, which is refined from petroleum products once dominated types of primary fuel for cooking in Indonesia, whereas kerosene has an expensive cost and small efficiency. Other household use LPG as their primary cooking fuel. However, LPG supply is also limited. In addition, with a very diverse environments and cultures in Indonesia led to diversity of the installation type of cooking, such as wood-burning stove brazier. The government is also promoting alternative fuels, such as charcoal briquettes, and fuel from biomass. The use of other fuels is part of the diversification of energy that is expected to reduce community dependence on petroleum-based fuels. The use of various fuels in cooking that vary from one region to another reflects the distribution of fuel basic use by household. By knowing the characteristics of each province, the government can take appropriate policies to each province according each character. Therefore, it would be very good if there exist a cluster analysis of all provinces in Indonesia based on the type of primary cooking fuel in household. Cluster analysis is done using K-Means method with K ranging from 2-5. Cluster results are validated using Silhouette Coefficient (SC). The results show that the highest SC achieved from K = 2 with SC value 0.39135818388151. Two clusters reflect provinces in Indonesia, one is a cluster of more traditional provinces and the other is a cluster of more modern provinces. The cluster results are then shown in a map using Google Map API.

  16. An approach to functionally relevant clustering of the protein universe: Active site profile‐based clustering of protein structures and sequences

    PubMed Central

    Knutson, Stacy T.; Westwood, Brian M.; Leuthaeuser, Janelle B.; Turner, Brandon E.; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D.; Harper, Angela F.; Brown, Shoshana D.; Morris, John H.; Ferrin, Thomas E.; Babbitt, Patricia C.

    2017-01-01

    Abstract Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification—amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two‐Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure‐Function Linkage Database, SFLD) self‐identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self‐identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well‐curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP‐identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F‐measure and performance analysis on the enolase search results and comparison to GEMMA and SCI‐PHY demonstrate that TuLIP avoids the over‐division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. PMID:28054422

  17. An approach to functionally relevant clustering of the protein universe: Active site profile-based clustering of protein structures and sequences.

    PubMed

    Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S

    2017-04-01

    Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

  18. Open-Source Sequence Clustering Methods Improve the State Of the Art.

    PubMed

    Kopylova, Evguenia; Navas-Molina, Jose A; Mercier, Céline; Xu, Zhenjiang Zech; Mahé, Frédéric; He, Yan; Zhou, Hong-Wei; Rognes, Torbjørn; Caporaso, J Gregory; Knight, Rob

    2016-01-01

    Sequence clustering is a common early step in amplicon-based microbial community analysis, when raw sequencing reads are clustered into operational taxonomic units (OTUs) to reduce the run time of subsequent analysis steps. Here, we evaluated the performance of recently released state-of-the-art open-source clustering software products, namely, OTUCLUST, Swarm, SUMACLUST, and SortMeRNA, against current principal options (UCLUST and USEARCH) in QIIME, hierarchical clustering methods in mothur, and USEARCH's most recent clustering algorithm, UPARSE. All the latest open-source tools showed promising results, reporting up to 60% fewer spurious OTUs than UCLUST, indicating that the underlying clustering algorithm can vastly reduce the number of these derived OTUs. Furthermore, we observed that stringent quality filtering, such as is done in UPARSE, can cause a significant underestimation of species abundance and diversity, leading to incorrect biological results. Swarm, SUMACLUST, and SortMeRNA have been included in the QIIME 1.9.0 release. IMPORTANCE Massive collections of next-generation sequencing data call for fast, accurate, and easily accessible bioinformatics algorithms to perform sequence clustering. A comprehensive benchmark is presented, including open-source tools and the popular USEARCH suite. Simulated, mock, and environmental communities were used to analyze sensitivity, selectivity, species diversity (alpha and beta), and taxonomic composition. The results demonstrate that recent clustering algorithms can significantly improve accuracy and preserve estimated diversity without the application of aggressive filtering. Moreover, these tools are all open source, apply multiple levels of multithreading, and scale to the demands of modern next-generation sequencing data, which is essential for the analysis of massive multidisciplinary studies such as the Earth Microbiome Project (EMP) (J. A. Gilbert, J. K. Jansson, and R. Knight, BMC Biol 12:69, 2014, http://dx.doi.org/10.1186/s12915-014-0069-1).

  19. Clustering P-Wave Receiver Functions To Constrain Subsurface Seismic Structure

    NASA Astrophysics Data System (ADS)

    Chai, C.; Larmat, C. S.; Maceira, M.; Ammon, C. J.; He, R.; Zhang, H.

    2017-12-01

    The acquisition of high-quality data from permanent and temporary dense seismic networks provides the opportunity to apply statistical and machine learning techniques to a broad range of geophysical observations. Lekic and Romanowicz (2011) used clustering analysis on tomographic velocity models of the western United States to perform tectonic regionalization and the velocity-profile clusters agree well with known geomorphic provinces. A complementary and somewhat less restrictive approach is to apply cluster analysis directly to geophysical observations. In this presentation, we apply clustering analysis to teleseismic P-wave receiver functions (RFs) continuing efforts of Larmat et al. (2015) and Maceira et al. (2015). These earlier studies validated the approach with surface waves and stacked EARS RFs from the USArray stations. In this study, we experiment with both the K-means and hierarchical clustering algorithms. We also test different distance metrics defined in the vector space of RFs following Lekic and Romanowicz (2011). We cluster data from two distinct data sets. The first, corresponding to the western US, was by smoothing/interpolation of receiver-function wavefield (Chai et al. 2015). Spatial coherence and agreement with geologic region increase with this simpler, spatially smoothed set of observations. The second data set is composed of RFs for more than 800 stations of the China Digital Seismic Network (CSN). Preliminary results show a first order agreement between clusters and tectonic region and each region cluster includes a distinct Ps arrival, which probably reflects differences in crustal thickness. Regionalization remains an important step to characterize a model prior to application of full waveform and/or stochastic imaging techniques because of the computational expense of these types of studies. Machine learning techniques can provide valuable information that can be used to design and characterize formal geophysical inversion, providing information on spatial variability in the subsurface geology.

  20. Spatio-temporal analysis of wildfire ignitions in the St. Johns River Water Management District, Florida

    Treesearch

    Marc G. Genton; David T. Butry; Marcia L. Gumpertz; Jeffrey P. Prestemon

    2006-01-01

    We analyse the spatio-temporal structure of wildfire ignitions in the St. Johns River Water Management District in north-eastern Florida. We show, using tools to analyse point patterns (e.g. the L-function), that wildfire events occur in clusters. Clustering of these events correlates with irregular distribution of fire ignitions, including lightning...

  1. Comment on ``Steady-state properties of a totally asymmetric exclusion process with periodic structure''

    NASA Astrophysics Data System (ADS)

    Jiang, Rui; Hu, Mao-Bin; Wu, Qing-Song

    2008-07-01

    Lakatos [Phys. Rev. E 71, 011103 (2005)] have studied a totally asymmetric exclusion process that contains periodically varying movement rates. They have presented a cluster mean-field theory for the problem. We show that their cluster mean-field theory leads to redundant equations. We present a mean-field analysis in which there is no redundant equation.

  2. Exploring the nature and synchronicity of early cluster formation in the Large Magellanic Cloud - III. Horizontal branch morphology

    NASA Astrophysics Data System (ADS)

    Wagner-Kaiser, R.; Mackey, Dougal; Sarajedini, Ata; Cohen, Roger E.; Geisler, Doug; Yang, Soung-Chul; Grocholski, Aaron J.; Cummings, Jeffrey D.

    2018-03-01

    We leverage new high-quality data from Hubble Space Telescope program GO-14164 to explore the variation in horizontal branch morphology among globular clusters in the Large Magellanic Cloud (LMC). Our new observations lead to photometry with a precision commensurate with that available for the Galactic globular cluster population. Our analysis indicates that, once metallicity is accounted for, clusters in the LMC largely share similar horizontal branch morphologies regardless of their location within the system. Furthermore, the LMC clusters possess, on average, slightly redder morphologies than most of the inner halo Galactic population; we find, instead, that their characteristics tend to be more similar to those exhibited by clusters in the outer Galactic halo. Our results are consistent with previous studies, showing a correlation between horizontal branch morphology and age.

  3. Prenatal Diagnosis and Molecular Analysis of a Large Novel Deletion (- -JS) Causing α0-Thalassemia.

    PubMed

    Cao, Jinru; He, Shuzhen; Pu, Yudong; Liu, Jingjing; Liu, Fuping; Feng, Jun

    α-Thalassemia (α-thal) is a very common single gene hereditary disease caused by large deletions or point mutations of the α-globin gene cluster in tropical and subtropical regions of the world. Here, we report for the first time, a novel large α-thal deletion in a Chinese family from Jiangsu Province, People's Republic of China (PRC), which removes almost the entire α2 and α1 genes from the α-globin gene cluster. Thus, it was named the Jiangsu deletion (- - JS ) on the α-globin gene cluster causing α 0 -thal. Heterozygotes for this deletion showed an α-thal trait phenotype with reduced mean corpuscular volume (MCV) and mean corpuscular hemoglobin (Hb) (MCH) levels. The sequencing results showed that a 2538 bp deletion (NG_000006.1: g.35801_38338) existed in this novel genotype on the basis of -α 4.2 (leftward), indicating a deletion of about 6.8 kb from the α-globin cluster. In addition, a 29 bp sequence was inserted into the deletion during the recombination events that led to this deletion. Through pedigree analysis, we knew that the proband inherited the novel allele from his mother.

  4. Analysis of ligand-protein exchange by Clustering of Ligand Diffusion Coefficient Pairs (CoLD-CoP).

    PubMed

    Snyder, David A; Chantova, Mihaela; Chaudhry, Saadia

    2015-06-01

    NMR spectroscopy is a powerful tool in describing protein structures and protein activity for pharmaceutical and biochemical development. This study describes a method to determine weak binding ligands in biological systems by using hierarchic diffusion coefficient clustering of multidimensional data obtained with a 400 MHz Bruker NMR. Comparison of DOSY spectrums of ligands of the chemical library in the presence and absence of target proteins show translational diffusion rates for small molecules upon interaction with macromolecules. For weak binders such as compounds found in fragment libraries, changes in diffusion rates upon macromolecular binding are on the order of the precision of DOSY diffusion measurements, and identifying such subtle shifts in diffusion requires careful statistical analysis. The "CoLD-CoP" (Clustering of Ligand Diffusion Coefficient Pairs) method presented here uses SAHN clustering to identify protein-binders in a chemical library or even a not fully characterized metabolite mixture. We will show how DOSY NMR and the "CoLD-CoP" method complement each other in identifying the most suitable candidates for lysozyme and wheat germ acid phosphatase. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. Variation of gunshot injury patterns in mortality associated with human rights abuses and armed conflict: an exploratory study.

    PubMed

    Baraybar, Jose Pablo

    2015-09-01

    The analysis of the distribution of gunshot injuries in a sample of 777 sets of human remains of proven human rights abuse from Somaliland, the Balkans and Peru is compared to frequencies of injuries sustained by combatants in contemporary conflicts reported in the literature. Principal Component Analysis (PCA) reduced the data to three components accounting for 82.94% of the variance. The first component with 38.31% of variance shows segments Arms and thorax/abdomen to be positively correlated (0.887 and 0.662, respectively); the segment head/neck is strongly correlated (0.951) to the second component while the segment thorax/abdomen shows a low, negative correlation (-0.388). Finally in the third component only the legs are strongly correlated (0.991). Data was further subjected to a K-means cluster analysis to determine the likely groupings combining the four types of injuries. Each of the three clusters reproduced similar patterns observed in the PCA: Cluster 1 shows the prevalence of injuries to the thorax/abdomen and extremities in addition to injuries to the head/neck; Cluster 2 shows injuries to the head/neck and Cluster 3 injuries to the thorax/abdomen and a lower representation of the arms and legs. Most of the cases (70.5%), irrespective of geography and type of site (attack or detention), were grouped into Cluster 2. Such comparison shows that in human rights abuse, irrespective of their geography, gunshot injuries tend to follow a pattern favouring the head/neck and thorax/abdomen areas over the extremities, the reverse pattern observed in contemporary combat operations. In those settings gunshot wound trauma is the second cause of mortality/morbidity (after fragmenting ammunition) and its distribution concentrates on the extremities, thorax/abdomen and head; following the pattern of protective armour when it is used. Considering that human rights abuses are often presented as encounters between two armed groups in the context of counter-insurgency operations, a careful analysis of gunshot injury patterns could serve as an indicator that in fact murder, rather than combat, took place and the intention was to kill rather than to maim or render people unfit for battle. To compare the variation of gunshot injury patterns between mortality associated with human rights abuses and armed conflict in selected samples from different countries. Literature review and case analysis. Original statistical analysis of gunshot injuries on human remains (n=777) recovered from mass or clandestine graves associated with human rights abuses in countries in Somaliland, the Balkans and Peru (1983-1995) and literature review of mortality caused by armed conflicts. Mechanism of gunshot injury and wound distribution pattern in geographically diverse samples of human rights abuse. Copyright © 2015 The Chartered Society of Forensic Sciences. Published by Elsevier Ireland Ltd. All rights reserved.

  6. A Class of Manifold Regularized Multiplicative Update Algorithms for Image Clustering.

    PubMed

    Yang, Shangming; Yi, Zhang; He, Xiaofei; Li, Xuelong

    2015-12-01

    Multiplicative update algorithms are important tools for information retrieval, image processing, and pattern recognition. However, when the graph regularization is added to the cost function, different classes of sample data may be mapped to the same subspace, which leads to the increase of data clustering error rate. In this paper, an improved nonnegative matrix factorization (NMF) cost function is introduced. Based on the cost function, a class of novel graph regularized NMF algorithms is developed, which results in a class of extended multiplicative update algorithms with manifold structure regularization. Analysis shows that in the learning, the proposed algorithms can efficiently minimize the rank of the data representation matrix. Theoretical results presented in this paper are confirmed by simulations. For different initializations and data sets, variation curves of cost functions and decomposition data are presented to show the convergence features of the proposed update rules. Basis images, reconstructed images, and clustering results are utilized to present the efficiency of the new algorithms. Last, the clustering accuracies of different algorithms are also investigated, which shows that the proposed algorithms can achieve state-of-the-art performance in applications of image clustering.

  7. Predicting the points of interaction of small molecules in the NF-κB pathway

    PubMed Central

    2011-01-01

    Background The similarity property principle has been used extensively in drug discovery to identify small compounds that interact with specific drug targets. Here we show it can be applied to identify the interactions of small molecules within the NF-κB signalling pathway. Results Clusters that contain compounds with a predominant interaction within the pathway were created, which were then used to predict the interaction of compounds not included in the clustering analysis. Conclusions The technique successfully predicted the points of interactions of compounds that are known to interact with the NF-κB pathway. The method was also shown to be successful when compounds for which the interaction points were unknown were included in the clustering analysis. PMID:21342508

  8. Tweets clustering using latent semantic analysis

    NASA Astrophysics Data System (ADS)

    Rasidi, Norsuhaili Mahamed; Bakar, Sakhinah Abu; Razak, Fatimah Abdul

    2017-04-01

    Social media are becoming overloaded with information due to the increasing number of information feeds. Unlike other social media, Twitter users are allowed to broadcast a short message called as `tweet". In this study, we extract tweets related to MH370 for certain of time. In this paper, we present overview of our approach for tweets clustering to analyze the users' responses toward tragedy of MH370. The tweets were clustered based on the frequency of terms obtained from the classification process. The method we used for the text classification is Latent Semantic Analysis. As a result, there are two types of tweets that response to MH370 tragedy which is emotional and non-emotional. We show some of our initial results to demonstrate the effectiveness of our approach.

  9. The X-ray cluster survey with eRosita: forecasts for cosmology, cluster physics and primordial non-Gaussianity

    NASA Astrophysics Data System (ADS)

    Pillepich, Annalisa; Porciani, Cristiano; Reiprich, Thomas H.

    2012-05-01

    Starting in late 2013, the eRosita telescope will survey the X-ray sky with unprecedented sensitivity. Assuming a detection limit of 50 photons in the (0.5-2.0) keV energy band with a typical exposure time of 1.6 ks, we predict that eRosita will detect ˜9.3 × 104 clusters of galaxies more massive than 5 × 1013 h-1 M⊙, with the currently planned all-sky survey. Their median redshift will be z≃ 0.35. We perform a Fisher-matrix analysis to forecast the constraining power of ? on the Λ cold dark matter (ΛCDM) cosmology and, simultaneously, on the X-ray scaling relations for galaxy clusters. Special attention is devoted to the possibility of detecting primordial non-Gaussianity. We consider two experimental probes: the number counts and the angular clustering of a photon-count limited sample of clusters. We discuss how the cluster sample should be split to optimize the analysis and we show that redshift information of the individual clusters is vital to break the strong degeneracies among the model parameters. For example, performing a 'tomographic' analysis based on photometric-redshift estimates and combining one- and two-point statistics will give marginal 1σ errors of Δσ8≃ 0.036 and ΔΩm≃ 0.012 without priors, and improve the current estimates on the slope of the luminosity-mass relation by a factor of 3. Regarding primordial non-Gaussianity, ? clusters alone will give ΔfNL≃ 9, 36 and 144 for the local, orthogonal and equilateral model, respectively. Measuring redshifts with spectroscopic accuracy would further tighten the constraints by nearly 40 per cent (barring fNL which displays smaller improvements). Finally, combining ? data with the analysis of temperature anisotropies in the cosmic microwave background by the Planck satellite should give sensational constraints on both the cosmology and the properties of the intracluster medium.

  10. Cross-correlating the γ-ray Sky with Catalogs of Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Branchini, Enzo; Camera, Stefano; Cuoco, Alessandro; Fornengo, Nicolao; Regis, Marco; Viel, Matteo; Xia, Jun-Qing

    2017-01-01

    We report the detection of a cross-correlation signal between Fermi Large Area Telescope diffuse γ-ray maps and catalogs of clusters. In our analysis, we considered three different catalogs: WHL12, redMaPPer, and PlanckSZ. They all show a positive correlation with different amplitudes, related to the average mass of the objects in each catalog, which also sets the catalog bias. The signal detection is confirmed by the results of a stacking analysis. The cross-correlation signal extends to rather large angular scales, around 1°, that correspond, at the typical redshift of the clusters in these catalogs, to a few to tens of megaparsecs, I.e., the typical scale-length of the large-scale structures in the universe. Most likely this signal is contributed by the cumulative emission from active galactic nuclei (AGNs) associated with the filamentary structures that converge toward the high peaks of the matter density field in which galaxy clusters reside. In addition, our analysis reveals the presence of a second component, more compact in size and compatible with a point-like emission from within individual clusters. At present, we cannot distinguish between the two most likely interpretations for such a signal, I.e., whether it is produced by AGNs inside clusters or if it is a diffuse γ-ray emission from the intracluster medium. We argue that this latter, intriguing, hypothesis might be tested by applying this technique to a low-redshift large-mass cluster sample.

  11. Large-Scale Genomic Analysis of Codon Usage in Dengue Virus and Evaluation of Its Phylogenetic Dependence

    PubMed Central

    Lara-Ramírez, Edgar E.; Salazar, Ma Isabel; López-López, María de Jesús; Salas-Benito, Juan Santiago; Sánchez-Varela, Alejandro

    2014-01-01

    The increasing number of dengue virus (DENV) genome sequences available allows identifying the contributing factors to DENV evolution. In the present study, the codon usage in serotypes 1–4 (DENV1–4) has been explored for 3047 sequenced genomes using different statistics methods. The correlation analysis of total GC content (GC) with GC content at the three nucleotide positions of codons (GC1, GC2, and GC3) as well as the effective number of codons (ENC, ENCp) versus GC3 plots revealed mutational bias and purifying selection pressures as the major forces influencing the codon usage, but with distinct pressure on specific nucleotide position in the codon. The correspondence analysis (CA) and clustering analysis on relative synonymous codon usage (RSCU) within each serotype showed similar clustering patterns to the phylogenetic analysis of nucleotide sequences for DENV1–4. These clustering patterns are strongly related to the virus geographic origin. The phylogenetic dependence analysis also suggests that stabilizing selection acts on the codon usage bias. Our analysis of a large scale reveals new feature on DENV genomic evolution. PMID:25136631

  12. Choosing appropriate analysis methods for cluster randomised cross-over trials with a binary outcome.

    PubMed

    Morgan, Katy E; Forbes, Andrew B; Keogh, Ruth H; Jairath, Vipul; Kahan, Brennan C

    2017-01-30

    In cluster randomised cross-over (CRXO) trials, clusters receive multiple treatments in a randomised sequence over time. In such trials, there is usual correlation between patients in the same cluster. In addition, within a cluster, patients in the same period may be more similar to each other than to patients in other periods. We demonstrate that it is necessary to account for these correlations in the analysis to obtain correct Type I error rates. We then use simulation to compare different methods of analysing a binary outcome from a two-period CRXO design. Our simulations demonstrated that hierarchical models without random effects for period-within-cluster, which do not account for any extra within-period correlation, performed poorly with greatly inflated Type I errors in many scenarios. In scenarios where extra within-period correlation was present, a hierarchical model with random effects for cluster and period-within-cluster only had correct Type I errors when there were large numbers of clusters; with small numbers of clusters, the error rate was inflated. We also found that generalised estimating equations did not give correct error rates in any scenarios considered. An unweighted cluster-level summary regression performed best overall, maintaining an error rate close to 5% for all scenarios, although it lost power when extra within-period correlation was present, especially for small numbers of clusters. Results from our simulation study show that it is important to model both levels of clustering in CRXO trials, and that any extra within-period correlation should be accounted for. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  13. Live-cell superresolution microscopy reveals the organization of RNA polymerase in the bacterial nucleoid

    PubMed Central

    Stracy, Mathew; Lesterlin, Christian; Garza de Leon, Federico; Uphoff, Stephan; Zawadzki, Pawel; Kapanidis, Achillefs N.

    2015-01-01

    Despite the fundamental importance of transcription, a comprehensive analysis of RNA polymerase (RNAP) behavior and its role in the nucleoid organization in vivo is lacking. Here, we used superresolution microscopy to study the localization and dynamics of the transcription machinery and DNA in live bacterial cells, at both the single-molecule and the population level. We used photoactivated single-molecule tracking to discriminate between mobile RNAPs and RNAPs specifically bound to DNA, either on promoters or transcribed genes. Mobile RNAPs can explore the whole nucleoid while searching for promoters, and spend 85% of their search time in nonspecific interactions with DNA. On the other hand, the distribution of specifically bound RNAPs shows that low levels of transcription can occur throughout the nucleoid. Further, clustering analysis and 3D structured illumination microscopy (SIM) show that dense clusters of transcribing RNAPs form almost exclusively at the nucleoid periphery. Treatment with rifampicin shows that active transcription is necessary for maintaining this spatial organization. In faster growth conditions, the fraction of transcribing RNAPs increases, as well as their clustering. Under these conditions, we observed dramatic phase separation between the densest clusters of RNAPs and the densest regions of the nucleoid. These findings show that transcription can cause spatial reorganization of the nucleoid, with movement of gene loci out of the bulk of DNA as levels of transcription increase. This work provides a global view of the organization of RNA polymerase and transcription in living cells. PMID:26224838

  14. Detecting hybridization between Iranian wild wolf (Canis lupus pallipes) and free-ranging domestic dog (Canis familiaris) by analysis of microsatellite markers.

    PubMed

    Khosravi, Rasoul; Rezaei, Hamid Reza; Kaboli, Mohammad

    2013-01-01

    The genetic threat due to hybridization with free-ranging dogs is one major concern in wolf conservation. The identification of hybrids and extent of hybridization is important in the conservation and management of wolf populations. Genetic variation was analyzed at 15 unlinked loci in 28 dogs, 28 wolves, four known hybrids, two black wolves, and one dog with abnormal traits in Iran. Pritchard's model, multivariate ordination by principal component analysis and neighbor joining clustering were used for population clustering and individual assignment. Analysis of genetic variation showed that genetic variability is high in both wolf and dog populations in Iran. Values of H(E) in dog and wolf samples ranged from 0.75-0.92 and 0.77-0.92, respectively. The results of AMOVA showed that the two groups of dog and wolf were significantly different (F(ST) = 0.05 and R(ST) = 0.36; P < 0.001). In each of the three methods, wolf and dog samples were separated into two distinct clusters. Two dark wolves were assigned to the wolf cluster. Also these models detected D32 (dog with abnormal traits) and some other samples, which were assigned to more than one cluster and could be a hybrid. This study is the beginning of a genetic study in wolf populations in Iran, and our results reveal that as in other countries, hybridization between wolves and dogs is sporadic in Iran and can be a threat to wolf populations if human perturbations increase.

  15. Transcriptomic analysis of neuregulin-1 regulated genes following ischemic stroke by computational identification of promoter binding sites: A role for the ETS-1 transcription factor.

    PubMed

    Surles-Zeigler, Monique C; Li, Yonggang; Distel, Timothy J; Omotayo, Hakeem; Ge, Shaokui; Ford, Byron D

    2018-01-01

    Ischemic stroke is a major cause of mortality in the United States. We previously showed that neuregulin-1 (NRG1) was neuroprotective in rat models of ischemic stroke. We used gene expression profiling to understand the early cellular and molecular mechanisms of NRG1's effects after the induction of ischemia. Ischemic stroke was induced by middle cerebral artery occlusion (MCAO). Rats were allocated to 3 groups: (1) control, (2) MCAO and (3) MCAO + NRG1. Cortical brain tissues were collected three hours following MCAO and NRG1 treatment and subjected to microarray analysis. Data and statistical analyses were performed using R/Bioconductor platform alongside Genesis, Ingenuity Pathway Analysis and Enrichr software packages. There were 2693 genes differentially regulated following ischemia and NRG1 treatment. These genes were organized by expression patterns into clusters using a K-means clustering algorithm. We further analyzed genes in clusters where ischemia altered gene expression, which was reversed by NRG1 (clusters 4 and 10). NRG1, IRS1, OPA3, and POU6F1 were central linking (node) genes in cluster 4. Conserved Transcription Factor Binding Site Finder (CONFAC) identified ETS-1 as a potential transcriptional regulator of NRG1 suppressed genes following ischemia. A transcription factor activity array showed that ETS-1 activity was increased 2-fold, 3 hours following ischemia and this activity was attenuated by NRG1. These findings reveal key early transcriptional mechanisms associated with neuroprotection by NRG1 in the ischemic penumbra.

  16. Galactic Doppelgängers: The Chemical Similarity Among Field Stars and Among Stars with a Common Birth Origin

    NASA Astrophysics Data System (ADS)

    Ness, M.; Rix, H.-W.; Hogg, David W.; Casey, A. R.; Holtzman, J.; Fouesneau, M.; Zasowski, G.; Geisler, D.; Shetrone, M.; Minniti, D.; Frinchaboy, Peter M.; Roman-Lopes, Alexandre

    2018-02-01

    We explore to what extent stars within Galactic disk open clusters resemble each other in the high-dimensional space of their photospheric element abundances and contrast this with pairs of field stars. Our analysis is based on abundances for 20 elements, homogeneously derived from APOGEE spectra (with carefully quantified uncertainties of typically 0.03 dex). We consider 90 red giant stars in seven open clusters and find that most stars within a cluster have abundances in most elements that are indistinguishable (in a {χ }2-sense) from those of the other members, as expected for stellar birth siblings. An analogous analysis among pairs of > 1000 field stars shows that highly significant abundance differences in the 20 dimensional space can be established for the vast majority of these pairs, and that the APOGEE-based abundance measurements have high discriminating power. However, pairs of field stars whose abundances are indistinguishable even at 0.03 dex precision exist: ∼0.3% of all field star pairs and ∼1.0% of field star pairs at the same (solar) metallicity [Fe/H] = 0 ± 0.02. Most of these pairs are presumably not birth siblings from the same cluster, but rather doppelgängers. Our analysis implies that “chemical tagging” in the strict sense, identifying birth siblings for typical disk stars through their abundance similarity alone, will not work with such data. However, our approach shows that abundances have extremely valuable information for probabilistic chemo-orbital modeling, and combined with velocities, we have identified new cluster members from the field.

  17. The disposition to understand for oneself at university: integrating learning processes with motivation and metacognition.

    PubMed

    Entwistle, Noel; McCune, Velda

    2013-06-01

    A re-analysis of several university-level interview studies has suggested that some students show evidence of a deep and stable approach to learning, along with other characteristics that support the approach. This combination, it was argued, could be seen to indicate a disposition to understand for oneself. To identify a group of students who showed high and consistent scores on deep approach, combined with equivalently high scores on effort and monitoring studying, and to explore these students' experiences of the teaching-learning environments they had experienced. Re-analysis of data from 1,896 students from 25 undergraduate courses taking four contrasting subject areas in eleven British universities. Inventories measuring approaches to studying were given at the beginning and the end of a semester, with the second inventory also exploring students' experiences of teaching. K-means cluster analysis was used to identify groups of students with differing patterns of response on the inventory scales, with a particular focus on students showing high, stable scores. One cluster clearly showed the characteristics expected of the disposition to understand and was also fairly stable over time. Other clusters also had deep approaches, but also showed either surface elements or lower scores on organized effort or monitoring their studying. Combining these findings with interview studies previously reported reinforces the idea of there being a disposition to understand for oneself that could be identified from an inventory scale or through further interviews. © 2013 The British Psychological Society.

  18. Sensory characteristics and consumer preference for chicken meat in Guinea.

    PubMed

    Sow, T M A; Grongnet, J F

    2010-10-01

    This study identified the sensory characteristics and consumer preference for chicken meat in Guinea. Five chicken samples [live village chicken, live broiler, live spent laying hen, ready-to-cook broiler, and ready-to-cook broiler (imported)] bought from different locations were assessed by 10 trained panelists using 19 sensory attributes. The ANOVA results showed that 3 chicken appearance attributes (brown, yellow, and white), 5 chicken odor attributes (oily, intense, medicine smell, roasted, and mouth persistent), 3 chicken flavor attributes (sweet, bitter, and astringent), and 8 chicken texture attributes (firm, tender, juicy, chew, smooth, springy, hard, and fibrous) were significantly discriminating between the chicken samples (P<0.05). Principal component analysis of the sensory data showed that the first 2 principal components explained 84% of the sensory data variance. The principal component analysis results showed that the live village chicken, the live spent laying hen, and the ready-to-cook broiler (imported) were very well represented and clearly distinguished from the live broiler and the ready-to-cook broiler. One hundred twenty consumers expressed their preferences for the chicken samples using a 5-point Likert scale. The hierarchical cluster analysis of the preference data identified 4 homogenous consumer clusters. The hierarchical cluster analysis results showed that the live village chicken was the most preferred chicken sample, whereas the ready-to-cook broiler was the least preferred one. The partial least squares regression (PLSR) type 1 showed that 72% of the sensory data for the first 2 principal components explained 83% of the chicken preference. The PLSR1 identified that the sensory characteristics juicy, oily, sweet, hard, mouth persistent, and yellow were the most relevant sensory drivers of the Guinean chicken preference. The PLSR2 (with multiple responses) identified the relationship between the chicken samples, their sensory attributes, and the consumer clusters. Our results showed that there was not a chicken category that was exclusively preferred from the other chicken samples and therefore highlight the existence of place for development of all chicken categories in the local market.

  19. Hyperspectral remote sensing for advanced detection of early blight (Alternaria solani) disease in potato (Solanum tuberosum) plants

    NASA Astrophysics Data System (ADS)

    Atherton, Daniel

    Early detection of disease and insect infestation within crops and precise application of pesticides can help reduce potential production losses, reduce environmental risk, and reduce the cost of farming. The goal of this study was the advanced detection of early blight (Alternaria solani) in potato (Solanum tuberosum) plants using hyperspectral remote sensing data captured with a handheld spectroradiometer. Hyperspectral reflectance spectra were captured 10 times over five weeks from plants grown to the vegetative and tuber bulking growth stages. The spectra were analyzed using principal component analysis (PCA), spectral change (ratio) analysis, partial least squares (PLS), cluster analysis, and vegetative indices. PCA successfully distinguished more heavily diseased plants from healthy and minimally diseased plants using two principal components. Spectral change (ratio) analysis provided wavelengths (490-510, 640, 665-670, 690, 740-750, and 935 nm) most sensitive to early blight infection followed by ANOVA results indicating a highly significant difference (p < 0.0001) between disease rating group means. In the majority of the experiments, comparisons of diseased plants with healthy plants using Fisher's LSD revealed more heavily diseased plants were significantly different from healthy plants. PLS analysis demonstrated the feasibility of detecting early blight infected plants, finding four optimal factors for raw spectra with the predictor variation explained ranging from 93.4% to 94.6% and the response variation explained ranging from 42.7% to 64.7%. Cluster analysis successfully distinguished healthy plants from all diseased plants except for the most mildly diseased plants, showing clustering analysis was an effective method for detection of early blight. Analysis of the reflectance spectra using the simple ratio (SR) and the normalized difference vegetative index (NDVI) was effective at differentiating all diseased plants from healthy plants, except for the most mildly diseased plants. Of the analysis methods attempted, cluster analysis and vegetative indices were the most promising. The results show the potential of hyperspectral remote sensing for the detection of early blight in potato plants.

  20. Cluster Analysis of Vulnerable Groups in Acute Traumatic Brain Injury Rehabilitation.

    PubMed

    Kucukboyaci, N Erkut; Long, Coralynn; Smith, Michelle; Rath, Joseph F; Bushnik, Tamara

    2018-01-06

    To analyze the complex relation between various social indicators that contribute to socioeconomic status and health care barriers. Cluster analysis of historical patient data obtained from inpatient visits. Inpatient rehabilitation unit in a large urban university hospital. Adult patients (N=148) receiving acute inpatient care, predominantly for closed head injury. Not applicable. We examined the membership of patients with traumatic brain injury in various "vulnerable group" clusters (eg, homeless, unemployed, racial/ethnic minority) and characterized the rehabilitation outcomes of patients (eg, duration of stay, changes in FIM scores between admission to inpatient stay and discharge). The cluster analysis revealed 4 major clusters (ie, clusters A-D) separated by vulnerable group memberships, with distinct durations of stay and FIM gains during their stay. Cluster B, the largest cluster and also consisting of mostly racial/ethnic minorities, had the shortest duration of hospital stay and one of the lowest FIM improvements among the 4 clusters despite higher FIM scores at admission. In cluster C, also consisting of mostly ethnic minorities with multiple socioeconomic status vulnerabilities, patients were characterized by low cognitive FIM scores at admission and the longest duration of stay, and they showed good improvement in FIM scores. Application of clustering techniques to inpatient data identified distinct clusters of patients who may experience differences in their rehabilitation outcome due to their membership in various "at-risk" groups. The results identified patients (ie, cluster B, with minority patients; and cluster D, with elderly patients) who attain below-average gains in brain injury rehabilitation. The results also suggested that systemic (eg, duration of stay) or clinical service improvements (eg, staff's language skills, ability to offer substance abuse therapy, provide appropriate referrals, liaise with intensive social work services, or plan subacute rehabilitation phase) could be beneficial for acute settings. Stronger recruitment, training, and retention initiatives for bilingual and multiethnic professionals may also be considered to optimize gains from acute inpatient rehabilitation after traumatic brain injury. Copyright © 2017 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  1. Multiple co-clustering based on nonparametric mixture models with heterogeneous marginal distributions

    PubMed Central

    Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

    2017-01-01

    We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392

  2. Molecular characterization and population structure study of cambuci: strategy for conservation and genetic improvement.

    PubMed

    Santos, D N; Nunes, C F; Setotaw, T A; Pio, R; Pasqual, M; Cançado, G M A

    2016-12-19

    Cambuci (Campomanesia phaea) belongs to the Myrtaceae family and is native to the Atlantic Forest of Brazil. It has ecological and social appeal but is exposed to problems associated with environmental degradation and expansion of agricultural activities in the region. Comprehensive studies on this species are rare, making its conservation and genetic improvement difficult. Thus, it is important to develop research activities to understand the current situation of the species as well as to make recommendations for its conservation and use. This study was performed to characterize the cambuci accessions found in the germplasm bank of Coordenadoria de Assistência Técnica Integral using inter-simple sequence repeat markers, with the goal of understanding the plant's population structure. The results showed the existence of some level of genetic diversity among the cambuci accessions that could be exploited for the genetic improvement of the species. Principal coordinate analysis and discriminant analysis clustered the 80 accessions into three groups, whereas Bayesian model-based clustering analysis clustered them into two groups. The formation of two cluster groups and the high membership coefficients within the groups pointed out the importance of further collection to cover more areas and more genetic variability within the species. The study also showed the lack of conservation activities; therefore, more attention from the appropriate organizations is needed to plan and implement natural and ex situ conservation activities.

  3. Co-clustering directed graphs to discover asymmetries and directional communities

    PubMed Central

    Rohe, Karl; Qin, Tai; Yu, Bin

    2016-01-01

    In directed graphs, relationships are asymmetric and these asymmetries contain essential structural information about the graph. Directed relationships lead to a new type of clustering that is not feasible in undirected graphs. We propose a spectral co-clustering algorithm called di-sim for asymmetry discovery and directional clustering. A Stochastic co-Blockmodel is introduced to show favorable properties of di-sim. To account for the sparse and highly heterogeneous nature of directed networks, di-sim uses the regularized graph Laplacian and projects the rows of the eigenvector matrix onto the sphere. A nodewise asymmetry score and di-sim are used to analyze the clustering asymmetries in the networks of Enron emails, political blogs, and the Caenorhabditis elegans chemical connectome. In each example, a subset of nodes have clustering asymmetries; these nodes send edges to one cluster, but receive edges from another cluster. Such nodes yield insightful information (e.g., communication bottlenecks) about directed networks, but are missed if the analysis ignores edge direction. PMID:27791058

  4. Co-clustering directed graphs to discover asymmetries and directional communities.

    PubMed

    Rohe, Karl; Qin, Tai; Yu, Bin

    2016-10-21

    In directed graphs, relationships are asymmetric and these asymmetries contain essential structural information about the graph. Directed relationships lead to a new type of clustering that is not feasible in undirected graphs. We propose a spectral co-clustering algorithm called di-sim for asymmetry discovery and directional clustering. A Stochastic co-Blockmodel is introduced to show favorable properties of di-sim To account for the sparse and highly heterogeneous nature of directed networks, di-sim uses the regularized graph Laplacian and projects the rows of the eigenvector matrix onto the sphere. A nodewise asymmetry score and di-sim are used to analyze the clustering asymmetries in the networks of Enron emails, political blogs, and the Caenorhabditis elegans chemical connectome. In each example, a subset of nodes have clustering asymmetries; these nodes send edges to one cluster, but receive edges from another cluster. Such nodes yield insightful information (e.g., communication bottlenecks) about directed networks, but are missed if the analysis ignores edge direction.

  5. Complex time series analysis of PM10 and PM2.5 for a coastal site using artificial neural network modelling and k-means clustering

    NASA Astrophysics Data System (ADS)

    Elangasinghe, M. A.; Singhal, N.; Dirks, K. N.; Salmond, J. A.; Samarasinghe, S.

    2014-09-01

    This paper uses artificial neural networks (ANN), combined with k-means clustering, to understand the complex time series of PM10 and PM2.5 concentrations at a coastal location of New Zealand based on data from a single site. Out of available meteorological parameters from the network (wind speed, wind direction, solar radiation, temperature, relative humidity), key factors governing the pattern of the time series concentrations were identified through input sensitivity analysis performed on the trained neural network model. The transport pathways of particulate matter under these key meteorological parameters were further analysed through bivariate concentration polar plots and k-means clustering techniques. The analysis shows that the external sources such as marine aerosols and local sources such as traffic and biomass burning contribute equally to the particulate matter concentrations at the study site. These results are in agreement with the results of receptor modelling by the Auckland Council based on Positive Matrix Factorization (PMF). Our findings also show that contrasting concentration-wind speed relationships exist between marine aerosols and local traffic sources resulting in very noisy and seemingly large random PM10 concentrations. The inclusion of cluster rankings as an input parameter to the ANN model showed a statistically significant (p < 0.005) improvement in the performance of the ANN time series model and also showed better performance in picking up high concentrations. For the presented case study, the correlation coefficient between observed and predicted concentrations improved from 0.77 to 0.79 for PM2.5 and from 0.63 to 0.69 for PM10 and reduced the root mean squared error (RMSE) from 5.00 to 4.74 for PM2.5 and from 6.77 to 6.34 for PM10. The techniques presented here enable the user to obtain an understanding of potential sources and their transport characteristics prior to the implementation of costly chemical analysis techniques or advanced air dispersion models.

  6. Supervised group Lasso with applications to microarray data analysis

    PubMed Central

    Ma, Shuangge; Song, Xiao; Huang, Jian

    2007-01-01

    Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436

  7. Blue Straggler Stars in the Globular Cluster M53

    NASA Astrophysics Data System (ADS)

    Rey, S. C.; Lee, Young-Wook; Chun, Mun-Suk; Byun, Yong-Ik

    The first large-format CCD color-magnitude diagram (CMD) in the B and V passbands is presented for the Galactic globular cluster M53 (NGC 5024). We have discovered more than 100 new blue straggler (BS) candidates in the field of M53. The analysis of bright BS stars (V < 19.0) clearly shows a bimodal radial distribution, with a high frequency in the inner and outer regions. The distribution is similar to that found in M3, a globular cluster with similar central density and concentration.

  8. Career paths in physicians' postgraduate training - an eight-year follow-up study.

    PubMed

    Buddeberg-Fischer, Barbara; Stamm, Martina; Klaghofer, Richard

    2010-10-06

    To date, there are hardly any studies on the choice of career path in medical school graduates. The present study aimed to investigate what career paths can be identified in the course of postgraduate training of physicians; what factors have an influence on the choice of a career path; and in what way the career paths are correlated with career-related factors as well as with work-life balance aspirations. The data reported originates from five questionnaire surveys of the prospective SwissMedCareer Study, beginning in 2001 (T1, last year of medical school). The study sample consisted of 358 physicians (197 females, 55%; 161 males, 45%) participating at each assessment from T2 (2003, first year of residency) to T5 (2009, seventh year of residency), answering the question: What career do you aspire to have? Furthermore, personal characteristics, chosen specialty, career motivation, mentoring experience, work-life balance as well as workload, career success and career satisfaction were assessed. Career paths were analysed with cluster analysis, and differences between clusters analysed with multivariate methods. The cluster analysis revealed four career clusters which discriminated distinctly between each other: (1) career in practice, (2) hospital career, (3) academic career, and (4) changing career goal. From T3 (third year of residency) to T5, respondents in Cluster 1-3 were rather stable in terms of their career path aspirations, while those assigned to Cluster 4 showed a high fluctuation in their career plans. Physicians in Cluster 1 showed high values in extraprofessional concerns and often consider part-time work. Cluster 2 and 3 were characterised by high instrumentality, intrinsic and extrinsic career motivation, career orientation and high career success. No cluster differences were seen in career satisfaction. In Cluster 1 and 4, females were overrepresented. Trainees should be supported to stay on the career path that best suits his/her personal and professional profile. Attention should be paid to the subgroup of physicians in Cluster 4 switching from one to another career goal in the course of their postgraduate training.

  9. White Matter Tract Integrity in Alzheimer's Disease vs. Late Onset Bipolar Disorder and Its Correlation with Systemic Inflammation and Oxidative Stress Biomarkers.

    PubMed

    Besga, Ariadna; Chyzhyk, Darya; Gonzalez-Ortega, Itxaso; Echeveste, Jon; Graña-Lecuona, Marina; Graña, Manuel; Gonzalez-Pinto, Ana

    2017-01-01

    Background: Late Onset Bipolar Disorder (LOBD) is the development of Bipolar Disorder (BD) at an age above 50 years old. It is often difficult to differentiate from other aging dementias, such as Alzheimer's Disease (AD), because they share cognitive and behavioral impairment symptoms. Objectives: We look for WM tract voxel clusters showing significant differences when comparing of AD vs. LOBD, and its correlations with systemic blood plasma biomarkers (inflammatory, neurotrophic factors, and oxidative stress). Materials: A sample of healthy controls (HC) ( n = 19), AD patients ( n = 35), and LOBD patients ( n = 24) was recruited at the Alava University Hospital. Blood plasma samples were obtained at recruitment time and analyzed to extract the inflammatory, oxidative stress, and neurotrophic factors. Several modalities of MRI were acquired for each subject, Methods: Fractional anisotropy (FA) coefficients are obtained from diffusion weighted imaging (DWI). Tract based spatial statistics (TBSS) finds FA skeleton clusters of WM tract voxels showing significant differences for all possible contrasts between HC, AD, and LOBD. An ANOVA F -test over all contrasts is carried out. Results of F -test are used to mask TBSS detected clusters for the AD > LOBD and LOBD > AD contrast to select the image clusters used for correlation analysis. Finally, Pearson's correlation coefficients between FA values at cluster sites and systemic blood plasma biomarker values are computed. Results: The TBSS contrasts with by ANOVA F -test has identified strongly significant clusters in the forceps minor, inferior longitudinal fasciculus, inferior fronto-occipital fasciculus, and cingulum gyrus. The correlation analysis of these tract clusters found strong negative correlation of AD with the nerve growth factor (NGF) and brain derived neurotrophic factor (BDNF) blood biomarkers. Negative correlation of AD and positive correlation of LOBD with inflammation biomarker IL6 was also found. Conclusion: TBSS voxel clusters tract atlas localizations are consistent with greater behavioral impairment and mood disorders in LOBD than in AD. Correlation analysis confirms that neurotrophic factors (i.e., NGF, BDNF) play a great role in AD while are absent in LOBD pathophysiology. Also, correlation results of IL1 and IL6 suggest stronger inflammatory effects in LOBD than in AD.

  10. Track structure in radiation biology: theory and applications.

    PubMed

    Nikjoo, H; Uehara, S; Wilson, W E; Hoshi, M; Goodhead, D T

    1998-04-01

    A brief review is presented of the basic concepts in track structure and the relative merit of various theoretical approaches adopted in Monte-Carlo track-structure codes are examined. In the second part of the paper, a formal cluster analysis is introduced to calculate cluster-distance distributions. Total experimental ionization cross-sections were least-square fitted and compared with the calculation by various theoretical methods. Monte-Carlo track-structure code Kurbuc was used to examine and compare the spectrum of the secondary electrons generated by using functions given by Born-Bethe, Jain-Khare, Gryzinsky, Kim-Rudd, Mott and Vriens' theories. The cluster analysis in track structure was carried out using the k-means method and Hartigan algorithm. Data are presented on experimental and calculated total ionization cross-sections: inverse mean free path (IMFP) as a function of electron energy used in Monte-Carlo track-structure codes; the spectrum of secondary electrons generated by different functions for 500 eV primary electrons; cluster analysis for 4 MeV and 20 MeV alpha-particles in terms of the frequency of total cluster energy to the root-mean-square (rms) radius of the cluster and differential distance distributions for a pair of clusters; and finally relative frequency distribution for energy deposited in DNA, single-strand break and double-strand breaks for 10MeV/u protons, alpha-particles and carbon ions. There are a number of Monte-Carlo track-structure codes that have been developed independently and the bench-marking presented in this paper allows a better choice of the theoretical method adopted in a track-structure code to be made. A systematic bench-marking of cross-sections and spectra of the secondary electrons shows differences between the codes at atomic level, but such differences are not significant in biophysical modelling at the macromolecular level. Clustered-damage evaluation shows: that a substantial proportion of dose ( 30%) is deposited by low-energy electrons; the majority of DNA damage lesions are of simple type; the complexity of damage increases with increased LET, while the total yield of strand breaks remains constant; and at high LET values nearly 70% of all double-strand breaks are of complex type.

  11. Large Data at Small Universities: Astronomical processing using a computer classroom

    NASA Astrophysics Data System (ADS)

    Fuller, Nathaniel James; Clarkson, William I.; Fluharty, Bill; Belanger, Zach; Dage, Kristen

    2016-06-01

    The use of large computing clusters for astronomy research is becoming more commonplace as datasets expand, but access to these required resources is sometimes difficult for research groups working at smaller Universities. As an alternative to purchasing processing time on an off-site computing cluster, or purchasing dedicated hardware, we show how one can easily build a crude on-site cluster by utilizing idle cycles on instructional computers in computer-lab classrooms. Since these computers are maintained as part of the educational mission of the University, the resource impact on the investigator is generally low.By using open source Python routines, it is possible to have a large number of desktop computers working together via a local network to sort through large data sets. By running traditional analysis routines in an “embarrassingly parallel” manner, gains in speed are accomplished without requiring the investigator to learn how to write routines using highly specialized methodology. We demonstrate this concept here applied to 1. photometry of large-format images and 2. Statistical significance-tests for X-ray lightcurve analysis. In these scenarios, we see a speed-up factor which scales almost linearly with the number of cores in the cluster. Additionally, we show that the usage of the cluster does not severely limit performance for a local user, and indeed the processing can be performed while the computers are in use for classroom purposes.

  12. Structure-Based Phylogenetic Analysis of the Lipocalin Superfamily.

    PubMed

    Lakshmi, Balasubramanian; Mishra, Madhulika; Srinivasan, Narayanaswamy; Archunan, Govindaraju

    2015-01-01

    Lipocalins constitute a superfamily of extracellular proteins that are found in all three kingdoms of life. Although very divergent in their sequences and functions, they show remarkable similarity in 3-D structures. Lipocalins bind and transport small hydrophobic molecules. Earlier sequence-based phylogenetic studies of lipocalins highlighted that they have a long evolutionary history. However the molecular and structural basis of their functional diversity is not completely understood. The main objective of the present study is to understand functional diversity of the lipocalins using a structure-based phylogenetic approach. The present study with 39 protein domains from the lipocalin superfamily suggests that the clusters of lipocalins obtained by structure-based phylogeny correspond well with the functional diversity. The detailed analysis on each of the clusters and sub-clusters reveals that the 39 lipocalin domains cluster based on their mode of ligand binding though the clustering was performed on the basis of gross domain structure. The outliers in the phylogenetic tree are often from single member families. Also structure-based phylogenetic approach has provided pointers to assign putative function for the domains of unknown function in lipocalin family. The approach employed in the present study can be used in the future for the functional identification of new lipocalin proteins and may be extended to other protein families where members show poor sequence similarity but high structural similarity.

  13. Smoothing metallic glasses without introducing crystallization by gas cluster ion beam

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Shao, Lin; Chen, Di; Myers, Michael

    2013-03-11

    We show that 30 keV Ar cluster ion bombardment of Ni{sub 52.5}Nb{sub 10}Zr{sub 15}Ti{sub 15}Pt{sub 7.5} metallic glass (MG) can remove surface mountain-like features and reduce the root mean square surface roughness from 12 nm to 0.7 nm. X-ray diffraction analysis reveals no crystallization after cluster ion irradiation. Molecular dynamics simulations show that, although damage cascades lead to local melting, the subsequent quenching rate is a few orders of magnitude higher than the critical cooling rate for MG formation, thus the melted zone retains its amorphous nature down to room temperature. These findings can be applied to obtain ultra-smooth MGsmore » without introducing crystallization.« less

  14. Diffuse Optical Intracluster Light as a Measure of Stellar Tidal Stripping: The Cluster CL0024+17 at z ~ 0.4 Observed at the Large Binocular Telescope

    NASA Astrophysics Data System (ADS)

    Giallongo, E.; Menci, N.; Grazian, A.; Gallozzi, S.; Castellano, M.; Fiore, F.; Fontana, A.; Pentericci, L.; Boutsia, K.; Paris, D.; Speziali, R.; Testa, V.

    2014-01-01

    We have evaluated the diffuse intracluster light (ICL) in the central core of the galaxy cluster CL0024+17 at z ~ 0.4 observed with the prime focus camera (Large Binocular Camera) at the Large Binocular Telescope. The measure required an accurate removal of the galaxies' light within ~200 kpc from the center. The residual background intensity has then been integrated in circular apertures to derive the average ICL intensity profile. The latter shows an approximate exponential decline as expected from theoretical cold dark matter models where the ICL is due to the integrated contribution of light from stars that are tidally stripped from the halo of their host galaxies due to encounters with other galaxies in the cluster cold dark matter (CDM) potential. The radial profile of the ICL over the galaxies intensity ratio (ICL fraction) is increasing with decreasing radius, but near the cluster center it starts to bend and then decreases where the overlap of the halos of the brightest cluster galaxies becomes dominant. Theoretical expectations in a simplified CDM scenario show that the ICL fraction profile can be estimated from the stripped over galaxy stellar mass ratio in the cluster. It is possible to show that the latter quantity is almost independent of the properties of the individual host galaxies but mainly depends on the average cluster properties. The predicted ICL fraction profile is thus very sensitive to the assumed CDM profile, total mass, and concentration parameter of the cluster. Adopting values very similar to those derived from the most recent lensing analysis in CL0024+17, we find a good agreement with the observed ICL fraction profile. The galaxy counts in the cluster core have then been compared with that derived from composite cluster samples in larger volumes, up to the clusters virial radius. The galaxy counts in the CL0024+17 core appear flatter and the amount of bending with respect to the average cluster galaxy counts imply a loss of total emissivity in broad agreement with the measured ICL fraction. The present analysis shows that the measure of the ICL fraction in clusters can quantitatively account for the stellar stripping activity in their cores and can be used to probe their CDM distribution and evolutionary status. Observations have been carried out using the Large Binocular Telescope at Mt. Graham, AZ. The LBT is an international collaboration among institutions in the United States, Italy, and Germany. LBT Corporation partners are the University of Arizona on behalf of the Arizona university system; Istituto Nazionale di Astrofisica, Italy; LBT Beteiligungsgesellschaft, Germany, representing the Max-Planck Society, the Astrophysical Institute Potsdam, and Heidelberg University; the Ohio State University; and The Research Corporation, on behalf of the University of Notre Dame, University of Minnesota, and University of Virginia.

  15. Clinical Study of the 3D-Master Color System among the Spanish Population.

    PubMed

    Gómez-Polo, Cristina; Gómez-Polo, Miguel; Martínez Vázquez de Parga, Juan Antonio; Celemín-Viñuela, Alicia

    2017-01-12

    To study whether the shades of the 3D-Master System were grouped and represented in the chromatic space according to the three-color coordinates of value, chroma, and hue. Maxillary central incisor color was measured on tooth surfaces through the Easyshade Compact spectrophotometer using 1361 participants aged between 16 and 89. The natural (not bleached teeth) color of the middle thirds was registered in the 3D-Master System nomenclature and in the CIELCh system. Principal component analysis and cluster analysis were applied. 75 colors of the 3D-Master System were found. The statistical analysis revealed the existence of 5 cluster groups. The centroid, the average of the 75 samples, in relation to lightness (L*) was 74.64, 22.87 for chroma (C*), and 88.85 for hue (h*). All of the clusters, except cluster 3, showed significant statistical differences with the centroid for the three-color coordinates (p <0.001). The results of this study indicated that 75 shades in the 3D-Master System were grouped into 5 clusters following coordinates L*, C*, and h* resulting from the dental spectrophotometer Vita Easyshade compact. The shades that composed each cluster did not belong to the same lightness color dimension groups. There was no special uniform chromatic distribution among the colors of the 3D-Master System. © 2017 by the American College of Prosthodontists.

  16. Genetic diversity and population structure analysis between Indian red jungle fowl and domestic chicken using microsatellite markers.

    PubMed

    Kumar, Vinay; Shukla, Sanjeev K; Mathew, Jose; Sharma, Deepak

    2015-01-01

    The present study was conducted to assess the genetic diversity, population structure, and relatedness in Indian red jungle fowl (RJF, Gallus gallus murgi) from northern India and three domestic chicken populations (gallus gallus domesticus), maintained at the institute farms, namely White Leghorn (WL), Aseel (AS) and Red Cornish (RC) using 25 microsatellite markers. All the markers were polymorphic, the number of alleles at each locus ranged from five (MCW0111) to forty-three (LEI0212) with an average number of 19 alleles per locus. Across all loci, the mean expected heterozygosity and polymorphic information content were 0.883 and 0.872, respectively. Population-specific alleles were found in each population. A UPGMA dendrogram based on shared allele distances clearly revealed two major clusters among the four populations; cluster I had genotypes from RJF and WL whereas cluster II had AS and RC genotypes. Furthermore, the estimation of population structure was performed to understand how genetic variation is partitioned within and among populations. The maximum ▵K value was observed for K = 4 with four identified clusters. Furthermore, factorial analysis clearly showed four clustering; each cluster represented the four types of population used in the study. These results clearly, demonstrate the potential of microsatellite markers in elucidating the genetic diversity, relationships, and population structure analysis in RJF and domestic chicken populations.

  17. the-wizz: clustering redshift estimation for everyone

    NASA Astrophysics Data System (ADS)

    Morrison, C. B.; Hildebrandt, H.; Schmidt, S. J.; Baldry, I. K.; Bilicki, M.; Choi, A.; Erben, T.; Schneider, P.

    2017-05-01

    We present the-wizz, an open source and user-friendly software for estimating the redshift distributions of photometric galaxies with unknown redshifts by spatially cross-correlating them against a reference sample with known redshifts. The main benefit of the-wizz is in separating the angular pair finding and correlation estimation from the computation of the output clustering redshifts allowing anyone to create a clustering redshift for their sample without the intervention of an 'expert'. It allows the end user of a given survey to select any subsample of photometric galaxies with unknown redshifts, match this sample's catalogue indices into a value-added data file and produce a clustering redshift estimation for this sample in a fraction of the time it would take to run all the angular correlations needed to produce a clustering redshift. We show results with this software using photometric data from the Kilo-Degree Survey (KiDS) and spectroscopic redshifts from the Galaxy and Mass Assembly survey and the Sloan Digital Sky Survey. The results we present for KiDS are consistent with the redshift distributions used in a recent cosmic shear analysis from the survey. We also present results using a hybrid machine learning-clustering redshift analysis that enables the estimation of clustering redshifts for individual galaxies. the-wizz can be downloaded at http://github.com/morriscb/The-wiZZ/.

  18. Spatio-Temporal Characteristics of Resident Trip Based on Poi and OD Data of Float CAR in Beijing

    NASA Astrophysics Data System (ADS)

    Mou, N.; Li, J.; Zhang, L.; Liu, W.; Xu, Y.

    2017-09-01

    Due to the influence of the urban inherent regional functional distribution, the daily activities of the residents presented some spatio-temporal patterns (periodic patterns, gathering patterns, etc.). In order to further understand the spatial and temporal characteristics of urban residents, this paper research takes the taxi trajectory data of Beijing as a sample data and studies the spatio-temporal characteristics of the residents' activities on the weekdays. At first, according to the characteristics of the taxi trajectory data distributed along the road network, it takes the Voronoi generated by the road nodes as the research unit. This paper proposes a hybrid clustering method - based on grid density, which is used to cluster the OD (origin and destination) data of taxi at different times. Then combining with the POI data of Beijing, this research calculated the density of the POI data in the clustering results, and analyzed the relationship between the activities of residents in different periods and the functional types of the region. The final results showed that the residents were mainly commuting on weekdays. And it found that the distribution of travel density showed a concentric circle of the characteristics, focusing on residential areas and work areas. The results of cluster analysis and POI analysis showed that the residents' travel had experienced the process of "spatial relative dispersion - spatial aggregation - spatial relative dispersion" in one day.

  19. SSR analysis of genetic diversity and structure of the germplasm of faba bean (Vicia faba L.).

    PubMed

    El-Esawi, Mohamed A

    Assessing the diversity and genetic structure of faba bean (Vicia faba L.) germplasm is essential to improve the quality and yield of this economically important crop. In this study, simple sequence repeats (SSRs) were utilized to evaluate the diversity and structure of 35 faba bean genotypes originating from three different geographical regions (Northern Africa, Eastern Africa, and Near East). All 15 SSR loci generated a total of 100 alleles. The allele number per locus varied from 4 to 11, with a mean of 6.67. The expected heterozygosity (H e ) of SSR loci ranged between 0.51 and 0.81, with a mean of 0.63. The PIC value also varied from 0.44 to 0.78, with an average of 0.58. The expected heterozygosity of 22 faba bean genotypes was higher than the observed one. Interestingly, AMOVA analysis showed that much of variability resided within accessions (79.2%). A highly significant difference among regions was also evidenced, and represented 5.3% of the total variation. Moreover, cluster analysis divided the 35 faba bean genotypes into two main clusters. The first main cluster comprised all faba bean genotypes originating from the Near East region, whereas the second main cluster comprised all the genotypes originating from the Northern and Eastern Africa regions, indicating that the Northern and Eastern African faba bean genotypes were more closely related to each other than to the Near East genotypes. Structure analysis also revealed that the 35 faba bean genotypes might be assigned to two populations, in complete accordance with cluster analysis data. In conclusion, this study showed high levels of diversity in the analysed genotypes of faba bean, and could be utilized in future breeding programmes to develop new cultivars of high yield. Copyright © 2017 Académie des sciences. Published by Elsevier Masson SAS. All rights reserved.

  20. Associations of Streptococcus suis Serotype 2 Ribotype Profiles with Clinical Disease and Antimicrobial Resistance

    PubMed Central

    Rasmussen, S. R.; Aarestrup, F. M.; Jensen, N. E.; Jorsal, S. E.

    1999-01-01

    A total of 122 Streptococcus suis serotype 2 strains were characterized thoroughly by comparing clinical and pathological observations, ribotype profiles, and antimicrobial resistance. Twenty-one different ribotype profiles were found and compared by cluster analysis, resulting in the identification of three ribotype clusters. A total of 58% of all strains investigated were of two ribotypes belonging to different ribotype clusters. A remarkable relationship existed between the observed ribotype profiles and the clinical-pathological observations because strains of one of the two dominant ribotypes were almost exclusively isolated from pigs with meningitis, while strains of the other dominant ribotype were never associated with meningitis. This second ribotype was isolated only from pigs with pneumonia, endocarditis, pericarditis, or septicemia. Cluster analysis revealed that strains belonging to the same ribotype cluster as one of the dominant ribotypes came from pigs that showed clinical signs similar to those of pigs infected with strains with the respective dominant ribotype profiles. Furthermore, strains belonging to different ribotype clusters had totally different patterns of resistance to antibiotics because strains isolated from pigs with meningitis were resistant to sulfamethazoxazole and strains isolated from pigs with pneumonia, endocarditis, pericarditis, or septicemia were resistant to tetracycline. PMID:9889228

  1. Finding reproducible cluster partitions for the k-means algorithm

    PubMed Central

    2013-01-01

    K-means clustering is widely used for exploratory data analysis. While its dependence on initialisation is well-known, it is common practice to assume that the partition with lowest sum-of-squares (SSQ) total i.e. within cluster variance, is both reproducible under repeated initialisations and also the closest that k-means can provide to true structure, when applied to synthetic data. We show that this is generally the case for small numbers of clusters, but for values of k that are still of theoretical and practical interest, similar values of SSQ can correspond to markedly different cluster partitions. This paper extends stability measures previously presented in the context of finding optimal values of cluster number, into a component of a 2-d map of the local minima found by the k-means algorithm, from which not only can values of k be identified for further analysis but, more importantly, it is made clear whether the best SSQ is a suitable solution or whether obtaining a consistently good partition requires further application of the stability index. The proposed method is illustrated by application to five synthetic datasets replicating a real world breast cancer dataset with varying data density, and a large bioinformatics dataset. PMID:23369085

  2. Finding reproducible cluster partitions for the k-means algorithm.

    PubMed

    Lisboa, Paulo J G; Etchells, Terence A; Jarman, Ian H; Chambers, Simon J

    2013-01-01

    K-means clustering is widely used for exploratory data analysis. While its dependence on initialisation is well-known, it is common practice to assume that the partition with lowest sum-of-squares (SSQ) total i.e. within cluster variance, is both reproducible under repeated initialisations and also the closest that k-means can provide to true structure, when applied to synthetic data. We show that this is generally the case for small numbers of clusters, but for values of k that are still of theoretical and practical interest, similar values of SSQ can correspond to markedly different cluster partitions. This paper extends stability measures previously presented in the context of finding optimal values of cluster number, into a component of a 2-d map of the local minima found by the k-means algorithm, from which not only can values of k be identified for further analysis but, more importantly, it is made clear whether the best SSQ is a suitable solution or whether obtaining a consistently good partition requires further application of the stability index. The proposed method is illustrated by application to five synthetic datasets replicating a real world breast cancer dataset with varying data density, and a large bioinformatics dataset.

  3. Formation and stability of dense arrays of Au nanoclusters on hexagonal boron nitride/Rh(111)

    NASA Astrophysics Data System (ADS)

    Patterson, Matthew C.; Habenicht, Bradley F.; Kurtz, Richard L.; Liu, Li; Xu, Ye; Sprunger, Phillip T.

    2014-05-01

    We have studied the nucleation and growth of Au clusters at submonolayer and greater coverages on the h-BN nanomesh grown on Rh(111) by means of scanning tunneling microscopy (STM), x-ray photoelectron spectroscopy (XPS), and density functional theory (DFT). STM reveals that submonolayer Au deposited at 115 K nucleates within the nanomesh pores and remains confined to the pores even after warming to room temperature. Whereas there is a propensity of monoatomic high islands at low temperature, upon annealing, bi- and multilayer Au clusters emerge. Deposition of higher coverages of Au similarly results in Au clusters primarily confined to the nanomesh pores at room temperature. XPS analysis of core-level electronic states in the deposited Au shows strong final-state effects induced by restricted particle size dominating for low Au coverage, with indications that larger Au clusters are negatively charged by interaction through the h-BN monolayer. DFT calculations suggest that the structure of the Au clusters transitions from monolayer to bilayer at a size between 30 and 37 atoms per cluster, in line with our experiment. Bader charge analysis supports the negative charge state of deposited Au.

  4. Exploring the musical taste of expert listeners: musicology students reveal tendency toward omnivorous taste

    PubMed Central

    Elvers, Paul; Omigie, Diana; Fuhrmann, Wolfgang; Fischinger, Timo

    2015-01-01

    Musicology students are engaged with music on an academic level and usually have an extensive musical background. They have a considerable knowledge of music history and theory and listening to music may be regarded as one of their primary occupations. Taken together, these factors qualify them as ≫expert listeners≪, who may be expected to exhibit a specific profile of musical taste: interest in a broad range of musical styles combined with a greater appreciation of ≫sophisticated≪ styles. The current study examined the musical taste of musicology students as compared to a control student group. Participants (n = 1003) completed an online survey regarding the frequency with which they listened to 22 musical styles. A factor analysis revealed six underlying dimensions of musical taste. A hierarchical cluster analysis then grouped all participants, regardless of their status, according to their similarity on these dimensions. The employed exploratory approach was expected to reveal potential differences between musicology students and controls. A three-cluster solution was obtained. Comparisons of the clusters in terms of musical taste revealed differences in the listening frequency and variety of appreciated music styles: the first cluster (51% musicology students/27% controls) showed the greatest musical engagement across all dimensions although with a tendency toward ≫sophisticated≪ musical styles. The second cluster (36% musicology students/46% controls) exhibited an interest in ≫conventional≪ music, while the third cluster (13% musicology students/27% controls) showed a strong liking of rock music. The results provide some support for the notion of specific tendencies in the musical taste of musicology students and the contribution of familiarity and knowledge toward musical omnivorousness. Further differences between the clusters in terms of social, personality, and sociodemographic factors are discussed. PMID:26347702

  5. Planck's view on the spectrum of the Sunyaev-Zeldovich effect

    NASA Astrophysics Data System (ADS)

    Erler, Jens; Basu, Kaustuv; Chluba, Jens; Bertoldi, Frank

    2018-05-01

    We present a detailed analysis of the stacked frequency spectrum of a large sample of galaxy clusters using Planck data, together with auxiliary data from the AKARI and IRAS missions. Our primary goal is to search for the imprint of relativistic corrections to the thermal Sunyaev-Zeldovich effect (tSZ) spectrum, which allow to measure the temperature of the intracluster medium. We remove Galactic and extragalactic foregrounds with a matched filtering technique, which is validated using simulations with realistic mock data sets. The extracted spectra show the tSZ signal at high significance and reveal an additional far-infrared (FIR) excess, which we attribute to thermal emission from the galaxy clusters themselves. This excess FIR emission from clusters is accounted for in our spectral model. We are able to measure the tSZ relativistic corrections at 2.2σ by constraining the mean temperature of our cluster sample to 4.4^{+2.1}_{-2.0} keV. We repeat the same analysis on a subsample containing only the 100 hottest clusters, for which we measure the mean temperature to be 6.0^{+3.8}_{-2.9} keV, corresponding to 2.0σ. The temperature of the emitting dust grains in our FIR model is constrained to ≃20 K, consistent with previous studies. Control for systematic biases is done by fitting mock clusters, from which we also show that using the non-relativistic spectrum for SZ signal extraction will lead to a bias in the integrated Compton parameter Y, which can be up to 14% for the most massive clusters. We conclude by providing an outlook for the upcoming CCAT-prime telescope, which will improve upon Planck with lower noise and better spatial resolution.

  6. Exploring the musical taste of expert listeners: musicology students reveal tendency toward omnivorous taste.

    PubMed

    Elvers, Paul; Omigie, Diana; Fuhrmann, Wolfgang; Fischinger, Timo

    2015-01-01

    Musicology students are engaged with music on an academic level and usually have an extensive musical background. They have a considerable knowledge of music history and theory and listening to music may be regarded as one of their primary occupations. Taken together, these factors qualify them as ≫expert listeners≪, who may be expected to exhibit a specific profile of musical taste: interest in a broad range of musical styles combined with a greater appreciation of ≫sophisticated≪ styles. The current study examined the musical taste of musicology students as compared to a control student group. Participants (n = 1003) completed an online survey regarding the frequency with which they listened to 22 musical styles. A factor analysis revealed six underlying dimensions of musical taste. A hierarchical cluster analysis then grouped all participants, regardless of their status, according to their similarity on these dimensions. The employed exploratory approach was expected to reveal potential differences between musicology students and controls. A three-cluster solution was obtained. Comparisons of the clusters in terms of musical taste revealed differences in the listening frequency and variety of appreciated music styles: the first cluster (51% musicology students/27% controls) showed the greatest musical engagement across all dimensions although with a tendency toward ≫sophisticated≪ musical styles. The second cluster (36% musicology students/46% controls) exhibited an interest in ≫conventional≪ music, while the third cluster (13% musicology students/27% controls) showed a strong liking of rock music. The results provide some support for the notion of specific tendencies in the musical taste of musicology students and the contribution of familiarity and knowledge toward musical omnivorousness. Further differences between the clusters in terms of social, personality, and sociodemographic factors are discussed.

  7. A recurrence network approach for the analysis of skin blood flow dynamics in response to loading pressure.

    PubMed

    Liao, Fuyuan; Jan, Yih-Kuen

    2012-06-01

    This paper presents a recurrence network approach for the analysis of skin blood flow dynamics in response to loading pressure. Recurrence is a fundamental property of many dynamical systems, which can be explored in phase spaces constructed from observational time series. A visualization tool of recurrence analysis called recurrence plot (RP) has been proved to be highly effective to detect transitions in the dynamics of the system. However, it was found that delay embedding can produce spurious structures in RPs. Network-based concepts have been applied for the analysis of nonlinear time series recently. We demonstrate that time series with different types of dynamics exhibit distinct global clustering coefficients and distributions of local clustering coefficients and that the global clustering coefficient is robust to the embedding parameters. We applied the approach to study skin blood flow oscillations (BFO) response to loading pressure. The results showed that global clustering coefficients of BFO significantly decreased in response to loading pressure (p<0.01). Moreover, surrogate tests indicated that such a decrease was associated with a loss of nonlinearity of BFO. Our results suggest that the recurrence network approach can practically quantify the nonlinear dynamics of BFO.

  8. Subgenotype analysis of Cryptosporidium isolates from humans, cattle, and zoo ruminants in Portugal.

    PubMed

    Alves, Margarida; Xiao, Lihua; Sulaiman, Irshad; Lal, Altaf A; Matos, Olga; Antunes, Francisco

    2003-06-01

    Cryptosporidium parvum and Cryptosporidium hominis isolates from human immunodeficiency virus-infected patients, cattle, and wild ruminants were characterized by PCR and DNA sequencing analysis of the 60-kDa glycoprotein gene. Seven alleles were identified, three corresponding to C. hominis and four corresponding to C. parvum. One new allele was found (IId), and one (IIb) had only been found in Portugal. Isolates from cattle and wild ruminants clustered in two alleles. In contrast, human isolates clustered in seven alleles, showing extensive allelic diversity.

  9. Using Cluster Analysis to Examine Husband-Wife Decision Making

    ERIC Educational Resources Information Center

    Bonds-Raacke, Jennifer M.

    2006-01-01

    Cluster analysis has a rich history in many disciplines and although cluster analysis has been used in clinical psychology to identify types of disorders, its use in other areas of psychology has been less popular. The purpose of the current experiments was to use cluster analysis to investigate husband-wife decision making. Cluster analysis was…

  10. Paternal age related schizophrenia (PARS): Latent subgroups detected by k-means clustering analysis.

    PubMed

    Lee, Hyejoo; Malaspina, Dolores; Ahn, Hongshik; Perrin, Mary; Opler, Mark G; Kleinhaus, Karine; Harlap, Susan; Goetz, Raymond; Antonius, Daniel

    2011-05-01

    Paternal age related schizophrenia (PARS) has been proposed as a subgroup of schizophrenia with distinct etiology, pathophysiology and symptoms. This study uses a k-means clustering analysis approach to generate hypotheses about differences between PARS and other cases of schizophrenia. We studied PARS (operationally defined as not having any family history of schizophrenia among first and second-degree relatives and fathers' age at birth ≥ 35 years) in a series of schizophrenia cases recruited from a research unit. Data were available on demographic variables, symptoms (Positive and Negative Syndrome Scale; PANSS), cognitive tests (Wechsler Adult Intelligence Scale-Revised; WAIS-R) and olfaction (University of Pennsylvania Smell Identification Test; UPSIT). We conducted a series of k-means clustering analyses to identify clusters of cases containing high concentrations of PARS. Two analyses generated clusters with high concentrations of PARS cases. The first analysis (N=136; PARS=34) revealed a cluster containing 83% PARS cases, in which the patients showed a significant discrepancy between verbal and performance intelligence. The mean paternal and maternal ages were 41 and 33, respectively. The second analysis (N=123; PARS=30) revealed a cluster containing 71% PARS cases, of which 93% were females; the mean age of onset of psychosis, at 17.2, was significantly early. These results strengthen the evidence that PARS cases differ from other patients with schizophrenia. Hypothesis-generating findings suggest that features of PARS may include a discrepancy between verbal and performance intelligence, and in females, an early age of onset. These findings provide a rationale for separating these phenotypes from others in future clinical, genetic and pathophysiologic studies of schizophrenia and in considering responses to treatment. Copyright © 2011 Elsevier B.V. All rights reserved.

  11. SEARCHING FOR BULK MOTIONS IN THE INTRACLUSTER MEDIUM OF MASSIVE, MERGING CLUSTERS WITH CHANDRA CCD DATA

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Liu, Ang; Yu, Heng; Tozzi, Paolo

    2016-04-10

    We search for bulk motions in the intracluster medium (ICM) of massive clusters showing evidence of an ongoing or recent major merger with spatially resolved spectroscopy in Chandra CCD data. We identify a sample of six merging clusters with >150 ks Chandra exposure in the redshift range 0.1 < z < 0.3. By performing X-ray spectral analysis of projected ICM regions selected according to their surface brightness, we obtain the projected redshift maps for all of these clusters. After performing a robust analysis of the statistical and systematic uncertainties in the measured X-ray redshift z{sub X}, we check whether or not themore » global z{sub X} distribution differs from that expected when the ICM is at rest. We find evidence of significant bulk motions at more than 3σ in A2142 and A115, and less than 2σ in A2034 and A520. Focusing on single regions, we identify significant localized velocity differences in all of the merger clusters. We also perform the same analysis on two relaxed clusters with no signatures of recent mergers, finding no signs of bulk motions, as expected. Our results indicate that deep Chandra CCD data enable us to identify the presence of bulk motions at the level of v{sub BM} > 1000 km s{sup −1} in the ICM of massive merging clusters at 0.1 < z < 0.3. Although the CCD spectral resolution is not sufficient for a detailed analysis of the ICM dynamics, Chandra CCD data constitute a key diagnostic tool complementing X-ray bolometers on board future X-ray missions.« less

  12. An Investigation of Document Partitions.

    ERIC Educational Resources Information Center

    Shaw, W. M., Jr.

    1986-01-01

    Empirical significance of document partitions is investigated as a function of index term-weight and similarity thresholds. Results show the same empirically preferred partitions can be detected by two independent strategies: an analysis of cluster-based retrieval analysis and an analysis of regularities in the underlying structure of the document…

  13. Hot spot analysis applied to identify ecosystem services potential in Lithuania

    NASA Astrophysics Data System (ADS)

    Pereira, Paulo; Depellegrin, Daniel; Misiune, Ieva

    2016-04-01

    Hot spot analysis are very useful to identify areas with similar characteristics. This is important for a sustainable use of the territory, since we can identify areas that need to be protected, or restored. This is a great advantage in terms of land use planning and management, since we can allocate resources, reduce the economical costs and do a better intervention in the landscape. Ecosystem services (ES) are different according land use. Since landscape is very heterogeneous, it is of major importance understand their spatial pattern and where are located the areas that provide better ES and the others that provide less services. The objective of this work is to use hot-spot analysis to identify areas with the most valuable ES in Lithuania. CORINE land-cover (CLC) of 2006 was used as the main spatial information. This classification uses a grid of 100 m resolution and extracted a total of 31 land use types. ES ranking was carried out based on expert knowledge. They were asked to evaluate the ES potential of each different CLC from 0 (no potential) to 5 (very high potential). Hot spot analysis were evaluated using the Getis-ord test, which identifies cluster analysis available in ArcGIS toolbox. This tool identifies areas with significantly high low values and significant high values at a p level of 0.05. In this work we used hot spot analysis to assess the distribution of providing, regulating cultural and total (sum of the previous 3) ES. The Z value calculated from Getis-ord was used to statistical analysis to access the clusters of providing, regulating cultural and total ES. ES with high Z value show that they have a high number of cluster areas with high potential of ES. The results showed that the Z-score was significantly different among services (Kruskal Wallis ANOVA =834. 607, p<0.001). The Z score of providing services (0.096±2.239) were significantly higher than the total (0.093±2.045), cultural (0.080±1.979) and regulating (0.076±1.961). These results suggested that providing services are more clustered than the remaining. Ecosystem Services Z score were significantly correlated, regulating vs total (0.98, p<0.0001), regulating vs cultural (0.97, p<0.0001), cultural vs total (0.96, p<0.0001), providing vs total (0.69, p<0.0001), regulating vs providing (0.56, p<0.0001) and providing vs cultural (0.56, p<0.0001). According to these results, ES distribution potential showed a similar pattern, especially regulating, cultural and total. This an evidence that the the areas that showed high and low significant regulating and cultural ES clusters are similar. The spatial distribution of these clusters is very high, which may be attributed to the landscape diversity and fragmentation.

  14. Molybdenum cluster loaded PLGA nanoparticles: An innovative theranostic approach for the treatment of ovarian cancer.

    PubMed

    Brandhonneur, N; Hatahet, T; Amela-Cortes, M; Molard, Y; Cordier, S; Dollo, G

    2018-04-01

    We evaluate poly (d,l-lactide-co-glycolide) (PLGA) nanoparticles embedding inorganic molybdenum octahedral cluster for photodynamic therapy of cancer (PDT). Tetrabutyl ammonium salt of Mo 6 Br 14 cluster unit, (TBA) 2 Mo 6 Br 14 , presents promising photosensitization activity in the destruction of targeted cancer cells. Stable cluster loaded nanoparticles (CNPs) were prepared by solvent displacement method showing spherical shapes, zeta potential values around -30 mV, polydispersity index lower than 0.2 and sizes around 100 nm. FT-IR and DSC analysis revealed the lack of strong chemical interaction between the cluster and the polymer within the nanoparticles. In vitro release study showed that (TBA) 2 Mo 6 Br 14 was totally dissolved in 20 min, while CNPs were able to control the release of encapsulated cluster. In vitro cellular viability studies conducted on A2780 ovarian cancer cell line treated up to 72 h with cluster or CNPs did not show any sign of toxicity in concentrations up to 20 µg/ml. This concentration was selected for photo-activation test on A2780 cells and CNPs were able to generate oxygen singlet resulting in a decrease of the cellular viability up to 50%, respectively compared to non-activated conditions. This work presents (TBA) 2 Mo 6 Br 14 as a novel photosensitizer for PDT and suggests PLGA nanoparticles as an efficient delivery system intended for tumor targeting. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. A highly efficient multi-core algorithm for clustering extremely large datasets

    PubMed Central

    2010-01-01

    Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922

  16. Application of Artificial Intelligence For Euler Solutions Clustering

    NASA Astrophysics Data System (ADS)

    Mikhailov, V.; Galdeano, A.; Diament, M.; Gvishiani, A.; Agayan, S.; Bogoutdinov, Sh.; Graeva, E.; Sailhac, P.

    Results of Euler deconvolution strongly depend on the selection of viable solutions. Synthetic calculations using multiple causative sources show that Euler solutions clus- ter in the vicinity of causative bodies even when they do not group densely about perimeter of the bodies. We have developed a clustering technique to serve as a tool for selecting appropriate solutions. The method RODIN, employed in this study, is based on artificial intelligence and was originally designed for problems of classification of large data sets. It is based on a geometrical approach to study object concentration in a finite metric space of any dimension. The method uses a formal definition of cluster and includes free parameters that facilitate the search for clusters of given proper- ties. Test on synthetic and real data showed that the clustering technique successfully outlines causative bodies more accurate than other methods of discriminating Euler solutions. In complicated field cases such as the magnetic field in the Gulf of Saint Malo region (Brittany, France), the method provides geologically insightful solutions. Other advantages of the clustering method application are: - Clusters provide solutions associated with particular bodies or parts of bodies permitting the analysis of different clusters of Euler solutions separately. This may allow computation of average param- eters for individual causative bodies. - Those measurements of the anomalous field that yield clusters also form dense clusters themselves. The application of cluster- ing technique thus outlines areas where the influence of different causative sources is more prominent. This allows one to focus on areas for reinterpretation, using different window sizes, structural indices and so on.

  17. Identification of PM10 air pollution origins at a rural background site

    NASA Astrophysics Data System (ADS)

    Reizer, Magdalena; Orza, José A. G.

    2018-01-01

    Trajectory cluster analysis and concentration weighted trajectory (CWT) approach have been applied to investigate the origins of PM10 air pollution recorded at a rural background site in North-eastern Poland (Diabla Góra). Air mass back-trajectories used in this study have been computed with the Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model for a 10-year period of 2006-2015. A cluster analysis grouped back-trajectories into 7 clusters. Most of the trajectories correspond to fast and moderately moving westerly and northerly flows (45% and 25% of the cases, respectively). However, significantly higher PM10 concentrations were observed for slow moving easterly (11%) and southerly (20%) air masses. The CWT analysis shows that high PM10 levels are observed at Diabla Góra site when air masses are originated and passed over the heavily industrialized areas in Central-Eastern Europe located to the south and south-east of the site.

  18. A Systems Biology Approach for Identifying Hepatotoxicant Groups Based on Similarity in Mechanisms of Action and Chemical Structure.

    PubMed

    Hebels, Dennie G A J; Rasche, Axel; Herwig, Ralf; van Westen, Gerard J P; Jennen, Danyel G J; Kleinjans, Jos C S

    2016-01-01

    When evaluating compound similarity, addressing multiple sources of information to reach conclusions about common pharmaceutical and/or toxicological mechanisms of action is a crucial strategy. In this chapter, we describe a systems biology approach that incorporates analyses of hepatotoxicant data for 33 compounds from three different sources: a chemical structure similarity analysis based on the 3D Tanimoto coefficient, a chemical structure-based protein target prediction analysis, and a cross-study/cross-platform meta-analysis of in vitro and in vivo human and rat transcriptomics data derived from public resources (i.e., the diXa data warehouse). Hierarchical clustering of the outcome scores of the separate analyses did not result in a satisfactory grouping of compounds considering their known toxic mechanism as described in literature. However, a combined analysis of multiple data types may hypothetically compensate for missing or unreliable information in any of the single data types. We therefore performed an integrated clustering analysis of all three data sets using the R-based tool iClusterPlus. This indeed improved the grouping results. The compound clusters that were formed by means of iClusterPlus represent groups that show similar gene expression while simultaneously integrating a similarity in structure and protein targets, which corresponds much better with the known mechanism of action of these toxicants. Using an integrative systems biology approach may thus overcome the limitations of the separate analyses when grouping liver toxicants sharing a similar mechanism of toxicity.

  19. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale.

    PubMed

    Emmons, Scott; Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms-Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large graphs with well-defined clusters.

  20. The Atacama Cosmology Telescope: Cosmology from Galaxy Clusters Detected Via the Sunyaev-Zel'dovich Effect

    NASA Technical Reports Server (NTRS)

    Sehgal, Neelima; Trac, Hy; Acquaviva, Viviana; Ade, Peter A. R.; Aguirre, Paula; Amiri, Mandana; Appel, John W.; Barrientos, L. Felipe; Battistelli, Elia S.; Bond, J. Richard; hide

    2010-01-01

    We present constraints on cosmological parameters based on a sample of Sunyaev-Zel'dovich-selected galaxy clusters detected in a millimeter-wave survey by the Atacama Cosmology Telescope. The cluster sample used in this analysis consists of 9 optically-confirmed high-mass clusters comprising the high-significance end of the total cluster sample identified in 455 square degrees of sky surveyed during 2008 at 148 GHz. We focus on the most massive systems to reduce the degeneracy between unknown cluster astrophysics and cosmology derived from SZ surveys. We describe the scaling relation between cluster mass and SZ signal with a 4-parameter fit. Marginalizing over the values of the parameters in this fit with conservative priors gives (sigma)8 = 0.851 +/- 0.115 and w = -1.14 +/- 0.35 for a spatially-flat wCDM cosmological model with WMAP 7-year priors on cosmological parameters. This gives a modest improvement in statistical uncertainty over WMAP 7-year constraints alone. Fixing the scaling relation between cluster mass and SZ signal to a fiducial relation obtained from numerical simulations and calibrated by X-ray observations, we find (sigma)8 + 0.821 +/- 0.044 and w = -1.05 +/- 0.20. These results are consistent with constraints from WMAP 7 plus baryon acoustic oscillations plus type Ia supernova which give (sigma)8 = 0.802 +/- 0.038 and w = -0.98 +/- 0.053. A stacking analysis of the clusters in this sample compared to clusters simulated assuming the fiducial model also shows good agreement. These results suggest that, given the sample of clusters used here, both the astrophysics of massive clusters and the cosmological parameters derived from them are broadly consistent with current models.

  1. Validating clustering of molecular dynamics simulations using polymer models.

    PubMed

    Phillips, Joshua L; Colvin, Michael E; Newsam, Shawn

    2011-11-14

    Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers.

  2. Validating clustering of molecular dynamics simulations using polymer models

    PubMed Central

    2011-01-01

    Background Molecular dynamics (MD) simulation is a powerful technique for sampling the meta-stable and transitional conformations of proteins and other biomolecules. Computational data clustering has emerged as a useful, automated technique for extracting conformational states from MD simulation data. Despite extensive application, relatively little work has been done to determine if the clustering algorithms are actually extracting useful information. A primary goal of this paper therefore is to provide such an understanding through a detailed analysis of data clustering applied to a series of increasingly complex biopolymer models. Results We develop a novel series of models using basic polymer theory that have intuitive, clearly-defined dynamics and exhibit the essential properties that we are seeking to identify in MD simulations of real biomolecules. We then apply spectral clustering, an algorithm particularly well-suited for clustering polymer structures, to our models and MD simulations of several intrinsically disordered proteins. Clustering results for the polymer models provide clear evidence that the meta-stable and transitional conformations are detected by the algorithm. The results for the polymer models also help guide the analysis of the disordered protein simulations by comparing and contrasting the statistical properties of the extracted clusters. Conclusions We have developed a framework for validating the performance and utility of clustering algorithms for studying molecular biopolymer simulations that utilizes several analytic and dynamic polymer models which exhibit well-behaved dynamics including: meta-stable states, transition states, helical structures, and stochastic dynamics. We show that spectral clustering is robust to anomalies introduced by structural alignment and that different structural classes of intrinsically disordered proteins can be reliably discriminated from the clustering results. To our knowledge, our framework is the first to utilize model polymers to rigorously test the utility of clustering algorithms for studying biopolymers. PMID:22082218

  3. Analysis of the convective evaporation of nondilute clusters of drops

    NASA Technical Reports Server (NTRS)

    Bellan, J.; Harstad, K.

    1987-01-01

    The penetration distance of an outer flow into a drop cluster volume is the critical, evaporation mode-controlling parameter in the present model for nondilute drop clusters' convective evaporation. The model is found to perform well for such low penetration distances as those obtained for dense clusters in hot environments and low relative velocities between the outer gases and the cluster. For large penetration distances, however, the predictive power of the model deteriorates; in addition, the evaporation time is found to be a weak function of the initial relative velocity and a strong function of the initial drop temperature. The results generally show that the interior drop temperature was transient throughout the drop lifetime, although temperature nonuniformities persisted up to the first third of the total evaporation time at most.

  4. Water transport and clustering behavior in homopolymer and graft copolymer polylactide

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Du, An; Koo, Donghun; Theryo, Grayce

    2015-02-19

    Polylactide is a bio-based and biodegradable polymer well-known for its renewable origins. Water sorption and clustering behavior in both a homopolymer polylactide and a graft copolymer of polylactide was studied using the quartz crystal microbalance/heat conduction calorimetry (QCM/HCC) technique. The graft copolymer, poly(1,5-cyclooctadiene-co-5-norbornene-2-methanol-graft-D,L-lactide), contained polylactide chains (95 wt.%) grafted onto a hydrophobic rubbery backbone (5 wt.%). Clustering is an important phenomenon in the study of water transport properties in polymers since the presence of water clusters can affect the water diffusivity. The HCC method using the thermal power signals and Van't Hoff's law were both employed to estimate the watermore » sorption enthalpy. Sorption enthalpy of water in both polymers was determined to be approximately -40 kJ/mol for all water activity levels. Zimm-Lundberg analysis showed that water clusters start to form at a water activity of 0.4. The engaged species induced clustering (ENSIC) model was used to curve fit sorption isotherms and showed that the affinity among water molecules is higher than that between water molecules and polymer chains. All the methods used indicate that clustering of water molecules exists in both polymers.« less

  5. Three estimates of the association between linear growth failure and cognitive ability.

    PubMed

    Cheung, Y B; Lam, K F

    2009-09-01

    To compare three estimators of association between growth stunting as measured by height-for-age Z-score and cognitive ability in children, and to examine the extent statistical adjustment for covariates is useful for removing confounding due to socio-economic status. Three estimators, namely random-effects, within- and between-cluster estimators, for panel data were used to estimate the association in a survey of 1105 pairs of siblings who were assessed for anthropometry and cognition. Furthermore, a 'combined' model was formulated to simultaneously provide the within- and between-cluster estimates. Random-effects and between-cluster estimators showed strong association between linear growth and cognitive ability, even after adjustment for a range of socio-economic variables. In contrast, the within-cluster estimator showed a much more modest association: For every increase of one Z-score in linear growth, cognitive ability increased by about 0.08 standard deviation (P < 0.001). The combined model verified that the between-cluster estimate was significantly larger than the within-cluster estimate (P = 0.004). Residual confounding by socio-economic situations may explain a substantial proportion of the observed association between linear growth and cognition in studies that attempt to control the confounding by means of multivariable regression analysis. The within-cluster estimator provides more convincing and modest results about the strength of association.

  6. A clustering approach to segmenting users of internet-based risk calculators.

    PubMed

    Harle, C A; Downs, J S; Padman, R

    2011-01-01

    Risk calculators are widely available Internet applications that deliver quantitative health risk estimates to consumers. Although these tools are known to have varying effects on risk perceptions, little is known about who will be more likely to accept objective risk estimates. To identify clusters of online health consumers that help explain variation in individual improvement in risk perceptions from web-based quantitative disease risk information. A secondary analysis was performed on data collected in a field experiment that measured people's pre-diabetes risk perceptions before and after visiting a realistic health promotion website that provided quantitative risk information. K-means clustering was performed on numerous candidate variable sets, and the different segmentations were evaluated based on between-cluster variation in risk perception improvement. Variation in responses to risk information was best explained by clustering on pre-intervention absolute pre-diabetes risk perceptions and an objective estimate of personal risk. Members of a high-risk overestimater cluster showed large improvements in their risk perceptions, but clusters of both moderate-risk and high-risk underestimaters were much more muted in improving their optimistically biased perceptions. Cluster analysis provided a unique approach for segmenting health consumers and predicting their acceptance of quantitative disease risk information. These clusters suggest that health consumers were very responsive to good news, but tended not to incorporate bad news into their self-perceptions much. These findings help to quantify variation among online health consumers and may inform the targeted marketing of and improvements to risk communication tools on the Internet.

  7. Low Divergence of Clonorchis sinensis in China Based on Multilocus Analysis

    PubMed Central

    Sun, Jiufeng; Huang, Yan; Huang, Huaiqiu; Liang, Pei; Wang, Xiaoyun; Mao, Qiang; Men, Jingtao; Chen, Wenjun; Deng, Chuanhuan; Zhou, Chenhui; Lv, Xiaoli; Zhou, Juanjuan; Zhang, Fan; Li, Ran; Tian, Yanli; Lei, Huali; Liang, Chi; Hu, Xuchu; Xu, Jin; Li, Xuerong; XinbingYu

    2013-01-01

    Clonorchis sinensis, an ancient parasite that infects a number of piscivorous mammals, attracts significant public health interest due to zoonotic exposure risks in Asia. The available studies are insufficient to reflect the prevalence, geographic distribution, and intraspecific genetic diversity of C. sinensis in endemic areas. Here, a multilocus analysis based on eight genes (ITS1, act, tub, ef-1a, cox1, cox3, nad4 and nad5 [4.986 kb]) was employed to explore the intra-species genetic construction of C. sinensis in China. Two hundred and fifty-six C. sinensis isolates were obtained from environmental reservoirs from 17 provinces of China. A total of 254 recognized Multilocus Types (MSTs) showed high diversity among these isolates using multilocus analysis. The comparison analysis of nuclear and mitochondrial phylogeny supports separate clusters in a nuclear dendrogram. Genetic differentiation analysis of three clusters (A, B, and C) showed low divergence within populations. Most isolates from clusters B and C are geographically limited to central China, while cluster A is extraordinarily genetically diverse. Further genetic analyses between different geographic distributions, water bodies and hosts support the low population divergence. The latter haplotype analyses were consistent with the phylogenetic and genetic differentiation results. A recombination network based on concatenated sequences showed a concentrated linkage recombination population in cox1, cox3, nad4 and nad5, with spatial structuring in ITS1. Coupled with the history record and archaeological evidence of C. sinensis infection in mummified desiccated feces, these data point to an ancient origin of C. sinensis in China. In conclusion, we present a likely phylogenetic structure of the C. sinensis population in mainland China, highlighting its possible tendency for biogeographic expansion. Meanwhile, ITS1 was found to be an effective marker for tracking C. sinensis infection worldwide. Thus, the present study improves our understanding of the global epidemiology and evolution of C. sinensis. PMID:23825605

  8. Analysis on the inbound tourist source market in Fujian Province

    NASA Astrophysics Data System (ADS)

    YU, Tong

    2017-06-01

    The paper analyzes the development and structure of inbound tourism in Fujian Province by Excel software and conducts the cluster analysis on the inbound tourism market by SPSS 23.0 software based on the inbound tourism data of Fujian Province from 2006 to 2015. The results show: the rapid development of inbound tourism in Fujian Province and the diversified inbound tourist source countries indicate the stability of inbound tourism market; the inbound tourist source market in Fujian Province can be divided into four categories according to the cluster analysis, and tourists from the United States, Japan, Malaysia, and Singapore are the key of inbound tourism in Fujian Province.

  9. Clustering of financial time series with application to index and enhanced index tracking portfolio

    NASA Astrophysics Data System (ADS)

    Dose, Christian; Cincotti, Silvano

    2005-09-01

    A stochastic-optimization technique based on time series cluster analysis is described for index tracking and enhanced index tracking problems. Our methodology solves the problem in two steps, i.e., by first selecting a subset of stocks and then setting the weight of each stock as a result of an optimization process (asset allocation). Present formulation takes into account constraints on the number of stocks and on the fraction of capital invested in each of them, whilst not including transaction costs. Computational results based on clustering selection are compared to those of random techniques and show the importance of clustering in noise reduction and robust forecasting applications, in particular for enhanced index tracking.

  10. Quantum phase transition between cluster and antiferromagnetic states

    NASA Astrophysics Data System (ADS)

    Son, W.; Amico, L.; Fazio, R.; Hamma, A.; Pascazio, S.; Vedral, V.

    2011-09-01

    We study a Hamiltonian system describing a three-spin-1/2 cluster-like interaction competing with an Ising-like exchange. We show that the ground state in the cluster phase possesses symmetry protected topological order. A continuous quantum phase transition occurs as result of the competition between the cluster and Ising terms. At the critical point the Hamiltonian is self-dual. The geometric entanglement is also studied and used to investigate the quantum phase transition. Our findings in one dimension corroborate the analysis of the two-dimensional generalization of the system, indicating, at a mean-field level, the presence of a direct transition between an antiferromagnetic and a valence bond solid ground state.

  11. On the Surface Mapping using Individual Cluster Impacts

    PubMed Central

    Fernandez-Lima, F.A.; Eller, M.J.; DeBord, J.D.; Verkhoturov, S.V.; Della-Negra, S.; Schweikert, E.A.

    2011-01-01

    This paper describes the advantages of using single impacts of large cluster projectiles (e.g. C60 and Au400) for surface mapping and characterization. The analysis of co-emitted time-resolved photon spectra, electron distributions and characteristic secondary ions shows that they can be used as surface fingerprints for target composition, morphology and structure. Photon, electron and secondary ion emission increases with the projectile cluster size and energy. The observed, high abundant secondary ion emission makes cluster projectiles good candidates for surface mapping of atomic and fragment ions (e.g., yield >1 per nominal mass) and molecular ions (e.g., few tens of percent in the 500 < m/z < 1500 range). PMID:22393269

  12. Security and Correctness Analysis on Privacy-Preserving k-Means Clustering Schemes

    NASA Astrophysics Data System (ADS)

    Su, Chunhua; Bao, Feng; Zhou, Jianying; Takagi, Tsuyoshi; Sakurai, Kouichi

    Due to the fast development of Internet and the related IT technologies, it becomes more and more easier to access a large amount of data. k-means clustering is a powerful and frequently used technique in data mining. Many research papers about privacy-preserving k-means clustering were published. In this paper, we analyze the existing privacy-preserving k-means clustering schemes based on the cryptographic techniques. We show those schemes will cause the privacy breach and cannot output the correct results due to the faults in the protocol construction. Furthermore, we analyze our proposal as an option to improve such problems but with intermediate information breach during the computation.

  13. A pattern-mixture model approach for handling missing continuous outcome data in longitudinal cluster randomized trials.

    PubMed

    Fiero, Mallorie H; Hsu, Chiu-Hsieh; Bell, Melanie L

    2017-11-20

    We extend the pattern-mixture approach to handle missing continuous outcome data in longitudinal cluster randomized trials, which randomize groups of individuals to treatment arms, rather than the individuals themselves. Individuals who drop out at the same time point are grouped into the same dropout pattern. We approach extrapolation of the pattern-mixture model by applying multilevel multiple imputation, which imputes missing values while appropriately accounting for the hierarchical data structure found in cluster randomized trials. To assess parameters of interest under various missing data assumptions, imputed values are multiplied by a sensitivity parameter, k, which increases or decreases imputed values. Using simulated data, we show that estimates of parameters of interest can vary widely under differing missing data assumptions. We conduct a sensitivity analysis using real data from a cluster randomized trial by increasing k until the treatment effect inference changes. By performing a sensitivity analysis for missing data, researchers can assess whether certain missing data assumptions are reasonable for their cluster randomized trial. Copyright © 2017 John Wiley & Sons, Ltd.

  14. An Empirical Taxonomy of Hospital Governing Board Roles

    PubMed Central

    Lee, Shoou-Yih D; Alexander, Jeffrey A; Wang, Virginia; Margolin, Frances S; Combes, John R

    2008-01-01

    Objective To develop a taxonomy of governing board roles in U.S. hospitals. Data Sources 2005 AHA Hospital Governance Survey, 2004 AHA Annual Survey of Hospitals, and Area Resource File. Study Design A governing board taxonomy was developed using cluster analysis. Results were validated and reviewed by industry experts. Differences in hospital and environmental characteristics across clusters were examined. Data Extraction Methods One-thousand three-hundred thirty-four hospitals with complete information on the study variables were included in the analysis. Principal Findings Five distinct clusters of hospital governing boards were identified. Statistical tests showed that the five clusters had high internal reliability and high internal validity. Statistically significant differences in hospital and environmental conditions were found among clusters. Conclusions The developed taxonomy provides policy makers, health care executives, and researchers a useful way to describe and understand hospital governing board roles. The taxonomy may also facilitate valid and systematic assessment of governance performance. Further, the taxonomy could be used as a framework for governing boards themselves to identify areas for improvement and direction for change. PMID:18355260

  15. Structural parameters of young star clusters: fractal analysis

    NASA Astrophysics Data System (ADS)

    Hetem, A.

    2017-07-01

    A unified view of star formation in the Universe demand detailed and in-depth studies of young star clusters. This work is related to our previous study of fractal statistics estimated for a sample of young stellar clusters (Gregorio-Hetem et al. 2015, MNRAS 448, 2504). The structural properties can lead to significant conclusions about the early stages of cluster formation: 1) virial conditions can be used to distinguish warm collapsed; 2) bound or unbound behaviour can lead to conclusions about expansion; and 3) fractal statistics are correlated to the dynamical evolution and age. The technique of error bars estimation most used in the literature is to adopt inferential methods (like bootstrap) to estimate deviation and variance, which are valid only for an artificially generated cluster. In this paper, we expanded the number of studied clusters, in order to enhance the investigation of the cluster properties and dynamic evolution. The structural parameters were compared with fractal statistics and reveal that the clusters radial density profile show a tendency of the mean separation of the stars increase with the average surface density. The sample can be divided into two groups showing different dynamic behaviour, but they have the same dynamic evolution, since the entire sample was revealed as being expanding objects, for which the substructures do not seem to have been completely erased. These results are in agreement with the simulations adopting low surface densities and supervirial conditions.

  16. Biochemical characterization and phylogenetic analysis based on 16S rRNA sequences for V-factor dependent members of Pasteurellaceae derived from laboratory rats.

    PubMed

    Hayashimoto, Nobuhito; Ueno, Masami; Tkakura, Akira; Itoh, Toshio

    2007-06-01

    Phylogenetic analysis based on 16S rRNA sequences with sequence data of some bacterial species of Pasteurellaceae related to rodents deposited in GenBank was performed along with biochemical characterization for the 20 strains of V-factor dependent members of Pasteurellaceae derived from laboratory rats to obtain basic information and to investigate the taxonomic positions. The results of biochemical tests for all strains were identical except for three tests, the ornithine decarboxylase test, and fermentation tests of D(+) mannose and D(+) xylose. The biochemical properties of 8 of 20 strains that showed negative results for the fermentation test of D(+) xylose agreed with those of Haemophilus parainfluenzae complex. By phylogenetic analysis, the strains were divided into two clusters that agreed with the results of the fermentation test of xylose (group I: negative reaction for xylose, group II: positive reaction for xylose). The clusters were independent of other bacterial species of Pasteurellaceae tested. The sequences of the strains in group I showed 99.7-99.8% similarity and the strains in group II showed 99.3-99.7% similarity. None of the strains in group I had a close relation with Haemophilus parainfluenzae by phylogenetic analysis, although they showed the same biochemical properties. In conclusion, the strains had characteristic biochemical properties and formed two independent groups within the "rodent cluster" of Pasteurellaceae that differed in the results of the fermentation test of xylose. Therefore, they seemed to be hitherto undescribed taxa in Pasteurellaceae.

  17. A genetic graph-based approach for partitional clustering.

    PubMed

    Menéndez, Héctor D; Barrero, David F; Camacho, David

    2014-05-01

    Clustering is one of the most versatile tools for data analysis. In the recent years, clustering that seeks the continuity of data (in opposition to classical centroid-based approaches) has attracted an increasing research interest. It is a challenging problem with a remarkable practical interest. The most popular continuity clustering method is the spectral clustering (SC) algorithm, which is based on graph cut: It initially generates a similarity graph using a distance measure and then studies its graph spectrum to find the best cut. This approach is sensitive to the parameters of the metric, and a correct parameter choice is critical to the quality of the cluster. This work proposes a new algorithm, inspired by SC, that reduces the parameter dependency while maintaining the quality of the solution. The new algorithm, named genetic graph-based clustering (GGC), takes an evolutionary approach introducing a genetic algorithm (GA) to cluster the similarity graph. The experimental validation shows that GGC increases robustness of SC and has competitive performance in comparison with classical clustering methods, at least, in the synthetic and real dataset used in the experiments.

  18. A ground truth based comparative study on clustering of gene expression data.

    PubMed

    Zhu, Yitan; Wang, Zuyi; Miller, David J; Clarke, Robert; Xuan, Jianhua; Hoffman, Eric P; Wang, Yue

    2008-05-01

    Given the variety of available clustering methods for gene expression data analysis, it is important to develop an appropriate and rigorous validation scheme to assess the performance and limitations of the most widely used clustering algorithms. In this paper, we present a ground truth based comparative study on the functionality, accuracy, and stability of five data clustering methods, namely hierarchical clustering, K-means clustering, self-organizing maps, standard finite normal mixture fitting, and a caBIG toolkit (VIsual Statistical Data Analyzer--VISDA), tested on sample clustering of seven published microarray gene expression datasets and one synthetic dataset. We examined the performance of these algorithms in both data-sufficient and data-insufficient cases using quantitative performance measures, including cluster number detection accuracy and mean and standard deviation of partition accuracy. The experimental results showed that VISDA, an interactive coarse-to-fine maximum likelihood fitting algorithm, is a solid performer on most of the datasets, while K-means clustering and self-organizing maps optimized by the mean squared compactness criterion generally produce more stable solutions than the other methods.

  19. Mass spectrometric identification of intermediates in the O2-driven [4Fe-4S] to [2Fe-2S] cluster conversion in FNR

    PubMed Central

    Crack, Jason C.; Thomson, Andrew J.

    2017-01-01

    The iron-sulfur cluster containing protein Fumarate and Nitrate Reduction (FNR) is the master regulator for the switch between anaerobic and aerobic respiration in Escherichia coli and many other bacteria. The [4Fe-4S] cluster functions as the sensory module, undergoing reaction with O2 that leads to conversion to a [2Fe-2S] form with loss of high-affinity DNA binding. Here, we report studies of the FNR cluster conversion reaction using time-resolved electrospray ionization mass spectrometry. The data provide insight into the reaction, permitting the detection of cluster conversion intermediates and products, including a [3Fe-3S] cluster and persulfide-coordinated [2Fe-2S] clusters [[2Fe-2S](S)n, where n = 1 or 2]. Analysis of kinetic data revealed a branched mechanism in which cluster sulfide oxidation occurs in parallel with cluster conversion and not as a subsequent, secondary reaction to generate [2Fe-2S](S)n species. This methodology shows great potential for broad application to studies of protein cofactor–small molecule interactions. PMID:28373574

  20. Modeling the Movement of Homicide by Type to Inform Public Health Prevention Efforts.

    PubMed

    Zeoli, April M; Grady, Sue; Pizarro, Jesenia M; Melde, Chris

    2015-10-01

    We modeled the spatiotemporal movement of hotspot clusters of homicide by motive in Newark, New Jersey, to investigate whether different homicide types have different patterns of clustering and movement. We obtained homicide data from the Newark Police Department Homicide Unit's investigative files from 1997 through 2007 (n = 560). We geocoded the address at which each homicide victim was found and recorded the date of and the motive for the homicide. We used cluster detection software to model the spatiotemporal movement of statistically significant homicide clusters by motive, using census tract and month of occurrence as the spatial and temporal units of analysis. Gang-motivated homicides showed evidence of clustering and diffusion through Newark. Additionally, gang-motivated homicide clusters overlapped to a degree with revenge and drug-motivated homicide clusters. Escalating dispute and nonintimate familial homicides clustered; however, there was no evidence of diffusion. Intimate partner and robbery homicides did not cluster. By tracking how homicide types diffuse through communities and determining which places have ongoing or emerging homicide problems by type, we can better inform the deployment of prevention and intervention efforts.

  1. Spectroscopic determination of fundamental parameters of small angular diameter galactic open clusters

    NASA Astrophysics Data System (ADS)

    Ahumada, A. V.; Claria, J. J.; Bica, E.; Parisi, M. C.; Torres, M. C.; Pavani, D. B.

    We present integrated spectra obtained at CASLEO (Argentina) for 9 galactic open clusters of small angular diameter. Two of them (BH 55 and Rup 159) have not been the target of previous research. The flux-calibrated spectra cover the spectral range approx. 3600-6900 A. Using the equivalent widths (EWs) of the Balmer lines and comparing the cluster spectra with template spectra, we determined E(B-V) colour excesses and ages for the present cluster sample. The parameters obtained for 6 of the clusters show good agreement with previous determinations based mainly on photometric methods. This is not the case, however, for BH 90, a scarcely reddened cluster, for which Moffat and Vogt (1975, Astron. and Astroph. SS, 20, 125) derived E(B-V) = 0.51. We explain and justify the strong discrepancy found for this object. According to the present analysis, 3 clusters are very young (Bo 14, Tr 15 and Tr 27), 2 are moderately young (NGC 6268 and BH 205), 3 are Hyades-like clusters (Rup 164, BH 90 and BH 55) and only one is an intermediate-age cluster (Rup 159).

  2. Computer simulations of dendrimer-polyelectrolyte complexes.

    PubMed

    Pandav, Gunja; Ganesan, Venkat

    2014-08-28

    We carry out a systematic analysis of static properties of the clusters formed by complexation between charged dendrimers and linear polyelectrolyte (LPE) chains in a dilute solution under good solvent conditions. We use single chain in mean-field simulations and analyze the structure of the clusters through radial distribution functions of the dendrimer, cluster size, and charge distributions. The effects of LPE length, charge ratio between LPE and dendrimer, the influence of salt concentration, and the dendrimer generation number are examined. Systems with short LPEs showed a reduced propensity for aggregation with dendrimers, leading to formation of smaller clusters. In contrast, larger dendrimers and longer LPEs lead to larger clusters with significant bridging. Increasing salt concentration was seen to reduce aggregation between dendrimers as a result of screening of electrostatic interactions. Generally, maximum complexation was observed in systems with an equal amount of net dendrimer and LPE charges, whereas either excess LPE or dendrimer concentrations resulted in reduced clustering between dendrimers.

  3. Equilibrium geometries, electronic and magnetic properties of small AunNi- (n = 1-9) clusters

    NASA Astrophysics Data System (ADS)

    Tang, Cui-Ming; Chen, Xiao-Xu; Yang, Xiang-Dong

    2014-05-01

    Geometrical, electronic and magnetic properties of small AunNi- (n = 1-9) clusters have been investigated based on density functional theory (DFT) at PW91P86 level. An extensive structural search shows that the relative stable structures of AunNi- (n = 1-9) clusters adopt 2D structure for n = 1-5, 7 and 3D structure for n = 6, 8-9. And the substitution of a Ni atom for an Au atom in the Au-n+1 cluster obviously changes the structure of the host cluster. Moreover, an odd-even alternation phenomenon has been found for HOMO-LUMO energy gaps, indicating that the relative stable structures of the AunNi- clusters with odd-numbered gold atoms have a higher relative stability. Finally, the natural population analysis (NPA) and the vertical detachment energies (VDE) are studied, respectively. The theoretical values of VDE are reported for the first time to our best knowledge.

  4. Dynamic multifactor clustering of financial networks

    NASA Astrophysics Data System (ADS)

    Ross, Gordon J.

    2014-02-01

    We investigate the tendency for financial instruments to form clusters when there are multiple factors influencing the correlation structure. Specifically, we consider a stock portfolio which contains companies from different industrial sectors, located in several different countries. Both sector membership and geography combine to create a complex clustering structure where companies seem to first be divided based on sector, with geographical subclusters emerging within each industrial sector. We argue that standard techniques for detecting overlapping clusters and communities are not able to capture this type of structure and show how robust regression techniques can instead be used to remove the influence of both sector and geography from the correlation matrix separately. Our analysis reveals that prior to the 2008 financial crisis, companies did not tend to form clusters based on geography. This changed immediately following the crisis, with geography becoming a more important determinant of clustering structure.

  5. Mixed Pattern Matching-Based Traffic Abnormal Behavior Recognition

    PubMed Central

    Cui, Zhiming; Zhao, Pengpeng

    2014-01-01

    A motion trajectory is an intuitive representation form in time-space domain for a micromotion behavior of moving target. Trajectory analysis is an important approach to recognize abnormal behaviors of moving targets. Against the complexity of vehicle trajectories, this paper first proposed a trajectory pattern learning method based on dynamic time warping (DTW) and spectral clustering. It introduced the DTW distance to measure the distances between vehicle trajectories and determined the number of clusters automatically by a spectral clustering algorithm based on the distance matrix. Then, it clusters sample data points into different clusters. After the spatial patterns and direction patterns learned from the clusters, a recognition method for detecting vehicle abnormal behaviors based on mixed pattern matching was proposed. The experimental results show that the proposed technical scheme can recognize main types of traffic abnormal behaviors effectively and has good robustness. The real-world application verified its feasibility and the validity. PMID:24605045

  6. Genetic diversity of Rhizobia isolates from Amazon soils using cowpea (Vigna unguiculata) as trap plant

    PubMed Central

    Silva, F.V.; Simões-Araújo, J.L.; Silva Júnior, J.P.; Xavier, G.R.; Rumjanek, N.G.

    2012-01-01

    The aim of this work was to characterize rhizobia isolated from the root nodules of cowpea (Vigna unguiculata) plants cultivated in Amazon soils samples by means of ARDRA (Amplified rDNA Restriction Analysis) and sequencing analysis, to know their phylogenetic relationships. The 16S rRNA gene of rhizobia was amplified by PCR (polymerase chain reaction) using universal primers Y1 and Y3. The amplification products were analyzed by the restriction enzymes HinfI, MspI and DdeI and also sequenced with Y1, Y3 and six intermediate primers. The clustering analysis based on ARDRA profiles separated the Amazon isolates in three subgroups, which formed a group apart from the reference isolates of Bradyrhizobium japonicum and Bradyrhizobium elkanii. The clustering analysis of 16S rRNA gene sequences showed that the fast-growing isolates had similarity with Enterobacter, Rhizobium, Klebsiella and Bradyrhizobium and all the slow-growing clustered close to Bradyrhizobium. PMID:24031880

  7. Pseudomonas aeruginosa in Dairy Goats: Genotypic and Phenotypic Comparison of Intramammary and Environmental Isolates

    PubMed Central

    Scaccabarozzi, Licia; Leoni, Livia; Ballarini, Annalisa; Barberio, Antonio; Locatelli, Clara; Casula, Antonio; Bronzo, Valerio; Pisoni, Giuliano; Jousson, Olivier; Morandi, Stefano; Rapetti, Luca; García-Fernández, Aurora; Moroni, Paolo

    2015-01-01

    Following the identification of a case of severe clinical mastitis in a Saanen dairy goat (goat A), an average of 26 lactating goats in the herd was monitored over a period of 11 months. Milk microbiological analysis revealed the presence of Pseudomonas aeruginosa in 7 of the goats. Among these 7 does, only goat A showed clinical signs of mastitis. The 7 P. aeruginosa isolates from the goat milk and 26 P. aeruginosa isolates from environmental samples were clustered by RAPD-PCR and PFGE analyses in 3 genotypes (G1, G2, G3) and 4 clusters (A, B, C, D), respectively. PFGE clusters A and B correlated with the G1 genotype and included the 7 milk isolates. Although it was not possible to identify the infection source, these results strongly suggest a spreading of the infection from goat A. Clusters C and D overlapped with genotypes G2 and G3, respectively, and included only environmental isolates. The outcome of the antimicrobial susceptibility test performed on the isolates revealed 2 main patterns of multiple resistance to beta-lactam antibiotics and macrolides. Virulence related phenotypes were analyzed, such as swarming and swimming motility, production of biofilm and production of secreted virulence factors. The isolates had distinct phenotypic profiles, corresponding to genotypes G1, G2 and G3. Overall, correlation analysis showed a strong correlation between sampling source, RAPD genotype, PFGE clusters, and phenotypic clusters. The comparison of the levels of virulence related phenotypes did not indicate a higher pathogenic potential in the milk isolates as compared to the environmental isolates. PMID:26606430

  8. Genetic Diversity and Differentiation of Colletotrichum spp. Isolates Associated with Leguminosae Using Multigene Loci, RAPD and ISSR

    PubMed Central

    Mahmodi, Farshid; Kadir, J. B.; Puteh, A.; Pourdad, S. S.; Nasehi, A.; Soleimani, N.

    2014-01-01

    Genetic diversity and differentiation of 50 Colletotrichum spp. isolates from legume crops studied through multigene loci, RAPD and ISSR analysis. DNA sequence comparisons by six genes (ITS, ACT, Tub2, CHS-1, GAPDH, and HIS3) verified species identity of C. truncatum, C. dematium and C. gloeosporiodes and identity C. capsici as a synonym of C. truncatum. Based on the matrix distance analysis of multigene sequences, the Colletotrichum species showed diverse degrees of intera and interspecific divergence (0.0 to 1.4%) and (15.5–19.9), respectively. A multilocus molecular phylogenetic analysis clustered Colletotrichum spp. isolates into 3 well-defined clades, representing three distinct species; C. truncatum, C. dematium and C. gloeosporioides. The ISSR and RAPD and cluster analysis exhibited a high degree of variability among different isolates and permitted the grouping of isolates of Colletotrichum spp. into three distinct clusters. Distinct populations of Colletotrichum spp. isolates were genetically in accordance with host specificity and inconsistent with geographical origins. The large population of C. truncatum showed greater amounts of genetic diversity than smaller populations of C. dematium and C. gloeosporioides species. Results of ISSR and RAPD markers were congruent, but the effective maker ratio and the number of private alleles were greater in ISSR markers. PMID:25288981

  9. High diversity and rapid diversification in the head louse, Pediculus humanus (Pediculidae: Phthiraptera)

    PubMed Central

    Ashfaq, Muhammad; Prosser, Sean; Nasir, Saima; Masood, Mariyam; Ratnasingham, Sujeevan; Hebert, Paul D. N.

    2015-01-01

    The study analyzes sequence variation of two mitochondrial genes (COI, cytb) in Pediculus humanus from three countries (Egypt, Pakistan, South Africa) that have received little prior attention, and integrates these results with prior data. Analysis indicates a maximum K2P distance of 10.3% among 960 COI sequences and 13.8% among 479 cytb sequences. Three analytical methods (BIN, PTP, ABGD) reveal five concordant OTUs for COI and cytb. Neighbor-Joining analysis of the COI sequences confirm five clusters; three corresponding to previously recognized mitochondrial clades A, B, C and two new clades, “D” and “E”, showing 2.3% and 2.8% divergence from their nearest neighbors (NN). Cytb data corroborate five clusters showing that clades “D” and “E” are both 4.6% divergent from their respective NN clades. Phylogenetic analysis supports the monophyly of all clusters recovered by NJ analysis. Divergence time estimates suggest that the earliest split of P. humanus clades occured slightly more than one million years ago (MYa) and the latest about 0.3 MYa. Sequence divergences in COI and cytb among the five clades of P. humanus are 10X those in their human host, a difference that likely reflects both rate acceleration and the acquisition of lice clades from several archaic hominid lineages. PMID:26373806

  10. Big Bangs in Galaxy Clusters: Using X-ray Temperature Maps to Trace Merger Histories in Clusters with Radio Halos/Relics

    NASA Astrophysics Data System (ADS)

    Burns, Jack O.; Datta, Abhirup; Hallman, Eric J.

    2016-06-01

    Galaxy clusters are assembled through large and small mergers which are the most energetic events ("bangs") since the Big Bang. Cluster mergers "stir" the intracluster medium (ICM) creating shocks and turbulence which are illuminated by ~Mpc-sized radio features called relics and halos. These shocks heat the ICM and are detected in x-rays via thermal emission. Disturbed morphologies in x-ray surface brightness and temperatures are direct evidence for cluster mergers. In the radio, relics (in the outskirts of the clusters) and halos (located near the cluster core) are also clear signposts of recent mergers. Our recent ENZO cosmological simulations suggest that around a merger event, radio emission peaks very sharply (and briefly) while the x-ray emission rises and decays slowly. Hence, a sample of galaxy clusters that shows both luminous x-ray emission and radio relics/halos are good candidates for very recent mergers. We are in the early stages of analyzing a unique sample of 48 galaxy clusters with (i) known radio relics and/or halos and (ii) significant archival x-ray observations (>50 ksec) from Chandra and/or XMM. We have developed a new x-ray data analysis pipeline, implemented on parallel processor supercomputers, to create x-ray surface brightness, high fidelity temperature, and pressure maps of these clusters in order to study merging activity. The temperature maps are made using three different map-making techniques: Weighted Voronoi Tessellation, Adaptive Circular Binning, and Contour Binning. In this talk, we will show preliminary results for several clusters, including Abell 2744 and the Bullet cluster. This work is supported by NASA ADAP grant NNX15AE17G.

  11. Metallicity Gradients in the Intracluster Gas of Abell 496

    NASA Astrophysics Data System (ADS)

    Dupke, Renato A.; White, Raymond E., III

    2000-07-01

    Analysis of spatially resolved ASCA spectra of the intracluster gas in Abell 496 confirms there are mild metal abundance enhancements near the center, as previously found in a joint analysis of spectra from Ginga Large Area Counter and Einstein solid state spectrometer. Simultaneous analysis of spectra from all ASCA instruments (SIS+GIS) shows that the iron abundance is 0.36+/-0.03 solar 3'-12' from the center of the cluster and rises ~50% to 0.53+/-0.04 solar within the central 2'. The F-test shows that this abundance gradient is significant at the more than 99.99% level. Nickel and sulfur abundances are also centrally enhanced. We use a variety of elemental abundance ratios to assess the relative contribution of Type Ia supernovae (SNe Ia) and Type II supernovae (SNe II) to the metal enrichment of the intracluster gas. We find spatial gradients in several abundance ratios, indicating that the fraction of iron from SNe Ia increases toward the cluster center, with SNe Ia accounting for ~50% of the iron mass 3'-12' from the center and ~70% within 2'. The increased proportion of SN Ia ejecta at the center is such that the central iron abundance enhancement can be attributed wholly to SNe Ia; we find no significant gradient in SN II ejecta. These spatial gradients in the proportion of SN Ia/II ejecta imply that the dominant metal enrichment mechanism near the center is different than in the outer parts of the cluster. We show that the central abundance enhancement is unlikely to be due to ram pressure stripping of gas from cluster galaxies or to secularly accumulated stellar mass loss within the central cD. We suggest that the additional SN Ia ejecta near the center is the vestige of a secondary SN Ia-driven wind from the cD (following a more energetic protogalactic SN II-driven wind phase), which was partially smothered in the cD due to its location at the cluster center.

  12. Investigating the usefulness of a cluster-based trend analysis to detect visual field progression in patients with open-angle glaucoma.

    PubMed

    Aoki, Shuichiro; Murata, Hiroshi; Fujino, Yuri; Matsuura, Masato; Miki, Atsuya; Tanito, Masaki; Mizoue, Shiro; Mori, Kazuhiko; Suzuki, Katsuyoshi; Yamashita, Takehiro; Kashiwagi, Kenji; Hirasawa, Kazunori; Shoji, Nobuyuki; Asaoka, Ryo

    2017-12-01

    To investigate the usefulness of the Octopus (Haag-Streit) EyeSuite's cluster trend analysis in glaucoma. Ten visual fields (VFs) with the Humphrey Field Analyzer (Carl Zeiss Meditec), spanning 7.7 years on average were obtained from 728 eyes of 475 primary open angle glaucoma patients. Mean total deviation (mTD) trend analysis and EyeSuite's cluster trend analysis were performed on various series of VFs (from 1st to 10th: VF1-10 to 6th to 10th: VF6-10). The results of the cluster-based trend analysis, based on different lengths of VF series, were compared against mTD trend analysis. Cluster-based trend analysis and mTD trend analysis results were significantly associated in all clusters and with all lengths of VF series. Between 21.2% and 45.9% (depending on VF series length and location) of clusters were deemed to progress when the mTD trend analysis suggested no progression. On the other hand, 4.8% of eyes were observed to progress using the mTD trend analysis when cluster trend analysis suggested no progression in any two (or more) clusters. Whole field trend analysis can miss local VF progression. Cluster trend analysis appears as robust as mTD trend analysis and useful to assess both sectorial and whole field progression. Cluster-based trend analyses, in particular the definition of two or more progressing cluster, may help clinicians to detect glaucomatous progression in a timelier manner than using a whole field trend analysis, without significantly compromising specificity. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  13. The observed clustering of damaging extratropical cyclones in Europe

    NASA Astrophysics Data System (ADS)

    Cusack, Stephen

    2016-04-01

    The clustering of severe European windstorms on annual timescales has substantial impacts on the (re-)insurance industry. Our knowledge of the risk is limited by large uncertainties in estimates of clustering from typical historical storm data sets covering the past few decades. Eight storm data sets are gathered for analysis in this study in order to reduce these uncertainties. Six of the data sets contain more than 100 years of severe storm information to reduce sampling errors, and observational errors are reduced by the diversity of information sources and analysis methods between storm data sets. All storm severity measures used in this study reflect damage, to suit (re-)insurance applications. The shortest storm data set of 42 years provides indications of stronger clustering with severity, particularly for regions off the main storm track in central Europe and France. However, clustering estimates have very large sampling and observational errors, exemplified by large changes in estimates in central Europe upon removal of one stormy season, 1989/1990. The extended storm records place 1989/1990 into a much longer historical context to produce more robust estimates of clustering. All the extended storm data sets show increased clustering between more severe storms from return periods (RPs) of 0.5 years to the longest measured RPs of about 20 years. Further, they contain signs of stronger clustering off the main storm track, and weaker clustering for smaller-sized areas, though these signals are more uncertain as they are drawn from smaller data samples. These new ultra-long storm data sets provide new information on clustering to improve our management of this risk.

  14. Suzaku observations of low surface brightness cluster Abell 1631

    NASA Astrophysics Data System (ADS)

    Babazaki, Yasunori; Mitsuishi, Ikuyuki; Ota, Naomi; Sasaki, Shin; Böhringer, Hans; Chon, Gayoung; Pratt, Gabriel W.; Matsumoto, Hironori

    2018-04-01

    We present analysis results for a nearby galaxy cluster Abell 1631 at z = 0.046 using the X-ray observatory Suzaku. This cluster is categorized as a low X-ray surface brightness cluster. To study the dynamical state of the cluster, we conduct four-pointed Suzaku observations and investigate physical properties of the Mpc-scale hot gas associated with the A 1631 cluster for the first time. Unlike relaxed clusters, the X-ray image shows no strong peak at the center and an irregular morphology. We perform spectral analysis and investigate the radial profiles of the gas temperature, density, and entropy out to approximately 1.5 Mpc in the east, north, west, and south directions by combining with the XMM-Newton data archive. The measured gas density in the central region is relatively low (a few ×10-4 cm-3) at the given temperature (˜2.9 keV) compared with X-ray-selected clusters. The entropy profile and value within the central region (r < 0.1 r200) are found to be flatter and higher (≳400 keV cm2). The observed bolometric luminosity is approximately three times lower than that expected from the luminosity-temperature relation in previous studies of relaxed clusters. These features are also observed in another low surface brightness cluster, Abell 76. The spatial distributions of galaxies and the hot gas appear to be different. The X-ray luminosity is relatively lower than that expected from the velocity dispersion. A post-merger scenario may explain the observed results.

  15. The Outer Limits of Galaxy Clusters: Observations to the Virial Radius with Suzaku, XMM,and Chandra

    NASA Technical Reports Server (NTRS)

    Miller, Eric D.; Bautz, Marshall; George, Jithin; Mushotzky, Richard; Davis, David; Henry, J. Patrick

    2012-01-01

    The outskirts of galaxy clusters, near the virial radius, remain relatively unexplored territory and yet are vital to our understanding of cluster growth, structure, and mass. In this presentation, we show the first results from a program to constrain the sate of the outer intra-cluster medium (ICM) in a large sample of galaxy clusters, exploiting the strengths of three complementary X-ray observatories: Suzaku (low, stable background), XMM-Newton (high sensitivity),and Chandra (good spatial resolution). By carefully combining observations from the cluster core to beyond r200, we are able to identify and reduce systematic uncertainties that would impede our spatial and spectral analysis using a single telescope. Our sample comprises nine clusters at z is approximately 0.1-0.2 fully covered in azimuth to beyond r200, and our analysis indicates that the ICM is not in hydrostatic equilibrium in the cluster outskirts, where we see clear azimuthal variations in temperature and surface brightness. In one of the clusters, we are able to measure the diffuse X-ray emission well beyond r200, and we find that the entropy profile and the gas fraction are consistent with expectations from theory and numerical simulations. These results stand in contrast to recent studies which point to gas clumping in the outskirts; the extent to which differences of cluster environment or instrumental effects factor in this difference remains unclear. From a broader perspective, this project will produce a sizeable fiducial data set for detailed comparison with high-resolution numerical simulations.

  16. Suzaku observations of low surface brightness cluster Abell 1631

    NASA Astrophysics Data System (ADS)

    Babazaki, Yasunori; Mitsuishi, Ikuyuki; Ota, Naomi; Sasaki, Shin; Böhringer, Hans; Chon, Gayoung; Pratt, Gabriel W.; Matsumoto, Hironori

    2018-06-01

    We present analysis results for a nearby galaxy cluster Abell 1631 at z = 0.046 using the X-ray observatory Suzaku. This cluster is categorized as a low X-ray surface brightness cluster. To study the dynamical state of the cluster, we conduct four-pointed Suzaku observations and investigate physical properties of the Mpc-scale hot gas associated with the A 1631 cluster for the first time. Unlike relaxed clusters, the X-ray image shows no strong peak at the center and an irregular morphology. We perform spectral analysis and investigate the radial profiles of the gas temperature, density, and entropy out to approximately 1.5 Mpc in the east, north, west, and south directions by combining with the XMM-Newton data archive. The measured gas density in the central region is relatively low (a few ×10-4 cm-3) at the given temperature (˜2.9 keV) compared with X-ray-selected clusters. The entropy profile and value within the central region (r < 0.1 r200) are found to be flatter and higher (≳400 keV cm2). The observed bolometric luminosity is approximately three times lower than that expected from the luminosity-temperature relation in previous studies of relaxed clusters. These features are also observed in another low surface brightness cluster, Abell 76. The spatial distributions of galaxies and the hot gas appear to be different. The X-ray luminosity is relatively lower than that expected from the velocity dispersion. A post-merger scenario may explain the observed results.

  17. IoT Big-Data Centred Knowledge Granule Analytic and Cluster Framework for BI Applications: A Case Base Analysis.

    PubMed

    Chang, Hsien-Tsung; Mishra, Nilamadhab; Lin, Chung-Chih

    2015-01-01

    The current rapid growth of Internet of Things (IoT) in various commercial and non-commercial sectors has led to the deposition of large-scale IoT data, of which the time-critical analytic and clustering of knowledge granules represent highly thought-provoking application possibilities. The objective of the present work is to inspect the structural analysis and clustering of complex knowledge granules in an IoT big-data environment. In this work, we propose a knowledge granule analytic and clustering (KGAC) framework that explores and assembles knowledge granules from IoT big-data arrays for a business intelligence (BI) application. Our work implements neuro-fuzzy analytic architecture rather than a standard fuzzified approach to discover the complex knowledge granules. Furthermore, we implement an enhanced knowledge granule clustering (e-KGC) mechanism that is more elastic than previous techniques when assembling the tactical and explicit complex knowledge granules from IoT big-data arrays. The analysis and discussion presented here show that the proposed framework and mechanism can be implemented to extract knowledge granules from an IoT big-data array in such a way as to present knowledge of strategic value to executives and enable knowledge users to perform further BI actions.

  18. Infrared spectroscopy reveals both qualitative and quantitative differences in equine subchondral bone during maturation

    NASA Astrophysics Data System (ADS)

    Kobrina, Yevgeniya; Isaksson, Hanna; Sinisaari, Miikka; Rieppo, Lassi; Brama, Pieter A.; van Weeren, René; Helminen, Heikki J.; Jurvelin, Jukka S.; Saarakkala, Simo

    2010-11-01

    The collagen phase in bone is known to undergo major changes during growth and maturation. The objective of this study is to clarify whether Fourier transform infrared (FTIR) microspectroscopy, coupled with cluster analysis, can detect quantitative and qualitative changes in the collagen matrix of subchondral bone in horses during maturation and growth. Equine subchondral bone samples (n = 29) from the proximal joint surface of the first phalanx are prepared from two sites subjected to different loading conditions. Three age groups are studied: newborn (0 days old), immature (5 to 11 months old), and adult (6 to 10 years old) horses. Spatial collagen content and collagen cross-link ratio are quantified from the spectra. Additionally, normalized second derivative spectra of samples are clustered using the k-means clustering algorithm. In quantitative analysis, collagen content in the subchondral bone increases rapidly between the newborn and immature horses. The collagen cross-link ratio increases significantly with age. In qualitative analysis, clustering is able to separate newborn and adult samples into two different groups. The immature samples display some nonhomogeneity. In conclusion, this is the first study showing that FTIR spectral imaging combined with clustering techniques can detect quantitative and qualitative changes in the collagen matrix of subchondral bone during growth and maturation.

  19. IoT Big-Data Centred Knowledge Granule Analytic and Cluster Framework for BI Applications: A Case Base Analysis

    PubMed Central

    Chang, Hsien-Tsung; Mishra, Nilamadhab; Lin, Chung-Chih

    2015-01-01

    The current rapid growth of Internet of Things (IoT) in various commercial and non-commercial sectors has led to the deposition of large-scale IoT data, of which the time-critical analytic and clustering of knowledge granules represent highly thought-provoking application possibilities. The objective of the present work is to inspect the structural analysis and clustering of complex knowledge granules in an IoT big-data environment. In this work, we propose a knowledge granule analytic and clustering (KGAC) framework that explores and assembles knowledge granules from IoT big-data arrays for a business intelligence (BI) application. Our work implements neuro-fuzzy analytic architecture rather than a standard fuzzified approach to discover the complex knowledge granules. Furthermore, we implement an enhanced knowledge granule clustering (e-KGC) mechanism that is more elastic than previous techniques when assembling the tactical and explicit complex knowledge granules from IoT big-data arrays. The analysis and discussion presented here show that the proposed framework and mechanism can be implemented to extract knowledge granules from an IoT big-data array in such a way as to present knowledge of strategic value to executives and enable knowledge users to perform further BI actions. PMID:26600156

  20. Multiscale Embedded Gene Co-expression Network Analysis

    PubMed Central

    Song, Won-Min; Zhang, Bin

    2015-01-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma. PMID:26618778

  1. Multiscale Embedded Gene Co-expression Network Analysis.

    PubMed

    Song, Won-Min; Zhang, Bin

    2015-11-01

    Gene co-expression network analysis has been shown effective in identifying functional co-expressed gene modules associated with complex human diseases. However, existing techniques to construct co-expression networks require some critical prior information such as predefined number of clusters, numerical thresholds for defining co-expression/interaction, or do not naturally reproduce the hallmarks of complex systems such as the scale-free degree distribution of small-worldness. Previously, a graph filtering technique called Planar Maximally Filtered Graph (PMFG) has been applied to many real-world data sets such as financial stock prices and gene expression to extract meaningful and relevant interactions. However, PMFG is not suitable for large-scale genomic data due to several drawbacks, such as the high computation complexity O(|V|3), the presence of false-positives due to the maximal planarity constraint, and the inadequacy of the clustering framework. Here, we developed a new co-expression network analysis framework called Multiscale Embedded Gene Co-expression Network Analysis (MEGENA) by: i) introducing quality control of co-expression similarities, ii) parallelizing embedded network construction, and iii) developing a novel clustering technique to identify multi-scale clustering structures in Planar Filtered Networks (PFNs). We applied MEGENA to a series of simulated data and the gene expression data in breast carcinoma and lung adenocarcinoma from The Cancer Genome Atlas (TCGA). MEGENA showed improved performance over well-established clustering methods and co-expression network construction approaches. MEGENA revealed not only meaningful multi-scale organizations of co-expressed gene clusters but also novel targets in breast carcinoma and lung adenocarcinoma.

  2. Analysis of the nutritional status of algae by Fourier transform infrared chemical imaging

    NASA Astrophysics Data System (ADS)

    Hirschmugl, Carol J.; Bayarri, Zuheir-El; Bunta, Maria; Holt, Justin B.; Giordano, Mario

    2006-09-01

    A new non-destructive method to study the nutritional status of algal cells and their environments is demonstrated. This approach allows rapid examination of whole cells without any or little pre-treatment providing a large amount of information on the biochemical composition of cells and growth medium. The method is based on the analysis of a collection of infrared (IR) spectra for individual cells; each spectrum describes the biochemical composition of a portion of a cell; a complete set of spectra is used to reconstruct an image of the entire cell. To obtain spatially resolved information synchrotron radiation was used as a bright IR source. We tested this method on the green flagellate Euglena gracilis; a comparison was conducted between cells grown in nutrient replete conditions (Type 1) and on cells allowed to deplete their medium (Type 2). Complete sets of spectra for individual cells of both types were analyzed with agglomerative hierarchical clustering, leading to distinct clusters representative of the two types of cells. The average spectra for the clusters confirmed the similarities between the clusters and the types of cells. The clustering analysis, therefore, allows the distinction of cells of the same species, but with different nutritional histories. In order to facilitate the application of the method and reduce manipulation (washing), we analyzed the cells in the presence of residual medium. The results obtained showed that even with residual medium the outcome of the clustering analysis is reliable. Our results demonstrate the applicability FTIR microspectroscopy for ecological and ecophysiological studies.

  3. Clustering of risk factors for cardiometabolic diseases in low-income, female adolescents.

    PubMed

    Melo, Elza M F S de; Azevedo, George D; Silva, João B da; Lemos, Telma M A M; Maranhão, Técia M O; Freitas, Ana K M S O; Spyrides, Maria H; Costa, Eduardo C

    2016-02-16

    To assess the prevalence and clustering patterns of cardiometabolic risk factors among low-income, female adolescents. Cross-sectional study involving 196 students of public schools (11-19 years old). The following risk factors were considered in the analysis: excess weight, central obesity, dyslipidemia, high blood pressure, and high fasting glucose. The ratio between observed and expected prevalence and its confidence interval were used to identify clustering of risk factors that exceeded expected prevalence in the population. The most prevalent risk factors were dyslipidemia (70.9%), and central obesity (39.8%), followed by excess weight (29.6%), and high blood pressure (12.8%). A total of 42.9% of adolescents had two or more risk factors, and 24% had three or more. Excess weight, central obesity, and dyslipidemia were common risk factors in the clustering patterns that showed higher-than-expected prevalence. Clustering of risk factors (≥ two factors) among the adolescents showed considerable prevalence, and there was a non-casual coexistence of excess weight, central obesity, and dyslipidemia (mainly low HDL-cholesterol).

  4. GOClonto: an ontological clustering approach for conceptualizing PubMed abstracts.

    PubMed

    Zheng, Hai-Tao; Borchert, Charles; Kim, Hong-Gee

    2010-02-01

    Concurrent with progress in biomedical sciences, an overwhelming of textual knowledge is accumulating in the biomedical literature. PubMed is the most comprehensive database collecting and managing biomedical literature. To help researchers easily understand collections of PubMed abstracts, numerous clustering methods have been proposed to group similar abstracts based on their shared features. However, most of these methods do not explore the semantic relationships among groupings of documents, which could help better illuminate the groupings of PubMed abstracts. To address this issue, we proposed an ontological clustering method called GOClonto for conceptualizing PubMed abstracts. GOClonto uses latent semantic analysis (LSA) and gene ontology (GO) to identify key gene-related concepts and their relationships as well as allocate PubMed abstracts based on these key gene-related concepts. Based on two PubMed abstract collections, the experimental results show that GOClonto is able to identify key gene-related concepts and outperforms the STC (suffix tree clustering) algorithm, the Lingo algorithm, the Fuzzy Ants algorithm, and the clustering based TRS (tolerance rough set) algorithm. Moreover, the two ontologies generated by GOClonto show significant informative conceptual structures.

  5. Human avian influenza in Indonesia: are they really clustered?

    PubMed

    Eyanoer, Putri Chairani; Singhasivanon, Pratap; Kaewkungwal, Jaranit; Apisarnthanarak, Anucha

    2011-05-01

    Understanding the epidemiology of human H5N1 cases in Indonesia is important. The question of whether cases are clustered or not is unclear. An increase in clustered cases suggests greater transmissibility. In the present study, 107 confirmed and 302 suspected human H5N1 cases in Indonesia during 2005-2007 were analyzed for spatial and temporal distribution. Most confirmed cases (97.2%) occurred on two main islands (Java and Sumatera). There were no patterns of disease occurrence over time. There were also no correlations between occurrence patterns in humans and poultry. Statistical analysis showed confirmed cases were clustered within an area on Java island covered by 8 districts along the border of three neighboring provinces (Jakarta, West Java, and Banten). This study shows human H5N1 cases in Indonesia were clustered at two sites where there was a high rate of infection among poultry. These findings are important since they highlight areas of high risk for possible human H5N1 infection in Indonesia, thus, preventive measures may be taken.

  6. Validity analysis on merged and averaged data using within and between analysis: focus on effect of qualitative social capital on self-rated health.

    PubMed

    Shin, Sang Soo; Shin, Young-Jeon

    2016-01-01

    With an increasing number of studies highlighting regional social capital (SC) as a determinant of health, many studies are using multi-level analysis with merged and averaged scores of community residents' survey responses calculated from community SC data. Sufficient examination is required to validate if the merged and averaged data can represent the community. Therefore, this study analyzes the validity of the selected indicators and their applicability in multi-level analysis. Within and between analysis (WABA) was performed after creating community variables using merged and averaged data of community residents' responses from the 2013 Community Health Survey in Korea, using subjective self-rated health assessment as a dependent variable. Further analysis was performed following the model suggested by WABA result. Both E-test results (1) and WABA results (2) revealed that single-level analysis needs to be performed using qualitative SC variable with cluster mean centering. Through single-level multivariate regression analysis, qualitative SC with cluster mean centering showed positive effect on self-rated health (0.054, p<0.001), although there was no substantial difference in comparison to analysis using SC variables without cluster mean centering or multi-level analysis. As modification in qualitative SC was larger within the community than between communities, we validate that relational analysis of individual self-rated health can be performed within the group, using cluster mean centering. Other tests besides the WABA can be performed in the future to confirm the validity of using community variables and their applicability in multi-level analysis.

  7. Cluster Analysis of Velocity Field Derived from Dense GNSS Network of Japan

    NASA Astrophysics Data System (ADS)

    Takahashi, A.; Hashimoto, M.

    2015-12-01

    Dense GNSS networks have been widely used to observe crustal deformation. Simpson et al. (2012) and Savage and Simpson (2013) have conducted cluster analyses of GNSS velocity field in the San Francisco Bay Area and Mojave Desert, respectively. They have successfully found velocity discontinuities. They also showed an advantage of cluster analysis for classifying GNSS velocity field. Since in western United States, strike-slip events are dominant, geometry is simple. However, the Japanese Islands are tectonically complicated due to subduction of oceanic plates. There are many types of crustal deformation such as slow slip event and large postseismic deformation. We propose a modified clustering method of GNSS velocity field in Japan to separate time variant and static crustal deformation. Our modification is performing cluster analysis every several months or years, then qualifying cluster member similarity. If a GNSS station moved differently from its neighboring GNSS stations, the station will not belong to in the cluster which includes its surrounding stations. With this method, time variant phenomena were distinguished. We applied our method to GNSS data of Japan from 1996 to 2015. According to the analyses, following conclusions were derived. The first is the clusters boundaries are consistent with known active faults. For examples, the Arima-Takatsuki-Hanaore fault system and the Shimane-Tottori segment proposed by Nishimura (2015) are recognized, though without using prior information. The second is improving detectability of time variable phenomena, such as a slow slip event in northern part of Hokkaido region detected by Ohzono et al. (2015). The last one is the classification of postseismic deformation caused by large earthquakes. The result suggested velocity discontinuities in postseismic deformation of the Tohoku-oki earthquake. This result implies that postseismic deformation is not continuously decaying proportional to distance from its epicenter.

  8. Students' Perceptions of Motivational Climate and Enjoyment in Finnish Physical Education: A Latent Profile Analysis.

    PubMed

    Jaakkola, Timo; Wang, C K John; Soini, Markus; Liukkonen, Jarmo

    2015-09-01

    The purpose of this study was to identify student clusters with homogenous profiles in perceptions of task- and ego-involving, autonomy, and social relatedness supporting motivational climate in school physical education. Additionally, we investigated whether different motivational climate groups differed in their enjoyment in PE. Participants of the study were 2 594 girls and 1 803 boys, aged 14-15 years. Students responded to questionnaires assessing their perception of motivational climate and enjoyment in physical education. Latent profile analyses produced a five-cluster solution labeled 1) 'low autonomy, relatedness, task, and moderate ego climate' group', 2) 'low autonomy, relatedness, and high task and ego climate, 3) 'moderate autonomy, relatedness, task and ego climate' group 4) 'high autonomy, relatedness, task, and moderate ego climate' group, and 5) 'high relatedness and task but moderate autonomy and ego climate' group. Analyses of variance showed that students in clusters 4 and 5 perceived the highest level of enjoyment whereas students in cluster 1 experienced the lowest level of enjoyment. The results showed that the students' perceptions of various motivational climates created differential levels of enjoyment in PE classes. Key pointsLatent profile analyses produced a five-cluster solution labeled 1) 'low autonomy, relatedness, task, and moderate ego climate' group', 2) 'low autonomy, relatedness, and high task and ego climate, 3) 'moderate autonomy, relatedness, task and ego climate' group 4) 'high autonomy, relatedness, task, and moderate ego climate' group, and 5) 'high relatedness and task but moderate autonomy and ego climate' group.Analyses of variance showed that clusters 4 and 5 perceived the highest level of enjoyment whereas cluster 1 experienced the lowest level of enjoyment. The results showed that the students' perceptions of motivational climate create differential levels of enjoyment in PE classes.

  9. Clustering ENTLN sferics to improve TGF temporal analysis

    NASA Astrophysics Data System (ADS)

    Pradhan, E.; Briggs, M. S.; Stanbro, M.; Cramer, E.; Heckman, S.; Roberts, O.

    2017-12-01

    Using TGFs detected with Fermi Gamma-ray Burst Monitor (GBM) and simultaneous radio sferics detected by Earth Network Total Lightning Network (ENTLN), we establish a temporal co-relation between them. The first step is to find ENTLN strokes that that are closely associated to GBM TGFs. We then identify all the related strokes in the lightning flash that the TGF-associated-stroke belongs to. After trying several algorithms, we found out that the DBSCAN clustering algorithm was best for clustering related ENTLN strokes into flashes. The operation of DBSCAN was optimized using a single seperation measure that combined time and distance seperation. Previous analysis found that these strokes show three timescales with respect to the gamma-ray time. We will use the improved identification of flashes to research this.

  10. The effect of antioxidant concentration of N-isopropyl-N-phenyl-p-phenylenediamine, and 2,2,4-trimethyl-1,2-dihydroquinoline and mixing time of physical properties, thermal properties, mechanical properties and microstructure on natural rubber compound

    NASA Astrophysics Data System (ADS)

    Budiarto

    2017-03-01

    Study the influence of high concentrations of antioxidants N-isopropyl-N-phenyl-p-phenylenediamine (IPPD) and 2,2,4-trimethyl-1,2-dihydroquinoline (TMQ) and the mixing time of the vulcanization physical properties, thermal properties, mechanical properties and structure micro on natural rubber compound has been done. The purpose of this study is to compare the effect of anti-oxidants types IPPD and TMQ and mixing time of vulcanization of the physical properties, mechanical properties, microstructure and elemental composition of the synthesis of natural rubber compound. Processes of vulcanization with variations in the concentration of antioxidant IPPD and TMQ: 2, 3, and 4 grams and mixing time: 20, 30, and 40 minutes. Analysis characterization of physical properties and mechanical properties of natural rubber compound showed that the maturity value 0,499Nm (TMQ) and 0.489 Nm (IPPD), Mooney viscosity value of 26.7 (TMQ) and 20.8 (IPPD), the value of the elongation at break 583.75 % (IPPD), and 552.63% (TMQ) as well as the value of tensile strength of 28.108 M.Pa (TMQ), and 27.986 M.Pa (IPPD). Analysis of thermal properties of natural rubber compound antioxidant IPPD with DTA shows there are three endothermic peak on the curve that is temperature 405°C, 550°C and 660°C and tested by TGA showed that the curve of the total reduction in the sample are 81.745% and compound rubber antioxidant TMQ with the analysis of DTA also contained 3 endothermic peak at a temperature 397,21°C, 514,02°C, and 610,27°C and TGA analysis shows the curve of the total sample of 82.356% reduction. Gsi fun group analysis rubber-antioxidant compound IPPD / TMQ with FTIR spectrophotometer shows some typical infrared absorption peak at the wave number (1 / λ) 833-895 cm-1 for cluster / CH bonds, 1,313 cm-1 for group / single bond Si-O, 1368 cm-1 to g ugus / single bond CC, 1507 cm-1, for cluster / bond C = C, 1665 cm-1For cluster / bond-C = O, 2128 cm-1 is the group / bond CN single, 3371cm-1 for group-OH, 3506 cm-1 for cluster / CH3 bond and 3585 cm-1 showed the presence of vibration in the cluster / bond-NH. The results of morphological observation with SEM produces uneven surface (homogeneous) and are compatible at 2000 times magnification, as well as the test composition by EDX spectroscopy showed that the biggest element in the rubber compound is carbon and Zn, S, Ca, Si, Mg, Al, N. This shows that the natural rubber compound antioxidant IPPD / TMQ meet the standard of "Mechanical Properties of Industrial Tyre rubber Compounds".

  11. Type 2 diabetes mellitus: distribution of genetic markers in Kazakh population.

    PubMed

    Sikhayeva, Nurgul; Talzhanov, Yerkebulan; Iskakova, Aisha; Dzharmukhanov, Jarkyn; Nugmanova, Raushan; Zholdybaeva, Elena; Ramanculov, Erlan

    2018-01-01

    Ethnic differences exist in the frequencies of genetic variations that contribute to the risk of common disease. This study aimed to analyse the distribution of several genes, previously associated with susceptibility to type 2 diabetes and obesity-related phenotypes, in a Kazakh population. A total of 966 individuals belonging to the Kazakh ethnicity were recruited from an outpatient clinic. We genotyped 41 common single nucleotide polymorphisms (SNPs) previously associated with type 2 diabetes in other ethnic groups and 31 of these were in Hardy-Weinberg equilibrium. The obtained allele frequencies were further compared to publicly available data from other ethnic populations. Allele frequencies for other (compared) populations were pooled from the haplotype map (HapMap) database. Principal component analysis (PCA), cluster analysis, and multidimensional scaling (MDS) were used for the analysis of genetic relationship between the populations. Comparative analysis of allele frequencies of the studied SNPs showed significant differentiation among the studied populations. The Kazakh population was grouped with Asian populations according to the cluster analysis and with the Caucasian populations according to PCA. According to MDS, results of the current study show that the Kazakh population holds an intermediate position between Caucasian and Asian populations. A high percentage of population differentiation was observed between Kazakh and world populations. The Kazakh population was clustered with Caucasian populations, and this result may indicate a significant Caucasian component in the Kazakh gene pool.

  12. Common factor analysis versus principal component analysis: choice for symptom cluster research.

    PubMed

    Kim, Hee-Ju

    2008-03-01

    The purpose of this paper is to examine differences between two factor analytical methods and their relevance for symptom cluster research: common factor analysis (CFA) versus principal component analysis (PCA). Literature was critically reviewed to elucidate the differences between CFA and PCA. A secondary analysis (N = 84) was utilized to show the actual result differences from the two methods. CFA analyzes only the reliable common variance of data, while PCA analyzes all the variance of data. An underlying hypothetical process or construct is involved in CFA but not in PCA. PCA tends to increase factor loadings especially in a study with a small number of variables and/or low estimated communality. Thus, PCA is not appropriate for examining the structure of data. If the study purpose is to explain correlations among variables and to examine the structure of the data (this is usual for most cases in symptom cluster research), CFA provides a more accurate result. If the purpose of a study is to summarize data with a smaller number of variables, PCA is the choice. PCA can also be used as an initial step in CFA because it provides information regarding the maximum number and nature of factors. In using factor analysis for symptom cluster research, several issues need to be considered, including subjectivity of solution, sample size, symptom selection, and level of measure.

  13. Effect of Policy Analysis on Indonesia’s Maritime Cluster Development Using System Dynamics Modeling

    NASA Astrophysics Data System (ADS)

    Nursyamsi, A.; Moeis, A. O.; Komarudin

    2018-03-01

    As an archipelago with two third of its territory consist of water, Indonesia should address more attention to its maritime industry development. One of the catalyst to fasten the maritime industry growth is by developing a maritime cluster. The purpose of this research is to gain understanding of the effect if Indonesia implement maritime cluster policy to the growth of maritime economic and its role to enhance the maritime cluster performance, hence enhancing Indonesia’s maritime industry as well. The result of the constructed system dynamic model simulation shows that with the effect of maritime cluster, the growth of employment rate and maritime economic is much bigger that the business as usual case exponentially. The result implies that the government should act fast to form a legitimate cluster maritime organizer institution so that there will be a synergize, sustainable, and positive maritime cluster environment that will benefit the performance of Indonesia’s maritime industry.

  14. Assessment of the climatic potential for tourism in Iran through biometeorology clustering.

    PubMed

    Roshan, Gholamreza; Yousefi, Robabe; Błażejczyk, Krzysztof

    2018-04-01

    This study presents a spatiotemporal analysis of bioclimatic comfort conditions for Iran using mean daily meteorological data from 1995 to 2014, analyzed through Physiological Equivalent Temperature (PET) index and Universal Thermal Climate Index (UTCI) indices, and bioclimatic clustering. The results of this study demonstrate that due to the climate variability across Iran during the year, there is at any point in time a location with climatic condition suitable for tourism. Mean values demonstrate maxima in bioclimatic comfort indices for the country in late winter and spring and minima for summer. Seven statistically significant clusters in bioclimatic indices were identified. Comparing these with clustering performed on PET and UTCI, the maximum overlaps between the two indices. In the following, the outputs of this research showed that most appropriate bioclimatic clustering for Iran includes seven clusters. These clustering locations according to climatic suitability for tourism provide a valuable contribution to tourism management in the country, particularly through marketing destinations to maximize tourist flow.

  15. Clustering multilayer omics data using MuNCut.

    PubMed

    Teran Hidalgo, Sebastian J; Ma, Shuangge

    2018-03-14

    Omics profiling is now a routine component of biomedical studies. In the analysis of omics data, clustering is an essential step and serves multiple purposes including for example revealing the unknown functionalities of omics units, assisting dimension reduction in outcome model building, and others. In the most recent omics studies, a prominent trend is to conduct multilayer profiling, which collects multiple types of genetic, genomic, epigenetic and other measurements on the same subjects. In the literature, clustering methods tailored to multilayer omics data are still limited. Directly applying the existing clustering methods to multilayer omics data and clustering each layer first and then combing across layers are both "suboptimal" in that they do not accommodate the interconnections within layers and across layers in an informative way. In this study, we develop the MuNCut (Multilayer NCut) clustering approach. It is tailored to multilayer omics data and sufficiently accounts for both across- and within-layer connections. It is based on the novel NCut technique and also takes advantages of regularized sparse estimation. It has an intuitive formulation and is computationally very feasible. To facilitate implementation, we develop the function muncut in the R package NcutYX. Under a wide spectrum of simulation settings, it outperforms competitors. The analysis of TCGA (The Cancer Genome Atlas) data on breast cancer and cervical cancer shows that MuNCut generates biologically meaningful results which differ from those using the alternatives. We propose a more effective clustering analysis of multiple omics data. It provides a new venue for jointly analyzing genetic, genomic, epigenetic and other measurements.

  16. [Analysis of Time-to-onset of Interstitial Lung Disease after the Administration of Small Molecule Molecularly-targeted Drugs].

    PubMed

    Komada, Fusao

    2018-01-01

     The aim of this study was to investigate the time-to-onset of drug-induced interstitial lung disease (DILD) following the administration of small molecule molecularly-targeted drugs via the use of the spontaneous adverse reaction reporting system of the Japanese Adverse Drug Event Report database. DILD datasets for afatinib, alectinib, bortezomib, crizotinib, dasatinib, erlotinib, everolimus, gefitinib, imatinib, lapatinib, nilotinib, osimertinib, sorafenib, sunitinib, temsirolimus, and tofacitinib were used to calculate the median onset times of DILD and the Weibull distribution parameters, and to perform the hierarchical cluster analysis. The median onset times of DILD for afatinib, bortezomib, crizotinib, erlotinib, gefitinib, and nilotinib were within one month. The median onset times of DILD for dasatinib, everolimus, lapatinib, osimertinib, and temsirolimus ranged from 1 to 2 months. The median onset times of the DILD for alectinib, imatinib, and tofacitinib ranged from 2 to 3 months. The median onset times of the DILD for sunitinib and sorafenib ranged from 8 to 9 months. Weibull distributions for these drugs when using the cluster analysis showed that there were 4 clusters. Cluster 1 described a subgroup with early to later onset DILD and early failure type profiles or a random failure type profile. Cluster 2 exhibited early failure type profiles or a random failure type profile with early onset DILD. Cluster 3 exhibited a random failure type profile or wear out failure type profiles with later onset DILD. Cluster 4 exhibited an early failure type profile or a random failure type profile with the latest onset DILD.

  17. CROSS-CORRELATING THE γ-RAY SKY WITH CATALOGS OF GALAXY CLUSTERS

    DOE PAGES

    Branchini, Enzo; Camera, Stefano; Cuoco, Alessandro; ...

    2017-01-18

    In this article, we report the detection of a cross-correlation signal between Fermi Large Area Telescope diffuse γ-ray maps and catalogs of clusters. In our analysis, we considered three different catalogs: WHL12, redMaPPer, and PlanckSZ. They all show a positive correlation with different amplitudes, related to the average mass of the objects in each catalog, which also sets the catalog bias. The signal detection is confirmed by the results of a stacking analysis. The cross-correlation signal extends to rather large angular scales, around 1°, that correspond, at the typical redshift of the clusters in these catalogs, to a few tomore » tens of megaparsecs, i.e., the typical scale-length of the large-scale structures in the universe. Most likely this signal is contributed by the cumulative emission from active galactic nuclei (AGNs) associated with the filamentary structures that converge toward the high peaks of the matter density field in which galaxy clusters reside. In addition, our analysis reveals the presence of a second component, more compact in size and compatible with a point-like emission from within individual clusters. At present, we cannot distinguish between the two most likely interpretations for such a signal, i.e., whether it is produced by AGNs inside clusters or if it is a diffuse γ-ray emission from the intracluster medium. Lastly, we argue that this latter, intriguing, hypothesis might be tested by applying this technique to a low-redshift large-mass cluster sample.« less

  18. The XMM Cluster Survey: X-ray analysis methodology

    NASA Astrophysics Data System (ADS)

    Lloyd-Davies, E. J.; Romer, A. Kathy; Mehrtens, Nicola; Hosmer, Mark; Davidson, Michael; Sabirli, Kivanc; Mann, Robert G.; Hilton, Matt; Liddle, Andrew R.; Viana, Pedro T. P.; Campbell, Heather C.; Collins, Chris A.; Dubois, E. Naomi; Freeman, Peter; Harrison, Craig D.; Hoyle, Ben; Kay, Scott T.; Kuwertz, Emma; Miller, Christopher J.; Nichol, Robert C.; Sahlén, Martin; Stanford, S. A.; Stott, John P.

    2011-11-01

    The XMM Cluster Survey (XCS) is a serendipitous search for galaxy clusters using all publicly available data in the XMM-Newton Science Archive. Its main aims are to measure cosmological parameters and trace the evolution of X-ray scaling relations. In this paper we describe the data processing methodology applied to the 5776 XMM observations used to construct the current XCS source catalogue. A total of 3675 > 4σ cluster candidates with >50 background-subtracted X-ray counts are extracted from a total non-overlapping area suitable for cluster searching of 410 deg2. Of these, 993 candidates are detected with >300 background-subtracted X-ray photon counts, and we demonstrate that robust temperature measurements can be obtained down to this count limit. We describe in detail the automated pipelines used to perform the spectral and surface brightness fitting for these candidates, as well as to estimate redshifts from the X-ray data alone. A total of 587 (122) X-ray temperatures to a typical accuracy of <40 (<10) per cent have been measured to date. We also present the methodology adopted for determining the selection function of the survey, and show that the extended source detection algorithm is robust to a range of cluster morphologies by inserting mock clusters derived from hydrodynamical simulations into real XMMimages. These tests show that the simple isothermal β-profiles is sufficient to capture the essential details of the cluster population detected in the archival XMM observations. The redshift follow-up of the XCS cluster sample is presented in a companion paper, together with a first data release of 503 optically confirmed clusters.

  19. The observed clustering of damaging extra-tropical cyclones in Europe

    NASA Astrophysics Data System (ADS)

    Cusack, S.

    2015-12-01

    The clustering of severe European windstorms on annual timescales has substantial impacts on the re/insurance industry. Management of the risk is impaired by large uncertainties in estimates of clustering from historical storm datasets typically covering the past few decades. The uncertainties are unusually large because clustering depends on the variance of storm counts. Eight storm datasets are gathered for analysis in this study in order to reduce these uncertainties. Six of the datasets contain more than 100~years of severe storm information to reduce sampling errors, and the diversity of information sources and analysis methods between datasets sample observational errors. All storm severity measures used in this study reflect damage, to suit re/insurance applications. It is found that the shortest storm dataset of 42 years in length provides estimates of clustering with very large sampling and observational errors. The dataset does provide some useful information: indications of stronger clustering for more severe storms, particularly for southern countries off the main storm track. However, substantially different results are produced by removal of one stormy season, 1989/1990, which illustrates the large uncertainties from a 42-year dataset. The extended storm records place 1989/1990 into a much longer historical context to produce more robust estimates of clustering. All the extended storm datasets show a greater degree of clustering with increasing storm severity and suggest clustering of severe storms is much more material than weaker storms. Further, they contain signs of stronger clustering in areas off the main storm track, and weaker clustering for smaller-sized areas, though these signals are smaller than uncertainties in actual values. Both the improvement of existing storm records and development of new historical storm datasets would help to improve management of this risk.

  20. Diversity and evolution analysis of glycoprotein GP85 from avian leukosis virus subgroup J isolates from chickens of different genetic backgrounds during 1989-2016: Coexistence of five extremely different clusters.

    PubMed

    Wang, Peikun; Lin, Lulu; Li, Haijuan; Yang, Yongli; Huang, Teng; Wei, Ping

    2018-02-01

    ALV-J has caused the most serious losses to the poultry industry in China. The gp85-coding sequence of ALV-J is known to be prone to mutation, but any association between the gp85 gene and breed of chicken remains unclear. A comprehensive and systematic study of the evolutionary process of ALV-J in China is needed. In this study, we compared and analyzed gp85 gene sequences from 198 ALV-J isolates, originating from China, USA, UK and France during 1989-2016. These were sorted into five clusters. Cluster 1, 2, 3, 4 and 5 included isolates from chicken types of different genetic backgrounds, e.g. white-feather broiler, Guangxi indigenous chicken breeds, Yellow chickens and layer chickens respectively. A correlation comparison of amino acid sequence similarities in the gp85 protein among the five clusters showed significant differences (P < 0.01) with the exception being when the third and fifth cluster were compared (P > 0.05). Results of entropy analysis of the gp85 sequences revealed that cluster 3 had the largest variation and cluster 1 had the least variation. The N-glycosylation sites in the majority of isolates numbered 14, 16, 17, 16 and 16, respectively, with regards to clusters 1-5. In addition, 5 isolates from cluster 3 had one more glycosylation site than the other isolates from cluster 3. Our study provides evidence that there were five extremely different ALV-J clusters during 1989-2016 and that the gp85 genes isolated from indigenous chicken breed isolates had the largest variation.

  1. Insight on AV-45 binding in white and grey matter from histogram analysis: a study on early Alzheimer's disease patients and healthy subjects

    PubMed Central

    Nemmi, Federico; Saint-Aubert, Laure; Adel, Djilali; Salabert, Anne-Sophie; Pariente, Jérémie; Barbeau, Emmanuel; Payoux, Pierre; Péran, Patrice

    2014-01-01

    Purpose AV-45 amyloid biomarker is known to show uptake in white matter in patients with Alzheimer’s disease (AD) but also in healthy population. This binding; thought to be of a non-specific lipophilic nature has not yet been investigated. The aim of this study was to determine the differential pattern of AV-45 binding in healthy and pathological populations in white matter. Methods We recruited 24 patients presenting with AD at early stage and 17 matched, healthy subjects. We used an optimized PET-MRI registration method and an approach based on intensity histogram using several indexes. We compared the results of the intensity histogram analyses with a more canonical approach based on target-to-cerebellum Standard Uptake Value (SUVr) in white and grey matters using MANOVA and discriminant analyses. A cluster analysis on white and grey matter histograms was also performed. Results White matter histogram analysis revealed significant differences between AD and healthy subjects, which were not revealed by SUVr analysis. However, white matter histograms was not decisive to discriminate groups, and indexes based on grey matter only showed better discriminative power than SUVr. The cluster analysis divided our sample in two clusters, showing different uptakes in grey but also in white matter. Conclusion These results demonstrate that AV-45 binding in white matter conveys subtle information not detectable using SUVr approach. Although it is not better than standard SUVr to discriminate AD patients from healthy subjects, this information could reveal white matter modifications. PMID:24573658

  2. VEGF-Induced Expression of miR-17–92 Cluster in Endothelial Cells Is Mediated by ERK/ELK1 Activation and Regulates Angiogenesis

    PubMed Central

    Chamorro-Jorganes, Aránzazu; Lee, Monica Y.; Araldi, Elisa; Landskroner-Eiger, Shira; Fernández-Fuertes, Marta; Sahraei, Mahnaz; Quiles del Rey, Maria; van Solingen, Coen; Yu, Jun; Fernández-Hernando, Carlos; Sessa, William C.

    2016-01-01

    Rationale: Several lines of evidence indicate that the regulation of microRNA (miRNA) levels by different stimuli may contribute to the modulation of stimulus-induced responses. The miR-17–92 cluster has been linked to tumor development and angiogenesis, but its role in vascular endothelial growth factor–induced endothelial cell (EC) functions is unclear and its regulation is unknown. Objective: The purpose of this study was to elucidate the mechanism by which VEGF regulates the expression of miR-17–92 cluster in ECs and determine its contribution to the regulation of endothelial angiogenic functions, both in vitro and in vivo. This was done by analyzing the effect of postnatal inactivation of miR-17–92 cluster in the endothelium (miR-17–92 iEC-KO mice) on developmental retinal angiogenesis, VEGF-induced ear angiogenesis, and tumor angiogenesis. Methods and Results: Here, we show that Erk/Elk1 activation on VEGF stimulation of ECs is responsible for Elk-1-mediated transcription activation (chromatin immunoprecipitation analysis) of the miR-17–92 cluster. Furthermore, we demonstrate that VEGF-mediated upregulation of the miR-17–92 cluster in vitro is necessary for EC proliferation and angiogenic sprouting. Finally, we provide genetic evidence that miR-17–92 iEC-KO mice have blunted physiological retinal angiogenesis during development and diminished VEGF-induced ear angiogenesis and tumor angiogenesis. Computational analysis and rescue experiments show that PTEN (phosphatase and tensin homolog) is a target of the miR-17–92 cluster and is a crucial mediator of miR-17-92–induced EC proliferation. However, the angiogenic transcriptional program is reduced when miR-17–92 is inhibited. Conclusions: Taken together, our results indicate that VEGF-induced miR-17–92 cluster expression contributes to the angiogenic switch of ECs and participates in the regulation of angiogenesis. PMID:26472816

  3. Subtypes of female juvenile offenders: a cluster analysis of the Millon Adolescent Clinical Inventory.

    PubMed

    Stefurak, Tres; Calhoun, Georgia B

    2007-01-01

    The current study sought to explore subtypes of adolescents within a sample of female juvenile offenders. Using the Millon Adolescent Clinical Inventory with 101 female juvenile offenders, a two-step cluster analysis was performed beginning with a Ward's method hierarchical cluster analysis followed by a K-Means iterative partitioning cluster analysis. The results suggest an optimal three-cluster solution, with cluster profiles leading to the following group labels: Externalizing Problems, Depressed/Interpersonally Ambivalent, and Anxious Prosocial. Analysis along the factors of age, race, offense typology and offense chronicity were conducted to further understand the nature of found clusters. Only the effect for race was significant with the Anxious Prosocial and Depressed Intepersonally Ambivalent clusters appearing disproportionately comprised of African American girls. To establish external validity, clusters were compared across scales of the Behavioral Assessment System for Children - Self Report of Personality, and corroborative distinctions between clusters were found here.

  4. Mass profile and dynamical status of the z ~ 0.8 galaxy cluster LCDCS 0504

    NASA Astrophysics Data System (ADS)

    Guennou, L.; Biviano, A.; Adami, C.; Limousin, M.; Lima Neto, G. B.; Mamon, G. A.; Ulmer, M. P.; Gavazzi, R.; Cypriano, E. S.; Durret, F.; Clowe, D.; LeBrun, V.; Allam, S.; Basa, S.; Benoist, C.; Cappi, A.; Halliday, C.; Ilbert, O.; Johnston, D.; Jullo, E.; Just, D.; Kubo, J. M.; Márquez, I.; Marshall, P.; Martinet, N.; Maurogordato, S.; Mazure, A.; Murphy, K. J.; Plana, H.; Rostagni, F.; Russeil, D.; Schirmer, M.; Schrabback, T.; Slezak, E.; Tucker, D.; Zaritsky, D.; Ziegler, B.

    2014-06-01

    Context. Constraints on the mass distribution in high-redshift clusters of galaxies are currently not very strong. Aims: We aim to constrain the mass profile, M(r), and dynamical status of the z ~ 0.8 LCDCS 0504 cluster of galaxies that is characterized by prominent giant gravitational arcs near its center. Methods: Our analysis is based on deep X-ray, optical, and infrared imaging as well as optical spectroscopy, collected with various instruments, which we complemented with archival data. We modeled the mass distribution of the cluster with three different mass density profiles, whose parameters were constrained by the strong lensing features of the inner cluster region, by the X-ray emission from the intracluster medium, and by the kinematics of 71 cluster members. Results: We obtain consistent M(r) determinations from three methods based on kinematics (dispersion-kurtosis, caustics, and MAMPOSSt), out to the cluster virial radius, ≃1.3 Mpc and beyond. The mass profile inferred by the strong lensing analysis in the central cluster region is slightly higher than, but still consistent with, the kinematics estimate. On the other hand, the X-ray based M(r) is significantly lower than the kinematics and strong lensing estimates. Theoretical predictions from ΛCDM cosmology for the concentration-mass relation agree with our observational results, when taking into account the uncertainties in the observational and theoretical estimates. There appears to be a central deficit in the intracluster gas mass fraction compared with nearby clusters. Conclusions: Despite the relaxed appearance of this cluster, the determinations of its mass profile by different probes show substantial discrepancies, the origin of which remains to be determined. The extension of a dynamical analysis similar to that of other clusters of the DAFT/FADA survey with multiwavelength data of sufficient quality will allow shedding light on the possible systematics that affect the determination of mass profiles of high-z clusters, which is possibly related to our incomplete understanding of intracluster baryon physics. Table 2 is available in electronic form at http://www.aanda.org

  5. [Cluster analysis in biomedical researches].

    PubMed

    Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

    2013-01-01

    Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.

  6. Information jet: Handling noisy big data from weakly disconnected network

    NASA Astrophysics Data System (ADS)

    Aurongzeb, Deeder

    Sudden aggregation (information jet) of large amount of data is ubiquitous around connected social networks, driven by sudden interacting and non-interacting events, network security threat attacks, online sales channel etc. Clustering of information jet based on time series analysis and graph theory is not new but little work is done to connect them with particle jet statistics. We show pre-clustering based on context can element soft network or network of information which is critical to minimize time to calculate results from noisy big data. We show difference between, stochastic gradient boosting and time series-graph clustering. For disconnected higher dimensional information jet, we use Kallenberg representation theorem (Kallenberg, 2005, arXiv:1401.1137) to identify and eliminate jet similarities from dense or sparse graph.

  7. On the Distribution of Orbital Poles of Milky Way Satellites

    NASA Astrophysics Data System (ADS)

    Palma, Christopher; Majewski, Steven R.; Johnston, Kathryn V.

    2002-01-01

    In numerous studies of the outer Galactic halo some evidence for accretion has been found. If the outer halo did form in part or wholly through merger events, we might expect to find coherent streams of stars and globular clusters following orbits similar to those of their parent objects, which are assumed to be present or former Milky Way dwarf satellite galaxies. We present a study of this phenomenon by assessing the likelihood of potential descendant ``dynamical families'' in the outer halo. We conduct two analyses: one that involves a statistical analysis of the spatial distribution of all known Galactic dwarf satellite galaxies (DSGs) and globular clusters, and a second, more specific analysis of those globular clusters and DSGs for which full phase space dynamical data exist. In both cases our methodology is appropriate only to members of descendant dynamical families that retain nearly aligned orbital poles today. Since the Sagittarius dwarf (Sgr) is considered a paradigm for the type of merger/tidal interaction event for which we are searching, we also undertake a case study of the Sgr system and identify several globular clusters that may be members of its extended dynamical family. In our first analysis, the distribution of possible orbital poles for the entire sample of outer (Rgc>8 kpc) halo globular clusters is tested for statistically significant associations among globular clusters and DSGs. Our methodology for identifying possible associations is similar to that used by Lynden-Bell & Lynden-Bell, but we put the associations on a more statistical foundation. Moreover, we study the degree of possible dynamical clustering among various interesting ensembles of globular clusters and satellite galaxies. Among the ensembles studied, we find the globular cluster subpopulation with the highest statistical likelihood of association with one or more of the Galactic DSGs to be the distant, outer halo (Rgc>25 kpc), second-parameter globular clusters. The results of our orbital pole analysis are supported by the great circle cell count methodology of Johnston, Hernquist, & Bolte. The space motions of the clusters Pal 4, NGC 6229, NGC 7006, and Pyxis are predicted to be among those most likely to show the clusters to be following stream orbits, since these clusters are responsible for the majority of the statistical significance of the association between outer halo, second-parameter globular clusters and the Milky Way DSGs. In our second analysis, we study the orbits of the 41 globular clusters and six Milky Way-bound DSGs having measured proper motions to look for objects with both coplanar orbits and similar angular momenta. Unfortunately, the majority of globular clusters with measured proper motions are inner halo clusters that are less likely to retain memory of their original orbit. Although four potential globular cluster/DSG associations are found, we believe three of these associations involving inner halo clusters to be coincidental. While the present sample of objects with complete dynamical data is small and does not include many of the globular clusters that are more likely to have been captured by the Milky Way, the methodology we adopt will become increasingly powerful as more proper motions are measured for distant Galactic satellites and globular clusters, and especially as results from the Space Interferometry Mission (SIM) become available.

  8. Prevalence and risk factors of seizure clusters in adult patients with epilepsy.

    PubMed

    Chen, Baibing; Choi, Hyunmi; Hirsch, Lawrence J; Katz, Austen; Legge, Alexander; Wong, Rebecca A; Jiang, Alfred; Kato, Kenneth; Buchsbaum, Richard; Detyniecki, Kamil

    2017-07-01

    In the current study, we explored the prevalence of physician-confirmed seizure clusters. We also investigated potential clinical factors associated with the occurrence of seizure clusters overall and by epilepsy type. We reviewed medical records of 4116 adult (≥16years old) outpatients with epilepsy at our centers for documentation of seizure clusters. Variables including patient demographics, epilepsy details, medical and psychiatric history, AED history, and epilepsy risk factors were then tested against history of seizure clusters. Patients were then divided into focal epilepsy, idiopathic generalized epilepsy (IGE), or symptomatic generalized epilepsy (SGE), and the same analysis was run. Overall, seizure clusters were independently associated with earlier age of seizure onset, symptomatic generalized epilepsy (SGE), central nervous system (CNS) infection, cortical dysplasia, status epilepticus, absence of 1-year seizure freedom, and having failed 2 or more AEDs (P<0.0026). Patients with SGE (27.1%) were more likely to develop seizure clusters than patients with focal epilepsy (16.3%) and IGE (7.4%; all P<0.001). Analysis by epilepsy type showed that absence of 1-year seizure freedom since starting treatment at one of our centers was associated with seizure clustering in patients across all 3 epilepsy types. In patients with SGE, clusters were associated with perinatal/congenital brain injury. In patients with focal epilepsy, clusters were associated with younger age of seizure onset, complex partial seizures, cortical dysplasia, status epilepticus, CNS infection, and having failed 2 or more AEDs. In patients with IGE, clusters were associated with presence of an aura. Only 43.5% of patients with seizure clusters were prescribed rescue medications. Patients with intractable epilepsy are at a higher risk of developing seizure clusters. Factors such as having SGE, CNS infection, cortical dysplasia, status epilepticus or an early seizure onset, can also independently increase one's chance of having seizure clusters. Copyright © 2017. Published by Elsevier B.V.

  9. Addressing the complexity of water chemistry in environmental fate modeling for engineered nanoparticles.

    PubMed

    Sani-Kast, Nicole; Scheringer, Martin; Slomberg, Danielle; Labille, Jérôme; Praetorius, Antonia; Ollivier, Patrick; Hungerbühler, Konrad

    2015-12-01

    Engineered nanoparticle (ENP) fate models developed to date - aimed at predicting ENP concentration in the aqueous environment - have limited applicability because they employ constant environmental conditions along the modeled system or a highly specific environmental representation; both approaches do not show the effects of spatial and/or temporal variability. To address this conceptual gap, we developed a novel modeling strategy that: 1) incorporates spatial variability in environmental conditions in an existing ENP fate model; and 2) analyzes the effect of a wide range of randomly sampled environmental conditions (representing variations in water chemistry). This approach was employed to investigate the transport of nano-TiO2 in the Lower Rhône River (France) under numerous sets of environmental conditions. The predicted spatial concentration profiles of nano-TiO2 were then grouped according to their similarity by using cluster analysis. The analysis resulted in a small number of clusters representing groups of spatial concentration profiles. All clusters show nano-TiO2 accumulation in the sediment layer, supporting results from previous studies. Analysis of the characteristic features of each cluster demonstrated a strong association between the water conditions in regions close to the ENP emission source and the cluster membership of the corresponding spatial concentration profiles. In particular, water compositions favoring heteroaggregation between the ENPs and suspended particulate matter resulted in clusters of low variability. These conditions are, therefore, reliable predictors of the eventual fate of the modeled ENPs. The conclusions from this study are also valid for ENP fate in other large river systems. Our results, therefore, shift the focus of future modeling and experimental research of ENP environmental fate to the water characteristic in regions near the expected ENP emission sources. Under conditions favoring heteroaggregation in these regions, the fate of the ENPs can be readily predicted. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. Socioeconomic status (SES) and childhood acute myeloid leukemia (AML) mortality risk: Analysis of SEER data.

    PubMed

    Knoble, Naomi B; Alderfer, Melissa A; Hossain, Md Jobayer

    2016-10-01

    Socioeconomic status (SES) is a complex construct of multiple indicators, known to impact cancer outcomes, but has not been adequately examined among pediatric AML patients. This study aimed to identify the patterns of co-occurrence of multiple community-level SES indicators and to explore associations between various patterns of these indicators and pediatric AML mortality risk. A nationally representative US sample of 3651 pediatric AML patients, aged 0-19 years at diagnosis was drawn from 17 Surveillance, Epidemiology, and End Results (SEER) database registries created between 1973 and 2012. Factor analysis, cluster analysis, stratified univariable and multivariable Cox proportional hazards models were used. Four SES factors accounting for 87% of the variance in SES indicators were identified: F1) economic/educational disadvantage, less immigration; F2) immigration-related features (foreign-born, language-isolation, crowding), less mobility; F3) housing instability; and, F4) absence of moving. F1 and F3 showed elevated risk of mortality, adjusted hazards ratios (aHR) (95% CI): 1.07(1.02-1.12) and 1.05(1.00-1.10), respectively. Seven SES-defined cluster groups were identified. Cluster 1 (low economic/educational disadvantage, few immigration-related features, and residential-stability) showed the minimum risk of mortality. Compared to Cluster 1, Cluster 3 (high economic/educational disadvantage, high-mobility) and Cluster 6 (moderately-high economic/educational disadvantages, housing-instability and immigration-related features) exhibited substantially greater risk of mortality, aHR(95% CI)=1.19(1.0-1.4) and 1.23 (1.1-1.5), respectively. Factors of correlated SES-indicators and their pattern-based groups demonstrated differential risks in the pediatric AML mortality indicating the need of special public-health attention in areas with economic-educational disadvantages, housing-instability and immigration-related features. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. X-Ray Morphological Analysis of the Planck ESZ Clusters

    NASA Astrophysics Data System (ADS)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine; Ettori, Stefano; Andrade-Santos, Felipe; Arnaud, Monique; Démoclès, Jessica; Pratt, Gabriel W.; Randall, Scott; Kraft, Ralph

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev-Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper we determine eight morphological parameters for the Planck Early Sunyaev-Zeldovich (ESZ) objects observed with XMM-Newton. We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.

  12. X-Ray Morphological Analysis of the Planck ESZ Clusters

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lovisari, Lorenzo; Forman, William R.; Jones, Christine

    2017-09-01

    X-ray observations show that galaxy clusters have a very large range of morphologies. The most disturbed systems, which are good to study how clusters form and grow and to test physical models, may potentially complicate cosmological studies because the cluster mass determination becomes more challenging. Thus, we need to understand the cluster properties of our samples to reduce possible biases. This is complicated by the fact that different experiments may detect different cluster populations. For example, Sunyaev–Zeldovich (SZ) selected cluster samples have been found to include a greater fraction of disturbed systems than X-ray selected samples. In this paper wemore » determine eight morphological parameters for the Planck Early Sunyaev–Zeldovich (ESZ) objects observed with XMM-Newton . We found that two parameters, concentration and centroid shift, are the best to distinguish between relaxed and disturbed systems. For each parameter we provide the values that allow selecting the most relaxed or most disturbed objects from a sample. We found that there is no mass dependence on the cluster dynamical state. By comparing our results with what was obtained with REXCESS clusters, we also confirm that the ESZ clusters indeed tend to be more disturbed, as found by previous studies.« less

  13. Seismic facies analysis based on self-organizing map and empirical mode decomposition

    NASA Astrophysics Data System (ADS)

    Du, Hao-kun; Cao, Jun-xing; Xue, Ya-juan; Wang, Xing-jian

    2015-01-01

    Seismic facies analysis plays an important role in seismic interpretation and reservoir model building by offering an effective way to identify the changes in geofacies inter wells. The selections of input seismic attributes and their time window have an obvious effect on the validity of classification and require iterative experimentation and prior knowledge. In general, it is sensitive to noise when waveform serves as the input data to cluster analysis, especially with a narrow window. To conquer this limitation, the Empirical Mode Decomposition (EMD) method is introduced into waveform classification based on SOM. We first de-noise the seismic data using EMD and then cluster the data using 1D grid SOM. The main advantages of this method are resolution enhancement and noise reduction. 3D seismic data from the western Sichuan basin, China, are collected for validation. The application results show that seismic facies analysis can be improved and better help the interpretation. The powerful tolerance for noise makes the proposed method to be a better seismic facies analysis tool than classical 1D grid SOM method, especially for waveform cluster with a narrow window.

  14. Particulate matter time-series and Köppen-Geiger climate classes in North America and Europe

    NASA Astrophysics Data System (ADS)

    Pražnikar, Jure

    2017-02-01

    Four years of time-series data on the particulate matter (PM) concentrations from 801 monitoring stations located in Europe and 234 stations in North America were analyzed. Using k-means clustering with distance correlation as a measure for similarity, 5 distinct PM clusters in Europe and 9 clusters across the United States of America (USA) were found. This study shows that meteorology has an important role in controlling PM concentrations, as comparison between Köppen-Geiger climate zones and identified PM clusters revealed very good spatial overlapping. Moreover, the Köppen-Geiger boundaries in Europe show a high similarity to the boundaries as defined by PM clusters. The western USA is much more diverse regarding climate zones; this characteristic was confirmed by cluster analysis, as 6 clusters were identified in the west, and only 3 were identified on the eastern side of the USA. The lowest similarity between PM time-series in Europe was observed between the Iberian Peninsula and the north Europe clusters. These two regions also show considerable differences, as the cold semi-arid climate has a long and hot summer period, while the cool continental climate has a short summertime and long and cold winters. Additionally, intra-continental examination of European clusters showed meteorologically driven phenomena in autumn 2011 encompassing a large European region from Bulgaria in the south, Germany in central Europe and Finland in the north with high PM concentrations in November and a decline in December 2011. Inter-continental comparison between Europe and the USA clusters revealed a remarkable difference between the PM time-series located in humid continental zone. It seems that because of higher shortwave downwelling radiation (≈210 W m-2) over the USA's continental zone, and consequently more intense production of secondary aerosols, a summer peak in PM concentration was observed. On the other hand, Europe's humid continental climate region experiences lower solar radiation (≈180 W m-2); consequently, the elevated summer-time PM concentrations were not detected.

  15. Update on ONC's Substellar IMF: A Second Peak in the Brown Dwarf Regime

    NASA Astrophysics Data System (ADS)

    Drass, Holger; Bayo, A.; Chini, R.; Haas, M.

    2017-06-01

    The Orion Nebular Cluster (ONC) has become the prototype cluster for studying the Initial Mass Function (IMF). In a deep JHK survey of the ONC with HAWK-I we detected a large population of 900 Brown Dwarfs and Planetary Mass Object candidates presenting a pronounced second peak in the substellar IMF. One of the most obvious issues of this result is the verification of cluster membership. The analysis so far was mainly based on statistical consideration. In this presentation I will show the results from using different high-resolution extinction map to determine the ONC membership.

  16. An application of bioassessment metrics and multivariate techniques to evaluate central Nebraska streams

    USGS Publications Warehouse

    Frenzel, S.A.

    1996-01-01

    Ninety-one stream sites in central Nebraska were classified into four clusters on the basis of a cluster analysis (TWINSPAN) of macroinvertebrate data. Rapid bioassessment protocol scores for macroinvertebrate species were significantly different among sites grouped by teh first division into two clusters. This division may have distinguished sites on the basis of water-quality imparement. Individual metrics that differed between clusters of sites were the Hilsenhoff Biotic Index, the number of Ephemeroptera, Plecoptera, and Trichoptera (EPT) taxa, and the ratio of individuals in EPT to Chironomidae taxa. Canonical correspondence analysis of 57 of 91 sites showed that stream width, site altitude, latitude, soil permeability, water temperature, and mean annual precipitation were the most important environmental variables describing variance in the species-environment relation. Stream width and soil permeability reflected streamflow characteristics of a site, whereas site altitude and latitude were factors related to general climatic conditions. Mean annual precipitation related to both streamflow and climatic conditions.

  17. A population of gamma-ray emitting globular clusters seen with the Fermi Large Area Telescope

    DOE PAGES

    Abdo, A. A.

    2010-11-24

    Context. Globular clusters with their large populations of millisecond pulsars (MSPs) are believed to be potential emitters of high-energy gamma-ray emission. The observation of this emission provides a powerful tool to assess the millisecond pulsar population of a cluster, is essential for understanding the importance of binary systems for the evolution of globular clusters, and provides complementary insights into magnetospheric emission processes. Aims. Our goal is to constrain the millisecond pulsar populations in globular clusters from analysis of gamma-ray observations. Methods. We use 546 days of continuous sky-survey observations obtained with the Large Area Telescope aboard the Fermi Gamma-ray Spacemore » Telescope to study the gamma-ray emission towards 13 globular clusters. Results. Steady point-like high-energy gamma-ray emission has been significantly detected towards 8 globular clusters. Five of them (47 Tucanae, Omega Cen, NGC 6388, Terzan 5, and M 28) show hard spectral power indices (0.7 < Γ < 1.4) and clear evidence for an exponential cut-off in the range 1.0 - 2.6 GeV, which is the characteristic signature of magnetospheric emission from MSPs. Three of them (M 62, NGC 6440 and NGC 6652) also show hard spectral indices (1.0 < Γ < 1.7), however the presence of an exponential cut-off can not be unambiguously established. Three of them (Omega Cen, NGC 6388, NGC 6652) have no known radio or X-ray MSPs yet still exhibit MSP spectral properties. From the observed gamma-ray luminosities, we estimate the total number of MSPs that is expected to be present in these globular clusters. We show that our estimates of the MSP population correlate with the stellar encounter rate and we estimate 2600 - 4700 MSPs in Galactic globular clusters, commensurate with previous estimates. Conclusions. The observation of high-energy gamma-ray emission from globular clusters thus provides a reliable independent method to assess their millisecond pulsar populations.« less

  18. Determination and analysis of the complete genome sequence of Paralichthys olivaceus rhabdovirus (PORV).

    PubMed

    Zhu, Ruo-Lin; Zhang, Qi-Ya

    2014-04-01

    Paralichthys olivaceus rhabdovirus (PORV), which is associated with high mortality rates in flounder, was isolated in China in 2005. Here, we provide an annotated sequence record of PORV, the genome of which comprises 11,182 nucleotides and contains six genes in the order 3'-N-P-M-G-NV-L-5'. Phylogenetic analysis based on glycoprotein sequences of PORV and other rhabdoviruses showed that PORV clusters with viral haemorrhagic septicemia virus (VHSV), genus Novirhabdovirus, family Rhabdoviridae. Further phylogenetic analysis of the combined amino acid sequences of six proteins of PORV and VHSV strains showed that PORV clusters with Korean strains and is closely related to Asian strains, all of which were isolated from flounder. In a comparison in which the sequences of the six proteins were combined, PORV shared the highest identity (98.3 %) with VHSV strain KJ2008 from Korea.

  19. TWO-STAGE FRAGMENTATION FOR CLUSTER FORMATION: ANALYTICAL MODEL AND OBSERVATIONAL CONSIDERATIONS

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bailey, Nicole D.; Basu, Shantanu, E-mail: nwityk@uwo.ca, E-mail: basu@uwo.ca

    2012-12-10

    Linear analysis of the formation of protostellar cores in planar magnetic interstellar clouds shows that molecular clouds exhibit a preferred length scale for collapse that depends on the mass-to-flux ratio and neutral-ion collision time within the cloud. We extend this linear analysis to the context of clustered star formation. By combining the results of the linear analysis with a realistic ionization profile for the cloud, we find that a molecular cloud may evolve through two fragmentation events in the evolution toward the formation of stars. Our model suggests that the initial fragmentation into clumps occurs for a transcritical cloud onmore » parsec scales while the second fragmentation can occur for transcritical and supercritical cores on subparsec scales. Comparison of our results with several star-forming regions (Perseus, Taurus, Pipe Nebula) shows support for a two-stage fragmentation model.« less

  20. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    DOE PAGES

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.; ...

    2016-11-24

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less

  1. Computing and Applying Atomic Regulons to Understand Gene Expression and Regulation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Faria, José P.; Davis, James J.; Edirisinghe, Janaka N.

    Understanding gene function and regulation is essential for the interpretation, prediction, and ultimate design of cell responses to changes in the environment. A multitude of technologies, abstractions, and interpretive frameworks have emerged to answer the challenges presented by genome function and regulatory network inference. Here, we propose a new approach for producing biologically meaningful clusters of coexpressed genes, called Atomic Regulons (ARs), based on expression data, gene context, and functional relationships. We demonstrate this new approach by computing ARs for Escherichia coli, which we compare with the coexpressed gene clusters predicted by two prevalent existing methods: hierarchical clustering and k-meansmore » clustering. We test the consistency of ARs predicted by all methods against expected interactions predicted by the Context Likelihood of Relatedness (CLR) mutual information based method, finding that the ARs produced by our approach show better agreement with CLR interactions. We then apply our method to compute ARs for four other genomes: Shewanella oneidensis, Pseudomonas aeruginosa, Thermus thermophilus, and Staphylococcus aureus. We compare the AR clusters from all genomes to study the similarity of coexpression among a phylogenetically diverse set of species, identifying subsystems that show remarkable similarity over wide phylogenetic distances. We also study the sensitivity of our method for computing ARs to the expression data used in the computation, showing that our new approach requires less data than competing approaches to converge to a near final configuration of ARs. We go on to use our sensitivity analysis to identify the specific experiments that lead most rapidly to the final set of ARs for E. coli. As a result, this analysis produces insights into improving the design of gene expression experiments.« less

  2. THE VERY MASSIVE STAR CONTENT OF THE NUCLEAR STAR CLUSTERS IN NGC 5253

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Smith, L. J.; Crowther, P. A.; Calzetti, D.

    2016-05-20

    The blue compact dwarf galaxy NGC 5253 hosts a very young starburst containing twin nuclear star clusters, separated by a projected distance of 5 pc. One cluster (#5) coincides with the peak of the H α emission and the other (#11) with a massive ultracompact H ii region. A recent analysis of these clusters shows that they have a photometric age of 1 ± 1 Myr, in apparent contradiction with the age of 3–5 Myr inferred from the presence of Wolf-Rayet features in the cluster #5 spectrum. We examine Hubble Space Telescope ultraviolet and Very Large Telescope optical spectroscopy ofmore » #5 and show that the stellar features arise from very massive stars (VMSs), with masses greater than 100 M {sub ⊙}, at an age of 1–2 Myr. We further show that the very high ionizing flux from the nuclear clusters can only be explained if VMSs are present. We investigate the origin of the observed nitrogen enrichment in the circumcluster ionized gas and find that the excess N can be produced by massive rotating stars within the first 1 Myr. We find similarities between the NGC 5253 cluster spectrum and those of metal-poor, high-redshift galaxies. We discuss the presence of VMSs in young, star-forming galaxies at high redshift; these should be detected in rest-frame UV spectra to be obtained with the James Webb Space Telescope . We emphasize that population synthesis models with upper mass cutoffs greater than 100 M {sub ⊙} are crucial for future studies of young massive star clusters at all redshifts.« less

  3. Impulsivity profiles in pathological slot machine gamblers.

    PubMed

    Aragay, Núria; Barrios, Maite; Ramirez-Gendrau, Isabel; Garcia-Caballero, Anna; Garrido, Gemma; Ramos-Grille, Irene; Galindo, Yésika; Martin-Dombrowski, Jonatan; Vallès, Vicenç

    2018-05-01

    In gambling disorder (GD), impulsivity has been related with severity, treatment outcome and a greater dropout rate. The aim of the study is to obtain an empirical classification of GD patients based on their impulsivity and compare the resulting groups in terms of sociodemographic, clinical and gambling behavior variables. 126 patients with slot machine GD attending the Pathological Gambling Unit between 2013 and 2016 were included. The UPPS-P Impulsive Behavior Scale was used to assess impulsivity, and the severity of past-year gambling behavior was established with the Screen for Gambling problems questionnaire (NODS). Depression and anxiety symptoms and executive function were also assessed. A two-step cluster analysis was carried out to determine impulsivity profiles. According to the UPPS-P data, two clusters were generated. Cluster 1 showed the highest scores on all the UPPS-P subscales, whereas patients from cluster 2 exhibited only high scores on two UPPS-P subscales: Negative Urgency and Lack of premeditation. Additionally, patients on cluster 1 were younger and showed significantly higher scores on the Beck Depression Inventory and on the State-Trait Anxiety Inventory questionnaires, worse emotional regulation and executive functioning, and reported more psychiatric comorbidity compared to patients in cluster 2. With regard to gambling behavior, cluster 1 patients had significantly higher NODS scores and a higher percentage presented active gambling behavior at treatment start than in cluster 2. We found two impulsivity subtypes of slot machine gamblers. Patients with high impulsivity showed more severe gambling behavior, more clinical psychopathology and worse emotional regulation and executive functioning than those with lower levels of impulsivity. These two different clinical profiles may require different therapeutic approaches. Copyright © 2018 Elsevier Inc. All rights reserved.

  4. Spatial Analysis of Great Lakes Regional Icing Cloud Liquid Water Content

    NASA Technical Reports Server (NTRS)

    Ryerson, Charles C.; Koenig, George G.; Melloh, Rae A.; Meese, Debra A.; Reehorst, Andrew L.; Miller, Dean R.

    2003-01-01

    Abstract Clustering of cloud microphysical conditions, such as liquid water content (LWC) and drop size, can affect the rate and shape of ice accretion and the airworthiness of aircraft. Clustering may also degrade the accuracy of cloud LWC measurements from radars and microwave radiometers being developed by the government for remotely mapping icing conditions ahead of aircraft in flight. This paper evaluates spatial clustering of LWC in icing clouds using measurements collected during NASA research flights in the Great Lakes region. We used graphical and analytical approaches to describe clustering. The analytical approach involves determining the average size of clusters and computing a clustering intensity parameter. We analyzed flight data composed of 1-s-frequency LWC measurements for 12 periods ranging from 17.4 minutes (73 km) to 45.3 minutes (190 km) in duration. Graphically some flight segments showed evidence of consistency with regard to clustering patterns. Cluster intensity varied from 0.06, indicating little clustering, to a high of 2.42. Cluster lengths ranged from 0.1 minutes (0.6 km) to 4.1 minutes (17.3 km). Additional analyses will allow us to determine if clustering climatologies can be developed to characterize cluster conditions by region, time period, or weather condition. Introduction

  5. Unraveling the efficiency of RAPD and SSR markers in diversity analysis and population structure estimation in common bean.

    PubMed

    Zargar, Sajad Majeed; Farhat, Sufia; Mahajan, Reetika; Bhakhri, Ayushi; Sharma, Arjun

    2016-01-01

    Increase in food production viz-a-viz quality of food is important to feed the growing human population to attain food as well as nutritional security. The availability of diverse germplasm of any crop is an important genetic resource to mine the genes that may assist in attaining food as well as nutritional security. Here we used 15 RAPD and 23 SSR markers to elucidate diversity among 51 common bean genotypes mostly landraces collected from the Himalayan region of Jammu and Kashmir, India. We observed that both the markers are highly polymorphic. The discriminatory power of these markers was determined using various parameters like; percent polymorphism, PIC, resolving power and marker index. 15 RAPDs produced 171 polymorphic bands, while 23 SSRs produced 268 polymorphic bands. SSRs showed a higher PIC value (0.300) compared to RAPDs (0.243). Further the resolving power of SSRs was 5.241 compared to 3.86 for RAPDs. However, RAPDs showed a higher marker index (2.69) compared to SSRs (1.279) that may be attributed to their higher multiplex ratio. The dendrograms generated with hierarchical UPGMA cluster analysis grouped genotypes into two main clusters with various degrees of sub clustering within the cluster. Here we observed that both the marker systems showed comparable accuracy in grouping genotypes of common bean according to their area of cultivation. The model based STRUCTURE analysis using 15 RAPD and 23 SSR markers identified a population with 3 sub-populations which corresponds to distance based groupings. High level of genetic diversity was observed within the population. These findings have further implications in common bean breeding as well as conservation programs.

  6. Performance analysis of unsupervised optimal fuzzy clustering algorithm for MRI brain tumor segmentation.

    PubMed

    Blessy, S A Praylin Selva; Sulochana, C Helen

    2015-01-01

    Segmentation of brain tumor from Magnetic Resonance Imaging (MRI) becomes very complicated due to the structural complexities of human brain and the presence of intensity inhomogeneities. To propose a method that effectively segments brain tumor from MR images and to evaluate the performance of unsupervised optimal fuzzy clustering (UOFC) algorithm for segmentation of brain tumor from MR images. Segmentation is done by preprocessing the MR image to standardize intensity inhomogeneities followed by feature extraction, feature fusion and clustering. Different validation measures are used to evaluate the performance of the proposed method using different clustering algorithms. The proposed method using UOFC algorithm produces high sensitivity (96%) and low specificity (4%) compared to other clustering methods. Validation results clearly show that the proposed method with UOFC algorithm effectively segments brain tumor from MR images.

  7. Extreme Mergers from the Massive Cluster Survey

    NASA Astrophysics Data System (ADS)

    Morris, R.

    2010-09-01

    We will observe an extraordinary, high-redshift galaxy cluster from the Massive Cluster Survey. The target is a very rare, triple merger system, and likely lies at the one of deepest nodes of the cosmic web. The target shows multiple strong gravitational lensing arcs in the cluster core. This target only possesses a very short {10ks} Chandra observations, and is unobserved by XMM-Newton. The X-ray data from this joint Chandra/HST proposal will be used to probe the mass distribution of hot, baryonic gas, and to reveal the details of the merger physics and the process of cluster assembly. We will also search for hints of X-ray emission from filaments between the merging clumps. Subaru and some Hubble Space Telescope imaging data are in hand; we will gather additional HST coverage for a lensing analysis.

  8. Network visualization of conformational sampling during molecular dynamics simulation.

    PubMed

    Ahlstrom, Logan S; Baker, Joseph Lee; Ehrlich, Kent; Campbell, Zachary T; Patel, Sunita; Vorontsov, Ivan I; Tama, Florence; Miyashita, Osamu

    2013-11-01

    Effective data reduction methods are necessary for uncovering the inherent conformational relationships present in large molecular dynamics (MD) trajectories. Clustering algorithms provide a means to interpret the conformational sampling of molecules during simulation by grouping trajectory snapshots into a few subgroups, or clusters, but the relationships between the individual clusters may not be readily understood. Here we show that network analysis can be used to visualize the dominant conformational states explored during simulation as well as the connectivity between them, providing a more coherent description of conformational space than traditional clustering techniques alone. We compare the results of network visualization against 11 clustering algorithms and principal component conformer plots. Several MD simulations of proteins undergoing different conformational changes demonstrate the effectiveness of networks in reaching functional conclusions. Copyright © 2013 Elsevier Inc. All rights reserved.

  9. tropical cyclone risk analysis: a decisive role of its track

    NASA Astrophysics Data System (ADS)

    Chelsea Nam, C.; Park, Doo-Sun R.; Ho, Chang-Hoi

    2016-04-01

    The tracks of 85 tropical cyclones (TCs) that made landfall to South Korea for the period 1979-2010 are classified into four clusters by using a fuzzy c-means clustering method. The four clusters are characterized by 1) east-short, 2) east-long, 3) west-long, and 4) west-short based on the moving routes around Korean peninsula. We conducted risk comparison analysis for these four clusters regarding their hazards, exposure, and damages. Here, hazard parameters are calculated from two different sources independently, one from the best-track data (BT) and the other from the 60 weather stations over the country (WS). The results show distinct characteristics of the four clusters in terms of the hazard parameters and economic losses (EL), suggesting that there is a clear track-dependency in the overall TC risk. It is appeared that whether there occurred an "effective collision" overweighs the intensity of the TC per se. The EL ranking did not agree with the BT parameters (maximum wind speed, central pressure, or storm radius), but matches to WS parameter (especially, daily accumulated rainfall and TC-influenced period). The west-approaching TCs (i.e. west-long and west-short clusters) generally recorded larger EL than the east-approaching TCs (i.e. east-short and east-long clusters), although the east-long clusters are the strongest in BT point of view. This can be explained through the spatial distribution of the WS parameters and the regional EL maps corresponding to it. West-approaching TCs accompanied heavy rainfall on the southern regions with the helps of the topographic effect on their tracks, and of the extended stay on the Korean Peninsula in their extratropical transition, that were not allowed to the east-approaching TCs. On the other hand, some regions had EL that are not directly proportional to the hazards, and this is partly attributed to spatial disparity in wealth and vulnerability. Correlation analysis also revealed the importance of rainfall; daily accumulated rainfall is the most-correlated with EL among all BT and WS hazard parameters for all clusters except the east-short. The least-correlated hazard parameter is the storm radius which showed significant correlations with EL for only the short clusters. In conclusion, this study suggests that TC track is essential in determining the way it brings damage on South Korea. Thus, it is suggested that the damage warning and adaptation policy need to be different for different TC tracks although South Korea is relatively small compared to average TC size.

  10. Chemometrics-based Approach in Analysis of Arnicae flos

    PubMed Central

    Zheleva-Dimitrova, Dimitrina Zh.; Balabanova, Vessela; Gevrenova, Reneta; Doichinova, Irini; Vitkova, Antonina

    2015-01-01

    Introduction: Arnica montana flowers have a long history as herbal medicines for external use on injuries and rheumatic complaints. Objective: To investigate Arnicae flos of cultivated accessions from Bulgaria, Poland, Germany, Finland, and Pharmacy store for phenolic derivatives and sesquiterpene lactones (STLs). Materials and Methods: Samples of Arnica from nine origins were prepared by ultrasound-assisted extraction with 80% methanol for phenolic compounds analysis. Subsequent reverse-phase high-performance liquid chromatography (HPLC) separation of the analytes was performed using gradient elution and ultraviolet detection at 280 and 310 nm (phenolic acids), and 360 nm (flavonoids). Total STLs were determined in chloroform extracts by solid-phase extraction-HPLC at 225 nm. The HPLC generated chromatographic data were analyzed using principal component analysis (PCA) and hierarchical clustering (HC). Results: The highest total amount of phenolic acids was found in the sample from Botanical Garden at Joensuu University, Finland (2.36 mg/g dw). Astragalin, isoquercitrin, and isorhamnetin 3-glucoside were the main flavonol glycosides being present up to 3.37 mg/g (astragalin). Three well-defined clusters were distinguished by PCA and HC. Cluster C1 comprised of the German and Finnish accessions characterized by the highest content of flavonols. Cluster C2 included the Bulgarian and Polish samples presenting a low content of flavonoids. Cluster C3 consisted only of one sample from a pharmacy store. Conclusion: A validated HPLC method for simultaneous determination of phenolic acids, flavonoid glycosides, and aglycones in A. montana flowers was developed. The PCA loading plot showed that quercetin, kaempferol, and isorhamnetin can be used to distinguish different Arnica accessions. SUMMARY A principal component analysis (PCA) on 13 phenolic compounds and total amount of sesquiterpene lactones in Arnicae flos collection tended to cluster the studied 9 accessions into three main groups. The profiles obtained demonstrated that the samples from Germany and Finland are characterized by greater amounts of phenolic derivatives than the Bulgarian and Polish ones. The PCA loading plot showed that quercetin, kaemferol and isorhamnetin can be used to distinguish different arnica accessions. PMID:27013791

  11. Non-specific filtering of beta-distributed data.

    PubMed

    Wang, Xinhui; Laird, Peter W; Hinoue, Toshinori; Groshen, Susan; Siegmund, Kimberly D

    2014-06-19

    Non-specific feature selection is a dimension reduction procedure performed prior to cluster analysis of high dimensional molecular data. Not all measured features are expected to show biological variation, so only the most varying are selected for analysis. In DNA methylation studies, DNA methylation is measured as a proportion, bounded between 0 and 1, with variance a function of the mean. Filtering on standard deviation biases the selection of probes to those with mean values near 0.5. We explore the effect this has on clustering, and develop alternate filter methods that utilize a variance stabilizing transformation for Beta distributed data and do not share this bias. We compared results for 11 different non-specific filters on eight Infinium HumanMethylation data sets, selected to span a variety of biological conditions. We found that for data sets having a small fraction of samples showing abnormal methylation of a subset of normally unmethylated CpGs, a characteristic of the CpG island methylator phenotype in cancer, a novel filter statistic that utilized a variance-stabilizing transformation for Beta distributed data outperformed the common filter of using standard deviation of the DNA methylation proportion, or its log-transformed M-value, in its ability to detect the cancer subtype in a cluster analysis. However, the standard deviation filter always performed among the best for distinguishing subgroups of normal tissue. The novel filter and standard deviation filter tended to favour features in different genome contexts; for the same data set, the novel filter always selected more features from CpG island promoters and the standard deviation filter always selected more features from non-CpG island intergenic regions. Interestingly, despite selecting largely non-overlapping sets of features, the two filters did find sample subsets that overlapped for some real data sets. We found two different filter statistics that tended to prioritize features with different characteristics, each performed well for identifying clusters of cancer and non-cancer tissue, and identifying a cancer CpG island hypermethylation phenotype. Since cluster analysis is for discovery, we would suggest trying both filters on any new data sets, evaluating the overlap of features selected and clusters discovered.

  12. Cortical atrophy patterns in early Parkinson's disease patients using hierarchical cluster analysis.

    PubMed

    Uribe, Carme; Segura, Barbara; Baggio, Hugo Cesar; Abos, Alexandra; Garcia-Diaz, Anna Isabel; Campabadal, Anna; Marti, Maria Jose; Valldeoriola, Francesc; Compta, Yaroslau; Tolosa, Eduard; Junque, Carme

    2018-05-01

    Cortical brain atrophy detectable with MRI in non-demented advanced Parkinson's disease (PD) is well characterized, but its presence in early disease stages is still under debate. We aimed to investigate cortical atrophy patterns in a large sample of early untreated PD patients using a hypothesis-free data-driven approach. Seventy-seven de novo PD patients and 50 controls from the Parkinson's Progression Marker Initiative database with T1-weighted images in a 3-tesla Siemens scanner were included in this study. Mean cortical thickness was extracted from 360 cortical areas defined by the Human Connectome Project Multi-Modal Parcellation version 1.0, and a hierarchical cluster analysis was performed using Ward's linkage method. A general linear model with cortical thickness data was then used to compare clustering groups using FreeSurfer software. We identified two patterns of cortical atrophy. Compared with controls, patients grouped in pattern 1 (n = 33) were characterized by cortical thinning in bilateral orbitofrontal, anterior cingulate, and lateral and medial anterior temporal gyri. Patients in pattern 2 (n = 44) showed cortical thinning in bilateral occipital gyrus, cuneus, superior parietal gyrus, and left postcentral gyrus, and they showed neuropsychological impairment in memory and other cognitive domains. Even in the early stages of PD, there is evidence of cortical brain atrophy. Neuroimaging clustering analysis is able to detect two subgroups of cortical thinning, one with mainly anterior atrophy, and the other with posterior predominance and worse cognitive performance. Copyright © 2018 Elsevier Ltd. All rights reserved.

  13. [A study on genotype of 271 mycobacterium tuberculosis isolates in 6 prefectures in Yunnan Province].

    PubMed

    Chen, L Y; Yang, X; Ru, H H; Yang, H J; Yan, S Q; Ma, L; Chen, J O; Yang, R; Xu, L

    2018-01-06

    Objective: To understand the characteristics of genotypes of Mycobacterium tuberculosis isolates in Yunnan province, and provide the molecular epidemiological evidence for prevention and control of tuberculosis in Yunnan Province. Methods: Mycobacterium Tuberculosis isolates were collected from 6 prefectures of Yunnan province in 2014 and their Genetypes of Mycobacterium tuberculosis isolates were obtained using spoligotyping and multiple locus variable numbers of tandem repeats analysis (MLVA). The results of spoligotyping were entered into the SITVITWEB database to obtain the Spoligotyping International Type (SIT) patterns and the sublineages of MTB isolates. The genoyping patterns were clustered with BioNumerics (version 5.0). Results: A total of 271 MTB isolates represented patients were collected from six prefectures in Yunnan province. Out of these patients, 196 (72.3%) were male. The mean age of the patients was (41.9±15.1) years. The most MTB isolates were from Puer, totally 94 iusolates(34.69%). Spoligotyping analysis revealed that 151 (55.72%) MTB isolates belonged to the Beijing genotype, while the other 120 (44.28%) were from non-Beijing genotype; 40 genotypes were consisted of 24 unique genotypes and 16 clusters. The 271 isolates were differentiated into 30 clusters (2 to 17 isolates per cluster) and 177 unique genotypes, showing a clustering rate of 23.62%. Beijing genotype strains showed higher clustering rate than non-Beijing genotype strains (29.14% vs 16.67%). The HGI of 12-locus VNTR in total MTB strains, Beijing genotype strains and non-Beijing genotype was 0.993, 0.982 and 0.995 respectively. Conclusion: The Beijing genotype was the predominant genotype in Yunnan Province, the characteristics of Mycobacterium tuberculosis showed high genetic diversity. The genotyping data reflect the potential recent ongoing transmission in some area, which highlights the urgent need for early diagnosis and treatment of the infectious TB cases, to cut off the transmission and avoid a large TB outbreak.

  14. Molecular reclassification of Crohn's disease: a cautionary note on population stratification.

    PubMed

    Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel

    2013-01-01

    Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn's disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn's disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals.

  15. Molecular Reclassification of Crohn’s Disease: A Cautionary Note on Population Stratification

    PubMed Central

    Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M.; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel

    2013-01-01

    Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn’s disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn’s disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals. PMID:24147066

  16. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Branchini, Enzo; Camera, Stefano; Cuoco, Alessandro

    In this article, we report the detection of a cross-correlation signal between Fermi Large Area Telescope diffuse γ-ray maps and catalogs of clusters. In our analysis, we considered three different catalogs: WHL12, redMaPPer, and PlanckSZ. They all show a positive correlation with different amplitudes, related to the average mass of the objects in each catalog, which also sets the catalog bias. The signal detection is confirmed by the results of a stacking analysis. The cross-correlation signal extends to rather large angular scales, around 1°, that correspond, at the typical redshift of the clusters in these catalogs, to a few tomore » tens of megaparsecs, i.e., the typical scale-length of the large-scale structures in the universe. Most likely this signal is contributed by the cumulative emission from active galactic nuclei (AGNs) associated with the filamentary structures that converge toward the high peaks of the matter density field in which galaxy clusters reside. In addition, our analysis reveals the presence of a second component, more compact in size and compatible with a point-like emission from within individual clusters. At present, we cannot distinguish between the two most likely interpretations for such a signal, i.e., whether it is produced by AGNs inside clusters or if it is a diffuse γ-ray emission from the intracluster medium. Lastly, we argue that this latter, intriguing, hypothesis might be tested by applying this technique to a low-redshift large-mass cluster sample.« less

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Branchini, Enzo; Camera, Stefano; Cuoco, Alessandro

    We report the detection of a cross-correlation signal between Fermi Large Area Telescope diffuse γ -ray maps and catalogs of clusters. In our analysis, we considered three different catalogs: WHL12, redMaPPer, and PlanckSZ. They all show a positive correlation with different amplitudes, related to the average mass of the objects in each catalog, which also sets the catalog bias. The signal detection is confirmed by the results of a stacking analysis. The cross-correlation signal extends to rather large angular scales, around 1°, that correspond, at the typical redshift of the clusters in these catalogs, to a few to tens ofmore » megaparsecs, i.e., the typical scale-length of the large-scale structures in the universe. Most likely this signal is contributed by the cumulative emission from active galactic nuclei (AGNs) associated with the filamentary structures that converge toward the high peaks of the matter density field in which galaxy clusters reside. In addition, our analysis reveals the presence of a second component, more compact in size and compatible with a point-like emission from within individual clusters. At present, we cannot distinguish between the two most likely interpretations for such a signal, i.e., whether it is produced by AGNs inside clusters or if it is a diffuse γ -ray emission from the intracluster medium. We argue that this latter, intriguing, hypothesis might be tested by applying this technique to a low-redshift large-mass cluster sample.« less

  18. The symbiovar trifolii of Rhizobium bangladeshense and Rhizobium aegyptiacum sp. nov. nodulate Trifolium alexandrinum in Egypt.

    PubMed

    Shamseldin, Abdelaal; Carro, Lorena; Peix, Alvaro; Velázquez, Encarna; Moawad, Hassan; Sadowsky, Michael J

    2016-06-01

    In the present work we analyzed the taxonomic status of several Rhizobium strains isolated from Trifolium alexandrinum L. nodules in Egypt. The 16S rRNA genes of these strains were identical to those of Rhizobium bangladeshense BLR175(T) and Rhizobium binae BLR195(T). However, the analyses of recA and atpD genes split the strains into two clusters. Cluster II strains are identified as R. bangladeshense with >98% similarity values in both genes. The cluster I strains are phylogenetically related to Rhizobium etli CFN42(T) and R. bangladeshense BLR175(T), but with less than 94% similarity values in recA and atpD genes. DNA-DNA hybridization analysis showed 42% and 48% average relatedness between the strain 1010(T) from cluster I with respect to R. bangladeshense BLR175(T) and R. etli CFN42(T), respectively. Phenotypic characteristics of cluster I strains also differed from those of their closest related Rhizobium species. Analysis of the nodC gene showed that the strains belong to two groups within the symbiovar trifolii which was identified in Egypt linked to the species R. bangladeshense. Based on the genotypic and phenotypic characteristics, the group I strains belong to a new species for which the name Rhizobium aegyptiacum sp. nov. (sv. trifolii) is proposed, with strain 1010(T) being designated as the type strain (= USDA 7124(T)=LMG 29296(T)=CECT 9098(T)). Copyright © 2016 Elsevier GmbH. All rights reserved.

  19. Identification of spatiotemporal nutrient patterns in a coastal bay via an integrated k-means clustering and gravity model.

    PubMed

    Chang, Ni-Bin; Wimberly, Brent; Xuan, Zhemin

    2012-03-01

    This study presents an integrated k-means clustering and gravity model (IKCGM) for investigating the spatiotemporal patterns of nutrient and associated dissolved oxygen levels in Tampa Bay, Florida. By using a k-means clustering analysis to first partition the nutrient data into a user-specified number of subsets, it is possible to discover the spatiotemporal patterns of nutrient distribution in the bay and capture the inherent linkages of hydrodynamic and biogeochemical features. Such patterns may then be combined with a gravity model to link the nutrient source contribution from each coastal watershed to the generated clusters in the bay to aid in the source proportion analysis for environmental management. The clustering analysis was carried out based on 1 year (2008) water quality data composed of 55 sample stations throughout Tampa Bay collected by the Environmental Protection Commission of Hillsborough County. In addition, hydrological and river water quality data of the same year were acquired from the United States Geological Survey's National Water Information System to support the gravity modeling analysis. The results show that the k-means model with 8 clusters is the optimal choice, in which cluster 2 at Lower Tampa Bay had the minimum values of total nitrogen (TN) concentrations, chlorophyll a (Chl-a) concentrations, and ocean color values in every season as well as the minimum concentration of total phosphorus (TP) in three consecutive seasons in 2008. The datasets indicate that Lower Tampa Bay is an area with limited nutrient input throughout the year. Cluster 5, located in Middle Tampa Bay, displayed elevated TN concentrations, ocean color values, and Chl-a concentrations, suggesting that high values of colored dissolved organic matter are linked with some nutrient sources. The data presented by the gravity modeling analysis indicate that the Alafia River Basin is the major contributor of nutrients in terms of both TP and TN values in all seasons. With this new integration, improvements for environmental monitoring and assessment were achieved to advance our understanding of sea-land interactions and nutrient cycling in a critical coastal bay, the Gulf of Mexico. This journal is © The Royal Society of Chemistry 2012

  20. Conformational Clusters of Phosphorylated Tyrosine.

    PubMed

    Abdelrasoul, Maha; Ponniah, Komala; Mao, Alice; Warden, Meghan S; Elhefnawy, Wessam; Li, Yaohang; Pascal, Steven M

    2017-12-06

    Tyrosine phosphorylation plays an important role in many cellular and intercellular processes including signal transduction, subcellular localization, and regulation of enzymatic activity. In 1999, Blom et al., using the limited number of protein data bank (PDB) structures available at that time, reported that the side chain structures of phosphorylated tyrosine (pY) are partitioned into two conserved conformational clusters ( Blom, N.; Gammeltoft, S.; Brunak, S. J. Mol. Biol. 1999 , 294 , 1351 - 1362 ). We have used the spectral clustering algorithm to cluster the increasingly growing number of protein structures with pY sites, and have found that the pY residues cluster into three distinct side chain conformations. Two of these pY conformational clusters associate strongly with a narrow range of tyrosine backbone conformation. The novel cluster also highly correlates with the identity of the n + 1 residue, and is strongly associated with a sequential pYpY conformation which places two adjacent pY side chains in a specific relative orientation. Further analysis shows that the three pY clusters are associated with distinct distributions of cognate protein kinases.

Top