groups cluster analysis: Topics by Science.gov

Sample records for groups cluster analysis

Cluster and principal component analysis based on SSR markers of Amomum tsao-ko in Jinping County of Yunnan Province

NASA Astrophysics Data System (ADS)

Ma, Mengli; Lei, En; Meng, Hengling; Wang, Tiantao; Xie, Linyan; Shen, Dong; Xianwang, Zhou; Lu, Bingyue

2017-08-01

Amomum tsao-ko is a commercial plant that used for various purposes in medicinal and food industries. For the present investigation, 44 germplasm samples were collected from Jinping County of Yunnan Province. Clusters analysis and 2-dimensional principal component analysis (PCA) was used to represent the genetic relations among Amomum tsao-ko by using simple sequence repeat (SSR) markers. Clustering analysis clearly distinguished the samples groups. Two major clusters were formed; first (Cluster I) consisted of 34 individuals, the second (Cluster II) consisted of 10 individuals, Cluster I as the main group contained multiple sub-clusters. PCA also showed 2 groups: PCA Group 1 included 29 individuals, PCA Group 2 included 12 individuals, consistent with the results of cluster analysis. The purpose of the present investigation was to provide information on genetic relationship of Amomum tsao-ko germplasm resources in main producing areas, also provide a theoretical basis for the protection and utilization of Amomum tsao-ko resources.
Missing continuous outcomes under covariate dependent missingness in cluster randomised trials

PubMed Central

Diaz-Ordaz, Karla; Bartlett, Jonathan W

2016-01-01

Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group. PMID:27177885
Missing continuous outcomes under covariate dependent missingness in cluster randomised trials.

PubMed

Hossain, Anower; Diaz-Ordaz, Karla; Bartlett, Jonathan W

2017-06-01

Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group.
Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-Means Cluster Analysis

ERIC Educational Resources Information Center

de Craen, Saskia; Commandeur, Jacques J. F.; Frank, Laurence E.; Heiser, Willem J.

2006-01-01

K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these…
Allergen Sensitization Pattern by Sex: A Cluster Analysis in Korea.

PubMed

Ohn, Jungyoon; Paik, Seung Hwan; Doh, Eun Jin; Park, Hyun-Sun; Yoon, Hyun-Sun; Cho, Soyun

2017-12-01

Allergens tend to sensitize simultaneously. Etiology of this phenomenon has been suggested to be allergen cross-reactivity or concurrent exposure. However, little is known about specific allergen sensitization patterns. To investigate the allergen sensitization characteristics according to gender. Multiple allergen simultaneous test (MAST) is widely used as a screening tool for detecting allergen sensitization in dermatologic clinics. We retrospectively reviewed the medical records of patients with MAST results between 2008 and 2014 in our Department of Dermatology. A cluster analysis was performed to elucidate the allergen-specific immunoglobulin (Ig)E cluster pattern. The results of MAST (39 allergen-specific IgEs) from 4,360 cases were analyzed. By cluster analysis, 39items were grouped into 8 clusters. Each cluster had characteristic features. When compared with female, the male group tended to be sensitized more frequently to all tested allergens, except for fungus allergens cluster. The cluster and comparative analysis results demonstrate that the allergen sensitization is clustered, manifesting allergen similarity or co-exposure. Only the fungus cluster allergens tend to sensitize female group more frequently than male group.
Effects of Group Size and Lack of Sphericity on the Recovery of Clusters in K-means Cluster Analysis.

PubMed

Craen, Saskia de; Commandeur, Jacques J F; Frank, Laurence E; Heiser, Willem J

2006-06-01

K-means cluster analysis is known for its tendency to produce spherical and equally sized clusters. To assess the magnitude of these effects, a simulation study was conducted, in which populations were created with varying departures from sphericity and group sizes. An analysis of the recovery of clusters in the samples taken from these populations showed a significant effect of lack of sphericity and group size. This effect was, however, not as large as expected, with still a recovery index of more than 0.5 in the "worst case scenario." An interaction effect between the two data aspects was also found. The decreasing trend in the recovery of clusters for increasing departures from sphericity is different for equal and unequal group sizes.
Elucidation of the Pattern of the Onset of Male Lower Urinary Tract Symptoms Using Cluster Analysis: Efficacy of Tamsulosin in Each Symptom Group.

PubMed

Aikawa, Ken; Kataoka, Masao; Ogawa, Soichiro; Akaihata, Hidenori; Sato, Yuichi; Yabe, Michihiro; Hata, Junya; Koguchi, Tomoyuki; Kojima, Yoshiyuki; Shiragasawa, Chihaya; Kobayashi, Toshimitsu; Yamaguchi, Osamu

2015-08-01

To present a new grouping of male patients with lower urinary tract symptoms (LUTS) based on symptom patterns and clarify whether the therapeutic effect of α1-blocker differs among the groups. We performed secondary analysis of anonymous data from 4815 patients enrolled in a postmarketing surveillance study of tamsulosin in Japan. Data on 7 International Prostate Symptom Score (IPSS) items at the initial visit were used in the cluster analysis. IPSS and quality of life (QOL) scores before and after tamsulosin treatment for 12 weeks were assessed in each cluster. Partial correlation coefficients were also obtained for IPSS and QOL scores based on changes before and after treatment. Five symptom groups were identified by cluster analysis of IPSS. On their symptom profile, each cluster was labeled as minimal type (cluster 1), multiple severe type (cluster 2), weak stream type (cluster 3), storage type (cluster 4), and voiding type (cluster 5). Prevalence and the mean symptom score were significantly improved in almost all symptoms in all clusters by tamsulosin treatment. Nocturia and weak stream had the strongest effect on QOL in clusters 1, 2, and 4 and clusters 3 and 5, respectively. The study clarified that 5 characteristic symptom patterns exist by cluster analysis of IPSS in male patients with LUTS. Tamsulosin improved various symptoms and QOL in each symptom group. The study reports many male patients with LUTS being satisfied with monotherapy using tamsulosin and suggests the usefulness of α1-blockers as a drug of first choice. Copyright © 2015 Elsevier Inc. All rights reserved.
Batch Computed Tomography Analysis of Projectiles

DTIC Science & Technology

2016-05-01

error calculation. Projectiles are then grouped together according to the similarity of their components. Also discussed is graphical- cluster analysis...ballistic, armor, grouping, clustering 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT UU 18. NUMBER OF...Fig. 10 Graphical structure of 15 clusters of the jacket/core radii profiles with plots of the profiles contained within each cluster . The size of
A hierarchical cluster analysis of normal-tension glaucoma using spectral-domain optical coherence tomography parameters.

PubMed

Bae, Hyoung Won; Ji, Yongwoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

2015-01-01

Normal-tension glaucoma (NTG) is a heterogenous disease, and there is still controversy about subclassifications of this disorder. On the basis of spectral-domain optical coherence tomography (SD-OCT), we subdivided NTG with hierarchical cluster analysis using optic nerve head (ONH) parameters and retinal nerve fiber layer (RNFL) thicknesses. A total of 200 eyes of 200 NTG patients between March 2011 and June 2012 underwent SD-OCT scans to measure ONH parameters and RNFL thicknesses. We classified NTG into homogenous subgroups based on these variables using a hierarchical cluster analysis, and compared clusters to evaluate diverse NTG characteristics. Three clusters were found after hierarchical cluster analysis. Cluster 1 (62 eyes) had the thickest RNFL and widest rim area, and showed early glaucoma features. Cluster 2 (60 eyes) was characterized by the largest cup/disc ratio and cup volume, and showed advanced glaucomatous damage. Cluster 3 (78 eyes) had small disc areas in SD-OCT and were comprised of patients with significantly younger age, longer axial length, and greater myopia than the other 2 groups. A hierarchical cluster analysis of SD-OCT scans divided NTG patients into 3 groups based upon ONH parameters and RNFL thicknesses. It is anticipated that the small disc area group comprised of younger and more myopic patients may show unique features unlike the other 2 groups.
Suicide in the oldest old: an observational study and cluster analysis.

PubMed

Sinyor, Mark; Tan, Lynnette Pei Lin; Schaffer, Ayal; Gallagher, Damien; Shulman, Kenneth

2016-01-01

The older population are at a high risk for suicide. This study sought to learn more about the characteristics of suicide in the oldest-old and to use a cluster analysis to determine if oldest-old suicide victims assort into clinically meaningful subgroups. Data were collected from a coroner's chart review of suicide victims in Toronto from 1998 to 2011. We compared two age groups (65-79 year olds, n = 335, and 80+ year olds, n = 191) and then conducted a hierarchical agglomerative cluster analysis using Ward's method to identify distinct clusters in the 80+ group. The younger and older age groups differed according to marital status, living circumstances and pattern of stressors. The cluster analysis identified three distinct clusters in the 80+ group. Cluster 1 was the largest (n = 124) and included people who were either married or widowed who had significantly more depression and somewhat more medical health stressors. In contrast, cluster 2 (n = 50) comprised people who were almost all single and living alone with significantly less identified depression and slightly fewer medical health stressors. All members of cluster 3 (n = 17) lived in a retirement residence or nursing home, and this group had the highest rates of depression, dementia, other mental illness and past suicide attempts. This is the first study to use the cluster analysis technique to identify meaningful subgroups among suicide victims in the oldest-old. The results reveal different patterns of suicide in the older population that may be relevant for clinical care. Copyright © 2015 John Wiley & Sons, Ltd.
Autoantibodies in pediatric systemic lupus erythematosus: ethnic grouping, cluster analysis, and clinical correlations.

PubMed

Jurencák, Roman; Fritzler, Marvin; Tyrrell, Pascal; Hiraki, Linda; Benseler, Susanne; Silverman, Earl

2009-02-01

(1) To evaluate the spectrum of serum autoantibodies in pediatric-onset systemic lupus erythematosus (pSLE) with a focus on ethnic differences; (2) using cluster analysis, to identify patients with similar autoantibody patterns and to determine their clinical associations. A single-center cohort study of all patients with newly diagnosed pSLE seen over an 8-year period was performed. Ethnicity, clinical, and serological data were prospectively collected from 156/169 patients (92%). The frequencies of 10 selected autoantibodies among ethnic groups were compared. Cluster analysis identified groups of patients with similar autoantibody profiles. Associations of these groups with clinical and laboratory features of pSLE were examined. Among our 5 ethnic groups, there were differences only in the prevalence of anti-U1RNP and anti-Sm antibodies, which occurred more frequently in non-Caucasian patients (p < 0.0001, p < 0.01, respectively). Cluster analysis revealed 3 autoantibody clusters. Cluster 1 consisted of anti-dsDNA antibodies. Cluster 2 consisted of anti-dsDNA, antichromatin, antiribosomal P, anti-U1RNP, anti-Sm, anti-Ro and anti-La autoantibody. Cluster 3 consisted of anti-dsDNA, anti-RNP, and anti-Sm autoantibody. The highest proportion of Caucasians was in cluster 1 (p < 0.05), which was characterized by a mild disease with infrequent major organ involvement compared to cluster 2, which had the highest frequency of nephritis, renal failure, serositis, and hemolytic anemia, or cluster 3, which was characterized by frequent neuropsychiatric disease and nephritis. We observed ethnic differences in autoantibody profiles in pSLE. Autoantibodies tended to cluster together and these clusters were associated with different clinical courses.
A Systematic Approach for Determining Vertical Pile Depth of Embedment in Cohensionless Soils to Withstand Lateral Barge Train Impact Loads

DTIC Science & Technology

2017-01-30

dynamic structural time- history response analysis of flexible approach walls founded on clustered pile groups using Impact_Deck. In Preparation, ERDC...research (Ebeling et al. 2012) has developed simplified analysis procedures for flexible approach wall systems founded on clustered groups of vertical...history response analysis of flexible approach walls founded on clustered pile groups using Impact_Deck. In Preparation, ERDC/ITL TR-16-X. Vicksburg, MS
An Approach to Cluster EU Member States into Groups According to Pathways of Salmonella in the Farm-to-Consumption Chain for Pork Products.

PubMed

Vigre, Håkan; Domingues, Ana Rita Coutinho Calado; Pedersen, Ulrik Bo; Hald, Tine

2016-03-01

The aim of the project as the cluster analysis was to in part to develop a generic structured quantitative microbiological risk assessment (QMRA) model of human salmonellosis due to pork consumption in EU member states (MSs), and the objective of the cluster analysis was to group the EU MSs according to the relative contribution of different pathways of Salmonella in the farm-to-consumption chain of pork products. In the development of the model, by selecting a case study MS from each cluster the model was developed to represent different aspects of pig production, pork production, and consumption of pork products across EU states. The objective of the cluster analysis was to aggregate MSs into groups of countries with similar importance of different pathways of Salmonella in the farm-to-consumption chain using available, and where possible, universal register data related to the pork production and consumption in each country. Based on MS-specific information about distribution of (i) small and large farms, (ii) small and large slaughterhouses, (iii) amount of pork meat consumed, and (iv) amount of sausages consumed we used nonhierarchical and hierarchical cluster analysis to group the MSs. The cluster solutions were validated internally using statistic measures and externally by comparing the clustered MSs with an estimated human incidence of salmonellosis due to pork products in the MSs. Finally, each cluster was characterized qualitatively using the centroids of the clusters. © 2016 Society for Risk Analysis.
Clustering analysis for muon tomography data elaboration in the Muon Portal project

NASA Astrophysics Data System (ADS)

Bandieramonte, M.; Antonuccio-Delogu, V.; Becciani, U.; Costa, A.; La Rocca, P.; Massimino, P.; Petta, C.; Pistagna, C.; Riggi, F.; Riggi, S.; Sciacca, E.; Vitello, F.

2015-05-01

Clustering analysis is one of multivariate data analysis techniques which allows to gather statistical data units into groups, in order to minimize the logical distance within each group and to maximize the one between different groups. In these proceedings, the authors present a novel approach to the muontomography data analysis based on clustering algorithms. As a case study we present the Muon Portal project that aims to build and operate a dedicated particle detector for the inspection of harbor containers to hinder the smuggling of nuclear materials. Clustering techniques, working directly on scattering points, help to detect the presence of suspicious items inside the container, acting, as it will be shown, as a filter for a preliminary analysis of the data.
[Cluster analysis in biomedical researches].

PubMed

Akopov, A S; Moskovtsev, A A; Dolenko, S A; Savina, G D

2013-01-01

Cluster analysis is one of the most popular methods for the analysis of multi-parameter data. The cluster analysis reveals the internal structure of the data, group the separate observations on the degree of their similarity. The review provides a definition of the basic concepts of cluster analysis, and discusses the most popular clustering algorithms: k-means, hierarchical algorithms, Kohonen networks algorithms. Examples are the use of these algorithms in biomedical research.
Differences Between Ward's and UPGMA Methods of Cluster Analysis: Implications for School Psychology.

ERIC Educational Resources Information Center

Hale, Robert L.; Dougherty, Donna

1988-01-01

Compared the efficacy of two methods of cluster analysis, the unweighted pair-groups method using arithmetic averages (UPGMA) and Ward's method, for students grouped on intelligence, achievement, and social adjustment by both clustering methods. Found UPGMA more efficacious based on output, on cophenetic correlation coefficients generated by each…
Cluster Analysis to Identify Possible Subgroups in Tinnitus Patients.

PubMed

van den Berge, Minke J C; Free, Rolien H; Arnold, Rosemarie; de Kleine, Emile; Hofman, Rutger; van Dijk, J Marc C; van Dijk, Pim

2017-01-01

In tinnitus treatment, there is a tendency to shift from a "one size fits all" to a more individual, patient-tailored approach. Insight in the heterogeneity of the tinnitus spectrum might improve the management of tinnitus patients in terms of choice of treatment and identification of patients with severe mental distress. The goal of this study was to identify subgroups in a large group of tinnitus patients. Data were collected from patients with severe tinnitus complaints visiting our tertiary referral tinnitus care group at the University Medical Center Groningen. Patient-reported and physician-reported variables were collected during their visit to our clinic. Cluster analyses were used to characterize subgroups. For the selection of the right variables to enter in the cluster analysis, two approaches were used: (1) variable reduction with principle component analysis and (2) variable selection based on expert opinion. Various variables of 1,783 tinnitus patients were included in the analyses. Cluster analysis (1) included 976 patients and resulted in a four-cluster solution. The effect of external influences was the most discriminative between the groups, or clusters, of patients. The "silhouette measure" of the cluster outcome was low (0.2), indicating a "no substantial" cluster structure. Cluster analysis (2) included 761 patients and resulted in a three-cluster solution, comparable to the first analysis. Again, a "no substantial" cluster structure was found (0.2). Two cluster analyses on a large database of tinnitus patients revealed that clusters of patients are mostly formed by a different response of external influences on their disease. However, both cluster outcomes based on this dataset showed a poor stability, suggesting that our tinnitus population comprises a continuum rather than a number of clearly defined subgroups.
Accounting for One-Group Clustering in Effect-Size Estimation

ERIC Educational Resources Information Center

Citkowicz, Martyna; Hedges, Larry V.

2013-01-01

In some instances, intentionally or not, study designs are such that there is clustering in one group but not in the other. This paper describes methods for computing effect size estimates and their variances when there is clustering in only one group and the analysis has not taken that clustering into account. The authors provide the effect size…
An enhanced cluster analysis program with bootstrap significance testing for ecological community analysis

USGS Publications Warehouse

McKenna, J.E.

2003-01-01

The biosphere is filled with complex living patterns and important questions about biodiversity and community and ecosystem ecology are concerned with structure and function of multispecies systems that are responsible for those patterns. Cluster analysis identifies discrete groups within multivariate data and is an effective method of coping with these complexities, but often suffers from subjective identification of groups. The bootstrap testing method greatly improves objective significance determination for cluster analysis. The BOOTCLUS program makes cluster analysis that reliably identifies real patterns within a data set more accessible and easier to use than previously available programs. A variety of analysis options and rapid re-analysis provide a means to quickly evaluate several aspects of a data set. Interpretation is influenced by sampling design and a priori designation of samples into replicate groups, and ultimately relies on the researcher's knowledge of the organisms and their environment. However, the BOOTCLUS program provides reliable, objectively determined groupings of multivariate data.
Scoring clustering solutions by their biological relevance.

PubMed

Gat-Viks, I; Sharan, R; Shamir, R

2003-12-12

A central step in the analysis of gene expression data is the identification of groups of genes that exhibit similar expression patterns. Clustering gene expression data into homogeneous groups was shown to be instrumental in functional annotation, tissue classification, regulatory motif identification, and other applications. Although there is a rich literature on clustering algorithms for gene expression analysis, very few works addressed the systematic comparison and evaluation of clustering results. Typically, different clustering algorithms yield different clustering solutions on the same data, and there is no agreed upon guideline for choosing among them. We developed a novel statistically based method for assessing a clustering solution according to prior biological knowledge. Our method can be used to compare different clustering solutions or to optimize the parameters of a clustering algorithm. The method is based on projecting vectors of biological attributes of the clustered elements onto the real line, such that the ratio of between-groups and within-group variance estimators is maximized. The projected data are then scored using a non-parametric analysis of variance test, and the score's confidence is evaluated. We validate our approach using simulated data and show that our scoring method outperforms several extant methods, including the separation to homogeneity ratio and the silhouette measure. We apply our method to evaluate results of several clustering methods on yeast cell-cycle gene expression data. The software is available from the authors upon request.

Investigating Subtypes of Child Development: A Comparison of Cluster Analysis and Latent Class Cluster Analysis in Typology Creation

ERIC Educational Resources Information Center

DiStefano, Christine; Kamphaus, R. W.

2006-01-01

Two classification methods, latent class cluster analysis and cluster analysis, are used to identify groups of child behavioral adjustment underlying a sample of elementary school children aged 6 to 11 years. Behavioral rating information across 14 subscales was obtained from classroom teachers and used as input for analyses. Both the procedures…
Gathering Real World Evidence with Cluster Analysis for Clinical Decision Support.

PubMed

Xia, Eryu; Liu, Haifeng; Li, Jing; Mei, Jing; Li, Xuejun; Xu, Enliang; Li, Xiang; Hu, Gang; Xie, Guotong; Xu, Meilin

2017-01-01

Clinical decision support systems are information technology systems that assist clinical decision-making tasks, which have been shown to enhance clinical performance. Cluster analysis, which groups similar patients together, aims to separate patient cases into phenotypically heterogenous groups and defining therapeutically homogeneous patient subclasses. Useful as it is, the application of cluster analysis in clinical decision support systems is less reported. Here, we describe the usage of cluster analysis in clinical decision support systems, by first dividing patient cases into similar groups and then providing diagnosis or treatment suggestions based on the group profiles. This integration provides data for clinical decisions and compiles a wide range of clinical practices to inform the performance of individual clinicians. We also include an example usage of the system under the scenario of blood lipid management in type 2 diabetes. These efforts represent a step toward promoting patient-centered care and enabling precision medicine.
Psychosocial Costs of Racism to Whites: Exploring Patterns through Cluster Analysis

ERIC Educational Resources Information Center

Spanierman, Lisa B.; Poteat, V. Paul; Beer, Amanda M.; Armstrong, Patrick Ian

2006-01-01

Participants (230 White college students) completed the Psychosocial Costs of Racism to Whites (PCRW) Scale. Using cluster analysis, we identified 5 distinct cluster groups on the basis of PCRW subscale scores: the unempathic and unaware cluster contained the lowest empathy scores; the insensitive and afraid cluster consisted of low empathy and…
A Preliminary Study of the Effects of Within-Group Covariance Structure on Recovery in Cluster Analysis. Research Report RR-94-46.

ERIC Educational Resources Information Center

Donoghue, John R.

Monte Carlo studies investigated effects of within-group covariance structure on subgroup recovery by several widely used hierarchical clustering methods. In Study 1, subgroup size, within-group correlation, within-group variance, and distance between subgroup centroids were manipulated. All clustering methods were strongly affected by…
[Cluster analysis applicability to fitness evaluation of cosmonauts on long-term missions of the International space station].

PubMed

Egorov, A D; Stepantsov, V I; Nosovskiĭ, A M; Shipov, A A

2009-01-01

Cluster analysis was applied to evaluate locomotion training (running and running intermingled with walking) of 13 cosmonauts on long-term ISS missions by the parameters of duration (min), distance (m) and intensity (km/h). Based on the results of analyses, the cosmonauts were distributed into three steady groups of 2, 5 and 6 persons. Distance and speed showed a statistical rise (p < 0.03) from group 1 to group 3. Duration of physical locomotion training was not statistically different in the groups (p = 0.125). Therefore, cluster analysis is an adequate method of evaluating fitness of cosmonauts on long-term missions.
Alteration mapping at Goldfield, Nevada, by cluster and discriminant analysis of LANDSAT digital data

NASA Technical Reports Server (NTRS)

Ballew, G.

1977-01-01

The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.
Customized recommendations for production management clusters of North American automatic milking systems.

PubMed

Tremblay, Marlène; Hess, Justin P; Christenson, Brock M; McIntyre, Kolby K; Smink, Ben; van der Kamp, Arjen J; de Jong, Lisanne G; Döpfer, Dörte

2016-07-01

Automatic milking systems (AMS) are implemented in a variety of situations and environments. Consequently, there is a need to characterize individual farming practices and regional challenges to streamline management advice and objectives for producers. Benchmarking is often used in the dairy industry to compare farms by computing percentile ranks of the production values of groups of farms. Grouping for conventional benchmarking is commonly limited to the use of a few factors such as farms' geographic region or breed of cattle. We hypothesized that herds' production data and management information could be clustered in a meaningful way using cluster analysis and that this clustering approach would yield better peer groups of farms than benchmarking methods based on criteria such as country, region, breed, or breed and region. By applying mixed latent-class model-based cluster analysis to 529 North American AMS dairy farms with respect to 18 significant risk factors, 6 clusters were identified. Each cluster (i.e., peer group) represented unique management styles, challenges, and production patterns. When compared with peer groups based on criteria similar to the conventional benchmarking standards, the 6 clusters better predicted milk produced (kilograms) per robot per day. Each cluster represented a unique management and production pattern that requires specialized advice. For example, cluster 1 farms were those that recently installed AMS robots, whereas cluster 3 farms (the most northern farms) fed high amounts of concentrates through the robot to compensate for low-energy feed in the bunk. In addition to general recommendations for farms within a cluster, individual farms can generate their own specific goals by comparing themselves to farms within their cluster. This is very comparable to benchmarking but adds the specific characteristics of the peer group, resulting in better farm management advice. The improvement that cluster analysis allows for is characterized by the multivariable approach and the fact that comparisons between production units can be accomplished within a cluster and between clusters as a choice. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Clustering and group selection of multiple criteria alternatives with application to space-based networks.

PubMed

Malakooti, Behnam; Yang, Ziyong

2004-02-01

In many real-world problems, the range of consequences of different alternatives are considerably different. In addition, sometimes, selection of a group of alternatives (instead of only one best alternative) is necessary. Traditional decision making approaches treat the set of alternatives with the same method of analysis and selection. In this paper, we propose clustering alternatives into different groups so that different methods of analysis, selection, and implementation for each group can be applied. As an example, consider the selection of a group of functions (or tasks) to be processed by a group of processors. The set of tasks can be grouped according to their similar criteria, and hence, each cluster of tasks to be processed by a processor. The selection of the best alternative for each clustered group can be performed using existing methods; however, the process of selecting groups is different than the process of selecting alternatives within a group. We develop theories and procedures for clustering discrete multiple criteria alternatives. We also demonstrate how the set of alternatives is clustered into mutually exclusive groups based on 1) similar features among alternatives; 2) ideal (or most representative) alternatives given by the decision maker; and 3) other preferential information of the decision maker. The clustering of multiple criteria alternatives also has the following advantages. 1) It decreases the set of alternatives to be considered by the decision maker (for example, different decision makers are assigned to different groups of alternatives). 2) It decreases the number of criteria. 3) It may provide a different approach for analyzing multiple decision makers problems. Each decision maker may cluster alternatives differently, and hence, clustering of alternatives may provide a basis for negotiation. The developed approach is applicable for solving a class of telecommunication networks problems where a set of objects (such as routers, processors, or intelligent autonomous vehicles) are to be clustered into similar groups. Objects are clustered based on several criteria and the decision maker's preferences.
Using cluster analysis to identify phenotypes and validation of mortality in men with COPD.

PubMed

Chen, Chiung-Zuei; Wang, Liang-Yi; Ou, Chih-Ying; Lee, Cheng-Hung; Lin, Chien-Chung; Hsiue, Tzuen-Ren

2014-12-01

Cluster analysis has been proposed to examine phenotypic heterogeneity in chronic obstructive pulmonary disease (COPD). The aim of this study was to use cluster analysis to define COPD phenotypes and validate them by assessing their relationship with mortality. Male subjects with COPD were recruited to identify and validate COPD phenotypes. Seven variables were assessed for their relevance to COPD, age, FEV(1) % predicted, BMI, history of severe exacerbations, mMRC, SpO(2), and Charlson index. COPD groups were identified by cluster analysis and validated prospectively against mortality during a 4-year follow-up. Analysis of 332 COPD subjects identified five clusters from cluster A to cluster E. Assessment of the predictive validity of these clusters of COPD showed that cluster E patients had higher all cause mortality (HR 18.3, p < 0.0001), and respiratory cause mortality (HR 21.5, p < 0.0001) than those in the other four groups. Cluster E patients also had higher all cause mortality (HR 14.3, p = 0.0002) and respiratory cause mortality (HR 10.1, p = 0.0013) than patients in cluster D alone. COPD patient with severe airflow limitation, many symptoms, and a history of frequent severe exacerbations was a novel and distinct clinical phenotype predicting mortality in men with COPD.
Conveyor Performance based on Motor DC 12 Volt Eg-530ad-2f using K-Means Clustering

NASA Astrophysics Data System (ADS)

Arifin, Zaenal; Artini, Sri DP; Much Ibnu Subroto, Imam

2017-04-01

To produce goods in industry, a controlled tool to improve production is required. Separation process has become a part of production process. Separation process is carried out based on certain criteria to get optimum result. By knowing the characteristics performance of a controlled tools in separation process the optimum results is also possible to be obtained. Clustering analysis is popular method for clustering data into smaller segments. Clustering analysis is useful to divide a group of object into a k-group in which the member value of the group is homogeny or similar. Similarity in the group is set based on certain criteria. The work in this paper based on K-Means method to conduct clustering of loading in the performance of a conveyor driven by a dc motor 12 volt eg-530-2f. This technique gives a complete clustering data for a prototype of conveyor driven by dc motor to separate goods in term of height. The parameters involved are voltage, current, time of travelling. These parameters give two clusters namely optimal cluster with center of cluster 10.50 volt, 0.3 Ampere, 10.58 second, and unoptimal cluster with center of cluster 10.88 volt, 0.28 Ampere and 40.43 second.
The contribution of psychological factors to recovery after mild traumatic brain injury: is cluster analysis a useful approach?

PubMed

Snell, Deborah L; Surgenor, Lois J; Hay-Smith, E Jean C; Williman, Jonathan; Siegert, Richard J

2015-01-01

Outcomes after mild traumatic brain injury (MTBI) vary, with slow or incomplete recovery for a significant minority. This study examines whether groups of cases with shared psychological factors but with different injury outcomes could be identified using cluster analysis. This is a prospective observational study following 147 adults presenting to a hospital-based emergency department or concussion services in Christchurch, New Zealand. This study examined associations between baseline demographic, clinical, psychological variables (distress, injury beliefs and symptom burden) and outcome 6 months later. A two-step approach to cluster analysis was applied (Ward's method to identify clusters, K-means to refine results). Three meaningful clusters emerged (high-adapters, medium-adapters, low-adapters). Baseline cluster-group membership was significantly associated with outcomes over time. High-adapters appeared recovered by 6-weeks and medium-adapters revealed improvements by 6-months. The low-adapters continued to endorse many symptoms, negative recovery expectations and distress, being significantly at risk for poor outcome more than 6-months after injury (OR (good outcome) = 0.12; CI = 0.03-0.53; p < 0.01). Cluster analysis supported the notion that groups could be identified early post-injury based on psychological factors, with group membership associated with differing outcomes over time. Implications for clinical care providers regarding therapy targets and cases that may benefit from different intensities of intervention are discussed.
Searching for a Gulf War syndrome using cluster analysis.

PubMed

Everitt, B; Ismail, K; David, A S; Wessely, S

2002-11-01

Gulf veterans report medically unexplained symptoms more frequently than non-Gulf veterans did. We examined whether Gulf and non-Gulf veterans could be distinguished by their patterns of symptom reporting. A k-means cluster analysis was applied to 500 randomly sampled veterans from each of three United Kingdom military cohorts of veterans; those deployed to the Gulf conflict between 1990 and 1991; to the Bosnia peacekeeping mission between 1992 and 1997; and military personnel who were in active service but not deployed to the Gulf (Era). Sociodemographic, health variables and scores for ten symptom groups were calculated. The gap statistic indicated the five-group solution as one that provided a particularly informative description of the structure in the data. Cluster 1 consisted of low scores for all symptom groups. Cluster 2 had veterans with highest symptom scores for musculoskeletal symptoms and high scores for psychiatric symptoms. Cluster 3 had high scores for psychiatric symptoms and marginally elevated scores for the remaining nine groups symptom groups. Cluster 4 had elevated scores for musculoskeletal symptoms only and cluster 5 was distinguishable from the other clusters in having high scores in all symptom groups, especially psychiatric and musculoskeletal. The findings do not support the existence of a unique syndrome affecting a subgroup of Gulf veterans but emphasize the excess of non-specific self-reported ill health in this group.
Phylogenomic and MALDI-TOF MS Analysis of Streptococcus sinensis HKU4T Reveals a Distinct Phylogenetic Clade in the Genus Streptococcus

PubMed Central

Tse, Herman; Chen, Jonathan H.K.; Tang, Ying; Lau, Susanna K.P.; Woo, Patrick C.Y.

2014-01-01

Streptococcus sinensis is a recently discovered human pathogen isolated from blood cultures of patients with infective endocarditis. Its phylogenetic position, as well as those of its closely related species, remains inconclusive when single genes were used for phylogenetic analysis. For example, S. sinensis branched out from members of the anginosus, mitis, and sanguinis groups in the 16S ribosomal RNA gene phylogenetic tree, but it was clustered with members of the anginosus and sanguinis groups when groEL gene sequences used for analysis. In this study, we sequenced the draft genome of S. sinensis and used a polyphasic approach, including concatenated genes, whole genomes, and matrix-assisted laser desorption ionization-time of flight mass spectrometry to analyze the phylogeny of S. sinensis. The size of the S. sinensis draft genome is 2.06 Mb, with GC content of 42.2%. Phylogenetic analysis using 50 concatenated genes or whole genomes revealed that S. sinensis formed a distinct cluster with Streptococcus oligofermentans and Streptococcus cristatus, and these three streptococci were clustered with the “sanguinis group.” As for phylogenetic analysis using hierarchical cluster analysis of the mass spectra of streptococci, S. sinensis also formed a distinct cluster with S. oligofermentans and S. cristatus, but these three streptococci were clustered with the “mitis group.” On the basis of the findings, we propose a novel group, named “sinensis group,” to include S. sinensis, S. oligofermentans, and S. cristatus, in the Streptococcus genus. Our study also illustrates the power of phylogenomic analyses for resolving ambiguities in bacterial taxonomy. PMID:25331233
Phylogenomic and MALDI-TOF MS analysis of Streptococcus sinensis HKU4T reveals a distinct phylogenetic clade in the genus Streptococcus.

PubMed

Teng, Jade L L; Huang, Yi; Tse, Herman; Chen, Jonathan H K; Tang, Ying; Lau, Susanna K P; Woo, Patrick C Y

2014-10-20

Streptococcus sinensis is a recently discovered human pathogen isolated from blood cultures of patients with infective endocarditis. Its phylogenetic position, as well as those of its closely related species, remains inconclusive when single genes were used for phylogenetic analysis. For example, S. sinensis branched out from members of the anginosus, mitis, and sanguinis groups in the 16S ribosomal RNA gene phylogenetic tree, but it was clustered with members of the anginosus and sanguinis groups when groEL gene sequences used for analysis. In this study, we sequenced the draft genome of S. sinensis and used a polyphasic approach, including concatenated genes, whole genomes, and matrix-assisted laser desorption ionization-time of flight mass spectrometry to analyze the phylogeny of S. sinensis. The size of the S. sinensis draft genome is 2.06 Mb, with GC content of 42.2%. Phylogenetic analysis using 50 concatenated genes or whole genomes revealed that S. sinensis formed a distinct cluster with Streptococcus oligofermentans and Streptococcus cristatus, and these three streptococci were clustered with the "sanguinis group." As for phylogenetic analysis using hierarchical cluster analysis of the mass spectra of streptococci, S. sinensis also formed a distinct cluster with S. oligofermentans and S. cristatus, but these three streptococci were clustered with the "mitis group." On the basis of the findings, we propose a novel group, named "sinensis group," to include S. sinensis, S. oligofermentans, and S. cristatus, in the Streptococcus genus. Our study also illustrates the power of phylogenomic analyses for resolving ambiguities in bacterial taxonomy. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Hierarchical cluster analysis of progression patterns in open-angle glaucoma patients with medical treatment.

PubMed

Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun

2014-04-29

To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.
Finding Groups Using Model-based Cluster Analysis: Heterogeneous Emotional Self-regulatory Processes and Heavy Alcohol Use Risk

PubMed Central

Mun, Eun-Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

2010-01-01

Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of non-nested models using the Bayesian Information Criterion (BIC) to compare multiple models and identify the optimum number of clusters. The current study clustered 36 young men and women based on their baseline heart rate (HR) and HR variability (HRV), chronic alcohol use, and reasons for drinking. Two cluster groups were identified and labeled High Alcohol Risk and Normative groups. Compared to the Normative group, individuals in the High Alcohol Risk group had higher levels of alcohol use and more strongly endorsed disinhibition and suppression reasons for use. The High Alcohol Risk group showed significant HRV changes in response to positive and negative emotional and appetitive picture cues, compared to neutral cues. In contrast, the Normative group showed a significant HRV change only to negative cues. Findings suggest that the individuals with autonomic self-regulatory difficulties may be more susceptible to heavy alcohol use and use alcohol for emotional regulation. PMID:18331138
A Preliminary Comparison of the Effectiveness of Cluster Analysis Weighting Procedures for Within-Group Covariance Structure.

ERIC Educational Resources Information Center

Donoghue, John R.

A Monte Carlo study compared the usefulness of six variable weighting methods for cluster analysis. Data were 100 bivariate observations from 2 subgroups, generated according to a finite normal mixture model. Subgroup size, within-group correlation, within-group variance, and distance between subgroup centroids were manipulated. Of the clustering…
Using Cluster Bootstrapping to Analyze Nested Data With a Few Clusters.

PubMed

Huang, Francis L

2018-04-01

Cluster randomized trials involving participants nested within intact treatment and control groups are commonly performed in various educational, psychological, and biomedical studies. However, recruiting and retaining intact groups present various practical, financial, and logistical challenges to evaluators and often, cluster randomized trials are performed with a low number of clusters (~20 groups). Although multilevel models are often used to analyze nested data, researchers may be concerned of potentially biased results due to having only a few groups under study. Cluster bootstrapping has been suggested as an alternative procedure when analyzing clustered data though it has seen very little use in educational and psychological studies. Using a Monte Carlo simulation that varied the number of clusters, average cluster size, and intraclass correlations, we compared standard errors using cluster bootstrapping with those derived using ordinary least squares regression and multilevel models. Results indicate that cluster bootstrapping, though more computationally demanding, can be used as an alternative procedure for the analysis of clustered data when treatment effects at the group level are of primary interest. Supplementary material showing how to perform cluster bootstrapped regressions using R is also provided.
Classification of patients based on their evaluation of hospital outcomes: cluster analysis following a national survey in Norway

PubMed Central

2013-01-01

Background A general trend towards positive patient-reported evaluations of hospitals could be taken as a sign that most patients form a homogeneous, reasonably pleased group, and consequently that there is little need for quality improvement. The objective of this study was to explore this assumption by identifying and statistically validating clusters of patients based on their evaluation of outcomes related to overall satisfaction, malpractice and benefit of treatment. Methods Data were collected using a national patient-experience survey of 61 hospitals in the 4 health regions in Norway during spring 2011. Postal questionnaires were mailed to 23,420 patients after their discharge from hospital. Cluster analysis was performed to identify response clusters of patients, based on their responses to single items about overall patient satisfaction, benefit of treatment and perception of malpractice. Results Cluster analysis identified six response groups, including one cluster with systematically poorer evaluation across outcomes (18.5% of patients) and one small outlier group (5.3%) with very poor scores across all outcomes. One-Way ANOVA with post-hoc tests showed that most differences between the six response groups on the three outcome items were significant. The response groups were significantly associated with nine patient-experience indicators (p < 0.001), and all groups were significantly different from each of the other groups on a majority of the patient-experience indicators. Clusters were significantly associated with age, education, self-perceived health, gender, and the degree to write open comments in the questionnaire. Conclusions The study identified five response clusters with distinct patient-reported outcome scores, in addition to a heterogeneous outlier group with very poor scores across all outcomes. The outlier group and the cluster with systematically poorer evaluation across outcomes comprised almost one-quarter of all patients, clearly demonstrating the need to tailor quality initiatives and improve patient-perceived quality in hospitals. More research on patient clustering in patient evaluation is needed, as well as standardization of methodology to increase comparability across studies. PMID:23433450
Alteration mapping at Goldfield, Nevada, by cluster and discriminant analysis of Landsat digital data. [mapping of hydrothermally altered volcanic rocks

NASA Technical Reports Server (NTRS)

Ballew, G.

1977-01-01

The ability of Landsat multispectral digital data to differentiate among 62 combinations of rock and alteration types at the Goldfield mining district of Western Nevada was investigated by using statistical techniques of cluster and discriminant analysis. Multivariate discriminant analysis was not effective in classifying each of the 62 groups, with classification results essentially the same whether data of four channels alone or combined with six ratios of channels were used. Bivariate plots of group means revealed a cluster of three groups including mill tailings, basalt and all other rock and alteration types. Automatic hierarchical clustering based on the fourth dimensional Mahalanobis distance between group means of 30 groups having five or more samples was performed using Johnson's HICLUS program. The results of the cluster analysis revealed hierarchies of mill tailings vs. natural materials, basalt vs. non-basalt, highly reflectant rocks vs. other rocks and exclusively unaltered rocks vs. predominantly altered rocks. The hierarchies were used to determine the order in which sets of multiple discriminant analyses were to be performed and the resulting discriminant functions were used to produce a map of geology and alteration which has an overall accuracy of 70 percent for discriminating exclusively altered rocks from predominantly altered rocks.

Classification of Forefoot Plantar Pressure Distribution in Persons with Diabetes: A Novel Perspective for the Mechanical Management of Diabetic Foot?

PubMed Central

Deschamps, Kevin; Matricali, Giovanni Arnoldo; Roosen, Philip; Desloovere, Kaat; Bruyninckx, Herman; Spaepen, Pieter; Nobels, Frank; Tits, Jos; Flour, Mieke; Staes, Filip

2013-01-01

Background The aim of this study was to identify groups of subjects with similar patterns of forefoot loading and verify if specific groups of patients with diabetes could be isolated from non-diabetics. Methodology/Principal Findings Ninety-seven patients with diabetes and 33 control participants between 45 and 70 years were prospectively recruited in two Belgian Diabetic Foot Clinics. Barefoot plantar pressure measurements were recorded and subsequently analysed using a semi-automatic total mapping technique. Kmeans cluster analysis was applied on relative regional impulses of six forefoot segments in order to pursue a classification for the control group separately, the diabetic group separately and both groups together. Cluster analysis led to identification of three distinct groups when considering only the control group. For the diabetic group, and the computation considering both groups together, four distinct groups were isolated. Compared to the cluster analysis of the control group an additional forefoot loading pattern was identified. This group comprised diabetic feet only. The relevance of the reported clusters was supported by ANOVA statistics indicating significant differences between different regions of interest and different clusters. Conclusion/s Significance There seems to emerge a new era in diabetic foot medicine which embraces the classification of diabetic patients according to their biomechanical profile. Classification of the plantar pressure distribution has the potential to provide a means to determine mechanical interventions for the prevention and/or treatment of the diabetic foot. PMID:24278219
Identification of five chronic obstructive pulmonary disease subgroups with different prognoses in the ECLIPSE cohort using cluster analysis.

PubMed

Rennard, Stephen I; Locantore, Nicholas; Delafont, Bruno; Tal-Singer, Ruth; Silverman, Edwin K; Vestbo, Jørgen; Miller, Bruce E; Bakke, Per; Celli, Bartolomé; Calverley, Peter M A; Coxson, Harvey; Crim, Courtney; Edwards, Lisa D; Lomas, David A; MacNee, William; Wouters, Emiel F M; Yates, Julie C; Coca, Ignacio; Agustí, Alvar

2015-03-01

Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease that likely includes clinically relevant subgroups. To identify subgroups of COPD in ECLIPSE (Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints) subjects using cluster analysis and to assess clinically meaningful outcomes of the clusters during 3 years of longitudinal follow-up. Factor analysis was used to reduce 41 variables determined at recruitment in 2,164 patients with COPD to 13 main factors, and the variables with the highest loading were used for cluster analysis. Clusters were evaluated for their relationship with clinically meaningful outcomes during 3 years of follow-up. The relationships among clinical parameters were evaluated within clusters. Five subgroups were distinguished using cross-sectional clinical features. These groups differed regarding outcomes. Cluster A included patients with milder disease and had fewer deaths and hospitalizations. Cluster B had less systemic inflammation at baseline but had notable changes in health status and emphysema extent. Cluster C had many comorbidities, evidence of systemic inflammation, and the highest mortality. Cluster D had low FEV1, severe emphysema, and the highest exacerbation and COPD hospitalization rate. Cluster E was intermediate for most variables and may represent a mixed group that includes further clusters. The relationships among clinical variables within clusters differed from that in the entire COPD population. Cluster analysis using baseline data in ECLIPSE identified five COPD subgroups that differ in outcomes and inflammatory biomarkers and show different relationships between clinical parameters, suggesting the clusters represent clinically and biologically different subtypes of COPD.
A Survey of Popular R Packages for Cluster Analysis

ERIC Educational Resources Information Center

Flynt, Abby; Dean, Nema

2016-01-01

Cluster analysis is a set of statistical methods for discovering new group/class structure when exploring data sets. This article reviews the following popular libraries/commands in the R software language for applying different types of cluster analysis: from the stats library, the kmeans, and hclust functions; the mclust library; the poLCA…
Using Cluster Analysis for Data Mining in Educational Technology Research

ERIC Educational Resources Information Center

Antonenko, Pavlo D.; Toy, Serkan; Niederhauser, Dale S.

2012-01-01

Cluster analysis is a group of statistical methods that has great potential for analyzing the vast amounts of web server-log data to understand student learning from hyperlinked information resources. In this methodological paper we provide an introduction to cluster analysis for educational technology researchers and illustrate its use through…
Analysis of the mutations induced by conazole fungicides in vivo.

PubMed

Ross, Jeffrey A; Leavitt, Sharon A

2010-05-01

The mouse liver tumorigenic conazole fungicides triadimefon and propiconazole have previously been shown to be in vivo mouse liver mutagens in the Big Blue transgenic mutation assay when administered in feed at tumorigenic doses, whereas the non-tumorigenic conazole myclobutanil was not mutagenic. DNA sequencing of the mutants recovered from each treatment group as well as from animals receiving control diet was conducted to gain additional insight into the mode of action by which tumorigenic conazoles induce mutations. Relative dinucleotide mutabilities (RDMs) were calculated for each possible dinucleotide in each treatment group and then examined by multivariate statistical analysis techniques. Unsupervised hierarchical clustering analysis of RDM values segregated two independent control groups together, along with the non-tumorigen myclobutanil. The two tumorigenic conazoles clustered together in a distinct grouping. Partitioning around mediods of RDM values into two clusters also groups the triadimefon and propiconazole together in one cluster and the two control groups and myclobutanil together in a second cluster. Principal component analysis of these results identifies two components that account for 88.3% of the variability in the points. Taken together, these results are consistent with the hypothesis that propiconazole- and triadimefon-induced mutations do not represent clonal expansion of background mutations and support the hypothesis that they arise from the accumulation of reactive electrophilic metabolic intermediates within the liver in vivo.
Analysis of Tropical Cyclone Tracks in the North Indian Ocean

NASA Astrophysics Data System (ADS)

Patwardhan, A.; Paliwal, M.; Mohapatra, M.

2011-12-01

Cyclones are regarded as one of the most dangerous meteorological phenomena of the tropical region. The probability of landfall of a tropical cyclone depends on its movement (trajectory). Analysis of trajectories of tropical cyclones could be useful for identifying potentially predictable characteristics. There is long history of analysis of tropical cyclones tracks. A common approach is using different clustering techniques to group the cyclone tracks on the basis of certain characteristics. Various clustering method have been used to study the tropical cyclones in different ocean basins like western North Pacific ocean (Elsner and Liu, 2003; Camargo et al., 2007), North Atlantic Ocean (Elsner, 2003; Gaffney et al. 2007; Nakamura et al., 2009). In this study, tropical cyclone tracks in the North Indian Ocean basin, for the period 1961-2010 have been analyzed and grouped into clusters based on their spatial characteristics. A tropical cyclone trajectory is approximated as an open curve and described by its first two moments. The resulting clusters have different centroid locations and also differently shaped variance ellipses. These track characteristics are then used in the standard clustering algorithms which allow the whole track shape, length, and location to be incorporated into the clustering methodology. The resulting clusters have different genesis locations and trajectory shapes. We have also examined characteristics such as life span, maximum sustained wind speed, landfall, seasonality, many of which are significantly different across the identified clusters. The clustering approach groups cyclones with higher maximum wind speed and longest life span in to one cluster. Another cluster includes short duration cyclonic events that are mostly deep depressions and significant for rainfall over Eastern and Central India. The clustering approach is likely to prove useful for analysis of events of significance with regard to impacts.
Investigating Faculty Familiarity with Assessment Terminology by Applying Cluster Analysis to Interpret Survey Data

ERIC Educational Resources Information Center

Raker, Jeffrey R.; Holme, Thomas A.

2014-01-01

A cluster analysis was conducted with a set of survey data on chemistry faculty familiarity with 13 assessment terms. Cluster groupings suggest a high, middle, and low overall familiarity with the terminology and an independent high and low familiarity with terms related to fundamental statistics. The six resultant clusters were found to be…
Multilevel Analysis Methods for Partially Nested Cluster Randomized Trials

ERIC Educational Resources Information Center

Sanders, Elizabeth A.

2011-01-01

This paper explores multilevel modeling approaches for 2-group randomized experiments in which a treatment condition involving clusters of individuals is compared to a control condition involving only ungrouped individuals, otherwise known as partially nested cluster randomized designs (PNCRTs). Strategies for comparing groups from a PNCRT in the…
Extended phenotype and clinical subgroups in unilateral Meniere disease: A cross-sectional study with cluster analysis.

PubMed

Frejo, L; Martin-Sanz, E; Teggi, R; Trinidad, G; Soto-Varela, A; Santos-Perez, S; Manrique, R; Perez, N; Aran, I; Almeida-Branco, M S; Batuecas-Caletrio, A; Fraile, J; Espinosa-Sanchez, J M; Perez-Guillen, V; Perez-Garrigues, H; Oliva-Dominguez, M; Aleman, O; Benitez, J; Perez, P; Lopez-Escamez, J A

2017-12-01

To define clinical subgroups by cluster analysis in patients with unilateral Meniere disease (MD) and to compare them with the clinical subgroups found in bilateral MD. A cross-sectional study with a two-step cluster analysis. A tertiary referral multicenter study. Nine hundred and eighty-eight adult patients with unilateral MD. best predictors to define clinical subgroups with potential different aetiologies. We established five clusters in unilateral MD. Group 1 is the most frequently found, includes 53% of patients, and it is defined as the sporadic, classic MD without migraine and without autoimmune disorder (AD). Group 2 is found in 8% of patients, and it is defined by hearing loss, which antedates the vertigo episodes by months or years (delayed MD), without migraine or AD in most of cases. Group 3 involves 13% of patients, and it is considered familial MD, while group 4, which includes 15% of patients, is linked to the presence of migraine in all cases. Group 5 is found in 11% of patients and is defined by a comorbid AD. We found significant differences in the distribution of AD in clusters 3, 4 and 5 between patients with uni- and bilateral MD. Cluster analysis defines clinical subgroups in MD, and it extends the phenotype beyond audiovestibular symptoms. This classification will help to improve the phenotyping in MD and facilitate the selection of patients for randomised clinical trials. © 2017 John Wiley & Sons Ltd.
Improving estimation of kinetic parameters in dynamic force spectroscopy using cluster analysis

NASA Astrophysics Data System (ADS)

Yen, Chi-Fu; Sivasankar, Sanjeevi

2018-03-01

Dynamic Force Spectroscopy (DFS) is a widely used technique to characterize the dissociation kinetics and interaction energy landscape of receptor-ligand complexes with single-molecule resolution. In an Atomic Force Microscope (AFM)-based DFS experiment, receptor-ligand complexes, sandwiched between an AFM tip and substrate, are ruptured at different stress rates by varying the speed at which the AFM-tip and substrate are pulled away from each other. The rupture events are grouped according to their pulling speeds, and the mean force and loading rate of each group are calculated. These data are subsequently fit to established models, and energy landscape parameters such as the intrinsic off-rate (koff) and the width of the potential energy barrier (xβ) are extracted. However, due to large uncertainties in determining mean forces and loading rates of the groups, errors in the estimated koff and xβ can be substantial. Here, we demonstrate that the accuracy of fitted parameters in a DFS experiment can be dramatically improved by sorting rupture events into groups using cluster analysis instead of sorting them according to their pulling speeds. We test different clustering algorithms including Gaussian mixture, logistic regression, and K-means clustering, under conditions that closely mimic DFS experiments. Using Monte Carlo simulations, we benchmark the performance of these clustering algorithms over a wide range of koff and xβ, under different levels of thermal noise, and as a function of both the number of unbinding events and the number of pulling speeds. Our results demonstrate that cluster analysis, particularly K-means clustering, is very effective in improving the accuracy of parameter estimation, particularly when the number of unbinding events are limited and not well separated into distinct groups. Cluster analysis is easy to implement, and our performance benchmarks serve as a guide in choosing an appropriate method for DFS data analysis.
The impact of catchment source group classification on the accuracy of sediment fingerprinting outputs.

PubMed

Pulley, Simon; Foster, Ian; Collins, Adrian L

2017-06-01

The objective classification of sediment source groups is at present an under-investigated aspect of source tracing studies, which has the potential to statistically improve discrimination between sediment sources and reduce uncertainty. This paper investigates this potential using three different source group classification schemes. The first classification scheme was simple surface and subsurface groupings (Scheme 1). The tracer signatures were then used in a two-step cluster analysis to identify the sediment source groupings naturally defined by the tracer signatures (Scheme 2). The cluster source groups were then modified by splitting each one into a surface and subsurface component to suit catchment management goals (Scheme 3). The schemes were tested using artificial mixtures of sediment source samples. Controlled corruptions were made to some of the mixtures to mimic the potential causes of tracer non-conservatism present when using tracers in natural fluvial environments. It was determined how accurately the known proportions of sediment sources in the mixtures were identified after unmixing modelling using the three classification schemes. The cluster analysis derived source groups (2) significantly increased tracer variability ratios (inter-/intra-source group variability) (up to 2122%, median 194%) compared to the surface and subsurface groupings (1). As a result, the composition of the artificial mixtures was identified an average of 9.8% more accurately on the 0-100% contribution scale. It was found that the cluster groups could be reclassified into a surface and subsurface component (3) with no significant increase in composite uncertainty (a 0.1% increase over Scheme 2). The far smaller effects of simulated tracer non-conservatism for the cluster analysis based schemes (2 and 3) was primarily attributed to the increased inter-group variability producing a far larger sediment source signal that the non-conservatism noise (1). Modified cluster analysis based classification methods have the potential to reduce composite uncertainty significantly in future source tracing studies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Subtypes of female juvenile offenders: a cluster analysis of the Millon Adolescent Clinical Inventory.

PubMed

Stefurak, Tres; Calhoun, Georgia B

2007-01-01

The current study sought to explore subtypes of adolescents within a sample of female juvenile offenders. Using the Millon Adolescent Clinical Inventory with 101 female juvenile offenders, a two-step cluster analysis was performed beginning with a Ward's method hierarchical cluster analysis followed by a K-Means iterative partitioning cluster analysis. The results suggest an optimal three-cluster solution, with cluster profiles leading to the following group labels: Externalizing Problems, Depressed/Interpersonally Ambivalent, and Anxious Prosocial. Analysis along the factors of age, race, offense typology and offense chronicity were conducted to further understand the nature of found clusters. Only the effect for race was significant with the Anxious Prosocial and Depressed Intepersonally Ambivalent clusters appearing disproportionately comprised of African American girls. To establish external validity, clusters were compared across scales of the Behavioral Assessment System for Children - Self Report of Personality, and corroborative distinctions between clusters were found here.
A measure for objects clustering in principal component analysis biplot: A case study in inter-city buses maintenance cost data

NASA Astrophysics Data System (ADS)

Ginanjar, Irlandia; Pasaribu, Udjianna S.; Indratno, Sapto W.

2017-03-01

This article presents the application of the principal component analysis (PCA) biplot for the needs of data mining. This article aims to simplify and objectify the methods for objects clustering in PCA biplot. The novelty of this paper is to get a measure that can be used to objectify the objects clustering in PCA biplot. Orthonormal eigenvectors, which are the coefficients of a principal component model representing an association between principal components and initial variables. The existence of the association is a valid ground to objects clustering based on principal axes value, thus if m principal axes used in the PCA, then the objects can be classified into 2m clusters. The inter-city buses are clustered based on maintenance costs data by using two principal axes PCA biplot. The buses are clustered into four groups. The first group is the buses with high maintenance costs, especially for lube, and brake canvass. The second group is the buses with high maintenance costs, especially for tire, and filter. The third group is the buses with low maintenance costs, especially for lube, and brake canvass. The fourth group is buses with low maintenance costs, especially for tire, and filter.
The hierarchical cluster analysis of oral health attitudes and behaviour using the Hiroshima University--Dental Behavioural Inventory (HU-DBI) among final year dental students in 17 countries.

PubMed

Komabayashi, Takashi; Kawamura, Makoto; Kim, Kang-Ju; Wright, Fredrick A C; Declerck, Dominique; Goiâs, Maria do Carmo Matias Freire; Hu, De-Yu; Honkala, Eino; Lévy, Gérard; Kalwitzki, Matthias; Polychronopoulou, Argy; Yip, Kevin Hak-Kong; Eli, Ilana; Kinirons, Martin J; Petti, Stefano; Srisilapanan, Patcharawan; Kwan, Stella Y L; Centore, Linda S

2006-10-01

To explore and describe international oral health attitudes/ behaviours among final year dental students. Validated translated versions of the Hiroshima University-Dental Behavioural Inventory (HU-DBI) questionnaire were administered to 1,096 final-year dental students in 17 countries. Hierarchical cluster analysis was conducted within the data to detect patterns and groupings. The overall response rate was 72%. The cluster analysis identified two main groups among the countries. Group 1 consisted of twelve countries: one Oceanic (Australia), one Middle-Eastern (Israel), seven European (Northern Ireland, England, Finland, Greece, Germany, Italy, and France) and three Asian (Korea, Thailand and Malaysia) countries. Group 2 consisted of five countries: one South American (Brazil), one European (Belgium) and three Asian (China, Indonesia and Japan) countries. The percentages of 'agree' responses in three HU-DBI questionnaire items were significantly higher in Group 2 than in Group 1. They include: "I worry about the colour of my teeth."; "I have noticed some white sticky deposits on my teeth."; and "I am bothered by the colour of my gums." Grouping the countries into international clusters yielded useful information for dentistry and dental education.
The Organization of Children's Same-Sex Peer Relationships.

ERIC Educational Resources Information Center

Benenson, Joyce; Apostoleris, Nicholas; Parnass, Jodi

1998-01-01

Uses a sociometric analysis to explore the differential organization of boys' and girls' peer groups. Finds that boys structure their peer groups by creating a large central cluster composed of smaller integrated clusters, whereas girls form small clusters unrelated to one another. Nevertheless, girls are aware of and sensitive to the status of…
CLUSFAVOR 5.0: hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles

PubMed Central

Peterson, Leif E

2002-01-01

CLUSFAVOR (CLUSter and Factor Analysis with Varimax Orthogonal Rotation) 5.0 is a Windows-based computer program for hierarchical cluster and principal-component analysis of microarray-based transcriptional profiles. CLUSFAVOR 5.0 standardizes input data; sorts data according to gene-specific coefficient of variation, standard deviation, average and total expression, and Shannon entropy; performs hierarchical cluster analysis using nearest-neighbor, unweighted pair-group method using arithmetic averages (UPGMA), or furthest-neighbor joining methods, and Euclidean, correlation, or jack-knife distances; and performs principal-component analysis. PMID:12184816
A Systems Biology Approach for Identifying Hepatotoxicant Groups Based on Similarity in Mechanisms of Action and Chemical Structure.

PubMed

Hebels, Dennie G A J; Rasche, Axel; Herwig, Ralf; van Westen, Gerard J P; Jennen, Danyel G J; Kleinjans, Jos C S

2016-01-01

When evaluating compound similarity, addressing multiple sources of information to reach conclusions about common pharmaceutical and/or toxicological mechanisms of action is a crucial strategy. In this chapter, we describe a systems biology approach that incorporates analyses of hepatotoxicant data for 33 compounds from three different sources: a chemical structure similarity analysis based on the 3D Tanimoto coefficient, a chemical structure-based protein target prediction analysis, and a cross-study/cross-platform meta-analysis of in vitro and in vivo human and rat transcriptomics data derived from public resources (i.e., the diXa data warehouse). Hierarchical clustering of the outcome scores of the separate analyses did not result in a satisfactory grouping of compounds considering their known toxic mechanism as described in literature. However, a combined analysis of multiple data types may hypothetically compensate for missing or unreliable information in any of the single data types. We therefore performed an integrated clustering analysis of all three data sets using the R-based tool iClusterPlus. This indeed improved the grouping results. The compound clusters that were formed by means of iClusterPlus represent groups that show similar gene expression while simultaneously integrating a similarity in structure and protein targets, which corresponds much better with the known mechanism of action of these toxicants. Using an integrative systems biology approach may thus overcome the limitations of the separate analyses when grouping liver toxicants sharing a similar mechanism of toxicity.
Fatality rate of pedestrians and fatal crash involvement rate of drivers in pedestrian crashes: a case study of Iran.

PubMed

Kashani, Ali Tavakoli; Besharati, Mohammad Mehdi

2017-06-01

The aim of this study was to uncover patterns of pedestrian crashes. In the first stage, 34,178 pedestrian-involved crashes occurred in Iran during a four-year period were grouped into homogeneous clusters using a clustering analysis. Next, some in-cluster and inter-cluster crash patterns were analysed. The clustering analysis yielded six pedestrian crash groups. Car/van/pickup crashes on rural roads as well as heavy vehicle crashes were found to be less frequent but more likely to be fatal compared to other crash clusters. In addition, after controlling for crash frequency in each cluster, it was found that the fatality rate of each pedestrian age group as well as the fatal crash involvement rate of each driver age group varies across the six clusters. Results of present study has some policy implications including, promoting pedestrian safety training sessions for heavy vehicle drivers, imposing limitations over elderly heavy vehicle drivers, reinforcing penalties toward under 19 drivers and motorcyclists. In addition, road safety campaigns in rural areas may be promoted to inform people about the higher fatality rate of pedestrians on rural roads. The crash patterns uncovered in this study might also be useful for prioritizing future pedestrian safety research areas.
Cluster analysis identifies three urodynamic patterns in patients with orthotopic neobladder reconstruction.

PubMed

Kim, Kwang Hyun; Yoon, Hyun Suk; Song, Wan; Choo, Hee Jung; Yoon, Hana; Chung, Woo Sik; Sim, Bong Suk; Lee, Dong Hyeon

2017-01-01

To classify patients with orthotopic neobladder based on urodynamic parameters using cluster analysis and to characterize the voiding function of each group. From January 2012 to November 2015, 142 patients with bladder cancer underwent radical cystectomy and Studer neobladder reconstruction at our institute. Of the 142 patients, 103 with complete urodynamic data and information on urinary functional outcomes were included in this study. K-means clustering was performed with urodynamic parameters which included maximal cystometric capacity, residual volume, maximal flow rate, compliance, and detrusor pressure at maximum flow rate. Three groups emerged by cluster analysis. Urodynamic parameters and urinary function outcomes were compared between three groups. Group 1 (n = 44) had ideal urodynamic parameters with a mean maximal bladder capacity of 513.3 ml and mean residual urine volume of 33.1 ml. Group 2 (n = 42) was characterized by small bladder capacity with low compliance. Patients in group 2 had higher rates of daytime incontinence and nighttime incontinence than patients in group 1. Group 3 (n = 17) was characterized by large residual urine volume with high compliance. When we examined gender differences in urodynamics and functional outcomes, residual urine volume and the rate of daytime incontinence were only marginally significant. However, females were significantly more likely to belong to group 2 or 3 (P = 0.003). In multivariate analysis to identify factors associated with group 1 which has the most ideal urodynamic pattern, age (OR 0.95, P = 0.017) and male gender (OR 7.57, P = 0.003) were identified as significant factors. While patients with ileal neobladder present with various voiding symptoms, three urodynamic patterns were identified by cluster analysis. Approximately half of patients had ideal urodynamic parameters. The other two groups were characterized by large residual urine and small capacity bladder with low compliance. Young age and male gender appear to have a favorable impact on urodynamic and voiding outcomes in patients undergoing orthotopic neobladder reconstruction.
Cluster analysis of the hot subdwarfs in the PG survey

NASA Technical Reports Server (NTRS)

Thejll, Peter; Charache, Darryl; Shipman, Harry L.

1989-01-01

Application of cluster analysis to the hot subdwarfs in the Palomar Green (PG) survey of faint blue high-Galactic-latitude objects is assessed, with emphasis on data noise and the number of clusters to subdivide the data into. The data used in the study are presented, and cluster analysis, using the CLUSTAN program, is applied to it. Distances are calculated using the Euclidean formula, and clustering is done by Ward's method. The results are discussed, and five groups representing natural divisions of the subdwarfs in the PG survey are presented.

Outcome-Driven Cluster Analysis with Application to Microarray Data.

PubMed

Hsu, Jessie J; Finkelstein, Dianne M; Schoenfeld, David A

2015-01-01

One goal of cluster analysis is to sort characteristics into groups (clusters) so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes) into groups of highly correlated genes that have the same effect on the outcome (recovery). We propose a random effects model where the genes within each group (cluster) equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.
Cluster analysis of the national weight control registry to identify distinct subgroups maintaining successful weight loss.

PubMed

Ogden, Lorraine G; Stroebele, Nanette; Wyatt, Holly R; Catenacci, Victoria A; Peters, John C; Stuht, Jennifer; Wing, Rena R; Hill, James O

2012-10-01

The National Weight Control Registry (NWCR) is the largest ongoing study of individuals successful at maintaining weight loss; the registry enrolls individuals maintaining a weight loss of at least 13.6 kg (30 lb) for a minimum of 1 year. The current report uses multivariate latent class cluster analysis to identify unique clusters of individuals within the NWCR that have distinct experiences, strategies, and attitudes with respect to weight loss and weight loss maintenance. The cluster analysis considers weight and health history, weight control behaviors and strategies, effort and satisfaction with maintaining weight, and psychological and demographic characteristics. The analysis includes 2,228 participants enrolled between 1998 and 2002. Cluster 1 (50.5%) represents a weight-stable, healthy, exercise conscious group who are very satisfied with their current weight. Cluster 2 (26.9%) has continuously struggled with weight since childhood; they rely on the greatest number of resources and strategies to lose and maintain weight, and report higher levels of stress and depression. Cluster 3 (12.7%) represents a group successful at weight reduction on the first attempt; they were least likely to be overweight as children, are maintaining the longest duration of weight loss, and report the least difficulty maintaining weight. Cluster 4 (9.9%) represents a group less likely to use exercise to control weight; they tend to be older, eat fewer meals, and report more health problems. Further exploration of the unique characteristics of these clusters could be useful for tailoring future weight loss and weight maintenance programs to the specific characteristics of an individual.
Cluster Analysis in Nursing Research: An Introduction, Historical Perspective, and Future Directions.

PubMed

Dunn, Heather; Quinn, Laurie; Corbridge, Susan J; Eldeirawi, Kamal; Kapella, Mary; Collins, Eileen G

2017-05-01

The use of cluster analysis in the nursing literature is limited to the creation of classifications of homogeneous groups and the discovery of new relationships. As such, it is important to provide clarity regarding its use and potential. The purpose of this article is to provide an introduction to distance-based, partitioning-based, and model-based cluster analysis methods commonly utilized in the nursing literature, provide a brief historical overview on the use of cluster analysis in nursing literature, and provide suggestions for future research. An electronic search included three bibliographic databases, PubMed, CINAHL and Web of Science. Key terms were cluster analysis and nursing. The use of cluster analysis in the nursing literature is increasing and expanding. The increased use of cluster analysis in the nursing literature is positioning this statistical method to result in insights that have the potential to change clinical practice.
Pattern of clustering of menopausal problems: A study with a Bengali Hindu ethnic group.

PubMed

Dasgupta, Doyel; Pal, Baidyanath; Ray, Subha

2016-01-01

We attempted to find out how menopausal problems cluster with each other. The study was conducted among a group of women belonging to a Bengali-speaking Hindu ethnic group of West Bengal, a state located in Eastern India. We recruited 1,400 participants for the study. Information on sociodemographic aspects and menopausal problems were collected from these participants with the help of a pretested questionnaire. Results of cluster analysis showed that vasomotor, vaginal, and urinary problems cluster together, separately from physical and psychosomatic problems.
Dimensional assessment of personality pathology in patients with eating disorders.

PubMed

Goldner, E M; Srikameswaran, S; Schroeder, M L; Livesley, W J; Birmingham, C L

1999-02-22

This study examined patients with eating disorders on personality pathology using a dimensional method. Female subjects who met DSM-IV diagnostic criteria for eating disorder (n = 136) were evaluated and compared to an age-controlled general population sample (n = 68). We assessed 18 features of personality disorder with the Dimensional Assessment of Personality Pathology - Basic Questionnaire (DAPP-BQ). Factor analysis and cluster analysis were used to derive three clusters of patients. A five-factor solution was obtained with limited intercorrelation between factors. Cluster analysis produced three clusters with the following characteristics: Cluster 1 members (constituting 49.3% of the sample and labelled 'rigid') had higher mean scores on factors denoting compulsivity and interpersonal difficulties; Cluster 2 (18.4% of the sample) showed highest scores in factors denoting psychopathy, neuroticism and impulsive features, and appeared to constitute a borderline psychopathology group; Cluster 3 (32.4% of the sample) was characterized by few differences in personality pathology in comparison to the normal population sample. Cluster membership was associated with DSM-IV diagnosis -- a large proportion of patients with anorexia nervosa were members of Cluster 1. An empirical classification of eating-disordered patients derived from dimensional assessment of personality pathology identified three groups with clinical relevance.
Cluster Analysis of Junior High School Students' Cognitive Structures

ERIC Educational Resources Information Center

Dan, Youngjun; Geng, Leisha; Li, Meng

2017-01-01

This study aimed to explore students' cognitive patterns based on their knowledge and levels. Participants were seventh graders from a junior high school in China. Three relatively distinct groups were specified by Cluster Analysis: high knowledge and low ability, low knowledge and low ability, and high knowledge and high ability. The group of low…
Cluster Analysis of Assessment in Anatomy and Physiology for Health Science Undergraduates

ERIC Educational Resources Information Center

Brown, Stephen; White, Sue; Power, Nicola

2016-01-01

Academic content common to health science programs is often taught to a mixed group of students; however, content assessment may be consistent for each discipline. This study used a retrospective cluster analysis on such a group, first to identify high and low achieving students, and second, to determine the distribution of students within…
Sensory Clusters of Adults with and without Autism Spectrum Conditions

ERIC Educational Resources Information Center

Elwin, Marie; Schröder, Agneta; Ek, Lena; Wallsten, Tuula; Kjellin, Lars

2017-01-01

We identified clusters of atypical sensory functioning adults with ASC by hierarchical cluster analysis. A new scale for commonly self-reported sensory reactivity was used as a measure. In a low frequency group (n = 37), all subscale scores were relatively low, in particular atypical sensory/motor reactivity. In the intermediate group (n = 17)…
A time-series approach for clustering farms based on slaughterhouse health aberration data.

PubMed

Hulsegge, B; de Greef, K H

2018-05-01

A large amount of data is collected routinely in meat inspection in pig slaughterhouses. A time series clustering approach is presented and applied that groups farms based on similar statistical characteristics of meat inspection data over time. A three step characteristic-based clustering approach was used from the idea that the data contain more info than the incidence figures. A stratified subset containing 511,645 pigs was derived as a study set from 3.5 years of meat inspection data. The monthly averages of incidence of pleuritis and of pneumonia of 44 Dutch farms (delivering 5149 batches to 2 pig slaughterhouses) were subjected to 1) derivation of farm level data characteristics 2) factor analysis and 3) clustering into groups of farms. The characteristic-based clustering was able to cluster farms for both lung aberrations. Three groups of data characteristics were informative, describing incidence, time pattern and degree of autocorrelation. The consistency of clustering similar farms was confirmed by repetition of the analysis in a larger dataset. The robustness of the clustering was tested on a substantially extended dataset. This confirmed the earlier results, three data distribution aspects make up the majority of distinction between groups of farms and in these groups (clusters) the majority of the farms was allocated comparable to the earlier allocation (75% and 62% for pleuritis and pneumonia, respectively). The difference between pleuritis and pneumonia in their seasonal dependency was confirmed, supporting the biological relevance of the clustering. Comparison of the identified clusters of statistically comparable farms can be used to detect farm level risk factors causing the health aberrations beyond comparison on disease incidence and trend alone. Copyright © 2018 Elsevier B.V. All rights reserved.
Finding Groups Using Model-Based Cluster Analysis: Heterogeneous Emotional Self-Regulatory Processes and Heavy Alcohol Use Risk

ERIC Educational Resources Information Center

Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.

2008-01-01

Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…
Identifying At-Risk Students in General Chemistry via Cluster Analysis of Affective Characteristics

ERIC Educational Resources Information Center

Chan, Julia Y. K.; Bauer, Christopher F.

2014-01-01

The purpose of this study is to identify academically at-risk students in first-semester general chemistry using affective characteristics via cluster analysis. Through the clustering of six preselected affective variables, three distinct affective groups were identified: low (at-risk), medium, and high. Students in the low affective group…
A formal concept analysis approach to consensus clustering of multi-experiment expression data

PubMed Central

2014-01-01

Background Presently, with the increasing number and complexity of available gene expression datasets, the combination of data from multiple microarray studies addressing a similar biological question is gaining importance. The analysis and integration of multiple datasets are expected to yield more reliable and robust results since they are based on a larger number of samples and the effects of the individual study-specific biases are diminished. This is supported by recent studies suggesting that important biological signals are often preserved or enhanced by multiple experiments. An approach to combining data from different experiments is the aggregation of their clusterings into a consensus or representative clustering solution which increases the confidence in the common features of all the datasets and reveals the important differences among them. Results We propose a novel generic consensus clustering technique that applies Formal Concept Analysis (FCA) approach for the consolidation and analysis of clustering solutions derived from several microarray datasets. These datasets are initially divided into groups of related experiments with respect to a predefined criterion. Subsequently, a consensus clustering algorithm is applied to each group resulting in a clustering solution per group. These solutions are pooled together and further analysed by employing FCA which allows extracting valuable insights from the data and generating a gene partition over all the experiments. In order to validate the FCA-enhanced approach two consensus clustering algorithms are adapted to incorporate the FCA analysis. Their performance is evaluated on gene expression data from multi-experiment study examining the global cell-cycle control of fission yeast. The FCA results derived from both methods demonstrate that, although both algorithms optimize different clustering characteristics, FCA is able to overcome and diminish these differences and preserve some relevant biological signals. Conclusions The proposed FCA-enhanced consensus clustering technique is a general approach to the combination of clustering algorithms with FCA for deriving clustering solutions from multiple gene expression matrices. The experimental results presented herein demonstrate that it is a robust data integration technique able to produce good quality clustering solution that is representative for the whole set of expression matrices. PMID:24885407
Using cluster analysis for medical resource decision making.

PubMed

Dilts, D; Khamalah, J; Plotkin, A

1995-01-01

Escalating costs of health care delivery have in the recent past often made the health care industry investigate, adapt, and apply those management techniques relating to budgeting, resource control, and forecasting that have long been used in the manufacturing sector. A strategy that has contributed much in this direction is the definition and classification of a hospital's output into "products" or groups of patients that impose similar resource or cost demands on the hospital. Existing classification schemes have frequently employed cluster analysis in generating these groupings. Unfortunately, the myriad articles and books on clustering and classification contain few formalized selection methodologies for choosing a technique for solving a particular problem, hence they often leave the novice investigator at a loss. This paper reviews the literature on clustering, particularly as it has been applied in the medical resource-utilization domain, addresses the critical choices facing an investigator in the medical field using cluster analysis, and offers suggestions (using the example of clustering low-vision patients) for how such choices can be made.
Cluster analysis of Southeastern U.S. climate stations

NASA Astrophysics Data System (ADS)

Stooksbury, D. E.; Michaels, P. J.

1991-09-01

A two-step cluster analysis of 449 Southeastern climate stations is used to objectively determine general climate clusters (groups of climate stations) for eight southeastern states. The purpose is objectively to define regions of climatic homogeneity that should perform more robustly in subsequent climatic impact models. This type of analysis has been successfully used in many related climate research problems including the determination of corn/climate districts in Iowa (Ortiz-Valdez, 1985) and the classification of synoptic climate types (Davis, 1988). These general climate clusters may be more appropriate for climate research than the standard climate divisions (CD) groupings of climate stations, which are modifications of the agro-economic United States Department of Agriculture crop reporting districts. Unlike the CD's, these objectively determined climate clusters are not restricted by state borders and thus have reduced multicollinearity which makes them more appropriate for the study of the impact of climate and climatic change.
Monitoring Wetland Hydro-dynamics in the Prairie Pothole Region Using Landsat Time Series

NASA Astrophysics Data System (ADS)

Zhou, Q.; Rover, J.; Gallant, A.

2017-12-01

Wetlands provide a variety of ecosystem functions, while it is spatially and temporally dynamic. We mapped the dynamics of wetlands in the North Dakota Prairie Pothole Region using all available clear observations of Landsat sensor data from 1985 to 2014. We used a cluster analysis to group pixels exhibiting similar long-term spectral trends over seven Landsat bands, then applied the tasseled-cap transformation to evaluate the temporal characteristics of brightness, greenness, and wetness for each cluster. We tested relations between these three indices and hydrologic conditions, as represented by the Palmer Hydrological Drought Index (PHDI), using the cross-correlation analysis for each cluster performed over an eight-year moving window for the 30 years covered by the study. This temporal window size coincided with the timing of a major shift from a prolonged drought that occurred within the first eight years of the study period to wetter conditions that prevailed throughout the remaining years. The 20 cluster we produced represented a gradient from locations that continuously held water throughout the study period to locations that, at most, held water only for short periods in some years. The spatial distribution of the cluster groups reflected patterns of regional geologic and geomorphologic features. Comparisons of the PHDI to tasseled-cap wetness were the most straightforward to interpret among the results from the three indices. Wetness for most cluster groups had high positive correlations with PHDI during drought years, with the correlations reduced as the landscape entered a lengthy, wetter period; however, wetness generally remained highly and positively correlated with PHDI across all years for four cluster groups where the area exhibited two or more multi-year dry-wet cycles. These same four groups also had strong, generally negative correlations with tasseled-cap brightness. For other cluster groups, brightness often was strongly negatively correlated with the PHDI during the drought years, with the relation weakening for subsequent years of adequate or high moisture. Relations between tasseled-cap greenness and PHDI were highly variable among and within cluster groups. Results from this analysis support ongoing efforts to develop new products that characterize wetland dynamics.
The Cluster Sensitivity Index: A Basic Measure of Classification Robustness

ERIC Educational Resources Information Center

Hom, Willard C.

2010-01-01

Analysts of institutional performance have occasionally used a peer grouping approach in which they compared institutions only to other institutions with similar characteristics. Because analysts historically have used cluster analysis to define peer groups (i.e., the group of comparable institutions), the author proposes and demonstrates with…
Supervised group Lasso with applications to microarray data analysis

PubMed Central

Ma, Shuangge; Song, Xiao; Huang, Jian

2007-01-01

Background A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure. Results We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data. Conclusion We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods. PMID:17316436
The detection methods of dynamic objects

NASA Astrophysics Data System (ADS)

Knyazev, N. L.; Denisova, L. A.

2018-01-01

The article deals with the application of cluster analysis methods for solving the task of aircraft detection on the basis of distribution of navigation parameters selection into groups (clusters). The modified method of cluster analysis for search and detection of objects and then iterative combining in clusters with the subsequent count of their quantity for increase in accuracy of the aircraft detection have been suggested. The course of the method operation and the features of implementation have been considered. In the conclusion the noted efficiency of the offered method for exact cluster analysis for finding targets has been shown.
Subgroups of advanced cancer patients clustered by their symptom profiles: quality-of-life outcomes.

PubMed

Husain, Amna; Myers, Jeff; Selby, Debbie; Thomson, Barbara; Chow, Edward

2011-11-01

Symptom cluster analysis is a new frontier of research in symptom management. This study clustered patients by their symptom profiles to identify subgroups that may be at higher risk for poor quality of life (QOL) and that may, therefore, benefit most from targeted interventions. Longitudinal study of metastatic cancer patients using the Edmonton Symptom Assessment Scale (ESAS). We generated two-, three-, and four-cluster subgroups and examined the relationship of cluster membership with patient outcomes. To address the problem of missing longitudinal data, we developed a novel outcome variable (QualTime) that measures both QOL and time in study. Two hundred and twenty-one patients with a mean Palliative Performance Scale (PPS) of 59.1 were enrolled. The three-cluster model was chosen for further analysis. The low-burden subgroup had all low severity symptom scores. The intermediate subgroup separates from the low-burden group on the "debility" profile of fatigue, drowsiness, appetite, and well-being. The high-burden group separates from the intermediate-burden group on pain, depression, and anxiety. At baseline, PPS (p=0.0003) and cluster membership (p<0.0001) contributed significantly to global QOL. In univariate analysis, cluster membership was related to the longitudinal outcome, QualTime. In a multivariate model, the relationship of PPS to QualTime was still significant (p=0.0002), but subgroup membership was no longer significant (p=0.1009). PPS is a stronger predictor of the longitudinal variable than cluster subgroups; however, cluster subgroups provide a target for clinical interventions that may improve QOL.
Clinical Study of the 3D-Master Color System among the Spanish Population.

PubMed

Gómez-Polo, Cristina; Gómez-Polo, Miguel; Martínez Vázquez de Parga, Juan Antonio; Celemín-Viñuela, Alicia

2017-01-12

To study whether the shades of the 3D-Master System were grouped and represented in the chromatic space according to the three-color coordinates of value, chroma, and hue. Maxillary central incisor color was measured on tooth surfaces through the Easyshade Compact spectrophotometer using 1361 participants aged between 16 and 89. The natural (not bleached teeth) color of the middle thirds was registered in the 3D-Master System nomenclature and in the CIELCh system. Principal component analysis and cluster analysis were applied. 75 colors of the 3D-Master System were found. The statistical analysis revealed the existence of 5 cluster groups. The centroid, the average of the 75 samples, in relation to lightness (L*) was 74.64, 22.87 for chroma (C*), and 88.85 for hue (h*). All of the clusters, except cluster 3, showed significant statistical differences with the centroid for the three-color coordinates (p <0.001). The results of this study indicated that 75 shades in the 3D-Master System were grouped into 5 clusters following coordinates L*, C*, and h* resulting from the dental spectrophotometer Vita Easyshade compact. The shades that composed each cluster did not belong to the same lightness color dimension groups. There was no special uniform chromatic distribution among the colors of the 3D-Master System. © 2017 by the American College of Prosthodontists.

Chill, Be Cool Man: African American Men, Identity, Coping, and Aggressive Ideation

PubMed Central

Thomas, Alvin; Hammond, Wizdom Powell; Kohn-Wood, Laura P.

2016-01-01

Aggression is an important correlate of violence, depression, coping, and suicide among emerging young African American males. Yet most researchers treat aggression deterministically, fail to address cultural factors, or consider the potential for individual characteristics to exert an intersectional influence on this psychosocial outcome. Addressing this gap, we consider the moderating effect of coping on the relationship between masculine and racial identity and aggressive ideation among African American males (N = 128) drawn from 2 large Midwestern universities. Using the phenomenological variant of ecological systems theory and person-centered methodology as a guide, hierarchical cluster analysis grouped participants into profile groups based on their responses to both a measure of racial identity and a measure of masculine identity. Results from the cluster analysis revealed 3 distinct identity clusters: Identity Ambivalent, Identity Appraising, and Identity Consolidated. Although these cluster groups did not differ with regard to coping, significant differences were observed between cluster groups in relation to aggressive ideation. Further, a full model with identity profile clusters, coping, and aggressive ideation indicates that cluster membership significantly moderates the relationship between coping and aggressive ideation. The implications of these data for intersecting identities of African American men, and the association of identity and outcomes related to risk for mental health and violence, are discussed. PMID:25090145
Chill, be cool man: African American men, identity, coping, and aggressive ideation.

PubMed

Thomas, Alvin; Hammond, Wizdom Powell; Kohn-Wood, Laura P

2015-07-01

Aggression is an important correlate of violence, depression, coping, and suicide among emerging young African American males. Yet most researchers treat aggression deterministically, fail to address cultural factors, or consider the potential for individual characteristics to exert an intersectional influence on this psychosocial outcome. Addressing this gap, we consider the moderating effect of coping on the relationship between masculine and racial identity and aggressive ideation among African American males (N = 128) drawn from 2 large Midwestern universities. Using the phenomenological variant of ecological systems theory and person-centered methodology as a guide, hierarchical cluster analysis grouped participants into profile groups based on their responses to both a measure of racial identity and a measure of masculine identity. Results from the cluster analysis revealed 3 distinct identity clusters: Identity Ambivalent, Identity Appraising, and Identity Consolidated. Although these cluster groups did not differ with regard to coping, significant differences were observed between cluster groups in relation to aggressive ideation. Further, a full model with identity profile clusters, coping, and aggressive ideation indicates that cluster membership significantly moderates the relationship between coping and aggressive ideation. The implications of these data for intersecting identities of African American men, and the association of identity and outcomes related to risk for mental health and violence, are discussed. (c) 2015 APA, all rights reserved).
Cluster Analysis of Vulnerable Groups in Acute Traumatic Brain Injury Rehabilitation.

PubMed

Kucukboyaci, N Erkut; Long, Coralynn; Smith, Michelle; Rath, Joseph F; Bushnik, Tamara

2018-01-06

To analyze the complex relation between various social indicators that contribute to socioeconomic status and health care barriers. Cluster analysis of historical patient data obtained from inpatient visits. Inpatient rehabilitation unit in a large urban university hospital. Adult patients (N=148) receiving acute inpatient care, predominantly for closed head injury. Not applicable. We examined the membership of patients with traumatic brain injury in various "vulnerable group" clusters (eg, homeless, unemployed, racial/ethnic minority) and characterized the rehabilitation outcomes of patients (eg, duration of stay, changes in FIM scores between admission to inpatient stay and discharge). The cluster analysis revealed 4 major clusters (ie, clusters A-D) separated by vulnerable group memberships, with distinct durations of stay and FIM gains during their stay. Cluster B, the largest cluster and also consisting of mostly racial/ethnic minorities, had the shortest duration of hospital stay and one of the lowest FIM improvements among the 4 clusters despite higher FIM scores at admission. In cluster C, also consisting of mostly ethnic minorities with multiple socioeconomic status vulnerabilities, patients were characterized by low cognitive FIM scores at admission and the longest duration of stay, and they showed good improvement in FIM scores. Application of clustering techniques to inpatient data identified distinct clusters of patients who may experience differences in their rehabilitation outcome due to their membership in various "at-risk" groups. The results identified patients (ie, cluster B, with minority patients; and cluster D, with elderly patients) who attain below-average gains in brain injury rehabilitation. The results also suggested that systemic (eg, duration of stay) or clinical service improvements (eg, staff's language skills, ability to offer substance abuse therapy, provide appropriate referrals, liaise with intensive social work services, or plan subacute rehabilitation phase) could be beneficial for acute settings. Stronger recruitment, training, and retention initiatives for bilingual and multiethnic professionals may also be considered to optimize gains from acute inpatient rehabilitation after traumatic brain injury. Copyright © 2017 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.
Quantifying the impact of fixed effects modeling of clusters in multiple imputation for cluster randomized trials

PubMed Central

Andridge, Rebecca. R.

2011-01-01

In cluster randomized trials (CRTs), identifiable clusters rather than individuals are randomized to study groups. Resulting data often consist of a small number of clusters with correlated observations within a treatment group. Missing data often present a problem in the analysis of such trials, and multiple imputation (MI) has been used to create complete data sets, enabling subsequent analysis with well-established analysis methods for CRTs. We discuss strategies for accounting for clustering when multiply imputing a missing continuous outcome, focusing on estimation of the variance of group means as used in an adjusted t-test or ANOVA. These analysis procedures are congenial to (can be derived from) a mixed effects imputation model; however, this imputation procedure is not yet available in commercial statistical software. An alternative approach that is readily available and has been used in recent studies is to include fixed effects for cluster, but the impact of using this convenient method has not been studied. We show that under this imputation model the MI variance estimator is positively biased and that smaller ICCs lead to larger overestimation of the MI variance. Analytical expressions for the bias of the variance estimator are derived in the case of data missing completely at random (MCAR), and cases in which data are missing at random (MAR) are illustrated through simulation. Finally, various imputation methods are applied to data from the Detroit Middle School Asthma Project, a recent school-based CRT, and differences in inference are compared. PMID:21259309
Toward An Understanding of Cluster Evolution: A Deep X-Ray Selected Cluster Catalog from ROSAT

NASA Technical Reports Server (NTRS)

Jones, Christine; Oliversen, Ronald (Technical Monitor)

2002-01-01

In the past year, we have focussed on studying individual clusters found in this sample with Chandra, as well as using Chandra to measure the luminosity-temperature relation for a sample of distant clusters identified through the ROSAT study, and finally we are continuing our study of fossil groups. For the luminosity-temperature study, we compared a sample of nearby clusters with a sample of distant clusters and, for the first time, measured a significant change in the relation as a function of redshift (Vikhlinin et al. in final preparation for submission to Cape). We also used our ROSAT analysis to select and propose for Chandra observations of individual clusters. We are now analyzing the Chandra observations of the distant cluster A520, which appears to have undergone a recent merger. Finally, we have completed the analysis of the fossil groups identified in ROM observations. In the past few months, we have derived X-ray fluxes and luminosities as well as X-ray extents for an initial sample of 89 objects. Based on the X-ray extents and the lack of bright galaxies, we have identified 16 fossil groups. We are comparing their X-ray and optical properties with those of optically rich groups. A paper is being readied for submission (Jones, Forman, and Vikhlinin in preparation).
Dataset of Fourier transform-infrared coupled with chemometric analysis used to distinguish accessions of Garcinia mangostana L. in Peninsular Malaysia.

PubMed

Samsir, Sri A'jilah; Bunawan, Hamidun; Yen, Choong Chee; Noor, Normah Mohd

2016-09-01

In this dataset, we distinguish 15 accessions of Garcinia mangostana from Peninsular Malaysia using Fourier transform-infrared spectroscopy coupled with chemometric analysis. We found that the position and intensity of characteristic peaks at 3600-3100 cm(-) (1) in IR spectra allowed discrimination of G. mangostana from different locations. Further principal component analysis (PCA) of all the accessions suggests the two main clusters were formed: samples from Johor, Melaka, and Negeri Sembilan (South) were clustered together in one group while samples from Perak, Kedah, Penang, Selangor, Kelantan, and Terengganu (North and East Coast) were in another clustered group.
Clustering analysis of proteins from microbial genomes at multiple levels of resolution.

PubMed

Zaslavsky, Leonid; Ciufo, Stacy; Fedorov, Boris; Tatusova, Tatiana

2016-08-31

Microbial genomes at the National Center for Biotechnology Information (NCBI) represent a large collection of more than 35,000 assemblies. There are several complexities associated with the data: a great variation in sampling density since human pathogens are densely sampled while other bacteria are less represented; different protein families occur in annotations with different frequencies; and the quality of genome annotation varies greatly. In order to extract useful information from these sophisticated data, the analysis needs to be performed at multiple levels of phylogenomic resolution and protein similarity, with an adequate sampling strategy. Protein clustering is used to construct meaningful and stable groups of similar proteins to be used for analysis and functional annotation. Our approach is to create protein clusters at three levels. First, tight clusters in groups of closely-related genomes (species-level clades) are constructed using a combined approach that takes into account both sequence similarity and genome context. Second, clustroids of conservative in-clade clusters are organized into seed global clusters. Finally, global protein clusters are built around the the seed clusters. We propose filtering strategies that allow limiting the protein set included in global clustering. The in-clade clustering procedure, subsequent selection of clustroids and organization into seed global clusters provides a robust representation and high rate of compression. Seed protein clusters are further extended by adding related proteins. Extended seed clusters include a significant part of the data and represent all major known cell machinery. The remaining part, coming from either non-conservative (unique) or rapidly evolving proteins, from rare genomes, or resulting from low-quality annotation, does not group together well. Processing these proteins requires significant computational resources and results in a large number of questionable clusters. The developed filtering strategies allow to identify and exclude such peripheral proteins limiting the protein dataset in global clustering. Overall, the proposed methodology allows the relevant data at different levels of details to be obtained and data redundancy eliminated while keeping biologically interesting variations.
Potential Environmental Justice (EJ) areas in Region 2 based on 2000 Census [EPA.EJAREAS_2000

EPA Pesticide Factsheets

Potential Environmental Justice (EJ) areas in Region 2 . This dataset was derived from 2000 census data and based on the criteria setforth in the Region 2 Interim Environmental Justice Policy. The two criteria for Region 2's EJ demographic analysis are percent poverty and percent minority. The percent minority and percent poverty numbers for each blockgroup are compared to the benchmark value for the state. Census blockgroups with percent poverty or percent minority higher than the state threshold are considered potential EJ areas. The cutoffs for each state were derived by using the statistical method - cluster analysis.Cluster analysis was chosen as the most objective way of evaluating the demographic data and determining cutoff values for minority and low income. With cluster analysis, data are divided into two distinct groups (e.g., minority and non-minority, and low income and non-low income). Cluster analysis examines natural breaks of the data. Separate analyses were conducted for minority and low income, respectively, for each State. All census block groups within a State were ranked in descending order according to the demographic factor under evaluation. This resulted in a ranking for percent minority by block group and a separate ranking for percent low income by block group. An iterative process was employed where the data were (1) split into two groups; (2) the means for each of the two groups were calculated; (3) the difference between the
Hierarchical clustering of HPV genotype patterns in the ASCUS-LSIL triage study

PubMed Central

Wentzensen, Nicolas; Wilson, Lauren E.; Wheeler, Cosette M.; Carreon, Joseph D.; Gravitt, Patti E.; Schiffman, Mark; Castle, Philip E.

2010-01-01

Anogenital cancers are associated with about 13 carcinogenic HPV types in a broader group that cause cervical intraepithelial neoplasia (CIN). Multiple concurrent cervical HPV infections are common which complicate the attribution of HPV types to different grades of CIN. Here we report the analysis of HPV genotype patterns in the ASCUS-LSIL triage study using unsupervised hierarchical clustering. Women who underwent colposcopy at baseline (n = 2780) were grouped into 20 disease categories based on histology and cytology. Disease groups and HPV genotypes were clustered using complete linkage. Risk of 2-year cumulative CIN3+, viral load, colposcopic impression, and age were compared between disease groups and major clusters. Hierarchical clustering yielded four major disease clusters: Cluster 1 included all CIN3 histology with abnormal cytology; Cluster 2 included CIN3 histology with normal cytology and combinations with either CIN2 or high-grade squamous intraepithelial lesion (HSIL) cytology; Cluster 3 included older women with normal or low grade histology/cytology and low viral load; Cluster 4 included younger women with low grade histology/cytology, multiple infections, and the highest viral load. Three major groups of HPV genotypes were identified: Group 1 included only HPV16; Group 2 included nine carcinogenic types plus non-carcinogenic HPV53 and HPV66; and Group 3 included non-carcinogenic types plus carcinogenic HPV33 and HPV45. Clustering results suggested that colposcopy missed a prevalent precancer in many women with no biopsy/normal histology and HSIL. This result was confirmed by an elevated 2-year risk of CIN3+ in these groups. Our novel approach to study multiple genotype infections in cervical disease using unsupervised hierarchical clustering can address complex genotype distributions on a population level. PMID:20959485
Genetic diversity analysis of Capparis spinosa L. populations by using ISSR markers.

PubMed

Liu, C; Xue, G P; Cheng, B; Wang, X; He, J; Liu, G H; Yang, W J

2015-12-09

Capparis spinosa L. is an important medicinal species in the Xinjiang Province of China. Ten natural populations of C. spinosa from 3 locations in North, Central, and South Xinjiang were studied using morphological trait inter simple sequence repeat (ISSR) molecular markers to assess the genetic diversity and population structure. In this study, the 10 ISSR primers produced 313 amplified DNA fragments, with 52% of fragments being polymorphic. Unweighted pair-group method with arithmetic average (UPGMA) cluster analysis indicated that 10 C. spinosa populations were clustered into 3 geographically distinct groups. The Nei gene of C. spinosa populations in different regions had Diversity and Shannon's information index ranges of 0.1312-0.2001 and 0.1004-0.1875, respectively. The 362 markers were used to construct the dendrogram based on the UPGMA cluster analysis. The dendrogram indicated that 10 populations of C. spinosa were clustered into 3 geographically distinct groups. The results showed these genotypes have high genetic diversity, and can be used for an alternative breeding program.
Identification of symptom and functional domains that fibromyalgia patients would like to see improved: a cluster analysis.

PubMed

Bennett, Robert M; Russell, Jon; Cappelleri, Joseph C; Bushmakin, Andrew G; Zlateva, Gergana; Sadosky, Alesia

2010-06-28

The purpose of this study was to determine whether some of the clinical features of fibromyalgia (FM) that patients would like to see improved aggregate into definable clusters. Seven hundred and eighty-eight patients with clinically confirmed FM and baseline pain > or =40 mm on a 100 mm visual analogue scale ranked 5 FM clinical features that the subjects would most like to see improved after treatment (one for each priority quintile) from a list of 20 developed during focus groups. For each subject, clinical features were transformed into vectors with rankings assigned values 1-5 (lowest to highest ranking). Logistic analysis was used to create a distance matrix and hierarchical cluster analysis was applied to identify cluster structure. The frequency of cluster selection was determined, and cluster importance was ranked using cluster scores derived from rankings of the clinical features. Multidimensional scaling was used to visualize and conceptualize cluster relationships. Six clinical features clusters were identified and named based on their key characteristics. In order of selection frequency, the clusters were Pain (90%; 4 clinical features), Fatigue (89%; 4 clinical features), Domestic (42%; 4 clinical features), Impairment (29%; 3 functions), Affective (21%; 3 clinical features), and Social (9%; 2 functional). The "Pain Cluster" was ranked of greatest importance by 54% of subjects, followed by Fatigue, which was given the highest ranking by 28% of subjects. Multidimensional scaling mapped these clusters to two dimensions: Status (bounded by Physical and Emotional domains), and Setting (bounded by Individual and Group interactions). Common clinical features of FM could be grouped into 6 clusters (Pain, Fatigue, Domestic, Impairment, Affective, and Social) based on patient perception of relevance to treatment. Furthermore, these 6 clusters could be charted in the 2 dimensions of Status and Setting, thus providing a unique perspective for interpretation of FM symptomatology.
Electrical Load Profile Analysis Using Clustering Techniques

NASA Astrophysics Data System (ADS)

Damayanti, R.; Abdullah, A. G.; Purnama, W.; Nandiyanto, A. B. D.

2017-03-01

Data mining is one of the data processing techniques to collect information from a set of stored data. Every day the consumption of electricity load is recorded by Electrical Company, usually at intervals of 15 or 30 minutes. This paper uses a clustering technique, which is one of data mining techniques to analyse the electrical load profiles during 2014. The three methods of clustering techniques were compared, namely K-Means (KM), Fuzzy C-Means (FCM), and K-Means Harmonics (KHM). The result shows that KHM is the most appropriate method to classify the electrical load profile. The optimum number of clusters is determined using the Davies-Bouldin Index. By grouping the load profile, the demand of variation analysis and estimation of energy loss from the group of load profile with similar pattern can be done. From the group of electric load profile, it can be known cluster load factor and a range of cluster loss factor that can help to find the range of values of coefficients for the estimated loss of energy without performing load flow studies.
Links between patterns of racial socialization and discrimination experiences and psychological adjustment: a cluster analysis.

PubMed

Ajayi, Alex A; Syed, Moin

2014-10-01

This study used a person-oriented analytic approach to identify meaningful patterns of barriers-focused racial socialization and perceived racial discrimination experiences in a sample of 295 late adolescents. Using cluster analysis, three distinct groups were identified: Low Barrier Socialization-Low Discrimination, High Barrier Socialization-Low Discrimination, and High Barrier Socialization-High Discrimination clusters. These groups were substantively unique in terms of the frequency of racial socialization messages about bias preparation and out-group mistrust its members received and their actual perceived discrimination experiences. Further, individuals in the High Barrier Socialization-High Discrimination cluster reported significantly higher depressive symptoms than those in the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. However, no differences in adjustment were observed between the Low Barrier Socialization-Low Discrimination and High Barrier Socialization-Low Discrimination clusters. Overall, the findings highlight important individual differences in how young people of color experience their race and how these differences have significant implications on psychological adjustment. Copyright © 2014 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Comprehensive identification and clustering of CLV3/ESR-related (CLE) genes in plants finds groups with potentially shared function.

PubMed

Goad, David M; Zhu, Chuanmei; Kellogg, Elizabeth A

2017-10-01

CLV3/ESR (CLE) proteins are important signaling peptides in plants. The short CLE peptide (12-13 amino acids) is cleaved from a larger pre-propeptide and functions as an extracellular ligand. The CLE family is large and has resisted attempts at classification because the CLE domain is too short for reliable phylogenetic analysis and the pre-propeptide is too variable. We used a model-based search for CLE domains from 57 plant genomes and used the entire pre-propeptide for comprehensive clustering analysis. In total, 1628 CLE genes were identified in land plants, with none recognizable from green algae. These CLEs form 12 groups within which CLE domains are largely conserved and pre-propeptides can be aligned. Most clusters contain sequences from monocots, eudicots and Amborella trichopoda, with sequences from Picea abies, Selaginella moellendorffii and Physcomitrella patens scattered in some clusters. We easily identified previously known clusters involved in vascular differentiation and nodulation. In addition, we found a number of discrete groups whose function remains poorly characterized. Available data indicate that CLE proteins within a cluster are likely to share function, whereas those from different clusters play at least partially different roles. Our analysis provides a foundation for future evolutionary and functional studies. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
An assessment of fatigue in patients with postural orthostatic tachycardia syndrome.

PubMed

Wise, Shelby; Ross, Amanda; Brown, Abigail; Evans, Meredyth; Jason, Leonard

2017-05-01

Individuals with postural orthostatic tachycardia syndrome share many symptoms with those who have chronic fatigue syndrome; one of which is severe fatigue. Previous literature found that those with chronic fatigue syndrome experience many forms of fatigue. The goal of this study was to investigate whether individuals with postural orthostatic tachycardia syndrome also experience multidimensional fatigue and whether these individuals can be clustered into subgroups based on the types of fatigue they endorse. A convenience sample of 138 participants (aged 14-29) with postural orthostatic tachycardia syndrome completed questionnaires that assessed fatigue, brain fog symptom severity, activities that improve brain fog, and brain fog-related disability. An exploratory factor analysis was conducted on the Fatigue Types Questionnaire, and a three-factor solution was produced. Factor scores were then used to cluster the patients into groups using a TwoStep cluster analysis. This resulted in two clusters, a high severity group and a low severity group. The clusters were then compared on a number of items related to symptom expression. Individuals within the more severe cluster had significantly more brain fog at the beginning and end of the survey when compared to cluster two. Those in the more severe cluster also described more activity impairment as well as more frequent, more severe, and more debilitation from postural orthostatic tachycardia syndrome and brain fog. The findings of the factor analysis suggest that patients with postural orthostatic tachycardia syndrome experience fatigue as a multidimensional construct and they also can be subgrouped based on symptom severity.
Hybrid Tracking Algorithm Improvements and Cluster Analysis Methods.

DTIC Science & Technology

1982-02-26

UPGMA ), and Ward’s method. Ling’s papers describe a (k,r) clustering method. Each of these methods have individual characteristics which make them...Reference 7), UPGMA is probably the most frequently used clustering strategy. UPGMA tries to group new points into an existing cluster by using an
Regional health care planning: a methodology to cluster facilities using community utilization patterns

PubMed Central

2013-01-01

Background Community-based health care planning and regulation necessitates grouping facilities and areal units into regions of similar health care use. Limited research has explored the methodologies used in creating these regions. We offer a new methodology that clusters facilities based on similarities in patient utilization patterns and geographic location. Our case study focused on Hospital Groups in Michigan, the allocation units used for predicting future inpatient hospital bed demand in the state’s Bed Need Methodology. The scientific, practical, and political concerns that were considered throughout the formulation and development of the methodology are detailed. Methods The clustering methodology employs a 2-step K-means + Ward’s clustering algorithm to group hospitals. The final number of clusters is selected using a heuristic that integrates both a statistical-based measure of cluster fit and characteristics of the resulting Hospital Groups. Results Using recent hospital utilization data, the clustering methodology identified 33 Hospital Groups in Michigan. Conclusions Despite being developed within the politically charged climate of Certificate of Need regulation, we have provided an objective, replicable, and sustainable methodology to create Hospital Groups. Because the methodology is built upon theoretically sound principles of clustering analysis and health care service utilization, it is highly transferable across applications and suitable for grouping facilities or areal units. PMID:23964905
Regional health care planning: a methodology to cluster facilities using community utilization patterns.

PubMed

Delamater, Paul L; Shortridge, Ashton M; Messina, Joseph P

2013-08-22

Community-based health care planning and regulation necessitates grouping facilities and areal units into regions of similar health care use. Limited research has explored the methodologies used in creating these regions. We offer a new methodology that clusters facilities based on similarities in patient utilization patterns and geographic location. Our case study focused on Hospital Groups in Michigan, the allocation units used for predicting future inpatient hospital bed demand in the state's Bed Need Methodology. The scientific, practical, and political concerns that were considered throughout the formulation and development of the methodology are detailed. The clustering methodology employs a 2-step K-means + Ward's clustering algorithm to group hospitals. The final number of clusters is selected using a heuristic that integrates both a statistical-based measure of cluster fit and characteristics of the resulting Hospital Groups. Using recent hospital utilization data, the clustering methodology identified 33 Hospital Groups in Michigan. Despite being developed within the politically charged climate of Certificate of Need regulation, we have provided an objective, replicable, and sustainable methodology to create Hospital Groups. Because the methodology is built upon theoretically sound principles of clustering analysis and health care service utilization, it is highly transferable across applications and suitable for grouping facilities or areal units.
Environmental Gradient Analysis, Ordination, and Classification in Environmental Impact Assessments.

DTIC Science & Technology

1987-09-01

agglomerative clustering algorithms for mainframe computers: (1) the unweighted pair-group method that V uses arithmetic averages ( UPGMA ), (2) the...hierarchical agglomerative unweighted pair-group method using arithmetic averages ( UPGMA ), which is also called average linkage clustering. This method was...dendrograms produced by weighted clustering (93). Sneath and Sokal (94), Romesburg (84), and Seber• (90) also strongly recommend the UPGMA . A dendrogram
a Morphometric Analysis of HYLARANA SIGNATA Group (previously Known as RANA SIGNATA and RANA PICTURATA) of Malaysia

NASA Astrophysics Data System (ADS)

Zainudin, Ramlah; Sazali, Siti Nurlydia

A study on morphometrical variations of Malaysian Hylarana signata group was conducted to reveal the morphological relationships within the species group. Twenty-seven morphological characters from 18 individuals of H. signata and H. picturata were measured and recorded. The numerical data were analysed using Discriminant Function Analysis in SPSS program version 16.0 and UPGMA Cluster Analysis in Minitab program version 14.0. The results show the complexity clustering between the examined species that might be due to ancient polymorphism of the lineages or cryptic species within the group. Hence, further study should include more representatives in order to fully elucidate the morphological relationships of H. signata group.

A Cluster Analysis of Bronchial Asthma Patients with Depressive Symptoms.

PubMed

Seino, Yo; Hasegawa, Takashi; Koya, Toshiyuki; Sakagami, Takuro; Mashima, Ichiro; Shimizu, Natsue; Muramatsu, Yoshiyuki; Muramatsu, Kumiko; Suzuki, Eiichi; Kikuchi, Toshiaki

2018-03-09

Objective Whether or not depression affects the control or severity of asthma is unclear. We performed a cluster analysis of asthma patients with depressive symptoms to clarify their characteristics. Methods and subjects Multiple medical institutions in Niigata Prefecture, Japan, were surveyed in 2014. We recorded the age, disease duration, body mass index (BMI), medications, and surveyed asthma control status and severity, as well as depressive symptoms and adherence to treatment using questionnaires. A hierarchical cluster analysis was performed on the group of patients assessed as having depression. Results Of 2,273 patients, 128 were assessed as being positive for depressive symptoms (DS[+]). Thirty-three were excluded because of missing data, and the remaining 95 DS[+] patients were classified into 3 clusters (A, B, and C). The patients in cluster A (n=19) were elderly, had severe, poorly controlled asthma, and demonstrated possible adherence barriers; those in cluster B (n=26) were elderly with a low BMI and had no significant adherence barriers but had severe, poorly controlled asthma; and those in cluster C (n=50) were younger, with a high BMI, no significant adherence barriers, well-controlled asthma, and few were severely affected. The scores for depressive symptoms were not significantly different between clusters. Conclusion About half of the patients in the DS[+] group had severe, poorly controlled asthma, and these clusters were able to be distinguished by their ASK-12 score, which reflects adherence barriers. The control status and severity of asthma may also be related to the age, disease duration, and BMI in the DS[+] group.
Onto-clust--a methodology for combining clustering analysis and ontological methods for identifying groups of comorbidities for developmental disorders.

PubMed

Peleg, Mor; Asbeh, Nuaman; Kuflik, Tsvi; Schertz, Mitchell

2009-02-01

Children with developmental disorders usually exhibit multiple developmental problems (comorbidities). Hence, such diagnosis needs to revolve on developmental disorder groups. Our objective is to systematically identify developmental disorder groups and represent them in an ontology. We developed a methodology that combines two methods (1) a literature-based ontology that we created, which represents developmental disorders and potential developmental disorder groups, and (2) clustering for detecting comorbid developmental disorders in patient data. The ontology is used to interpret and improve clustering results and the clustering results are used to validate the ontology and suggest directions for its development. We evaluated our methodology by applying it to data of 1175 patients from a child development clinic. We demonstrated that the ontology improves clustering results, bringing them closer to an expert generated gold-standard. We have shown that our methodology successfully combines an ontology with a clustering method to support systematic identification and representation of developmental disorder groups.
Effect of functionalization of boron nitride flakes by main group metal clusters on their optoelectronic properties

NASA Astrophysics Data System (ADS)

Chakraborty, Debdutta; Chattaraj, Pratim Kumar

2017-10-01

The possibility of functionalizing boron nitride flakes (BNFs) with some selected main group metal clusters, viz. OLi4, NLi5, CLi6, BLI7 and Al12Be, has been analyzed with the aid of density functional theory (DFT) based computations. Thermochemical as well as energetic considerations suggest that all the metal clusters interact with the BNF moiety in a favorable fashion. As a result of functionalization, the static (first) hyperpolarizability (β ) values of the metal cluster supported BNF moieties increase quite significantly as compared to that in the case of pristine BNF. Time dependent DFT analysis reveals that the metal clusters can lower the transition energies associated with the dominant electronic transitions quite significantly thereby enabling the metal cluster supported BNF moieties to exhibit significant non-linear optical activity. Moreover, the studied systems demonstrate broad band absorption capability spanning the UV-visible as well as infra-red domains. Energy decomposition analysis reveals that the electrostatic interactions principally stabilize the metal cluster supported BNF moieties.
Effect of functionalization of boron nitride flakes by main group metal clusters on their optoelectronic properties.

PubMed

Chakraborty, Debdutta; Chattaraj, Pratim Kumar

2017-10-25

The possibility of functionalizing boron nitride flakes (BNFs) with some selected main group metal clusters, viz. OLi 4 , NLi 5 , CLi 6 , BLI 7 and Al 12 Be, has been analyzed with the aid of density functional theory (DFT) based computations. Thermochemical as well as energetic considerations suggest that all the metal clusters interact with the BNF moiety in a favorable fashion. As a result of functionalization, the static (first) hyperpolarizability ([Formula: see text]) values of the metal cluster supported BNF moieties increase quite significantly as compared to that in the case of pristine BNF. Time dependent DFT analysis reveals that the metal clusters can lower the transition energies associated with the dominant electronic transitions quite significantly thereby enabling the metal cluster supported BNF moieties to exhibit significant non-linear optical activity. Moreover, the studied systems demonstrate broad band absorption capability spanning the UV-visible as well as infra-red domains. Energy decomposition analysis reveals that the electrostatic interactions principally stabilize the metal cluster supported BNF moieties.
Kinematic gait patterns in healthy runners: A hierarchical cluster analysis.

PubMed

Phinyomark, Angkoon; Osis, Sean; Hettinga, Blayne A; Ferber, Reed

2015-11-05

Previous studies have demonstrated distinct clusters of gait patterns in both healthy and pathological groups, suggesting that different movement strategies may be represented. However, these studies have used discrete time point variables and usually focused on only one specific joint and plane of motion. Therefore, the first purpose of this study was to determine if running gait patterns for healthy subjects could be classified into homogeneous subgroups using three-dimensional kinematic data from the ankle, knee, and hip joints. The second purpose was to identify differences in joint kinematics between these groups. The third purpose was to investigate the practical implications of clustering healthy subjects by comparing these kinematics with runners experiencing patellofemoral pain (PFP). A principal component analysis (PCA) was used to reduce the dimensionality of the entire gait waveform data and then a hierarchical cluster analysis (HCA) determined group sets of similar gait patterns and homogeneous clusters. The results show two distinct running gait patterns were found with the main between-group differences occurring in frontal and sagittal plane knee angles (P<0.001), independent of age, height, weight, and running speed. When these two groups were compared to PFP runners, one cluster exhibited greater while the other exhibited reduced peak knee abduction angles (P<0.05). The variability observed in running patterns across this sample could be the result of different gait strategies. These results suggest care must be taken when selecting samples of subjects in order to investigate the pathomechanics of injured runners. Copyright © 2015 Elsevier Ltd. All rights reserved.
Different disease subtypes with distinct clinical expression in familial Mediterranean fever: results of a cluster analysis.

PubMed

Akar, Servet; Solmaz, Dilek; Kasifoglu, Timucin; Bilge, Sule Yasar; Sari, Ismail; Gumus, Zeynep Zehra; Tunca, Mehmet

2016-02-01

The aim of this study was to evaluate whether there are clinical subgroups that may have different prognoses among FMF patients. The cumulative clinical features of a large group of FMF patients [1168 patients, 593 (50.8%) male, mean age 35.3 years (s.d. 12.4)] were studied. To analyse our data and identify groups of FMF patients with similar clinical characteristics, a two-step cluster analysis using log-likelihood distance measures was performed. For clustering the FMF patients, we evaluated the following variables: gender, current age, age at symptom onset, age at diagnosis, presence of major clinical features, variables related with therapy and family history for FMF, renal failure and carriage of M694V. Three distinct groups of FMF patients were identified. Cluster 1 was characterized by a high prevalence of arthritis, pleuritis, erysipelas-like erythema (ELE) and febrile myalgia. The dosage of colchicine and the frequency of amyloidosis were lower in cluster 1. Patients in cluster 2 had an earlier age of disease onset and diagnosis. M694V carriage and amyloidosis prevalence were the highest in cluster 2. This group of patients was using the highest dose of colchicine. Patients in cluster 3 had the lowest prevalence of arthritis, ELE and febrile myalgia. The frequencies of M694V carriage and amyloidosis were lower in cluster 3 than the overall FMF patients. Non-response to colchicine was also slightly lower in cluster 3. Patients with FMF can be clustered into distinct patterns of clinical and genetic manifestations and these patterns may have different prognostic significance. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Another collision for the Coma cluster

NASA Technical Reports Server (NTRS)

Vikhlinin, A.; Forman, W.; Jones, C.

1996-01-01

The wavelet transform analysis of the Rosat position sensitive proportional counter (PSPC) images of the Coma cluster are presented. The analysis shows, on small scales, a substructure dominated by two extended sources surrounding the two bright clusters NGC 4874 and NGC 4889. On scales of about 2 arcmin to 3 arcmin, the analysis reveals a tail of X-ray emission originating near the cluster center, curving to the south and east for approximately 25 arcmin and ending near the galaxy NGC 4911. The results are interpreted in terms of a merger of a group, having a core mass of approximately 10(exp 13) solar mass, with the main body of the Coma cluster.
Clusters of midlife women by physical activity and their racial/ethnic differences.

PubMed

Im, Eun-Ok; Ko, Young; Chee, Eunice; Chee, Wonshik; Mao, Jun James

2017-04-01

The purpose of this study was to identify clusters of midlife women by physical activity and to determine racial/ethnic differences in physical activities in each cluster. This was a secondary analysis of the data from 542 women (157 non-Hispanic [NH] Whites, 127 Hispanics, 135 NH African Americans, and 123 NH Asian) in a larger Internet study on midlife women's attitudes toward physical activity. The instruments included the Barriers to Health Activities Scale, the Physical Activity Assessment Inventory, the Questions on Attitudes toward Physical Activity, Subjective Norm, Perceived Behavioral Control, and Behavioral Intention, and the Kaiser Physical Activity Survey. The data were analyzed using hierarchical cluster analyses, analysis of variance, and multinominal logistic analyses. A three-cluster solution was adopted: cluster 1 (high active living and sports/exercise activity group; 48%), cluster 2 (high household/caregiving and occupational activity group; 27%), and cluster 3 (low active living and sports/exercise activity group; 26%). There were significant racial/ethnic differences in occupational activities of clusters 1 and 3 (all P < 0.01). Compared with cluster 1, cluster 2 tended to have lower family income, less access to health care, higher unemployment, higher perceived barriers scores, and lower social influences scores (all P < 0.01). Compared with cluster 1, cluster 3 tended to have greater obesity, less access to health care, higher perceived barriers scores, more negative attitudes toward physical activity, and lower self-efficacy scores (all P < 0.01). Midlife women's unique patterns of physical activity and their associated factors need to be considered in future intervention development.
Text grouping in patent analysis using adaptive K-means clustering algorithm

NASA Astrophysics Data System (ADS)

Shanie, Tiara; Suprijadi, Jadi; Zulhanif

2017-03-01

Patents are one of the Intellectual Property. Analyzing patent is one requirement in knowing well the development of technology in each country and in the world now. This study uses the patent document coming from the Espacenet server about Green Tea. Patent documents related to the technology in the field of tea is still widespread, so it will be difficult for users to information retrieval (IR). Therefore, it is necessary efforts to categorize documents in a specific group of related terms contained therein. This study uses titles patent text data with the proposed Green Tea in Statistical Text Mining methods consists of two phases: data preparation and data analysis stage. The data preparation phase uses Text Mining methods and data analysis stage is done by statistics. Statistical analysis in this study using a cluster analysis algorithm, the Adaptive K-Means Clustering Algorithm. Results from this study showed that based on the maximum value Silhouette, generate 87 clusters associated fifteen terms therein that can be utilized in the process of information retrieval needs.
Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.

PubMed

Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V

2007-01-01

The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.
Clustering "N" Objects into "K" Groups under Optimal Scaling of Variables.

ERIC Educational Resources Information Center

van Buuren, Stef; Heiser, Willem J.

1989-01-01

A method based on homogeneity analysis (multiple correspondence analysis or multiple scaling) is proposed to reduce many categorical variables to one variable with "k" categories. The method is a generalization of the sum of squared distances cluster analysis problem to the case of mixed measurement level variables. (SLD)
Identification of different nutritional status groups in institutionalized elderly people by cluster analysis.

PubMed

López-Contreras, María José; López, Maria Ángeles; Canteras, Manuel; Candela, María Emilia; Zamora, Salvador; Pérez-Llamas, Francisca

2014-03-01

To apply a cluster analysis to groups of individuals of similar characteristics in an attempt to identify undernutrition or the risk of undernutrition in this population. A cross-sectional study. Seven public nursing homes in the province of Murcia, on the Mediterranean coast of Spain. 205 subjects aged 65 and older (131 women and 74 men). Dietary intake (energy and nutrients), anthropometric (body mass index, skinfold thickness, mid-arm muscle circumference, mid-arm muscle area, corrected arm muscle area, waist to hip ratio) and biochemical and haematological (serum albumin, transferrin, total cholesterol, total lymphocyte count). Variables were analyzed by cluster analysis. The results of the cluster analysis, including intake, anthropometric and analytical data showed that, of the 205 elderly subjects, 66 (32.2%) were over - weight/obese, 72 (35.1%) had an adequate nutritional status and 67 (32.7%) were undernourished or at risk of undernutrition. The undernourished or at risk of undernutrition group showed the lowest values for dietary intake and the anthropometric and analytical parameters measured. Our study shows that cluster analysis is a useful statistical method for assessing the nutritional status of institutionalized elderly populations. In contrast, use of the specific reference values frequently described in the literature might fail to detect real cases of undernourishment or those at risk of undernutrition. Copyright AULA MEDICA EDICIONES 2014. Published by AULA MEDICA. All rights reserved.
An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins.

PubMed

Harper, Angela F; Leuthaeuser, Janelle B; Babbitt, Patricia C; Morris, John H; Ferrin, Thomas E; Poole, Leslie B; Fetrow, Jacquelyn S

2017-02-01

Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.
An Atlas of Peroxiredoxins Created Using an Active Site Profile-Based Approach to Functionally Relevant Clustering of Proteins

PubMed Central

Babbitt, Patricia C.; Ferrin, Thomas E.

2017-01-01

Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially—MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method’s novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences. PMID:28187133
Clustering of Multivariate Geostatistical Data

NASA Astrophysics Data System (ADS)

Fouedjio, Francky

2017-04-01

Multivariate data indexed by geographical coordinates have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations belonging to the same cluster have a certain degree of homogeneity while data locations in the different clusters have to be as different as possible. However, groups of data locations created through classical clustering techniques turn out to show poor spatial contiguity, a feature obviously inconvenient for many geoscience applications. In this work, we develop a clustering method that overcomes this problem by accounting the spatial dependence structure of data; thus reinforcing the spatial contiguity of resulting cluster. The capability of the proposed clustering method to provide spatially contiguous and meaningful clusters of data locations is assessed using both synthetic and real datasets. Keywords: clustering, geostatistics, spatial contiguity, spatial dependence.
Clustering performances in the NBA according to players' anthropometric attributes and playing experience.

PubMed

Zhang, Shaoliang; Lorenzo, Alberto; Gómez, Miguel-Angel; Mateus, Nuno; Gonçalves, Bruno; Sampaio, Jaime

2018-04-20

The aim of this study was: (i) to group basketball players into similar clusters based on a combination of anthropometric characteristics and playing experience; and (ii) explore the distribution of players (included starters and non-starters) from different levels of teams within the obtained clusters. The game-related statistics from 699 regular season balanced games were analyzed using a two-step cluster model and a discriminant analysis. The clustering process allowed identifying five different player profiles: Top height and weight (HW) with low experience, TopHW-LowE; Middle HW with middle experience, MiddleHW-MiddleE; Middle HW with top experience, MiddleHW-TopE; Low HW with low experience, LowHW-LowE; Low HW with middle experience, LowHW-MiddleE. Discriminant analysis showed that TopHW-LowE group was highlighted by two-point field goals made and missed, offensive and defensive rebounds, blocks, and personal fouls; whereas the LowHW-LowE group made fewest passes and touches. The players from weaker teams were mostly distributed in LowHW-LowE group, whereas players from stronger teams were mainly grouped in LowHW-MiddleE group; and players that participated in the finals were allocated in the MiddleHW-MiddleE group. These results provide alternative references for basketball staff concerning the process of evaluating performance.
Topic modeling for cluster analysis of large biological and medical datasets

PubMed Central

2014-01-01

Background The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. Results In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Conclusion Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets. PMID:25350106
Topic modeling for cluster analysis of large biological and medical datasets.

PubMed

Zhao, Weizhong; Zou, Wen; Chen, James J

2014-01-01

The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets.
Characterizing the course of back pain after osteoporotic vertebral fracture: a hierarchical cluster analysis of a prospective cohort study.

PubMed

Toyoda, Hiromitsu; Takahashi, Shinji; Hoshino, Masatoshi; Takayama, Kazushi; Iseki, Kazumichi; Sasaoka, Ryuichi; Tsujio, Tadao; Yasuda, Hiroyuki; Sasaki, Takeharu; Kanematsu, Fumiaki; Kono, Hiroshi; Nakamura, Hiroaki

2017-09-23

This study demonstrated four distinct patterns in the course of back pain after osteoporotic vertebral fracture (OVF). Greater angular instability in the first 6 months after the baseline was one factor affecting back pain after OVF. Understanding the natural course of symptomatic acute OVF is important in deciding the optimal treatment strategy. We used latent class analysis to classify the course of back pain after OVF and identify the risk factors associated with persistent pain. This multicenter cohort study included 218 consecutive patients with ≤ 2-week-old OVFs who were enrolled at 11 institutions. Dynamic x-rays and back pain assessment with a visual analog scale (VAS) were obtained at enrollment and at 1-, 3-, and 6-month follow-ups. The VAS scores were used to characterize patient groups, using hierarchical cluster analysis. VAS for 128 patients was used for hierarchical cluster analysis. Analysis yielded four clusters representing different patterns of back pain progression. Cluster 1 patients (50.8%) had stable, mild pain. Cluster 2 patients (21.1%) started with moderate pain and progressed quickly to very low pain. Patients in cluster 3 (10.9%) had moderate pain that initially improved but worsened after 3 months. Cluster 4 patients (17.2%) had persistent severe pain. Patients in cluster 4 showed significant high baseline pain intensity, higher degree of angular instability, and higher number of previous OVFs, and tended to lack regular exercise. In contrast, patients in cluster 2 had significantly lower baseline VAS and less angular instability. We identified four distinct groups of OVF patients with different patterns of back pain progression. Understanding the course of back pain after OVF may help in its management and contribute to future treatment trials.
Sensory Clusters of Toddlers with Autism Spectrum Disorders: Differences in Affective Symptoms

ERIC Educational Resources Information Center

Ben-Sasson, A.; Cermak, S. A.; Orsmond, G. I.; Tager-Flusberg, H.; Kadlec, M. B.; Carter, A. S.

2008-01-01

Background: Individuals with autism spectrum disorders (ASDs) show variability in their sensory behaviors. In this study we identified clusters of toddlers with ASDs who shared sensory profiles and examined differences in affective symptoms across these clusters. Method: Using cluster analysis 170 toddlers with ASDs were grouped based on parent…

Comparative Genomic Hybridization Analysis of Two Predominant Nordic Group I (Proteolytic) Clostridium botulinum Type B Clusters▿ †

PubMed Central

Lindström, Miia; Hinderink, Katja; Somervuo, Panu; Kiviniemi, Katri; Nevas, Mari; Chen, Ying; Auvinen, Petri; Carter, Andrew T.; Mason, David R.; Peck, Michael W.; Korkeala, Hannu

2009-01-01

Comparative genomic hybridization analysis of 32 Nordic group I Clostridium botulinum type B strains isolated from various sources revealed two homogeneous clusters, clusters BI and BII. The type B strains differed from reference strain ATCC 3502 by 413 coding sequence (CDS) probes, sharing 88% of all the ATCC 3502 genes represented on the microarray. The two Nordic type B clusters differed from each other by their response to 145 CDS probes related mainly to transport and binding, adaptive mechanisms, fatty acid biosynthesis, the cell membranes, bacteriophages, and transposon-related elements. The most prominent differences between the two clusters were related to resistance to toxic compounds frequently found in the environment, such as arsenic and cadmium, reflecting different adaptive responses in the evolution of the two clusters. Other relatively variable CDS groups were related to surface structures and the gram-positive cell wall, suggesting that the two clusters possess different antigenic properties. All the type B strains carried CDSs putatively related to capsule formation, which may play a role in adaptation to different environmental and clinical niches. Sequencing showed that representative strains of the two type B clusters both carried subtype B2 neurotoxin genes. As many of the type B strains studied have been isolated from foods or associated with botulism, it is expected that the two group I C. botulinum type B clusters present a public health hazard in Nordic countries. Knowing the genetic and physiological markers of these clusters will assist in targeting control measures against these pathogens. PMID:19270141
Motivational and emotional profiles in university undergraduates: a self-determination theory perspective.

PubMed

González, Antonio; Paoloni, Verónica; Donolo, Danilo; Rinaudo, Cristina

2012-11-01

Previous research has focused on specific forms of self-determined motivation or discrete class-related emotions, but few studies have simultaneously examined both constructs. The aim of this study on 472 undergraduates was twofold: to perform cluster analysis to identify homogeneous groups of motivation in the sample; and to determine the profile of each cluster for emotions and academic achievement. Cluster analysis configured four groups in terms of motivation: controlled, autonomous, both high, and both low. Each cluster revealed a distinct emotional profile, autonomous motivation being the most adaptable with high scores for academic achievement and pleasant emotions and low values for unpleasant emotions. The results are discussed in the light of their implications for academic adjustment.
Grouping of Bulgarian wines according to grape variety by using statistical methods

NASA Astrophysics Data System (ADS)

Milev, M.; Nikolova, Kr.; Ivanova, Ir.; Minkova, St.; Evtimov, T.; Krustev, St.

2017-12-01

68 different types of Bulgarian wines were studied in accordance with 9 optical parameters as follows: color parameters in XYZ and SIE Lab color systems, lightness, Hue angle, chroma, fluorescence intensity and emission wavelength. The main objective of this research is using hierarchical cluster analysis to evaluate the similarity and the distance between examined different types of Bulgarian wines and their grouping based on physical parameters. We have found that wines are grouped in clusters on the base of the degree of identity between them. There are two main clusters each one with two subclusters. The first one contains white wines and Sira, the second contains red wines and rose. The results from cluster analysis are presented graphically by a dendrogram. The other statistical technique used is factor analysis performed by the Method of Principal Components (PCA). The aim is to reduce the large number of variables to a few factors by grouping the correlated variables into one factor and subdividing the noncorrelated variables into different factors. Moreover the factor analysis provided the possibility to determine the parameters with the greatest influence over the distribution of samples in different clusters. In our study after the rotation of the factors with Varimax method the parameters were combined into two factors, which explain about 80 % of the total variation. The first one explains the 61.49% and correlates with color characteristics, the second one explains 18.34% from the variation and correlates with the parameters connected with fluorescence spectroscopy.
Characterizing cognitive heterogeneity on the schizophrenia-bipolar disorder spectrum.

PubMed

Van Rheenen, T E; Lewandowski, K E; Tan, E J; Ospina, L H; Ongur, D; Neill, E; Gurvich, C; Pantelis, C; Malhotra, A K; Rossell, S L; Burdick, K E

2017-07-01

Current group-average analysis suggests quantitative but not qualitative cognitive differences between schizophrenia (SZ) and bipolar disorder (BD). There is increasing recognition that cognitive within-group heterogeneity exists in both disorders, but it remains unclear as to whether between-group comparisons of performance in cognitive subgroups emerging from within each of these nosological categories uphold group-average findings. We addressed this by identifying cognitive subgroups in large samples of SZ and BD patients independently, and comparing their cognitive profiles. The utility of a cross-diagnostic clustering approach to understanding cognitive heterogeneity in these patients was also explored. Hierarchical clustering analyses were conducted using cognitive data from 1541 participants (SZ n = 564, BD n = 402, healthy control n = 575). Three qualitatively and quantitatively similar clusters emerged within each clinical group: a severely impaired cluster, a mild-moderately impaired cluster and a relatively intact cognitive cluster. A cross-diagnostic clustering solution also resulted in three subgroups and was superior in reducing cognitive heterogeneity compared with disorder clustering independently. Quantitative SZ-BD cognitive differences commonly seen using group averages did not hold when cognitive heterogeneity was factored into our sample. Members of each corresponding subgroup, irrespective of diagnosis, might be manifesting the outcome of differences in shared cognitive risk factors.
Determining the trophic guilds of fishes and macroinvertebrates in a seagrass food web

USGS Publications Warehouse

Luczkovich, J.J.; Ward, G.P.; Johnson, J.C.; Christian, R.R.; Baird, D.; Neckles, H.; Rizzo, W.M.

2002-01-01

We established trophic guilds of macroinvertebrate and fish taxa using correspondence analysis and a hierarchical clustering strategy for a seagrass food web in winter in the northeastern Gulf of Mexico. To create the diet matrix, we characterized the trophic linkages of macroinvertebrate and fish taxa present in Halodule wrightii seagrass habitat areas within the St. Marks National Wildlife Refuge (Florida) using binary data, combining dietary links obtained from relevant literature for macroinvertebrates with stomach analysis of common fishes collected during January and February of 1994. Heirarchical average-linkage cluster analysis of the 73 taxa of fishes and macroinvertebrates in the diet matrix yielded 14 clusters with diet similarity ??? 0.60. We then used correspondence analysis with three factors to jointly plot the coordinates of the consumers (identified by cluster membership) and of the 33 food sources. Correspondence analysis served as a visualization tool for assigning each taxon to one of eight trophic guilds: herbivores, detritivores, suspension feeders, omnivores, molluscivores, meiobenthos consumers, macrobenthos consumers, and piscivores. These trophic groups, cross-classified with major taxonomic groups, were further used to develop consumer compartments in a network analysis model of carbon flow in this seagrass ecosystem. The method presented here should greatly improve the development of future network models of food webs by providing an objective procedure for aggregating trophic groups.
Review of Recent Methodological Developments in Group-Randomized Trials: Part 2-Analysis.

PubMed

Turner, Elizabeth L; Prague, Melanie; Gallis, John A; Li, Fan; Murray, David M

2017-07-01

In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have updated that review with developments in analysis of the past 13 years, with a companion article to focus on developments in design. We discuss developments in the topics of the earlier review (e.g., methods for parallel-arm GRTs, individually randomized group-treatment trials, and missing data) and in new topics, including methods to account for multiple-level clustering and alternative estimation methods (e.g., augmented generalized estimating equations, targeted maximum likelihood, and quadratic inference functions). In addition, we describe developments in analysis of alternative group designs (including stepped-wedge GRTs, network-randomized trials, and pseudocluster randomized trials), which require clustering to be accounted for in their design and analysis.
Profiling physical activity motivation based on self-determination theory: a cluster analysis approach.

PubMed

Friederichs, Stijn Ah; Bolman, Catherine; Oenema, Anke; Lechner, Lilian

2015-01-01

In order to promote physical activity uptake and maintenance in individuals who do not comply with physical activity guidelines, it is important to increase our understanding of physical activity motivation among this group. The present study aimed to examine motivational profiles in a large sample of adults who do not comply with physical activity guidelines. The sample for this study consisted of 2473 individuals (31.4% male; age 44.6 ± 12.9). In order to generate motivational profiles based on motivational regulation, a cluster analysis was conducted. One-way analyses of variance were then used to compare the clusters in terms of demographics, physical activity level, motivation to be active and subjective experience while being active. Three motivational clusters were derived based on motivational regulation scores: a low motivation cluster, a controlled motivation cluster and an autonomous motivation cluster. These clusters differed significantly from each other with respect to physical activity behavior, motivation to be active and subjective experience while being active. Overall, the autonomous motivation cluster displayed more favorable characteristics compared to the other two clusters. The results of this study provide additional support for the importance of autonomous motivation in the context of physical activity behavior. The three derived clusters may be relevant in the context of physical activity interventions as individuals within the different clusters might benefit most from different intervention approaches. In addition, this study shows that cluster analysis is a useful method for differentiating between motivational profiles in large groups of individuals who do not comply with physical activity guidelines.
Deconstructing Bipolar Disorder and Schizophrenia: A cross-diagnostic cluster analysis of cognitive phenotypes.

PubMed

Lee, Junghee; Rizzo, Shemra; Altshuler, Lori; Glahn, David C; Miklowitz, David J; Sugar, Catherine A; Wynn, Jonathan K; Green, Michael F

2017-02-01

Bipolar disorder (BD) and schizophrenia (SZ) show substantial overlap. It has been suggested that a subgroup of patients might contribute to these overlapping features. This study employed a cross-diagnostic cluster analysis to identify subgroups of individuals with shared cognitive phenotypes. 143 participants (68 BD patients, 39 SZ patients and 36 healthy controls) completed a battery of EEG and performance assessments on perception, nonsocial cognition and social cognition. A K-means cluster analysis was conducted with all participants across diagnostic groups. Clinical symptoms, functional capacity, and functional outcome were assessed in patients. A two-cluster solution across 3 groups was the most stable. One cluster including 44 BD patients, 31 controls and 5 SZ patients showed better cognition (High cluster) than the other cluster with 24 BD patients, 35 SZ patients and 5 controls (Low cluster). BD patients in the High cluster performed better than BD patients in the Low cluster across cognitive domains. Within each cluster, participants with different clinical diagnoses showed different profiles across cognitive domains. All patients are in the chronic phase and out of mood episode at the time of assessment and most of the assessment were behavioral measures. This study identified two clusters with shared cognitive phenotype profiles that were not proxies for clinical diagnoses. The finding of better social cognitive performance of BD patients than SZ patients in the Lowe cluster suggest that relatively preserved social cognition may be important to identify disease process distinct to each disorder. Copyright © 2016 Elsevier B.V. All rights reserved.
A Taxonomic Approach to the Gestalt Theory of Perls

ERIC Educational Resources Information Center

Raming, Henry E.; Frey, David H.

1974-01-01

This study applied content analysis and cluster analysis to the ideas of Fritz Perls to develop a taxonomy of Gestalt processes and goals. Summaries of the typal groups or clusters were written and the implications of taxonomic research in counseling discussed. (Author)
Improving Cluster Analysis with Automatic Variable Selection Based on Trees

DTIC Science & Technology

2014-12-01

regression trees Daisy DISsimilAritY PAM partitioning around medoids PMA penalized multivariate analysis SPC sparse principal components UPGMA unweighted...unweighted pair-group average method ( UPGMA ). This method measures dissimilarities between all objects in two clusters and takes the average value
Comprehensive cluster analysis with Transitivity Clustering.

PubMed

Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan

2011-03-01

Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.
Bias and inference from misspecified mixed-effect models in stepped wedge trial analysis.

PubMed

Thompson, Jennifer A; Fielding, Katherine L; Davey, Calum; Aiken, Alexander M; Hargreaves, James R; Hayes, Richard J

2017-10-15

Many stepped wedge trials (SWTs) are analysed by using a mixed-effect model with a random intercept and fixed effects for the intervention and time periods (referred to here as the standard model). However, it is not known whether this model is robust to misspecification. We simulated SWTs with three groups of clusters and two time periods; one group received the intervention during the first period and two groups in the second period. We simulated period and intervention effects that were either common-to-all or varied-between clusters. Data were analysed with the standard model or with additional random effects for period effect or intervention effect. In a second simulation study, we explored the weight given to within-cluster comparisons by simulating a larger intervention effect in the group of the trial that experienced both the control and intervention conditions and applying the three analysis models described previously. Across 500 simulations, we computed bias and confidence interval coverage of the estimated intervention effect. We found up to 50% bias in intervention effect estimates when period or intervention effects varied between clusters and were treated as fixed effects in the analysis. All misspecified models showed undercoverage of 95% confidence intervals, particularly the standard model. A large weight was given to within-cluster comparisons in the standard model. In the SWTs simulated here, mixed-effect models were highly sensitive to departures from the model assumptions, which can be explained by the high dependence on within-cluster comparisons. Trialists should consider including a random effect for time period in their SWT analysis model. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Bias and inference from misspecified mixed‐effect models in stepped wedge trial analysis

PubMed Central

Fielding, Katherine L.; Davey, Calum; Aiken, Alexander M.; Hargreaves, James R.; Hayes, Richard J.

2017-01-01

Many stepped wedge trials (SWTs) are analysed by using a mixed‐effect model with a random intercept and fixed effects for the intervention and time periods (referred to here as the standard model). However, it is not known whether this model is robust to misspecification. We simulated SWTs with three groups of clusters and two time periods; one group received the intervention during the first period and two groups in the second period. We simulated period and intervention effects that were either common‐to‐all or varied‐between clusters. Data were analysed with the standard model or with additional random effects for period effect or intervention effect. In a second simulation study, we explored the weight given to within‐cluster comparisons by simulating a larger intervention effect in the group of the trial that experienced both the control and intervention conditions and applying the three analysis models described previously. Across 500 simulations, we computed bias and confidence interval coverage of the estimated intervention effect. We found up to 50% bias in intervention effect estimates when period or intervention effects varied between clusters and were treated as fixed effects in the analysis. All misspecified models showed undercoverage of 95% confidence intervals, particularly the standard model. A large weight was given to within‐cluster comparisons in the standard model. In the SWTs simulated here, mixed‐effect models were highly sensitive to departures from the model assumptions, which can be explained by the high dependence on within‐cluster comparisons. Trialists should consider including a random effect for time period in their SWT analysis model. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. PMID:28556355
A New Classification of Diabetic Gait Pattern Based on Cluster Analysis of Biomechanical Data

PubMed Central

Sawacha, Zimi; Guarneri, Gabriella; Avogaro, Angelo; Cobelli, Claudio

2010-01-01

Background The diabetic foot, one of the most serious complications of diabetes mellitus and a major risk factor for plantar ulceration, is determined mainly by peripheral neuropathy. Neuropathic patients exhibit decreased stability while standing as well as during dynamic conditions. A new methodology for diabetic gait pattern classification based on cluster analysis has been proposed that aims to identify groups of subjects with similar patterns of gait and verify if three-dimensional gait data are able to distinguish diabetic gait patterns from one of the control subjects. Method The gait of 20 nondiabetic individuals and 46 diabetes patients with and without peripheral neuropathy was analyzed [mean age 59.0 (2.9) and 61.1(4.4) years, mean body mass index (BMI) 24.0 (2.8), and 26.3 (2.0)]. K-means cluster analysis was applied to classify the subjects' gait patterns through the analysis of their ground reaction forces, joints and segments (trunk, hip, knee, ankle) angles, and moments. Results Cluster analysis classification led to definition of four well-separated clusters: one aggregating just neuropathic subjects, one aggregating both neuropathics and non-neuropathics, one including only diabetes patients, and one including either controls or diabetic and neuropathic subjects. Conclusions Cluster analysis was useful in grouping subjects with similar gait patterns and provided evidence that there were subgroups that might otherwise not be observed if a group ensemble was presented for any specific variable. In particular, we observed the presence of neuropathic subjects with a gait similar to the controls and diabetes patients with a long disease duration with a gait as altered as the neuropathic one. PMID:20920432
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ogden, K; O’Dwyer, R; Bradford, T

Purpose: To reduce differences in features calculated from MRI brain scans acquired at different field strengths with or without Gadolinium contrast. Methods: Brain scans were processed for 111 epilepsy patients to extract hippocampus and thalamus features. Scans were acquired on 1.5 T scanners with Gadolinium contrast (group A), 1.5T scanners without Gd (group B), and 3.0 T scanners without Gd (group C). A total of 72 features were extracted. Features were extracted from original scans and from scans where the image pixel values were rescaled to the mean of the hippocampi and thalami values. For each data set, cluster analysismore » was performed on the raw feature set and for feature sets with normalization (conversion to Z scores). Two methods of normalization were used: The first was over all values of a given feature, and the second by normalizing within the patient group membership. The clustering software was configured to produce 3 clusters. Group fractions in each cluster were calculated. Results: For features calculated from both the non-rescaled and rescaled data, cluster membership was identical for both the non-normalized and normalized data sets. Cluster 1 was comprised entirely of Group A data, Cluster 2 contained data from all three groups, and Cluster 3 contained data from only groups 1 and 2. For the categorically normalized data sets there was a more uniform distribution of group data in the three Clusters. A less pronounced effect was seen in the rescaled image data features. Conclusion: Image Rescaling and feature renormalization can have a significant effect on the results of clustering analysis. These effects are also likely to influence the results of supervised machine learning algorithms. It may be possible to partly remove the influence of scanner field strength and the presence of Gadolinium based contrast in feature extraction for radiomics applications.« less
Sulfur in Cometary Dust

NASA Technical Reports Server (NTRS)

Fomenkova, M. N.

1997-01-01

The computer-intensive project consisted of the analysis and synthesis of existing data on composition of comet Halley dust particles. The main objective was to obtain a complete inventory of sulfur containing compounds in the comet Halley dust by building upon the existing classification of organic and inorganic compounds and applying a variety of statistical techniques for cluster and cross-correlational analyses. A student hired for this project wrote and tested the software to perform cluster analysis. The following tasks were carried out: (1) selecting the data from existing database for the proposed project; (2) finding access to a standard library of statistical routines for cluster analysis; (3) reformatting the data as necessary for input into the library routines; (4) performing cluster analysis and constructing hierarchical cluster trees using three methods to define the proximity of clusters; (5) presenting the output results in different formats to facilitate the interpretation of the obtained cluster trees; (6) selecting groups of data points common for all three trees as stable clusters. We have also considered the chemistry of sulfur in inorganic compounds.
Effective implementation of hierarchical clustering

NASA Astrophysics Data System (ADS)

Verma, Mudita; Vijayarajan, V.; Sivashanmugam, G.; Bessie Amali, D. Geraldine

2017-11-01

Hierarchical clustering is generally used for cluster analysis in which we build up a hierarchy of clusters. In order to find that which cluster should be split a large amount of observations are being carried out. Here the data set of US based personalities has been considered for clustering. After implementation of hierarchical clustering on the data set we group it in three different clusters one is of politician, sports person and musicians. Training set is the main parameter which decides the category which has to be assigned to the observations that are being collected. The category of these observations must be known. Recognition comes from the formulation of classification. Supervised learning has the main instance in the form of classification. While on the other hand Clustering is an instance of unsupervised procedure. Clustering consists of grouping of data that have similar properties which are either their own or are inherited from some other sources.
Multiple goals, motivation and academic learning.

PubMed

Valle, Antonio; Cabanach, Ramón G; Núnez, José C; González-Pienda, Julio; Rodríguez, Susana; Piñeiro, Isabel

2003-03-01

The type of academic goals pursued by students is one of the most important variables in motivational research in educational contexts. Although motivational theory and research have emphasised the somewhat exclusive nature of two types of goal orientation (learning goals versus performance goals), some studies (Meece, 1994; Seifert, 1995, 1996) have shown that the two kinds of goals are relatively complementary and that it is possible for students to have multiple goals simultaneously, which guarantees some flexibility to adapt more efficaciously to various contexts and learning situations. The principal aim of this study is to determine the academic goals pursued by university students and to analyse the differences in several very significant variables related to motivation and academic learning. Participants were 609 university students (74% women and 26% men) who filled in several questionnaires about the variables under study. We used cluster analysis ('quick cluster analysis' method) to establish the different groups or clusters of individuals as a function of the three types of goals (learning goals, performance goals, and social reinforcement goals). By means of MANOVA, we determined whether the groups or clusters identified were significantly different in the variables that are relevant to motivation and academic learning. Lastly, we performed ANOVA on the variables that revealed significant effects in the previous analysis. Using cluster analysis, three groups of students with different motivational orientations were identified: a group with predominance of performance goals (Group PG: n = 230), a group with predominance of multiple goals (Group MG: n = 238), and a group with predominance of learning goals (Group LG: n = 141). Groups MG and LG attributed their success more to ability, they had higher perceived ability, they took task characteristics into account when planning which strategies to use in the learning process, they showed higher persistence, and used more deep learning strategies than did the students with predominance of performance goals (Group PG). On the other hand, Groups MG and PG took the evaluation criteria more into account when deciding which strategies to use in order to learn, and they attributed their failures more to luck than did Group LG. Students from Group MG attributed their success more to effort than did the other two groups and they attained higher achievement than Group PG. Group LG tended to attribute their failures more to lack of effort than did the other two groups.
Using sperm morphometry and multivariate analysis to differentiate species of gray Mazama

PubMed Central

Duarte, José Maurício Barbanti

2016-01-01

There is genetic evidence that the two species of Brazilian gray Mazama, Mazama gouazoubira and Mazama nemorivaga, belong to different genera. This study identified significant differences that separated them into distinct groups, based on characteristics of the spermatozoa and ejaculate of both species. The characteristics that most clearly differentiated between the species were ejaculate colour, white for M. gouazoubira and reddish for M. nemorivaga, and sperm head dimensions. Multivariate analysis of sperm head dimension and format data accurately discriminated three groups for species with total percentage of misclassified of 0.71. The individual analysis, by animal, and the multivariate analysis have also discriminated correctly all five animals (total percentage of misclassified of 13.95%), and the canonical plot has shown three different clusters: Cluster 1, including individuals of M. nemorivaga; Cluster 2, including two individuals of M. gouazoubira; and Cluster 3, including a single individual of M. gouazoubira. The results obtained in this work corroborate the hypothesis of the formation of new genera and species for gray Mazama. Moreover, the easily applied method described herein can be used as an auxiliary tool to identify sibling species of other taxonomic groups. PMID:28018612
Down-Regulation of Olfactory Receptors in Response to Traumatic Brain Injury Promotes Risk for Alzheimers Disease

DTIC Science & Technology

2015-12-01

group assignment of samples in unsupervised hierarchical clustering by the Unweighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on...log2 transformed MAS5.0 signal values; probe set clustering was performed by the UPGMA method using Cosine correlation as the similarity met- ric. For...differentially-regulated genes identified were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as

Investigating the health profile of patients with end-stage renal failure receiving peritoneal dialysis: a cluster analysis.

PubMed

Chan, M F; Wong, Frances K Y; Chow, Susan K Y

2010-03-01

To determine whether the patients with end stage renal failure can be differentiated into several subtypes based on five main variables. There is a lack of interventional research linking to clinical outcomes among the patients with end stage renal failure in Hong Kong and with no clear evidence of differences in terms of their clinical/health outcomes and characteristics. A cross-sectional survey. Data were collected using a structured questionnaire. One hundred and fifty-three patients with end stage renal failure were recruited during 2007 at three renal centres in Hong Kong. Five main variables were employed: predisposing characteristic, enabling resources, quality of life, symptom control and self-care adherence. A cluster analysis yielded two clusters. Each cluster represented a different profile of patients with end stage renal failure. Cluster A consisted of 49.7% (n = 76) and Cluster B consisted of 50.3% (n = 77) of the patients. Cluster A patients, more of whom were women, were older, less educated, had higher quality of life scores, a better adherence rate and more had received nursing care supports than patients in Cluster B. We have identified two groupings of patients with end stage renal failure who were experiencing unique health profile. Nursing support services may have an effect on patient health outcomes but only on a group of patients whose profile is similar to the patients in Cluster A and not for patients in Cluster B. A clear profile may help health care professional make appropriate strategies to target a specific group of patients to improve patient outcomes. The identification of risk for future health-care use could enable better targeting of interventional strategies in these groups. The results of this study might provide health care professionals with a model to design specified interventions to improve life quality for each profile group.
Metabolic Analysis of Various Date Palm Fruit (Phoenix dactylifera L.) Cultivars from Saudi Arabia to Assess Their Nutritional Quality.

PubMed

Hamad, Ismail; AbdElgawad, Hamada; Al Jaouni, Soad; Zinta, Gaurav; Asard, Han; Hassan, Sherif; Hegab, Momtaz; Hagagy, Nashwa; Selim, Samy

2015-07-27

Date palm is an important crop, especially in the hot-arid regions of the world. Date palm fruits have high nutritional and therapeutic value and possess significant antibacterial and antifungal properties. In this study, we performed bioactivity analyses and metabolic profiling of date fruits of 12 cultivars from Saudi Arabia to assess their nutritional value. Our results showed that the date extracts from different cultivars have different free radical scavenging and anti-lipid peroxidation activities. Moreover, the cultivars showed significant differences in their chemical composition, e.g., the phenolic content (10.4-22.1 mg/100 g DW), amino acids (37-108 μmol·g-1 FW) and minerals (237-969 mg/100 g DW). Principal component analysis (PCA) showed a clear separation of the cultivars into four different groups. The first group consisted of the Sokary, Nabtit Ali cultivars, the second group of Khlas Al Kharj, Khla Al Qassim, Mabroom, Khlas Al Ahsa, the third group of Khals Elshiokh, Nabot Saif, Khodry, and the fourth group consisted of Ajwa Al Madinah, Saffawy, Rashodia, cultivars. Hierarchical cluster analysis (HCA) revealed clustering of date cultivars into two groups. The first cluster consisted of the Sokary, Rashodia and Nabtit Ali cultivars, and the second cluster contained all the other tested cultivars. These results indicate that date fruits have high nutritive value, and different cultivars have different chemical composition.
Atlas-guided cluster analysis of large tractography datasets.

PubMed

Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

2013-01-01

Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment.
Data Mining of University Philanthropic Giving: Cluster-Discriminant Analysis and Pareto Effects

ERIC Educational Resources Information Center

Le Blanc, Louis A.; Rucks, Conway T.

2009-01-01

A large sample of 33,000 university alumni records were cluster-analyzed to generate six groups relatively unique in their respective attribute values. The attributes used to cluster the former students included average gift to the university's foundation and to the alumni association for the same institution. Cluster detection is useful in this…
Phenotype in combination with genotype improves outcome prediction in acute myeloid leukemia: a report from Children’s Oncology Group protocol AAML0531

PubMed Central

Voigt, Andrew P.; Brodersen, Lisa Eidenschink; Alonzo, Todd A.; Gerbing, Robert B.; Menssen, Andrew J.; Wilson, Elisabeth R.; Kahwash, Samir; Raimondi, Susana C.; Hirsch, Betsy A.; Gamis, Alan S.; Meshinchi, Soheil; Wells, Denise A.; Loken, Michael R.

2017-01-01

Diagnostic biomarkers can be used to determine relapse risk in acute myeloid leukemia, and certain genetic aberrancies have prognostic relevance. A diagnostic immunophenotypic expression profile, which quantifies the amounts of distinct gene products, not just their presence or absence, was established in order to improve outcome prediction for patients with acute myeloid leukemia. The immunophenotypic expression profile, which defines each patient’s leukemia as a location in 15-dimensional space, was generated for 769 patients enrolled in the Children’s Oncology Group AAML0531 protocol. Unsupervised hierarchical clustering grouped patients with similar immunophenotypic expression profiles into eleven patient cohorts, demonstrating high associations among phenotype, genotype, morphology, and outcome. Of 95 patients with inv(16), 79% segregated in Cluster A. Of 109 patients with t(8;21), 92% segregated in Clusters A and B. Of 152 patients with 11q23 alterations, 78% segregated in Clusters D, E, F, G, or H. For both inv(16) and 11q23 abnormalities, differential phenotypic expression identified patient groups with different survival characteristics (P<0.05). Clinical outcome analysis revealed that Cluster B (predominantly t(8;21)) was associated with favorable outcome (P<0.001) and Clusters E, G, H, and K were associated with adverse outcomes (P<0.05). Multivariable regression analysis revealed that Clusters E, G, H, and K were independently associated with worse survival (P range <0.001 to 0.008). The Children’s Oncology Group AAML0531 trial: clinicaltrials.gov Identifier: 00372593. PMID:28883080
Variability in body size and shape of UK offshore workers: A cluster analysis approach.

PubMed

Stewart, Arthur; Ledingham, Robert; Williams, Hector

2017-01-01

Male UK offshore workers have enlarged dimensions compared with UK norms and knowledge of specific sizes and shapes typifying their physiques will assist a range of functions related to health and ergonomics. A representative sample of the UK offshore workforce (n = 588) underwent 3D photonic scanning, from which 19 extracted dimensional measures were used in k-means cluster analysis to characterise physique groups. Of the 11 resulting clusters four somatotype groups were expressed: one cluster was muscular and lean, four had greater muscularity than adiposity, three had equal adiposity and muscularity and three had greater adiposity than muscularity. Some clusters appeared constitutionally similar to others, differing only in absolute size. These cluster centroids represent an evidence-base for future designs in apparel and other applications where body size and proportions affect functional performance. They also constitute phenotypic evidence providing insight into the 'offshore culture' which may underpin the enlarged dimensions of offshore workers. Copyright © 2016 Elsevier Ltd. All rights reserved.
Identifying Two Groups of Entitled Individuals: Cluster Analysis Reveals Emotional Stability and Self-Esteem Distinction.

PubMed

Crowe, Michael L; LoPilato, Alexander C; Campbell, W Keith; Miller, Joshua D

2016-12-01

The present study hypothesized that there exist two distinct groups of entitled individuals: grandiose-entitled, and vulnerable-entitled. Self-report scores of entitlement were collected for 916 individuals using an online platform. Model-based cluster analyses were conducted on the individuals with scores one standard deviation above mean (n = 159) using the five-factor model dimensions as clustering variables. The results support the existence of two groups of entitled individuals categorized as emotionally stable and emotionally vulnerable. The emotionally stable cluster reported emotional stability, high self-esteem, more positive affect, and antisocial behavior. The emotionally vulnerable cluster reported low self-esteem and high levels of neuroticism, disinhibition, conventionality, psychopathy, negative affect, childhood abuse, intrusive parenting, and attachment difficulties. Compared to the control group, both clusters reported being more antagonistic, extraverted, Machiavellian, and narcissistic. These results suggest important differences are missed when simply examining the linear relationships between entitlement and various aspects of its nomological network.
Analysis of genetic association using hierarchical clustering and cluster validation indices.

PubMed

Pagnuco, Inti A; Pastore, Juan I; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L

2017-10-01

It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, based on some criteria of similarity. This task is usually performed by clustering algorithms, where the genes are clustered into meaningful groups based on their expression values in a set of experiment. In this work, we propose a method to find sets of co-expressed genes, based on cluster validation indices as a measure of similarity for individual gene groups, and a combination of variants of hierarchical clustering to generate the candidate groups. We evaluated its ability to retrieve significant sets on simulated correlated and real genomics data, where the performance is measured based on its detection ability of co-regulated sets against a full search. Additionally, we analyzed the quality of the best ranked groups using an online bioinformatics tool that provides network information for the selected genes. Copyright © 2017 Elsevier Inc. All rights reserved.
Identification of "binge-prone" women: an experimentally and psychometrically validated cluster analysis in a college population.

PubMed

Beebe, D W; Holmbeck, G N; Albright, J S; Noga, K; DeCastro, B

1995-01-01

This study investigated the escape model of binge eating through a cluster analysis using standardized measures. A sample of 126 undergraduate women underwent a manipulation of their level of cognition and were asked to "taste-test" several flavors of ice cream. Questionnaire data from these women were entered into a cluster analysis. Two groups emerged: women in the "binge-prone" group were significantly more depressed, had lower self-esteem, had more chaotic and extreme eating patterns, and were more self-conscious than those in the control group. In validation work, binge-prone women were shown to report elevated levels of bulimic symptomatology and, when in the presence of a food they enjoyed, to respond to increases in level of cognition by eating more. These results were consistent with some, but not all, of the components of the escape model.
Nursing home care quality: a cluster analysis.

PubMed

Grøndahl, Vigdis Abrahamsen; Fagerli, Liv Berit

2017-02-13

Purpose The purpose of this paper is to explore potential differences in how nursing home residents rate care quality and to explore cluster characteristics. Design/methodology/approach A cross-sectional design was used, with one questionnaire including questions from quality from patients' perspective and Big Five personality traits, together with questions related to socio-demographic aspects and health condition. Residents ( n=103) from four Norwegian nursing homes participated (74.1 per cent response rate). Hierarchical cluster analysis identified clusters with respect to care quality perceptions. χ 2 tests and one-way between-groups ANOVA were performed to characterise the clusters ( p<0.05). Findings Two clusters were identified; Cluster 1 residents (28.2 per cent) had the best care quality perceptions and Cluster 2 (67.0 per cent) had the worst perceptions. The clusters were statistically significant and characterised by personal-related conditions: gender, psychological well-being, preferences, admission, satisfaction with staying in the nursing home, emotional stability and agreeableness, and by external objective care conditions: healthcare personnel and registered nurses. Research limitations/implications Residents assessed as having no cognitive impairments were included, thus excluding the largest group. By choosing questionnaire design and structured interviews, the number able to participate may increase. Practical implications Findings may provide healthcare personnel and managers with increased knowledge on which to develop strategies to improve specific care quality perceptions. Originality/value Cluster analysis can be an effective tool for differentiating between nursing homes residents' care quality perceptions.
Cluster analysis differentiates high and low community functioning in schizophrenia: Subgroups differ on working memory but not other neurocognitive domains.

PubMed

Alden, Eva C; Cobia, Derin J; Reilly, James L; Smith, Matthew J

2015-10-01

Schizophrenia is characterized by impairment in multiple aspects of community functioning. Available literature suggests that community functioning may be enhanced through cognitive remediation, however, evidence is limited regarding whether specific neurocognitive domains may be treatment targets. We characterized schizophrenia subjects based on their level of community functioning through cluster analysis in an effort to identify whether specific neurocognitive domains were associated with variation in functioning. Schizophrenia (SCZ, n=60) and control (CON, n=45) subjects completed a functional capacity task, social competence role-play, functional attainment interview, and a neuropsychological battery. Multiple cluster analytic techniques were used on the measures of functioning in the schizophrenia subjects to generate functionally-defined subgroups. MANOVA evaluated between-group differences in neurocognition. The cluster analysis revealed two distinct groups, consisting of 36 SCZ characterized by high levels of community functioning (HF-SCZ) and 24 SCZ with low levels of community functioning (LF-SCZ). There was a main group effect for neurocognitive performance (p<0.001) with CON outperforming both SCZ groups in all neurocognitive domains. Post-hoc tests revealed that HF-SCZ had higher verbal working memory compared to LF-SCZ (p≤0.05, Cohen's d=0.78) but the two groups did not differ in remaining domains. The cluster analysis classified schizophrenia subjects in HF-SCZ and LF-SCZ using a multidimensional assessment of community functioning. Moreover, HF-SCZ demonstrated rather preserved verbal working memory relative to LF-SCZ. The results suggest that verbal working memory may play a critical role in community functioning, and is a potential cognitive treatment target for schizophrenia subjects. Copyright © 2015 Elsevier B.V. All rights reserved.
Population clustering based on copy number variations detected from next generation sequencing data.

PubMed

Duan, Junbo; Zhang, Ji-Gang; Wan, Mingxi; Deng, Hong-Wen; Wang, Yu-Ping

2014-08-01

Copy number variations (CNVs) can be used as significant bio-markers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering.
A framework to spatially cluster air pollution monitoring sites in US based on the PM2.5 composition

PubMed Central

Austin, Elena; Coull, Brent A.; Zanobetti, Antonella; Koutrakis, Petros

2013-01-01

Background Heterogeneity in the response to PM2.5 is hypothesized to be related to differences in particle composition across monitoring sites which reflect differences in source types as well as climatic and topographic conditions impacting different geographic locations. Identifying spatial patterns in particle composition is a multivariate problem that requires novel methodologies. Objectives Use cluster analysis methods to identify spatial patterns in PM2.5 composition. Verify that the resulting clusters are distinct and informative. Methods 109 monitoring sites with 75% reported speciation data during the period 2003–2008 were selected. These sites were categorized based on their average PM2.5 composition over the study period using k-means cluster analysis. The obtained clusters were validated and characterized based on their physico-chemical characteristics, geographic locations, emissions profiles, population density and proximity to major emission sources. Results Overall 31 clusters were identified. These include 21 clusters with 2 or more sites which were further grouped into 4 main types using hierarchical clustering. The resulting groupings are chemically meaningful and represent broad differences in emissions. The remaining clusters, encompassing single sites, were characterized based on their particle composition and geographic location. Conclusions The framework presented here provides a novel tool which can be used to identify and further classify sites based on their PM2.5 composition. The solution presented is fairly robust and yielded groupings that were meaningful in the context of air-pollution research. PMID:23850585
[Achene morphology cluster analysis of Taraxacum F. H. Wigg. from northeast China and molecule systematics evidence determined by SRAP].

PubMed

Li, Hai-juan; Zhao, Xin; Jia, Qing-fei; Li, Tian-lai; Ning, Wei

2012-08-01

The achenes morphological and micro-morphological characteristics of six species of genus Taraxacum from northeastern China as well as SRAP cluster analysis were observed for their classification evidences. The achenes were observed by microscope and EPMA. Cluster analysis was given on the basis of the size, shape, cone proportion, color and surface sculpture of achenes. The Taraxacum inter-species achene shape characteristic difference is obvious, particularly spinulose distribution and size, achene color and achene size; with the Taraxacum plant achene shape the cluster method T. antungense Kitag. and the T. urbanum Kitag. should combine for the identical kind; the achene morphology cluster analysis and the SRAP tagged molecule systematics's cluster result retrieves in the table with "the Chinese flora". The class group to divide the result is consistent. Taraxacum plant achene shape characteristic stable conservative, may carry on the inter-species division and the sibship analysis according to the achene shape characteristic combination difference; the achene morphology cluster analysis as well as the SRAP tagged molecule systematics confirmation support dandelion classification result of "the Chinese flora".
Exploratory Item Classification Via Spectral Graph Clustering

PubMed Central

Chen, Yunxiao; Li, Xiaoou; Liu, Jingchen; Xu, Gongjun; Ying, Zhiliang

2017-01-01

Large-scale assessments are supported by a large item pool. An important task in test development is to assign items into scales that measure different characteristics of individuals, and a popular approach is cluster analysis of items. Classical methods in cluster analysis, such as the hierarchical clustering, K-means method, and latent-class analysis, often induce a high computational overhead and have difficulty handling missing data, especially in the presence of high-dimensional responses. In this article, the authors propose a spectral clustering algorithm for exploratory item cluster analysis. The method is computationally efficient, effective for data with missing or incomplete responses, easy to implement, and often outperforms traditional clustering algorithms in the context of high dimensionality. The spectral clustering algorithm is based on graph theory, a branch of mathematics that studies the properties of graphs. The algorithm first constructs a graph of items, characterizing the similarity structure among items. It then extracts item clusters based on the graphical structure, grouping similar items together. The proposed method is evaluated through simulations and an application to the revised Eysenck Personality Questionnaire. PMID:29033476
Elements concentration analysis in groundwater from the North Serra Geral aquifer in Santa Helena-Brazil using SR-TXRF spectrometer.

PubMed

Justen, Gisele C; Espinoza-Quiñones, Fernando R; Módenes, Aparecido Nivaldo; Bergamasco, Rosangela

2012-01-01

In this work the analysis of elements concentration in groundwater was performed using the synchrotron radiation total-reflection X-ray fluorescence (SR-TXRF) technique. A set of nine tube-wells with serious risk of contamination was chosen to monitor the mean concentration of elements in groundwater from the North Serra Geral aquifer in Santa Helena, Brazil, during 1 year. Element concentrations were determined applying a SR-TXRF methodology. The accuracy of SR-TXRF technique was validated by analysis of a certified reference material. As the groundwater composition in the North Serra Geral aquifer showed heterogeneity in the spatial distribution of eight major elements, a hierarchical clustering to the data was performed. By a similarity in their compositions, two of the nine wells were grouped in a first cluster, while the other seven were grouped in a second cluster. Calcium was the major element in all wells, with higher Ca concentration in the second cluster than in the first cluster. However, concentrations of Ti, V, Cr in the first cluster are slightly higher than those in the second cluster. The findings of this study within a monitoring program of tube-wells could provide a useful assessment of controls over groundwater composition and support management at regional level.
Cluster analysis of molecular simulation trajectories for systems where both conformation and orientation of the sampled states are important.

PubMed

Abramyan, Tigran M; Snyder, James A; Thyparambil, Aby A; Stuart, Steven J; Latour, Robert A

2016-08-05

Clustering methods have been widely used to group together similar conformational states from molecular simulations of biomolecules in solution. For applications such as the interaction of a protein with a surface, the orientation of the protein relative to the surface is also an important clustering parameter because of its potential effect on adsorbed-state bioactivity. This study presents cluster analysis methods that are specifically designed for systems where both molecular orientation and conformation are important, and the methods are demonstrated using test cases of adsorbed proteins for validation. Additionally, because cluster analysis can be a very subjective process, an objective procedure for identifying both the optimal number of clusters and the best clustering algorithm to be applied to analyze a given dataset is presented. The method is demonstrated for several agglomerative hierarchical clustering algorithms used in conjunction with three cluster validation techniques. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Country clustering applied to the water & sanitation sector: a new tool with potential applications in research & policy

PubMed Central

Onda, Kyle; Crocker, Jonny; Kayser, Georgia Lyn; Bartram, Jamie

2013-01-01

The fields of global health and international development commonly cluster countries by geography and income to target resources and describe progress. For any given sector of interest, a range of relevant indicators can serve as a more appropriate basis for classification. We create a new typology of country clusters specific to the water and sanitation (WatSan) sector based on similarities across multiple WatSan-related indicators. After a literature review and consultation with experts in the WatSan sector, nine indicators were selected. Indicator selection was based on relevance to and suggested influence on national water and sanitation service delivery, and to maximize data availability across as many countries as possible. A hierarchical clustering method and a gap statistic analysis were used to group countries into a natural number of relevant clusters. Two stages of clustering resulted in five clusters, representing 156 countries or 6.75 billion people. The five clusters were not well explained by income or geography, and were unique from existing country clusters used in international development. Analysis of these five clusters revealed that they were more compact and well separated than United Nations and World Bank country clusters. This analysis and resulting country typology suggest that previous geography- or income-based country groupings can be improved upon for applications in the WatSan sector by utilizing globally available WatSan-related indicators. Potential applications include guiding and discussing research, informing policy, improving resource targeting, describing sector progress, and identifying critical knowledge gaps in the WatSan sector. PMID:24054545
Down-Regulation of Olfactory Receptors in Response to Traumatic Brain Injury Promotes Risk for Alzheimer’s Disease

DTIC Science & Technology

2013-10-01

correct group assignment of samples in unsupervised hierarchical clustering by the Unweighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on...centering of log2 transformed MAS5.0 signal values; probe set clustering was performed by the UPGMA method using Cosine correlation as the similarity met...A) The 108 differentially-regulated genes identified were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with
Cluster Analysis in Sociometric Research: A Pattern-Oriented Approach to Identifying Temporally Stable Peer Status Groups of Girls

ERIC Educational Resources Information Center

Zettergren, Peter

2007-01-01

A modern clustering technique was applied to age-10 and age-13 sociometric data with the purpose of identifying longitudinally stable peer status clusters. The study included 445 girls from a Swedish longitudinal study. The identified temporally stable clusters of rejected, popular, and average girls were essentially larger than corresponding…

Accounting for multiple births in randomised trials: a systematic review.

PubMed

Yelland, Lisa Nicole; Sullivan, Thomas Richard; Makrides, Maria

2015-03-01

Multiple births are an important subgroup to consider in trials aimed at reducing preterm birth or its consequences. Including multiples results in a unique mixture of independent and clustered data, which has implications for the design, analysis and reporting of the trial. We aimed to determine how multiple births were taken into account in the design and analysis of recent trials involving preterm infants, and whether key information relevant to multiple births was reported. We conducted a systematic review of multicentre randomised trials involving preterm infants published between 2008 and 2013. Information relevant to multiple births was extracted. Of the 56 trials included in the review, 6 (11%) excluded multiples and 24 (43%) failed to indicate whether multiples were included. Among the 26 trials that reported multiples were included, only one (4%) accounted for clustering in the sample size calculations and eight (31%) took the clustering into account in the analysis of the primary outcome. Of the 20 trials that randomised infants, 12 (60%) failed to report how infants from the same birth were randomised. Information on multiple births is often poorly reported in trials involving preterm infants, and clustering due to multiple births is rarely taken into account. Since ignoring clustering could result in inappropriate recommendations for clinical practice, clustering should be taken into account in the design and analysis of future neonatal and perinatal trials including infants from a multiple birth. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Using Cluster Analysis and ICP-MS to Identify Groups of Ecstasy Tablets in Sao Paulo State, Brazil.

PubMed

Maione, Camila; de Oliveira Souza, Vanessa Cristina; Togni, Loraine Rezende; da Costa, José Luiz; Campiglia, Andres Dobal; Barbosa, Fernando; Barbosa, Rommel Melgaço

2017-11-01

The variations found in the elemental composition in ecstasy samples result in spectral profiles with useful information for data analysis, and cluster analysis of these profiles can help uncover different categories of the drug. We provide a cluster analysis of ecstasy tablets based on their elemental composition. Twenty-five elements were determined by ICP-MS in tablets apprehended by Sao Paulo's State Police, Brazil. We employ the K-means clustering algorithm along with C4.5 decision tree to help us interpret the clustering results. We found a better number of two clusters within the data, which can refer to the approximated number of sources of the drug which supply the cities of seizures. The C4.5 model was capable of differentiating the ecstasy samples from the two clusters with high prediction accuracy using the leave-one-out cross-validation. The model used only Nd, Ni, and Pb concentration values in the classification of the samples. © 2017 American Academy of Forensic Sciences.
Impact of a Participatory Intervention with Women’s Groups on Psychological Distress among Mothers in Rural Bangladesh: Secondary Analysis of a Cluster-Randomised Controlled Trial

PubMed Central

Clarke, Kelly; Azad, Kishwar; Kuddus, Abdul; Shaha, Sanjit; Nahar, Tasmin; Aumon, Bedowra Haq; Hossen, Mohammed Munir; Beard, James; Costello, Anthony; Houweling, Tanja A. J.; Prost, Audrey; Fottrell, Edward

2014-01-01

Background Perinatal common mental disorders (PCMDs) are a major cause of disability among women and disproportionately affect lower income countries. Interventions to address PCMDs are urgently needed in these settings, and group-based and peer-led approaches are potential strategies to increase access to mental health interventions. Participatory women’s health groups led by local women previously reduced postpartum psychological distress in eastern India. We assessed the effect of a similar intervention on postpartum psychological distress in rural Bangladesh. Method We conducted a secondary analysis of data from a cluster-randomised controlled trial with 18 clusters and an estimated population of 532,996. Nine clusters received an intervention comprising monthly meetings during which women’s groups worked through a participatory learning and action cycle to develop strategies for improving women’s and children’s health. There was one group for every 309 individuals in the population, 810 groups in total. Mothers in nine control clusters had access to usual perinatal care. Postpartum psychological distress was measured with the 20-item Self Reporting Questionnaire (SRQ-20) between six and 52 weeks after delivery, during the months of January to April, in 2010 and 2011. Results We analysed outcomes for 6275 mothers. Although the cluster mean SRQ-20 score was lower in the intervention arm (mean 5.2, standard deviation 1.8) compared to control (5.3, 1.2), the difference was not significant (β 1.44, 95% CI 0.28, 3.08). Conclusions Despite promising results in India, participatory women’s groups focused on women’s and children’s health had no significant effect on postpartum psychological distress in rural Bangladesh. PMID:25329470
Identifying Peer Institutions Using Cluster Analysis

ERIC Educational Resources Information Center

Boronico, Jess; Choksi, Shail S.

2012-01-01

The New York Institute of Technology's (NYIT) School of Management (SOM) wishes to develop a list of peer institutions for the purpose of benchmarking and monitoring/improving performance against other business schools. The procedure utilizes relevant criteria for the purpose of establishing this peer group by way of a cluster analysis. The…
Determining the trophic guilds of fishes and macroinvertebrates in a seagrass food web

USGS Publications Warehouse

Luczkovich, J.J.; Ward, G.P.; Johnson, J.C.; Christian, R.R.; Baird, D.; Neckles, H.; Rizzo, W.M.

2002-01-01

We established trophic guilds of macroinvertebrate and fish taxa using correspondence analysis and a hierarchical clustering strategy for a seagrass food web in winter in the northeastern Gulf of Mexico. To create the diet matrix, we characterized the trophic linkages of macroinvertebrate and fish taxa. present in Hatodule wrightii seagrass habitat areas within the St. Marks National Wildlife Refuge (Florida) using binary data, combining dietary links obtained from relevant literature for macroinvertebrates with stomach analysis of common fishes collected during January and February of 1994. Heirarchical average-linkage cluster analysis of the 73 taxa of fishes and macroinvertebrates in the diet matrix yielded 14 clusters with diet similarity greater than or equal to 0.60. We then used correspondence analysis with three factors to jointly plot the coordinates of the consumers (identified by cluster membership) and of the 33 food sources. Correspondence analysis served as a visualization tool for assigning each taxon to one of eight trophic guilds: herbivores, detritivores, suspension feeders, omnivores, molluscivores, meiobenthos consumers, macrobenthos consumers, and piscivores. These trophic groups, cross-classified with major taxonomic groups, were further used to develop consumer compartments in a network analysis model of carbon flow in this seagrass ecosystem. The method presented here should greatly improve the development of future network models of food webs by providing an objective procedure for aggregating trophic groups.
Groundwater quality assessment of urban Bengaluru using multivariate statistical techniques

NASA Astrophysics Data System (ADS)

Gulgundi, Mohammad Shahid; Shetty, Amba

2018-03-01

Groundwater quality deterioration due to anthropogenic activities has become a subject of prime concern. The objective of the study was to assess the spatial and temporal variations in groundwater quality and to identify the sources in the western half of the Bengaluru city using multivariate statistical techniques. Water quality index rating was calculated for pre and post monsoon seasons to quantify overall water quality for human consumption. The post-monsoon samples show signs of poor quality in drinking purpose compared to pre-monsoon. Cluster analysis (CA), principal component analysis (PCA) and discriminant analysis (DA) were applied to the groundwater quality data measured on 14 parameters from 67 sites distributed across the city. Hierarchical cluster analysis (CA) grouped the 67 sampling stations into two groups, cluster 1 having high pollution and cluster 2 having lesser pollution. Discriminant analysis (DA) was applied to delineate the most meaningful parameters accounting for temporal and spatial variations in groundwater quality of the study area. Temporal DA identified pH as the most important parameter, which discriminates between water quality in the pre-monsoon and post-monsoon seasons and accounts for 72% seasonal assignation of cases. Spatial DA identified Mg, Cl and NO3 as the three most important parameters discriminating between two clusters and accounting for 89% spatial assignation of cases. Principal component analysis was applied to the dataset obtained from the two clusters, which evolved three factors in each cluster, explaining 85.4 and 84% of the total variance, respectively. Varifactors obtained from principal component analysis showed that groundwater quality variation is mainly explained by dissolution of minerals from rock water interactions in the aquifer, effect of anthropogenic activities and ion exchange processes in water.
Ecological tolerances of Miocene larger benthic foraminifera from Indonesia

NASA Astrophysics Data System (ADS)

Novak, Vibor; Renema, Willem

2018-01-01

To provide a comprehensive palaeoenvironmental reconstruction based on larger benthic foraminifera (LBF), a quantitative analysis of their assemblage composition is needed. Besides microfacies analysis which includes environmental preferences of foraminiferal taxa, statistical analyses should also be employed. Therefore, detrended correspondence analysis and cluster analysis were performed on relative abundance data of identified LBF assemblages deposited in mixed carbonate-siliciclastic (MCS) systems and blue-water (BW) settings. Studied MCS system localities include ten sections from the central part of the Kutai Basin in East Kalimantan, ranging from late Burdigalian to Serravallian age. The BW samples were collected from eleven sections of the Bulu Formation on Central Java, dated as Serravallian. Results from detrended correspondence analysis reveal significant differences between these two environmental settings. Cluster analysis produced five clusters of samples; clusters 1 and 2 comprise dominantly MCS samples, clusters 3 and 4 with dominance of BW samples, and cluster 5 showing a mixed composition with both MCS and BW samples. The results of cluster analysis were afterwards subjected to indicator species analysis resulting in the interpretation that generated three groups among LBF taxa: typical assemblage indicators, regularly occurring taxa and rare taxa. By interpreting the results of detrended correspondence analysis, cluster analysis and indicator species analysis, along with environmental preferences of identified LBF taxa, a palaeoenvironmental model is proposed for the distribution of LBF in Miocene MCS systems and adjacent BW settings of Indonesia.
Creating peer groups for assessing and comparing nursing home performance.

PubMed

Byrne, Margaret M; Daw, Christina; Pietz, Ken; Reis, Brian; Petersen, Laura A

2013-11-01

Publicly reported performance data for hospitals and nursing homes are becoming ubiquitous. For such comparisons to be fair, facilities must be compared with their peers. To adapt a previously published methodology for developing hospital peer groupings so that it is applicable to nursing homes and to explore the characteristics of "nearest-neighbor" peer groupings. Analysis of Department of Veterans Affairs administrative databases and nursing home facility characteristics. The nearest-neighbor methodology for developing peer groupings involves calculating the Euclidean distance between facilities based on facility characteristics. We describe our steps in selection of facility characteristics, describe the characteristics of nearest-neighbor peer groups, and compare them with peer groups derived through classical cluster analysis. The facility characteristics most pertinent to nursing home groupings were found to be different from those that were most relevant for hospitals. Unlike classical cluster groups, nearest neighbor groups are not mutually exclusive, and the nearest-neighbor methodology resulted in nursing home peer groupings that were substantially less diffuse than nursing home peer groups created using traditional cluster analysis. It is essential that healthcare policy makers and administrators have a means of fairly grouping facilities for the purposes of quality, cost, or efficiency comparisons. In this research, we show that a previously published methodology can be successfully applied to a nursing home setting. The same approach could be applied in other clinical settings such as primary care.
Galaxies in X-ray Selected Clusters and Groups in Dark Energy Survey Data: Stellar Mass Growth of Bright Central Galaxies Since z~1.2

DOE PAGES

Zhang, Y.; Miller, C.; McKay, T.; ...

2016-01-10

Using the science verification data of the Dark Energy Survey for a new sample of 106 X-ray selected clusters and groups, we study the stellar mass growth of bright central galaxies (BCGs) since redshift z ~ 1.2. Compared with the expectation in a semi-analytical model applied to the Millennium Simulation, the observed BCGs become under-massive/under-luminous with decreasing redshift. We incorporate the uncertainties associated with cluster mass, redshift, and BCG stellar mass measurements into analysis of a redshift-dependent BCG-cluster mass relation.
Cluster Analysis of Time-Dependent Crystallographic Data: Direct Identification of Time-Independent Structural Intermediates

PubMed Central

Kostov, Konstantin S.; Moffat, Keith

2011-01-01

The initial output of a time-resolved macromolecular crystallography experiment is a time-dependent series of difference electron density maps that displays the time-dependent changes in underlying structure as a reaction progresses. The goal is to interpret such data in terms of a small number of crystallographically refinable, time-independent structures, each associated with a reaction intermediate; to establish the pathways and rate coefficients by which these intermediates interconvert; and thereby to elucidate a chemical kinetic mechanism. One strategy toward achieving this goal is to use cluster analysis, a statistical method that groups objects based on their similarity. If the difference electron density at a particular voxel in the time-dependent difference electron density (TDED) maps is sensitive to the presence of one and only one intermediate, then its temporal evolution will exactly parallel the concentration profile of that intermediate with time. The rationale is therefore to cluster voxels with respect to the shapes of their TDEDs, so that each group or cluster of voxels corresponds to one structural intermediate. Clusters of voxels whose TDEDs reflect the presence of two or more specific intermediates can also be identified. From such groupings one can then infer the number of intermediates, obtain their time-independent difference density characteristics, and refine the structure of each intermediate. We review the principles of cluster analysis and clustering algorithms in a crystallographic context, and describe the application of the method to simulated and experimental time-resolved crystallographic data for the photocycle of photoactive yellow protein. PMID:21244840
Characterization and virulence clustering analysis of extraintestinal pathogenic Escherichia coli isolated from swine in China.

PubMed

Zhu, Yinchu; Dong, Wenyang; Ma, Jiale; Yuan, Lvfeng; Hejair, Hassan M A; Pan, Zihao; Liu, Guangjin; Yao, Huochun

2017-04-08

Swine extraintestinal pathogenic Escherichia coli (ExPEC) is an important pathogen that leads to economic and welfare costs in the swine industry worldwide, and is occurring with increasing frequency in China. By far, various virulence factors have been recognized in ExPEC. Here, we investigated the virulence genotypes and clonal structure of collected strains to improve the knowledge of phylogenetic traits of porcine ExPECs in China. We isolated 64 Chinese porcine ExPEC strains from 2013 to 14 in China. By multiplex PCR, the distribution of isolates belonging to phylogenetic groups B1, B2, A and D was 9.4%, 10.9%, 57.8% and 21.9%, respectively. Nineteen virulence-related genes were detected by PCR assay; ompA, fimH, vat, traT and iutA were highly prevalent. Virulence-related genes were remarkably more prevalent in group B2 than in groups A, B1 and D; notably, usp, cnf1, hlyD, papA and ibeA were only found in group B2 strains. Genotyping analysis was performed and four clusters of strains (named I to IV) were identified. Cluster IV contained all isolates from group B2 and Cluster IV isolates had the strongest pathogenicity in a mouse infection model. As phylogenetic group B2 and D ExPEC isolates are generally considered virulent, multilocus sequence typing (MLST) analysis was performed for these isolates to further investigate genetic relationships. Two novel sequence types, ST5170 and ST5171, were discovered. Among the nine clonal complexes identified among our group B2 and D isolates, CC12 and CC95 have been indicated to have high zoonotic pathogenicity. The distinction between group B2 and non-B2 isolates in virulence and genotype accorded with MLST analysis. This study reveals significant genetic diversity among ExPEC isolates and helps us to better understand their pathogenesis. Importantly, our data suggest group B2 (Cluster IV) strains have the highest risk of causing animal disease and illustrate the correlation between genotype and virulence.
Atlas-Guided Cluster Analysis of Large Tractography Datasets

PubMed Central

Ros, Christian; Güllmar, Daniel; Stenzel, Martin; Mentzel, Hans-Joachim; Reichenbach, Jürgen Rainer

2013-01-01

Diffusion Tensor Imaging (DTI) and fiber tractography are important tools to map the cerebral white matter microstructure in vivo and to model the underlying axonal pathways in the brain with three-dimensional fiber tracts. As the fast and consistent extraction of anatomically correct fiber bundles for multiple datasets is still challenging, we present a novel atlas-guided clustering framework for exploratory data analysis of large tractography datasets. The framework uses an hierarchical cluster analysis approach that exploits the inherent redundancy in large datasets to time-efficiently group fiber tracts. Structural information of a white matter atlas can be incorporated into the clustering to achieve an anatomically correct and reproducible grouping of fiber tracts. This approach facilitates not only the identification of the bundles corresponding to the classes of the atlas; it also enables the extraction of bundles that are not present in the atlas. The new technique was applied to cluster datasets of 46 healthy subjects. Prospects of automatic and anatomically correct as well as reproducible clustering are explored. Reconstructed clusters were well separated and showed good correspondence to anatomical bundles. Using the atlas-guided cluster approach, we observed consistent results across subjects with high reproducibility. In order to investigate the outlier elimination performance of the clustering algorithm, scenarios with varying amounts of noise were simulated and clustered with three different outlier elimination strategies. By exploiting the multithreading capabilities of modern multiprocessor systems in combination with novel algorithms, our toolkit clusters large datasets in a couple of minutes. Experiments were conducted to investigate the achievable speedup and to demonstrate the high performance of the clustering framework in a multiprocessing environment. PMID:24386292
Novel approach to characterising individuals with low back-related leg pain: cluster identification with latent class analysis and 12-month follow-up.

PubMed

Stynes, Siobhán; Konstantinou, Kika; Ogollah, Reuben; Hay, Elaine M; Dunn, Kate M

2018-04-01

Traditionally, low back-related leg pain (LBLP) is diagnosed clinically as referred leg pain or sciatica (nerve root involvement). However, within the spectrum of LBLP, we hypothesised that there may be other unrecognised patient subgroups. This study aimed to identify clusters of patients with LBLP using latent class analysis and describe their clinical course. The study population was 609 LBLP primary care consulters. Variables from clinical assessment were included in the latent class analysis. Characteristics of the statistically identified clusters were compared, and their clinical course over 1 year was described. A 5 cluster solution was optimal. Cluster 1 (n = 104) had mild leg pain severity and was considered to represent a referred leg pain group with no clinical signs, suggesting nerve root involvement (sciatica). Cluster 2 (n = 122), cluster 3 (n = 188), and cluster 4 (n = 69) had mild, moderate, and severe pain and disability, respectively, and response to clinical assessment items suggested categories of mild, moderate, and severe sciatica. Cluster 5 (n = 126) had high pain and disability, longer pain duration, and more comorbidities and was difficult to map to a clinical diagnosis. Most improvement for pain and disability was seen in the first 4 months for all clusters. At 12 months, the proportion of patients reporting recovery ranged from 27% for cluster 5 to 45% for cluster 2 (mild sciatica). This is the first study that empirically shows the variability in profile and clinical course of patients with LBLP including sciatica. More homogenous groups were identified, which could be considered in future clinical and research settings.
Clinical Phenotype of Diabetic Peripheral Neuropathy and Relation to Symptom Patterns: Cluster and Factor Analysis in Patients with Type 2 Diabetes in Korea.

PubMed

Won, Jong Chul; Im, Yong-Jin; Lee, Ji-Hyun; Kim, Chong Hwa; Kwon, Hyuk Sang; Cha, Bong-Yun; Park, Tae Sun

2017-01-01

Patients with diabetic peripheral neuropathy (DPN) is the most common complication. However, patients are usually suffering from not only diverse sensory deficit but also neuropathy-related discomforts. The aim of this study is to identify distinct groups of patients with DPN with respect to its clinical impacts on symptom patterns and comorbidities. A hierarchical cluster analysis and factor analysis were performed to identify relevant subgroups of patients with DPN ( n = 1338) and symptom patterns. Patients with DPN were divided into three clusters: asymptomatic (cluster 1, n = 448, 33.5%), moderate symptoms with disturbed sleep (cluster 2, n = 562, 42.0%), and severe symptoms with decreased quality of life (cluster 3, n = 328, 24.5%). Patients in cluster 3, compared with clusters 1 and 2, were characterized by higher levels of HbA1c and more severe pain and physical impairments. Patients in cluster 2 had moderate pain levels but disturbed sleep patterns comparable to those in cluster 3. The frequency of symptoms on each item of MNSI by "painful" symptom pattern showed a similar distribution pattern with increasing intensities along the three clusters. Cluster and factor analysis endorsed the use of comprehensive and symptomatic subgrouping to individualize the evaluation of patients with DPN.
Who are the obese? A cluster analysis exploring subgroups of the obese.

PubMed

Green, M A; Strong, M; Razak, F; Subramanian, S V; Relton, C; Bissell, P

2016-06-01

Body mass index (BMI) can be used to group individuals in terms of their height and weight as obese. However, such a distinction fails to account for the variation within this group across other factors such as health, demographic and behavioural characteristics. The study aims to examine the existence of subgroups of obese individuals. Data were taken from the Yorkshire Health Study (2010-12) including information on demographic, health and behavioural characteristics. Individuals with a BMI of ≥30 were included. A two-step cluster analysis was used to define groups of individuals who shared common characteristics. The cluster analysis found six distinct groups of individuals whose BMI was ≥30. These subgroups were heavy drinking males, young healthy females; the affluent and healthy elderly; the physically sick but happy elderly; the unhappy and anxious middle aged and a cluster with the poorest health. It is important to account for the important heterogeneity within individuals who are obese. Interventions introduced by clinicians and policymakers should not target obese individuals as a whole but tailor strategies depending upon the subgroups that individuals belong to. © The Author 2015. Published by Oxford University Press on behalf of Faculty of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Undergraduate ALFALFA Team: Analysis of Spatially-Resolved Star-Formation in Nearby Galaxy Groups and Clusters

NASA Astrophysics Data System (ADS)

Finn, Rose; Collova, Natasha; Spicer, Sandy; Whalen, Kelly; Koopmann, Rebecca A.; Durbala, Adriana; Haynes, Martha P.; Undergraduate ALFALFA Team

2017-01-01

As part of the Undergraduate ALFALFA Team, we are conducting a survey of the gas and star-formation properties of galaxies in 36 groups and clusters in the local universe. The galaxies in our sample span a large range of galactic environments, from the centers of galaxy groups and clusters to the surrounding infall regions. One goal of the project is to map the spatial distribution of star-formation; the relative extent of the star-forming and stellar disks provides important information about the internal and external processes that deplete gas and thus drive galaxy evolution. We obtained wide-field H-alpha observations with the WIYN 0.9m telescope at Kitt Peak National Observatory for galaxies in the vicinity of the MKW11 and NRGb004 galaxy groups and the Abell 1367 cluster. We present a preliminary analysis of the relative size of the star-forming and stellar disks as a function of galaxy morphology and local galaxy density, and we calculate gas depletion times using star-formation rates and HI gas mass. We will combine these results with those from other UAT members to determine if and how environmentally-driven gas depletion varies with the mass and X-ray properties of the host group or cluster. This work has supported by NSF grants AST-0847430, AST-1211005 and AST-1637339.
The relationship between a low grain intake dietary pattern and impulsive behaviors in middle-aged Japanese people.

PubMed

Toyomaki, Atsuhito; Koga, Minori; Okada, Emiko; Nakai, Yukiei; Miyazaki, Akane; Tamakoshi, Akiko; Kiso, Yoshinobu; Kusumi, Ichiro

2017-01-01

Several studies indicate that dietary habits are associated with mental health. We are interested in identifying not a specific single nutrient/food group but the population preferring specific food combinations that can be related to mental health. Very few studies have examined relationships between dietary patterns and multifaceted mental states using cluster analysis. The purpose of this study was to investigate population-level dietary patterns associated with mental state using cluster analysis. We focused on depressive state, sleep quality, subjective well-being, and impulsive behaviors using rating scales. Two hundred and seventy-nine Japanese middle-aged people participated in the present study. Dietary pattern was estimated using a brief self-administered diet-history questionnaire (the BDHQ). We conducted K-means cluster analysis using thirteen BDHQ food groups: milk, meat, fish, egg, pulses, potatoes, green and yellow vegetables, other vegetables, mushrooms, seaweed, sweets, fruits, and grain. We identified three clusters characterized as "vegetable and fruit dominant," "grain dominant," and "low grain tendency" subgroups. The vegetable and fruit dominant group showed increases in several aspects of subjective well-being demonstrated by the SF-8. Differences in mean subject characteristics across clusters were tested using ANOVA. The low frequency intake of grain group showed higher impulsive behavior, demonstrated by BIS-11 deliberation and sum scores. The present study demonstrated that traditional Japanese dietary patterns, such as eating rice, can help with beneficial changes in mental health.
The relationship between a low grain intake dietary pattern and impulsive behaviors in middle-aged Japanese people

PubMed Central

Toyomaki, Atsuhito; Koga, Minori; Okada, Emiko; Nakai, Yukiei; Miyazaki, Akane; Tamakoshi, Akiko; Kiso, Yoshinobu; Kusumi, Ichiro

2017-01-01

Several studies indicate that dietary habits are associated with mental health. We are interested in identifying not a specific single nutrient/food group but the population preferring specific food combinations that can be related to mental health. Very few studies have examined relationships between dietary patterns and multifaceted mental states using cluster analysis. The purpose of this study was to investigate population-level dietary patterns associated with mental state using cluster analysis. We focused on depressive state, sleep quality, subjective well-being, and impulsive behaviors using rating scales. Two hundred and seventy-nine Japanese middle-aged people participated in the present study. Dietary pattern was estimated using a brief self-administered diet-history questionnaire (the BDHQ). We conducted K-means cluster analysis using thirteen BDHQ food groups: milk, meat, fish, egg, pulses, potatoes, green and yellow vegetables, other vegetables, mushrooms, seaweed, sweets, fruits, and grain. We identified three clusters characterized as “vegetable and fruit dominant,” “grain dominant,” and “low grain tendency” subgroups. The vegetable and fruit dominant group showed increases in several aspects of subjective well-being demonstrated by the SF-8. Differences in mean subject characteristics across clusters were tested using ANOVA. The low frequency intake of grain group showed higher impulsive behavior, demonstrated by BIS-11 deliberation and sum scores. The present study demonstrated that traditional Japanese dietary patterns, such as eating rice, can help with beneficial changes in mental health. PMID:28704469
Genetic variation in resistance to blast (Pyricularia oryzae Cavara) in rice (Oryza sativa L.) germplasms of Bangladesh

PubMed Central

Khan, Mohammad Ashik Iqbal; Latif, Mohammad Abdul; Khalequzzaman, Mohammad; Tomita, Asami; Ali, Mohammad Ansar; Fukuta, Yoshimichi

2017-01-01

Genetic variation in blast resistance was clarified in 334 Bangladesh rice accessions from 4 major ecotypes (Aus, Aman, Boro and Jhum). Cluster analysis of polymorphism data of 74 SSR markers separated these accessions into cluster I (corresponding to the Japonica Group) and cluster II (corresponding to the Indica Group). Cluster II accessions were represented with high frequency in all ecotypes. Cluster II was further subdivided into subclusters IIa and IIb. Subcluster IIa accessions were represented with high frequency in only Aus and Jhum ecotypes. Cluster I accessions were more frequent in the Aman ecotype than in other ecotypes. Distinct variations in resistance were found, and accessions were classified into 4 groups (A1, A2, B1 and B2) based on their reactions to standard differential blast isolates. The most susceptible group was A2 (which included susceptible variety Lijiangxintuanheigu, most of the differential varieties, and a few Bangladesh accessions), followed in order by A1, B2 and B1 (the most resistant). Accessions from 4 ecotypes fell with different frequencies into each of these resistance groups. These results demonstrated that Japonica Group accessions were found mainly in Aman, and Indica Group accessions were distributed across all ecotypes. Susceptible accessions were limited in Aus and Aman. PMID:29398943
Worldwide Topology of the Scientific Subject Profile: A Macro Approach in the Country Level

PubMed Central

Moya-Anegón, Félix; Herrero-Solana, Víctor

2013-01-01

Background Models for the production of knowledge and systems of innovation and science are key elements for characterizing a country in view of its scientific thematic profile. With regard to scientific output and publication in journals of international visibility, the countries of the world may be classified into three main groups according to their thematic bias. Methodology/Principal Findings This paper aims to classify the countries of the world in several broad groups, described in terms of behavioural models that attempt to sum up the characteristics of their systems of knowledge and innovation. We perceive three clusters in our analysis: 1) the biomedical cluster, 2) the basic science & engineering cluster, and 3) the agricultural cluster. The countries are conceptually associated with the clusters via Principal Component Analysis (PCA), and a Multidimensional Scaling (MDS) map with all the countries is presented. Conclusions/Significance As we have seen, insofar as scientific output and publication in journals of international visibility is concerned, the countries of the world may be classified into three main groups according to their thematic profile. These groups can be described in terms of behavioral models that attempt to sum up the characteristics of their systems of knowledge and innovation. PMID:24349467

Species-richness of the Anopheles annulipes Complex (Diptera: Culicidae) Revealed by Tree and Model-Based Allozyme Clustering Analyses

DTIC Science & Technology

2007-01-01

including tree- based methods such as the unweighted pair group method of analysis ( UPGMA ) and Neighbour-joining (NJ) (Saitou & Nei, 1987). By...based Bayesian approach and the tree-based UPGMA and NJ cluster- ing methods. The results obtained suggest that far more species occur in the An...unlikely that groups that differ by more than these levels are conspecific. Genetic distances were clustered using the UPGMA and NJ algorithms in MEGA
Cluster Analysis Identifies 3 Phenotypes within Allergic Asthma.

PubMed

Sendín-Hernández, María Paz; Ávila-Zarza, Carmelo; Sanz, Catalina; García-Sánchez, Asunción; Marcos-Vadillo, Elena; Muñoz-Bellido, Francisco J; Laffond, Elena; Domingo, Christian; Isidoro-García, María; Dávila, Ignacio

Asthma is a heterogeneous chronic disease with different clinical expressions and responses to treatment. In recent years, several unbiased approaches based on clinical, physiological, and molecular features have described several phenotypes of asthma. Some phenotypes are allergic, but little is known about whether these phenotypes can be further subdivided. We aimed to phenotype patients with allergic asthma using an unbiased approach based on multivariate classification techniques (unsupervised hierarchical cluster analysis). From a total of 54 variables of 225 patients with well-characterized allergic asthma diagnosed following American Thoracic Society (ATS) recommendation, positive skin prick test to aeroallergens, and concordant symptoms, we finally selected 19 variables by multiple correspondence analyses. Then a cluster analysis was performed. Three groups were identified. Cluster 1 was constituted by patients with intermittent or mild persistent asthma, without family antecedents of atopy, asthma, or rhinitis. This group showed the lowest total IgE levels. Cluster 2 was constituted by patients with mild asthma with a family history of atopy, asthma, or rhinitis. Total IgE levels were intermediate. Cluster 3 included patients with moderate or severe persistent asthma that needed treatment with corticosteroids and long-acting β-agonists. This group showed the highest total IgE levels. We identified 3 phenotypes of allergic asthma in our population. Furthermore, we described 2 phenotypes of mild atopic asthma mainly differentiated by a family history of allergy. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Quantum chemical calculations in the structural analysis of phloretin

NASA Astrophysics Data System (ADS)

Gómez-Zavaglia, Andrea

2009-07-01

In this work, a conformational search on the molecule of phloretin [2',4',6'-Trihydroxy-3-(4-hydroxyphenyl)-propiophenone] has been performed. The molecule of phloretin has eight dihedral angles, four of them taking part in the carbon backbone and the other four, related with the orientation of the hydroxyl groups. A systematic search involving a random variation of the dihedral angles has been used to generate input structures for the quantum chemical calculations. Calculations at the DFT(B3LYP)/6-311++G(d,p) level of theory permitted the identification of 58 local minima belonging to the C 1 symmetry point group. The molecular structures of the conformers have been analyzed using hierarchical cluster analysis. This method allowed us to group conformers according to their similarities, and thus, to correlate the conformers' stability with structural parameters. The dendrogram obtained from the hierarchical cluster analysis depicted two main clusters. Cluster I included all the conformers with relative energies lower than 25 kJ mol -1 and cluster II, the remaining conformers. The possibility of forming intramolecular hydrogen bonds resulted the main factor contributing for the stability. Accordingly, all conformers depicting intramolecular H-bonds belong to cluster I. These conformations are clearly favored when the carbon backbone is as planar as possible. The values of the νC dbnd O and νOH vibrational modes were compared among all the conformers of phloretin. The redshifts associated with intramolecular H-bonds were correlated with the H-bonds distances and energies.
OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes

PubMed Central

Li, Li; Stoeckert, Christian J.; Roos, David S.

2003-01-01

The identification of orthologous groups is useful for genome annotation, studies on gene/protein evolution, comparative genomics, and the identification of taxonomically restricted sequences. Methods successfully exploited for prokaryotic genome analysis have proved difficult to apply to eukaryotes, however, as larger genomes may contain multiple paralogous genes, and sequence information is often incomplete. OrthoMCL provides a scalable method for constructing orthologous groups across multiple eukaryotic taxa, using a Markov Cluster algorithm to group (putative) orthologs and paralogs. This method performs similarly to the INPARANOID algorithm when applied to two genomes, but can be extended to cluster orthologs from multiple species. OrthoMCL clusters are coherent with groups identified by EGO, but improved recognition of “recent” paralogs permits overlapping EGO groups representing the same gene to be merged. Comparison with previously assigned EC annotations suggests a high degree of reliability, implying utility for automated eukaryotic genome annotation. OrthoMCL has been applied to the proteome data set from seven publicly available genomes (human, fly, worm, yeast, Arabidopsis, the malaria parasite Plasmodium falciparum, and Escherichia coli). A Web interface allows queries based on individual genes or user-defined phylogenetic patterns (http://www.cbil.upenn.edu/gene-family). Analysis of clusters incorporating P. falciparum genes identifies numerous enzymes that were incompletely annotated in first-pass annotation of the parasite genome. PMID:12952885
Characterizing Heterogeneity within Head and Neck Lesions Using Cluster Analysis of Multi-Parametric MRI Data.

PubMed

Borri, Marco; Schmidt, Maria A; Powell, Ceri; Koh, Dow-Mu; Riddell, Angela M; Partridge, Mike; Bhide, Shreerang A; Nutting, Christopher M; Harrington, Kevin J; Newbold, Katie L; Leach, Martin O

2015-01-01

To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters) of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment. The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4). Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters. The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4), determined with cluster validation, produced the best separation between reducing and non-reducing clusters. The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes.
Typology of schizotypy in non-clinical young adults: Psychopathological and personality disorder traits correlates.

PubMed

Raynal, Patrick; Goutaudier, Nelly; Nidetch, Victoria; Chabrol, Henri

2016-12-30

Few typological studies address schizotypy in young adults. Schizotypal traits were assessed on 466 college students using the Schizotypal Personality Questionnaire-Brief (SPQ-B). Other measures evaluated personality traits previously associated with schizotypy (borderline, obsessionnal, and autistic traits), psychopathological symptoms (suicidal ideations, depressive and obsessive-compulsive symptoms) and psychosocial functioning. A factor analysis was first performed on SPQ-B results, leading to four factors: negative schizotypy, positive schizotypy, social anxiety, and reference ideas. Based on these factors, a cluster analysis was conducted, which yielded four clearly distinct groups characterized by "Low" (non schizotypy), "High schizotypy" (mixed positive and negative), "Positive schizotypy", and "Social impairment". Regarding personality disorder traits and psychopathological symptoms, the "High schizotypy" cluster scored higher than the "Positive" and the "Social impairment" groups, which scored higher than the "Low" cluster. The "Positive" group had higher levels of interpersonal relationships than in the "High" and the "Social impairment" clusters, suggesting that positive schizotypy was associated to benefits such as perceived social relationships. Nevertheless the "Positive" cluster was also linked to high levels of personality disorder traits and psychopathological symptoms, and to low academic achievement, at levels similar those observed in the "Social impairment" cluster, confirming an unhealthy side to positive schizotypy. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The application of cluster analysis in the intercomparison of loop structures in RNA.

PubMed

Huang, Hung-Chung; Nagaswamy, Uma; Fox, George E

2005-04-01

We have developed a computational approach for the comparison and classification of RNA loop structures. Hairpin or interior loops identified in atomic resolution RNA structures were intercompared by conformational matching. The root-mean-square deviation (RMSD) values between all pairs of RNA fragments of interest, even if from different molecules, are calculated. Subsequently, cluster analysis is performed on the resulting matrix of RMSD distances using the unweighted pair group method with arithmetic mean (UPGMA). The cluster analysis objectively reveals groups of folds that resemble one another. To demonstrate the utility of the approach, a comprehensive analysis of all the terminal hairpin tetraloops that have been observed in 15 RNA structures that have been determined by X-ray crystallography was undertaken. The method found major clusters corresponding to the well-known GNRA and UNCG types. In addition, two tetraloops with the unusual primary sequence UMAC (M is A or C) were successfully assigned to the GNRA cluster. Larger loop structures were also examined and the clustering results confirmed the occurrence of variations of the GNRA and UNCG tetraloops in these loops and provided a systematic means for locating them. Nineteen examples of larger loops that closely resemble either the GNRA or UNCG tetraloop were found in the large ribosomal RNAs. When the clustering approach was extended to include all structures in the SCOR database, novel relationships were detected including one between the ANYA motif and a less common folding of the GAAA tetraloop sequence.
The application of cluster analysis in the intercomparison of loop structures in RNA

PubMed Central

HUANG, HUNG-CHUNG; NAGASWAMY, UMA; FOX, GEORGE E.

2005-01-01

We have developed a computational approach for the comparison and classification of RNA loop structures. Hairpin or interior loops identified in atomic resolution RNA structures were intercompared by conformational matching. The root-mean-square deviation (RMSD) values between all pairs of RNA fragments of interest, even if from different molecules, are calculated. Subsequently, cluster analysis is performed on the resulting matrix of RMSD distances using the unweighted pair group method with arithmetic mean (UPGMA). The cluster analysis objectively reveals groups of folds that resemble one another. To demonstrate the utility of the approach, a comprehensive analysis of all the terminal hairpin tetraloops that have been observed in 15 RNA structures that have been determined by X-ray crystallography was undertaken. The method found major clusters corresponding to the well-known GNRA and UNCG types. In addition, two tetraloops with the unusual primary sequence UMAC (M is A or C) were successfully assigned to the GNRA cluster. Larger loop structures were also examined and the clustering results confirmed the occurrence of variations of the GNRA and UNCG tetraloops in these loops and provided a systematic means for locating them. Nineteen examples of larger loops that closely resemble either the GNRA or UNCG tetraloop were found in the large ribosomal RNAs. When the clustering approach was extended to include all structures in the SCOR database, novel relationships were detected including one between the ANYA motif and a less common folding of the GAAA tetraloop sequence. PMID:15769871
Teaching Gene Technology in an Outreach Lab: Students' Assigned Cognitive Load Clusters and the Clusters' Relationships to Learner Characteristics, Laboratory Variables, and Cognitive Achievement

ERIC Educational Resources Information Center

Scharfenberg, Franz-Josef; Bogner, Franz X.

2013-01-01

This study classified students into different cognitive load (CL) groups by means of cluster analysis based on their experienced CL in a gene technology outreach lab which has instructionally been designed with regard to CL theory. The relationships of the identified student CL clusters to learner characteristics, laboratory variables, and…
Profile Analysis of the Woodcock-Johnson III Tests of Cognitive Abilities with Gifted Students.

ERIC Educational Resources Information Center

Rizza, Mary G.; McIntosh, David E.; McCunn, Alice

2001-01-01

The Cattell-Horn-Carroll (CHC) factor clusters of the Woodcock-Johnson III Tests of Cognitive Abilities (WJ III COG) were studied with a group of gifted and nongifted individuals. Results found both groups display similar patterns of performance across the CHC factor clusters. Discusses clinical and educational considerations when using the WJ III…
Diversity in phenotypic and nutritional traits in vegetable amaranth (Amaranthus tricolor), a nutritionally underutilised crop.

PubMed

Shukla, Sudhir; Bhargava, Atul; Chatterjee, Avijeet; Pandey, Avinash Chandra; Mishra, Brij K

2010-01-15

Assessment of genetic diversity in a crop-breeding programme helps in the identification of diverse parental combinations to create segregating progenies with maximum genetic variability and facilitates introgression of desirable genes from diverse germplasm into the available genetic base. In the present study, 39 strains of vegetable amaranth (Amaranthus tricolor) were evaluated for eight morphological and seven quality traits for two test seasons to study the extent of genetic divergence among the strains. Multivariate analysis showed that the first four principal components contributed 67.55% of the variability. Cluster analysis grouped the strains into six clusters that displayed a wide range of diversity for most of the traits. Cluster analysis has proved to be an effective method in grouping strains that may facilitate effective management and utilisation in crop-breeding programmes. The diverse strains falling in different clusters were identified, which can be utilised in different hybridisation programmes to develop high-foliage-yielding varieties rich in nutritional components. Copyright (c) 2009 Society of Chemical Industry.
Discrimination of multilocus sequence typing-based Campylobacter jejuni subgroups by MALDI-TOF mass spectrometry.

PubMed

Zautner, Andreas Erich; Masanta, Wycliffe Omurwa; Tareen, Abdul Malik; Weig, Michael; Lugert, Raimond; Groß, Uwe; Bader, Oliver

2013-11-07

Campylobacter jejuni, the most common bacterial pathogen causing gastroenteritis, shows a wide genetic diversity. Previously, we demonstrated by the combination of multi locus sequence typing (MLST)-based UPGMA-clustering and analysis of 16 genetic markers that twelve different C. jejuni subgroups can be distinguished. Among these are two prominent subgroups. The first subgroup contains the majority of hyperinvasive strains and is characterized by a dimeric form of the chemotaxis-receptor Tlp7(m+c). The second has an extended amino acid metabolism and is characterized by the presence of a periplasmic asparaginase (ansB) and gamma-glutamyl-transpeptidase (ggt). Phyloproteomic principal component analysis (PCA) hierarchical clustering of MALDI-TOF based intact cell mass spectrometry (ICMS) spectra was able to group particular C. jejuni subgroups of phylogenetic related isolates in distinct clusters. Especially the aforementioned Tlp7(m+c)(+) and ansB+/ ggt+ subgroups could be discriminated by PCA. Overlay of ICMS spectra of all isolates led to the identification of characteristic biomarker ions for these specific C. jejuni subgroups. Thus, mass peak shifts can be used to identify the C. jejuni subgroup with an extended amino acid metabolism. Although the PCA hierarchical clustering of ICMS-spectra groups the tested isolates into a different order as compared to MLST-based UPGMA-clustering, the isolates of the indicator-groups form predominantly coherent clusters. These clusters reflect phenotypic aspects better than phylogenetic clustering, indicating that the genes corresponding to the biomarker ions are phylogenetically coupled to the tested marker genes. Thus, PCA clustering could be an additional tool for analyzing the relatedness of bacterial isolates.
Chemotaxonomy of heterocystous cyanobacteria using FAME profiling as species markers.

PubMed

Shukla, Ekta; Singh, Satya Shila; Singh, Prashant; Mishra, Arun Kumar

2012-07-01

The fatty acid methyl ester (FAME) analysis of the 12 heterocystous cyanobacterial strains showed different fatty acid profiling based on the presence/absence and the percentage of 13 different types of fatty acids. The major fatty acids viz. palmitic acid (16:0), hexadecadienoic acid (16:2), stearic acid (18:0), oleic acid (18:1), linoleic (18:2), and linolenic acid (18:3) were present among all the strains except Cylindrospermum musicola where oleic acid (18:1) was absent. All the strains showed high levels of polyunsaturated fatty acid (PUFAs; 41-68.35%) followed by saturated fatty acid (SAFAs; 1.82-40.66%) and monounsaturated fatty acid (0.85-24.98%). Highest percentage of PUFAs and essential fatty acid (linolenic acid; 18:3) was reported in Scytonema bohnerii which can be used as fatty acid supplement in medical and biotechnological purpose. The cluster analysis based on FAME profiling suggests the presence of two distinct clusters with Euclidean distance ranging from 0 to 25. S. bohnerii of cluster I was distantly related to the other strains of cluster II. The genotypes of cluster II were further divided into two subclusters, i.e., IIa with C. musicola showing great divergence with the other genotypes of IIb which was further subdivided into two groups. Subsubcluster IIb(1) was represented by a genotype, Anabaena sp. whereas subsubcluster IIb(2) was distinguished by two groups, i.e., one group having significant similarity among their three genotypes showed distant relation with the other group having closely related six genotypes. To test the validity of the fatty acid profiles as a marker, cluster analysis has also been generated on the basis of morphological attributes. Our results suggest that FAME profiling might be used as species markers in the study of polyphasic approach based taxonomy and phylogenetic relationship.
Multivariate Cluster Analysis.

ERIC Educational Resources Information Center

McRae, Douglas J.

Procedures for grouping students into homogeneous subsets have long interested educational researchers. The research reported in this paper is an investigation of a set of objective grouping procedures based on multivariate analysis considerations. Four multivariate functions that might serve as criteria for adequate grouping are given and…
Language Learner Motivational Types: A Cluster Analysis Study

ERIC Educational Resources Information Center

Papi, Mostafa; Teimouri, Yasser

2014-01-01

The study aimed to identify different second language (L2) learner motivational types drawing on the framework of the L2 motivational self system. A total of 1,278 secondary school students learning English in Iran completed a questionnaire survey. Cluster analysis yielded five different groups based on the strength of different variables within…
Power Analysis for Cross Level Mediation in CRTs

ERIC Educational Resources Information Center

Kelcey, Ben

2014-01-01

A common design in education research for interventions operating at a group or cluster level is a cluster randomized trial (CRT) (Bloom, 2005). In CRTs, intact clusters (e.g., schools) are assigned to treatment conditions rather than individuals (e.g., students) and are frequently an effective way to study interventions because they permit…
Country clustering applied to the water and sanitation sector: a new tool with potential applications in research and policy.

PubMed

Onda, Kyle; Crocker, Jonny; Kayser, Georgia Lyn; Bartram, Jamie

2014-03-01

The fields of global health and international development commonly cluster countries by geography and income to target resources and describe progress. For any given sector of interest, a range of relevant indicators can serve as a more appropriate basis for classification. We create a new typology of country clusters specific to the water and sanitation (WatSan) sector based on similarities across multiple WatSan-related indicators. After a literature review and consultation with experts in the WatSan sector, nine indicators were selected. Indicator selection was based on relevance to and suggested influence on national water and sanitation service delivery, and to maximize data availability across as many countries as possible. A hierarchical clustering method and a gap statistic analysis were used to group countries into a natural number of relevant clusters. Two stages of clustering resulted in five clusters, representing 156 countries or 6.75 billion people. The five clusters were not well explained by income or geography, and were distinct from existing country clusters used in international development. Analysis of these five clusters revealed that they were more compact and well separated than United Nations and World Bank country clusters. This analysis and resulting country typology suggest that previous geography- or income-based country groupings can be improved upon for applications in the WatSan sector by utilizing globally available WatSan-related indicators. Potential applications include guiding and discussing research, informing policy, improving resource targeting, describing sector progress, and identifying critical knowledge gaps in the WatSan sector. Copyright © 2013 Elsevier GmbH. All rights reserved.
Lifestyle and accidents among young drivers.

PubMed

Gregersen, N P; Berg, H Y

1994-06-01

This study covers the lifestyle component of the problems related to young drivers' accident risk. The purpose of the study is to measure the relationship between lifestyle and accident risk, and to identify specific high-risk and low-risk groups. Lifestyle is measured through a questionnaire, where 20-year-olds describe themselves and how often they deal with a large number of different activities, like sports, music, movies, reading, cars and driving, political engagement, etc. They also report their involvement in traffic accidents. With a principal component analysis followed by a cluster analysis, lifestyle profiles are defined. These profiles are finally correlated to accidents, which makes it possible to define high-risk and low-risk groups. The cluster analysis defined 15 clusters including four high-risk groups with an average overrisk of 150% and two low-risk groups with an average underrisk of 75%. The results are discussed from two perspectives. The first is the importance of theoretical understanding of the contribution of lifestyle factors to young drivers' high accident risk. The second is how the findings could be used in practical road safety measures, like education, campaigns, etc.
Use of cluster analysis and preference mapping to evaluate consumer acceptability of choice and select bovine M. longissimus lumborum steaks cooked to various end-point temperatures.

PubMed

Schmidt, T B; Schilling, M W; Behrends, J M; Battula, V; Jackson, V; Sekhon, R K; Lawrence, T E

2010-01-01

Consumer research was conducted to evaluate the acceptability of choice and select steaks from the Longissimus lumborum that were cooked to varying degrees of doneness using demographic information, cluster analysis and descriptive analysis. On average, using data from approximately 155 panelists, no differences (P>0.05) existed in consumer acceptability among select and choice steaks, and all treatment means ranged between like slightly and like moderately (6-7) on the hedonic scale. Individual consumers were highly variable in their perception of acceptability and consumers were grouped into clusters (eight for select and seven for choice) based on their preference and liking of steaks. The largest consumer groups liked steaks from all treatments, but other groups preferred (P<0.05) steaks that were cooked to various end-point temperatures. Results revealed that consumers could be grouped together according to preference, liking and descriptive sensory attributes, (juiciness, tenderness, bloody, metallic, and roasted) to further understand consumer perception of steaks that were cooked to different end-point temperatures.
Breast cancer and symptom clusters during radiotherapy.

PubMed

Matthews, Ellyn E; Schmiege, Sarah J; Cook, Paul F; Sousa, Karen H

2012-01-01

Symptom clusters assessment shifts the clinical focus from a specific symptom to the patient's experience as a whole. Few studies have examined breast cancer symptom clusters during treatment, and fewer studies have addressed symptom clusters during radiation therapy (RT). The theoretical underpinning of this study is the Symptoms Experience Model. Research is needed to identify antecedents and consequences of cancer-related symptom clusters. The present study was intended to determine the clustering of symptoms during RT in women with breast cancer and significant correlations among the symptoms, individual characteristics, and mood. A secondary data analysis from a descriptive correlational study of 93 women at weeks 3 to 7 of RT from centers in the mid-Atlantic region of the United States, Symptom Distress Scale, the subscales of the Positive and Negative Affect Scale, Life Orientation Test, and Self-transcendence Scale were completed. Confirmatory factor analysis revealed symptoms grouped into 3 distinct clusters: pain-insomnia-fatigue, cognitive disturbance-outlook, and gastrointestinal. The pain-insomnia-fatigue and cognitive disturbance-outlook clusters were associated with individual characteristics, optimism, self-transcendence, and positive and negative mood. The gastrointestinal cluster correlated significantly only with positive mood. This study provides insight into symptoms that group together and the relationship of symptom clusters to antecedents and mood. These findings underscore the need to define and standardize the measurement of symptom clusters and understand variability in concurrent symptoms. Attention to symptom clusters shifts the clinical focus from a specific symptom to the patient's experience as a whole and helps identify the most effective interventions.

Clustering Analysis of Antibiograms and Antibiogram Types of Streptococcus agalactiae Strains from Tilapia in China.

PubMed

Liu, Chan; Feng, Juan; Zhang, Defeng; Xie, Yundan; Li, Anxing; Wang, Jiangyong; Su, Youlu

2018-05-11

In view of the changing antibiotic-resistance profiles of Streptococcus agalactiae from tilapia in China, antimicrobial susceptibilities of 75 S. agalactiae strains were determined by the disc diffusion method, and cluster analyses of the antibiograms and antibiogram types were performed. All strains displayed multidrug resistance (MDR). The antimicrobial-resistance rates were highest (>90%) to aminoglycosides, sulfonamides, pipemidic acid, and norfloxacin, followed by penicillin, ampicillin, and ciprofloxacin (26.7-38.7%); those to furadantin, lincomycin, erythromycin, ofloxacin, tetracycline, and florfenicol were low (<10%), and no resistance to vancomycin, cefalexin, cefoxitin, amoxicillin, medemycin, doxitard, oxytetracycline, rifampin, chloramphenicol, or thiamphenicol was detected. Statistical analysis showed that the resistance rate to ciprofloxacin increased significantly in 2016 (p = 0.009), whereas that to trimethoprim/sulfamethoxazole decreased (p = 0.017). Cluster analyses identified that the strains had 23 antibiogram types (A-W) and clustered in five groups (Groups I-V). The strains with higher antimicrobial resistance mainly clustered in Groups I and II. Our results show that the antibiograms varied with time and by location and that antibiogram types are constantly updating and expanding. Effective measures must be taken to reduce the antimicrobial resistance and spread of MDR strains.
Assessment of hybridization among wild and cultivated Vigna unguiculata subspecies revealed by arbitrarily primed polymerase chain reaction analysis

PubMed Central

Vijaykumar, Archana; Saini, Ajay; Jawali, Narendra

2012-01-01

Background and aims Intra-species hybridization and incompletely homogenized ribosomal RNA repeat units have earlier been reported in 21 accessions of Vigna unguiculata from six subspecies using internal transcribed spacer (ITS) and 5S intergenic spacer (IGS) analyses. However, the relationships among these accessions were not clear from these analyses. We therefore assessed intra-species hybridization in the same set of accessions. Methodology Arbitrarily primed polymerase chain reaction (AP-PCR) analysis was carried out using 12 primers. The PCR products were resolved on agarose gels and the DNA fragments were scored manually. Genetic relationships were inferred by TREECON software using unweighted paired group method with arithmetic averages (UPGMA) cluster analysis evaluated by bootstrapping and compared with previous analyses based on ITS and 5S IGS. Principal results A total of 202 (86 %) fragments were found to be polymorphic and used for generating a genetic distance matrix. Twenty-one V. unguiculata accessions were grouped into three main clusters. The cultivated subspecies (var. unguiculata) and most of its wild progenitors (var. spontanea) were placed in cluster I along with ssp. pubescens and ssp. stenophylla. Whereas var. spontanea were grouped with ssp. alba and ssp. tenuis accessions in cluster II, ssp. alba and ssp. baoulensis were included in cluster III. Close affinities of ssp. unguiculata, ssp. alba and ssp. tenuis suggested inter-subspecies hybridization. Conclusions Multi-locus AP-PCR analysis reveals that intra-species hybridization is prevalent among V. unguiculata subspecies and suggests that grouping of accessions from two different subspecies is not solely due to the similarity in the ITS and 5S IGS regions but also due to other regions of the genome. PMID:22619698
Cluster analysis of quantitative parametric maps from DCE-MRI: application in evaluating heterogeneity of tumor response to antiangiogenic treatment.

PubMed

Longo, Dario Livio; Dastrù, Walter; Consolino, Lorena; Espak, Miklos; Arigoni, Maddalena; Cavallo, Federica; Aime, Silvio

2015-07-01

The objective of this study was to compare a clustering approach to conventional analysis methods for assessing changes in pharmacokinetic parameters obtained from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) during antiangiogenic treatment in a breast cancer model. BALB/c mice bearing established transplantable her2+ tumors were treated with a DNA-based antiangiogenic vaccine or with an empty plasmid (untreated group). DCE-MRI was carried out by administering a dose of 0.05 mmol/kg of Gadocoletic acid trisodium salt, a Gd-based blood pool contrast agent (CA) at 1T. Changes in pharmacokinetic estimates (K(trans) and vp) in a nine-day interval were compared between treated and untreated groups on a voxel-by-voxel analysis. The tumor response to therapy was assessed by a clustering approach and compared with conventional summary statistics, with sub-regions analysis and with histogram analysis. Both the K(trans) and vp estimates, following blood-pool CA injection, showed marked and spatial heterogeneous changes with antiangiogenic treatment. Averaged values for the whole tumor region, as well as from the rim/core sub-regions analysis were unable to assess the antiangiogenic response. Histogram analysis resulted in significant changes only in the vp estimates (p<0.05). The proposed clustering approach depicted marked changes in both the K(trans) and vp estimates, with significant spatial heterogeneity in vp maps in response to treatment (p<0.05), provided that DCE-MRI data are properly clustered in three or four sub-regions. This study demonstrated the value of cluster analysis applied to pharmacokinetic DCE-MRI parametric maps for assessing tumor response to antiangiogenic therapy. Copyright © 2015 Elsevier Inc. All rights reserved.
Strategic groups, performance, and strategic response in the nursing home industry.

PubMed

Zinn, J S; Aaronson, W E; Rosko, M D

1994-06-01

This study examines the effect of strategic group membership on nursing home performance and strategic behavior. Data from the 1987 Medicare and Medicaid Automated Certification Survey were combined with data from the 1987 and 1989 Pennsylvania Long Term Care Facility Questionnaire. The sample consisted of 383 Pennsylvania nursing homes. Cluster analysis was used to place the 383 nursing homes into strategic groups on the basis of variables measuring scope and resource deployment. Performance was measured by indicators of the quality of nursing home care (rates of pressure ulcers, catheterization, and restraint usage) and efficiency in services provision. Changes in Medicare participation after passage of the 1988 Medicare Catastrophic Coverage Act (MCCA) measured strategic behavior. MANOVA and Turkey HSD post hoc means tests determined if significant differences were associated with strategic group membership. Cluster analysis produced an optimal seven-group solution. Differences in group means were significant for the clustering, performance, and conduct variables (p < .0001). Strategic groups characterized by facilities providing a continuum of care services had the best patient care outcomes. The most efficient groups were characterized by facilities with high Medicare census. While all strategic groups increased Medicare census following passage of the MCCA, those dominated by for-profits had the greatest increases. Our analysis demonstrates that strategic orientation influences nursing home response to regulatory initiatives, a factor that should be recognized in policy formation directed at nursing home reform.
Proposed shade guide for human facial skin and lip: a pilot study.

PubMed

Wee, Alvin G; Beatty, Mark W; Gozalo-Diaz, David J; Kim-Pusateri, Seungyee; Marx, David B

2013-08-01

Currently, no commercially available facial shade guide exists in the United States for the fabrication of facial prostheses. The purpose of this study was to measure facial skin and lip color in a human population sample stratified by age, gender, and race. Clustering analysis was used to determine optimal color coordinates for a proposed facial shade guide. Participants (n=119) were recruited from 4 racial/ethnic groups, 5 age groups, and both genders. Reflectance measurements of participants' noses and lower lips were made by using a spectroradiometer and xenon arc lamp with a 45/0 optical configuration. Repeated measures ANOVA (α=.05), to identify skin and lip color differences, resulting from race, age, gender, and location, and a hierarchical clustering analysis, to identify clusters of skin colors) were used. Significant contributors to L*a*b* facial color were race and facial location (P<.01). b* affected all factors (P<.05). Age affected only b* (P<.001), while gender affected only L* (P<.05) and b* (P<.05). Analyses identified 5 clusters of skin color. The study showed that skin color caused by age and gender primarily occurred within the yellow-blue axis. A significant lightness difference between gender groups was also found. Clustering analysis identified 5 distinct skin shade tabs. Copyright © 2013 The Editorial Council of the Journal of Prosthetic Dentistry. Published by Mosby, Inc. All rights reserved.
Cluster Analysis of Clinical Data Identifies Fibromyalgia Subgroups

PubMed Central

Docampo, Elisa; Collado, Antonio; Escaramís, Geòrgia; Carbonell, Jordi; Rivera, Javier; Vidal, Javier; Alegre, José

2013-01-01

Introduction Fibromyalgia (FM) is mainly characterized by widespread pain and multiple accompanying symptoms, which hinder FM assessment and management. In order to reduce FM heterogeneity we classified clinical data into simplified dimensions that were used to define FM subgroups. Material and Methods 48 variables were evaluated in 1,446 Spanish FM cases fulfilling 1990 ACR FM criteria. A partitioning analysis was performed to find groups of variables similar to each other. Similarities between variables were identified and the variables were grouped into dimensions. This was performed in a subset of 559 patients, and cross-validated in the remaining 887 patients. For each sample and dimension, a composite index was obtained based on the weights of the variables included in the dimension. Finally, a clustering procedure was applied to the indexes, resulting in FM subgroups. Results Variables clustered into three independent dimensions: “symptomatology”, “comorbidities” and “clinical scales”. Only the two first dimensions were considered for the construction of FM subgroups. Resulting scores classified FM samples into three subgroups: low symptomatology and comorbidities (Cluster 1), high symptomatology and comorbidities (Cluster 2), and high symptomatology but low comorbidities (Cluster 3), showing differences in measures of disease severity. Conclusions We have identified three subgroups of FM samples in a large cohort of FM by clustering clinical data. Our analysis stresses the importance of family and personal history of FM comorbidities. Also, the resulting patient clusters could indicate different forms of the disease, relevant to future research, and might have an impact on clinical assessment. PMID:24098674
Study on force mechanism for therapeutic effect of pushing manipulation with one-finger meditation base on similarity analysis of force and waveform.

PubMed

Fang, Lei; Fang, Min; Guo, Min-Min

2016-12-27

To reveal the force mechanism for therapeutic effect of pushing manipulation with one-finger meditation. A total of 15 participants were recruited in this study and assigned to an expert group, a skilled group and a novice group, with 5 participants in each group. Mechanical signals were collected from a biomechanical testing platform, and these data were further observed via similarity analysis and cluster analysis. Comparing the force waveforms of manipulation revealed that the manipulation forces were similar between the expert group and the skilled group (P>0.05). The mean value of vertical force was 9.8 N, and 95% CI rang from 6.37 to 14.70 N, but there were significant differences compared with the novice group (P<0.05). The result of overall similarity coefficient cluster analysis showed that two kinds of manipulation forces curves were existed between the expert group and the skilled group. Pushing manipulation with one-finger meditation is a kind of light stimulation manipulation on the acupoint, and force characteristics of double waveforms continuously alternated during manual operation.
Long-term analysis of health status and preventive behavior in music students across an entire university program.

PubMed

Spahn, Claudia; Nusseck, Manfred; Zander, Mark

2014-03-01

The aim of this investigation was to analyze longitudinal data concerning physical and psychological health, playing-related problems, and preventive behavior among music students across their complete 4- to 5-year study period. In a longitudinal, observational study, we followed students during their university training and measured their psychological and physical health status and preventive behavior using standardized questionnaires at four different times. The data were in accordance with previous findings. They demonstrated three groups of health characteristics observed in beginners of music study: healthy students (cluster 1), students with preclinical symptoms (cluster 2), and students who are clinically symptomatic (cluster 3). In total, 64% of all students remained in the same cluster group during their whole university training. About 10% of the students showed considerable health problems and belonged to the third cluster group. The three clusters of health characteristics found in this longitudinal study with music students necessitate that prevention programs for musicians must be adapted to the target audience.
[Typologies of Madrid's citizens (Spain) at the end-of-life: cluster analysis].

PubMed

Ortiz-Gonçalves, Belén; Perea-Pérez, Bernardo; Labajo González, Elena; Albarrán Juan, Elena; Santiago-Sáez, Andrés

2018-03-06

To establish typologies within Madrid's citizens (Spain) with regard to end-of-life by cluster analysis. The SPAD 8 programme was implemented in a sample from a health care centre in the autonomous region of Madrid (Spain). A multiple correspondence analysis technique was used, followed by a cluster analysis to create a dendrogram. A cross-sectional study was made beforehand with the results of the questionnaire. Five clusters stand out. Cluster 1: a group who preferred not to answer numerous questions (5%). Cluster 2: in favour of receiving palliative care and euthanasia (40%). Cluster 3: would oppose assisted suicide and would not ask for spiritual assistance (15%). Cluster 4: would like to receive palliative care and assisted suicide (16%). Cluster 5: would oppose assisted suicide and would ask for spiritual assistance (24%). The following four clusters stood out. Clusters 2 and 4 would like to receive palliative care, euthanasia (2) and assisted suicide (4). Clusters 4 and 5 regularly practiced their faith and their family members did not receive palliative care. Clusters 3 and 5 would be opposed to euthanasia and assisted suicide in particular. Clusters 2, 4 and 5 had not completed an advance directive document (2, 4 and 5). Clusters 2 and 3 seldom practiced their faith. This study could be taken into consideration to improve the quality of end-of-life care choices. Copyright © 2017 SESPAS. Publicado por Elsevier España, S.L.U. All rights reserved.
Analysis of genetic association in Listeria and Diabetes using Hierarchical Clustering and Silhouette Index

NASA Astrophysics Data System (ADS)

Pagnuco, Inti A.; Pastore, Juan I.; Abras, Guillermo; Brun, Marcel; Ballarin, Virginia L.

2016-04-01

It is usually assumed that co-expressed genes suggest co-regulation in the underlying regulatory network. Determining sets of co-expressed genes is an important task, where significative groups of genes are defined based on some criteria. This task is usually performed by clustering algorithms, where the whole family of genes, or a subset of them, are clustered into meaningful groups based on their expression values in a set of experiment. In this work we used a methodology based on the Silhouette index as a measure of cluster quality for individual gene groups, and a combination of several variants of hierarchical clustering to generate the candidate groups, to obtain sets of co-expressed genes for two real data examples. We analyzed the quality of the best ranked groups, obtained by the algorithm, using an online bioinformatics tool that provides network information for the selected genes. Moreover, to verify the performance of the algorithm, considering the fact that it doesn’t find all possible subsets, we compared its results against a full search, to determine the amount of good co-regulated sets not detected.
Spatial pattern recognition of seismic events in South West Colombia

NASA Astrophysics Data System (ADS)

Benítez, Hernán D.; Flórez, Juan F.; Duque, Diana P.; Benavides, Alberto; Lucía Baquero, Olga; Quintero, Jiber

2013-09-01

Recognition of seismogenic zones in geographical regions supports seismic hazard studies. This recognition is usually based on visual, qualitative and subjective analysis of data. Spatial pattern recognition provides a well founded means to obtain relevant information from large amounts of data. The purpose of this work is to identify and classify spatial patterns in instrumental data of the South West Colombian seismic database. In this research, clustering tendency analysis validates whether seismic database possesses a clustering structure. A non-supervised fuzzy clustering algorithm creates groups of seismic events. Given the sensitivity of fuzzy clustering algorithms to centroid initial positions, we proposed a methodology to initialize centroids that generates stable partitions with respect to centroid initialization. As a result of this work, a public software tool provides the user with the routines developed for clustering methodology. The analysis of the seismogenic zones obtained reveals meaningful spatial patterns in South-West Colombia. The clustering analysis provides a quantitative location and dispersion of seismogenic zones that facilitates seismological interpretations of seismic activities in South West Colombia.
Using cluster analysis to organize and explore regional GPS velocities

USGS Publications Warehouse

Simpson, Robert W.; Thatcher, Wayne; Savage, James C.

2012-01-01

Cluster analysis offers a simple visual exploratory tool for the initial investigation of regional Global Positioning System (GPS) velocity observations, which are providing increasingly precise mappings of actively deforming continental lithosphere. The deformation fields from dense regional GPS networks can often be concisely described in terms of relatively coherent blocks bounded by active faults, although the choice of blocks, their number and size, can be subjective and is often guided by the distribution of known faults. To illustrate our method, we apply cluster analysis to GPS velocities from the San Francisco Bay Region, California, to search for spatially coherent patterns of deformation, including evidence of block-like behavior. The clustering process identifies four robust groupings of velocities that we identify with four crustal blocks. Although the analysis uses no prior geologic information other than the GPS velocities, the cluster/block boundaries track three major faults, both locked and creeping.
Predicting healthcare outcomes in prematurely born infants using cluster analysis.

PubMed

MacBean, Victoria; Lunt, Alan; Drysdale, Simon B; Yarzi, Muska N; Rafferty, Gerrard F; Greenough, Anne

2018-05-23

Prematurely born infants are at high risk of respiratory morbidity following neonatal unit discharge, though prediction of outcomes is challenging. We have tested the hypothesis that cluster analysis would identify discrete groups of prematurely born infants with differing respiratory outcomes during infancy. A total of 168 infants (median (IQR) gestational age 33 (31-34) weeks) were recruited in the neonatal period from consecutive births in a tertiary neonatal unit. The baseline characteristics of the infants were used to classify them into hierarchical agglomerative clusters. Rates of viral lower respiratory tract infections (LRTIs) were recorded for 151 infants in the first year after birth. Infants could be classified according to birth weight and duration of neonatal invasive mechanical ventilation (MV) into three clusters. Cluster one (MV ≤5 days) had few LRTIs. Clusters two and three (both MV ≥6 days, but BW ≥or <882 g respectively), had significantly higher LRTI rates. Cluster two had a higher proportion of infants experiencing respiratory syncytial virus LRTIs (P = 0.01) and cluster three a higher proportion of rhinovirus LRTIs (P < 0.001) CONCLUSIONS: Readily available clinical data allowed classification of prematurely born infants into one of three distinct groups with differing subsequent respiratory morbidity in infancy. © 2018 Wiley Periodicals, Inc.
Comparison of a non-stationary voxelation-corrected cluster-size test with TFCE for group-Level MRI inference.

PubMed

Li, Huanjie; Nickerson, Lisa D; Nichols, Thomas E; Gao, Jia-Hong

2017-03-01

Two powerful methods for statistical inference on MRI brain images have been proposed recently, a non-stationary voxelation-corrected cluster-size test (CST) based on random field theory and threshold-free cluster enhancement (TFCE) based on calculating the level of local support for a cluster, then using permutation testing for inference. Unlike other statistical approaches, these two methods do not rest on the assumptions of a uniform and high degree of spatial smoothness of the statistic image. Thus, they are strongly recommended for group-level fMRI analysis compared to other statistical methods. In this work, the non-stationary voxelation-corrected CST and TFCE methods for group-level analysis were evaluated for both stationary and non-stationary images under varying smoothness levels, degrees of freedom and signal to noise ratios. Our results suggest that, both methods provide adequate control for the number of voxel-wise statistical tests being performed during inference on fMRI data and they are both superior to current CSTs implemented in popular MRI data analysis software packages. However, TFCE is more sensitive and stable for group-level analysis of VBM data. Thus, the voxelation-corrected CST approach may confer some advantages by being computationally less demanding for fMRI data analysis than TFCE with permutation testing and by also being applicable for single-subject fMRI analyses, while the TFCE approach is advantageous for VBM data. Hum Brain Mapp 38:1269-1280, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
A cluster merging method for time series microarray with production values.

PubMed

Chira, Camelia; Sedano, Javier; Camara, Monica; Prieto, Carlos; Villar, Jose R; Corchado, Emilio

2014-09-01

A challenging task in time-course microarray data analysis is to cluster genes meaningfully combining the information provided by multiple replicates covering the same key time points. This paper proposes a novel cluster merging method to accomplish this goal obtaining groups with highly correlated genes. The main idea behind the proposed method is to generate a clustering starting from groups created based on individual temporal series (representing different biological replicates measured in the same time points) and merging them by taking into account the frequency by which two genes are assembled together in each clustering. The gene groups at the level of individual time series are generated using several shape-based clustering methods. This study is focused on a real-world time series microarray task with the aim to find co-expressed genes related to the production and growth of a certain bacteria. The shape-based clustering methods used at the level of individual time series rely on identifying similar gene expression patterns over time which, in some models, are further matched to the pattern of production/growth. The proposed cluster merging method is able to produce meaningful gene groups which can be naturally ranked by the level of agreement on the clustering among individual time series. The list of clusters and genes is further sorted based on the information correlation coefficient and new problem-specific relevant measures. Computational experiments and results of the cluster merging method are analyzed from a biological perspective and further compared with the clustering generated based on the mean value of time series and the same shape-based algorithm.
Water quality assessment with hierarchical cluster analysis based on Mahalanobis distance.

PubMed

Du, Xiangjun; Shao, Fengjing; Wu, Shunyao; Zhang, Hanlin; Xu, Si

2017-07-01

Water quality assessment is crucial for assessment of marine eutrophication, prediction of harmful algal blooms, and environment protection. Previous studies have developed many numeric modeling methods and data driven approaches for water quality assessment. The cluster analysis, an approach widely used for grouping data, has also been employed. However, there are complex correlations between water quality variables, which play important roles in water quality assessment but have always been overlooked. In this paper, we analyze correlations between water quality variables and propose an alternative method for water quality assessment with hierarchical cluster analysis based on Mahalanobis distance. Further, we cluster water quality data collected form coastal water of Bohai Sea and North Yellow Sea of China, and apply clustering results to evaluate its water quality. To evaluate the validity, we also cluster the water quality data with cluster analysis based on Euclidean distance, which are widely adopted by previous studies. The results show that our method is more suitable for water quality assessment with many correlated water quality variables. To our knowledge, it is the first attempt to apply Mahalanobis distance for coastal water quality assessment.
Characterizing Heterogeneity within Head and Neck Lesions Using Cluster Analysis of Multi-Parametric MRI Data

PubMed Central

Borri, Marco; Schmidt, Maria A.; Powell, Ceri; Koh, Dow-Mu; Riddell, Angela M.; Partridge, Mike; Bhide, Shreerang A.; Nutting, Christopher M.; Harrington, Kevin J.; Newbold, Katie L.; Leach, Martin O.

2015-01-01

Purpose To describe a methodology, based on cluster analysis, to partition multi-parametric functional imaging data into groups (or clusters) of similar functional characteristics, with the aim of characterizing functional heterogeneity within head and neck tumour volumes. To evaluate the performance of the proposed approach on a set of longitudinal MRI data, analysing the evolution of the obtained sub-sets with treatment. Material and Methods The cluster analysis workflow was applied to a combination of dynamic contrast-enhanced and diffusion-weighted imaging MRI data from a cohort of squamous cell carcinoma of the head and neck patients. Cumulative distributions of voxels, containing pre and post-treatment data and including both primary tumours and lymph nodes, were partitioned into k clusters (k = 2, 3 or 4). Principal component analysis and cluster validation were employed to investigate data composition and to independently determine the optimal number of clusters. The evolution of the resulting sub-regions with induction chemotherapy treatment was assessed relative to the number of clusters. Results The clustering algorithm was able to separate clusters which significantly reduced in voxel number following induction chemotherapy from clusters with a non-significant reduction. Partitioning with the optimal number of clusters (k = 4), determined with cluster validation, produced the best separation between reducing and non-reducing clusters. Conclusion The proposed methodology was able to identify tumour sub-regions with distinct functional properties, independently separating clusters which were affected differently by treatment. This work demonstrates that unsupervised cluster analysis, with no prior knowledge of the data, can be employed to provide a multi-parametric characterization of functional heterogeneity within tumour volumes. PMID:26398888
Reproducibility of Cognitive Profiles in Psychosis Using Cluster Analysis.

PubMed

Lewandowski, Kathryn E; Baker, Justin T; McCarthy, Julie M; Norris, Lesley A; Öngür, Dost

2018-04-01

Cognitive dysfunction is a core symptom dimension that cuts across the psychoses. Recent findings support classification of patients along the cognitive dimension using cluster analysis; however, data-derived groupings may be highly determined by sampling characteristics and the measures used to derive the clusters, and so their interpretability must be established. We examined cognitive clusters in a cross-diagnostic sample of patients with psychosis and associations with clinical and functional outcomes. We then compared our findings to a previous report of cognitive clusters in a separate sample using a different cognitive battery. Participants with affective or non-affective psychosis (n=120) and healthy controls (n=31) were administered the MATRICS Consensus Cognitive Battery, and clinical and community functioning assessments. Cluster analyses were performed on cognitive variables, and clusters were compared on demographic, cognitive, and clinical measures. Results were compared to findings from our previous report. A four-cluster solution provided a good fit to the data; profiles included a neuropsychologically normal cluster, a globally impaired cluster, and two clusters of mixed profiles. Cognitive burden was associated with symptom severity and poorer community functioning. The patterns of cognitive performance by cluster were highly consistent with our previous findings. We found evidence of four cognitive subgroups of patients with psychosis, with cognitive profiles that map closely to those produced in our previous work. Clusters were associated with clinical and community variables and a measure of premorbid functioning, suggesting that they reflect meaningful groupings: replicable, and related to clinical presentation and functional outcomes. (JINS, 2018, 24, 382-390).
Classification of municipal occupations.

PubMed

Ilmarinen, J; Suurnäkki, T; Nygård, C H; Landau, K

1991-01-01

Eighty-eight job titles were analyzed with the "ergonomic job analysis procedure" [Arbeitswissenschaftliche Erhebungsverfahren zur Tätigkeits-analyse abbreviated (AET) in German]. The objective was to classify the wide range of municipal jobs into homogeneous groups according to job demand and to provide better possibilities to study the relationships between work and health among the aging municipal working population. Altogether 216 items were classified. First, a hierarchical cluster analysis was made, and a dendrogram of the analyzed job titles was drawn. Second, a profile analysis was done in which the single items were grouped into 39 sum items, and a graphic profile was drawn. Finally, the stress factors were listed and drawn in ranking order. The cluster analysis formed 13 groups. Groups exposed to the highest stress factor level were kitchen supervisors, dentists, and physicians. More than 10 stress factors (greater than 50% of the maximum) were found in nursing, administration, installation, transport, and technical supervision.
Concept mapping and network analysis: an analytic approach to measure ties among constructs.

PubMed

Goldman, Alyssa W; Kane, Mary

2014-12-01

Group concept mapping is a mixed-methods approach that helps a group visually represent its ideas on a topic of interest through a series of related maps. The maps and additional graphics are useful for planning, evaluation and theory development. Group concept maps are typically described, interpreted and utilized through points, clusters and distances, and the implications of these features in understanding how constructs relate to one another. This paper focuses on the application of network analysis to group concept mapping to quantify the strength and directionality of relationships among clusters. The authors outline the steps of this analysis, and illustrate its practical use through an organizational strategic planning example. Additional benefits of this analysis to evaluation projects are also discussed, supporting the overall utility of this supplemental technique to the standard concept mapping methodology. Copyright © 2014 Elsevier Ltd. All rights reserved.

Molecular Eigensolution Symmetry Analysis and Fine Structure

PubMed Central

Harter, William G.; Mitchell, Justin C.

2013-01-01

Spectra of high-symmetry molecules contain fine and superfine level cluster structure related to J-tunneling between hills and valleys on rovibronic energy surfaces (RES). Such graphic visualizations help disentangle multi-level dynamics, selection rules, and state mixing effects including widespread violation of nuclear spin symmetry species. A review of RES analysis compares it to that of potential energy surfaces (PES) used in Born–Oppenheimer approximations. Both take advantage of adiabatic coupling in order to visualize Hamiltonian eigensolutions. RES of symmetric and D2 asymmetric top rank-2-tensor Hamiltonians are compared with Oh spherical top rank-4-tensor fine-structure clusters of 6-fold and 8-fold tunneling multiplets. Then extreme 12-fold and 24-fold multiplets are analyzed by RES plots of higher rank tensor Hamiltonians. Such extreme clustering is rare in fundamental bands but prevalent in hot bands, and analysis of its superfine structure requires more efficient labeling and a more powerful group theory. This is introduced using elementary examples involving two groups of order-6 (C6 and D3~C3v), then applied to families of Oh clusters in SF6 spectra and to extreme clusters. PMID:23344041
First CCD UBVI photometric analysis of six open cluster candidates

NASA Astrophysics Data System (ADS)

Piatti, A. E.; Clariá, J. J.; Ahumada, A. V.

2011-04-01

We have obtained CCD UBVIKC photometry down to V ˜ 22 for the open cluster candidates Haffner 3, Haffner 5, NGC 2368, Haffner 25, Hogg 3 and Hogg 4 and their surrounding fields. None of these objects have been photometrically studied so far. Our analysis shows that these stellar groups are not genuine open clusters since no clear main sequences or other meaningful features can be seen in their colour-magnitude and colour-colour diagrams. We checked for possible differential reddening across the studied fields that could be hiding the characteristics of real open clusters. However, the dust in the directions to these objects appears to be uniformly distributed. Moreover, star counts carried out within and outside the open cluster candidate fields do not support the hypothesis that these objects are real open clusters or even open cluster remnants.
Cluster analysis as a prediction tool for pregnancy outcomes.

PubMed

Banjari, Ines; Kenjerić, Daniela; Šolić, Krešimir; Mandić, Milena L

2015-03-01

Considering specific physiology changes during gestation and thinking of pregnancy as a "critical window", classification of pregnant women at early pregnancy can be considered as crucial. The paper demonstrates the use of a method based on an approach from intelligent data mining, cluster analysis. Cluster analysis method is a statistical method which makes possible to group individuals based on sets of identifying variables. The method was chosen in order to determine possibility for classification of pregnant women at early pregnancy to analyze unknown correlations between different variables so that the certain outcomes could be predicted. 222 pregnant women from two general obstetric offices' were recruited. The main orient was set on characteristics of these pregnant women: their age, pre-pregnancy body mass index (BMI) and haemoglobin value. Cluster analysis gained a 94.1% classification accuracy rate with three branch- es or groups of pregnant women showing statistically significant correlations with pregnancy outcomes. The results are showing that pregnant women both of older age and higher pre-pregnancy BMI have a significantly higher incidence of delivering baby of higher birth weight but they gain significantly less weight during pregnancy. Their babies are also longer, and these women have significantly higher probability for complications during pregnancy (gestosis) and higher probability of induced or caesarean delivery. We can conclude that the cluster analysis method can appropriately classify pregnant women at early pregnancy to predict certain outcomes.
Optimizing disinfection by-product monitoring points in a distribution system using cluster analysis.

PubMed

Delpla, Ianis; Florea, Mihai; Pelletier, Geneviève; Rodriguez, Manuel J

2018-06-04

Trihalomethanes (THMs) and Haloacetic Acids (HAAs) are the main groups detected in drinking water and are consequently strictly regulated. However, the increasing quantity of data for disinfection byproducts (DBPs) produced from research projects and regulatory programs remains largely unexploited, despite a great potential for its use in optimizing drinking water quality monitoring to meet specific objectives. In this work, we developed a procedure to optimize locations and periods for DBPs monitoring based on a set of monitoring scenarios using the cluster analysis technique. The optimization procedure used a robust set of spatio-temporal monitoring results on DBPs (THMs and HAAs) generated from intensive sampling campaigns conducted in a residential sector of a water distribution system. Results shows that cluster analysis allows for the classification of water quality in different groups of THMs and HAAs according to their similarities, and the identification of locations presenting water quality concerns. By using cluster analysis with different monitoring objectives, this work provides a set of monitoring solutions and a comparison between various monitoring scenarios for decision-making purposes. Finally, it was demonstrated that the data from intensive monitoring of free chlorine residual and water temperature as DBP proxy parameters, when processed using cluster analysis, could also help identify the optimal sampling points and periods for regulatory THMs and HAAs monitoring. Copyright © 2018 Elsevier Ltd. All rights reserved.
An exploratory analysis for Lean and Six Sigma implementation in hospitals: Together is better?

PubMed

Lee, Jung Young; McFadden, Kathleen L; Gowen, Charles R

Despite the increasing interest for Lean and Six Sigma implementations in hospitals, there has been little empirical evidence that goes beyond descriptive case studies to address the current status and the effectiveness of the implementations. The aim of this study was to explore existing patterns of Lean and Six Sigma implementation in U.S. hospitals and compare the performance of the different patterns. We collected data from 215 U.S. hospitals via a survey that includes measurement items developed from related literature. Using the cross-sectional data, we conducted a cluster analysis, followed by t tests, chi-square tests, and regression analyses for cluster verification. The cluster analysis identifies two clusters, a Moderate Six Sigma group and a Lean Six Sigma group. Results show that the Lean Six Sigma group outperforms the Moderate Six Sigma group across many performance dimensions: responsiveness capability, patient safety, and possibly cost saving. In addition, the Lean Six Sigma group tends to be composed of larger, private teaching hospitals located in more urban areas, and they employ more resources for quality improvement. Our research contributes to the quality management literature by supporting the possible complementary relationship between Lean and Six Sigma in hospitals. Our study encourages practitioners and managers to pay more attention to Lean implementation. Although Lean seems to be conducted in a limited fashion in many hospitals, it should be expanded and combined with Six Sigma for better results.
Preliminary Cluster Analysis For Several Representatives Of Genus Kerivoula (Chiroptera: Vespertilionidae) in Borneo

NASA Astrophysics Data System (ADS)

Hasan, Noor Haliza; Abdullah, M. T.

2008-01-01

The aim of the study is to use cluster analysis on morphometric parameters within the genus Kerivoula to produce a dendrogram and to determine the suitability of this method to describe the relationship among species within this genus. A total of 15 adult male individuals from genus Kerivoula taken from sampling trips around Borneo and specimens kept at the zoological museum of Universiti Malaysia Sarawak were examined. A total of 27 characters using dental, skull and external body measurements were recorded. Clustering analysis illustrated the grouping and morphometric relationships between the species of this genus. It has clearly separated each species from each other despite the overlapping of measurements of some species within the genus. Cluster analysis provides an alternative approach to make a preliminary identification of a species.
An Empirical Taxonomy of Youths' Fears: Cluster Analysis of the American Fear Survey Schedule

ERIC Educational Resources Information Center

Burnham, Joy J.; Schaefer, Barbara A.; Giesen, Judy

2006-01-01

Fears profiles among children and adolescents were explored using the Fear Survey Schedule for Children-American version (FSSC-AM; J.J. Burnham, 1995, 2005). Eight cluster profiles were identified via multistage Euclidean grouping and supported by homogeneity coefficients and replication. Four clusters reflected overall level of fears (i.e., very…
Analysis of indoor air pollutants checklist using environmetric technique for health risk assessment of sick building complaint in nonindustrial workplace

PubMed Central

Syazwan, AI; Rafee, B Mohd; Juahir, Hafizan; Azman, AZF; Nizar, AM; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, AA; Yunos, MA Syafiq; Anita, AR; Hanafiah, J Muhamad; Shaharuddin, MS; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, MN Mohamad; Azizan, HS; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, FT

2012-01-01

Purpose To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. Design A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. Method A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Result Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. Conclusion This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ investigations and complaints procedures. PMID:23055779
Analysis of indoor air pollutants checklist using environmetric technique for health risk assessment of sick building complaint in nonindustrial workplace.

PubMed

Syazwan, Ai; Rafee, B Mohd; Juahir, Hafizan; Azman, Azf; Nizar, Am; Izwyn, Z; Syahidatussyakirah, K; Muhaimin, Aa; Yunos, Ma Syafiq; Anita, Ar; Hanafiah, J Muhamad; Shaharuddin, Ms; Ibthisham, A Mohd; Hasmadi, I Mohd; Azhar, Mn Mohamad; Azizan, Hs; Zulfadhli, I; Othman, J; Rozalini, M; Kamarul, Ft

2012-01-01

To analyze and characterize a multidisciplinary, integrated indoor air quality checklist for evaluating the health risk of building occupants in a nonindustrial workplace setting. A cross-sectional study based on a participatory occupational health program conducted by the National Institute of Occupational Safety and Health (Malaysia) and Universiti Putra Malaysia. A modified version of the indoor environmental checklist published by the Department of Occupational Health and Safety, based on the literature and discussion with occupational health and safety professionals, was used in the evaluation process. Summated scores were given according to the cluster analysis and principal component analysis in the characterization of risk. Environmetric techniques was used to classify the risk of variables in the checklist. Identification of the possible source of item pollutants was also evaluated from a semiquantitative approach. Hierarchical agglomerative cluster analysis resulted in the grouping of factorial components into three clusters (high complaint, moderate-high complaint, moderate complaint), which were further analyzed by discriminant analysis. From this, 15 major variables that influence indoor air quality were determined. Principal component analysis of each cluster revealed that the main factors influencing the high complaint group were fungal-related problems, chemical indoor dispersion, detergent, renovation, thermal comfort, and location of fresh air intake. The moderate-high complaint group showed significant high loading on ventilation, air filters, and smoking-related activities. The moderate complaint group showed high loading on dampness, odor, and thermal comfort. This semiquantitative assessment, which graded risk from low to high based on the intensity of the problem, shows promising and reliable results. It should be used as an important tool in the preliminary assessment of indoor air quality and as a categorizing method for further IAQ investigations and complaints procedures.
Molecular characterization and population structure study of cambuci: strategy for conservation and genetic improvement.

PubMed

Santos, D N; Nunes, C F; Setotaw, T A; Pio, R; Pasqual, M; Cançado, G M A

2016-12-19

Cambuci (Campomanesia phaea) belongs to the Myrtaceae family and is native to the Atlantic Forest of Brazil. It has ecological and social appeal but is exposed to problems associated with environmental degradation and expansion of agricultural activities in the region. Comprehensive studies on this species are rare, making its conservation and genetic improvement difficult. Thus, it is important to develop research activities to understand the current situation of the species as well as to make recommendations for its conservation and use. This study was performed to characterize the cambuci accessions found in the germplasm bank of Coordenadoria de Assistência Técnica Integral using inter-simple sequence repeat markers, with the goal of understanding the plant's population structure. The results showed the existence of some level of genetic diversity among the cambuci accessions that could be exploited for the genetic improvement of the species. Principal coordinate analysis and discriminant analysis clustered the 80 accessions into three groups, whereas Bayesian model-based clustering analysis clustered them into two groups. The formation of two cluster groups and the high membership coefficients within the groups pointed out the importance of further collection to cover more areas and more genetic variability within the species. The study also showed the lack of conservation activities; therefore, more attention from the appropriate organizations is needed to plan and implement natural and ex situ conservation activities.
Sub-grouping patients with non-specific low back pain based on cluster analysis of discriminatory clinical items.

PubMed

Billis, Evdokia; McCarthy, Christopher J; Roberts, Chris; Gliatis, John; Papandreou, Maria; Gioftsos, George; Oldham, Jacqueline A

2013-02-01

To identify potential subgroups amongst patients with non-specific low back pain based on a consensus list of potentially discriminatory examination items. Exploratory study. A convenience sample of 106 patients with non-specific low back pain (43 males, 63 females, mean age 36 years, standard deviation 15.9 years) and 7 physiotherapists. Based on 3 focus groups and a two-round Delphi involving 23 health professionals and a random stratified sample of 150 physiotherapists, respectively, a comprehensive examination list comprising the most "discriminatory" items was compiled. Following reliability analysis, the most reliable clinical items were assessed with a sample of patients with non-specific low back pain. K-means cluster analysis was conducted for 2-, 3- and 4-cluster options to explore for meaningful homogenous subgroups. The most clinically meaningful cluster was a two-subgroup option, comprising a small group (n = 24) with more severe clinical presentation (i.e. more widespread pain, functional and sleeping problems, other symptoms, increased investigations undertaken, more severe clinical signs, etc.) and a larger less dysfunctional group (n = 80). A number of potentially discriminatory clinical items were identified by health professionals and sub-classified, based on a sample of patients with non-specific low back pain, into two subgroups. However, further work is needed to validate this classification process.
Pathological and non-pathological variants of restrictive eating behaviors in middle childhood: A latent class analysis.

PubMed

Schmidt, Ricarda; Vogel, Mandy; Hiemisch, Andreas; Kiess, Wieland; Hilbert, Anja

2018-08-01

Although restrictive eating behaviors are very common during early childhood, their precise nature and clinical correlates remain unclear. Especially, there is little evidence on restrictive eating behaviors in older children and their associations with children's shape concern. The present population-based study sought to delineate subgroups of restrictive eating patterns in N = 799 7-14 year old children. Using Latent Class Analysis, children were classified based on six restrictive eating behaviors (for example, picky eating, food neophobia, and eating-related anxiety) and shape concern, separately in three age groups. For cluster validation, sociodemographic and objective anthropometric data, parental feeding practices, and general and eating disorder psychopathology were used. The results showed a 3-cluster solution across all age groups: an asymptomatic class (Cluster 1), a class with restrictive eating behaviors without shape concern (Cluster 2), and a class showing restrictive eating behaviors with prominent shape concern (Cluster 3). The clusters differed in all variables used for validation. Particularly, the proportion of children with symptoms of avoidant/restrictive food intake disorder was greater in Cluster 2 than Clusters 1 and 3. The study underlined the importance of considering shape concern to distinguish between different phenotypes of children's restrictive eating patterns. Longitudinal data are needed to evaluate the clusters' predictive effects on children's growth and development of clinical eating disorders. Copyright © 2018 Elsevier Ltd. All rights reserved.
Fitness as a determinant of arterial stiffness in healthy adult men: a cross-sectional study.

PubMed

Chung, Jinwook; Kim, Milyang; Jin, Youngsoo; Kim, Yonghwan; Hong, Jeeyoung

2018-01-01

Fitness is known to influence arterial stiffness. This study aimed to assess differences in cardiorespiratory endurance, muscular strength, and flexibility according to arterial stiffness, based on sex and age. We enrolled 1590 healthy adults (men: 1242, women: 348) who were free of metabolic syndrome. We measured cardiorespiratory endurance in an exercise stress test on a treadmill, muscular strength by a grip test, and flexibility by upper body forward-bends from a standing position. The brachial-ankle pulse wave velocity test was performed to measure arterial stiffness before the fitness test. Cluster analysis was performed to divide the patients into groups with low (Cluster 1) and high (Cluster 2) arterial stiffness. According to the k-cluster analysis results, Cluster 1 included 624 men and 180 women, and Cluster 2 included 618 men and 168 women. Men in the middle-aged group with low arterial stiffness demonstrated higher cardiorespiratory endurance, muscular strength, and flexibility than those with high arterial stiffness. Similarly, among men in the old-aged group, the cardiorespiratory endurance and muscular strength, but not flexibility, differed significantly according to arterial stiffness. Women in both clusters showed similar cardiorespiratory endurance, muscular strength, and flexibility regardless of their arterial stiffness. Among healthy adults, arterial stiffness was inversely associated with fitness in men but not in women. Therefore, fitness seems to be a determinant for arterial stiffness in men. Additionally, regular exercise should be recommended for middle-aged men to prevent arterial stiffness.
Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms

PubMed Central

Esplin, M Sean; Manuck, Tracy A.; Varner, Michael W.; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M.; Ilekis, John

2015-01-01

Objective We sought to employ an innovative tool based on common biological pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB), in order to enhance investigators' ability to identify to highlight common mechanisms and underlying genetic factors responsible for SPTB. Study Design A secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks gestation. Each woman was assessed for the presence of underlying SPTB etiologies. A hierarchical cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis using VEGAS software. Results 1028 women with SPTB were assigned phenotypes. Hierarchical clustering of the phenotypes revealed five major clusters. Cluster 1 (N=445) was characterized by maternal stress, cluster 2 (N=294) by premature membrane rupture, cluster 3 (N=120) by familial factors, and cluster 4 (N=63) by maternal comorbidities. Cluster 5 (N=106) was multifactorial, characterized by infection (INF), decidual hemorrhage (DH) and placental dysfunction (PD). These three phenotypes were highly correlated by Chi-square analysis [PD and DH (p<2.2e-6); PD and INF (p=6.2e-10); INF and DH (p=0.0036)]. Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. Conclusion We identified 5 major clusters of SPTB based on a phenotype tool and hierarchal clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors underlying SPTB. PMID:26070700
The effect of cognitive appraisal for stressors on the oral health-related QOL of dry mouth patients.

PubMed

Matsuoka, Hirofumi; Chiba, Itsuo; Sakano, Yuji; Saito, Ichiro; Abiko, Yoshihiro

2014-01-01

Dry mouth is very common symptom, and psychological factors have an influence on this symptom. Although the influence of emotional factor related to patients with oral dryness has been examined in previous studies, the cognitive factors have not been examined thus far. The purpose of this study was to examine the influence of cognitive factors on patients with oral dryness. The participants were 106 patients complaining of oral dryness. They were required to complete a questionnaire measuring subjective oral dryness, oral-related QOL, cognition for stressors, and mood state. Correlational analyses revealed that OHIP-14 is significantly related to oral dryness, appraisal for effect, appraisal for threat, and commitment. These correlations were maintained even after controlling for the influence of depression and anxiety. Using oral dryness, appraisal for effect, appraisal for threat, and commitment, cluster analysis was done and three clusters (cluster-1, severe oral dryness; cluster-2, positive cognitive style: cluster-3, negative cognitive style) were extracted. The results of ANOVA showed that the group with severe oral dryness (cluster-1) had a significantly higher score on OHIP-14 than the other two groups. There was no significant difference between the groups with positive (cluster-2) and negative (cluster-3) cognitive style. Although the group of patients with positive cognitive style complained of more severe oral dryness than the group with negative cognitive style, no significant difference was observed between these two groups in OHIP-14. These results indicate that cognitive factors would be a useful therapeutic target for the improvement of the oral-related QOL of patients with oral dryness.
Resemblance profiles as clustering decision criteria: Estimating statistical power, error, and correspondence for a hypothesis test for multivariate structure.

PubMed

Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F

2017-04-01

Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
On Identifying Clusters Within the C-type Asteroids of the Sloan Digital Sky Survey

NASA Astrophysics Data System (ADS)

Poole, Renae; Ziffer, J.; Harvell, T.

2012-10-01

We applied AutoClass, a data mining technique based upon Bayesian Classification, to C-group asteroid colors in the Sloan Digital Sky Survey (SDSS). Previous taxonomic studies relied mostly on Principal Component Analysis (PCA) to differentiate asteroids within the C-group (e.g. B, G, F, Ch, Cg and Cb). AutoClass's advantage is that it calculates the most probable classification for us, removing the human factor from this part of the analysis. In our results, AutoClass divided the C-groups into two large classes and six smaller classes. The two large classes (n=4974 and 2033, respectively) display distinct regions with some overlap in color-vs-color plots. Each cluster's average spectrum is compared to 'typical' spectra of the C-group subtypes as defined by Tholen (1989) and each cluster's members are evaluated for consistency with previous taxonomies. Of the 117 asteroids classified as B-type in previous taxonomies, only 12 were found with SDSS colors that matched our criteria of having less than 0.1 magnitude error in u and 0.05 magnitude error in g, r, i, and z colors. Although this is a relatively small group, 11 of the 12 B-types were placed by AutoClass in the same cluster. By determining the C-group sub-classifications in the large SDSS database, this research furthers our understanding of the stratigraphy and composition of the main-belt.
Quantification and statistical significance analysis of group separation in NMR-based metabonomics studies

PubMed Central

Goodpaster, Aaron M.; Kennedy, Michael A.

2015-01-01

Currently, no standard metrics are used to quantify cluster separation in PCA or PLS-DA scores plots for metabonomics studies or to determine if cluster separation is statistically significant. Lack of such measures makes it virtually impossible to compare independent or inter-laboratory studies and can lead to confusion in the metabonomics literature when authors putatively identify metabolites distinguishing classes of samples based on visual and qualitative inspection of scores plots that exhibit marginal separation. While previous papers have addressed quantification of cluster separation in PCA scores plots, none have advocated routine use of a quantitative measure of separation that is supported by a standard and rigorous assessment of whether or not the cluster separation is statistically significant. Here quantification and statistical significance of separation of group centroids in PCA and PLS-DA scores plots are considered. The Mahalanobis distance is used to quantify the distance between group centroids, and the two-sample Hotelling's T2 test is computed for the data, related to an F-statistic, and then an F-test is applied to determine if the cluster separation is statistically significant. We demonstrate the value of this approach using four datasets containing various degrees of separation, ranging from groups that had no apparent visual cluster separation to groups that had no visual cluster overlap. Widespread adoption of such concrete metrics to quantify and evaluate the statistical significance of PCA and PLS-DA cluster separation would help standardize reporting of metabonomics data. PMID:26246647
The WAGGS project - I. The WiFeS Atlas of Galactic Globular cluster Spectra

NASA Astrophysics Data System (ADS)

Usher, Christopher; Pastorello, Nicola; Bellstedt, Sabine; Alabi, Adebusola; Cerulo, Pierluigi; Chevalier, Leonie; Fraser-McKelvie, Amelia; Penny, Samantha; Foster, Caroline; McDermid, Richard M.; Schiavon, Ricardo P.; Villaume, Alexa

2017-07-01

We present the WiFeS Atlas of Galactic Globular cluster Spectra, a library of integrated spectra of Milky Way and Local Group globular clusters. We used the WiFeS integral field spectrograph on the Australian National University 2.3 m telescope to observe the central regions of 64 Milky Way globular clusters and 22 globular clusters hosted by the Milky Way's low-mass satellite galaxies. The spectra have wider wavelength coverage (3300-9050 Å) and higher spectral resolution (R = 6800) than existing spectral libraries of Milky Way globular clusters. By including Large and Small Magellanic Cloud star clusters, we extend the coverage of parameter space of existing libraries towards young and intermediate ages. While testing stellar population synthesis models and analysis techniques is the main aim of this library, the observations may also further our understanding of the stellar populations of Local Group globular clusters and make possible the direct comparison of extragalactic globular cluster integrated light observations with well-understood globular clusters in the Milky Way. The integrated spectra are publicly available via the project website.
WISC-R Types of Learning Disabilities: A Profile Analysis with Cross-Validation.

ERIC Educational Resources Information Center

Holcomb, William R.; And Others

1987-01-01

Profiles (Wechsler Intelligence Scale for Children - Revised) of 119 children in five learning disability programs were placed in six homogeneous groups using cluster analysis. One group showed superior intelligence quotient (IQ) with motor coordination deficits and severe emotional problems, while three groups represented children with low IQs…

Students' Perceptions of Motivational Climate and Enjoyment in Finnish Physical Education: A Latent Profile Analysis.

PubMed

Jaakkola, Timo; Wang, C K John; Soini, Markus; Liukkonen, Jarmo

2015-09-01

The purpose of this study was to identify student clusters with homogenous profiles in perceptions of task- and ego-involving, autonomy, and social relatedness supporting motivational climate in school physical education. Additionally, we investigated whether different motivational climate groups differed in their enjoyment in PE. Participants of the study were 2 594 girls and 1 803 boys, aged 14-15 years. Students responded to questionnaires assessing their perception of motivational climate and enjoyment in physical education. Latent profile analyses produced a five-cluster solution labeled 1) 'low autonomy, relatedness, task, and moderate ego climate' group', 2) 'low autonomy, relatedness, and high task and ego climate, 3) 'moderate autonomy, relatedness, task and ego climate' group 4) 'high autonomy, relatedness, task, and moderate ego climate' group, and 5) 'high relatedness and task but moderate autonomy and ego climate' group. Analyses of variance showed that students in clusters 4 and 5 perceived the highest level of enjoyment whereas students in cluster 1 experienced the lowest level of enjoyment. The results showed that the students' perceptions of various motivational climates created differential levels of enjoyment in PE classes. Key pointsLatent profile analyses produced a five-cluster solution labeled 1) 'low autonomy, relatedness, task, and moderate ego climate' group', 2) 'low autonomy, relatedness, and high task and ego climate, 3) 'moderate autonomy, relatedness, task and ego climate' group 4) 'high autonomy, relatedness, task, and moderate ego climate' group, and 5) 'high relatedness and task but moderate autonomy and ego climate' group.Analyses of variance showed that clusters 4 and 5 perceived the highest level of enjoyment whereas cluster 1 experienced the lowest level of enjoyment. The results showed that the students' perceptions of motivational climate create differential levels of enjoyment in PE classes.
Novel clustering of items from the Autism Diagnostic Interview-Revised to define phenotypes within autism spectrum disorders

PubMed Central

Hu, Valerie W.; Steinberg, Mara E.

2009-01-01

Heterogeneity in phenotypic presentation of ASD has been cited as one explanation for the difficulty in pinpointing specific genes involved in autism. Recent studies have attempted to reduce the “noise” in genetic and other biological data by reducing the phenotypic heterogeneity of the sample population. The current study employs multiple clustering algorithms on 123 item scores from the Autism Diagnostic Interview-Revised (ADI-R) diagnostic instrument of nearly 2000 autistic individuals to identify subgroups of autistic probands with clinically relevant behavioral phenotypes in order to isolate more homogeneous groups of subjects for gene expression analyses. Our combined cluster analyses suggest optimal division of the autistic probands into 4 phenotypic clusters based on similarity of symptom severity across the 123 selected item scores. One cluster is characterized by severe language deficits, while another exhibits milder symptoms across the domains. A third group possesses a higher frequency of savant skills while the fourth group exhibited intermediate severity across all domains. Grouping autistic individuals by multivariate cluster analysis of ADI-R scores reveals meaningful phenotypes of subgroups within the autistic spectrum which we show, in a related (accompanying) study, to be associated with distinct gene expression profiles. PMID:19455643
Performance Analysis of Entropy Methods on K Means in Clustering Process

NASA Astrophysics Data System (ADS)

Dicky Syahputra Lubis, Mhd.; Mawengkang, Herman; Suwilo, Saib

2017-12-01

K Means is a non-hierarchical data clustering method that attempts to partition existing data into one or more clusters / groups. This method partitions the data into clusters / groups so that data that have the same characteristics are grouped into the same cluster and data that have different characteristics are grouped into other groups.The purpose of this data clustering is to minimize the objective function set in the clustering process, which generally attempts to minimize variation within a cluster and maximize the variation between clusters. However, the main disadvantage of this method is that the number k is often not known before. Furthermore, a randomly chosen starting point may cause two points to approach the distance to be determined as two centroids. Therefore, for the determination of the starting point in K Means used entropy method where this method is a method that can be used to determine a weight and take a decision from a set of alternatives. Entropy is able to investigate the harmony in discrimination among a multitude of data sets. Using Entropy criteria with the highest value variations will get the highest weight. Given this entropy method can help K Means work process in determining the starting point which is usually determined at random. Thus the process of clustering on K Means can be more quickly known by helping the entropy method where the iteration process is faster than the K Means Standard process. Where the postoperative patient dataset of the UCI Repository Machine Learning used and using only 12 data as an example of its calculations is obtained by entropy method only with 2 times iteration can get the desired end result.
Classification of different degrees of adiposity in sedentary rats.

PubMed

Leopoldo, A S; Lima-Leopoldo, A P; Nascimento, A F; Luvizotto, R A M; Sugizaki, M M; Campos, D H S; da Silva, D C T; Padovani, C R; Cicogna, A C

2016-01-01

In experimental studies, several parameters, such as body weight, body mass index, adiposity index, and dual-energy X-ray absorptiometry, have commonly been used to demonstrate increased adiposity and investigate the mechanisms underlying obesity and sedentary lifestyles. However, these investigations have not classified the degree of adiposity nor defined adiposity categories for rats, such as normal, overweight, and obese. The aim of the study was to characterize the degree of adiposity in rats fed a high-fat diet using cluster analysis and to create adiposity intervals in an experimental model of obesity. Thirty-day-old male Wistar rats were fed a normal (n=41) or a high-fat (n=43) diet for 15 weeks. Obesity was defined based on the adiposity index; and the degree of adiposity was evaluated using cluster analysis. Cluster analysis allowed the rats to be classified into two groups (overweight and obese). The obese group displayed significantly higher total body fat and a higher adiposity index compared with those of the overweight group. No differences in systolic blood pressure or nonesterified fatty acid, glucose, total cholesterol, or triglyceride levels were observed between the obese and overweight groups. The adiposity index of the obese group was positively correlated with final body weight, total body fat, and leptin levels. Despite the classification of sedentary rats into overweight and obese groups, it was not possible to identify differences in the comorbidities between the two groups.
Trajectories of acute low back pain: a latent class growth analysis.

PubMed

Downie, Aron S; Hancock, Mark J; Rzewuska, Magdalena; Williams, Christopher M; Lin, Chung-Wei Christine; Maher, Christopher G

2016-01-01

Characterising the clinical course of back pain by mean pain scores over time may not adequately reflect the complexity of the clinical course of acute low back pain. We analysed pain scores over 12 weeks for 1585 patients with acute low back pain presenting to primary care to identify distinct pain trajectory groups and baseline patient characteristics associated with membership of each cluster. This was a secondary analysis of the PACE trial that evaluated paracetamol for acute low back pain. Latent class growth analysis determined a 5 cluster model, which comprised 567 (35.8%) patients who recovered by week 2 (cluster 1, rapid pain recovery); 543 (34.3%) patients who recovered by week 12 (cluster 2, pain recovery by week 12); 222 (14.0%) patients whose pain reduced but did not recover (cluster 3, incomplete pain recovery); 167 (10.5%) patients whose pain initially decreased but then increased by week 12 (cluster 4, fluctuating pain); and 86 (5.4%) patients who experienced high-level pain for the whole 12 weeks (cluster 5, persistent high pain). Patients with longer pain duration were more likely to experience delayed recovery or nonrecovery. Belief in greater risk of persistence was associated with nonrecovery, but not delayed recovery. Higher pain intensity, longer duration, and workers' compensation were associated with persistent high pain, whereas older age and increased number of episodes were associated with fluctuating pain. Identification of discrete pain trajectory groups offers the potential to better manage acute low back pain.
KinFin: Software for Taxon-Aware Analysis of Clustered Protein Sequences.

PubMed

Laetsch, Dominik R; Blaxter, Mark L

2017-10-05

The field of comparative genomics is concerned with the study of similarities and differences between the information encoded in the genomes of organisms. A common approach is to define gene families by clustering protein sequences based on sequence similarity, and analyze protein cluster presence and absence in different species groups as a guide to biology. Due to the high dimensionality of these data, downstream analysis of protein clusters inferred from large numbers of species, or species with many genes, is nontrivial, and few solutions exist for transparent, reproducible, and customizable analyses. We present KinFin, a streamlined software solution capable of integrating data from common file formats and delivering aggregative annotation of protein clusters. KinFin delivers analyses based on systematic taxonomy of the species analyzed, or on user-defined, groupings of taxa, for example, sets based on attributes such as life history traits, organismal phenotypes, or competing phylogenetic hypotheses. Results are reported through graphical and detailed text output files. We illustrate the utility of the KinFin pipeline by addressing questions regarding the biology of filarial nematodes, which include parasites of veterinary and medical importance. We resolve the phylogenetic relationships between the species and explore functional annotation of proteins in clusters in key lineages and between custom taxon sets, identifying gene families of interest. KinFin can easily be integrated into existing comparative genomic workflows, and promotes transparent and reproducible analysis of clustered protein data. Copyright © 2017 Laetsch and Blaxter.
[The relationship of empathic-affective responses toward others' positive affect with prosocial behaviors and aggressive behaviors].

PubMed

Sakurai, Shigeo; Hayama, Daichi; Suzuki, Takashi; Kurazumi, Tomoe; Hagiwara, Toshihiko; Suzuki, Miyuki; Ohuchi, Akiko; Chizuko, Oikawa

2011-06-01

The purposes of this study were to develop and validate the Empathic-Affective Response Scale, and to examine the relationship of empathic-affective responses with prosocial behaviors and aggressive behaviors. Undergraduate students (N = 443) participated in a questionnaire study. The results of factor analysis indicated that empathic-affective responses involved three factors: (a) sharing and good feeling toward others' positive affect, (b) sharing of negative affect and (c) sympathy toward others' negative affect. Correlations with other empathy-related scales and internal consistency suggested that this scale has satisfactory validity and reliability. Cluster analysis revealed that participants were clustered into four groups: high-empathic group, low-empathic group, insufficient positive affective response group and insufficient negative affective response group. Additional analysis showed the frequency of prosocial behaviors in high-empathic group was highest in all groups. On the other hand, the frequency of aggressive behaviors in both insufficient positive affective response group and low-empathic group were higher than others' groups. The results indicated that empathic-affective responses toward positive affect are also very important to predict prosocial behaviors and aggressive behaviors.
Community resource centres to improve the health of women and children in informal settlements in Mumbai: a cluster-randomised, controlled trial.

PubMed

More, Neena Shah; Das, Sushmita; Bapat, Ujwala; Alcock, Glyn; Manjrekar, Shreya; Kamble, Vikas; Sawant, Rijuta; Shende, Sushma; Daruwalla, Nayreen; Pantvaidya, Shanti; Osrin, David

2017-03-01

Around 105 million people in India will be living in informal settlements by 2017. We investigated the effects of local resource centres delivering integrated activities to improve women's and children's health in urban informal settlements. In a cluster-randomised controlled trial in 40 clusters, each containing around 600 households, 20 were randomly allocated to have a resource centre (intervention group) and 20 no centre (control group). Community organisers in the intervention centres addressed maternal and neonatal health, child health and nutrition, reproductive health, and prevention of violence against women and children through home visits, group meetings, day care, community events, service provision, and liaison. The primary endpoints were met need for family planning in women aged 15-49 years, proportion of children aged 12-23 months fully immunised, and proportion of children younger than 5 years with anthropometric wasting. Census interviews with women aged 15-49 years were done before and 2 years after the intervention was implemented. The primary intention-to-treat analysis compared cluster allocation groups after the intervention. We also analysed the per-protocol population (all women with data from both censuses) and assessed cluster-level changes. This study is registered with ISRCTN, number ISRCTN56183183, and Clinical Trials Registry of India, number CTRI/2012/09/003004. 12 614 households were allocated to the intervention and 12 239 to control. Postintervention data were available for 8271 women and 5371 children younger than 5 years in the intervention group, and 7965 women and 5180 children in the control group. Met need for family planning was greater in the intervention clusters than in the control clusters (odds ratio [OR] 1·31, 95% CI 1·11-1·53). The proportions of fully immunised children were similar in the intervention and control groups in the intention-to-treat analysis (OR 1·30, 95% CI 0·84-2·01), but were greater in the intervention group when assessed per protocol (1·73, 1·05-2·86). Childhood wasting did not differ between groups (OR 0·92, 95% CI 0·75-1·12), although improvement was seen at the cluster level in the intervention group (p=0·020). This community resource model seems feasible and replicable and may be protocolised for expansion. Wellcome Trust, CRY, Cipla. Copyright © 2017 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY license. Published by Elsevier Ltd.. All rights reserved.
Strategic groups, performance, and strategic response in the nursing home industry.

PubMed Central

Zinn, J S; Aaronson, W E; Rosko, M D

1994-01-01

OBJECTIVE. This study examines the effect of strategic group membership on nursing home performance and strategic behavior. DATA SOURCES AND STUDY SETTING. Data from the 1987 Medicare and Medicaid Automated Certification Survey were combined with data from the 1987 and 1989 Pennsylvania Long Term Care Facility Questionnaire. The sample consisted of 383 Pennsylvania nursing homes. STUDY DESIGN. Cluster analysis was used to place the 383 nursing homes into strategic groups on the basis of variables measuring scope and resource deployment. Performance was measured by indicators of the quality of nursing home care (rates of pressure ulcers, catheterization, and restraint usage) and efficiency in services provision. Changes in Medicare participation after passage of the 1988 Medicare Catastrophic Coverage Act (MCCA) measured strategic behavior. MANOVA and Turkey HSD post hoc means tests determined if significant differences were associated with strategic group membership. FINDINGS. Cluster analysis produced an optimal seven-group solution. Differences in group means were significant for the clustering, performance, and conduct variables (p < .0001). Strategic groups characterized by facilities providing a continuum of care services had the best patient care outcomes. The most efficient groups were characterized by facilities with high Medicare census. While all strategic groups increased Medicare census following passage of the MCCA, those dominated by for-profits had the greatest increases. CONCLUSIONS. Our analysis demonstrates that strategic orientation influences nursing home response to regulatory initiatives, a factor that should be recognized in policy formation directed at nursing home reform. PMID:8005789
Factors that cause genotype by environment interaction and use of a multiple-trait herd-cluster model for milk yield of Holstein cattle from Brazil and Colombia.

PubMed

Cerón-Muñoz, M F; Tonhati, H; Costa, C N; Rojas-Sarmiento, D; Echeverri Echeverri, D M

2004-08-01

Descriptive herd variables (DVHE) were used to explain genotype by environment interactions (G x E) for milk yield (MY) in Brazilian and Colombian production environments and to develop a herd-cluster model to estimate covariance components and genetic parameters for each herd environment group. Data consisted of 180,522 lactation records of 94,558 Holstein cows from 937 Brazilian and 400 Colombian herds. Herds in both countries were jointly grouped in thirds according to 8 DVHE: production level, phenotypic variability, age at first calving, calving interval, percentage of imported semen, lactation length, and herd size. For each DVHE, REML bivariate animal model analyses were used to estimate genetic correlations for MY between upper and lower thirds of the data. Based on estimates of genetic correlations, weights were assigned to each DVHE to group herds in a cluster analysis using the FASTCLUS procedure in SAS. Three clusters were defined, and genetic and residual variance components were heterogeneous among herd clusters. Estimates of heritability in clusters 1 and 3 were 0.28 and 0.29, respectively, but the estimate was larger (0.39) in Cluster 2. The genetic correlations of MY from different clusters ranged from 0.89 to 0.97. The herd-cluster model based on DVHE properly takes into account G x E by grouping similar environments accordingly and seems to be an alternative to simply considering country borders to distinguish between environments.
An approach to functionally relevant clustering of the protein universe: Active site profile‐based clustering of protein structures and sequences

PubMed Central

Knutson, Stacy T.; Westwood, Brian M.; Leuthaeuser, Janelle B.; Turner, Brandon E.; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D.; Harper, Angela F.; Brown, Shoshana D.; Morris, John H.; Ferrin, Thomas E.; Babbitt, Patricia C.

2017-01-01

Abstract Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification—amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two‐Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure‐Function Linkage Database, SFLD) self‐identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self‐identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well‐curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP‐identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F‐measure and performance analysis on the enolase search results and comparison to GEMMA and SCI‐PHY demonstrate that TuLIP avoids the over‐division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. PMID:28054422
An approach to functionally relevant clustering of the protein universe: Active site profile-based clustering of protein structures and sequences.

PubMed

Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S

2017-04-01

Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Cluster Subcutaneous Allergen Specific Immunotherapy for the Treatment of Allergic Rhinitis: A Systematic Review and Meta-Analysis

PubMed Central

Sun, Yueqi; Luo, Xi; Li, Huabin

2014-01-01

Background Although allergen specific immunotherapy (SIT) represents the only immune- modifying and curative option available for patients with allergic rhinitis (AR), the optimal schedule for specific subcutaneous immunotherapy (SCIT) is still unknown. The objective of this study is to systematically assess the efficacy and safety of cluster SCIT for patients with AR. Methods By searching PubMed, EMBASE and the Cochrane clinical trials database from 1980 through May 10th, 2013, we collected and analyzed the randomized controlled trials (RCTs) of cluster SCIT to assess its efficacy and safety. Results Eight trials involving 567 participants were included in this systematic review. Our meta-analysis showed that cluster SCIT have similar effect in reduction of both rhinitis symptoms and the requirement for anti-allergic medication compared with conventional SCIT, but when comparing cluster SCIT with placebo, no statistic significance were found in reduction of symptom scores or medication scores. Some caution is required in this interpretation as there was significant heterogeneity between studies. Data relating to Rhinoconjunctivitis Quality of Life Questionnaire (RQLQ) in 3 included studies were analyzed, which consistently point to the efficacy of cluster SCIT in improving quality of life compared to placebo. To assess the safety of cluster SCIT, meta-analysis showed that no differences existed in the incidence of either local adverse reaction or systemic adverse reaction between the cluster group and control group. Conclusion Based on the current limited evidence, we still could not conclude affirmatively that cluster SCIT was a safe and efficacious option for the treatment of AR patients. Further large-scale, well-designed RCTs on this topic are still needed. PMID:24489740
Pattern Activity Clustering and Evaluation (PACE)

NASA Astrophysics Data System (ADS)

Blasch, Erik; Banas, Christopher; Paul, Michael; Bussjager, Becky; Seetharaman, Guna

2012-06-01

With the vast amount of network information available on activities of people (i.e. motions, transportation routes, and site visits) there is a need to explore the salient properties of data that detect and discriminate the behavior of individuals. Recent machine learning approaches include methods of data mining, statistical analysis, clustering, and estimation that support activity-based intelligence. We seek to explore contemporary methods in activity analysis using machine learning techniques that discover and characterize behaviors that enable grouping, anomaly detection, and adversarial intent prediction. To evaluate these methods, we describe the mathematics and potential information theory metrics to characterize behavior. A scenario is presented to demonstrate the concept and metrics that could be useful for layered sensing behavior pattern learning and analysis. We leverage work on group tracking, learning and clustering approaches; as well as utilize information theoretical metrics for classification, behavioral and event pattern recognition, and activity and entity analysis. The performance evaluation of activity analysis supports high-level information fusion of user alerts, data queries and sensor management for data extraction, relations discovery, and situation analysis of existing data.
[Visual field progression in glaucoma: cluster analysis].

PubMed

Bresson-Dumont, H; Hatton, J; Foucher, J; Fonteneau, M

2012-11-01

Visual field progression analysis is one of the key points in glaucoma monitoring, but distinction between true progression and random fluctuation is sometimes difficult. There are several different algorithms but no real consensus for detecting visual field progression. The trend analysis of global indices (MD, sLV) may miss localized deficits or be affected by media opacities. Conversely, point-by-point analysis makes progression difficult to differentiate from physiological variability, particularly when the sensitivity of a point is already low. The goal of our study was to analyse visual field progression with the EyeSuite™ Octopus Perimetry Clusters algorithm in patients with no significant changes in global indices or worsening of the analysis of pointwise linear regression. We analyzed the visual fields of 162 eyes (100 patients - 58 women, 42 men, average age 66.8 ± 10.91) with ocular hypertension or glaucoma. For inclusion, at least six reliable visual fields per eye were required, and the trend analysis (EyeSuite™ Perimetry) of visual field global indices (MD and SLV), could show no significant progression. The analysis of changes in cluster mode was then performed. In a second step, eyes with statistically significant worsening of at least one of their clusters were analyzed point-by-point with the Octopus Field Analysis (OFA). Fifty four eyes (33.33%) had a significant worsening in some clusters, while their global indices remained stable over time. In this group of patients, more advanced glaucoma was present than in stable group (MD 6.41 dB vs. 2.87); 64.82% (35/54) of those eyes in which the clusters progressed, however, had no statistically significant change in the trend analysis by pointwise linear regression. Most software algorithms for analyzing visual field progression are essentially trend analyses of global indices, or point-by-point linear regression. This study shows the potential role of analysis by clusters trend. However, for best results, it is preferable to compare the analyses of several tests in combination with morphologic exam. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
A pattern-mixture model approach for handling missing continuous outcome data in longitudinal cluster randomized trials.

PubMed

Fiero, Mallorie H; Hsu, Chiu-Hsieh; Bell, Melanie L

2017-11-20

We extend the pattern-mixture approach to handle missing continuous outcome data in longitudinal cluster randomized trials, which randomize groups of individuals to treatment arms, rather than the individuals themselves. Individuals who drop out at the same time point are grouped into the same dropout pattern. We approach extrapolation of the pattern-mixture model by applying multilevel multiple imputation, which imputes missing values while appropriately accounting for the hierarchical data structure found in cluster randomized trials. To assess parameters of interest under various missing data assumptions, imputed values are multiplied by a sensitivity parameter, k, which increases or decreases imputed values. Using simulated data, we show that estimates of parameters of interest can vary widely under differing missing data assumptions. We conduct a sensitivity analysis using real data from a cluster randomized trial by increasing k until the treatment effect inference changes. By performing a sensitivity analysis for missing data, researchers can assess whether certain missing data assumptions are reasonable for their cluster randomized trial. Copyright © 2017 John Wiley & Sons, Ltd.
Termination of seizure clusters is related to the duration of focal seizures.

PubMed

Ferastraoaru, Victor; Schulze-Bonhage, Andreas; Lipton, Richard B; Dümpelmann, Matthias; Legatt, Alan D; Blumberg, Julie; Haut, Sheryl R

2016-06-01

Clustered seizures are characterized by shorter than usual interseizure intervals and pose increased morbidity risk. This study examines the characteristics of seizures that cluster, with special attention to the final seizure in a cluster. This is a retrospective analysis of long-term inpatient monitoring data from the EPILEPSIAE project. Patients underwent presurgical evaluation from 2002 to 2009. Seizure clusters were defined by the occurrence of at least two consecutive seizures with interseizure intervals of <4 h. Other definitions of seizure clustering were examined in a sensitivity analysis. Seizures were classified into three contextually defined groups: isolated seizures (not meeting clustering criteria), terminal seizure (last seizure in a cluster), and intracluster seizures (any other seizures within a cluster). Seizure characteristics were compared among the three groups in terms of duration, type (focal seizures remaining restricted to one hemisphere vs. evolving bilaterally), seizure origin, and localization concordance among pairs of consecutive seizures. Among 92 subjects, 77 (83%) had at least one seizure cluster. The intracluster seizures were significantly shorter than the last seizure in a cluster (p = 0.011), whereas the last seizure in a cluster resembled the isolated seizures in terms of duration. Although focal only (unilateral), seizures were shorter than seizures that evolved bilaterally and there was no correlation between the seizure type and the seizure position in relation to a cluster (p = 0.762). Frontal and temporal lobe seizures were more likely to cluster compared with other localizations (p = 0.009). Seizure pairs that are part of a cluster were more likely to have a concordant origin than were isolated seizures. Results were similar for the 2 h definition of clustering, but not for the 8 h definition of clustering. We demonstrated that intracluster seizures are short relative to isolated seizures and terminal seizures. Frontal and temporal lobe seizures are more likely to cluster. Wiley Periodicals, Inc. © 2016 International League Against Epilepsy.
Transmission clustering among newly diagnosed HIV patients in Chicago, 2008 to 2011: using phylogenetics to expand knowledge of regional HIV transmission patterns

PubMed Central

Lubelchek, Ronald J.; Hoehnen, Sarah C.; Hotton, Anna L.; Kincaid, Stacey L.; Barker, David E.; French, Audrey L.

2014-01-01

Introduction HIV transmission cluster analyses can inform HIV prevention efforts. We describe the first such assessment for transmission clustering among HIV patients in Chicago. Methods We performed transmission cluster analyses using HIV pol sequences from newly diagnosed patients presenting to Chicago’s largest HIV clinic between 2008 and 2011. We compared sequences via progressive pairwise alignment, using neighbor joining to construct an un-rooted phylogenetic tree. We defined clusters as >2 sequences among which each sequence had at least one partner within a genetic distance of ≤ 1.5%. We used multivariable regression to examine factors associated with clustering and used geospatial analysis to assess geographic proximity of phylogenetically clustered patients. Results We compared sequences from 920 patients; median age 35 years; 75% male; 67% Black, 23% Hispanic; 8% had a Rapid Plasma Reagin (RPR) titer ≥ 1:16 concurrent with their HIV diagnosis. We had HIV transmission risk data for 54%; 43% identified as men who have sex with men (MSM). Phylogenetic analysis demonstrated 123 patients (13%) grouped into 26 clusters, the largest having 20 members. In multivariable regression, age < 25, Black race, MSM status, male gender, higher HIV viral load, and RPR ≥ 1:16 associated with clustering. We did not observe geographic grouping of genetically clustered patients. Discussion Our results demonstrate high rates of HIV transmission clustering, without local geographic foci, among young Black MSM in Chicago. Applied prospectively, phylogenetic analyses could guide prevention efforts and help break the cycle of transmission. PMID:25321182
Do healthy and unhealthy behaviours cluster in New Zealand?

PubMed

Tobias, Martin; Jackson, Gary; Yeh, Li-Chia; Huang, Ken

2007-04-01

To describe the co-occurrence and clustering of healthy and unhealthy behaviours in New Zealand. Data were sourced from the 2002/03 New Zealand Health Survey. Behaviours selected for analysis were tobacco use, quantity and pattern of alcohol consumption, level of physical activity, and intake of fruit and vegetables. Clustering was defined as co-prevalence of behaviours greater than that expected based on the laws of probability. Co-occurrence was examined using multiple logistic regression modelling, while clustering was examined in a stratified analysis using age and (where appropriate) ethnic standardisation for confounding control. Approximately 29% of adults enjoyed a healthy lifestyle characterised by non-use of tobacco, non- or safe use of alcohol, sufficient physical activity and adequate fruit and vegetable intake. This is only slightly greater than the prevalence expected if all four behaviours were independently distributed through the population i.e. little clustering of healthy behaviours was found. By contrast, 1.5% of adults exhibited all four unhealthy behaviours and 13% exhibited any combination of three of the four unhealthy behaviours. Unhealthy behaviours were more clustered than healthy behaviours, yet Maori exhibited less clustering of unhealthy behaviours than other ethnic groups and no deprivation gradient was seen in clustering. The relative lack of clustering of healthy behaviours supports single issue universal health promotion strategies at the population level. Our results also support targeted interventions at the clinical level for the 15% with 'unhealthy lifestyles'. Our finding of only limited clustering of unhealthy behaviours among Maori and no deprivation gradient suggests that clustering does not contribute to the greater burden of disease experienced by these groups.
Investigating the long-term course of schizophrenia by sequence analysis.

PubMed

An der Heiden, Wolfram; Häfner, Heinz

2015-08-30

In the present study we set out to explore the long-term clinical course of schizophrenia in a holistic manner by adopting sequence analysis. Our aim was to identify course types of illness by means of cluster analysis. The study was based on course and outcome data for 107 patients followed up over 134 months after first admission in the ABC Schizophrenia Study. Focusing on the main syndromes (positive, negative, depressive and unspecific symptoms) and their combinations we looked for similarities in individual illness courses using the 'optimal matching' method. A cluster analysis performed on the resulting similarity matrix yielded two main groups (a 'improving' and a 'chronic' group), which comprised a total of six different types of illness course. The course types differed in both quantitative (frequency of syndromes and syndrome combinations) and qualitative terms (clinical presentation, sequence of syndromes). Cluster membership was only rarely, but clearly associated with sociodemographic characteristics, treatment data and other illness variables. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

Do targets of workplace bullying portray a general victim personality profile?

PubMed

Glasø, Lars; Matthiesen, Stig Berge; Nielsen, Morten Birkeland; Einarsen, Ståle

2007-08-01

The aim of this study is to examine differences in personality between a group of bullied victims and a non-bullied group. The 144 participants, comprising of 72 victims and a matched contrast group of 72 respondents, completed Goldberg's (1999) International Personality Item Pool (IPIP). Significant differences emerged between victims and non-victims on four out of five personality dimensions. Victims tended to be more neurotic and less agreeable, conscientious and extravert than non-victims. However, a cluster analysis revealed that the victim sample can be divided into two personality groups. One cluster, which comprised 64% of the victim sample, do not differ from non-victims as far as personality is concerned. Hence, the results indicate that there is no such thing as a general victim personality profile. However, a small cluster of victims tended to be less extrovert, less agreeable, less conscientious, and less open to experience but more emotional unstable than victims in the major cluster and the control group. Further, both clusters of victims scored higher than non-victims on emotional instability, indicating that personality should not be neglected as being a factor in understanding the bullying phenomenon.
Principal components derived from CSF inflammatory profiles predict outcome in survivors after severe traumatic brain injury.

PubMed

Kumar, Raj G; Rubin, Jonathan E; Berger, Rachel P; Kochanek, Patrick M; Wagner, Amy K

2016-03-01

Studies have characterized absolute levels of multiple inflammatory markers as significant risk factors for poor outcomes after traumatic brain injury (TBI). However, inflammatory marker concentrations are highly inter-related, and production of one may result in the production or regulation of another. Therefore, a more comprehensive characterization of the inflammatory response post-TBI should consider relative levels of markers in the inflammatory pathway. We used principal component analysis (PCA) as a dimension-reduction technique to characterize the sets of markers that contribute independently to variability in cerebrospinal (CSF) inflammatory profiles after TBI. Using PCA results, we defined groups (or clusters) of individuals (n=111) with similar patterns of acute CSF inflammation that were then evaluated in the context of outcome and other relevant CSF and serum biomarkers collected days 0-3 and 4-5 post-injury. We identified four significant principal components (PC1-PC4) for CSF inflammation from days 0-3, and PC1 accounted for the greatest (31%) percentage of variance. PC1 was characterized by relatively higher CSF sICAM-1, sFAS, IL-10, IL-6, sVCAM-1, IL-5, and IL-8 levels. Cluster analysis then defined two distinct clusters, such that individuals in cluster 1 had highly positive PC1 scores and relatively higher levels of CSF cortisol, progesterone, estradiol, testosterone, brain derived neurotrophic factor (BDNF), and S100b; this group also had higher serum cortisol and lower serum BDNF. Multinomial logistic regression analyses showed that individuals in cluster 1 had a 10.9 times increased likelihood of GOS scores of 2/3 vs. 4/5 at 6 months compared to cluster 2, after controlling for covariates. Cluster group did not discriminate between mortality compared to GOS scores of 4/5 after controlling for age and other covariates. Cluster groupings also did not discriminate mortality or 12 month outcomes in multivariate models. PCA and cluster analysis establish that a subset of CSF inflammatory markers measured in days 0-3 post-TBI may distinguish individuals with poor 6-month outcome, and future studies should prospectively validate these findings. PCA of inflammatory mediators after TBI could aid in prognostication and in identifying patient subgroups for therapeutic interventions. Copyright © 2015 Elsevier Inc. All rights reserved.
Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome.

PubMed

Lalonde, Michel; Wells, R Glenn; Birnie, David; Ruddy, Terrence D; Wassenaar, Richard

2014-07-01

Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. About 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster analysis results were similar to SPECT RNA phase analysis (ROC AUC = 0.78, p = 0.73 vs cluster AUC; sensitivity/specificity = 59%/89%) and PET scar size analysis (ROC AUC = 0.73, p = 1.0 vs cluster AUC; sensitivity/specificity = 76%/67%). A SPECT RNA cluster analysis algorithm was developed for the prediction of CRT outcome. Cluster analysis results produced results equivalent to those obtained from Fourier and scar analysis.
Development and optimization of SPECT gated blood pool cluster analysis for the prediction of CRT outcome

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lalonde, Michel, E-mail: mlalonde15@rogers.com; Wassenaar, Richard; Wells, R. Glenn

2014-07-15

Purpose: Phase analysis of single photon emission computed tomography (SPECT) radionuclide angiography (RNA) has been investigated for its potential to predict the outcome of cardiac resynchronization therapy (CRT). However, phase analysis may be limited in its potential at predicting CRT outcome as valuable information may be lost by assuming that time-activity curves (TAC) follow a simple sinusoidal shape. A new method, cluster analysis, is proposed which directly evaluates the TACs and may lead to a better understanding of dyssynchrony patterns and CRT outcome. Cluster analysis algorithms were developed and optimized to maximize their ability to predict CRT response. Methods: Aboutmore » 49 patients (N = 27 ischemic etiology) received a SPECT RNA scan as well as positron emission tomography (PET) perfusion and viability scans prior to undergoing CRT. A semiautomated algorithm sampled the left ventricle wall to produce 568 TACs from SPECT RNA data. The TACs were then subjected to two different cluster analysis techniques, K-means, and normal average, where several input metrics were also varied to determine the optimal settings for the prediction of CRT outcome. Each TAC was assigned to a cluster group based on the comparison criteria and global and segmental cluster size and scores were used as measures of dyssynchrony and used to predict response to CRT. A repeated random twofold cross-validation technique was used to train and validate the cluster algorithm. Receiver operating characteristic (ROC) analysis was used to calculate the area under the curve (AUC) and compare results to those obtained for SPECT RNA phase analysis and PET scar size analysis methods. Results: Using the normal average cluster analysis approach, the septal wall produced statistically significant results for predicting CRT results in the ischemic population (ROC AUC = 0.73;p < 0.05 vs. equal chance ROC AUC = 0.50) with an optimal operating point of 71% sensitivity and 60% specificity. Cluster analysis results were similar to SPECT RNA phase analysis (ROC AUC = 0.78, p = 0.73 vs cluster AUC; sensitivity/specificity = 59%/89%) and PET scar size analysis (ROC AUC = 0.73, p = 1.0 vs cluster AUC; sensitivity/specificity = 76%/67%). Conclusions: A SPECT RNA cluster analysis algorithm was developed for the prediction of CRT outcome. Cluster analysis results produced results equivalent to those obtained from Fourier and scar analysis.« less
Exploring relationships between Dairy Herd Improvement monitors of performance and the Transition Cow Index in Wisconsin dairy herds.

PubMed

Schultz, K K; Bennett, T B; Nordlund, K V; Döpfer, D; Cook, N B

2016-09-01

Transition cow management has been tracked via the Transition Cow Index (TCI; AgSource Cooperative Services, Verona, WI) since 2006. Transition Cow Index was developed to measure the difference between actual and predicted milk yield at first test day to evaluate the relative success of the transition period program. This project aimed to assess TCI in relation to all commonly used Dairy Herd Improvement (DHI) metrics available through AgSource Cooperative Services. Regression analysis was used to isolate variables that were relevant to TCI, and then principal components analysis and network analysis were used to determine the relative strength and relatedness among variables. Finally, cluster analysis was used to segregate herds based on similarity of relevant variables. The DHI data were obtained from 2,131 Wisconsin dairy herds with test-day mean ≥30 cows, which were tested ≥10 times throughout the 2014 calendar year. The original list of 940 DHI variables was reduced through expert-driven selection and regression analysis to 23 variables. The K-means cluster analysis produced 5 distinct clusters. Descriptive statistics were calculated for the 23 variables per cluster grouping. Using principal components analysis, cluster analysis, and network analysis, 4 parameters were isolated as most relevant to TCI; these were energy-corrected milk, 3 measures of intramammary infection (dry cow cure rate, linear somatic cell count score in primiparous cows, and new infection rate), peak ratio, and days in milk at peak milk production. These variables together with cow and newborn calf survival measures form a group of metrics that can be used to assist in the evaluation of overall transition period performance. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
The Clusters - Collaborative Models of Sustainable Regional Development

NASA Astrophysics Data System (ADS)

Mănescu, Gabriel; Kifor, Claudiu

2014-12-01

The clusters are the subject of actions and of whole series of documents issued by national and international organizations, and, based on experience, many authorities promote the idea that because of the clusters, competitiveness increases, the workforce specializes, regional businesses and economies grow. The present paper is meant to be an insight into the initiatives of forming clusters in Romania. Starting from a comprehensive analysis of the development potential offered by each region of economic development, we present the main types of clusters grouped according to fields of activity and their overall objectives
Cluster analysis of spontaneous preterm birth phenotypes identifies potential associations among preterm birth mechanisms.

PubMed

Esplin, M Sean; Manuck, Tracy A; Varner, Michael W; Christensen, Bryce; Biggio, Joseph; Bukowski, Radek; Parry, Samuel; Zhang, Heping; Huang, Hao; Andrews, William; Saade, George; Sadovsky, Yoel; Reddy, Uma M; Ilekis, John

2015-09-01

We sought to use an innovative tool that is based on common biologic pathways to identify specific phenotypes among women with spontaneous preterm birth (SPTB) to enhance investigators' ability to identify and to highlight common mechanisms and underlying genetic factors that are responsible for SPTB. We performed a secondary analysis of a prospective case-control multicenter study of SPTB. All cases delivered a preterm singleton at SPTB ≤34.0 weeks' gestation. Each woman was assessed for the presence of underlying SPTB causes. A hierarchic cluster analysis was used to identify groups of women with homogeneous phenotypic profiles. One of the phenotypic clusters was selected for candidate gene association analysis with the use of VEGAS software. One thousand twenty-eight women with SPTB were assigned phenotypes. Hierarchic clustering of the phenotypes revealed 5 major clusters. Cluster 1 (n = 445) was characterized by maternal stress; cluster 2 (n = 294) was characterized by premature membrane rupture; cluster 3 (n = 120) was characterized by familial factors, and cluster 4 (n = 63) was characterized by maternal comorbidities. Cluster 5 (n = 106) was multifactorial and characterized by infection (INF), decidual hemorrhage (DH), and placental dysfunction (PD). These 3 phenotypes were correlated highly by χ(2) analysis (PD and DH, P < 2.2e-6; PD and INF, P = 6.2e-10; INF and DH, (P = .0036). Gene-based testing identified the INS (insulin) gene as significantly associated with cluster 3 of SPTB. We identified 5 major clusters of SPTB based on a phenotype tool and hierarch clustering. There was significant correlation between several of the phenotypes. The INS gene was associated with familial factors that were underlying SPTB. Copyright © 2015 Elsevier Inc. All rights reserved.
Line-of-sight structure toward strong lensing galaxy clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bayliss, Matthew B.; Johnson, Traci; Sharon, Keren

2014-03-01

We present an analysis of the line-of-sight structure toward a sample of 10 strong lensing cluster cores. Structure is traced by groups that are identified spectroscopically in the redshift range, 0.1 ≤ z ≤ 0.9, and we measure the projected angular and comoving separations between each group and the primary strong lensing clusters in each corresponding line of sight. From these data we measure the distribution of projected angular separations between the primary strong lensing clusters and uncorrelated large-scale structure as traced by groups. We then compare the observed distribution of angular separations for our strong lensing selected lines ofmore » sight against the distribution of groups that is predicted for clusters lying along random lines of sight. There is clear evidence for an excess of structure along the line of sight at small angular separations (θ ≤ 6') along the strong lensing selected lines of sight, indicating that uncorrelated structure is a significant systematic that contributes to producing galaxy clusters with large cross sections for strong lensing. The prevalence of line-of-sight structure is one of several biases in strong lensing clusters that can potentially be folded into cosmological measurements using galaxy cluster samples. These results also have implications for current and future studies—such as the Hubble Space Telescope Frontier Fields—that make use of massive galaxy cluster lenses as precision cosmological telescopes; it is essential that the contribution of line-of-sight structure be carefully accounted for in the strong lens modeling of the cluster lenses.« less
Systematization of actinides using cluster analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Kopyrin, A.A.; Terent`eva, T.N.; Khramov, N.N.

1994-11-01

A representation of the actinides in multidimensional property space is proposed for systematization of these elements using cluster analysis. Literature data for their atomic properties are used. Owing to the wide variation of published ionization potentials, medians are used to estimate them. Vertical dendograms are used for classification on the basis of distances between the actinides in atomic-property space. The properties of actinium and lawrencium are furthest removed from the main group. Thorium and mendelevium exhibit individualized properties. A cluster based on the einsteinium-fermium pair is joined by californium.
Review of Recent Methodological Developments in Group-Randomized Trials: Part 1—Design

PubMed Central

Li, Fan; Gallis, John A.; Prague, Melanie; Murray, David M.

2017-01-01

In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have highlighted the developments of the past 13 years in design with a companion article to focus on developments in analysis. As a pair, these articles update the 2004 review. We have discussed developments in the topics of the earlier review (e.g., clustering, matching, and individually randomized group-treatment trials) and in new topics, including constrained randomization and a range of randomized designs that are alternatives to the standard parallel-arm GRT. These include the stepped-wedge GRT, the pseudocluster randomized trial, and the network-randomized GRT, which, like the parallel-arm GRT, require clustering to be accounted for in both their design and analysis. PMID:28426295
Review of Recent Methodological Developments in Group-Randomized Trials: Part 1-Design.

PubMed

Turner, Elizabeth L; Li, Fan; Gallis, John A; Prague, Melanie; Murray, David M

2017-06-01

In 2004, Murray et al. reviewed methodological developments in the design and analysis of group-randomized trials (GRTs). We have highlighted the developments of the past 13 years in design with a companion article to focus on developments in analysis. As a pair, these articles update the 2004 review. We have discussed developments in the topics of the earlier review (e.g., clustering, matching, and individually randomized group-treatment trials) and in new topics, including constrained randomization and a range of randomized designs that are alternatives to the standard parallel-arm GRT. These include the stepped-wedge GRT, the pseudocluster randomized trial, and the network-randomized GRT, which, like the parallel-arm GRT, require clustering to be accounted for in both their design and analysis.
Cluster analysis of cognitive performance in elderly and demented subjects.

PubMed

Giaquinto, S; Nolfe, G; Calvani, M

1985-06-01

48 elderly normals, 14 demented subjects and 76 young controls were tested for basic cognitive functions. All the tests were quantified and could therefore be subjected to statistical analysis. The results show a difference in the speed of information processing and in memory load between the young controls and elderly normals but the age groups differed in quantitative terms only. Cluster analysis showed that the elderly and the demented formed two distinctly separate groups at the qualitative level, the basic cognitive processes being damaged in the demented group. Age thus appears to be only a risk factor for dementia and not its cause. It is concluded that batteries based on precise and measurable tasks are the most appropriate not only for the study of dementia but for rehabilitation purposes too.
Genetic Characterization of Turkish Snake Melon (Cucumis melo L. subsp. melo flexuosus Group) Accessions Revealed by SSR Markers.

PubMed

Solmaz, Ilknur; Kacar, Yildiz Aka; Simsek, Ozhan; Sari, Nebahat

2016-08-01

Snake melon is an important cucurbit crop especially in the Southeastern and the Mediterranean region of Turkey. It is consumed as fresh or pickled. The production is mainly done with the local landraces in the country. Turkey is one of the secondary diversification centers of melon and possesses valuable genetic resources which have different morphological characteristics in case of snake melon. Genetic diversity of snake melon genotypes collected from different regions of Turkey and reference genotypes obtained from World Melon Gene Bank in Avignon-France was examined using 13 simple sequence repeat (SSR) markers. A total of 69 alleles were detected, with an average of 5.31 alleles per locus. The polymorphism information content of SSR markers ranged from 0.19 to 0.57 (average 0.38). Based on cluster analysis, two major groups were defined. The first major group included only one accession (61), while the rest of all accessions grouped in the second major group and separated into different sub-clusters. Based on SSR markers, cluster analysis indicated that considerably high genetic variability exists among the examined accessions; however, Turkish snake melon accessions were grouped together with the reference snake melon accessions.
HIV Transmission Networks in the San Diego–Tijuana Border Region

PubMed Central

Mehta, Sanjay R.; Wertheim, Joel O.; Brouwer, Kimberly C.; Wagner, Karla D.; Chaillon, Antoine; Strathdee, Steffanie; Patterson, Thomas L.; Rangel, Maria G.; Vargas, Mlenka; Murrell, Ben; Garfein, Richard; Little, Susan J.; Smith, Davey M.

2015-01-01

Background HIV sequence data can be used to reconstruct local transmission networks. Along international borders, like the San Diego–Tijuana region, understanding the dynamics of HIV transmission across reported risks, racial/ethnic groups, and geography can help direct effective prevention efforts on both sides of the border. Methods We gathered sociodemographic, geographic, clinical, and viral sequence data from HIV infected individuals participating in ten studies in the San Diego–Tijuana border region. Phylogenetic and network analysis was performed to infer putative relationships between HIV sequences. Correlates of identified clusters were evaluated and spatiotemporal relationships were explored using Bayesian phylogeographic analysis. Findings After quality filtering, 843 HIV sequences with associated demographic data and 263 background sequences from the region were analyzed, and 138 clusters were inferred (2–23 individuals). Overall, the rate of clustering did not differ by ethnicity, residence, or sex, but bisexuals were less likely to cluster than heterosexuals or men who have sex with men (p = 0.043), and individuals identifying as white (p ≤ 0.01) were more likely to cluster than other races. Clustering individuals were also 3.5 years younger than non-clustering individuals (p < 0.001). Although the sampled San Diego and Tijuana epidemics were phylogenetically compartmentalized, five clusters contained individuals residing on both sides of the border. Interpretation This study sampled ~ 7% of HIV infected individuals in the border region, and although the sampled networks on each side of the border were largely separate, there was evidence of persistent bidirectional cross-border transmissions that linked risk groups, thus highlighting the importance of the border region as a “melting pot” of risk groups. Funding NIH, VA, and Pendleton Foundation. PMID:26629540
HIV Transmission Networks in the San Diego-Tijuana Border Region.

PubMed

Mehta, Sanjay R; Wertheim, Joel O; Brouwer, Kimberly C; Wagner, Karla D; Chaillon, Antoine; Strathdee, Steffanie; Patterson, Thomas L; Rangel, Maria G; Vargas, Mlenka; Murrell, Ben; Garfein, Richard; Little, Susan J; Smith, Davey M

2015-10-01

HIV sequence data can be used to reconstruct local transmission networks. Along international borders, like the San Diego-Tijuana region, understanding the dynamics of HIV transmission across reported risks, racial/ethnic groups, and geography can help direct effective prevention efforts on both sides of the border. We gathered sociodemographic, geographic, clinical, and viral sequence data from HIV infected individuals participating in ten studies in the San Diego-Tijuana border region. Phylogenetic and network analysis was performed to infer putative relationships between HIV sequences. Correlates of identified clusters were evaluated and spatiotemporal relationships were explored using Bayesian phylogeographic analysis. After quality filtering, 843 HIV sequences with associated demographic data and 263 background sequences from the region were analyzed, and 138 clusters were inferred (2-23 individuals). Overall, the rate of clustering did not differ by ethnicity, residence, or sex, but bisexuals were less likely to cluster than heterosexuals or men who have sex with men (p = 0.043), and individuals identifying as white (p ≤ 0.01) were more likely to cluster than other races. Clustering individuals were also 3.5 years younger than non-clustering individuals (p < 0.001). Although the sampled San Diego and Tijuana epidemics were phylogenetically compartmentalized, five clusters contained individuals residing on both sides of the border. This study sampled ~ 7% of HIV infected individuals in the border region, and although the sampled networks on each side of the border were largely separate, there was evidence of persistent bidirectional cross-border transmissions that linked risk groups, thus highlighting the importance of the border region as a "melting pot" of risk groups. NIH, VA, and Pendleton Foundation.
Characterization of anticancer agents by their growth inhibitory activity and relationships to mechanism of action and structure.

PubMed

Keskin, O; Bahar, I; Jernigan, R L; Beutler, J A; Shoemaker, R H; Sausville, E A; Covell, D G

2000-04-01

An analysis of the growth inhibitory potency of 122 anticancer agents available from the National Cancer Institute anticancer drug screen is presented. Methods of singular value decomposition (SVD) were applied to determine the matrix of distances between all compounds. These SVD-derived dissimilarity distances were used to cluster compounds that exhibit similar tumor growth inhibitory activity patterns against 60 human cancer cell lines. Cluster analysis divides the 122 standard agents into 25 statistically distinct groups. The first eight groups include structurally diverse compounds with reactive functionalities that act as DNA-damaging agents while the remaining 17 groups include compounds that inhibit nucleic acid biosynthesis and mitosis. Examination of the average activity patterns across the 60 tumor cell lines reveals unique 'fingerprints' associated with each group. A diverse set of structural features are observed for compounds within these groups, with frequent occurrences of strong within-group structural similarities. Clustering of cell types by their response to the 122 anticancer agents divides the 60 cell types into 21 groups. The strongest within-panel groupings were found for the renal, leukemia and ovarian cell panels. These results contribute to the basis for comparisons between log(GI(50)) screening patterns of the 122 anticancer agents and additional tested compounds.
The Peculiarities in O-Type Galaxy Clusters

NASA Astrophysics Data System (ADS)

Panko, E. A.; Emelyanov, S. I.

We present the results of analysis of 2D distribution of galaxies in galaxy cluster fields. The Catalogue of Galaxy Clusters and Groups PF (Panko & Flin) was used as input observational data set. We selected open rich PF galaxy clusters, containing 100 and more galaxies for our study. According to Panko classification scheme open galaxy clusters (O-type) have no concentration to the cluster center. The data set contains both pure O-type clusters and O-type clusters with overdence belts, namely OL and OF types. According to Rood & Sastry and Struble & Rood ideas, the open galaxy clusters are the beginning stage of cluster evolution. We found in the O-type clusters some types of statistically significant regular peculiarities, such as two crossed belts or curved strip. We suppose founded features connected with galaxy clusters evolution and the distribution of DM inside the clusters.
Dietary patterns among a national random sample of British adults

PubMed Central

Pryer, J; Nichols, R; Elliott, P; Thakrar, B; Brunner, E; Marmot, M

2001-01-01

STUDY OBJECTIVES—To identify groups within the UK male and female population who report similar patterns of diet. DESIGN—National representative dietary survey, using seven day weighed dietary records, of men and women aged 16-64 years living in private households in Great Britain in 1986-7. Cluster analysis was used to aggregate participants into diet groups. SETTING—Great Britain. PARTICIPANTS—1087 men and 1110 women. RESULTS—93% of men and 86% of women fell into one of four distinct diet groups. Among men the most prevalent diet group was "beer and convenience food" (34% of the male population); second was "traditional British diet" (18%); third was "healthier but sweet diet" (17.5%) and fourth was "healthier diet " (17%). Among women, the most prevalent diet group was " traditional British diet" (32%); second, was "healthy cosmopolitan diet" (25%); third was a "convenience food diet" (21%); and fourth was "healthier but sweet diet" (15%). There were important differences in nutrient profile, sociodemographic and behavioural characteristics between diet groups. CONCLUSIONS—Cluster analysis identified four diet groups among men and four among women, which differed not only in terms of reported dietary intakes, but also with respect to nutrient, social and behavioural profiles. The groups identified could provide a useful basis for development, monitoring and targeting of public health nutrition policy in the UK.   Keywords: diet; cluster analysis; sociodemographic variables PMID:11112948
The effect of cognitive appraisal for stressors on the oral health-related QOL of dry mouth patients

PubMed Central

2014-01-01

Background Dry mouth is very common symptom, and psychological factors have an influence on this symptom. Although the influence of emotional factor related to patients with oral dryness has been examined in previous studies, the cognitive factors have not been examined thus far. Objective The purpose of this study was to examine the influence of cognitive factors on patients with oral dryness. Methods The participants were 106 patients complaining of oral dryness. They were required to complete a questionnaire measuring subjective oral dryness, oral-related QOL, cognition for stressors, and mood state. Results Correlational analyses revealed that OHIP-14 is significantly related to oral dryness, appraisal for effect, appraisal for threat, and commitment. These correlations were maintained even after controlling for the influence of depression and anxiety. Using oral dryness, appraisal for effect, appraisal for threat, and commitment, cluster analysis was done and three clusters (cluster-1, severe oral dryness; cluster-2, positive cognitive style: cluster-3, negative cognitive style) were extracted. The results of ANOVA showed that the group with severe oral dryness (cluster-1) had a significantly higher score on OHIP-14 than the other two groups. There was no significant difference between the groups with positive (cluster-2) and negative (cluster-3) cognitive style. Conclusion Although the group of patients with positive cognitive style complained of more severe oral dryness than the group with negative cognitive style, no significant difference was observed between these two groups in OHIP-14. These results indicate that cognitive factors would be a useful therapeutic target for the improvement of the oral-related QOL of patients with oral dryness. PMID:26019720
Phylogenetic continuum indicates "galaxies" in the protein universe: preliminary results on the natural group structures of proteins.

PubMed

Ladunga, I

1992-04-01

The markedly nonuniform, even systematic distribution of sequences in the protein "universe" has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two chi 2-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.

Infrared spectroscopy reveals both qualitative and quantitative differences in equine subchondral bone during maturation

NASA Astrophysics Data System (ADS)

Kobrina, Yevgeniya; Isaksson, Hanna; Sinisaari, Miikka; Rieppo, Lassi; Brama, Pieter A.; van Weeren, René; Helminen, Heikki J.; Jurvelin, Jukka S.; Saarakkala, Simo

2010-11-01

The collagen phase in bone is known to undergo major changes during growth and maturation. The objective of this study is to clarify whether Fourier transform infrared (FTIR) microspectroscopy, coupled with cluster analysis, can detect quantitative and qualitative changes in the collagen matrix of subchondral bone in horses during maturation and growth. Equine subchondral bone samples (n = 29) from the proximal joint surface of the first phalanx are prepared from two sites subjected to different loading conditions. Three age groups are studied: newborn (0 days old), immature (5 to 11 months old), and adult (6 to 10 years old) horses. Spatial collagen content and collagen cross-link ratio are quantified from the spectra. Additionally, normalized second derivative spectra of samples are clustered using the k-means clustering algorithm. In quantitative analysis, collagen content in the subchondral bone increases rapidly between the newborn and immature horses. The collagen cross-link ratio increases significantly with age. In qualitative analysis, clustering is able to separate newborn and adult samples into two different groups. The immature samples display some nonhomogeneity. In conclusion, this is the first study showing that FTIR spectral imaging combined with clustering techniques can detect quantitative and qualitative changes in the collagen matrix of subchondral bone during growth and maturation.
Designing and evaluating health systems level hypertension control interventions for African-Americans: lessons from a pooled analysis of three cluster randomized trials.

PubMed

Pavlik, Valory N; Chan, Wenyaw; Hyman, David J; Feldman, Penny; Ogedegbe, Gbenga; Schwartz, Joseph E; McDonald, Margaret; Einhorn, Paula; Tobin, Jonathan N

2015-01-01

African-Americans (AAs) have a high prevalence of hypertension and their blood pressure (BP) control on treatment still lags behind other groups. In 2004, NHLBI funded five projects that aimed to evaluate clinically feasible interventions to effect changes in medical care delivery leading to an increased proportion of AA patients with controlled BP. Three of the groups performed a pooled analysis of trial results to determine: 1) the magnitude of the combined intervention effect; and 2) how the pooled results could inform the methodology for future health-system level BP interventions. Using a cluster randomized design, the trials enrolled AAs with uncontrolled hypertension to test interventions targeting a combination of patient and clinician behaviors. The 12-month Systolic BP (SBP) and Diastolic BP (DBP) effects of intervention or control cluster assignment were assessed using mixed effects longitudinal regression modeling. 2,015 patients representing 352 clusters participated across the three trials. Pooled BP slopes followed a quadratic pattern, with an initial decline, followed by a rise toward baseline, and did not differ significantly between intervention and control clusters: SBP linear coefficient = -2.60±0.21 mmHg per month, p<0.001; quadratic coefficient = 0.167± 0.02 mmHg/month, p<0.001; group by time interaction group by time group x linear time coefficient=0.145 ± 0.293, p=0.622; group x quadratic time coefficient= -0.017 ± 0.026, p=0.525). RESULTS were similar for DBP. The individual sites did not have significant intervention effects when analyzed separately. Investigators planning behavioral trials to improve BP control in health systems serving AAs should plan for small effect sizes and employ a "run-in" period in which BP can be expected to improve in both experimental and control clusters.
Ram pressure stripping of hot coronal gas from group and cluster galaxies and the detectability of surviving X-ray coronae

NASA Astrophysics Data System (ADS)

Vijayaraghavan, Rukmani; Ricker, Paul M.

2015-05-01

Ram pressure stripping can remove hot and cold gas from galaxies in the intracluster medium, as shown by observations of X-ray and H I galaxy wakes in nearby clusters of galaxies. However, ram pressure stripping, including pre-processing in group environments, does not remove all the hot coronal gas from cluster galaxies. Recent high-resolution Chandra observations have shown that ˜1-4 kpc extended, hot galactic coronae are ubiquitous in group and cluster galaxies. To better understand this result, we simulate ram pressure stripping of a cosmologically motivated population of galaxies in isolated group and cluster environments. The galaxies and the host group and cluster are composed of collisionless dark matter and hot gas initially in hydrostatic equilibrium with the galaxy and host potentials. We show that the rate at which gas is lost depends on the galactic and host halo mass. Using synthetic X-ray observations, we evaluate the detectability of stripped galactic coronae in real observations by stacking images on the known galaxy centres. We find that coronal emission should be detected within ˜10 arcsec, or ˜5 kpc up to ˜2.3 Gyr in the lowest (0.1-1.2 keV) energy band. Thus, the presence of observed coronae in cluster galaxies significantly smaller than the hot X-ray haloes of field galaxies indicates that at least some gas removal occurs within cluster environments for recently accreted galaxies. Finally, we evaluate the possibility that existing and future X-ray cluster catalogues can be used in combination with optical galaxy positions to detect galactic coronal emission via stacking analysis. We briefly discuss the effects of additional physical processes on coronal survival, and will address them in detail in future papers in this series.
Autism spectrum disorder in Down syndrome: cluster analysis of Aberrant Behaviour Checklist data supports diagnosis.

PubMed

Ji, N Y; Capone, G T; Kaufmann, W E

2011-11-01

The diagnostic validity of autism spectrum disorder (ASD) based on Diagnostic and Statistical Manual of Mental Disorders (DSM) has been challenged in Down syndrome (DS), because of the high prevalence of cognitive impairments in this population. Therefore, we attempted to validate DSM-based diagnoses via an unbiased categorisation of participants with a DSM-independent behavioural instrument. Based on scores on the Aberrant Behaviour Checklist - Community, we performed sequential factor (four DS-relevant factors: Autism-Like Behaviour, Disruptive Behaviour, Hyperactivity, Self-Injury) and cluster analyses on a 293-participant paediatric DS clinic cohort. The four resulting clusters were compared with DSM-delineated groups: DS + ASD, DS + None (no DSM diagnosis), DS + DBD (disruptive behaviour disorder) and DS + SMD (stereotypic movement disorder), the latter two as comparison groups. Two clusters were identified with DS + ASD: Cluster 1 (35.1%) with higher disruptive behaviour and Cluster 4 (48.2%) with more severe autistic behaviour and higher percentage of late onset ASD. The majority of participants in DS + None (71.9%) and DS + DBD (87.5%) were classified into Cluster 2 and 3, respectively, while participants in DS + SMD were relatively evenly distributed throughout the four clusters. Our unbiased, DSM-independent analyses, using a rating scale specifically designed for individuals with severe intellectual disability, demonstrated that DSM-based criteria of ASD are applicable to DS individuals despite their cognitive impairments. Two DS + ASD clusters were identified and supported the existence of at least two subtypes of ASD in DS, which deserve further characterisation. Despite the prominence of stereotypic behaviour in DS, the SMD diagnosis was not identified by cluster analysis, suggesting that high-level stereotypy is distributed throughout DS. Further supporting DSM diagnoses, typically behaving DS participants were easily distinguished as a group from those with maladaptive behaviours. © 2011 The Authors. Journal of Intellectual Disability Research © 2011 Blackwell Publishing Ltd.
Clusters of Midlife Women by Physical Activity and Their Racial/Ethnic Differences

PubMed Central

Im, Eun-Ok; Ko, Young; Chee, Eunice; Chee, Wonshik; Mao, Jun James

2016-01-01

Objective The purpose of this study was to identify clusters of midlife women by physical activity and to determine racial/ethnic differences in physical activities in each cluster. Methods This was a secondary analysis of the data from 542 women (157 Non-Hispanic [NH] Whites, 127 Hispanics, 135 NH African Americans, and 123 NH Asian) in a larger Internet study on midlife women’s attitudes toward physical activity. The instruments included the Barriers to Health Activities Scale, the Physical Activity Assessment Inventory, the Questions on Attitudes toward Physical Activity, Subjective Norm, Perceived Behavioral Control, and Behavioral Intention, and the Kaiser Physical Activity Survey. The data were analyzed using hierarchical cluster analyses, ANOVA, and multinominal logistic analyses. Results A three cluster solution was adopted: Cluster 1 (high active living and sports/exercise activity group; 48%), Cluster 2 (high household/caregiving and occupational activity group; 27%), and Cluster 3 (low active living and sports/exercise activity group; 26%). There were significant racial/ethnic differences in occupational activities of Clusters 1 and 3 (all p<.01). Compared with Cluster 1, Cluster 2 tended to have lower family income, less access to health care, higher unemployment, higher perceived barriers scores, and lower social influences scores (all p<.01). Compared with Cluster 1, Cluster 3 tended to have greater obesity, less access to health care, higher perceived barriers scores, more negative attutides toward physical activity, and lower self-efficacy scores (all p<.01). Conclusions Midlife women’s unique patterns of physical activity and their associated factors need to be considered in future intervention development. PMID:27846052
Classification of Support Needs for Elderly Outpatients with Diabetes Who Live Alone.

PubMed

Miyawaki, Yoshiko; Shimizu, Yasuko; Seto, Natsuko

2016-02-01

To investigate the support needs of elderly patients with diabetes and to classify elderly patients with diabetes living alone on the basis of support needs. Support needs were derived from a literature review of relevant journals and interviews of outpatients as well as expert nurses in the field of diabetes to prepare a 45-item questionnaire. Each item was analyzed on a 4-point Likert scale. The study included 634 elderly patients with diabetes who were recruited from 3 hospitals in Japan. Exploratory factor analysis was performed to determine the underlying structure of support needs, followed by hierarchical cluster analysis to clarify the characteristics of patients living alone (n=104) who had common support needs. Exploratory factor analysis suggested a 5-factor solution with 23 items: (1) hope for class and gatherings, (2) hope for personal advice including emergency response, (3) supportlessness and hopelessness, (4) barriers to food preparation, (5) hope of safe medical therapy. The hierarchical cluster analysis of subjects yielded 7 clusters, including a no special-support needs group, a collective support group, a self-care support group, a personal-support focus group, a life-support group, a food-preparation support group and a healthcare-environment support group. The support needs of elderly patients with diabetes who live alone can be divided into 2 categories: life and self-care support. Implementation of these categories in outpatient-management programs in which contact time with patients is limited is important in the overall management of elderly patients with diabetes who are living alone. Copyright © 2015 Canadian Diabetes Association. Published by Elsevier Inc. All rights reserved.
Whether the Autism Spectrum Quotient Consists of Two Different Subgroups? Cluster Analysis of the Autism Spectrum Quotient in General Population

ERIC Educational Resources Information Center

Kitazoe, Noriko; Fujita, Naofumi; Izumoto, Yuji; Terada, Shin-ichi; Hatakenaka, Yuhei

2017-01-01

The purpose of this study was to investigate whether the individuals in the general population with high scores on the Autism Spectrum Quotient constituted a single homogeneous group or not. A cohort of university students (n = 4901) was investigated by cluster analysis based on the original five subscales of the Autism Spectrum Quotient. Based on…
Exploring Different Patterns of Love Attitudes among Chinese College Students.

PubMed

Zeng, Xianglong; Pan, Yiqin; Zhou, Han; Yu, Shi; Liu, Xiangping

2016-01-01

Individual differences in love attitudes and the relationship between love attitudes and other variables in Asian culture lack in-depth exploration. This study conducted cluster analysis with data regarding love attitudes obtained from 389 college students in mainland China. The result of cluster analysis based on love-attitude scales distinguished four types of students: game players, rational lovers, emotional lovers, and absence lovers. These four groups of students showed significant differences in sexual attitudes and personality traits of deliberation and dutifulness but not self-discipline. The study's implications for future studies on love attitudes in certain cultural groups were also discussed.
A Cross-Cultural Comparison of Symptom Reporting and Symptom Clusters in Heart Failure.

PubMed

Park, Jumin; Johantgen, Mary E

2017-07-01

An understanding of symptoms in heart failure (HF) among different cultural groups has become increasingly important. The purpose of this study was to compare symptom reporting and symptom clusters in HF patients between a Western (the United States) and an Eastern Asian sample (China and Taiwan). A secondary analysis of a cross-sectional observational study was conducted. The data were obtained from a matched HF patient sample from the United States and China/Taiwan ( N = 240 in each). Eight selective items related to HF symptoms from the Minnesota Living with Heart Failure Questionnaire were analyzed. Compared with the U.S. sample, HF patients from China/Taiwan reported a lower level of symptom distress. Analysis of two different regional groups did not result in the same number of clusters using latent class approach: the United States (four classes) and China/Taiwan (three classes). The study demonstrated that symptom reporting and identification of symptom clusters might be influenced by cultural factors.
Inflammatory endotypes of chronic rhinosinusitis based on cluster analysis of biomarkers.

PubMed

Tomassen, Peter; Vandeplas, Griet; Van Zele, Thibaut; Cardell, Lars-Olaf; Arebro, Julia; Olze, Heidi; Förster-Ruhrmann, Ulrike; Kowalski, Marek L; Olszewska-Ziąber, Agnieszka; Holtappels, Gabriele; De Ruyck, Natalie; Wang, Xiangdong; Van Drunen, Cornelis; Mullol, Joaquim; Hellings, Peter; Hox, Valerie; Toskala, Elina; Scadding, Glenis; Lund, Valerie; Zhang, Luo; Fokkens, Wytske; Bachert, Claus

2016-05-01

Current phenotyping of chronic rhinosinusitis (CRS) into chronic rhinosinusitis with nasal polyps (CRSwNP) and chronic rhinosinusitis without nasal polyps (CRSsNP) might not adequately reflect the pathophysiologic diversity within patients with CRS. We sought to identify inflammatory endotypes of CRS. Therefore we aimed to cluster patients with CRS based solely on immune markers in a phenotype-free approach. Secondarily, we aimed to match clusters to phenotypes. In this multicenter case-control study patients with CRS and control subjects underwent surgery, and tissue was analyzed for IL-5, IFN-γ, IL-17A, TNF-α, IL-22, IL-1β, IL-6, IL-8, eosinophilic cationic protein, myeloperoxidase, TGF-β1, IgE, Staphylococcus aureus enterotoxin-specific IgE, and albumin. We used partition-based clustering. Clustering of 173 cases resulted in 10 clusters, of which 4 clusters with low or undetectable IL-5, eosinophilic cationic protein, IgE, and albumin concentrations, and 6 clusters with high concentrations of those markers. The group of IL-5-negative clusters, 3 clusters clinically resembled a predominant chronic rhinosinusitis without nasal polyps (CRSsNP) phenotype without increased asthma prevalence, and 1 cluster had a TH17 profile and had mixed CRSsNP/CRSwNP. The IL-5-positive clusters were divided into a group with moderate IL-5 concentrations, a mixed CRSsNP/CRSwNP and increased asthma phenotype, and a group with high IL-5 levels, an almost exclusive nasal polyp phenotype with strongly increased asthma prevalence. In the latter group, 2 clusters demonstrated the highest concentrations of IgE and asthma prevalence, with all samples expressing Staphylococcus aureus enterotoxin-specific IgE. Distinct CRS clusters with diverse inflammatory mechanisms largely correlated with phenotypes and further differentiated them and provided a more accurate description of the inflammatory mechanisms involved than phenotype information only. Copyright © 2016 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Functional Status, Quality of Life, and Costs Associated With Fibromyalgia Subgroups: A Latent Profile Analysis.

PubMed

Luciano, Juan V; Forero, Carlos G; Cerdà-Lafont, Marta; Peñarrubia-María, María Teresa; Fernández-Vergel, Rita; Cuesta-Vargas, Antonio I; Ruíz, José M; Rozadilla-Sacanell, Antoni; Sirvent-Alierta, Elena; Santo-Panero, Pilar; García-Campayo, Javier; Serrano-Blanco, Antoni; Pérez-Aranda, Adrián; Rubio-Valera, María

2016-10-01

Although fibromyalgia syndrome (FM) is considered a heterogeneous condition, there is no generally accepted subgroup typology. We used hierarchical cluster analysis and latent profile analysis to replicate Giesecke's classification in Spanish FM patients. The second aim was to examine whether the subgroups differed in sociodemographic characteristics, functional status, quality of life, and in direct and indirect costs. A total of 160 FM patients completed the following measures for cluster derivation: the Center for Epidemiological Studies-Depression Scale, the Trait Anxiety Inventory, the Pain Catastrophizing Scale, and the Control over Pain subscale. Pain threshold was measured with a sphygmomanometer. In addition, the Fibromyalgia Impact Questionnaire-Revised, the EuroQoL-5D-3L, and the Client Service Receipt Inventory were administered for cluster validation. Two distinct clusters were identified using hierarchical cluster analysis ("hypersensitive" group, 69.8% and "functional" group, 30.2%). In contrast, the latent profile analysis goodness-of-fit indices supported the existence of 3 FM patient profiles: (1) a "functional" profile (28.1%) defined as moderate tenderness, distress, and pain catastrophizing; (2) a "dysfunctional" profile (45.6%) defined by elevated tenderness, distress, and pain catastrophizing; and (3) a "highly dysfunctional and distressed" profile (26.3%) characterized by elevated tenderness and extremely high distress and catastrophizing. We did not find significant differences in sociodemographic characteristics between the 2 clusters or among the 3 profiles. The functional profile was associated with less impairment, greater quality of life, and lower health care costs. We identified 3 distinct profiles which accounted for the heterogeneity of FM patients. Our findings might help to design tailored interventions for FM patients.
Infalling groups and galaxy transformations in the cluster A2142

NASA Astrophysics Data System (ADS)

Einasto, Maret; Deshev, Boris; Lietzen, Heidi; Kipper, Rain; Tempel, Elmo; Park, Changbom; Gramann, Mirt; Heinämäki, Pekka; Saar, Enn; Einasto, Jaan

2018-03-01

Context. Superclusters of galaxies provide dynamical environments for the study of the formation and evolution of structures in the cosmic web from galaxies, to the richest galaxy clusters, and superclusters themselves. Aims: We study galaxy populations and search for possible merging substructures in the rich galaxy cluster A2142 in the collapsing core of the supercluster SCl A2142, which may give rise to radio and X-ray structures in the cluster, and affect galaxy properties of this cluster. Methods: We used normal mixture modelling to select substructure of the cluster A2142. We compared alignments of the cluster, its brightest galaxies (hereafter BCGs), subclusters, and supercluster axes. The projected phase space (PPS) diagram and clustercentric distributions are used to analyse the dynamics of the cluster and study the distribution of various galaxy populations in the cluster and subclusters. Results: We find several infalling galaxy groups and subclusters. The cluster, supercluster, BCGs, and one infalling subcluster are all aligned. Their orientation is correlated with the alignment of the radio and X-ray haloes of the cluster. Galaxy populations in the main cluster and in the outskirts subclusters are different. Galaxies in the centre of the main cluster at the clustercentric distances 0.5 h-1 Mpc (Dc/Rvir < 0.5, Rvir = 0.9 h-1 Mpc) have older stellar populations (with the median age of 10-11 Gyr) than galaxies at larger clustercentric distances. Star-forming and recently quenched galaxies are located mostly at the clustercentric distances Dc ≈ 1.8 h-1 Mpc, where subclusters fall into the cluster and the properties of galaxies change rapidly. In this region the median age of stellar populations of galaxies is about 2 Gyr. Galaxies in A2142 on average have higher stellar masses, lower star formation rates, and redder colours than galaxies in rich groups. The total mass in infalling groups and subclusters is M ≈ 6 × 1014 h-1 M⊙, that is approximately half of the mass of the cluster. This mass is sufficient for the mass growth of the cluster from redshift z = 0.5 (half-mass epoch) to the present. Conclusions: Our analysis suggests that the cluster A2142 has formed as a result of past and present mergers and infallen groups, predominantly along the supercluster axis. Mergers cause complex radio and X-ray structure of the cluster and affect the properties of galaxies in the cluster, especially at the boundaries of the cluster in the infall region. Explaining the differences between galaxy populations, mass, and richness of A2142, and other groups and clusters may lead to better insight about the formation and evolution of rich galaxy clusters.
Spatial cluster for clustering the influence factor of birth and death child in Bogor Regency, West Java

NASA Astrophysics Data System (ADS)

Bekti, Rokhana Dwi; Rachmawati, Ro'fah

2014-03-01

The number of birth and death child is the benchmarks to determine and monitor the health and welfare in Indonesia. It can be used to identify groups of people who have a high mortality risk. Identifying group is important to compare the characteristics of human that have high and low risk. These characteristics can be seen from the factors that influenced it. Furthermore, there are factors which influence of birth and death child, such us economic, health facility, education, and others. The influence factors of every individual are different, but there are similarities some individuals which live close together or in the close locations. It means there was spatial effect. To identify group in this research, clustering is done by spatial cluster method, which is view to considering the influence of the location or the relationship between locations. One of spatial cluster method is Spatial 'K'luster Analysis by Tree Edge Removal (SKATER). The research was conducted in Bogor Regency, West Java. The goal was to get a cluster of districts based on the factors that influence birth and death child. SKATER build four number of cluster respectively consists of 26, 7, 2, and 5 districts. SKATER has good performance for clustering which include spatial effect. If it compare by other cluster method, Kmeans has good performance by MANOVA test.
Genetic diversity and relationship analysis of Gossypium arboreum accessions.

PubMed

Liu, F; Zhou, Z L; Wang, C Y; Wang, Y H; Cai, X Y; Wang, X X; Zhang, Z S; Wang, K B

2015-11-19

Simple sequence repeat techniques were used to identify the genetic diversity of 101 Gossypium arboreum accessions collected from India, Vietnam, and the southwest of China (Guizhou, Guangxi, and Yunnan provinces). Twenty-six pairs of SSR primers produced a total of 103 polymorphic loci with an average of 3.96 polymorphic loci per primer. The average of the effective number of alleles, Nei's gene diversity, and Shannon's information index were 0.59, 0.2835, and 0.4361, respectively. The diversity varied among different geographic regions. The result of principal component analysis was consistent with that of unweighted pair group method with arithmetic mean clustering analysis. The 101 G. arboreum accessions were clustered into 2 groups.
Application of Factor Analysis on the Financial Ratios of Indian Cement Industry and Validation of the Results by Cluster Analysis

NASA Astrophysics Data System (ADS)

De, Anupam; Bandyopadhyay, Gautam; Chakraborty, B. N.

2010-10-01

Financial ratio analysis is an important and commonly used tool in analyzing financial health of a firm. Quite a large number of financial ratios, which can be categorized in different groups, are used for this analysis. However, to reduce number of ratios to be used for financial analysis and regrouping them into different groups on basis of empirical evidence, Factor Analysis technique is being used successfully by different researches during the last three decades. In this study Factor Analysis has been applied over audited financial data of Indian cement companies for a period of 10 years. The sample companies are listed on the Stock Exchange India (BSE and NSE). Factor Analysis, conducted over 44 variables (financial ratios) grouped in 7 categories, resulted in 11 underlying categories (factors). Each factor is named in an appropriate manner considering the factor loads and constituent variables (ratios). Representative ratios are identified for each such factor. To validate the results of Factor Analysis and to reach final conclusion regarding the representative ratios, Cluster Analysis had been performed.
Clustering Methods with Qualitative Data: A Mixed Methods Approach for Prevention Research with Small Samples

PubMed Central

Henry, David; Dymnicki, Allison B.; Mohatt, Nathaniel; Allen, James; Kelly, James G.

2016-01-01

Qualitative methods potentially add depth to prevention research, but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data, but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-Means clustering, and latent class analysis produced similar levels of accuracy with binary data, and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a “real-world” example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities. PMID:25946969
The association between mood state and chronobiological characteristics in bipolar I disorder: a naturalistic, variable cluster analysis-based study.

PubMed

Gonzalez, Robert; Suppes, Trisha; Zeitzer, Jamie; McClung, Colleen; Tamminga, Carol; Tohen, Mauricio; Forero, Angelica; Dwivedi, Alok; Alvarado, Andres

2018-02-19

Multiple types of chronobiological disturbances have been reported in bipolar disorder, including characteristics associated with general activity levels, sleep, and rhythmicity. Previous studies have focused on examining the individual relationships between affective state and chronobiological characteristics. The aim of this study was to conduct a variable cluster analysis in order to ascertain how mood states are associated with chronobiological traits in bipolar I disorder (BDI). We hypothesized that manic symptomatology would be associated with disturbances of rhythm. Variable cluster analysis identified five chronobiological clusters in 105 BDI subjects. Cluster 1, comprising subjective sleep quality was associated with both mania and depression. Cluster 2, which comprised variables describing the degree of rhythmicity, was associated with mania. Significant associations between mood state and cluster analysis-identified chronobiological variables were noted. Disturbances of mood were associated with subjectively assessed sleep disturbances as opposed to objectively determined, actigraphy-based sleep variables. No associations with general activity variables were noted. Relationships between gender and medication classes in use and cluster analysis-identified chronobiological characteristics were noted. Exploratory analyses noted that medication class had a larger impact on these relationships than the number of psychiatric medications in use. In a BDI sample, variable cluster analysis was able to group related chronobiological variables. The results support our primary hypothesis that mood state, particularly mania, is associated with chronobiological disturbances. Further research is required in order to define these relationships and to determine the directionality of the associations between mood state and chronobiological characteristics.
Clustering Methods with Qualitative Data: a Mixed-Methods Approach for Prevention Research with Small Samples.

PubMed

Henry, David; Dymnicki, Allison B; Mohatt, Nathaniel; Allen, James; Kelly, James G

2015-10-01

Qualitative methods potentially add depth to prevention research but can produce large amounts of complex data even with small samples. Studies conducted with culturally distinct samples often produce voluminous qualitative data but may lack sufficient sample sizes for sophisticated quantitative analysis. Currently lacking in mixed-methods research are methods allowing for more fully integrating qualitative and quantitative analysis techniques. Cluster analysis can be applied to coded qualitative data to clarify the findings of prevention studies by aiding efforts to reveal such things as the motives of participants for their actions and the reasons behind counterintuitive findings. By clustering groups of participants with similar profiles of codes in a quantitative analysis, cluster analysis can serve as a key component in mixed-methods research. This article reports two studies. In the first study, we conduct simulations to test the accuracy of cluster assignment using three different clustering methods with binary data as produced when coding qualitative interviews. Results indicated that hierarchical clustering, K-means clustering, and latent class analysis produced similar levels of accuracy with binary data and that the accuracy of these methods did not decrease with samples as small as 50. Whereas the first study explores the feasibility of using common clustering methods with binary data, the second study provides a "real-world" example using data from a qualitative study of community leadership connected with a drug abuse prevention project. We discuss the implications of this approach for conducting prevention research, especially with small samples and culturally distinct communities.
HICOSMO - cosmology with a complete sample of galaxy clusters - I. Data analysis, sample selection and luminosity-mass scaling relation

NASA Astrophysics Data System (ADS)

Schellenberger, G.; Reiprich, T. H.

2017-08-01

The X-ray regime, where the most massive visible component of galaxy clusters, the intracluster medium, is visible, offers directly measured quantities, like the luminosity, and derived quantities, like the total mass, to characterize these objects. The aim of this project is to analyse a complete sample of galaxy clusters in detail and constrain cosmological parameters, like the matter density, Ωm, or the amplitude of initial density fluctuations, σ8. The purely X-ray flux-limited sample (HIFLUGCS) consists of the 64 X-ray brightest galaxy clusters, which are excellent targets to study the systematic effects, that can bias results. We analysed in total 196 Chandra observations of the 64 HIFLUGCS clusters, with a total exposure time of 7.7 Ms. Here, we present our data analysis procedure (including an automated substructure detection and an energy band optimization for surface brightness profile analysis) that gives individually determined, robust total mass estimates. These masses are tested against dynamical and Planck Sunyaev-Zeldovich (SZ) derived masses of the same clusters, where good overall agreement is found with the dynamical masses. The Planck SZ masses seem to show a mass-dependent bias to our hydrostatic masses; possible biases in this mass-mass comparison are discussed including the Planck selection function. Furthermore, we show the results for the (0.1-2.4) keV luminosity versus mass scaling relation. The overall slope of the sample (1.34) is in agreement with expectations and values from literature. Splitting the sample into galaxy groups and clusters reveals, even after a selection bias correction, that galaxy groups exhibit a significantly steeper slope (1.88) compared to clusters (1.06).
GibbsCluster: unsupervised clustering and alignment of peptide sequences.

PubMed

Andreatta, Massimo; Alvarez, Bruno; Nielsen, Morten

2017-07-03

Receptor interactions with short linear peptide fragments (ligands) are at the base of many biological signaling processes. Conserved and information-rich amino acid patterns, commonly called sequence motifs, shape and regulate these interactions. Because of the properties of a receptor-ligand system or of the assay used to interrogate it, experimental data often contain multiple sequence motifs. GibbsCluster is a powerful tool for unsupervised motif discovery because it can simultaneously cluster and align peptide data. The GibbsCluster 2.0 presented here is an improved version incorporating insertion and deletions accounting for variations in motif length in the peptide input. In basic terms, the program takes as input a set of peptide sequences and clusters them into meaningful groups. It returns the optimal number of clusters it identified, together with the sequence alignment and sequence motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

The dynamics of cyclone clustering in re-analysis and a high-resolution climate model

NASA Astrophysics Data System (ADS)

Priestley, Matthew; Pinto, Joaquim; Dacre, Helen; Shaffrey, Len

2017-04-01

Extratropical cyclones have a tendency to occur in groups (clusters) in the exit of the North Atlantic storm track during wintertime, potentially leading to widespread socioeconomic impacts. The Winter of 2013/14 was the stormiest on record for the UK and was characterised by the recurrent clustering of intense extratropical cyclones. This clustering was associated with a strong, straight and persistent North Atlantic 250 hPa jet with Rossby wave-breaking (RWB) on both flanks, pinning the jet in place. Here, we provide for the first time an analysis of all clustered events in 36 years of the ERA-Interim Re-analysis at three latitudes (45˚ N, 55˚ N, 65˚ N) encompassing various regions of Western Europe. The relationship between the occurrence of RWB and cyclone clustering is studied in detail. Clustering at 55˚ N is associated with an extended and anomalously strong jet flanked on both sides by RWB. However, clustering at 65(45)˚ N is associated with RWB to the south (north) of the jet, deflecting the jet northwards (southwards). A positive correlation was found between the intensity of the clustering and RWB occurrence to the north and south of the jet. However, there is considerable spread in these relationships. Finally, analysis has shown that the relationships identified in the re-analysis are also present in a high-resolution coupled global climate model (HiGEM). In particular, clustering is associated with the same dynamical conditions at each of our three latitudes in spite of the identified biases in frequency and intensity of RWB.
Cerebral and non-cerebral coenurosis: on the genotypic and phenotypic diversity of Taenia multiceps.

PubMed

Christodoulopoulos, Georgios; Dinkel, Anke; Romig, Thomas; Ebi, Dennis; Mackenstedt, Ute; Loos-Frank, Brigitte

2016-12-01

We characterised the causative agents of cerebral and non-cerebral coenurosis in livestock by determining the mitochondrial genotypes and morphological phenotypes of 52 Taenia multiceps isolates from a wide geographical range in Europe, Africa, and western Asia. Three studies were conducted: (1) a morphological comparison of the rostellar hooks of cerebral and non-cerebral cysts of sheep and goats, (2) a morphological comparison of adult worms experimentally produced in dogs, and (3) a molecular analysis of three partial mitochondrial genes (nad1, cox1, and 12S rRNA) of the same isolates. No significant morphological or genetic differences were associated with the species of the intermediate host. Adult parasites originating from cerebral and non-cerebral cysts differed morphologically, e.g. the shape of the small hooks and the distribution of the testes in the mature proglottids. The phylogenetic analysis of the mitochondrial haplotypes produced three distinct clusters: one cluster including both cerebral isolates from Greece and non-cerebral isolates from tropical and subtropical countries, and two clusters including cerebral isolates from Greece. The majority of the non-cerebral specimens clustered together but did not form a monophyletic group. No monophyletic groups were observed based on geography, although specimens from the same region tended to cluster. The clustering indicates high intraspecific diversity. The phylogenetic analysis suggests that all variants of T. multiceps can cause cerebral coenurosis in sheep (which may be the ancestral phenotype), and some variants, predominantly from one genetic cluster, acquired the additional capacity to produce non-cerebral forms in goats and more rarely in sheep.
Iterative Stable Alignment and Clustering of 2D Transmission Electron Microscope Images

PubMed Central

Yang, Zhengfan; Fang, Jia; Chittuluru, Johnathan; Asturias, Francisco J.; Penczek, Pawel A.

2012-01-01

SUMMARY Identification of homogeneous subsets of images in a macromolecular electron microscopy (EM) image data set is a critical step in single-particle analysis. The task is handled by iterative algorithms, whose performance is compromised by the compounded limitations of image alignment and K-means clustering. Here we describe an approach, iterative stable alignment and clustering (ISAC) that, relying on a new clustering method and on the concepts of stability and reproducibility, can extract validated, homogeneous subsets of images. ISAC requires only a small number of simple parameters and, with minimal human intervention, can eliminate bias from two-dimensional image clustering and maximize the quality of group averages that can be used for ab initio three-dimensional structural determination and analysis of macromolecular conformational variability. Repeated testing of the stability and reproducibility of a solution within ISAC eliminates heterogeneous or incorrect classes and introduces critical validation to the process of EM image clustering. PMID:22325773
Research on retailer data clustering algorithm based on Spark

NASA Astrophysics Data System (ADS)

Huang, Qiuman; Zhou, Feng

2017-03-01

Big data analysis is a hot topic in the IT field now. Spark is a high-reliability and high-performance distributed parallel computing framework for big data sets. K-means algorithm is one of the classical partition methods in clustering algorithm. In this paper, we study the k-means clustering algorithm on Spark. Firstly, the principle of the algorithm is analyzed, and then the clustering analysis is carried out on the supermarket customers through the experiment to find out the different shopping patterns. At the same time, this paper proposes the parallelization of k-means algorithm and the distributed computing framework of Spark, and gives the concrete design scheme and implementation scheme. This paper uses the two-year sales data of a supermarket to validate the proposed clustering algorithm and achieve the goal of subdividing customers, and then analyze the clustering results to help enterprises to take different marketing strategies for different customer groups to improve sales performance.
Functional clustering of time series gene expression data by Granger causality

PubMed Central

2012-01-01

Background A common approach for time series gene expression data analysis includes the clustering of genes with similar expression patterns throughout time. Clustered gene expression profiles point to the joint contribution of groups of genes to a particular cellular process. However, since genes belong to intricate networks, other features, besides comparable expression patterns, should provide additional information for the identification of functionally similar genes. Results In this study we perform gene clustering through the identification of Granger causality between and within sets of time series gene expression data. Granger causality is based on the idea that the cause of an event cannot come after its consequence. Conclusions This kind of analysis can be used as a complementary approach for functional clustering, wherein genes would be clustered not solely based on their expression similarity but on their topological proximity built according to the intensity of Granger causality among them. PMID:23107425
Computational gene expression profiling under salt stress reveals patterns of co-expression

PubMed Central

Sanchita; Sharma, Ashok

2016-01-01

Plants respond differently to environmental conditions. Among various abiotic stresses, salt stress is a condition where excess salt in soil causes inhibition of plant growth. To understand the response of plants to the stress conditions, identification of the responsible genes is required. Clustering is a data mining technique used to group the genes with similar expression. The genes of a cluster show similar expression and function. We applied clustering algorithms on gene expression data of Solanum tuberosum showing differential expression in Capsicum annuum under salt stress. The clusters, which were common in multiple algorithms were taken further for analysis. Principal component analysis (PCA) further validated the findings of other cluster algorithms by visualizing their clusters in three-dimensional space. Functional annotation results revealed that most of the genes were involved in stress related responses. Our findings suggest that these algorithms may be helpful in the prediction of the function of co-expressed genes. PMID:26981411
Technical support for creating an artificial intelligence system for feature extraction and experimental design

NASA Technical Reports Server (NTRS)

Glick, B. J.

1985-01-01

Techniques for classifying objects into groups or clases go under many different names including, most commonly, cluster analysis. Mathematically, the general problem is to find a best mapping of objects into an index set consisting of class identifiers. When an a priori grouping of objects exists, the process of deriving the classification rules from samples of classified objects is known as discrimination. When such rules are applied to objects of unknown class, the process is denoted classification. The specific problem addressed involves the group classification of a set of objects that are each associated with a series of measurements (ratio, interval, ordinal, or nominal levels of measurement). Each measurement produces one variable in a multidimensional variable space. Cluster analysis techniques are reviewed and methods for incuding geographic location, distance measures, and spatial pattern (distribution) as parameters in clustering are examined. For the case of patterning, measures of spatial autocorrelation are discussed in terms of the kind of data (nominal, ordinal, or interval scaled) to which they may be applied.
Shape analysis of H II regions - I. Statistical clustering

NASA Astrophysics Data System (ADS)

Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

2018-07-01

We present here our shape analysis method for a sample of 76 Galactic H II regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation are linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorize H II regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionized by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilizing synthetic observations from numerical simulations of H II regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.
Shape Analysis of HII Regions - I. Statistical Clustering

NASA Astrophysics Data System (ADS)

Campbell-White, Justyn; Froebrich, Dirk; Kume, Alfred

2018-04-01

We present here our shape analysis method for a sample of 76 Galactic HII regions from MAGPIS 1.4 GHz data. The main goal is to determine whether physical properties and initial conditions of massive star cluster formation is linked to the shape of the regions. We outline a systematic procedure for extracting region shapes and perform hierarchical clustering on the shape data. We identified six groups that categorise HII regions by common morphologies. We confirmed the validity of these groupings by bootstrap re-sampling and the ordinance technique multidimensional scaling. We then investigated associations between physical parameters and the assigned groups. Location is mostly independent of group, with a small preference for regions of similar longitudes to share common morphologies. The shapes are homogeneously distributed across Galactocentric distance and latitude. One group contains regions that are all younger than 0.5 Myr and ionised by low- to intermediate-mass sources. Those in another group are all driven by intermediate- to high-mass sources. One group was distinctly separated from the other five and contained regions at the surface brightness detection limit for the survey. We find that our hierarchical procedure is most sensitive to the spatial sampling resolution used, which is determined for each region from its distance. We discuss how these errors can be further quantified and reduced in future work by utilising synthetic observations from numerical simulations of HII regions. We also outline how this shape analysis has further applications to other diffuse astronomical objects.
Hierarchical cluster-tendency analysis of the group structure in the foreign exchange market

NASA Astrophysics Data System (ADS)

Wu, Xin-Ye; Zheng, Zhi-Gang

2013-08-01

A hierarchical cluster-tendency (HCT) method in analyzing the group structure of networks of the global foreign exchange (FX) market is proposed by combining the advantages of both the minimal spanning tree (MST) and the hierarchical tree (HT). Fifty currencies of the top 50 World GDP in 2010 according to World Bank's database are chosen as the underlying system. By using the HCT method, all nodes in the FX market network can be "colored" and distinguished. We reveal that the FX networks can be divided into two groups, i.e., the Asia-Pacific group and the Pan-European group. The results given by the hierarchical cluster-tendency method agree well with the formerly observed geographical aggregation behavior in the FX market. Moreover, an oil-resource aggregation phenomenon is discovered by using our method. We find that gold could be a better numeraire for the weekly-frequency FX data.
Sensitivity Analysis of Multiple Informant Models When Data Are Not Missing at Random

ERIC Educational Resources Information Center

Blozis, Shelley A.; Ge, Xiaojia; Xu, Shu; Natsuaki, Misaki N.; Shaw, Daniel S.; Neiderhiser, Jenae M.; Scaramella, Laura V.; Leve, Leslie D.; Reiss, David

2013-01-01

Missing data are common in studies that rely on multiple informant data to evaluate relationships among variables for distinguishable individuals clustered within groups. Estimation of structural equation models using raw data allows for incomplete data, and so all groups can be retained for analysis even if only 1 member of a group contributes…
Visualizing Confidence in Cluster-Based Ensemble Weather Forecast Analyses.

PubMed

Kumpf, Alexander; Tost, Bianca; Baumgart, Marlene; Riemer, Michael; Westermann, Rudiger; Rautenhaus, Marc

2018-01-01

In meteorology, cluster analysis is frequently used to determine representative trends in ensemble weather predictions in a selected spatio-temporal region, e.g., to reduce a set of ensemble members to simplify and improve their analysis. Identified clusters (i.e., groups of similar members), however, can be very sensitive to small changes of the selected region, so that clustering results can be misleading and bias subsequent analyses. In this article, we - a team of visualization scientists and meteorologists-deliver visual analytics solutions to analyze the sensitivity of clustering results with respect to changes of a selected region. We propose an interactive visual interface that enables simultaneous visualization of a) the variation in composition of identified clusters (i.e., their robustness), b) the variability in cluster membership for individual ensemble members, and c) the uncertainty in the spatial locations of identified trends. We demonstrate that our solution shows meteorologists how representative a clustering result is, and with respect to which changes in the selected region it becomes unstable. Furthermore, our solution helps to identify those ensemble members which stably belong to a given cluster and can thus be considered similar. In a real-world application case we show how our approach is used to analyze the clustering behavior of different regions in a forecast of "Tropical Cyclone Karl", guiding the user towards the cluster robustness information required for subsequent ensemble analysis.
[Multimorbidity patterns in young adults in Catalonia: an analysis of clusters].

PubMed

Violán, Concepción; Foguet-Boreu, Quintí; Roso-Llorach, Albert; Rodriguez-Blanco, Teresa; Pons-Vigués, Mariona; Pujol-Ribera, Enriqueta; Valderas, Jose M

2016-01-01

The aim of this study was to identify multimorbidity patterns in patients from 19 to 44 years attended in primary care in Catalonia in 2010. Cross-sectional study. 251 primary care centres. 530,798 people with multimorbidity, aged 19 to 44 years. Multimorbidity was defined as the coexistence of ≥2 more International Classification system (ICD-10) registered in the electronic health record. Multimorbidity patterns were identified using hierarchical cluster analysis and by sex and age group (19-24 and 25-44). Of the 882,708 people from initial population, 530,798 (60.1%) accomplished multimorbidity criterion. Mean age was 33.0 years (SD: 7.0) and 53.3% were women. Multimorbidity was higher in the 25-to 44-years-old group with respect the younger group (60.5 vs. 58.1%, p<0.001), being higher in women. Most prevalent cluster in all groups included, among others, by dental caries, smoking, dorsalgia, common cold and other anxiety disorders. For both sexes in the 25-to 44-years-old group appeared the cardiovascular-endocrine-metabolic pattern (obesity, lipid disorders and arterial hypertension). Multimorbidity affects more than half of persons between 19 to 44-years-old. The most prevalent cluster is formed by grouping common diseases (dental caries, common cold, smoking, anxiety disorders and dorsalgias). Another pattern to highlight is the cardiovascular-endocrine-metabolic pattern in the 25- to 44 years-old group. Knowledge of patterns of multimorbidity in young adults could be used to design individualized preventive strategies. Copyright © 2015 Elsevier España, S.L.U. All rights reserved.
Genotyping and chlorine-resistance of Methylobacterium aquaticum isolated from water samples in Japan.

PubMed

Furuhata, Katsunori; Banzai, Azusa U; Kawakami, Yasushi; Ishizaki, Naoto; Yoshida, Yoshihiro; Goto, Keiichi; Fukuyama, Masafumi

2011-09-01

For microbial ecological analysis, 14 strains of Methylobacterium aquaticum isolated from water samples were subjected to clustering analysis on the basis of ribotyping and RAPD-PCR tests. The ribopatterns after digestion with EcoRI obtained from 14 strains of M. aquaticum were used to divide the strains into two groups (Groups I and II) with a similarity of 55%. From the analysis of RAPD patterns using primer 208, the 14 strains were divided into 3 groups (A-C) based on a homology of 45% or greater, and from that using primer 272, there were 4 groups (A-D) based on a homology of 50% or greater. The chlorine resistance (99.9% CT values) of these isolates was also experimentally confirmed, and we attempted to define the connection between chlorine resistance and the geno-cluster. The average CT value of group I was 0.89 mg•min/l and the average of group II was 0.69 mg•min/l. No remarkable differences in the CT values for the groups were found.
ADPROCLUS: a graphical user interface for fitting additive profile clustering models to object by variable data matrices.

PubMed

Wilderjans, Tom F; Ceulemans, Eva; Van Mechelen, Iven; Depril, Dirk

2011-03-01

In many areas of psychology, one is interested in disclosing the underlying structural mechanisms that generated an object by variable data set. Often, based on theoretical or empirical arguments, it may be expected that these underlying mechanisms imply that the objects are grouped into clusters that are allowed to overlap (i.e., an object may belong to more than one cluster). In such cases, analyzing the data with Mirkin's additive profile clustering model may be appropriate. In this model: (1) each object may belong to no, one or several clusters, (2) there is a specific variable profile associated with each cluster, and (3) the scores of the objects on the variables can be reconstructed by adding the cluster-specific variable profiles of the clusters the object in question belongs to. Until now, however, no software program has been publicly available to perform an additive profile clustering analysis. For this purpose, in this article, the ADPROCLUS program, steered by a graphical user interface, is presented. We further illustrate its use by means of the analysis of a patient by symptom data matrix.
Model-based clustering for RNA-seq data.

PubMed

Si, Yaqing; Liu, Peng; Li, Pinghua; Brutnell, Thomas P

2014-01-15

RNA-seq technology has been widely adopted as an attractive alternative to microarray-based methods to study global gene expression. However, robust statistical tools to analyze these complex datasets are still lacking. By grouping genes with similar expression profiles across treatments, cluster analysis provides insight into gene functions and networks, and hence is an important technique for RNA-seq data analysis. In this manuscript, we derive clustering algorithms based on appropriate probability models for RNA-seq data. An expectation-maximization algorithm and another two stochastic versions of expectation-maximization algorithms are described. In addition, a strategy for initialization based on likelihood is proposed to improve the clustering algorithms. Moreover, we present a model-based hybrid-hierarchical clustering method to generate a tree structure that allows visualization of relationships among clusters as well as flexibility of choosing the number of clusters. Results from both simulation studies and analysis of a maize RNA-seq dataset show that our proposed methods provide better clustering results than alternative methods such as the K-means algorithm and hierarchical clustering methods that are not based on probability models. An R package, MBCluster.Seq, has been developed to implement our proposed algorithms. This R package provides fast computation and is publicly available at http://www.r-project.org
Personalized Medicine in Veterans with Traumatic Brain Injuries

DTIC Science & Technology

2013-05-01

Pair-Group Method using Arithmetic averages ( UPGMA ) based on cosine correlation of row mean centered log2 signal values; this was the top 50%-tile...cluster- ing was performed by the UPGMA method using Cosine correlation as the similarity metric. For comparative purposes, clustered heat maps included...non-mTBI cases were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as the similarity
Personalized Medicine in Veterans with Traumatic Brain Injuries

DTIC Science & Technology

2014-07-01

9 control cases are subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as the similarity...in unsu- pervised hierarchical clustering by the Un- weighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on cosine correlation of row...of log2 trans- formed MAS5.0 signal values; probe set cluster- ing was performed by the UPGMA method using Cosine correlation as the similarity
Molecular reclassification of Crohn's disease: a cautionary note on population stratification.

PubMed

Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel

2013-01-01

Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn's disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn's disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals.
Molecular Reclassification of Crohn’s Disease: A Cautionary Note on Population Stratification

PubMed Central

Maus, Bärbel; Jung, Camille; Mahachie John, Jestinah M.; Hugot, Jean-Pierre; Génin, Emmanuelle; Van Steen, Kristel

2013-01-01

Complex human diseases commonly differ in their phenotypic characteristics, e.g., Crohn’s disease (CD) patients are heterogeneous with regard to disease location and disease extent. The genetic susceptibility to Crohn’s disease is widely acknowledged and has been demonstrated by identification of over 100 CD associated genetic loci. However, relating CD subphenotypes to disease susceptible loci has proven to be a difficult task. In this paper we discuss the use of cluster analysis on genetic markers to identify genetic-based subgroups while taking into account possible confounding by population stratification. We show that it is highly relevant to consider the confounding nature of population stratification in order to avoid that detected clusters are strongly related to population groups instead of disease-specific groups. Therefore, we explain the use of principal components to correct for population stratification while clustering affected individuals into genetic-based subgroups. The principal components are obtained using 30 ancestry informative markers (AIM), and the first two PCs are determined to discriminate between continental origins of the affected individuals. Genotypes on 51 CD associated single nucleotide polymorphisms (SNPs) are used to perform latent class analysis, hierarchical and Partitioning Around Medoids (PAM) cluster analysis within a sample of affected individuals with and without the use of principal components to adjust for population stratification. It is seen that without correction for population stratification clusters seem to be influenced by population stratification while with correction clusters are unrelated to continental origin of individuals. PMID:24147066

Evaluation of genetic diversity in jackfruit (Artocarpus heterophyllus Lam.) based on amplified fragment length polymorphism markers.

PubMed

Shyamalamma, S; Chandra, S B C; Hegde, M; Naryanswamy, P

2008-07-22

Artocarpus heterophyllus Lam., commonly called jackfruit, is a medium-sized evergreen tree that bears high yields of the largest known edible fruit. Yet, it has been little explored commercially due to wide variation in fruit quality. The genetic diversity and genetic relatedness of 50 jackfruit accessions were studied using amplified fragment length polymorphism markers. Of 16 primer pairs evaluated, eight were selected for screening of genotypes based on the number and quality of polymorphic fragments produced. These primer combinations produced 5976 bands, 1267 (22%) of which were polymorphic. Among the jackfruit accessions, the similarity coefficient ranged from 0.137 to 0.978; the accessions also shared a large number of monomorphic fragments (78%). Cluster analysis and principal component analysis grouped all jackfruit genotypes into three major clusters. Cluster I included the genotypes grown in a jackfruit region of Karnataka, called Tamaka, with very dry conditions; cluster II contained the genotypes collected from locations having medium to heavy rainfall in Karnataka; cluster III grouped the genotypes in distant locations with different environmental conditions. Strong coincidence of these amplified fragment length polymorphism-based groupings with geographical localities as well as morphological characters was observed. We found moderate genetic diversity in these jackfruit accessions. This information should be useful for tree breeding programs, as part of our effort to popularize jackfruit as a commercial crop.
Classification and discrimination of pediatric patients undergoing open heart surgery with and without methylprednisolone treatment by cytomics

NASA Astrophysics Data System (ADS)

Bocsi, Jozsef; Mittag, Anja; Pierzchalski, Arkadiusz; Osmancik, Pavel; Dähnert, Ingo; Tárnok, Attila

2011-02-01

Introduction: Methylprednisolone (MP) is frequently preoperatively administered in children undergoing open heart surgery. The aim of this medication is to inhibit overshooting immune responses. Earlier studies demonstrated cellular and humoral immunological changes in pediatric patients undergoing heart surgeries with and without MP administration. Here in a retrospective study we investigated the modulation of the cellular immune response by MP. The aim was to identify suitable parameters characterizing MP effects by cluster analysis. Methods: Blood samples were analysed from two aged matched groups with surgical correction of septum defects. Group without MP treatment consisted of 10 patients; MP was administered on 21 patients (median dose: 11mg/kg) before cardiopulmonary bypass (CPB). EDTA anticoagulated blood was obtained 24 h preoperatively, after anesthesia, at CPB begin and end (CPB2), 4h, 24h, 48h after surgery, at discharge and at out-patient followup (8.2; 3.3-12.2 month after surgery; median and IQR). Flow cytometry showed the biggest MP relevant changes at CPB2 and 4h postoperatively. They were used for clustering analysis. Classification was made by discriminant analysis and cluster analysis by means of Genes@work software. Results & conclusion: 146 parameters were obtained from analysis. Cross-validation revealed several parameters being able to discriminate between MP groups and to identify immune modulation. MP administration resulted in a delayed activation of monocytes, increased ratio of neutrophils, reduced T-lymphocytes counts. Cluster analysis demonstrated that classification of patients is possible based on the identified cytomics parameters. Further investigation of these parameters might help to understand the MP effects in pediatric open heart surgery.
An Assessment of the Condition of Coral Reefs off the Former Navy Bombing Ranges at Isla De Culebra and Isla De Vieques, Puerto Rico

DTIC Science & Technology

2005-04-01

Bray-Curtis distance measure with an Unweighted Pair Group Method with Arithmetic Averages ( UPGMA ) linkage method to perform a cluster analysis of the...59 35 Comparison of reef condition indicators clustering by UPGMA analysis...Polyvinyl Chloride RBD Red-band Disease SACEX Supporting Arms Coordination Exercise SAV Submerged Aquatic Vegetation SD Standard Deviation UPGMA
Genetic diversity of allozymes in turnip (Brassica rapa L. var. rapa) from the Nordic area.

PubMed

Persson, K; Fält, A S; von Bothmer, R

2001-01-01

Genetic diversity and relationships based on isozymes were studied in 31 accessions of turnip (Brassica rapa L. var. rapa). The material included varieties, elite stocks, landraces and older turnip of slash-and-burn type from the Nordic area. A total of 9 isozyme loci and 26 alleles were studied. The isozyme systems were ACO, DIA, GPI, GOT, PGM, PGD and SKD. The level of heterozygosity was reduced in the landraces, but it was high for the variety group 'Ostersundom'. Turnip has a higher genetic variation than other crops within B. rapa and than in other species with the same breeding system. The genetic diversity showed that 18.7% of the genetic variation was within the accessions, and the total H tau value was 0.358. Gpi-I and Pgd-I showed the lowest variation compared with the other loci. The cluster analysis revealed five clusters, with one main cluster including 25 of the 31 accessions. The dendrogram indicated that the variety group 'Ostersundom' clustered together whereas the variety group 'Bortfelder' was associated with country of origin. The landraces were spread in different clusters. The 'slash-and-burn' type of turnip belonged to two groups.
Clinical Implications of Cluster Analysis-Based Classification of Acute Decompensated Heart Failure and Correlation with Bedside Hemodynamic Profiles.

PubMed

Ahmad, Tariq; Desai, Nihar; Wilson, Francis; Schulte, Phillip; Dunning, Allison; Jacoby, Daniel; Allen, Larry; Fiuzat, Mona; Rogers, Joseph; Felker, G Michael; O'Connor, Christopher; Patel, Chetan B

2016-01-01

Classification of acute decompensated heart failure (ADHF) is based on subjective criteria that crudely capture disease heterogeneity. Improved phenotyping of the syndrome may help improve therapeutic strategies. To derive cluster analysis-based groupings for patients hospitalized with ADHF, and compare their prognostic performance to hemodynamic classifications derived at the bedside. We performed a cluster analysis on baseline clinical variables and PAC measurements of 172 ADHF patients from the ESCAPE trial. Employing regression techniques, we examined associations between clusters and clinically determined hemodynamic profiles (warm/cold/wet/dry). We assessed association with clinical outcomes using Cox proportional hazards models. Likelihood ratio tests were used to compare the prognostic value of cluster data to that of hemodynamic data. We identified four advanced HF clusters: 1) male Caucasians with ischemic cardiomyopathy, multiple comorbidities, lowest B-type natriuretic peptide (BNP) levels; 2) females with non-ischemic cardiomyopathy, few comorbidities, most favorable hemodynamics; 3) young African American males with non-ischemic cardiomyopathy, most adverse hemodynamics, advanced disease; and 4) older Caucasians with ischemic cardiomyopathy, concomitant renal insufficiency, highest BNP levels. There was no association between clusters and bedside-derived hemodynamic profiles (p = 0.70). For all adverse clinical outcomes, Cluster 4 had the highest risk, and Cluster 2, the lowest. Compared to Cluster 4, Clusters 1-3 had 45-70% lower risk of all-cause mortality. Clusters were significantly associated with clinical outcomes, whereas hemodynamic profiles were not. By clustering patients with similar objective variables, we identified four clinically relevant phenotypes of ADHF patients, with no discernable relationship to hemodynamic profiles, but distinct associations with adverse outcomes. Our analysis suggests that ADHF classification using simultaneous considerations of etiology, comorbid conditions, and biomarker levels, may be superior to bedside classifications.
Biochemical characterization and phylogenetic analysis based on 16S rRNA sequences for V-factor dependent members of Pasteurellaceae derived from laboratory rats.

PubMed

Hayashimoto, Nobuhito; Ueno, Masami; Tkakura, Akira; Itoh, Toshio

2007-06-01

Phylogenetic analysis based on 16S rRNA sequences with sequence data of some bacterial species of Pasteurellaceae related to rodents deposited in GenBank was performed along with biochemical characterization for the 20 strains of V-factor dependent members of Pasteurellaceae derived from laboratory rats to obtain basic information and to investigate the taxonomic positions. The results of biochemical tests for all strains were identical except for three tests, the ornithine decarboxylase test, and fermentation tests of D(+) mannose and D(+) xylose. The biochemical properties of 8 of 20 strains that showed negative results for the fermentation test of D(+) xylose agreed with those of Haemophilus parainfluenzae complex. By phylogenetic analysis, the strains were divided into two clusters that agreed with the results of the fermentation test of xylose (group I: negative reaction for xylose, group II: positive reaction for xylose). The clusters were independent of other bacterial species of Pasteurellaceae tested. The sequences of the strains in group I showed 99.7-99.8% similarity and the strains in group II showed 99.3-99.7% similarity. None of the strains in group I had a close relation with Haemophilus parainfluenzae by phylogenetic analysis, although they showed the same biochemical properties. In conclusion, the strains had characteristic biochemical properties and formed two independent groups within the "rodent cluster" of Pasteurellaceae that differed in the results of the fermentation test of xylose. Therefore, they seemed to be hitherto undescribed taxa in Pasteurellaceae.
Floral and Vegetative Morphometrics of Five Pleurothallis (Orchidaceae) Species: Correlation with Taxonomy, Phylogeny, Genetic Variability and Pollination Systems

PubMed Central

BORBA, EDUARDO L.; SHEPHERD, GEORGE J.; BERG, CÁSSIO VAN DEN; SEMIR, JOÃO

2002-01-01

Morphometric analyses of vegetative and floral characters were conducted in 21 populations of five Pleurothallis (Orchidaceae) species occurring in Brazilian ‘campo rupestre’ vegetation. A phylogenetic analysis of this species group was also carried out using nuclear ribosomal DNA internal transcribed spacers (ITS1 and ITS2). Results of the ordination and cluster analyses agree with species’ delimitation revealed by taxonomic and allozyme studies. The groups formed in ordination analysis correspond to the pollinator groups determined in a previous pollination study. Relationships among the species in the cluster analysis using only vegetative characters are similar to those found in a previous allozyme study, but those indicated by cluster analysis using only floral characters differ. These results support the hypothesis that floral similarities are due to convergence driven by similar pollination mechanisms, and therefore floral traits may not be good indicators of phylogenetic relationships in this group. The results of the phylogenetic analysis support this conclusion to some extent. There is no correlation between genetic (allozyme) and morphological variability in the populations nor in the way this variability is distributed among conspecific populations. We describe a new subspecies of Pleurothallis ochreata based on differences in vegetative and chemical characters as well as geographic distribution. Absence of differentiation in floral characters, attraction of the same pollinator species, interfertility and genetic similarity support the argument for subspecific rather than specific status. PMID:12197519
Identifying influential individuals on intensive care units: using cluster analysis to explore culture.

PubMed

Fong, Allan; Clark, Lindsey; Cheng, Tianyi; Franklin, Ella; Fernandez, Nicole; Ratwani, Raj; Parker, Sarah Henrickson

2017-07-01

The objective of this paper is to identify attribute patterns of influential individuals in intensive care units using unsupervised cluster analysis. Despite the acknowledgement that culture of an organisation is critical to improving patient safety, specific methods to shift culture have not been explicitly identified. A social network analysis survey was conducted and an unsupervised cluster analysis was used. A total of 100 surveys were gathered. Unsupervised cluster analysis was used to group individuals with similar dimensions highlighting three general genres of influencers: well-rounded, knowledge and relational. Culture is created locally by individual influencers. Cluster analysis is an effective way to identify common characteristics among members of an intensive care unit team that are noted as highly influential by their peers. To change culture, identifying and then integrating the influencers in intervention development and dissemination may create more sustainable and effective culture change. Additional studies are ongoing to test the effectiveness of utilising these influencers to disseminate patient safety interventions. This study offers an approach that can be helpful in both identifying and understanding influential team members and may be an important aspect of developing methods to change organisational culture. © 2017 John Wiley & Sons Ltd.
Measuring Systemic and Climate Diversity in Ontario's University Sector

ERIC Educational Resources Information Center

Piché, Pierre Gilles

2015-01-01

This article proposes a methodology for measuring institutional diversity and applies it to Ontario's university sector. This study first used hierarchical cluster analysis, which suggested there has been very little change in diversity between 1994 and 2010 as universities were clustered in three groups for both years. However, by adapting…
A Cluster Analytic Study of Clinical Orientations among Chemical Dependency Counselors.

ERIC Educational Resources Information Center

Thombs, Dennis L.; Osborn, Cynthia J.

2001-01-01

Three distinct clinical orientations were identified in a sample of chemical dependency counselors (N=406). Based on cluster analysis, the largest group, identified and labeled as "uniform counselors," endorsed a simple, moral-disease model with little interest in psychosocial interventions. (Contains 50 references and 4 tables.) (GCP)
SOMFlow: Guided Exploratory Cluster Analysis with Self-Organizing Maps and Analytic Provenance.

PubMed

Sacha, Dominik; Kraus, Matthias; Bernard, Jurgen; Behrisch, Michael; Schreck, Tobias; Asano, Yuki; Keim, Daniel A

2018-01-01

Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings often requires several rounds of user interactions to fine-tune the data preprocessing and algorithms. We present a multi-stage Visual Analytics (VA) approach for iterative cluster refinement together with an implementation (SOMFlow) that uses Self-Organizing Maps (SOM) to analyze time series data. It supports exploration by offering the analyst a visual platform to analyze intermediate results, adapt the underlying computations, iteratively partition the data, and to reflect previous analytical activities. The history of previous decisions is explicitly visualized within a flow graph, allowing to compare earlier cluster refinements and to explore relations. We further leverage quality and interestingness measures to guide the analyst in the discovery of useful patterns, relations, and data partitions. We conducted two pair analytics experiments together with a subject matter expert in speech intonation research to demonstrate that the approach is effective for interactive data analysis, supporting enhanced understanding of clustering results as well as the interactive process itself.
Complete characterization of the stability of cluster synchronization in complex dynamical networks.

PubMed

Sorrentino, Francesco; Pecora, Louis M; Hagerstrom, Aaron M; Murphy, Thomas E; Roy, Rajarshi

2016-04-01

Synchronization is an important and prevalent phenomenon in natural and engineered systems. In many dynamical networks, the coupling is balanced or adjusted to admit global synchronization, a condition called Laplacian coupling. Many networks exhibit incomplete synchronization, where two or more clusters of synchronization persist, and computational group theory has recently proved to be valuable in discovering these cluster states based on the topology of the network. In the important case of Laplacian coupling, additional synchronization patterns can exist that would not be predicted from the group theory analysis alone. Understanding how and when clusters form, merge, and persist is essential for understanding collective dynamics, synchronization, and failure mechanisms of complex networks such as electric power grids, distributed control networks, and autonomous swarming vehicles. We describe a method to find and analyze all of the possible cluster synchronization patterns in a Laplacian-coupled network, by applying methods of computational group theory to dynamically equivalent networks. We present a general technique to evaluate the stability of each of the dynamically valid cluster synchronization patterns. Our results are validated in an optoelectronic experiment on a five-node network that confirms the synchronization patterns predicted by the theory.
Stalking: developing an empirical typology to classify stalkers.

PubMed

Del Ben, Kevin; Fremouw, W

2002-01-01

Stalking has received a great deal of attention from the media and its harmful effects on victims have been well documented. Stalking is also more common than previously thought, leading researchers to classify stalkers into groups in an attempt to predict future behavior. Previous research has grouped stalkers based on theoretical models rather than trying to empirically examine stalking behaviors along with other factors such as motivation, type of relationship, and attachment style in determining a typology of stalkers. Female college students (N = 108) who had experienced stalking behaviors responded to questions regarding their perceptions of those behaviors. First, these victim perceptions were factor analyzed. Then, cluster analysis grouped those factors to produce a four-cluster typology of stalkers. Cluster 1 (Harmless) appeared to reflect a more casual, less jealous pattern of behavior. Cluster 2 (Low Threat) appeared the least likely to become physically violent or threatening, or to engage in illegal behaviors. Cluster 3 (Violent Criminal) appeared to be the most likely to engage in physically threatening and illegal behaviors. Cluster 4 (High Threat) was characterized by a more serious type of relationship and may attempt to be more restrictive of their partner when first meeting them.
[Difficulties in emotion regulation and personal distress in young adults with social anxiety].

PubMed

Contardi, Anna; Farina, Benedetto; Fabbricatore, Mariantonietta; Tamburello, Stella; Scapellato, Paolo; Penzo, Ilaria; Tamburello, Antonino; Innamorati, Marco

2013-01-01

The aim of this study was to assess the association between social anxiety and difficulties in emotion regulation in a sample of Italian young adults. Our convenience sample was composed of 298 Italian young adults (184 women and 114 men) aged 18-34 years. Participants were administered the Interaction Anxiousness Scale (IAS), the Audience Anxiousness Scale (AAS), the Difficulties in Emotion Regulation Scale (DERS), and the Interpersonal Reactivity Index (IRI). A Two Step cluster analysis was used to group subjects according to their level of social anxiety. The cluster analysis indicated a two-cluster solution. The first cluster included 163 young adults with higher scores on the AAS and the IAS than those included in cluster 2 (n=135). A generalized linear model with groups as dependent variable indicated that people with higher social anxiety (compared to those with lower social anxiety) have higher scores on the dimension personal distress of the IRI (p<0.01), and on the DERS non acceptance of negative emotions (p<0.001) and lack of emotional clarity (p<0.05). The results are consistent with models of psychopathology, which hypothesize that people who cannot deal effectively with their emotions may develop depressive and anxious disorders.
The Psychology of Yoga Practitioners: A Cluster Analysis.

PubMed

Genovese, Jeremy E C; Fondran, Kristine M

2017-11-01

Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall -Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.
The Psychology of Yoga Practitioners: A Cluster Analysis.

PubMed

Genovese, Jeremy E C; Fondran, Kristine M

2017-03-30

Yoga practitioners (N = 261) completed the revised Expression of Spirituality Inventory (ESI) and the Multidimensional Body-Self Relations Questionnaire. Cluster analysis revealed three clusters: Cluster A scored high on all four spiritual constructs. They had high positive evaluations of their appearance, but a lower orientation towards their appearance. They tended to have a high evaluation of their fitness and health, and higher body satisfaction. Cluster B showed lower scores on the spiritual constructs. Like Cluster A, members of Cluster B tended to show high positive evaluations of appearance and fitness. They also had higher body satisfaction. Members of Cluster B had a higher fitness orientation and a higher appearance orientation than members of Cluster A. Members of Cluster C had low scores for all spiritual constructs. They had a low evaluation of, and unhappiness with, their appearance. They were unhappy with the size and appearance of their bodies. They tended to see themselves as overweight. There was a significant difference in years of practice between the three groups (Kruskall-Wallis, p = .0041). Members of Cluster A have the most years of yoga experience and members of Cluster B have more yoga experience than members of Cluster C. These results suggest the possible existence of a developmental trajectory for yoga practitioners. Such a developmental sequence may have important implications for yoga practice and instruction.
Determining the Optimal Number of Clusters with the Clustergram

NASA Technical Reports Server (NTRS)

Fluegemann, Joseph K.; Davies, Misty D.; Aguirre, Nathan D.

2011-01-01

Cluster analysis aids research in many different fields, from business to biology to aerospace. It consists of using statistical techniques to group objects in large sets of data into meaningful classes. However, this process of ordering data points presents much uncertainty because it involves several steps, many of which are subject to researcher judgment as well as inconsistencies depending on the specific data type and research goals. These steps include the method used to cluster the data, the variables on which the cluster analysis will be operating, the number of resulting clusters, and parts of the interpretation process. In most cases, the number of clusters must be guessed or estimated before employing the clustering method. Many remedies have been proposed, but none is unassailable and certainly not for all data types. Thus, the aim of current research for better techniques of determining the number of clusters is generally confined to demonstrating that the new technique excels other methods in performance for several disparate data types. Our research makes use of a new cluster-number-determination technique based on the clustergram: a graph that shows how the number of objects in the cluster and the cluster mean (the ordinate) change with the number of clusters (the abscissa). We use the features of the clustergram to make the best determination of the cluster-number.
Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques

NASA Astrophysics Data System (ADS)

Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein

2017-10-01

The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.
Clustering change patterns using Fourier transformation with time-course gene expression data.

PubMed

Kim, Jaehee

2011-01-01

To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a period of time because biologically related gene groups can share the same change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. This work is aimed at discovering gene groups with similar change patterns which share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. We applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns.
Genetic diversity of red-grained rice landraces in Hani's terraced fields based on phenotypic characteristics

NASA Astrophysics Data System (ADS)

Zhou, Xiaomei; Zheng, Yun; Zhang, Tingting; Zhang, Xiaoqian; Ma, Mengli; Meng, Hengling; Wang, Tiantao; Lu, Bingyue

2018-06-01

In order to provide useful information for protection and utilization of red-grained rice landraces from Hani's terraced fields, the phenotypic diversity of 61 red-grained rice landraces were assessed based 20 quantitative traits. The results indicated that the phenotypic diversity was abundant in red-grained rice landraces. Coefficients of variation (CV) ranged from 4.878% to 72.878%, and the largest of CV was the panicle neck length, while grain width was smallest. Shannon-Weaver diversity index (H') of 20 traits ranged from 1.464 to 2.165, the largest and the smallest H' values were observed in filled grain number and chalkiness, respectively. Cluster analysis based on unweighted pair group method showed 61 red-grain rice landraces grouped into eight clusters at a cut-off value of 6.2631. The first cluster included 11 landraces, the main cluster II involved 42 landraces, and the cluster IV included 3 landraces. Laopinzhonghongmi, Chena2, Laojingnuo, Bianhao6 and Baimi were separated from the main clusters.

Sequence determination and analysis of S-adenosyl-L-homocysteine hydrolase from yellow lupine (Lupinus luteus).

PubMed

Brzeziński, K; Janowski, R; Podkowiński, J; Jaskólski, M

2001-01-01

The coding sequences of two S-adenosyl-L-homocysteine hydrolases (SAHases) were identified in yellow lupine by screenig of a cDNA library. One of them, corresponding to the complete protein, was sequenced and compared with 52 other SAHase sequences. Phylogenetic analysis of these proteins identified three groups of the enzymes. Group A comprises only bacterial sequences. Group B is subdivided into two subgroups, one of which (B1) is formed by animal sequences. Subgroup B2 consist of two distinct clusters, B2a and B2b. Cluster B2b comprises all known plant sequences, including the yellow lupine enzyme, which are distinguished by a 50-residue insert. Group C is heterogeneous and contains SAHases from Archaea as well as a new class of animal enzymes, distinctly different from those in group B1.
Inference With Difference-in-Differences With a Small Number of Groups: A Review, Simulation Study, and Empirical Application Using SHARE Data.

PubMed

Rokicki, Slawa; Cohen, Jessica; Fink, Günther; Salomon, Joshua A; Landrum, Mary Beth

2018-01-01

Difference-in-differences (DID) estimation has become increasingly popular as an approach to evaluate the effect of a group-level policy on individual-level outcomes. Several statistical methodologies have been proposed to correct for the within-group correlation of model errors resulting from the clustering of data. Little is known about how well these corrections perform with the often small number of groups observed in health research using longitudinal data. First, we review the most commonly used modeling solutions in DID estimation for panel data, including generalized estimating equations (GEE), permutation tests, clustered standard errors (CSE), wild cluster bootstrapping, and aggregation. Second, we compare the empirical coverage rates and power of these methods using a Monte Carlo simulation study in scenarios in which we vary the degree of error correlation, the group size balance, and the proportion of treated groups. Third, we provide an empirical example using the Survey of Health, Ageing, and Retirement in Europe. When the number of groups is small, CSE are systematically biased downwards in scenarios when data are unbalanced or when there is a low proportion of treated groups. This can result in over-rejection of the null even when data are composed of up to 50 groups. Aggregation, permutation tests, bias-adjusted GEE, and wild cluster bootstrap produce coverage rates close to the nominal rate for almost all scenarios, though GEE may suffer from low power. In DID estimation with a small number of groups, analysis using aggregation, permutation tests, wild cluster bootstrap, or bias-adjusted GEE is recommended.
The Multiple Faces of Non-Cystic Fibrosis Bronchiectasis. A Cluster Analysis Approach.

PubMed

Martínez-García, Miguel Á; Vendrell, Montserrat; Girón, Rosa; Máiz-Carro, Luis; de la Rosa Carrillo, David; de Gracia, Javier; Olveira, Casilda

2016-09-01

The clinical presentation and prognosis of non-cystic fibrosis bronchiectasis are both very heterogeneous. To identify different clinical phenotypes for non-cystic fibrosis bronchiectasis and their impact on prognosis. Using a standardized protocol, we conducted a multicenter observational cohort study at six Spanish centers with patients diagnosed with non-cystic fibrosis bronchiectasis before December 31, 2005, with a 5-year follow-up from the bronchiectasis diagnosis. A cluster analysis was used to classify the patients into homogeneous groups by means of significant variables corresponding to different aspects of bronchiectasis (clinical phenotypes): age, sex, body mass index, smoking habit, dyspnea, macroscopic appearance of sputum, number of exacerbations, chronic colonization with Pseudomonas aeruginosa, FEV1, number of pulmonary lobes affected, idiopathic bronchiectasis, and associated chronic obstructive pulmonary disease. Survival analysis (Kaplan-Meier method and log-rank test) was used to evaluate the comparative survival of the different subgroups. A total of 468 patients with a mean age of 63 (15.9) years were analyzed. Of these, 58% were females, 39.7% had idiopathic bronchiectasis, and 29.3% presented with chronic Pseudomonas aeruginosa colonization. Cluster analysis showed four clinical phenotypes: (1) younger women with mild disease, (2) older women with mild disease, (3) older patients with severe disease who had frequent exacerbations, and (4) older patients with severe disease who did not have frequent exacerbations. The follow-up period was 54 months, during which there were 95 deaths. Mortality was low in the first and second groups (3.9% and 7.6%, respectively) and high for the third (37%) and fourth (40.8%) groups. The third cluster had a higher proportion of respiratory deaths than the fourth (77.8% vs. 34.4%; P < 0.001). Using cluster analysis, it is possible to separate patients with bronchiectasis into distinct clinical phenotypes with different prognoses.
An Efficient Data Compression Model Based on Spatial Clustering and Principal Component Analysis in Wireless Sensor Networks.

PubMed

Yin, Yihang; Liu, Fengzheng; Zhou, Xiang; Li, Quanzhong

2015-08-07

Wireless sensor networks (WSNs) have been widely used to monitor the environment, and sensors in WSNs are usually power constrained. Because inner-node communication consumes most of the power, efficient data compression schemes are needed to reduce the data transmission to prolong the lifetime of WSNs. In this paper, we propose an efficient data compression model to aggregate data, which is based on spatial clustering and principal component analysis (PCA). First, sensors with a strong temporal-spatial correlation are grouped into one cluster for further processing with a novel similarity measure metric. Next, sensor data in one cluster are aggregated in the cluster head sensor node, and an efficient adaptive strategy is proposed for the selection of the cluster head to conserve energy. Finally, the proposed model applies principal component analysis with an error bound guarantee to compress the data and retain the definite variance at the same time. Computer simulations show that the proposed model can greatly reduce communication and obtain a lower mean square error than other PCA-based algorithms.
The NGC 4839 group falling into the Coma cluster observed by XMM-Newton

NASA Astrophysics Data System (ADS)

Neumann, D. M.; Arnaud, M.; Gastaud, R.; Aghanim, N.; Lumb, D.; Briel, U. G.; Vestrand, W. T.; Stewart, G. C.; Molendi, S.; Mittaz, J. P. D.

2001-01-01

We present here the first analysis of the XMM-Newton EPIC-MOS data of the galaxy group around NGC 4839, which lies at a projected distance to the Coma cluster center of 1.6h50-1 Mpc. In our analysis, which includes imaging, spectro-imaging and spectroscopy we find compelling evidence for the sub group being on its first infall onto the Coma cluster. The complex temperature structure around NGC 4839 is consistent with simulations of galaxies falling into a cluster environment. We see indications of a bow shock and of ram pressure stripping around NGC 4839. Furthermore our data reveal a displacement between NGC 4839 and the center of the hot gas in the group of about 300h50-1 kpc. With a simple approximation we can explain this displacement by the pressure force originating from the infall, which acts much stronger on the group gas than on the galaxies. Based on observations obtained with XMM-Newton, an ESA science mission with instruments and contributions directly funded by ESA Member States and the USA (NASA). EPIC was developed by the EPIC Consortium led by the Principal Investigator, Dr. M. J. L. Turner. The consortium comprises the following Institutes: University of Leicester, University of Birmingham, (UK); CEA/Saclay, IAS Orsay, CESR Toulouse, (France); IAAP Tuebingen, MPE Garching, (Germany); IFC Milan, ITESRE Bologna, IAUP Palermo, Italy. EPIC is funded by: PPARC, CEA, CNES, DLR and ASI.
Korean immigrants' knowledge of heart attack symptoms and risk factors.

PubMed

Hwang, Seon Y; Ryan, Catherine J; Zerwic, Julie Johnson

2008-02-01

This study assessed the knowledge of heart attack symptoms and risk factors in a convenience sample of Korean immigrants. A total of 116 Korean immigrants in a Midwestern metropolitan area were recruited through Korean churches and markets. Knowledge was assessed using both open-ended questions and a structured questionnaire. Latent class cluster analysis and Chi-square tests were used to analyze the data. About 76% of the sample had at least one self-reported risk factor for cardiovascular disease. Using an open-ended question, the majority of subjects could only identify one symptom. In the structured questionnaire, subjects identified a mean of 5 out of 10 heart attack symptoms and a mean of 5 out of 9 heart attack risk factors. Latent class cluster analysis showed that subjects clustered into two groups for both risk factors and symptoms: a high knowledge group and a low knowledge group. Subjects who clustered into the risk factor low knowledge group (48%) were more likely than the risk factor high knowledge group to be older than 65 years, to have lower education, to not know to use 911 when a heart attack occurred, and to not have a family history of heart attack. Korean immigrants' knowledge of heart attack symptoms and risk factors was variable, ranging from high to very low. Education should be focused on those at highest risk for a heart attack, which includes the elderly and those with risk factors.
GOClonto: an ontological clustering approach for conceptualizing PubMed abstracts.

PubMed

Zheng, Hai-Tao; Borchert, Charles; Kim, Hong-Gee

2010-02-01

Concurrent with progress in biomedical sciences, an overwhelming of textual knowledge is accumulating in the biomedical literature. PubMed is the most comprehensive database collecting and managing biomedical literature. To help researchers easily understand collections of PubMed abstracts, numerous clustering methods have been proposed to group similar abstracts based on their shared features. However, most of these methods do not explore the semantic relationships among groupings of documents, which could help better illuminate the groupings of PubMed abstracts. To address this issue, we proposed an ontological clustering method called GOClonto for conceptualizing PubMed abstracts. GOClonto uses latent semantic analysis (LSA) and gene ontology (GO) to identify key gene-related concepts and their relationships as well as allocate PubMed abstracts based on these key gene-related concepts. Based on two PubMed abstract collections, the experimental results show that GOClonto is able to identify key gene-related concepts and outperforms the STC (suffix tree clustering) algorithm, the Lingo algorithm, the Fuzzy Ants algorithm, and the clustering based TRS (tolerance rough set) algorithm. Moreover, the two ontologies generated by GOClonto show significant informative conceptual structures.
Molecular Characterization of Cryptosporidium spp., Giardia duodenalis, and Enterocytozoon bieneusi in Captive Wildlife at Zhengzhou Zoo, China.

PubMed

Li, Junqiang; Qi, Meng; Chang, Yankai; Wang, Rongjun; Li, Tongyi; Dong, Haiju; Zhang, Longxian

2015-01-01

Cryptosporidium spp., Giardia duodenalis, and Enterocytozoon bieneusi are common gastrointestinal protists in humans and animals. Two hundred and three fecal specimens from 80 wildlife species were collected in Zhengzhou Zoo and their genomic DNA extracted. Three intestinal pathogens were characterized with a DNA sequence analysis of different loci. Cryptosporidium felis, C. baileyi, and avian genotype III were identified in three specimens (1.5%), the manul, red-crowned crane, and cockatiel, respectively. Giardia duodenalis was also found in five specimens (2.5%) firstly: assemblage B in a white-cheeked gibbon and beaver, and assemblage F in a Chinese leopard and two Siberian tigers, respectively. Thirteen genotypes of E. bieneusi (seven previously reported genotypes and six new genotypes) were detected in 32 specimens (15.8%), of which most were reported for the first time. A phylogenetic analysis of E. bieneusi showed that five genotypes (three known and two new) clustered in group 1; three known genotypes clustered in group 2; one known genotype clustered in group 4; and the remaining four genotypes clustered in a new group. In conclusion, zoonotic Cryptosporidium spp., G. duodenalis, and E. bieneusi are maintained in wildlife and transmitted between them. Zoonotic disease outbreaks of these infectious agents possibly originate in wildlife reservoirs. © 2015 The Author(s) Journal of Eukaryotic Microbiology © 2015 International Society of Protistologists.
Clustering of Dietary Patterns, Lifestyles, and Overweight among Spanish Children and Adolescents in the ANIBES Study

PubMed Central

Pérez-Rodrigo, Carmen; Gil, Ángel; González-Gross, Marcela; Ortega, Rosa M.; Serra-Majem, Lluis; Varela-Moreiras, Gregorio; Aranceta-Bartrina, Javier

2015-01-01

Weight gain has been associated with behaviors related to diet, sedentary lifestyle, and physical activity. We investigated dietary patterns and possible meaningful clustering of physical activity, sedentary behavior, and sleep time in Spanish children and adolescents and whether the identified clusters could be associated with overweight. Analysis was based on a subsample (n = 415) of the cross-sectional ANIBES study in Spain. We performed exploratory factor analysis and subsequent cluster analysis of dietary patterns, physical activity, sedentary behaviors, and sleep time. Logistic regression analysis was used to explore the association between the cluster solutions and overweight. Factor analysis identified four dietary patterns, one reflecting a profile closer to the traditional Mediterranean diet. Dietary patterns, physical activity behaviors, sedentary behaviors and sleep time on weekdays in Spanish children and adolescents clustered into two different groups. A low physical activity-poorer diet lifestyle pattern, which included a higher proportion of girls, and a high physical activity, low sedentary behavior, longer sleep duration, healthier diet lifestyle pattern. Although increased risk of being overweight was not significant, the Prevalence Ratios (PRs) for the low physical activity-poorer diet lifestyle pattern were >1 in children and in adolescents. The healthier lifestyle pattern included lower proportions of children and adolescents from low socioeconomic status backgrounds. PMID:26729155
Ankle plantarflexion strength in rearfoot and forefoot runners: a novel clusteranalytic approach.

PubMed

Liebl, Dominik; Willwacher, Steffen; Hamill, Joseph; Brüggemann, Gert-Peter

2014-06-01

The purpose of the present study was to test for differences in ankle plantarflexion strengths of habitually rearfoot and forefoot runners. In order to approach this issue, we revisit the problem of classifying different footfall patterns in human runners. A dataset of 119 subjects running shod and barefoot (speed 3.5m/s) was analyzed. The footfall patterns were clustered by a novel statistical approach, which is motivated by advances in the statistical literature on functional data analysis. We explain the novel statistical approach in detail and compare it to the classically used strike index of Cavanagh and Lafortune (1980). The two groups found by the new cluster approach are well interpretable as a forefoot and a rearfoot footfall groups. The subsequent comparison study of the clustered subjects reveals that runners with a forefoot footfall pattern are capable of producing significantly higher joint moments in a maximum voluntary contraction (MVC) of their ankle plantarflexor muscles tendon units; difference in means: 0.28Nm/kg. This effect remains significant after controlling for an additional gender effect and for differences in training levels. Our analysis confirms the hypothesis that forefoot runners have a higher mean MVC plantarflexion strength than rearfoot runners. Furthermore, we demonstrate that our proposed stochastic cluster analysis provides a robust and useful framework for clustering foot strikes. Copyright © 2014 Elsevier B.V. All rights reserved.
Exploring Different Patterns of Love Attitudes among Chinese College Students

PubMed Central

Zeng, Xianglong; Pan, Yiqin; Zhou, Han; Yu, Shi; Liu, Xiangping

2016-01-01

Individual differences in love attitudes and the relationship between love attitudes and other variables in Asian culture lack in-depth exploration. This study conducted cluster analysis with data regarding love attitudes obtained from 389 college students in mainland China. The result of cluster analysis based on love-attitude scales distinguished four types of students: game players, rational lovers, emotional lovers, and absence lovers. These four groups of students showed significant differences in sexual attitudes and personality traits of deliberation and dutifulness but not self-discipline. The study’s implications for future studies on love attitudes in certain cultural groups were also discussed. PMID:27851784
Identification of complex metabolic states in critically injured patients using bioinformatic cluster analysis.

PubMed

Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M Margaret; Butte, Atul J; Manley, Geoffrey T

2010-01-01

Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enables pattern recognition and identification of physiologic patient states. We hypothesized that processing of multivariate data using hierarchical clustering techniques would allow identification of otherwise hidden patient physiologic patterns that would be predictive of outcome. Multivariate physiologic and ventilator data were collected continuously using a multimodal bioinformatics system in the surgical ICU at San Francisco General Hospital. These data were incorporated with non-continuous data and stored on a server in the ICU. A hierarchical clustering algorithm grouped each minute of data into 1 of 10 clusters. Clusters were correlated with outcome measures including incidence of infection, multiple organ failure (MOF), and mortality. We identified 10 clusters, which we defined as distinct patient states. While patients transitioned between states, they spent significant amounts of time in each. Clusters were enriched for our outcome measures: 2 of the 10 states were enriched for infection, 6 of 10 were enriched for MOF, and 3 of 10 were enriched for death. Further analysis of correlations between pairs of variables within each cluster reveals significant differences in physiology between clusters. Here we show for the first time the feasibility of clustering physiological measurements to identify clinically relevant patient states after trauma. These results demonstrate that hierarchical clustering techniques can be useful for visualizing complex multivariate data and may provide new insights for the care of critically injured patients.
Sleep stages identification in patients with sleep disorder using k-means clustering

NASA Astrophysics Data System (ADS)

Fadhlullah, M. U.; Resahya, A.; Nugraha, D. F.; Yulita, I. N.

2018-05-01

Data mining is a computational intelligence discipline where a large dataset processed using a certain method to look for patterns within the large dataset. This pattern then used for real time application or to develop some certain knowledge. This is a valuable tool to solve a complex problem, discover new knowledge, data analysis and decision making. To be able to get the pattern that lies inside the large dataset, clustering method is used to get the pattern. Clustering is basically grouping data that looks similar so a certain pattern can be seen in the large data set. Clustering itself has several algorithms to group the data into the corresponding cluster. This research used data from patients who suffer sleep disorders and aims to help people in the medical world to reduce the time required to classify the sleep stages from a patient who suffers from sleep disorders. This study used K-Means algorithm and silhouette evaluation to find out that 3 clusters are the optimal cluster for this dataset which means can be divided to 3 sleep stages.
Reducing Earth Topography Resolution for SMAP Mission Ground Tracks Using K-Means Clustering

NASA Technical Reports Server (NTRS)

Rizvi, Farheen

2013-01-01

The K-means clustering algorithm is used to reduce Earth topography resolution for the SMAP mission ground tracks. As SMAP propagates in orbit, knowledge of the radar antenna footprints on Earth is required for the antenna misalignment calibration. Each antenna footprint contains a latitude and longitude location pair on the Earth surface. There are 400 pairs in one data set for the calibration model. It is computationally expensive to calculate corresponding Earth elevation for these data pairs. Thus, the antenna footprint resolution is reduced. Similar topographical data pairs are grouped together with the K-means clustering algorithm. The resolution is reduced to the mean of each topographical cluster called the cluster centroid. The corresponding Earth elevation for each cluster centroid is assigned to the entire group. Results show that 400 data points are reduced to 60 while still maintaining algorithm performance and computational efficiency. In this work, sensitivity analysis is also performed to show a trade-off between algorithm performance versus computational efficiency as the number of cluster centroids and algorithm iterations are increased.
Clustering gene expression regulators: new approach to disease subtyping.

PubMed

Pyatnitskiy, Mikhail; Mazo, Ilya; Shkrob, Maria; Schwartz, Elena; Kotelnikova, Ekaterina

2014-01-01

One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.
Clustering Gene Expression Regulators: New Approach to Disease Subtyping

PubMed Central

Pyatnitskiy, Mikhail; Mazo, Ilya; Shkrob, Maria; Schwartz, Elena; Kotelnikova, Ekaterina

2014-01-01

One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient. PMID:24416320
Identification and validation of asthma phenotypes in Chinese population using cluster analysis.

PubMed

Wang, Lei; Liang, Rui; Zhou, Ting; Zheng, Jing; Liang, Bing Miao; Zhang, Hong Ping; Luo, Feng Ming; Gibson, Peter G; Wang, Gang

2017-10-01

Asthma is a heterogeneous airway disease, so it is crucial to clearly identify clinical phenotypes to achieve better asthma management. To identify and prospectively validate asthma clusters in a Chinese population. Two hundred eighty-four patients were consecutively recruited and 18 sociodemographic and clinical variables were collected. Hierarchical cluster analysis was performed by the Ward method followed by k-means cluster analysis. Then, a prospective 12-month cohort study was used to validate the identified clusters. Five clusters were successfully identified. Clusters 1 (n = 71) and 3 (n = 81) were mild asthma phenotypes with slight airway obstruction and low exacerbation risk, but with a sex differential. Cluster 2 (n = 65) described an "allergic" phenotype, cluster 4 (n = 33) featured a "fixed airflow limitation" phenotype with smoking, and cluster 5 (n = 34) was a "low socioeconomic status" phenotype. Patients in clusters 2, 4, and 5 had distinctly lower socioeconomic status and more psychological symptoms. Cluster 2 had a significantly increased risk of exacerbations (risk ratio [RR] 1.13, 95% confidence interval [CI] 1.03-1.25), unplanned visits for asthma (RR 1.98, 95% CI 1.07-3.66), and emergency visits for asthma (RR 7.17, 95% CI 1.26-40.80). Cluster 4 had an increased risk of unplanned visits (RR 2.22, 95% CI 1.02-4.81), and cluster 5 had increased emergency visits (RR 12.72, 95% CI 1.95-69.78). Kaplan-Meier analysis confirmed that cluster grouping was predictive of time to the first asthma exacerbation, unplanned visit, emergency visit, and hospital admission (P < .0001 for all comparisons). We identified 3 clinical clusters as "allergic asthma," "fixed airflow limitation," and "low socioeconomic status" phenotypes that are at high risk of severe asthma exacerbations and that have management implications for clinical practice in developing countries. Copyright © 2017 American College of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Internet Gamblers Differ on Social Variables: A Latent Class Analysis.

PubMed

Khazaal, Yasser; Chatton, Anne; Achab, Sophia; Monney, Gregoire; Thorens, Gabriel; Dufour, Magali; Zullino, Daniele; Rothen, Stephane

2017-09-01

Online gambling has gained popularity in the last decade, leading to an important shift in how consumers engage in gambling and in the factors related to problem gambling and prevention. Indebtedness and loneliness have previously been associated with problem gambling. The current study aimed to characterize online gamblers in relation to indebtedness, loneliness, and several in-game social behaviors. The data set was obtained from 584 Internet gamblers recruited online through gambling websites and forums. Of these gamblers, 372 participants completed all study assessments and were included in the analyses. Questionnaires included those on sociodemographics and social variables (indebtedness, loneliness, in-game social behaviors), as well as the Gambling Motives Questionnaire, Gambling Related Cognitions Scale, Internet Addiction Test, Problem Gambling Severity Index, Short Depression-Happiness Scale, and UPPS-P Impulsive Behavior Scale. Social variables were explored with a latent class model. The clusters obtained were compared for psychological measures and three clusters were found: lonely indebted gamblers (cluster 1: 6.5%), not lonely not indebted gamblers (cluster 2: 75.4%), and not lonely indebted gamblers (cluster 3: 18%). Participants in clusters 1 and 3 (particularly in cluster 1) were at higher risk of problem gambling than were those in cluster 2. The three groups differed on most assessed variables, including the Problem Gambling Severity Index, the Short Depression-Happiness Scale, and the UPPS-P subscales (except the sensation seeking subscore). Results highlight significant between-group differences, suggesting that Internet gamblers are not a homogeneous group. Specific intervention strategies could be implemented for groups at risk.
Mothers of young children cluster into 4 groups based on psychographic food decision influencers.

PubMed

Byrd-Bredbenner, Carol; Abbot, Jaclyn Maurer; Cussler, Ellen

2008-08-01

This study explored how mothers grouped into clusters according to multiple psychographic food decision influencers and how the clusters differed in nutrient intake and nutrient content of their household food supply. Mothers (n = 201) completed a survey assessing basic demographic characteristics, food shopping and meal preparation activities, self and spouse employment, exposure to formal food or nutrition education, education level and occupation, weight status, nutrition and food preparation knowledge and skill, family member health and nutrition status, food decision influencer constructs, and dietary intake. In addition, an in-home inventory of 100 participants' household food supplies was conducted. Four distinct clusters presented when 26 psychographic food choice influencers were evaluated. These clusters appear to be valid and robust classifications of mothers in that they discriminated well on the psychographic variables used to construct the clusters as well as numerous other variables not used in the cluster analysis. In addition, the clusters appear to transcend demographic variables that often segment audiences (eg, race, mother's age, socioeconomic status), thereby adding a new dimension to the way in which this audience can be characterized. Furthermore, psychographically defined clusters predicted dietary quality. This study demonstrates that mothers are not a homogenous group and need to have their unique characteristics taken into consideration when designing strategies to promote health. These results can help health practitioners better understand factors affecting food decisions and tailor interventions to better meet the needs of mothers.
Functional grouping of similar genes using eigenanalysis on minimum spanning tree based neighborhood graph.

PubMed

Jothi, R; Mohanty, Sraban Kumar; Ojha, Aparajita

2016-04-01

Gene expression data clustering is an important biological process in DNA microarray analysis. Although there have been many clustering algorithms for gene expression analysis, finding a suitable and effective clustering algorithm is always a challenging problem due to the heterogeneous nature of gene profiles. Minimum Spanning Tree (MST) based clustering algorithms have been successfully employed to detect clusters of varying shapes and sizes. This paper proposes a novel clustering algorithm using Eigenanalysis on Minimum Spanning Tree based neighborhood graph (E-MST). As MST of a set of points reflects the similarity of the points with their neighborhood, the proposed algorithm employs a similarity graph obtained from k(') rounds of MST (k(')-MST neighborhood graph). By studying the spectral properties of the similarity matrix obtained from k(')-MST graph, the proposed algorithm achieves improved clustering results. We demonstrate the efficacy of the proposed algorithm on 12 gene expression datasets. Experimental results show that the proposed algorithm performs better than the standard clustering algorithms. Copyright © 2016 Elsevier Ltd. All rights reserved.

An unsupervised classification approach for analysis of Landsat data to monitor land reclamation in Belmont county, Ohio

NASA Technical Reports Server (NTRS)

Brumfield, J. O.; Bloemer, H. H. L.; Campbell, W. J.

1981-01-01

Two unsupervised classification procedures for analyzing Landsat data used to monitor land reclamation in a surface mining area in east central Ohio are compared for agreement with data collected from the corresponding locations on the ground. One procedure is based on a traditional unsupervised-clustering/maximum-likelihood algorithm sequence that assumes spectral groupings in the Landsat data in n-dimensional space; the other is based on a nontraditional unsupervised-clustering/canonical-transformation/clustering algorithm sequence that not only assumes spectral groupings in n-dimensional space but also includes an additional feature-extraction technique. It is found that the nontraditional procedure provides an appreciable improvement in spectral groupings and apparently increases the level of accuracy in the classification of land cover categories.
Cluster analysis of dynamic contrast enhanced MRI reveals tumor subregions related to locoregional relapse for cervical cancer patients.

PubMed

Torheim, Turid; Groendahl, Aurora R; Andersen, Erlend K F; Lyng, Heidi; Malinen, Eirik; Kvaal, Knut; Futsaether, Cecilia M

2016-11-01

Solid tumors are known to be spatially heterogeneous. Detection of treatment-resistant tumor regions can improve clinical outcome, by enabling implementation of strategies targeting such regions. In this study, K-means clustering was used to group voxels in dynamic contrast enhanced magnetic resonance images (DCE-MRI) of cervical cancers. The aim was to identify clusters reflecting treatment resistance that could be used for targeted radiotherapy with a dose-painting approach. Eighty-one patients with locally advanced cervical cancer underwent DCE-MRI prior to chemoradiotherapy. The resulting image time series were fitted to two pharmacokinetic models, the Tofts model (yielding parameters K trans and ν e ) and the Brix model (A Brix , k ep and k el ). K-means clustering was used to group similar voxels based on either the pharmacokinetic parameter maps or the relative signal increase (RSI) time series. The associations between voxel clusters and treatment outcome (measured as locoregional control) were evaluated using the volume fraction or the spatial distribution of each cluster. One voxel cluster based on the RSI time series was significantly related to locoregional control (adjusted p-value 0.048). This cluster consisted of low-enhancing voxels. We found that tumors with poor prognosis had this RSI-based cluster gathered into few patches, making this cluster a potential candidate for targeted radiotherapy. None of the voxels clusters based on Tofts or Brix parameter maps were significantly related to treatment outcome. We identified one group of tumor voxels significantly associated with locoregional relapse that could potentially be used for dose painting. This tumor voxel cluster was identified using the raw MRI time series rather than the pharmacokinetic maps.
An ensemble framework for clustering protein-protein interaction networks.

PubMed

Asur, Sitaram; Ucar, Duygu; Parthasarathy, Srinivasan

2007-07-01

Protein-Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. The presence of biologically relevant functional modules in these networks has been theorized by many researchers. However, the application of traditional clustering algorithms for extracting these modules has not been successful, largely due to the presence of noisy false positive interactions as well as specific topological challenges in the network. In this article, we propose an ensemble clustering framework to address this problem. For base clustering, we introduce two topology-based distance metrics to counteract the effects of noise. We develop a PCA-based consensus clustering technique, designed to reduce the dimensionality of the consensus problem and yield informative clusters. We also develop a soft consensus clustering variant to assign multifaceted proteins to multiple functional groups. We conduct an empirical evaluation of different consensus techniques using topology-based, information theoretic and domain-specific validation metrics and show that our approaches can provide significant benefits over other state-of-the-art approaches. Our analysis of the consensus clusters obtained demonstrates that ensemble clustering can (a) produce improved biologically significant functional groupings; and (b) facilitate soft clustering by discovering multiple functional associations for proteins. Supplementary data are available at Bioinformatics online.
Study on transport infrastructure as mechanism of long-term urban planning strategies

NASA Astrophysics Data System (ADS)

Popova, Olga; Martynov, Kirill; Khusnutdinov, Rinat

2017-10-01

In this article, the authors carry out the research of the transport infrastructure. The authors have developed an algorithm for quality assessment of transport networks and connectivity of urban development areas. The results of the research are presented on the example of several central city quarters of Arkhangelsk city. The analysis was carried out by clustering objects (separate quarters of the Arkhangelsk city) using of SOM in comparable groups with a high level of similarity of characteristics inside each group. The result of clustering was 5 clusters with different levels of transport infrastructure. The novelty of the study is to justification for advantages of applying structural analysis for qualitative ranking of areas. The advantage of the proposed methodology is that it gives the opportunity both to compare the transport infrastructure quality of different city quarters and to determine the strategy for its development with a list of specific activities.
[Genetic relationship analysis of Ephedra intermedia from different habitat in Gansu by ISSR analysis].

PubMed

Zhu, Tian-Tian; Jin, Ling; Du, Tao; Cui, Zhi-Jia; Zhang, Xian-Fei; Wu, Di

2013-09-01

To investigate the genetic relationship of Ephedra intermedia from different habitats in Gansu. The genetic diversity and genetic relationship of E. intermedia from different habitats in Gansu were studied by ISSR molecular marker technique. Twelve ISSR primers were selected from 70 ISSR primers and used for ISSR amplification. Total 112 loci were amplified, in which 81 were polymorphic loci, the average percentage of polymorphie bands (PPB) was 72.32%. Clustering results indicated that the wild species and cultivating species were clustered into different group. The wild species, which had closer distance, were clustered into a group. E. intermedia of different habitats in Gansu have rich genetic diversities among species, it is the reason that E. intermedia has strong adaptability and wide distribution. Further, the genetic distance of E. intermedia is associated with geographical distance, the further distance can hinder the gene flow.
MMPI-2: Cluster Analysis of Personality Profiles in Perinatal Depression—Preliminary Evidence

PubMed Central

Grillo, Alessandra; Lauriola, Marco; Giacchetti, Nicoletta

2014-01-01

Background. To assess personality characteristics of women who develop perinatal depression. Methods. The study started with a screening of a sample of 453 women in their third trimester of pregnancy, to which was administered a survey data form, the Edinburgh Postnatal Depression Scale (EPDS) and the Minnesota Multiphasic Personality Inventory 2 (MMPI-2). A clinical group of subjects with perinatal depression (PND, 55 subjects) was selected; clinical and validity scales of MMPI-2 were used as predictors in hierarchical cluster analysis carried out. Results. The analysis identified three clusters of personality profile: two “clinical” clusters (1 and 3) and an “apparently common” one (cluster 2). The first cluster (39.5%) collects structures of personality with prevalent obsessive or dependent functioning tending to develop a “psychasthenic” depression; the third cluster (13.95%) includes women with prevalent borderline functioning tending to develop “dysphoric” depression; the second cluster (46.5%) shows a normal profile with a “defensive” attitude, probably due to the presence of defense mechanisms or to the fear of stigma. Conclusion. Characteristics of personality have a key role in clinical manifestations of perinatal depression; it is important to detect them to identify mothers at risk and to plan targeted therapeutic interventions. PMID:25574499
Zachary D. Barker: Final DHS HS-STEM Report

DOE Office of Scientific and Technical Information (OSTI.GOV)

Barker, Z D

Working at Lawrence Livermore National Laboratory (LLNL) this summer has provided a very unique and special experience for me. I feel that the research opportunities given to me have allowed me to significantly benefit my research group, the laboratory, the Department of Homeland Security, and the Department of Energy. The researchers in the Single Particle Aerosol Mass Spectrometry (SPAMS) group were very welcoming and clearly wanted me to get the most out of my time in Livermore. I feel that my research partner, Veena Venkatachalam of MIT, and I have been extremely productive in meeting our research goals throughout thismore » summer, and have learned much about working in research at a national laboratory such as Lawrence Livermore. I have learned much about the technical aspects of research while working at LLNL, however I have also gained important experience and insight into how research groups at national laboratories function. I believe that this internship has given me valuable knowledge and experience which will certainly help my transition to graduate study and a career in engineering. My work with Veena Venkatachalam in the SPAMS group this summer has focused on two major projects. Initially, we were tasked with an analysis of data collected by the group this past spring in a large public environment. The SPAMS instrument was deployed for over two months, collecting information on many of the ambient air particles circulating through the area. Our analysis of the particle data collected during this deployment concerned several aspects, including finding groups, or clusters, of particles that seemed to appear more during certain times of day, analyzing the mass spectral data of clusters and comparing them with mass spectral data of known substances, and comparing the real-time detection capability of the SPAMS instrument with that of a commercially available biological detection instrument. This analysis was performed in support of a group report to the Department of Homeland Security on the results of the deployment. The analysis of the deployment data revealed some interesting applications of the SPAMS instrument to homeland security situations. Using software developed in-house by SPAMS group member Dr. Paul Steele, Veena and I were able to cluster a subset of data over a certain timeframe (ranging from a single hour to an entire week). The software used makes clusters based on the mass spectral characteristics of the each particle in the data set, as well as other parameters. By looking more closely at the characteristics of individual clusters, including the mass spectra, conclusions could be made about what these particles are. This was achieved partially through examination and discussion of the mass spectral data with the members of the SPAMS group, as well as through comparison with known mass spectra collected from substances tested in the laboratory. In many cases, broad conclusions could be drawn about the identity of a cluster of particles.« less
Genetic diversity and population structure of Chinese natural bermudagrass [Cynodon dactylon (L.) Pers.] germplasm based on SRAP markers.

PubMed

Zheng, Yiqi; Xu, Shaojun; Liu, Jing; Zhao, Yan; Liu, Jianxiu

2017-01-01

Bermudagrass [Cynodon dactylon (L.) Pers.], an important turfgrass used in public parks, home lawns, golf courses and sports fields, is widely distributed in China. In the present study, sequence-related amplified polymorphism (SRAP) markers were used to assess genetic diversity and population structure among 157 indigenous bermudagrass genotypes from 20 provinces in China. The application of 26 SRAP primer pairs produced 340 bands, of which 328 (96.58%) were polymorphic. The polymorphic information content (PIC) ranged from 0.36 to 0.49 with a mean of 0.44. Genetic distance coefficients among accessions ranged from 0.04 to 0.61, with an average of 0.32. The results of STRUCTURE analysis suggested that 157 bermudagrass accessions can be grouped into three subpopulations. Moreover, according to clustering based on the unweighted pair-group method of arithmetic averages (UPGMA), accessions were divided into three major clusters. The UPGMA dendrogram revealed that accessions from identical or adjacent areas were generally, but not entirely, clustered into the same cluster. Comparison of the UPGMA dendrogram and the Bayesian STRUCTURE analysis showed general agreement between the population subdivisions and the genetic relationships among accessions. Principal coordinate analysis (PCoA) with SRAP markers revealed a similar grouping of accessions to the UPGMA dendrogram and STRUCTUE analysis. Analysis of molecular variance (AMOVA) indicated that 18% of total molecular variance was attributed to diversity among subpopulations, while 82% of variance was associated with differences within subpopulations. Our study represents the most comprehensive investigation of the genetic diversity and population structure of bermudagrass in China to date, and provides valuable information for the germplasm collection, genetic improvement, and systematic utilization of bermudagrass.
Genetic diversity and population structure of Chinese natural bermudagrass [Cynodon dactylon (L.) Pers.] germplasm based on SRAP markers

PubMed Central

Xu, Shaojun; Liu, Jing; Zhao, Yan; Liu, Jianxiu

2017-01-01

Bermudagrass [Cynodon dactylon (L.) Pers.], an important turfgrass used in public parks, home lawns, golf courses and sports fields, is widely distributed in China. In the present study, sequence-related amplified polymorphism (SRAP) markers were used to assess genetic diversity and population structure among 157 indigenous bermudagrass genotypes from 20 provinces in China. The application of 26 SRAP primer pairs produced 340 bands, of which 328 (96.58%) were polymorphic. The polymorphic information content (PIC) ranged from 0.36 to 0.49 with a mean of 0.44. Genetic distance coefficients among accessions ranged from 0.04 to 0.61, with an average of 0.32. The results of STRUCTURE analysis suggested that 157 bermudagrass accessions can be grouped into three subpopulations. Moreover, according to clustering based on the unweighted pair-group method of arithmetic averages (UPGMA), accessions were divided into three major clusters. The UPGMA dendrogram revealed that accessions from identical or adjacent areas were generally, but not entirely, clustered into the same cluster. Comparison of the UPGMA dendrogram and the Bayesian STRUCTURE analysis showed general agreement between the population subdivisions and the genetic relationships among accessions. Principal coordinate analysis (PCoA) with SRAP markers revealed a similar grouping of accessions to the UPGMA dendrogram and STRUCTUE analysis. Analysis of molecular variance (AMOVA) indicated that 18% of total molecular variance was attributed to diversity among subpopulations, while 82% of variance was associated with differences within subpopulations. Our study represents the most comprehensive investigation of the genetic diversity and population structure of bermudagrass in China to date, and provides valuable information for the germplasm collection, genetic improvement, and systematic utilization of bermudagrass. PMID:28493962
Star Formation in Undergraduate ALFALFA Team Galaxy Groups and Clusters

NASA Astrophysics Data System (ADS)

Koopmann, Rebecca A.; Durbala, Adriana; Finn, Rose; Haynes, Martha P.; Coble, Kimberly A.; Craig, David W.; Hoffman, G. Lyle; Miller, Brendan P.; Crone-Odekon, Mary; O'Donoghue, Aileen A.; Troischt, Parker; Undergraduate ALFALFA Team; ALFALFA Team

2017-01-01

The Undergraduate ALFALFA Team (UAT) Groups project is a coordinated study of gas and star formation properties of galaxies in and around 36 nearby (z<0.03) groups and clusters of varied richness, morphological type mix, and X-ray luminosity. By studying a large range of environments and considering the spatial distributions of star formation, we probe mechanisms of gas depletion and morphological transformation. The project uses ALFALFA HI observations, optical observations, and digital databases like SDSS, and incorporates work undertaken by faculty and students at different institutions within the UAT. Here we present results from our wide area Hα and broadband R imaging project carried out with the WIYN 0.9m+MOSAIC/HDI at KPNO, including an analysis of radial star formation rates and extents of galaxies in the NGC 5846, Abell 779, NRGb331, and HCG 69 groups/clusters. This work has been supported by NSF grant AST-1211005 and AST-1637339.
Improved Ant Colony Clustering Algorithm and Its Performance Study

PubMed Central

Gao, Wei

2016-01-01

Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533
[Styles of interpersonal conflict in patients with panic disorder, alcoholism, rheumatoid arthritis and healthy controls: a cluster analysis study].

PubMed

Eher, R; Windhaber, J; Rau, H; Schmitt, M; Kellner, E

2000-05-01

Conflict and conflict resolution in intimate relationships are not only among the most important factors influencing relationship satisfaction but are also seen in association with clinical symptoms. Styles of conflict will be assessed in patients suffering from panic disorder with and without agoraphobia, in alcoholics and in patients suffering from rheumatoid arthritis. 176 patients and healthy controls filled out the Styles of Conflict Inventory and questionnaires concerning severity of clinical symptoms. A cluster analysis revealed 5 types of conflict management. Healthy controls showed predominantely assertive and constructive styles, patients with panic disorder showed high levels of cognitive and/or behavioral aggression. Alcoholics showed high levels of repressed aggression, and patients with rheumatoid arthritis often did not exhibit any aggression during conflict. 5 Clusters of conflict pattern have been identified by cluster analysis. Each patient group showed considerable different patterns of conflict management.
Cluster Analysis of Childhood Temperament Data on Adoptees.

ERIC Educational Resources Information Center

Maurer, Ralph; And Others

1980-01-01

Results concur with the Thomas-Chess findings in identifying three main temperament groups: difficult, easy, and slow to warm up. Membership in the difficult group predicted later childhood behavior disorder in both sexes. (Author)
Cluster analysis of phytoplankton data collected from the National Stream Quality Accounting Network in the Tennessee River basin, 1974-81

USGS Publications Warehouse

Stephens, D.W.; Wangsgard, J.B.

1988-01-01

A computer program, Numerical Taxonomy System of Multivariate Statistical Programs (NTSYS), was used with interfacing software to perform cluster analyses of phytoplankton data stored in the biological files of the U.S. Geological Survey. The NTSYS software performs various types of statistical analyses and is capable of handling a large matrix of data. Cluster analyses were done on phytoplankton data collected from 1974 to 1981 at four national Stream Quality Accounting Network stations in the Tennessee River basin. Analysis of the changes in clusters of phytoplankton genera indicated possible changes in the water quality of the French Broad River near Knoxville, Tennessee. At this station, the most common diatom groups indicated a shift in dominant forms with some of the less common diatoms being replaced by green and blue-green algae. There was a reduction in genera variability between 1974-77 and 1979-81 sampling periods. Statistical analysis of chloride and dissolved solids confirmed that concentrations of these substances were smaller in 1974-77 than in 1979-81. At Pickwick Landing Dam, the furthest downstream station used in the study, there was an increase in the number of genera of ' rare ' organisms with time. The appearance of two groups of green and blue-green algae indicated that an increase in temperature or nutrient concentrations occurred from 1974 to 1981, but this could not be confirmed using available water quality data. Associations of genera forming the phytoplankton communities at three stations on the Tennessee River were found to be seasonal. Nodal analysis of combined data from all four stations used in the study did not identify any seasonal or temporal patterns during 1974-81. Cluster analysis using the NYSYS programs was effective in reducing the large phytoplankton data set to a manageable size and provided considerable insight into the structure of phytoplankton communities in the Tennessee River basin. Problems encountered using cluster analysis were the subjectivity introduced in the definition of meaningful clusters, and the lack of taxonomic identification to the species level. (Author 's abstract)
Understanding Teacher Users of a Digital Library Service: A Clustering Approach

ERIC Educational Resources Information Center

Xu, Beijie

2011-01-01

This research examined teachers' online behaviors while using a digital library service--the Instructional Architect (IA)--through three consecutive studies. In the first two studies, a statistical model called latent class analysis (LCA) was applied to cluster different groups of IA teachers according to their diverse online behaviors. The third…
Blooming Trees: Substructures and Surrounding Groups of Galaxy Clusters

NASA Astrophysics Data System (ADS)

Yu, Heng; Diaferio, Antonaldo; Serra, Ana Laura; Baldi, Marco

2018-06-01

We develop the Blooming Tree Algorithm, a new technique that uses spectroscopic redshift data alone to identify the substructures and the surrounding groups of galaxy clusters, along with their member galaxies. Based on the estimated binding energy of galaxy pairs, the algorithm builds a binary tree that hierarchically arranges all of the galaxies in the field of view. The algorithm searches for buds, corresponding to gravitational potential minima on the binary tree branches; for each bud, the algorithm combines the number of galaxies, their velocity dispersion, and their average pairwise distance into a parameter that discriminates between the buds that do not correspond to any substructure or group, and thus eventually die, and the buds that correspond to substructures and groups, and thus bloom into the identified structures. We test our new algorithm with a sample of 300 mock redshift surveys of clusters in different dynamical states; the clusters are extracted from a large cosmological N-body simulation of a ΛCDM model. We limit our analysis to substructures and surrounding groups identified in the simulation with mass larger than 1013 h ‑1 M ⊙. With mock redshift surveys with 200 galaxies within 6 h ‑1 Mpc from the cluster center, the technique recovers 80% of the real substructures and 60% of the surrounding groups; in 57% of the identified structures, at least 60% of the member galaxies of the substructures and groups belong to the same real structure. These results improve by roughly a factor of two the performance of the best substructure identification algorithm currently available, the σ plateau algorithm, and suggest that our Blooming Tree Algorithm can be an invaluable tool for detecting substructures of galaxy clusters and investigating their complex dynamics.
Parity among interpretation methods of MLEE patterns and disparity among clustering methods in epidemiological typing of Candida albicans.

PubMed

Boriollo, Marcelo Fabiano Gomes; Rosa, Edvaldo Antonio Ribeiro; Gonçalves, Reginaldo Bruno; Höfling, José Francisco

2006-03-01

The typing of C. albicans by MLEE (multilocus enzyme electrophoresis) is dependent on the interpretation of enzyme electrophoretic patterns, and the study of the epidemiological relationships of these yeasts can be conducted by cluster analysis. Therefore, the aims of the present study were to first determine the discriminatory power of genetic interpretation (deduction of the allelic composition of diploid organisms) and numerical interpretation (mere determination of the presence and absence of bands) of MLEE patterns, and then to determine the concordance (Pearson product-moment correlation coefficient) and similarity (Jaccard similarity coefficient) of the groups of strains generated by three cluster analysis models, and the discriminatory power of such models as well [model A: genetic interpretation, genetic distance matrix of Nei (d(ij)) and UPGMA dendrogram; model B: genetic interpretation, Dice similarity matrix (S(D1)) and UPGMA dendrogram; model C: numerical interpretation, Dice similarity matrix (S(D2)) and UPGMA dendrogram]. MLEE was found to be a powerful and reliable tool for the typing of C. albicans due to its high discriminatory power (>0.9). Discriminatory power indicated that numerical interpretation is a method capable of discriminating a greater number of strains (47 versus 43 subtypes), but also pointed to model B as a method capable of providing a greater number of groups, suggesting its use for the typing of C. albicans by MLEE and cluster analysis. Very good agreement was only observed between the elements of the matrices S(D1) and S(D2), but a large majority of the groups generated in the three UPGMA dendrograms showed similarity S(J) between 4.8% and 75%, suggesting disparities in the conclusions obtained by the cluster assays.
Multivariate statistical analysis: Principles and applications to coorbital streams of meteorite falls

NASA Technical Reports Server (NTRS)

Wolf, S. F.; Lipschutz, M. E.

1993-01-01

Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
Phenotypes determined by cluster analysis in severe or difficult-to-treat asthma.

PubMed

Schatz, Michael; Hsu, Jin-Wen Y; Zeiger, Robert S; Chen, Wansu; Dorenbaum, Alejandro; Chipps, Bradley E; Haselkorn, Tmirah

2014-06-01

Asthma phenotyping can facilitate understanding of disease pathogenesis and potential targeted therapies. To further characterize the distinguishing features of phenotypic groups in difficult-to-treat asthma. Children ages 6-11 years (n = 518) and adolescents and adults ages ≥12 years (n = 3612) with severe or difficult-to-treat asthma from The Epidemiology and Natural History of Asthma: Outcomes and Treatment Regimens (TENOR) study were evaluated in this post hoc cluster analysis. Analyzed variables included sex, race, atopy, age of asthma onset, smoking (adolescents and adults), passive smoke exposure (children), obesity, and aspirin sensitivity. Cluster analysis used the hierarchical clustering algorithm with the Ward minimum variance method. The results were compared among clusters by χ(2) analysis; variables with significant (P < .05) differences among clusters were considered as distinguishing feature candidates. Associations among clusters and asthma-related health outcomes were assessed in multivariable analyses by adjusting for socioeconomic status, environmental exposures, and intensity of therapy. Five clusters were identified in each age stratum. Sex, atopic status, and nonwhite race were distinguishing variables in both strata; passive smoke exposure was distinguishing in children and aspirin sensitivity in adolescents and adults. Clusters were not related to outcomes in children, but 2 adult and adolescent clusters distinguished by nonwhite race and aspirin sensitivity manifested poorer quality of life (P < .0001), and the aspirin-sensitive cluster experienced more frequent asthma exacerbations (P < .0001). Distinct phenotypes appear to exist in patients with severe or difficult-to-treat asthma, which is related to outcomes in adolescents and adults but not in children. The study of the therapeutic implications of these phenotypes is warranted. Copyright © 2013 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.
Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values.

PubMed

Bhattacharya, Anindya; De, Rajat K

2010-08-01

Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But this algorithm may also fail for certain cases. In order to overcome these situations, we propose a new clustering algorithm, called average correlation clustering algorithm (ACCA), which is able to produce better clustering solution than that produced by some others. ACCA is able to find groups of genes having more common transcription factors and similar pattern of variation in their expression values. Moreover, ACCA is more efficient than DCCA with respect to the time of execution. Like DCCA, we use the concept of correlation clustering concept introduced by Bansal et al. ACCA uses the correlation matrix in such a way that all genes in a cluster have the highest average correlation values with the genes in that cluster. We have applied ACCA and some well-known conventional methods including DCCA to two artificial and nine gene expression datasets, and compared the performance of the algorithms. The clustering results of ACCA are found to be more significantly relevant to the biological annotations than those of the other methods. Analysis of the results show the superiority of ACCA over some others in determining a group of genes having more common transcription factors and with similar pattern of variation in their expression profiles. Availability of the software: The software has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/~rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software. Copyright 2010 Elsevier Inc. All rights reserved.

A Model-Based Cluster Analysis of Maternal Emotion Regulation and Relations to Parenting Behavior.

PubMed

Shaffer, Anne; Whitehead, Monica; Davis, Molly; Morelen, Diana; Suveg, Cynthia

2017-10-15

In a diverse community sample of mothers (N = 108) and their preschool-aged children (M age = 3.50 years), this study conducted person-oriented analyses of maternal emotion regulation (ER) based on a multimethod assessment incorporating physiological, observational, and self-report indicators. A model-based cluster analysis was applied to five indicators of maternal ER: maternal self-report, observed negative affect in a parent-child interaction, baseline respiratory sinus arrhythmia (RSA), and RSA suppression across two laboratory tasks. Model-based cluster analyses revealed four maternal ER profiles, including a group of mothers with average ER functioning, characterized by socioeconomic advantage and more positive parenting behavior. A dysregulated cluster demonstrated the greatest challenges with parenting and dyadic interactions. Two clusters of intermediate dysregulation were also identified. Implications for assessment and applications to parenting interventions are discussed. © 2017 Family Process Institute.
Isomers and energy landscapes of micro-hydrated sulfite and chlorate clusters

NASA Astrophysics Data System (ADS)

Hey, John C.; Doyle, Emily J.; Chen, Yuting; Johnston, Roy L.

2018-03-01

We present putative global minima for the micro-hydrated sulfite SO32-(H2O)N and chlorate ClO32(H2O)N systems in the range 3≤N≤15 found using basin-hopping global structure optimization with an empirical potential. We present a structural analysis of the hydration of a large number of minimized structures for hydrated sulfite and chlorate clusters in the range 3≤N≤50. We show that sulfite is a significantly stronger net acceptor of hydrogen bonding within water clusters than chlorate, completely suppressing the appearance of hydroxyl groups pointing out from the cluster surface (dangling OH bonds), in low-energy clusters. We also present a qualitative analysis of a highly explored energy landscape in the region of the global minimum of the eight water hydrated sulfite and chlorate systems. This article is part of the theme issue `Modern theoretical chemistry'.
Isomers and energy landscapes of micro-hydrated sulfite and chlorate clusters.

PubMed

Hey, John C; Doyle, Emily J; Chen, Yuting; Johnston, Roy L

2018-03-13

We present putative global minima for the micro-hydrated sulfite SO 3 2- (H 2 O) N and chlorate ClO 3 - (H 2 O) N systems in the range 3≤ N ≤15 found using basin-hopping global structure optimization with an empirical potential. We present a structural analysis of the hydration of a large number of minimized structures for hydrated sulfite and chlorate clusters in the range 3≤ N ≤50. We show that sulfite is a significantly stronger net acceptor of hydrogen bonding within water clusters than chlorate, completely suppressing the appearance of hydroxyl groups pointing out from the cluster surface (dangling OH bonds), in low-energy clusters. We also present a qualitative analysis of a highly explored energy landscape in the region of the global minimum of the eight water hydrated sulfite and chlorate systems.This article is part of the theme issue 'Modern theoretical chemistry'. © 2018 The Authors.
Descriptive Statistics and Cluster Analysis for Extreme Rainfall in Java Island

NASA Astrophysics Data System (ADS)

E Komalasari, K.; Pawitan, H.; Faqih, A.

2017-03-01

This study aims to describe regional pattern of extreme rainfall based on maximum daily rainfall for period 1983 to 2012 in Java Island. Descriptive statistics analysis was performed to obtain centralization, variation and distribution of maximum precipitation data. Mean and median are utilized to measure central tendency data while Inter Quartile Range (IQR) and standard deviation are utilized to measure variation of data. In addition, skewness and kurtosis used to obtain shape the distribution of rainfall data. Cluster analysis using squared euclidean distance and ward method is applied to perform regional grouping. Result of this study show that mean (average) of maximum daily rainfall in Java Region during period 1983-2012 is around 80-181mm with median between 75-160mm and standard deviation between 17 to 82. Cluster analysis produces four clusters and show that western area of Java tent to have a higher annual maxima of daily rainfall than northern area, and have more variety of annual maximum value.
Cluster Analysis of Longidorus Species (Nematoda: Longidoridae), a New Approach in Species Identification

PubMed Central

Ye, Weimin; Robbins, R. T.

2004-01-01

Hierarchical cluster analysis based on female morphometric character means including body length, distance from vulva opening to anterior end, head width, odontostyle length, esophagus length, body width, tail length, and tail width were used to examine the morphometric relationships and create dendrograms for (i) 62 populations belonging to 9 Longidorus species from Arkansas, (ii) 137 published Longidorus species, and (iii) 137 published Longidorus species plus 86 populations of 16 Longidorus species from Arkansas and various other locations by using JMP 4.02 software (SAS Institute, Cary, NC). Cluster analysis dendograms visually illustrated the grouping and morphometric relationships of the species and populations. It provided a computerized statistical approach to assist by helping to identify and distinguish species, by indicating morphometric relationships among species, and by assisting with new species diagnosis. The preliminary species identification can be accomplished by running cluster analysis for unknown species together with the data matrix of known published Longidorus species. PMID:19262809
Subgroup Analysis in Burnout: Relations Between Fatigue, Anxiety, and Depression

PubMed Central

van Dam, Arno

2016-01-01

Several authors have suggested that burned out patients do not form a homogeneous group and that subgroups should be considered. The identification of these subgroups may contribute to a better understanding of the burnout construct and lead to more specific therapeutic interventions. Subgroup analysis may also help clarify whether burnout is a distinct entity and whether subgroups of burnout overlap with other disorders such as depression and chronic fatigue syndrome. In a group of 113 clinically diagnosed burned out patients, levels of fatigue, depression, and anxiety were assessed. In order to identify possible subgroups, we performed a two-step cluster analysis. The analysis revealed two clusters that differed from one another in terms of symptom severity on the three aforementioned measures. Depression appeared to be the strongest predictor of group membership. These results are considered in the light of the scientific debate on whether burnout can be distinguished from depression and whether burnout subtyping is useful. Finally, implications for clinical practice and future research are discussed. PMID:26869983
Phenotypes of sleeplessness: stressing the need for psychodiagnostics in the assessment of insomnia.

PubMed

van de Laar, Merijn; Leufkens, Tim; Bakker, Bart; Pevernagie, Dirk; Overeem, Sebastiaan

2017-09-01

Insomnia is a too general term for various subtypes that might have different etiologies and therefore require different types of treatment. In this explorative study we used cluster analysis to distinguish different phenotypes in 218 patients with insomnia, taking into account several factors including sleep variables and characteristics related to personality and psychiatric comorbidity. Three clusters emerged from the analysis. The 'moderate insomnia with low psychopathology'-cluster was characterized by relatively normal personality traits, as well as normal levels of anxiety and depressive symptoms in the presence of moderate insomnia severity. The 'severe insomnia with moderate psychopathology'-cluster showed relatively high scores on the Insomnia Severity Index and scores on the sleep log that were indicative for severe insomnia. Anxiety and depressive symptoms were slightly above the cut-off and they were characterized by below average self-sufficiency and less goal-directed behavior. The 'early onset insomnia with high psychopathology'-cluster showed a much younger age and earlier insomnia onset than the other two groups. Anxiety and depressive symptoms were well above the cut-off score and the group consisted of a higher percentage of subjects with comorbid psychiatric disorders. This cluster showed a 'typical psychiatric' personality profile. Our findings stress the need for psychodiagnostic procedures next to a sleep-related diagnostic approach, especially in the younger insomnia patients. Specific treatment suggestions are given based on the three phenotypes.
The Impact of Clinical, Demographic and Risk Factors on Rates of HIV Transmission: A Population-based Phylogenetic Analysis in British Columbia, Canada

PubMed Central

Poon, Art F. Y.; Joy, Jeffrey B.; Woods, Conan K.; Shurgold, Susan; Colley, Guillaume; Brumme, Chanson J.; Hogg, Robert S.; Montaner, Julio S. G.; Harrigan, P. Richard

2015-01-01

Background. The diversification of human immunodeficiency virus (HIV) is shaped by its transmission history. We therefore used a population based province wide HIV drug resistance database in British Columbia (BC), Canada, to evaluate the impact of clinical, demographic, and behavioral factors on rates of HIV transmission. Methods. We reconstructed molecular phylogenies from 27 296 anonymized bulk HIV pol sequences representing 7747 individuals in BC—about half the estimated HIV prevalence in BC. Infections were grouped into clusters based on phylogenetic distances, as a proxy for variation in transmission rates. Rates of cluster expansion were reconstructed from estimated dates of HIV seroconversion. Results. Our criteria grouped 4431 individuals into 744 clusters largely separated with respect to risk factors, including large established clusters predominated by injection drug users and more-recently emerging clusters comprising men who have sex with men. The mean log10 viral load of an individual's phylogenetic neighborhood (composed of 5 other individuals with shortest phylogenetic distances) increased their odds of appearing in a cluster by >2-fold per log10 viruses per milliliter. Conclusions. Hotspots of ongoing HIV transmission can be characterized in near real time by the secondary analysis of HIV resistance genotypes, providing an important potential resource for targeting public health initiatives for HIV prevention. PMID:25312037
Detecting synchronization clusters in multivariate time series via coarse-graining of Markov chains.

PubMed

Allefeld, Carsten; Bialonski, Stephan

2007-12-01

Synchronization cluster analysis is an approach to the detection of underlying structures in data sets of multivariate time series, starting from a matrix R of bivariate synchronization indices. A previous method utilized the eigenvectors of R for cluster identification, analogous to several recent attempts at group identification using eigenvectors of the correlation matrix. All of these approaches assumed a one-to-one correspondence of dominant eigenvectors and clusters, which has however been shown to be wrong in important cases. We clarify the usefulness of eigenvalue decomposition for synchronization cluster analysis by translating the problem into the language of stochastic processes, and derive an enhanced clustering method harnessing recent insights from the coarse-graining of finite-state Markov processes. We illustrate the operation of our method using a simulated system of coupled Lorenz oscillators, and we demonstrate its superior performance over the previous approach. Finally we investigate the question of robustness of the algorithm against small sample size, which is important with regard to field applications.
Identifying clusters of falls-related hospital admissions to inform population targets for prioritising falls prevention programmes

PubMed Central

Finch, Caroline F; Stephan, Karen; Shee, Anna Wong; Hill, Keith; Haines, Terry P; Clemson, Lindy; Day, Lesley

2015-01-01

Background There has been limited research investigating the relationship between injurious falls and hospital resource use. The aims of this study were to identify clusters of community-dwelling older people in the general population who are at increased risk of being admitted to hospital following a fall and how those clusters differed in their use of hospital resources. Methods Analysis of routinely collected hospital admissions data relating to 45 374 fall-related admissions in Victorian community-dwelling older adults aged ≥65 years that occurred during 2008/2009 to 2010/2011. Fall-related admission episodes were identified based on being admitted from a private residence to hospital with a principal diagnosis of injury (International Classification of Diseases (ICD)-10-AM codes S00 to T75) and having a first external cause of a fall (ICD-10-AM codes W00 to W19). A cluster analysis was performed to identify homogeneous groups using demographic details of patients and information on the presence of comorbidities. Hospital length of stay (LOS) was compared across clusters using competing risks regression. Results Clusters based on area of residence, demographic factors (age, gender, marital status, country of birth) and the presence of comorbidities were identified. Clusters representing hospitalised fallers with comorbidities were associated with longer LOS compared with other cluster groups. Clusters delineated by demographic factors were also associated with increased LOS. Conclusions All patients with comorbidity, and older women without comorbidities, stay in hospital longer following a fall and hence consume a disproportionate share of hospital resources. These findings have important implications for the targeting of falls prevention interventions for community-dwelling older people. PMID:25618735
Making the most of missing values : object clustering with partial data in astronomy

NASA Technical Reports Server (NTRS)

Wagstaff, Kiri L.; Laidler, Victoria G.

2004-01-01

We demonstrate a clustering analysis algorithm, KSC, that a) uses all observed values and b) does not discard the partially observed objects. KSC uses soft constraints defined by the fully observed objects to assist in the grouping of objects with missing values. We present an analysis of objects taken from the Sloan Digital Sky Survey to demonstrate how imputing the values can be misleading and why the KSC approach can produce more appropriate results.
A classification of substance-dependent men on temperament and severity variables.

PubMed

Henderson, Melinda J; Galen, Luke W

2003-06-01

This study examined the validity of classifying substance abusers based on temperament and dependence severity, and expanded the scope of typology differences to proximal determinants of use (e.g., expectancies, motives). Patients were interviewed about substance use, depression, and family history of alcohol and drug abuse. Self-report instruments measuring temperament, expectancies, and motives were completed. Participants were 147 male veterans admitted to inpatient substance abuse treatment at a U.S. Department of Veterans Affairs medical center. Cluster analysis identified four types of users with two high substance problem severity and two low substance problem severity groups. Two, high problem severity, early onset groups differed only on the cluster variable of negative affectivity (NA), but showed differences on antisocial personality characteristics, hypochondriasis, and coping motives for alcohol. The two low problem severity groups were distinguished by age of onset and positive affectivity (PA). The late onset, low PA group had a higher incidence of depression, a greater tendency to use substances in solitary contexts, and lower enhancement motives for alcohol compared to the early onset, high PA cluster. The four-cluster solution yielded more distinctions on external criteria than the two-cluster solution. Such temperament variation within both high and low severity substance abusers may be important for treatment planning.
A novel symptom cluster analysis among ambulatory HIV/AIDS patients in Uganda.

PubMed

Namisango, Eve; Harding, Richard; Katabira, Elly T; Siegert, Richard J; Powell, Richard A; Atuhaire, Leonard; Moens, Katrien; Taylor, Steve

2015-01-01

Symptom clusters are gaining importance given HIV/AIDS patients experience multiple, concurrent symptoms. This study aimed to: determine clusters of patients with similar symptom combinations; describe symptom combinations distinguishing the clusters; and evaluate the clusters regarding patient socio-demographic, disease and treatment characteristics, quality of life (QOL) and functional performance. This was a cross-sectional study of 302 adult HIV/AIDS outpatients consecutively recruited at two teaching and referral hospitals in Uganda. Socio-demographic and seven-day period symptom prevalence and distress data were self-reported using the Memorial Symptom Assessment Schedule. QOL was assessed using the Medical Outcome Scale and functional performance using the Karnofsky Performance Scale. Symptom clusters were established using hierarchical cluster analysis with squared Euclidean distances using Ward's clustering methods based on symptom occurrence. Analysis of variance compared clusters on mean QOL and functional performance scores. Patient subgroups were categorised based on symptom occurrence rates. Five symptom occurrence clusters were identified: Cluster 1 (n=107), high-low for sensory discomfort and eating difficulties symptoms; Cluster 2 (n=47), high-low for psycho-gastrointestinal symptoms; Cluster 3 (n=71), high for pain and sensory disturbance symptoms; Cluster 4 (n=35), all high for general HIV/AIDS symptoms; and Cluster 5 (n=48), all low for mood-cognitive symptoms. The all high occurrence cluster was associated with worst functional status, poorest QOL scores and highest symptom-associated distress. Use of antiretroviral therapy was associated with all high symptom occurrence rate (Fisher's exact=4, P<0.001). CD4 count group below 200 was associated with the all high occurrence rate symptom cluster (Fisher's exact=41, P<0.001). Symptom clusters have a differential, affect HIV/AIDS patients' self-reported outcomes, with the subgroup experiencing high-symptom occurrence rates having a higher risk of poorer outcomes. Identification of symptom clusters could provide insights into commonly co-occurring symptoms that should be jointly targeted for management in patients with multiple complaints.
Self-concept differentiation and self-concept clarity across adulthood: associations with age and psychological well-being.

PubMed

Diehl, Manfred; Hay, Elizabeth L

2011-01-01

This study focused on the identification of conceptually meaningful groups of individuals based on their joint self-concept differentiation (SCD) and self-concept clarity (SCC) scores. Notably, we examined whether membership in different SCD-SCC groups differed by age and also was associated with differences in psychological well-being (PWB). Cluster analysis revealed five distinct SCD-SCC groups: a self-assured, unencumbered, fragmented-only, confused-only, and fragmented and confused group. Individuals in the self-assured group had the highest mean scores for positive PWB and the lowest mean scores for negative PWB, whereas individuals in the fragmented and confused group showed the inverse pattern. Findings showed that it was psychologically advantageous to belong to the self-assured group at all ages. As hypothesized, older adults were more likely than young adults to be in the self-assured cluster, whereas young adults were more likely to be in the fragmented and confused cluster. Thus, consistent with extant theorizing, age was positively associated with psychologically adaptive self-concept profiles.
Clustering of Variables for Mixed Data

NASA Astrophysics Data System (ADS)

Saracco, J.; Chavent, M.

2016-05-01

This chapter presents clustering of variables which aim is to lump together strongly related variables. The proposed approach works on a mixed data set, i.e. on a data set which contains numerical variables and categorical variables. Two algorithms of clustering of variables are described: a hierarchical clustering and a k-means type clustering. A brief description of PCAmix method (that is a principal component analysis for mixed data) is provided, since the calculus of the synthetic variables summarizing the obtained clusters of variables is based on this multivariate method. Finally, the R packages ClustOfVar and PCAmixdata are illustrated on real mixed data. The PCAmix and ClustOfVar approaches are first used for dimension reduction (step 1) before applying in step 2 a standard clustering method to obtain groups of individuals.
Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this needs to be experimentally characterized with ecologically relevant phenotype properties. This study justifies the need to sequence multiple isolates, especially from P. fluorescens group in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.« less
Comparative genome analysis of Pseudomonas genomes including Populus-associated isolates

DOE PAGES

Jun, Se Ran; Wassenaar, Trudy; Nookaew, Intawat; ...

2016-01-01

The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches including the rhizosphere and endosphere of many plants influencing phylogenetic diversity and heterogeneity. In this study, comparative genome analysis was performed on over one thousand Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides. Based on average amino acid identity, genomic clusters were identified within the Pseudomonas genus, which showed agreements with clades by NCBI and cliques by IMG. The P. fluorescens group was organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. The speciesmore » P. aeruginosa showed clear distinction in their genomic relatedness compared to other Pseudomonas species groups based on the pan and core genome analysis. The 19 isolates of our 21 Populus-associated isolates formed three distinct subgroups within the P. fluorescens major group, supported by pathway profiles analysis, while two isolates were more closely related to P. chlororaphis and P. putida. The specific genes to Populus-associated subgroups were identified where genes specific to subgroup 1 include several sensory systems such as proteins which act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor; specific genes to subgroup 2 contain unique hypothetical genes; and genes specific to subgroup 3 organisms have a different hydrolase activity. IMPORTANCE The comparative genome analyses of the genus Pseudomonas that included Populus-associated isolates resulted in novel insights into high diversity of Pseudomonas. Consistent and robust genomic clusters with phylogenetic homogeneity were identified, which resolved species-clades that are not clearly defined by 16S rRNA gene sequence analysis alone. The genomic clusters may be reflective of distinct ecological niches to which the organisms have adapted, but this needs to be experimentally characterized with ecologically relevant phenotype properties. This study justifies the need to sequence multiple isolates, especially from P. fluorescens group in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.« less
Effect of study design and setting on tuberculosis clustering estimates using Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats (MIRU-VNTR): a systematic review.

PubMed

Mears, Jessica; Abubakar, Ibrahim; Cohen, Theodore; McHugh, Timothy D; Sonnenberg, Pam

2015-01-21

To systematically review the evidence for the impact of study design and setting on the interpretation of tuberculosis (TB) transmission using clustering derived from Mycobacterial Interspersed Repetitive Units-Variable Number Tandem Repeats (MIRU-VNTR) strain typing. MEDLINE, EMBASE, CINHAL, Web of Science and Scopus were searched for articles published before 21st October 2014. Studies in humans that reported the proportion of clustering of TB isolates by MIRU-VNTR were included in the analysis. Univariable meta-regression analyses were conducted to assess the influence of study design and setting on the proportion of clustering. The search identified 27 eligible articles reporting clustering between 0% and 63%. The number of MIRU-VNTR loci typed, requiring consent to type patient isolates (as a proxy for sampling fraction), the TB incidence and the maximum cluster size explained 14%, 14%, 27% and 48% of between-study variation, respectively, and had a significant association with the proportion of clustering. Although MIRU-VNTR typing is being adopted worldwide there is a paucity of data on how study design and setting may influence estimates of clustering. We have highlighted study design variables for consideration in the design and interpretation of future studies. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
HIV transmission patterns among The Netherlands, Suriname, and The Netherlands Antilles: a molecular epidemiological study.

PubMed

Kramer, Merlijn A; Cornelissen, Marion; Paraskevis, Dimitrios; Prins, Maria; Coutinho, Roel A; van Sighem, Ard I; Sabajo, Lesley; Duits, Ashley J; Winkel, Cai N; Prins, Jan M; van der Ende, Marchina E; Kauffmann, Robert H; Op de Coul, Eline L

2011-02-01

We aimed to study patterns of HIV transmission among Suriname, The Netherlands Antilles, and The Netherlands. Fragments of env, gag, and pol genes of 55 HIV-infected Surinamese, Antillean, and Dutch heterosexuals living in The Netherlands and 72 HIV-infected heterosexuals living in Suriname and the Antilles were amplified and sequenced. We included 145 pol sequences of HIV-infected Surinamese, Antillean, and Dutch heterosexuals living in The Netherlands from an observational cohort. All sequences were phylogenetically analyzed by neighbor-joining. Additionally, HIV-1 mobility among ethnic groups was estimated. A phylogenetic tree of all pol sequences showed two Surinamese and three Antillean clusters of related strains, but no clustering between ethnic groups. Clusters included sequences of individuals living in Suriname and the Antilles as well as those who have migrated to The Netherlands. Similar clustering patterns were observed in env and gag. Analysis of HIV mobility among ethnic groups showed significantly lower migration between groups than expected under the hypothesis of panmixis, apart from higher HIV migration between Antilleans in The Netherlands and all other groups. Our study shows that HIV transmission mainly occurs within the ethnic group. This suggests that cultural factors could have a larger impact on HIV mobility than geographic distance.
The molecular epidemiology of HIV-1 in the Comunidad Valenciana (Spain): analysis of transmission clusters.

PubMed

Patiño-Galindo, Juan Ángel; Torres-Puente, Manoli; Bracho, María Alma; Alastrué, Ignacio; Juan, Amparo; Navarro, David; Galindo, María José; Ocete, Dolores; Ortega, Enrique; Gimeno, Concepción; Belda, Josefina; Domínguez, Victoria; Moreno, Rosario; González-Candelas, Fernando

2017-09-14

HIV infections are still a very serious concern for public heath worldwide. We have applied molecular evolution methods to study the HIV-1 epidemics in the Comunidad Valenciana (CV, Spain) from a public health surveillance perspective. For this, we analysed 1804 HIV-1 sequences comprising protease and reverse transcriptase (PR/RT) coding regions, sampled between 2004 and 2014. These sequences were subtyped and subjected to phylogenetic analyses in order to detect transmission clusters. In addition, univariate and multinomial comparisons were performed to detect epidemiological differences between HIV-1 subtypes, and risk groups. The HIV epidemic in the CV is dominated by subtype B infections among local men who have sex with men (MSM). 270 transmission clusters were identified (>57% of the dataset), 12 of which included ≥10 patients; 11 of subtype B (9 affecting MSMs) and one (n = 21) of CRF14, affecting predominately intravenous drug users (IDUs). Dated phylogenies revealed these large clusters to have originated from the mid-80s to the early 00 s. Subtype B is more likely to form transmission clusters than non-B variants and MSMs to cluster than other risk groups. Multinomial analyses revealed an association between non-B variants, which are not established in the local population yet, and different foreign groups.

[Genetic polymorphism of Tulipa gesneriana L. evaluated on the basis of the ISSR marking data].

PubMed

Kashin, A S; Kritskaya, T A; Schanzer, I A

2016-10-01

Using the method of ISSR analysis, the genetic diversity of 18 natural populations of Tulipa gesneriana L. from the north of the Lower Volga region was examined. The ten ISSR primers used in the study provided identification of 102 PCR fragments, of which 50 were polymorphic (49.0%). According to the proportion of polymorphic markers, two population groups were distinguished: (1) the populations in which the proportion of polymorphic markers ranged from 0.35 to 0.41; (2) the populations in which the proportion of polymorphic markers ranged from 0.64 to 0.85. UPGMA clustering analysis provided subdivision of the sample into two large clusters. The unrooted tree constructed using the Neighbor Joining algorithm had similar topology. The first cluster included slightly variable populations and the second cluster included highly variable populations. The AMOVA analysis showed statistically significant differences (F CT = 0.430; p = 0.000) between the two groups. Local populations are considerably genetically differentiated from each other (F ST = 0.632) and have almost no links via modern gene flow, as evidenced by the results of the Mantel test (r =–0.118; p = 0.819). It is suggested that the degree of genetic similarities and differences between the populations depends on the time and the species dispersal patterns on these territories.
Raman spectroscopy of normal oral buccal mucosa tissues: study on intact and incised biopsies

NASA Astrophysics Data System (ADS)

Deshmukh, Atul; Singh, S. P.; Chaturvedi, Pankaj; Krishna, C. Murali

2011-12-01

Oral squamous cell carcinoma is one of among the top 10 malignancies. Optical spectroscopy, including Raman, is being actively pursued as alternative/adjunct for cancer diagnosis. Earlier studies have demonstrated the feasibility of classifying normal, premalignant, and malignant oral ex vivo tissues. Spectral features showed predominance of lipids and proteins in normal and cancer conditions, respectively, which were attributed to membrane lipids and surface proteins. In view of recent developments in deep tissue Raman spectroscopy, we have recorded Raman spectra from superior and inferior surfaces of 10 normal oral tissues on intact, as well as incised, biopsies after separation of epithelium from connective tissue. Spectral variations and similarities among different groups were explored by unsupervised (principal component analysis) and supervised (linear discriminant analysis, factorial discriminant analysis) methodologies. Clusters of spectra from superior and inferior surfaces of intact tissues show a high overlap; whereas spectra from separated epithelium and connective tissue sections yielded clear clusters, though they also overlap on clusters of intact tissues. Spectra of all four groups of normal tissues gave exclusive clusters when tested against malignant spectra. Thus, this study demonstrates that spectra recorded from the superior surface of an intact tissue may have contributions from deeper layers but has no bearing from the classification of a malignant tissues point of view.
Classification of cassava genotypes based on qualitative and quantitative data.

PubMed

Oliveira, E J; Oliveira Filho, O S; Santos, V S

2015-02-02

We evaluated the genetic variation of cassava accessions based on qualitative (binomial and multicategorical) and quantitative traits (continuous). We characterized 95 accessions obtained from the Cassava Germplasm Bank of Embrapa Mandioca e Fruticultura; we evaluated these accessions for 13 continuous, 10 binary, and 25 multicategorical traits. First, we analyzed the accessions based only on quantitative traits; next, we conducted joint analysis (qualitative and quantitative traits) based on the Ward-MLM method, which performs clustering in two stages. According to the pseudo-F, pseudo-t2, and maximum likelihood criteria, we identified five and four groups based on quantitative trait and joint analysis, respectively. The smaller number of groups identified based on joint analysis may be related to the nature of the data. On the other hand, quantitative data are more subject to environmental effects in the phenotype expression; this results in the absence of genetic differences, thereby contributing to greater differentiation among accessions. For most of the accessions, the maximum probability of classification was >0.90, independent of the trait analyzed, indicating a good fit of the clustering method. Differences in clustering according to the type of data implied that analysis of quantitative and qualitative traits in cassava germplasm might explore different genomic regions. On the other hand, when joint analysis was used, the means and ranges of genetic distances were high, indicating that the Ward-MLM method is very useful for clustering genotypes when there are several phenotypic traits, such as in the case of genetic resources and breeding programs.
AMMI adjustment for statistical analysis of an international wheat yield trial.

PubMed

Crossa, J; Fox, P N; Pfeiffer, W H; Rajaram, S; Gauch, H G

1991-01-01

Multilocation trials are important for the CIMMYT Bread Wheat Program in producing high-yielding, adapted lines for a wide range of environments. This study investigated procedures for improving predictive success of a yield trial, grouping environments and genotypes into homogeneous subsets, and determining the yield stability of 18 CIMMYT bread wheats evaluated at 25 locations. Additive Main effects and Multiplicative Interaction (AMMI) analysis gave more precise estimates of genotypic yields within locations than means across replicates. This precision facilitated formation by cluster analysis of more cohesive groups of genotypes and locations for biological interpretation of interactions than occurred with unadjusted means. Locations were clustered into two subsets for which genotypes with positive interactions manifested in high, stable yields were identified. The analyses highlighted superior selections with both broad and specific adaptation.
Towards a realistic population of simulated galaxy groups and clusters

NASA Astrophysics Data System (ADS)

Le Brun, Amandine M. C.; McCarthy, Ian G.; Schaye, Joop; Ponman, Trevor J.

2014-06-01

We present a new suite of large-volume cosmological hydrodynamical simulations called cosmo-OWLS. They form an extension to the OverWhelmingly Large Simulations (OWLS) project, and have been designed to help improve our understanding of cluster astrophysics and non-linear structure formation, which are now the limiting systematic errors when using clusters as cosmological probes. Starting from identical initial conditions in either the Planck or WMAP7 cosmologies, we systematically vary the most important `sub-grid' physics, including feedback from supernovae and active galactic nuclei (AGN). We compare the properties of the simulated galaxy groups and clusters to a wide range of observational data, such as X-ray luminosity and temperature, gas mass fractions, entropy and density profiles, Sunyaev-Zel'dovich flux, I-band mass-to-light ratio, dominance of the brightest cluster galaxy and central massive black hole (BH) masses, by producing synthetic observations and mimicking observational analysis techniques. These comparisons demonstrate that some AGN feedback models can produce a realistic population of galaxy groups and clusters, broadly reproducing both the median trend and, for the first time, the scatter in physical properties over approximately two decades in mass (1013 M⊙ ≲ M500 ≲ 1015 M⊙) and 1.5 decades in radius (0.05 ≲ r/r500 ≲ 1.5). However, in other models, the AGN feedback is too violent (even though they reproduce the observed BH scaling relations), implying that calibration of the models is required. The production of realistic populations of simulated groups and clusters, as well as models that bracket the observations, opens the door to the creation of synthetic surveys for assisting the astrophysical and cosmological interpretation of cluster surveys, as well as quantifying the impact of selection effects.
Child's positive and negative impacts on parents--a person-oriented approach to understanding temperament in preschool children with intellectual disabilities.

PubMed

Boström, P K; Broberg, M; Bodin, L

2011-01-01

Despite previous efforts to understand temperament in children with intellectual disability (ID), and how child temperament may affect parents, the approach has so far been unidimensional. Child temperament has been considered in relation to diagnosis, with the inherent risk of overlooking individual variation of children's temperament profiles within diagnostic groups. The aim of the present study was to identify temperamental profiles of children with ID, and investigate how these may affect parents in terms of positive and negative impacts. Parent-rated temperament in children with ID was explored through a person-oriented approach (cluster analysis). Children with ID (N=49) and typically developing (TD) children (N=82) aged between 4 and 6 years were clustered separately. Variation in temperament profiles was more prominent among children with ID than in TD children. Out of the three clusters found in the ID group, the disruptive, and passive/withdrawn clusters were distinctly different from clusters found in the TD group in terms of temperament, while the cluster active and outgoing was similar in shape and level of temperament ratings of TD children. Children within the disruptive cluster were described to have more negative and less positive impacts on mothers compared to children within the other clusters in the ID group. Mothers who describe their children as having disruptive temperament may be at particular risk for experiencing higher parenting stress as they report that the child has higher negative and lower positive impacts than other parents describe. The absence of a relationship between child temperament profile and positive or negative impact on fathers may indicate that fathers are less affected by child temperament. However, this relationship needs to be further explored. Copyright © 2011 Elsevier Ltd. All rights reserved.
Pain Sensitivity Subgroups in Individuals With Spine Pain: Potential Relevance to Short-Term Clinical Outcome

PubMed Central

Bialosky, Joel E.; Robinson, Michael E.

2014-01-01

Background Cluster analysis can be used to identify individuals similar in profile based on response to multiple pain sensitivity measures. There are limited investigations into how empirically derived pain sensitivity subgroups influence clinical outcomes for individuals with spine pain. Objective The purposes of this study were: (1) to investigate empirically derived subgroups based on pressure and thermal pain sensitivity in individuals with spine pain and (2) to examine subgroup influence on 2-week clinical pain intensity and disability outcomes. Design A secondary analysis of data from 2 randomized trials was conducted. Methods Baseline and 2-week outcome data from 157 participants with low back pain (n=110) and neck pain (n=47) were examined. Participants completed demographic, psychological, and clinical information and were assessed using pain sensitivity protocols, including pressure (suprathreshold pressure pain) and thermal pain sensitivity (thermal heat threshold and tolerance, suprathreshold heat pain, temporal summation). A hierarchical agglomerative cluster analysis was used to create subgroups based on pain sensitivity responses. Differences in data for baseline variables, clinical pain intensity, and disability were examined. Results Three pain sensitivity cluster groups were derived: low pain sensitivity, high thermal static sensitivity, and high pressure and thermal dynamic sensitivity. There were differences in the proportion of individuals meeting a 30% change in pain intensity, where fewer individuals within the high pressure and thermal dynamic sensitivity group (adjusted odds ratio=0.3; 95% confidence interval=0.1, 0.8) achieved successful outcomes. Limitations Only 2-week outcomes are reported. Conclusions Distinct pain sensitivity cluster groups for individuals with spine pain were identified, with the high pressure and thermal dynamic sensitivity group showing worse clinical outcome for pain intensity. Future studies should aim to confirm these findings. PMID:24764070
Is patient-grouping on basis of condition on admission indicative for discharge destination in geriatric stroke patients after rehabilitation in skilled nursing facilities? The results of a cluster analysis.

PubMed

Buijck, Bianca I; Zuidema, Sytse U; Spruit-van Eijk, Monica; Bor, Hans; Gerritsen, Debby L; Koopmans, Raymond T C M

2012-12-04

Geriatric stroke patients are generally frail, have an advanced age and co-morbidity. It is yet unclear whether specific groups of patients might benefit differently from structured multidisciplinary rehabilitation programs. Therefore, the aims of our study are 1) to determine relevant patient characteristics to distinguish groups of patients based on their admission scores in skilled nursing facilities (SNFs), and (2) to study the course of these particular patient-groups in relation to their discharge destination. This is a longitudinal, multicenter, observational study. We collected data on patient characteristics, balance, walking ability, arm function, co-morbidity, activities of daily living (ADL), neuropsychiatric symptoms, and depressive complaints of 127 geriatric stroke patients admitted to skilled nursing facilities with specific units for geriatric rehabilitation after stroke. Cluster analyses revealed two groups: cluster 1 included patients in poor condition upon admission (n = 52), and cluster 2 included patients in fair/good condition upon admission (n = 75). Patients in both groups improved in balance, walking abilities, and arm function. Patients in cluster 1 also improved in ADL. Depressive complaints decreased significantly in patients in cluster 1 who were discharged to an independent- or assisted-living situation. Compared to 80% of the patients in cluster 2, a lower proportion (46%) of the patients in cluster 1 were discharged to an independent- or assisted-living situation. Stroke patients referred for rehabilitation to SNFs could be clustered on the basis of their condition upon admission. Although patients in poor condition on admission were more likely to be referred to a facility for long-term care, this was certainly not the case in all patients. Almost half of them could be discharged to an independent or assisted living situation, which implied that also in patients in poor condition on admission, discharge to an independent or assisted living situation was an attainable goal. It is important to put substantial effort into the rehabilitation of patients in poor condition at admission.
Convex Clustering: An Attractive Alternative to Hierarchical Clustering

PubMed Central

Chen, Gary K.; Chi, Eric C.; Ranola, John Michael O.; Lange, Kenneth

2015-01-01

The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/ PMID:25965340
Convex clustering: an attractive alternative to hierarchical clustering.

PubMed

Chen, Gary K; Chi, Eric C; Ranola, John Michael O; Lange, Kenneth

2015-05-01

The primary goal in cluster analysis is to discover natural groupings of objects. The field of cluster analysis is crowded with diverse methods that make special assumptions about data and address different scientific aims. Despite its shortcomings in accuracy, hierarchical clustering is the dominant clustering method in bioinformatics. Biologists find the trees constructed by hierarchical clustering visually appealing and in tune with their evolutionary perspective. Hierarchical clustering operates on multiple scales simultaneously. This is essential, for instance, in transcriptome data, where one may be interested in making qualitative inferences about how lower-order relationships like gene modules lead to higher-order relationships like pathways or biological processes. The recently developed method of convex clustering preserves the visual appeal of hierarchical clustering while ameliorating its propensity to make false inferences in the presence of outliers and noise. The solution paths generated by convex clustering reveal relationships between clusters that are hidden by static methods such as k-means clustering. The current paper derives and tests a novel proximal distance algorithm for minimizing the objective function of convex clustering. The algorithm separates parameters, accommodates missing data, and supports prior information on relationships. Our program CONVEXCLUSTER incorporating the algorithm is implemented on ATI and nVidia graphics processing units (GPUs) for maximal speed. Several biological examples illustrate the strengths of convex clustering and the ability of the proximal distance algorithm to handle high-dimensional problems. CONVEXCLUSTER can be freely downloaded from the UCLA Human Genetics web site at http://www.genetics.ucla.edu/software/.
Wing morphometrics as a possible tool for the diagnosis of the Ceratitis fasciventris, C. anonae, C. rosa complex (Diptera, Tephritidae).

PubMed

Van Cann, Joannes; Virgilio, Massimiliano; Jordaens, Kurt; De Meyer, Marc

2015-01-01

Previous attempts to resolve the Ceratitis FAR complex (Ceratitis fasciventris, Ceratitis anonae, Ceratitis rosa, Diptera, Tephritidae) showed contrasting results and revealed the occurrence of five microsatellite genotypic clusters (A, F1, F2, R1, R2). In this paper we explore the potential of wing morphometrics for the diagnosis of FAR morphospecies and genotypic clusters. We considered a set of 227 specimens previously morphologically identified and genotyped at 16 microsatellite loci. Seventeen wing landmarks and 6 wing band areas were used for morphometric analyses. Permutational multivariate analysis of variance detected significant differences both across morphospecies and genotypic clusters (for both males and females). Unconstrained and constrained ordinations did not properly resolve groups corresponding to morphospecies or genotypic clusters. However, posterior group membership probabilities (PGMPs) of the Discriminant Analysis of Principal Components (DAPC) allowed the consistent identification of a relevant proportion of specimens (but with performances differing across morphospecies and genotypic clusters). This study suggests that wing morphometrics and PGMPs might represent a possible tool for the diagnosis of species within the FAR complex. Here, we propose a tentative diagnostic method and provide a first reference library of morphometric measures that might be used for the identification of additional and unidentified FAR specimens.
Within-Group Differences in Sexual Orientation and Identity

ERIC Educational Resources Information Center

Worthington, Roger L.; Reynolds, Amy L.

2009-01-01

The purpose of this investigation was to examine within-group differences among self-identified sexual orientation and identity groups. To understand these within-group differences, 2 types of analysis were conducted. First, a sample of 2,732 participants completed the Sexual Orientation and Identity Scale. Cluster analyses were used to identify 3…
Framing life and death on YouTube: the strategic communication of organ donation messages by organ procurement organizations.

PubMed

VanderKnyff, Jeremy; Friedman, Daniela B; Tanner, Andrea

2015-01-01

Using a sample of YouTube videos posted on the YouTube channels of organ procurement organizations, a content analysis was conducted to identify the frames used to strategically communicate prodonation messages. A total of 377 videos were coded for general characteristics, format, speaker characteristics, organs discussed, structure, problem definition, and treatment. Principal components analysis identified message frames, and k-means cluster analysis established distinct groupings of videos on the basis of the strength of their relationship to message frames. Analysis of these frames and clusters found that organ procurement organizations present multiple, and sometimes competing, video types and message frames on YouTube. This study serves as important formative research that will inform future studies to measure the effectiveness of the distinct message frames and clusters identified.
Dietary patterns by cluster analysis in pregnant women: relationship with nutrient intakes and dietary patterns in 7-year-old offspring.

PubMed

Freitas-Vilela, Ana Amélia; Smith, Andrew D A C; Kac, Gilberto; Pearson, Rebecca M; Heron, Jon; Emond, Alan; Hibbeln, Joseph R; Castro, Maria Beatriz Trindade; Emmett, Pauline M

2017-04-01

Little is known about how dietary patterns of mothers and their children track over time. The objectives of this study are to obtain dietary patterns in pregnancy using cluster analysis, to examine women's mean nutrient intakes in each cluster and to compare the dietary patterns of mothers to those of their children. Pregnant women (n = 12 195) from the Avon Longitudinal Study of Parents and Children reported their frequency of consumption of 47 foods and food groups. These data were used to obtain dietary patterns during pregnancy by cluster analysis. The absolute and energy-adjusted nutrient intakes were compared between clusters. Women's dietary patterns were compared with previously derived clusters of their children at 7 years of age. Multinomial logistic regression was performed to evaluate relationships comparing maternal and offspring clusters. Three maternal clusters were identified: 'fruit and vegetables', 'meat and potatoes' and 'white bread and coffee'. After energy adjustment women in the 'fruit and vegetables' cluster had the highest mean nutrient intakes. Mothers in the 'fruit and vegetables' cluster were more likely than mothers in 'meat and potatoes' (adjusted odds ratio [OR]: 2.00; 95% Confidence Interval [CI]: 1.69-2.36) or 'white bread and coffee' (OR: 2.18; 95% CI: 1.87-2.53) clusters to have children in a 'plant-based' cluster. However the majority of children were in clusters unrelated to their mother dietary pattern. Three distinct dietary patterns were obtained in pregnancy; the 'fruit and vegetables' pattern being the most nutrient dense. Mothers' dietary patterns were associated with but did not dominate offspring dietary patterns. © 2016 The Authors. Maternal & Child Nutrition published by John Wiley & Sons Ltd.
Cluster analysis of sputum cytokine-high profiles reveals diversity in T(h)2-high asthma patients.

PubMed

Seys, Sven F; Scheers, Hans; Van den Brande, Paul; Marijsse, Gudrun; Dilissen, Ellen; Van Den Bergh, Annelies; Goeminne, Pieter C; Hellings, Peter W; Ceuppens, Jan L; Dupont, Lieven J; Bullens, Dominique M A

2017-02-23

Asthma is characterized by a heterogeneous inflammatory profile and can be subdivided into T(h)2-high and T(h)2-low airway inflammation. Profiling of a broader panel of airway cytokines in large unselected patient cohorts is lacking. Patients (n = 205) were defined as being "cytokine-low/high" if sputum mRNA expression of a particular cytokine was outside the respective 10 th /90 th percentile range of the control group (n = 80). Unsupervised hierarchical clustering was used to determine clusters based on sputum cytokine profiles. Half of patients (n = 108; 52.6%) had a classical T(h)2-high ("IL-4-, IL-5- and/or IL-13-high") sputum cytokine profile. Unsupervised cluster analysis revealed 5 clusters. Patients with an "IL-4- and/or IL-13-high" pattern surprisingly did not cluster but were equally distributed among the 5 clusters. Patients with an "IL-5-, IL-17A-/F- and IL-25- high" profile were restricted to cluster 1 (n = 24) with increased sputum eosinophil as well as neutrophil counts and poor lung function parameters at baseline and 2 years later. Four other clusters were identified: "IL-5-high or IL-10-high" (n = 16), "IL-6-high" (n = 8), "IL-22-high" (n = 25). Cluster 5 (n = 132) consists of patients without "cytokine-high" pattern or patients with only high IL-4 and/or IL-13. We identified 5 unique asthma molecular phenotypes by biological clustering. Type 2 cytokines cluster with non-type 2 cytokines in 4 out of 5 clusters. Unsupervised analysis thus not supports a priori type 2 versus non-type 2 molecular phenotypes. www.clinicaltrials.gov NCT01224938. Registered 18 October 2010.
Empirically Derived Learning Disability Subtypes: A Replication Attempt and Longitudinal Patterns over 15 Years.

ERIC Educational Resources Information Center

Spreen, Otfried; Haaf, Robert G.

1986-01-01

Test scores of two groups of learning disabled children (N=63 and N=96) were submitted to cluster analysis in an attempt to replicate previously described subtypes. All three subtypes (visuo-perceptual, linguistic, and articulo-graphomotor types) were identified along with minimally and severely impaired subtypes. Similar clusters in the same…
Structure-related clustering of gene expression fingerprints of thp-1 cells exposed to smaller polycyclic aromatic hydrocarbons.

PubMed

Wan, B; Yarbrough, J W; Schultz, T W

2008-01-01

This study was undertaken to test the hypothesis that structurally similar PAHs induce similar gene expression profiles. THP-1 cells were exposed to a series of 12 selected PAHs at 50 microM for 24 hours and gene expressions profiles were analyzed using both unsupervised and supervised methods. Clustering analysis of gene expression profiles revealed that the 12 tested chemicals were grouped into five clusters. Within each cluster, the gene expression profiles are more similar to each other than to the ones outside the cluster. One-methylanthracene and 1-methylfluorene were found to have the most similar profiles; dibenzothiophene and dibenzofuran were found to share common profiles with fluorine. As expression pattern comparisons were expanded, similarity in genomic fingerprint dropped off dramatically. Prediction analysis of microarrays (PAM) based on the clustering pattern generated 49 predictor genes that can be used for sample discrimination. Moreover, a significant analysis of Microarrays (SAM) identified 598 genes being modulated by tested chemicals with a variety of biological processes, such as cell cycle, metabolism, and protein binding and KEGG pathways being significantly (p < 0.05) affected. It is feasible to distinguish structurally different PAHs based on their genomic fingerprints, which are mechanism based.
Collective Emotions Online and Their Influence on Community Life

PubMed Central

Chmiel, Anna; Sienkiewicz, Julian; Thelwall, Mike; Paltoglou, Georgios; Buckley, Kevan; Kappas, Arvid; Hołyst, Janusz A.

2011-01-01

Background E-communities, social groups interacting online, have recently become an object of interdisciplinary research. As with face-to-face meetings, Internet exchanges may not only include factual information but also emotional information – how participants feel about the subject discussed or other group members. Emotions in turn are known to be important in affecting interaction partners in offline communication in many ways. Could emotions in Internet exchanges affect others and systematically influence quantitative and qualitative aspects of the trajectory of e-communities? The development of automatic sentiment analysis has made large scale emotion detection and analysis possible using text messages collected from the web. However, it is not clear if emotions in e-communities primarily derive from individual group members' personalities or if they result from intra-group interactions, and whether they influence group activities. Methodology/Principal Findings Here, for the first time, we show the collective character of affective phenomena on a large scale as observed in four million posts downloaded from Blogs, Digg and BBC forums. To test whether the emotions of a community member may influence the emotions of others, posts were grouped into clusters of messages with similar emotional valences. The frequency of long clusters was much higher than it would be if emotions occurred at random. Distributions for cluster lengths can be explained by preferential processes because conditional probabilities for consecutive messages grow as a power law with cluster length. For BBC forum threads, average discussion lengths were higher for larger values of absolute average emotional valence in the first ten comments and the average amount of emotion in messages fell during discussions. Conclusions/Significance Overall, our results prove that collective emotional states can be created and modulated via Internet communication and that emotional expressiveness is the fuel that sustains some e-communities. PMID:21818302
Alerts in electronic medical records to promote a colorectal cancer screening programme: a cluster randomised controlled trial in primary care.

PubMed

Guiriguet, Carolina; Muñoz-Ortiz, Laura; Burón, Andrea; Rivero, Irene; Grau, Jaume; Vela-Vallespín, Carmen; Vilarrubí, Mercedes; Torres, Miquel; Hernández, Cristina; Méndez-Boo, Leonardo; Toràn, Pere; Caballeria, Llorenç; Macià, Francesc; Castells, Antoni

2016-07-01

Participation rates in colorectal cancer screening are below recommended European targets. To evaluate the effectiveness of an alert in primary care electronic medical records (EMRs) to increase individuals' participation in an organised, population-based colorectal cancer screening programme when compared with usual care. Cluster randomised controlled trial in primary care centres of Barcelona, Spain. Participants were males and females aged 50-69 years, who were invited to the first round of a screening programme based on the faecal immunochemical test (FIT) (n = 41 042), and their primary care professional. The randomisation unit was the physician cluster (n = 130) and patients were blinded to the study group. The control group followed usual care as per the colorectal cancer screening programme. In the intervention group, as well as usual care, an alert to health professionals (cluster level) to promote screening was introduced in the individual's primary care EMR for 1 year. The main outcome was colorectal cancer screening participation at individual participant level. In total, 67 physicians and 21 619 patients (intervention group) and 63 physicians and 19 423 patients (control group) were randomised. In the intention-to-treat analysis screening participation was 44.1% and 42.2% respectively (odds ratio 1.08, 95% confidence interval [CI] = 0.97 to 1.20, P = 0.146). However, in the per-protocol analysis screening uptake in the intervention group showed a statistically significant increase, after adjusting for potential confounders (OR, 1.11; 95% CI = 1.02 to 1.22; P = 0.018). The use of an alert in an individual's primary care EMR is associated with a statistically significant increased uptake of an organised, FIT-based colorectal cancer screening programme in patients attending primary care centres. © British Journal of General Practice 2016.
Phenotyping asthma, rhinitis and eczema in MeDALL population-based birth cohorts: an allergic comorbidity cluster.

PubMed

Garcia-Aymerich, J; Benet, M; Saeys, Y; Pinart, M; Basagaña, X; Smit, H A; Siroux, V; Just, J; Momas, I; Rancière, F; Keil, T; Hohmann, C; Lau, S; Wahn, U; Heinrich, J; Tischer, C G; Fantini, M P; Lenzi, J; Porta, D; Koppelman, G H; Postma, D S; Berdel, D; Koletzko, S; Kerkhof, M; Gehring, U; Wickman, M; Melén, E; Hallberg, J; Bindslev-Jensen, C; Eller, E; Kull, I; Lødrup Carlsen, K C; Carlsen, K-H; Lambrecht, B N; Kogevinas, M; Sunyer, J; Kauffmann, F; Bousquet, J; Antó, J M

2015-08-01

Asthma, rhinitis and eczema often co-occur in children, but their interrelationships at the population level have been poorly addressed. We assessed co-occurrence of childhood asthma, rhinitis and eczema using unsupervised statistical techniques. We included 17 209 children at 4 years and 14 585 at 8 years from seven European population-based birth cohorts (MeDALL project). At each age period, children were grouped, using partitioning cluster analysis, according to the distribution of 23 variables covering symptoms 'ever' and 'in the last 12 months', doctor diagnosis, age of onset and treatments of asthma, rhinitis and eczema; immunoglobulin E sensitization; weight; and height. We tested the sensitivity of our estimates to subject and variable selections, and to different statistical approaches, including latent class analysis and self-organizing maps. Two groups were identified as the optimal way to cluster the data at both age periods and in all sensitivity analyses. The first (reference) group at 4 and 8 years (including 70% and 79% of children, respectively) was characterized by a low prevalence of symptoms and sensitization, whereas the second (symptomatic) group exhibited more frequent symptoms and sensitization. Ninety-nine percentage of children with comorbidities (co-occurrence of asthma, rhinitis and/or eczema) were included in the symptomatic group at both ages. The children's characteristics in both groups were consistent in all sensitivity analyses. At 4 and 8 years, at the population level, asthma, rhinitis and eczema can be classified together as an allergic comorbidity cluster. Future research including time-repeated assessments and biological data will help understanding the interrelationships between these diseases. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

Phenotypic and Genotypic Characterization of Non-Starter Lactic Acid Bacteria in Mature Cheddar Cheese

PubMed Central

Fitzsimons, N. A.; Cogan, T. M.; Condon, S.; Beresford, T.

1999-01-01

Non-starter lactic acid bacteria were isolated from 14 premium-quality and 3 sensorially defective mature Irish Cheddar cheeses, obtained from six manufacturers. From countable plates of Lactobacillus-selective agar, 20 single isolated colonies were randomly picked per cheese. All 331 viable isolates were biochemically characterized as mesophilic (i.e., group II) Lactobacillus spp. Phenotypically, the isolates comprised 96.4% L. paracasei, 2.1% L. plantarum, 0.3% L. curvatus, 0.3% L. brevis, and 0.9% unidentified species. Randomly amplified polymorphic DNA (RAPD) analysis was used to rapidly identify the dominant strain groups in nine cheeses from three of the factories, and through clustering by the unweighted pair group method with arithmetic averages, an average of seven strains were found per cheese. In general, strains isolated from cheese produced at the same factory clustered together. The majority of isolates associated with premium-quality cheese grouped together and apart from clusters of strains from defective-quality cheese. No correlation was found between the isomer of lactate produced and RAPD profiles, although isolates which did not ferment ribose clustered together. The phenotypic and genotypic methods employed were validated with a selection of 31 type and reference strains of mesophilic Lactobacillus spp. commonly found in Cheddar cheese. RAPD analysis was found to be a useful and rapid method for identifying isolates to the species level. The low homology exhibited between RAPD banding profiles for cheese isolates and collection strains demonstrated the heterogeneity of the L. paracasei complex. PMID:10427029
ASTM clustering for improving coal analysis by near-infrared spectroscopy.

PubMed

Andrés, J M; Bona, M T

2006-11-15

Multivariate analysis techniques have been applied to near-infrared (NIR) spectra coals to investigate the relationship between nine coal properties (moisture (%), ash (%), volatile matter (%), fixed carbon (%), heating value (kcal/kg), carbon (%), hydrogen (%), nitrogen (%) and sulphur (%)) and the corresponding predictor variables. In this work, a whole set of coal samples was grouped into six more homogeneous clusters following the ASTM reference method for classification prior to the application of calibration methods to each coal set. The results obtained showed a considerable improvement of the error determination compared with the calibration for the whole sample set. For some groups, the established calibrations approached the quality required by the ASTM/ISO norms for laboratory analysis. To predict property values for a new coal sample it is necessary the assignation of that sample to its respective group. Thus, the discrimination and classification ability of coal samples by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS) in the NIR range was also studied by applying Soft Independent Modelling of Class Analogy (SIMCA) and Linear Discriminant Analysis (LDA) techniques. Modelling of the groups by SIMCA led to overlapping models that cannot discriminate for unique classification. On the other hand, the application of Linear Discriminant Analysis improved the classification of the samples but not enough to be satisfactory for every group considered.
Principal component and clustering analysis on molecular dynamics data of the ribosomal L11·23S subdomain.

PubMed

Wolf, Antje; Kirschner, Karl N

2013-02-01

With improvements in computer speed and algorithm efficiency, MD simulations are sampling larger amounts of molecular and biomolecular conformations. Being able to qualitatively and quantitatively sift these conformations into meaningful groups is a difficult and important task, especially when considering the structure-activity paradigm. Here we present a study that combines two popular techniques, principal component (PC) analysis and clustering, for revealing major conformational changes that occur in molecular dynamics (MD) simulations. Specifically, we explored how clustering different PC subspaces effects the resulting clusters versus clustering the complete trajectory data. As a case example, we used the trajectory data from an explicitly solvated simulation of a bacteria's L11·23S ribosomal subdomain, which is a target of thiopeptide antibiotics. Clustering was performed, using K-means and average-linkage algorithms, on data involving the first two to the first five PC subspace dimensions. For the average-linkage algorithm we found that data-point membership, cluster shape, and cluster size depended on the selected PC subspace data. In contrast, K-means provided very consistent results regardless of the selected subspace. Since we present results on a single model system, generalization concerning the clustering of different PC subspaces of other molecular systems is currently premature. However, our hope is that this study illustrates a) the complexities in selecting the appropriate clustering algorithm, b) the complexities in interpreting and validating their results, and c) by combining PC analysis with subsequent clustering valuable dynamic and conformational information can be obtained.
[Autism Spectrum Disorder and DSM-5: Spectrum or Cluster?].

PubMed

Kienle, Xaver; Freiberger, Verena; Greulich, Heide; Blank, Rainer

2015-01-01

Within the new DSM-5, the currently differentiated subgroups of "Autistic Disorder" (299.0), "Asperger's Disorder" (299.80) and "Pervasive Developmental Disorder" (299.80) are replaced by the more general "Autism Spectrum Disorder". With regard to a patient-oriented and expedient advising therapy planning, however, the issue of an empirically reproducible and clinically feasible differentiation into subgroups must still be raised. Based on two Autism-rating-scales (ASDS and FSK), an exploratory two-step cluster analysis was conducted with N=103 children (age: 5-18) seen in our social-pediatric health care centre to examine potentially autistic symptoms. In the two-cluster solution of both rating scales, mainly the problems in social communication grouped the children into a cluster "with communication problems" (51 % and 41 %), and a cluster "without communication problems". Within the three-cluster solution of the ASDS, sensory hypersensitivity, cleaving to routines and social-communicative problems generated an "autistic" subgroup (22%). The children of the second cluster ("communication problems", 35%) were only described by social-communicative problems, and the third group did not show any problems (38%). In the three-cluster solution of the FSK, the "autistic cluster" of the two-cluster solution differentiated in a subgroup with mainly social-communicative problems (cluster 1) and a second subgroup described by restrictive, repetitive behavior. The different cluster solutions will be discussed with a view to the new DSM-5 diagnostic criteria, for following studies a further specification of some of the ASDS and FSK items could be helpful.
Genetic diversity and variation of Chinese fir from Fujian province and Taiwan, China, based on ISSR markers

PubMed Central

Chen, Yu; Peng, Zhuqing; Wu, Chao; Ma, Zhihui; Ding, Guochang; Cao, Guangqiu; Ruan, Shaoning; Lin, Sizu

2017-01-01

Genetic diversity and variation among 11 populations of Chinese fir from Fujian province and Taiwan were assessed using inter-simple sequence repeat (ISSR) markers to reveal the evolutionary relationship in their distribution range in this report. Analysis of genetic parameters of the different populations showed that populations in Fujian province exhibited a greater level of genetic diversity than did the populations in Taiwan. Compared to Taiwan populations, significant limited gene flow were observed among Fujian populations. An UPGMA cluster analysis showed that the most individuals of Taiwan populations formed a single cluster, whereas 6 discrete clusters were formed by each population from Fujian. All populations were divided into 3 main groups and that all 5 populations from Taiwan were gathered into a subgroup combined with 2 populations, Dehua and Liancheng, formed one of the 3 main groups, which indicated relative stronger relatedness. It is supported by a genetic structure analysis. All those results are suggesting different levels of genetic diversity and variation of Chinese fir between Fujian and Taiwan, and indicating different patterns of evolutionary process and local environmental adaption. PMID:28406956
Genetic diversity and variation of Chinese fir from Fujian province and Taiwan, China, based on ISSR markers.

PubMed

Chen, Yu; Peng, Zhuqing; Wu, Chao; Ma, Zhihui; Ding, Guochang; Cao, Guangqiu; Ruan, Shaoning; Lin, Sizu

2017-01-01

Genetic diversity and variation among 11 populations of Chinese fir from Fujian province and Taiwan were assessed using inter-simple sequence repeat (ISSR) markers to reveal the evolutionary relationship in their distribution range in this report. Analysis of genetic parameters of the different populations showed that populations in Fujian province exhibited a greater level of genetic diversity than did the populations in Taiwan. Compared to Taiwan populations, significant limited gene flow were observed among Fujian populations. An UPGMA cluster analysis showed that the most individuals of Taiwan populations formed a single cluster, whereas 6 discrete clusters were formed by each population from Fujian. All populations were divided into 3 main groups and that all 5 populations from Taiwan were gathered into a subgroup combined with 2 populations, Dehua and Liancheng, formed one of the 3 main groups, which indicated relative stronger relatedness. It is supported by a genetic structure analysis. All those results are suggesting different levels of genetic diversity and variation of Chinese fir between Fujian and Taiwan, and indicating different patterns of evolutionary process and local environmental adaption.
Clusters, Groups, and Filaments in the Chandra Deep Field-South up to Redshift 1

NASA Astrophysics Data System (ADS)

Dehghan, S.; Johnston-Hollitt, M.

2014-03-01

We present a comprehensive structure detection analysis of the 0.3 deg2 area of the MUSYC-ACES field, which covers the Chandra Deep Field-South (CDFS). Using a density-based clustering algorithm on the MUSYC and ACES photometric and spectroscopic catalogs, we find 62 overdense regions up to redshifts of 1, including clusters, groups, and filaments. We also present the detection of a relatively small void of ~10 Mpc2 at z ~ 0.53. All structures are confirmed using the DBSCAN method, including the detection of nine structures previously reported in the literature. We present a catalog of all structures present, including their central position, mean redshift, velocity dispersions, and classification based on their morphological and spectroscopic distributions. In particular, we find 13 galaxy clusters and 6 large groups/small clusters. Comparison of these massive structures with published XMM-Newton imaging (where available) shows that 80% of these structures are associated with diffuse, soft-band (0.4-1 keV) X-ray emission, including 90% of all objects classified as clusters. The presence of soft-band X-ray emission in these massive structures (M 200 >= 4.9 × 1013 M ⊙) provides a strong independent confirmation of our methodology and classification scheme. In the closest two clusters identified (z < 0.13) high-quality optical imaging from the Deep2c field of the Garching-Bonn Deep Survey reveals the cD galaxies and demonstrates that they sit at the center of the detected X-ray emission. Nearly 60% of the clusters, groups, and filaments are detected in the known enhanced density regions of the CDFS at z ~= 0.13, 0.52, 0.68, and 0.73. Additionally, all of the clusters, bar the most distant, are found in these overdense redshift regions. Many of the clusters and groups exhibit signs of ongoing formation seen in their velocity distributions, position within the detected cosmic web, and in one case through the presence of tidally disrupted central galaxies exhibiting trails of stars. These results all provide strong support for hierarchical structure formation up to redshifts of 1.
Exploratory analysis of textual data from the Mother and Child Handbook using a text mining method (II): Monthly changes in the words recorded by mothers.

PubMed

Tagawa, Miki; Matsuda, Yoshio; Manaka, Tomoko; Kobayashi, Makiko; Ohwada, Michitaka; Matsubara, Shigeki

2017-01-01

The aim of the study was to examine the possibility of converting subjective textual data written in the free column space of the Mother and Child Handbook (MCH) into objective information using text mining and to compare any monthly changes in the words written by the mothers. Pregnant women without complications (n = 60) were divided into two groups according to State-Trait Anxiety Inventory grade: low trait anxiety (group I, n = 39) and high trait anxiety (group II, n = 21). Exploratory analysis of the textual data from the MCH was conducted by text mining using the Word Miner software program. Using 1203 structural elements extracted after processing, a comparison of monthly changes in the words used in the mothers' comments was made between the two groups. The data was mainly analyzed by a correspondence analysis. The structural elements in groups I and II were divided into seven and six clusters, respectively, by cluster analysis. Correspondence analysis revealed clear monthly changes in the words used in the mothers' comments as the pregnancy progressed in group I, whereas the association was not clear in group II. The text mining method was useful for exploratory analysis of the textual data obtained from pregnant women, and the monthly change in the words used in the mothers' comments as pregnancy progressed differed according to their degree of unease. © 2016 Japan Society of Obstetrics and Gynecology.
[Differences in living conditions and health between cities: construction of a composite indicator].

PubMed

Luiz, Olinda do Carmo; Heimann, Luiza Sterman; Boaretto, Roberta Cristina; Pacheco, Adriana Galvão; Pessoto, Umberto Catarino; Ibanhes, Lauro Cesar; Castro, Iracema Ester do Nascimento; Kayano, Jorge; Junqueira, Virginia; Rocha, Jucilene Leite da; Cortizo, Carlos Tato; Telesi Junior, Emílio

2009-02-01

To describe an index to identify inequities in living conditions and health and its relationship with health planning. Variables and indicators that would reflect demographic, economic, environment and education processes as well as supply and production of health services were applied for nondimensional scaling and clustering of 5,507 Brazilian municipalities. Data sources were the 2000 Census and the Brazilian Ministry of Health information systems. Z-score test statistic and cluster analysis were performed allowing to defining 4 groups of municipalities by living conditions. There was seen a polarization between the group with the best living conditions and health (Group 1) and the group with the worst living conditions (Group 4). Group 1 consisted of municipalities with larger populations while Group 4 comprised mainly the smallest municipalities. As for Brazilian macroregions, municipalities in Group 1 are clustered in the south and southeast and those in Group 4 are in the Northeast. The living conditions and health index comprises reality dimensions such as housing, environment and health which allows to identifying the most vulnerable municipalities and can provide input for setting priorities, and developing criteria for more equitable financing and resource allocation.
Phylogeny of isolates of Prunus necrotic ringspot virus from the Ilarvirus Ringtest and identification of group-specific features.

PubMed

Hammond, R W

2003-06-01

Isolates of Prunus necrotic ringspot virus (PNRSV) were examined to establish the level of naturally occurring sequence variation in the coat protein (CP) gene and to identify group-specific genome features that may prove valuable for the generation of diagnostic reagents. Phylogenetic analysis of a 452 bp sequence of 68 virus isolates, 20 obtained from the European Union Ilarvirus Ringtest held in October 1998, confirmed the clustering of the isolates into three distinct groups. Although no correlation was found between the sequence and host or geographic origin, there was a general trend for severe isolates to cluster into one group. Group-specific features have been identified for discrimination between virus strains.
Preventive behaviors by the level of perceived infection sensitivity during the Korea outbreak of Middle East Respiratory Syndrome in 2015.

PubMed

Lee, Soon Young; Yang, Hee Jeong; Kim, Gawon; Cheong, Hae-Kwan; Choi, Bo Youl

2016-01-01

This study was performed to investigate the relationship between community residents' infection sensitivity and their levels of preventive behaviors during the 2015 Middle East Respiratory Syndrome (MERS) outbreak in Korea. Seven thousands two hundreds eighty one participants from nine areas in Gyeonggi-do including Pyeongtaek, the origin of the outbreak in 2015 agreed to participate in the survey and the data from 6,739 participants were included in the final analysis. The data on the perceived infection sensitivity were subjected to cluster analysis. The levels of stress, reliability/practice of preventive behaviors, hand washing practice and policy credibility during the outbreak period were analyzed for each cluster. Cluster analysis of infection sensitivity due to the MERS outbreak resulted in classification of participants into four groups: the non-sensitive group (14.5%), social concern group (17.4%), neutral group (29.1%), and overall sensitive group (39.0%). A logistic regression analysis found that the overall sensitive group with high sensitivity had higher stress levels (17.80; 95% confidence interval [CI], 13.77 to 23.00), higher reliability on preventive behaviors (5.81; 95% CI, 4.84 to 6.98), higher practice of preventive behaviors (4.53; 95% CI, 3.83 to 5.37) and higher practice of hand washing (2.71; 95% CI, 2.13 to 3.43) during the outbreak period, compared to the non-sensitive group. Infection sensitivity of community residents during the MERS outbreak correlated with gender, age, occupation, and health behaviors. When there is an outbreak in the community, there is need to maintain a certain level of sensitivity while reducing excessive stress, as well as promote the practice of preventive behaviors among local residents. In particular, target groups need to be notified and policies need to be established with a consideration of the socio-demographic characteristics of the community.
Manipulating measurement scales in medical statistical analysis and data mining: A review of methodologies

PubMed Central

Marateb, Hamid Reza; Mansourian, Marjan; Adibi, Peyman; Farina, Dario

2014-01-01

Background: selecting the correct statistical test and data mining method depends highly on the measurement scale of data, type of variables, and purpose of the analysis. Different measurement scales are studied in details and statistical comparison, modeling, and data mining methods are studied based upon using several medical examples. We have presented two ordinal–variables clustering examples, as more challenging variable in analysis, using Wisconsin Breast Cancer Data (WBCD). Ordinal-to-Interval scale conversion example: a breast cancer database of nine 10-level ordinal variables for 683 patients was analyzed by two ordinal-scale clustering methods. The performance of the clustering methods was assessed by comparison with the gold standard groups of malignant and benign cases that had been identified by clinical tests. Results: the sensitivity and accuracy of the two clustering methods were 98% and 96%, respectively. Their specificity was comparable. Conclusion: by using appropriate clustering algorithm based on the measurement scale of the variables in the study, high performance is granted. Moreover, descriptive and inferential statistics in addition to modeling approach must be selected based on the scale of the variables. PMID:24672565
A Dimensionally Reduced Clustering Methodology for Heterogeneous Occupational Medicine Data Mining.

PubMed

Saâdaoui, Foued; Bertrand, Pierre R; Boudet, Gil; Rouffiac, Karine; Dutheil, Frédéric; Chamoux, Alain

2015-10-01

Clustering is a set of techniques of the statistical learning aimed at finding structures of heterogeneous partitions grouping homogenous data called clusters. There are several fields in which clustering was successfully applied, such as medicine, biology, finance, economics, etc. In this paper, we introduce the notion of clustering in multifactorial data analysis problems. A case study is conducted for an occupational medicine problem with the purpose of analyzing patterns in a population of 813 individuals. To reduce the data set dimensionality, we base our approach on the Principal Component Analysis (PCA), which is the statistical tool most commonly used in factorial analysis. However, the problems in nature, especially in medicine, are often based on heterogeneous-type qualitative-quantitative measurements, whereas PCA only processes quantitative ones. Besides, qualitative data are originally unobservable quantitative responses that are usually binary-coded. Hence, we propose a new set of strategies allowing to simultaneously handle quantitative and qualitative data. The principle of this approach is to perform a projection of the qualitative variables on the subspaces spanned by quantitative ones. Subsequently, an optimal model is allocated to the resulting PCA-regressed subspaces.
InCHlib - interactive cluster heatmap for web applications.

PubMed

Skuta, Ctibor; Bartůněk, Petr; Svozil, Daniel

2014-12-01

Hierarchical clustering is an exploratory data analysis method that reveals the groups (clusters) of similar objects. The result of the hierarchical clustering is a tree structure called dendrogram that shows the arrangement of individual clusters. To investigate the row/column hierarchical cluster structure of a data matrix, a visualization tool called 'cluster heatmap' is commonly employed. In the cluster heatmap, the data matrix is displayed as a heatmap, a 2-dimensional array in which the colour of each element corresponds to its value. The rows/columns of the matrix are ordered such that similar rows/columns are near each other. The ordering is given by the dendrogram which is displayed on the side of the heatmap. We developed InCHlib (Interactive Cluster Heatmap Library), a highly interactive and lightweight JavaScript library for cluster heatmap visualization and exploration. InCHlib enables the user to select individual or clustered heatmap rows, to zoom in and out of clusters or to flexibly modify heatmap appearance. The cluster heatmap can be augmented with additional metadata displayed in a different colour scale. In addition, to further enhance the visualization, the cluster heatmap can be interconnected with external data sources or analysis tools. Data clustering and the preparation of the input file for InCHlib is facilitated by the Python utility script inchlib_clust . The cluster heatmap is one of the most popular visualizations of large chemical and biomedical data sets originating, e.g., in high-throughput screening, genomics or transcriptomics experiments. The presented JavaScript library InCHlib is a client-side solution for cluster heatmap exploration. InCHlib can be easily deployed into any modern web application and configured to cooperate with external tools and data sources. Though InCHlib is primarily intended for the analysis of chemical or biological data, it is a versatile tool which application domain is not limited to the life sciences only.
Effects of Writing Instruction on Kindergarten Students' Writing Achievement: An Experimental Study

ERIC Educational Resources Information Center

Jones, Cindy D'On

2015-01-01

This full-year experimental study examined how methods of writing instruction contribute to kindergarten students' acquisition of foundational and compositional early writing skills. Multiple regression with cluster analysis was used to compare 3 writing instructional groups: an interactive writing group, a writing workshop group, and a…
An Analysis of Peer Feedback Exchanged in Group Supervision

ERIC Educational Resources Information Center

Wahesh, Edward; Kemer, Gulsah; Willis, Ben T.; Schmidt, Christopher D.

2017-01-01

The authors examined the peer feedback exchanged in 2 supervision groups of counselors-in-training (CITs). CITs generated 169 statements grouped into 10 clusters representing 5 regions of peer feedback: counselor focus and engagement, insight-oriented skills, exploratory skills, therapeutic alliance, and intervention activities. Both positive and…
Visual reconciliation of alternative similarity spaces in climate modeling

Treesearch

J Poco; A Dasgupta; Y Wei; William Hargrove; C.R. Schwalm; D.N. Huntzinger; R Cook; E Bertini; C.T. Silva

2015-01-01

Visual data analysis often requires grouping of data objects based on their similarity. In many application domains researchers use algorithms and techniques like clustering and multidimensional scaling to extract groupings from data. While extracting these groups using a single similarity criteria is relatively straightforward, comparing alternative criteria poses...
Automated rice leaf disease detection using color image analysis

NASA Astrophysics Data System (ADS)

Pugoy, Reinald Adrian D. L.; Mariano, Vladimir Y.

2011-06-01

In rice-related institutions such as the International Rice Research Institute, assessing the health condition of a rice plant through its leaves, which is usually done as a manual eyeball exercise, is important to come up with good nutrient and disease management strategies. In this paper, an automated system that can detect diseases present in a rice leaf using color image analysis is presented. In the system, the outlier region is first obtained from a rice leaf image to be tested using histogram intersection between the test and healthy rice leaf images. Upon obtaining the outlier, it is then subjected to a threshold-based K-means clustering algorithm to group related regions into clusters. Then, these clusters are subjected to further analysis to finally determine the suspected diseases of the rice leaf.
Analysis of Genetic Diversity and Structure Pattern of Indigofera Pseudotinctoria in Karst Habitats of the Wushan Mountains Using AFLP Markers.

PubMed

Fan, Yan; Zhang, Chenglin; Wu, Wendan; He, Wei; Zhang, Li; Ma, Xiao

2017-10-16

Indigofera pseudotinctoria Mats is an agronomically and economically important perennial legume shrub with a high forage yield, protein content and strong adaptability, which is subject to natural habitat fragmentation and serious human disturbance. Until now, our knowledge of the genetic relationships and intraspecific genetic diversity for its wild collections is still poor, especially at small spatial scales. Here amplified fragment length polymorphism (AFLP) technology was employed for analysis of genetic diversity, differentiation, and structure of 364 genotypes of I. pseudotinctoria from 15 natural locations in Wushan Montain, a highly structured mountain with typical karst landforms in Southwest China. We also tested whether eco-climate factors has affected genetic structure by correlating genetic diversity with habitat features. A total of 515 distinctly scoreable bands were generated, and 324 of them were polymorphic. The polymorphic information content (PIC) ranged from 0.694 to 0.890 with an average of 0.789 per primer pair. On species level, Nei's gene diversity ( H j ), the Bayesian genetic diversity index ( H B ) and the Shannon information index ( I ) were 0.2465, 0.2363 and 0.3772, respectively. The high differentiation among all sampling sites was detected ( F ST = 0.2217, G ST = 0.1746, G' ST = 0.2060, θ B = 0.1844), and instead, gene flow among accessions ( N m = 1.1819) was restricted. The population genetic structure resolved by the UPGMA tree, principal coordinate analysis, and Bayesian-based cluster analyses irrefutably grouped all accessions into two distinct clusters, i.e., lowland and highland groups. The population genetic structure resolved by the UPGMA tree, principal coordinate analysis, and Bayesian-based cluster analyses irrefutably grouped all accessions into two distinct clusters, i.e., lowland and highland groups. This structure pattern may indicate joint effects by the neutral evolution and natural selection. Restricted N m was observed across all accessions, and genetic barriers were detected between adjacent accessions due to specifically geographical landform.
A K-means multivariate approach for clustering independent components from magnetoencephalographic data.

PubMed

Spadone, Sara; de Pasquale, Francesco; Mantini, Dante; Della Penna, Stefania

2012-09-01

Independent component analysis (ICA) is typically applied on functional magnetic resonance imaging, electroencephalographic and magnetoencephalographic (MEG) data due to its data-driven nature. In these applications, ICA needs to be extended from single to multi-session and multi-subject studies for interpreting and assigning a statistical significance at the group level. Here a novel strategy for analyzing MEG independent components (ICs) is presented, Multivariate Algorithm for Grouping MEG Independent Components K-means based (MAGMICK). The proposed approach is able to capture spatio-temporal dynamics of brain activity in MEG studies by running ICA at subject level and then clustering the ICs across sessions and subjects. Distinctive features of MAGMICK are: i) the implementation of an efficient set of "MEG fingerprints" designed to summarize properties of MEG ICs as they are built on spatial, temporal and spectral parameters; ii) the implementation of a modified version of the standard K-means procedure to improve its data-driven character. This algorithm groups the obtained ICs automatically estimating the number of clusters through an adaptive weighting of the parameters and a constraint on the ICs independence, i.e. components coming from the same session (at subject level) or subject (at group level) cannot be grouped together. The performances of MAGMICK are illustrated by analyzing two sets of MEG data obtained during a finger tapping task and median nerve stimulation. The results demonstrate that the method can extract consistent patterns of spatial topography and spectral properties across sessions and subjects that are in good agreement with the literature. In addition, these results are compared to those from a modified version of affinity propagation clustering method. The comparison, evaluated in terms of different clustering validity indices, shows that our methodology often outperforms the clustering algorithm. Eventually, these results are confirmed by a comparison with a MEG tailored version of the self-organizing group ICA, which is largely used for fMRI IC clustering. Copyright © 2012 Elsevier Inc. All rights reserved.

Are pelvic adhesions associated with pain, physical, emotional and functional characteristics of women presenting with chronic pelvic pain? A cluster analysis.

PubMed

Cheong, Ying; Saran, Mili; Hounslow, James William; Reading, Isabel Claire

2018-01-08

Chronic pelvic pain is a debilitating condition. It is unknown if there is a clinical phenotype for adhesive disorders. This study aimed to determine if the presence or absence, nature, severity and extent of adhesions correlated with demographic and patient reported clinical characteristics of women presenting with CPP. Women undergoing a laparoscopy for the investigation of chronic pelvic pain were recruited prospectively; their pain and phenotypic characteristics were entered into a hierarchical cluster analysis. The groups with differing baseline clinical and operative characteristics in terms of adhesions involvement were analyzed. Sixty two women were recruited where 37 had adhesions. A low correlation was found between women's reported current pain scores and that of most severe (r = 0.34) or average pain experienced (r = 0.44) in the last 6 months. Three main groups of women with CPP were identified: Cluster 1 (n = 35) had moderate severity of pain, with poor average and present pain intensity; Cluster 2 (n = 14) had a long duration of symptoms/diagnosis, the worst current pain and worst physical, emotional and social functions; Cluster 3 (n = 11) had the shortest duration of pain and showed the best evidence of coping with low (good) physical, social and emotional scores. This cluster also had the highest proportion of women with adhesions (82%) compared to 51% in Cluster 1 and 71% in Cluster 2. In this study, we found that there is little or no correlation between patient-reported pain, physical, emotional and functional characteristics scores with the presence or absence of intra-abdominal/pelvic adhesions found during investigative laparoscopy. Most women who had adhesions had the lowest reported current pain scores.
Sleep, Dietary, and Exercise Behavioral Clusters among Truck Drivers with Obesity: Implications for Interventions

PubMed Central

Olson, Ryan; Thompson, Sharon V.; Wipfli, Brad; Hanson, Ginger; Elliot, Diane L.; Anger, W. Kent; Bodner, Todd; Hammer, Leslie B.; Hohn, Elliot; Perrin, Nancy A.

2015-01-01

Objective Our objectives were to describe a sample of truck drivers, identify clusters of drivers with similar patterns in behaviors affecting energy balance (sleep, diet, and exercise), and test for cluster differences in health and psychosocial factors. Methods Participants’ (n=452, BMI M=37.2, 86.4% male) self-reported behaviors were dichotomized prior to hierarchical cluster analysis, which identified groups with similar behavior co-variation. Cluster differences were tested with generalized estimating equations. Results Five behavioral clusters were identified that differed significantly in age, smoking status, diabetes prevalence, lost work days, stress, and social support, but not in BMI. Cluster 2, characterized by the best sleep quality, had significantly lower lost workdays and stress than other clusters. Conclusions Weight management interventions for drivers should explicitly address sleep, and may be maximally effective after establishing socially supportive work environments that reduce stress exposures. PMID:26949883
Sleep, Dietary, and Exercise Behavioral Clusters Among Truck Drivers With Obesity: Implications for Interventions.

PubMed

Olson, Ryan; Thompson, Sharon V; Wipfli, Brad; Hanson, Ginger; Elliot, Diane L; Anger, W Kent; Bodner, Todd; Hammer, Leslie B; Hohn, Elliot; Perrin, Nancy A

2016-03-01

The objectives of the study were to describe a sample of truck drivers, identify clusters of drivers with similar patterns in behaviors affecting energy balance (sleep, diet, and exercise), and test for cluster differences in health safety, and psychosocial factors. Participants' (n = 452, body mass index M = 37.2, 86.4% male) self-reported behaviors were dichotomized prior to hierarchical cluster analysis, which identified groups with similar behavior covariation. Cluster differences were tested with generalized estimating equations. Five behavioral clusters were identified that differed significantly in age, smoking status, diabetes prevalence, lost work days, stress, and social support, but not in body mass index. Cluster 2, characterized by the best sleep quality, had significantly lower lost workdays and stress than other clusters. Weight management interventions for drivers should explicitly address sleep, and may be maximally effective after establishing socially supportive work environments that reduce stress exposures.
The co-occurrence of autistic traits and borderline personality disorder traits is associated to increased suicidal ideation in nonclinical young adults.

PubMed

Chabrol, Henri; Raynal, Patrick

2018-04-01

The co-occurrence of Autism Spectrum Disorder (ASD) and Borderline Personality Disorder (BPD) is not rare and has been linked to increased suicidality. Despite this significant comorbidity between ASD and BPD, no study had examined the co-occurrence of autistic traits and borderline personality disorder traits in the general population. The aim of the present study was to examine the co-occurrence of autistic and borderline traits in a non-clinical sample of young adults and its influence on the levels of suicidal ideation and depressive symptomatology. Participants were 474 college students who completed self-report questionnaires. Data were analysed using correlation and cluster analyses. Borderline personality traits and autistic traits were weakly correlated. However, cluster analysis yielded four groups: a low traits group, a borderline traits group, an autistic traits group, and a group characterized by high levels of both traits. Cluster analysis revealed that autistic and borderline traits can co-occur in a significant proportion of young adults. The high autistic and borderline traits group constituted 17% of the total sample and had higher level of suicidal ideation than the borderline traits group, despite similar levels of depressive symptoms. This result suggests that the higher suicidality observed in patients with comorbid ASD and BPD may extent to non-clinical individuals with high levels of co-occurrent autistic and borderline traits. Copyright © 2018 Elsevier Inc. All rights reserved.
The adiposity of children is associated with their lifestyle behaviours: a cluster analysis of school-aged children from 12 nations.

PubMed

Dumuid, Dorothea; Olds, T; Lewis, L K; Martin-Fernández, J A; Barreira, T; Broyles, S; Chaput, J-P; Fogelholm, M; Hu, G; Kuriyan, R; Kurpad, A; Lambert, E V; Maia, J; Matsudo, V; Onywera, V O; Sarmiento, O L; Standage, M; Tremblay, M S; Tudor-Locke, C; Zhao, P; Katzmarzyk, P; Gillison, F; Maher, C

2018-02-01

The relationship between children's adiposity and lifestyle behaviour patterns is an area of growing interest. The objectives of this study are to identify clusters of children based on lifestyle behaviours and compare children's adiposity among clusters. Cross-sectional data from the International Study of Childhood Obesity, Lifestyle and the Environment were used. the participants were children (9-11 years) from 12 nations (n = 5710). 24-h accelerometry and self-reported diet and screen time were clustering input variables. Objectively measured adiposity indicators were waist-to-height ratio, percent body fat and body mass index z-scores. sex-stratified analyses were performed on the global sample and repeated on a site-wise basis. Cluster analysis (using isometric log ratios for compositional data) was used to identify common lifestyle behaviour patterns. Site representation and adiposity were compared across clusters using linear models. Four clusters emerged: (1) Junk Food Screenies, (2) Actives, (3) Sitters and (4) All-Rounders. Countries were represented differently among clusters. Chinese children were over-represented in Sitters and Colombian children in Actives. Adiposity varied across clusters, being highest in Sitters and lowest in Actives. Children from different sites clustered into groups of similar lifestyle behaviours. Cluster membership was linked with differing adiposity. Findings support the implementation of activity interventions in all countries, targeting both physical activity and sedentary time. © 2016 World Obesity Federation.
Baseline adjustments for binary data in repeated cross-sectional cluster randomized trials.

PubMed

Nixon, R M; Thompson, S G

2003-09-15

Analysis of covariance models, which adjust for a baseline covariate, are often used to compare treatment groups in a controlled trial in which individuals are randomized. Such analysis adjusts for any baseline imbalance and usually increases the precision of the treatment effect estimate. We assess the value of such adjustments in the context of a cluster randomized trial with repeated cross-sectional design and a binary outcome. In such a design, a new sample of individuals is taken from the clusters at each measurement occasion, so that baseline adjustment has to be at the cluster level. Logistic regression models are used to analyse the data, with cluster level random effects to allow for different outcome probabilities in each cluster. We compare the estimated treatment effect and its precision in models that incorporate a covariate measuring the cluster level probabilities at baseline and those that do not. In two data sets, taken from a cluster randomized trial in the treatment of menorrhagia, the value of baseline adjustment is only evident when the number of subjects per cluster is large. We assess the generalizability of these findings by undertaking a simulation study, and find that increased precision of the treatment effect requires both large cluster sizes and substantial heterogeneity between clusters at baseline, but baseline imbalance arising by chance in a randomized study can always be effectively adjusted for. Copyright 2003 John Wiley & Sons, Ltd.
Knowledge, attitudes towards and acceptability of genetic modification in Germany.

PubMed

Christoph, Inken B; Bruhn, Maike; Roosen, Jutta

2008-07-01

Genetic modification remains a controversial issue. The aim of this study is to analyse the attitudes towards genetic modification, the knowledge about it and its acceptability in different application areas among German consumers. Results are based on a survey from spring 2005. An exploratory factor analysis is conducted to identify the attitudes towards genetic modification. The identified factors are used in a cluster analysis that identified a cluster of supporters, of opponents and a group of indifferent consumers. Respondents' knowledge of genetics and biotechnology differs among the found clusters without revealing a clear relationship between knowledge and support of genetic modification. The acceptability of genetic modification varies by application area and cluster, and genetically modified non-food products are more widely accepted than food products. The perception of personal health risks has high explanatory power for attitudes and acceptability.
Cardiometabolic risk clustering in spinal cord injury: results of exploratory factor analysis.

PubMed

Libin, Alexander; Tinsley, Emily A; Nash, Mark S; Mendez, Armando J; Burns, Patricia; Elrod, Matt; Hamm, Larry F; Groah, Suzanne L

2013-01-01

Evidence suggests an elevated prevalence of cardiometabolic risks among persons with spinal cord injury (SCI); however, the unique clustering of risk factors in this population has not been fully explored. The purpose of this study was to describe unique clustering of cardiometabolic risk factors differentiated by level of injury. One hundred twenty-one subjects (mean 37 ± 12 years; range, 18-73) with chronic C5 to T12 motor complete SCI were studied. Assessments included medical histories, anthropometrics and blood pressure, and fasting serum lipids, glucose, insulin, and hemoglobin A1c (HbA1c). The most common cardiometabolic risk factors were overweight/obesity, high levels of low-density lipoprotein (LDL-C), and low levels of high-density lipoprotein (HDL-C). Risk clustering was found in 76.9% of the population. Exploratory principal component factor analysis using varimax rotation revealed a 3-factor model in persons with paraplegia (65.4% variance) and a 4-factor solution in persons with tetraplegia (73.3% variance). The differences between groups were emphasized by the varied composition of the extracted factors: Lipid Profile A (total cholesterol [TC] and LDL-C), Body Mass-Hypertension Profile (body mass index [BMI], systolic blood pressure [SBP], and fasting insulin [FI]); Glycemic Profile (fasting glucose and HbA1c), and Lipid Profile B (TG and HDL-C). BMI and SBP formed a separate factor only in persons with tetraplegia. Although the majority of the population with SCI has risk clustering, the composition of the risk clusters may be dependent on level of injury, based on a factor analysis group comparison. This is clinically plausible and relevant as tetraplegics tend to be hypo- to normotensive and more sedentary, resulting in lower HDL-C and a greater propensity toward impaired carbohydrate metabolism.
Association of Mediterranean diet and other health behaviours with barriers to healthy eating and perceived health among British adults of retirement age.

PubMed

Lara, Jose; McCrum, Leigh-Ann; Mathers, John C

2014-11-01

Health behaviours including diet, smoking, alcohol consumption, and physical activity, predict health risks at the population level. We explored health behaviours, barriers to healthy eating and self-rated health among individuals of retirement age. Study design 82 men and 124 women participated in an observational, cross-sectional online survey. Main outcome measures A 14-item Mediterranean diet score (MDPS), perceived barriers to healthy eating (PBHE), self-reported smoking, physical activity habits, and current and prior perceived health status (PHS) were assessed. A health behaviours score (HBS) including smoking, physical activity, body mass index (BMI) and MDPS was created to evaluate associations with PHS. Two-step cluster analysis identified natural groups based on PBHE. Analysis of variance was used to evaluate between group comparisons. PBHE number was associated with BMI (r=0.28, P<0.001), age (r=-0.19; P=0.006), and MDPS (r=-0.31; P<0.001). PHBE cluster analysis produced three clusters. Cluster-1 members (busy lifestyle) were significantly younger (57 years), more overweight (28kg/m(2)), scored lower on MDPS (4.7) and reported more PBHE (7). Cluster-3 members (no characteristic PBHE) were leaner (25kg/m(2)), reported the lowest number of PBHE (2), and scored higher on HBS (2.7) and MDPS (6.2). Those in PHS categories, bad/fair, good, and very good, reported mean HBS of 2.0, 2.4 and 3.0, respectively (P<0.001). Compared with the previous year, no significant associations between PHS and HBS were observed. PBHE clusters were associated with BMI, MDPS and PHS and could be a useful tool to tailor interventions for those of peri-retirement age. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
A phylogenetic study of ubiquinone-7 species of the genus Candida based on 18S ribosomal DNA sequence divergence.

PubMed

Suzuki, Motofumi; Nakase, Takashi

2002-02-01

To clarify phylogenetic relationships among ubiquinone 7 (Q7)-forming species of the genus Candida, we analyzed the nearly complete sequences of 18S ribosomal RNA genes (18S rDNAs) from fifty strains (including 46 type strains) of Candida species, and from 8 type strains of species/varieties of the genera Issatchenkia, Pichia and Saturnispora. Q7-forming Candida species were divided into three major groups (Group I, II, and III) and were phylogenetically distant from a group that includes the type species of the genus Candida. Group I included four clusters with basal branches that were weakly supported. The first cluster comprised C. vartiovaarae, C. maritima, C. utilis, C. freyschussii, C. odintsovae, C. melinii, C. quercuum, Williopsis saturnus var. saturnus, and W. mucosa. The second cluster comprised C. norvegica, C. montana, C. stellimalicola, C. solani, C. berthetii, and C. dendrica. Williopsis pratensis, W. californica, Pichia opuntiae and 2 related species, P. amethionina (two varieties), and P. caribaea were also included in this cluster. The third cluster comprised C. pelliculosa (anamorph of P. anomala), C. nitrativorans, and C. silvicultrix. The fourth cluster comprised C. wickerhamii and C. peltata, which were placed in the P. holstii - C. ernobii clade with Q8-containing species. Group II comprised C. pignaliae, C. nemodendra, C. methanolovescens, C. maris, C. sonorensis, C. pini, C. llanquihuensis, C. cariosilignicola, C. ovalis, C. succiphila (including its two synonyms), C. methanosorbosa, C. nitratophila, C. nanaspora, C. boidinii (including its two synonyms), W. salicorniae, and P. methanolica. Group III was composed of four clusters with strong bootstrap support. The first cluster comprised C. valida (anamorph of P. membranifaciens), C. ethanolica, C. pseudolambica, C. citrea, C. inconspicua, C. norvegensis, C. rugopelliculosa, and C. lambica. Three species and two varieties of the genus Issatchenkia were also included in this cluster. The second cluster comprised C. diversa, C. silvae, 4 Saturnispora species, and P. besseyi. The third comprised C. sorboxylosa, and the fourth comprised C. vini. Based on this 18S rDNA sequence analysis, it is evident that Q7-forming Candida species and the genera Pichia and Williopsis are polyphyletic. The genus Issatchenkia is suggested to be congeneric with the genus Pichia. The genus Saturnispora is phylogenetically definable.
Temperamental dimensions of the TEMPS-A in females with co-morbid bipolar disorder and bulimia.

PubMed

Rybakowski, Janusz K; Kaminska, Katarzyna; Charytonik, Jolanta; Akiskal, Kareen K; Akiskal, Hagop S

2014-08-01

We investigated the effect of co-morbid bipolar disorder and bulimia on temperamental dimensions measured by TEMPS-A, relative to "pure" bulimia and "pure" bipolar disorder, in female patients. The study was performed on 47 patients with bipolar disorder (BD) with a mean age of 36±10 years, 96 patients with bulimia or bulimic type of anorexia, mean age 26±9 years and 50 control healthy females (HC), mean age 29±6 years. Among bulimic patients, a group of 68 subjects with co-morbid bulimia with bipolarity (BD+B) was identified, based on positive score of the Mood Disorder Questionnaire (MDQ). The TEMPS-A questionnaire, 110 questions version, has been used, evaluating five temperament domains: depressive, cyclothymic, hyperthymic, irritable and anxious. Parametric analysis was performed for 4 groups (BD, "pure" bulimia (PB), BD+B and HC), with 28 subjects randomly chosen from each group, using analysis of variance and cluster analysis. All clinical groups significantly differed from control group by having higher scores of depressive, cyclothymic, irritable and anxious temperaments and lower of hyperthymic one. Among patients, significantly higher scores of cyclothymic and irritable temperaments were found in BD+B compared to both PB and BD. These differences were also reflected in cluster analysis, where two clusters were identified. Bipolarity in bulimic patients assessed only by the MDQ. These results show that co-morbid bulimia and bipolar disorder is characterized by extreme dimensions of both cyclothymic and irritable temperaments, significantly higher than each single diagnosis. Possible clinical implications of such fact are discussed. Copyright © 2014 Elsevier B.V. All rights reserved.
Broad DNA methylation changes of spermatogenesis, inflammation and immune response-related genes in a subgroup of sperm samples for assisted reproduction.

PubMed

Schütte, B; El Hajj, N; Kuhtz, J; Nanda, I; Gromoll, J; Hahn, T; Dittrich, M; Schorsch, M; Müller, T; Haaf, T

2013-11-01

Aberrant sperm DNA methylation patterns, mainly in imprinted genes, have been associated with male subfertility and oligospermia. Here, we performed a genome-wide methylation analysis in sperm samples representing a wide range of semen parameters. Sperm DNA samples of 38 males attending a fertility centre were analysed with Illumina HumanMethylation27 BeadChips, which quantify methylation of >27 000 CpG sites in cis-regulatory regions of almost 15 000 genes. In an unsupervised analysis of methylation of all analysed sites, the patient samples clustered into a major and a minor group. The major group clustered with samples from normozoospermic healthy volunteers and, thus, may more closely resemble the normal situation. When correlating the clusters with semen and clinical parameters, the sperm counts were significantly different between groups with the minor group exhibiting sperm counts in the low normal range. A linear model identified almost 3000 CpGs with significant methylation differences between groups. Functional analysis revealed a broad gain of methylation in spermatogenesis-related genes and a loss of methylation in inflammation- and immune response-related genes. Quantitative bisulfite pyrosequencing validated differential methylation in three of five significant candidate genes on the array. Collectively, we identified a subgroup of sperm samples for assisted reproduction with sperm counts in the low normal range and broad methylation changes (affecting approximately 10% of analysed CpG sites) in specific pathways, most importantly spermatogenesis-related genes. We propose that epigenetic analysis can supplement traditional semen parameters and has the potential to provide new insights into the aetiology of male subfertility. © 2013 American Society of Andrology and European Academy of Andrology.
Intertumoral Heterogeneity within Medulloblastoma Subgroups.

PubMed

Cavalli, Florence M G; Remke, Marc; Rampasek, Ladislav; Peacock, John; Shih, David J H; Luu, Betty; Garzia, Livia; Torchia, Jonathon; Nor, Carolina; Morrissy, A Sorana; Agnihotri, Sameer; Thompson, Yuan Yao; Kuzan-Fischer, Claudia M; Farooq, Hamza; Isaev, Keren; Daniels, Craig; Cho, Byung-Kyu; Kim, Seung-Ki; Wang, Kyu-Chang; Lee, Ji Yeoun; Grajkowska, Wieslawa A; Perek-Polnik, Marta; Vasiljevic, Alexandre; Faure-Conter, Cecile; Jouvet, Anne; Giannini, Caterina; Nageswara Rao, Amulya A; Li, Kay Ka Wai; Ng, Ho-Keung; Eberhart, Charles G; Pollack, Ian F; Hamilton, Ronald L; Gillespie, G Yancey; Olson, James M; Leary, Sarah; Weiss, William A; Lach, Boleslaw; Chambless, Lola B; Thompson, Reid C; Cooper, Michael K; Vibhakar, Rajeev; Hauser, Peter; van Veelen, Marie-Lise C; Kros, Johan M; French, Pim J; Ra, Young Shin; Kumabe, Toshihiro; López-Aguilar, Enrique; Zitterbart, Karel; Sterba, Jaroslav; Finocchiaro, Gaetano; Massimino, Maura; Van Meir, Erwin G; Osuka, Satoru; Shofuda, Tomoko; Klekner, Almos; Zollo, Massimo; Leonard, Jeffrey R; Rubin, Joshua B; Jabado, Nada; Albrecht, Steffen; Mora, Jaume; Van Meter, Timothy E; Jung, Shin; Moore, Andrew S; Hallahan, Andrew R; Chan, Jennifer A; Tirapelli, Daniela P C; Carlotti, Carlos G; Fouladi, Maryam; Pimentel, José; Faria, Claudia C; Saad, Ali G; Massimi, Luca; Liau, Linda M; Wheeler, Helen; Nakamura, Hideo; Elbabaa, Samer K; Perezpeña-Diazconti, Mario; Chico Ponce de León, Fernando; Robinson, Shenandoah; Zapotocky, Michal; Lassaletta, Alvaro; Huang, Annie; Hawkins, Cynthia E; Tabori, Uri; Bouffet, Eric; Bartels, Ute; Dirks, Peter B; Rutka, James T; Bader, Gary D; Reimand, Jüri; Goldenberg, Anna; Ramaswamy, Vijay; Taylor, Michael D

2017-06-12

While molecular subgrouping has revolutionized medulloblastoma classification, the extent of heterogeneity within subgroups is unknown. Similarity network fusion (SNF) applied to genome-wide DNA methylation and gene expression data across 763 primary samples identifies very homogeneous clusters of patients, supporting the presence of medulloblastoma subtypes. After integration of somatic copy-number alterations, and clinical features specific to each cluster, we identify 12 different subtypes of medulloblastoma. Integrative analysis using SNF further delineates group 3 from group 4 medulloblastoma, which is not as readily apparent through analyses of individual data types. Two clear subtypes of infants with Sonic Hedgehog medulloblastoma with disparate outcomes and biology are identified. Medulloblastoma subtypes identified through integrative clustering have important implications for stratification of future clinical trials. Copyright © 2017 Elsevier Inc. All rights reserved.
Phylogenetic analysis of widely cultivated Ganoderma in China based on the mitochondrial V4-V6 region of SSU rDNA.

PubMed

Zhou, X W; Su, K Q; Zhang, Y M

2015-02-02

Ganoderma mushroom is one of the most prescribed traditional medicines and has been used for centuries, particularly in China, Japan, Korea, and other Asian countries. In this study, different strains of Ganoderma spp and the genetic relationships of the closely related strains were identified and investigated based on the V4-V6 region of mitochondrial small subunit ribosomal DNA of the Ganoderma species. The sizes of the mitochondrial ribosomal DNA regions from different Ganoderma species showed 2 types of sequences, 2.0 or 0.5 kb. A phylogenetic tree was constructed, which revealed a high level of genetic diversity in Ganoderma species. Ganoderma lucidum G05 and G. eupense G09 strains were clustered into a G. resinaceum group. Ganoderma spp G29 and G22 strains were clustered into a G. lucidum group. However, Ganoderma spp G19, G20, and G21 strains were clustered into a single group, the G. lucidum AF214475, G. sinense, G. strum G17, G. strum G36, and G. sinense G10 strains contained an intron and were clustered into other groups.
Using Opinions and Knowledge to Identify Natural Groups of Gambling Employees.

PubMed

Gray, Heather M; Tom, Matthew A; LaPlante, Debi A; Shaffer, Howard J

2015-12-01

Gaming industry employees are at higher risk than the general population for health conditions including gambling disorder. Responsible gambling training programs, which train employees about gambling and gambling-related problems, might be a point of intervention. However, such programs tend to use a "one-size-fits-all" approach rather than multiple tiers of instruction. We surveyed employees of one Las Vegas casino (n = 217) and one online gambling operator (n = 178) regarding their gambling-related knowledge and opinions prior to responsible gambling training, to examine the presence of natural knowledge groups among recently hired employees. Using k-means cluster analysis, we observed four natural groups within the Las Vegas casino sample and two natural groups within the online operator sample. We describe these natural groups in terms of opinion/knowledge differences as well as distributions of demographic/occupational characteristics. Gender and language spoken at home were correlates of cluster group membership among the sample of Las Vegas casino employees, but we did not identify demographic or occupational correlates of cluster group membership among the online gambling operator employees. Gambling operators should develop more sophisticated training programs that include instruction that targets different natural knowledge groups.
A pyrosequencing assay for the quantitative methylation analysis of the PCDHB gene cluster, the major factor in neuroblastoma methylator phenotype.

PubMed

Banelli, Barbara; Brigati, Claudio; Di Vinci, Angela; Casciano, Ida; Forlani, Alessandra; Borzì, Luana; Allemanni, Giorgio; Romani, Massimo

2012-03-01

Epigenetic alterations are hallmarks of cancer and powerful biomarkers, whose clinical utilization is made difficult by the absence of standardization and of common methods of data interpretation. The coordinate methylation of many loci in cancer is defined as 'CpG island methylator phenotype' (CIMP) and identifies clinically distinct groups of patients. In neuroblastoma (NB), CIMP is defined by a methylation signature, which includes different loci, but its predictive power on outcome is entirely recapitulated by the PCDHB cluster only. We have developed a robust and cost-effective pyrosequencing-based assay that could facilitate the clinical application of CIMP in NB. This assay permits the unbiased simultaneous amplification and sequencing of 17 out of 19 genes of the PCDHB cluster for quantitative methylation analysis, taking into account all the sequence variations. As some of these variations were at CpG doublets, we bypassed the data interpretation conducted by the methylation analysis software to assign the corrected methylation value at these sites. The final result of the assay is the mean methylation level of 17 gene fragments in the protocadherin B cluster (PCDHB) cluster. We have utilized this assay to compare the methylation levels of the PCDHB cluster between high-risk and very low-risk NB patients, confirming the predictive value of CIMP. Our results demonstrate that the pyrosequencing-based assay herein described is a powerful instrument for the analysis of this gene cluster that may simplify the data comparison between different laboratories and, in perspective, could facilitate its clinical application. Furthermore, our results demonstrate that, in principle, pyrosequencing can be efficiently utilized for the methylation analysis of gene clusters with high internal homologies.
Do beef risk perceptions or risk attitudes have a greater effect on the beef purchase decisions of Canadian consumers?

PubMed

Yang, Jun; Goddard, Ellen

2011-01-01

Cluster analysis is applied in this study to group Canadian households by two characteristics, their risk perceptions and risk attitudes toward beef. There are some similarities in demographic profiles, meat purchases, and bovine spongiform encephalopathy (BSE) media recall between the cluster that perceives beef to be the most risky and the cluster that has little willingness to accept the risks of eating beef. There are similarities between the medium risk perception cluster and the medium risk attitude cluster, as well as between the cluster that perceives beef to have little risk and the cluster that is most willing to accept the risks of eating beef. Regression analysis shows that risk attitudes have a larger impact on household-level beef purchasing decisions than do risk perceptions for all consumer clusters. This implies that it may be more effective to undertake policies that reduce the risks associated with eating beef, instead of enhancing risk communication to improve risk perceptions. Only for certain clusters with higher willingness to accept the risks of eating beef might enhancing risk communication increase beef consumption significantly. The different role of risk perceptions and risk attitudes in beef consumption needs to be recognized during the design of risk management policies.
Chemical Fingerprint and Quantitative Analysis for the Quality Evaluation of Platycladi cacumen by Ultra-performance Liquid Chromatography Coupled with Hierarchical Cluster Analysis.

PubMed

Shan, Mingqiu; Li, Sam Fong Yau; Yu, Sheng; Qian, Yan; Guo, Shuchen; Zhang, Li; Ding, Anwei

2018-01-01

Platycladi cacumen (dried twigs and leaves of Platycladus orientalis (L.) Franco) is a frequently utilized Chinese medicinal herb. To evaluate the quality of the phytomedcine, an ultra-performance liquid chromatographic method with diode array detection was established for chemical fingerprinting and quantitative analysis. In this study, 27 batches of P. cacumen from different regions were collected for analysis. A chemical fingerprint with 20 common peaks was obtained using Similarity Evaluation System for Chromatographic Fingerprint of Traditional Chinese Medicine (Version 2004A). Among these 20 components, seven flavonoids (myricitrin, isoquercitrin, quercitrin, afzelin, cupressuflavone, amentoflavone and hinokiflavone) were identified and determined simultaneously. In the method validation, the seven analytes showed good regressions (R ≥ 0.9995) within linear ranges and good recoveries from 96.4% to 103.3%. Furthermore, with the contents of these seven flavonoids, hierarchical clustering analysis was applied to distinguish the 27 batches into five groups. The chemometric results showed that these groups were almost consistent with geographical positions and climatic conditions of the production regions. Integrating fingerprint analysis, simultaneous determination and hierarchical clustering analysis, the established method is rapid, sensitive, accurate and readily applicable, and also provides a significant foundation for quality control of P. cacumen efficiently. © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Assessing Nutritional Differences in Household Level Production and Consumption in African Villages

NASA Astrophysics Data System (ADS)

Markey, K.; Palm, C.; Wood, S.

2015-12-01

Studies of agriculture often focus on yields and calories, but overlook the production of diverse nutrients needed for human health. Nutritional production is particularly important in low-income countries, where foods produced correspond largely to those consumed. Through an analysis of crops, livestock, and animal products, this study aims to quantify the nutritional differences between household-level production and consumption in the Millennium Village at Bonsaaso, Ghana. By converting food items into their nutritional components it became clear that certain nutritional disparities existed between the two categories. In Bonsasso, 64-78% of households exhibited deficiencies in the consumption of Calcium, Fat, and/or Vitamin A despite less than 30% of households showing deficiencies on the production side. To better understand these differences, k-means clustering analysis was performed, placing households into groups characterized by nutritional means. By comparing the households in these groupings, it was clear that clusters formed around certain nutritional deficiencies. The socioeconomic characteristics of these groupings were then studied for correlations, concentrating on number of people at the household, sex and age of household head, and dependency ratio. It was found that clusters with high dependency ratios (the number of working persons in the household to non-working persons) exhibited a large variety of, and often drastic, nutritional deficiencies. In fact, the cluster with the highest average dependency ratio exhibited deficiencies in every nutrient. In light of these findings, regional policies may look to target households with a large number of dependents, and package nutrients for household distribution based on the characteristics of these clusters.
Theory of mind predicts severity level in autism.

PubMed

Hoogenhout, Michelle; Malcolm-Smith, Susan

2017-02-01

We investigated whether theory of mind skills can indicate autism spectrum disorder severity. In all, 62 children with autism spectrum disorder completed a developmentally sensitive theory of mind battery. We used intelligence quotient, Diagnostic and Statistical Manual of Mental Disorders (4th ed.) diagnosis and level of support needed as indicators of severity level. Using hierarchical cluster analysis, we found three distinct clusters of theory of mind ability: early-developing theory of mind (Cluster 1), false-belief reasoning (Cluster 2) and sophisticated theory of mind understanding (Cluster 3). The clusters corresponded to severe, moderate and mild autism spectrum disorder. As an indicator of level of support needed, cluster grouping predicted the type of school children attended. All Cluster 1 children attended autism-specific schools; Cluster 2 was divided between autism-specific and special needs schools and nearly all Cluster 3 children attended general special needs and mainstream schools. Assessing theory of mind skills can reliably discriminate severity levels within autism spectrum disorder.

Genetic diversity analysis of cyanogenic potential (CNp) of root among improved genotypes of cassava using simple sequence repeat markers.

PubMed

Moyib, O K; Mkumbira, J; Odunola, O A; Dixon, A G

2012-12-01

Cyanogenic potential (CNp) of cassava constitutes a serious problem for over 500 million people who rely on the crop as their main source of calories. Genetic diversity is a key to successful crop improvement for breeding new improved variability for target traits. Forty-three improved genotypes of cassava developed by International Institute of Tropical Agriculture (ITA), Ibadan, were characterized for CNp trait using 35 Simple Sequence.Repeat (SSR) markers. Essential colorimetry picric test was used for evaluation of CNp on a color scale of 1 to 14. The CNp scores obtained ranged from 3 to 9, with a mean score of 5.48 (+/- 0.09) based on Statistical Analysis System (SAS) package. TMS M98/ 0068 (4.0 +/- 0.25) was identified as the best genotype with low CNp while TMS M98/0028 (7.75 +/- 0.25) was the worst. The 43 genotypes were assigned into 7 phenotypic groups based on rank-sum analysis in SAS. Dissimilarity analysis representatives for windows generated a phylogenetic tree with 5 clusters which represented hybridizing groups. Each of the clusters (except 4) contained low CNp genotypes that could be used for improving the high CNp genotypes in the same or near cluster. The scatter plot of the genotypes showed that there was little or no demarcation for phenotypic CNp groupings in the molecular groupings. The result of this study demonstrated that SSR markers are powerful tools for the assessment of genetic variability, and proper identification and selection of parents for genetic improvement of low CNp trait among the IITA cassava collection.
Diversity of lactic acid bacteria associated with fish and the fish farm environment, established by amplified rRNA gene restriction analysis.

PubMed

Michel, Christian; Pelletier, Claire; Boussaha, Mekki; Douet, Diane-Gaëlle; Lautraite, Armand; Tailliez, Patrick

2007-05-01

Lactic acid bacteria have become a major source of concern for aquaculture in recent decades. In addition to true pathogenic species of worldwide significance, such as Streptococcus iniae and Lactococcus garvieae, several species have been reported to produce occasional fish mortalities in limited geographic areas, and many unidentifiable or ill-defined isolates are regularly isolated from fish or fish products. To clarify the nature and prevalence of different fish-associated bacteria belonging to the lactic acid bacterium group, a collection of 57 isolates of different origins was studied and compared with a set of 22 type strains, using amplified rRNA gene restriction analysis (ARDRA). Twelve distinct clusters were delineated on the basis of ARDRA profiles and were confirmed by sequencing of sodA and 16S rRNA genes. These clusters included the following: Lactococcus raffinolactis, L. garvieae, Lactococcus l., S. iniae, S. dysgalactiae, S. parauberis, S. agalactiae, Carnobacterium spp., the Enterococcus "faecium" group, a heterogeneous Enterococcus-like cluster comprising indiscernible representatives of Vagococcus fluvialis or the recently recognized V. carniphilus, V. salmoninarum, and Aerococcus spp. Interestingly, the L. lactis and L. raffinolactis clusters appeared to include many commensals of fish, so opportunistic infections caused by these species cannot be disregarded. The significance for fish populations and fish food processing of three or four genetic clusters of uncertain or complex definition, namely, Aerococcus and Enterococcus clusters, should be established more accurately.
Patterns of psychological responses in parents of children that underwent stem cell transplantation.

PubMed

Riva, Roberto; Forinder, Ulla; Arvidson, Johan; Mellgren, Karin; Toporski, Jacek; Winiarski, Jacek; Norberg, Annika Lindahl

2014-11-01

Hematopoietic stem cell transplantation (HSCT) is curative in several life-threatening pediatric diseases but may affect children and their families inducing depression, anxiety, burnout symptoms, and post-traumatic stress symptoms, as well as post-traumatic growth (PTG). The aim of this study was to investigate the co-occurrence of different aspects of such responses in parents of children that had undergone HSCT. Questionnaires were completed by 260 parents (146 mothers and 114 fathers) 11-198 months after HSCT: the Hospital Anxiety and Depression Scale, the Shirom-Melamed Burnout Questionnaire, the post-traumatic stress disorders checklist, civilian version, and the PTG inventory. Additional variables were also investigated: perceived support, time elapsed since HSCT, job stress, partner-relationship satisfaction, trauma appraisal, and the child's health problems. A hierarchical cluster analysis and a k-means cluster analysis were used to identify patterns of psychological responses. Four clusters of parents with different psychological responses were identified. One cluster (n = 40) significantly differed from the other groups and reported levels of depression, anxiety, burnout symptoms, and post-traumatic stress symptoms above the cut-off. In contrast, another cluster (n = 66) reported higher levels of PTG than the other groups did. This study shows a subgroup of parents maintaining high levels of several aspects of distress years after HSCT. Differences between clusters might be explained by differences in perceived support, the child's health problems, job stress, and partner-relationship satisfaction. Copyright © 2014 John Wiley & Sons, Ltd.
Identifying a typology of men who use anabolic androgenic steroids (AAS).

PubMed

Zahnow, Renee; McVeigh, Jim; Bates, Geoff; Hope, Vivian; Kean, Joseph; Campbell, John; Smith, Josie

2018-05-01

Despite recognition that the Anabolic Androgenic Steroid (AAS) using population is diverse, empirical studies to develop theories to conceptualise this variance in use have been limited. In this study, using cluster analysis and multinomial logistic regression, we identify typologies of people who use AAS and examine variations in motivations for AAS use across types in a sample of 611 men who use AAS. The cluster analysis identified four groups in the data with different risk profiles. These groups largely reflect the ideal types of people who use AAS proposed by Christiansen et al. (2016): Cluster 1 (You Only Live Once (YOLO) type, n = 68, 11.1%) were younger and motivated by fat loss; Cluster 2 (Well-being type, n = 236, 38.6%) were concerned with getting fit; Cluster 3 (Athlete type, n = 155, 25.4%) were motivated by muscle and strength gains; Cluster 4 (Expert type, n = 152, 24.9%) were focused on specific goals (i.e. not 'getting fit'). The results of this study demonstrate the need to make information about AAS accessible to the general population and to inform health service providers about variations in motivations and associated risk behaviours. Attention should also be given to ensuring existing harm minimisation services are equipped to disseminate information about safe intra-muscular injecting and ensuring needle disposal sites are accessible to the different types. Copyright © 2018 Elsevier B.V. All rights reserved.
Psychological effects of chemical weapons: a follow-up study of First World War veterans.

PubMed

Jones, E; Everitt, B; Ironside, S; Palmer, I; Wessely, S

2008-10-01

Chemical weapons exercise an enduring and often powerful psychological effect. This had been recognized during the First World War when it was shown that the symptoms of stress mimicked those of mild exposure to gas. Debate about long-term effects followed the suggestion that gassing triggered latent tuberculosis. A random sample of 103 First World War servicemen awarded a war pension for the effects of gas, but without evidence of chronic respiratory pathology, were subjected to cluster analysis using 25 common symptoms. The consistency of symptom reporting was also investigated across repeated follow-ups. Cluster analysis identified four groups: one (n=56) with a range of somatic symptoms, a second (n=30) with a focus on the respiratory system, a third (n=12) with a predominance of neuropsychiatric symptoms, and a fourth (n=5) with a narrow band of symptoms related to the throat and breathing difficulties. Veterans from the neuropsychiatric cluster had multiple diagnoses including neurasthenia and disordered action of the heart, and reported many more symptoms than those in the three somatic clusters. Mild or intermittent respiratory disorders in the post-war period supported beliefs about the damaging effects of gas in the three somatic clusters. By contrast, the neuropsychiatric group did not report new respiratory illnesses. For this cluster, the experience of gassing in a context of extreme danger may have been responsible for the intensity of their symptoms, which showed no sign of diminution over the 12-year follow-up.
The impact of clinical, demographic and risk factors on rates of HIV transmission: a population-based phylogenetic analysis in British Columbia, Canada.

PubMed

Poon, Art F Y; Joy, Jeffrey B; Woods, Conan K; Shurgold, Susan; Colley, Guillaume; Brumme, Chanson J; Hogg, Robert S; Montaner, Julio S G; Harrigan, P Richard

2015-03-15

The diversification of human immunodeficiency virus (HIV) is shaped by its transmission history. We therefore used a population based province wide HIV drug resistance database in British Columbia (BC), Canada, to evaluate the impact of clinical, demographic, and behavioral factors on rates of HIV transmission. We reconstructed molecular phylogenies from 27,296 anonymized bulk HIV pol sequences representing 7747 individuals in BC-about half the estimated HIV prevalence in BC. Infections were grouped into clusters based on phylogenetic distances, as a proxy for variation in transmission rates. Rates of cluster expansion were reconstructed from estimated dates of HIV seroconversion. Our criteria grouped 4431 individuals into 744 clusters largely separated with respect to risk factors, including large established clusters predominated by injection drug users and more-recently emerging clusters comprising men who have sex with men. The mean log10 viral load of an individual's phylogenetic neighborhood (composed of 5 other individuals with shortest phylogenetic distances) increased their odds of appearing in a cluster by >2-fold per log10 viruses per milliliter. Hotspots of ongoing HIV transmission can be characterized in near real time by the secondary analysis of HIV resistance genotypes, providing an important potential resource for targeting public health initiatives for HIV prevention. © The Author 2014. Published by Oxford University Press on behalf of the Infectious Diseases Society of America. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
An application of cluster detection to scene analysis

NASA Technical Reports Server (NTRS)

Rosenfeld, A. H.; Lee, Y. H.

1971-01-01

Certain arrangements of local features in a scene tend to group together and to be seen as units. It is suggested that in some instances, this phenomenon might be interpretable as a process of cluster detection in a graph-structured space derived from the scene. This idea is illustrated using a class of scenes that contain only horizontal and vertical line segments.
The Application of Clustering Techniques to Citation Data. Research Reports Series B No. 6.

ERIC Educational Resources Information Center

Arms, William Y.; Arms, Caroline

This report describes research carried out as part of the Design of Information Systems in the Social Sciences (DISISS) project. Cluster analysis techniques were applied to a machine readable file of bibliographic data in the form of cited journal titles in order to identify groupings which could be used to structure bibliographic files. Practical…
Multivariate Statistical Analysis of MSL APXS Bulk Geochemical Data

NASA Astrophysics Data System (ADS)

Hamilton, V. E.; Edwards, C. S.; Thompson, L. M.; Schmidt, M. E.

2014-12-01

We apply cluster and factor analyses to bulk chemical data of 130 soil and rock samples measured by the Alpha Particle X-ray Spectrometer (APXS) on the Mars Science Laboratory (MSL) rover Curiosity through sol 650. Multivariate approaches such as principal components analysis (PCA), cluster analysis, and factor analysis compliment more traditional approaches (e.g., Harker diagrams), with the advantage of simultaneously examining the relationships between multiple variables for large numbers of samples. Principal components analysis has been applied with success to APXS, Pancam, and Mössbauer data from the Mars Exploration Rovers. Factor analysis and cluster analysis have been applied with success to thermal infrared (TIR) spectral data of Mars. Cluster analyses group the input data by similarity, where there are a number of different methods for defining similarity (hierarchical, density, distribution, etc.). For example, without any assumptions about the chemical contributions of surface dust, preliminary hierarchical and K-means cluster analyses clearly distinguish the physically adjacent rock targets Windjana and Stephen as being distinctly different than lithologies observed prior to Curiosity's arrival at The Kimberley. In addition, they are separated from each other, consistent with chemical trends observed in variation diagrams but without requiring assumptions about chemical relationships. We will discuss the variation in cluster analysis results as a function of clustering method and pre-processing (e.g., log transformation, correction for dust cover) and implications for interpreting chemical data. Factor analysis shares some similarities with PCA, and examines the variability among observed components of a dataset so as to reveal variations attributable to unobserved components. Factor analysis has been used to extract the TIR spectra of components that are typically observed in mixtures and only rarely in isolation; there is the potential for similar results with data from APXS. These techniques offer new ways to understand the chemical relationships between the materials interrogated by Curiosity, and potentially their relation to materials observed by APXS instruments on other landed missions.
The bacterial species definition in the genomic era

PubMed Central

Konstantinidis, Konstantinos T; Ramette, Alban; Tiedje, James M

2006-01-01

The bacterial species definition, despite its eminent practical significance for identification, diagnosis, quarantine and diversity surveys, remains a very difficult issue to advance. Genomics now offers novel insights into intra-species diversity and the potential for emergence of a more soundly based system. Although we share the excitement, we argue that it is premature for a universal change to the definition because current knowledge is based on too few phylogenetic groups and too few samples of natural populations. Our analysis of five important bacterial groups suggests, however, that more stringent standards for species may be justifiable when a solid understanding of gene content and ecological distinctiveness becomes available. Our analysis also reveals what is actually encompassed in a species according to the current standards, in terms of whole-genome sequence and gene-content diversity, and shows that this does not correspond to coherent clusters for the environmental Burkholderia and Shewanella genera examined. In contrast, the obligatory pathogens, which have a very restricted ecological niche, do exhibit clusters. Therefore, the idea of biologically meaningful clusters of diversity that applies to most eukaryotes may not be universally applicable in the microbial world, or if such clusters exist, they may be found at different levels of distinction. PMID:17062412
Maternal Styles of Talking about Child Feeding across Sociodemographic Groups

PubMed Central

Pesch, Megan H.; Harrell, Kristina J.; Kaciroti, Niko; Rosenblum, Kate; Lumeng, Julie C.

2011-01-01

This study sought to identify maternal styles of talking about child feeding from a semi-structured interview and to evaluate associated maternal and child characteristics. Mothers of preschool-aged children (n = 133) of diverse race/ethnicity and socioeconomic status (SES) (45 lower SES black, 29 lower SES white, 32 lower SES Hispanic, 15 middle to upper SES white, 12 middle to upper SES Asian) participated in a semi-structured interview about feeding. Interviews were audio-taped and transcribed. Themes were identified, and individual interviews were coded within these themes: authority (high/low), confidence (confident/conflicted/unopinionated), and investment (deep/mild/removed). Demographic characteristics were collected and a subset of children had measured weights and heights. Cluster analysis was used to identify narrative styles. Participant characteristics were compared across clusters using Fisher’s exact test and analysis of variance. Six narrative styles were identified: Easy-Going, Practical No-Nonsense, Disengaged, Effortful No-Nonsense, Indulgent Worry, and Conflicted Control. Cluster membership differed significantly based on maternal demographic group (P < .001) and child weight status (P < .05). More than half (60%) of children of mothers in the Conflicted Control cluster were obese. Maternal styles of talking about feeding are associated with maternal and child characteristics. PMID:22117662
THE STRUCTURE OF THE MERGING RCS 231953+00 SUPERCLUSTER AT z {approx} 0.9

DOE Office of Scientific and Technical Information (OSTI.GOV)

Faloon, A. J.; Webb, T. M. A.; Geach, J. E.

2013-05-10

The RCS 2319+00 supercluster is a massive supercluster at z = 0.9 comprising three optically selected, spectroscopically confirmed clusters separated by <3 Mpc on the plane of the sky. This supercluster is one of a few known examples of the progenitors of present-day massive clusters (10{sup 15} M{sub Sun} by z {approx} 0.5). We present an extensive spectroscopic campaign carried out on the supercluster field resulting, in conjunction with previously published data, in 1961 high-confidence galaxy redshifts. We find 302 structure members spanning three distinct redshift walls separated from one another by {approx}65 Mpc ({Delta} z = 0.03). The componentmore » clusters have spectroscopic redshifts of 0.901, 0.905, and 0.905. The velocity dispersions are consistent with those predicted from X-ray data, giving estimated cluster masses of {approx}10{sup 14.5}-10{sup 14.9} M{sub Sun }. The Dressler-Shectman test finds evidence of substructure in the supercluster field and a friends-of-friends analysis identified five groups in the supercluster, including a filamentary structure stretching between two cluster cores previously identified in the infrared by Coppin et al. The galaxy colors further show this filamentary structure to be a unique region of activity within the supercluster, comprised mainly of blue galaxies compared to the {approx}43%-77% red-sequence galaxies present in the other groups and cluster cores. Richness estimates from stacked luminosity function fits result in average group mass estimates consistent with {approx}10{sup 13} M{sub Sun} halos. Currently, 22% of our confirmed members reside in {approx}> 10{sup 13} M{sub Sun} groups/clusters destined to merge onto the most massive cluster, in agreement with the massive halo galaxy fractions important in cluster galaxy pre-processing in N-body simulation merger tree studies.« less
A Cyber-Attack Detection Model Based on Multivariate Analyses

NASA Astrophysics Data System (ADS)

Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi

In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.
HICOSMO: cosmology with a complete sample of galaxy clusters - II. Cosmological results

NASA Astrophysics Data System (ADS)

Schellenberger, G.; Reiprich, T. H.

2017-10-01

The X-ray bright, hot gas in the potential well of a galaxy cluster enables systematic X-ray studies of samples of galaxy clusters to constrain cosmological parameters. HIFLUGCS consists of the 64 X-ray brightest galaxy clusters in the Universe, building up a local sample. Here, we utilize this sample to determine, for the first time, individual hydrostatic mass estimates for all the clusters of the sample and, by making use of the completeness of the sample, we quantify constraints on the two interesting cosmological parameters, Ωm and σ8. We apply our total hydrostatic and gas mass estimates from the X-ray analysis to a Bayesian cosmological likelihood analysis and leave several parameters free to be constrained. We find Ωm = 0.30 ± 0.01 and σ8 = 0.79 ± 0.03 (statistical uncertainties, 68 per cent credibility level) using our default analysis strategy combining both a mass function analysis and the gas mass fraction results. The main sources of biases that we correct here are (1) the influence of galaxy groups (incompleteness in parent samples and differing behaviour of the Lx-M relation), (2) the hydrostatic mass bias, (3) the extrapolation of the total mass (comparing various methods), (4) the theoretical halo mass function and (5) other physical effects (non-negligible neutrino mass). We find that galaxy groups introduce a strong bias, since their number density seems to be over predicted by the halo mass function. On the other hand, incorporating baryonic effects does not result in a significant change in the constraints. The total (uncorrected) systematic uncertainties (∼20 per cent) clearly dominate the statistical uncertainties on cosmological parameters for our sample.
A baseline record of trace elements concentration along the beach placer mining areas of Kanyakumari coast, South India.

PubMed

Simon Peter, T; Chandrasekar, N; John Wilson, J S; Selvakumar, S; Krishnakumar, S; Magesh, N S

2017-06-15

Trace element concentration in the beach placer mining areas of Kanyakumari coast, South India was assessed. Sewage and contaminated sediments from mining sites has contaminated the surface sediments. Enrichment factor indicates moderately severe enrichment for Pb, minor enrichment for Mn, Zn, Ni, Fe and no enrichment for Cr and Cu. The Igeo values show higher concentration of Pb ranging in the scale of 3-4, which shows strong contamination due to high anthropogenic activity such as mining and terrestrial influences into the coastal regions. Correlation coefficient shows that most of the elements are associated with each other except Ni and Pb. Factor analysis reveals that Mn, Zn, Fe, Cr, Pb and Cu are having a significant loading and it indicates that these elements are mainly derived from similar origin. The cluster analysis clearly indicated that the mining areas are grouped under cluster 2 and non-mining areas are clustered under group 1. Copyright © 2017 Elsevier Ltd. All rights reserved.
Chemical Composition and Crystal Morphology of Epicuticular Wax in Mature Fruits of 35 Pear (Pyrus spp.) Cultivars

PubMed Central

Wu, Xiao; Yin, Hao; Shi, Zebin; Chen, Yangyang; Qi, Kaijie; Qiao, Xin; Wang, Guoming; Cao, Peng; Zhang, Shaoling

2018-01-01

An evaluation of fruit wax components will provide us with valuable information for pear breeding and enhancing fruit quality. Here, we dissected the epicuticular wax concentration, composition and structure of mature fruits from 35 pear cultivars belonging to five different species and hybrid interspecies. A total of 146 epicuticular wax compounds were detected, and the wax composition and concentration varied dramatically among species, with the highest level of 1.53 mg/cm2 in Pyrus communis and the lowest level of 0.62 mg/cm2 in Pyrus pyrifolia. Field emission scanning electron microscopy (FESEM) analysis showed amorphous structures of the epicuticular wax crystals of different pear cultivars. Cluster analysis revealed that the Pyrus bretschneideri cultivars were grouped much closer to Pyrus pyrifolia and Pyrus ussuriensis, and the Pyrus sinkiangensis cultivars were clustered into a distant group. Based on the principal component analysis (PCA), the cultivars could be divided into three groups and five groups according to seven main classes of epicuticular wax compounds and 146 wax compounds, respectively. PMID:29875784
Investigating the limitations of tree species classification using the Combined Cluster and Discriminant Analysis method for low density ALS data from a dense forest region in Aggtelek (Hungary)

NASA Astrophysics Data System (ADS)

Koma, Zsófia; Deák, Márton; Kovács, József; Székely, Balázs; Kelemen, Kristóf; Standovár, Tibor

2016-04-01

Airborne Laser Scanning (ALS) is a widely used technology for forestry classification applications. However, single tree detection and species classification from low density ALS point cloud is limited in a dense forest region. In this study we investigate the division of a forest into homogenous groups at stand level. The study area is located in the Aggtelek karst region (Northeast Hungary) with a complex relief topography. The ALS dataset contained only 4 discrete echoes (at 2-4 pt/m2 density) from the study area during leaf-on season. Ground-truth measurements about canopy closure and proportion of tree species cover are available for every 70 meter in 500 square meter circular plots. In the first step, ALS data were processed and geometrical and intensity based features were calculated into a 5×5 meter raster based grid. The derived features contained: basic statistics of relative height, canopy RMS, echo ratio, openness, pulse penetration ratio, basic statistics of radiometric feature. In the second step the data were investigated using Combined Cluster and Discriminant Analysis (CCDA, Kovács et al., 2014). The CCDA method first determines a basic grouping for the multiple circle shaped sampling locations using hierarchical clustering and then for the arising grouping possibilities a core cycle is executed comparing the goodness of the investigated groupings with random ones. Out of these comparisons difference values arise, yielding information about the optimal grouping out of the investigated ones. If sub-groups are then further investigated, one might even find homogeneous groups. We found that low density ALS data classification into homogeneous groups are highly dependent on canopy closure, and the proportion of the dominant tree species. The presented results show high potential using CCDA for determination of homogenous separable groups in LiDAR based tree species classification. Aggtelek Karst/Slovakian Karst Caves" (HUSK/1101/221/0180, Aggtelek NP), data evaluation: 'Multipurpose assessment serving forest biodiversity conservation in the Carpathian region of Hungary', Swiss-Hungarian Cooperation Programme (SH/4/13 Project). BS contributed as an Alexander von Humboldt Research Fellow. J. Kovács, S. Kovács, N. Magyar, P. Tanos, I. G. Hatvani, and A. Anda (2014), Classification into homogeneous groups using combined cluster and discriminant analysis, Environmental Modelling & Software, 57, 52-59.
Usage of K-cluster and factor analysis for grouping and evaluation the quality of olive oil in accordance with physico-chemical parameters

NASA Astrophysics Data System (ADS)

Milev, M.; Nikolova, Kr.; Ivanova, Ir.; Dobreva, M.

2015-11-01

25 olive oils were studied- different in origin and ways of extraction, in accordance with 17 physico-chemical parameters as follows: color parameters - a and b, light, fluorescence peaks, pigments - chlorophyll and β-carotene, fatty-acid content. The goals of the current study were: Conducting correlation analysis to find the inner relation between the studied indices; By applying factor analysis with the help of the method of Principal Components (PCA), to reduce the great number of variables into a few factors, which are of main importance for distinguishing the different types of olive oil;Using K-means cluster to compare and group the tested types olive oils based on their similarity. The inner relation between the studied indices was found by applying correlation analysis. A factor analysis using PCA was applied on the basis of the found correlation matrix. Thus the number of the studied indices was reduced to 4 factors, which explained 79.3% from the entire variation. The first one unified the color parameters, β-carotene and the related with oxidative products fluorescence peak - about 520 nm. The second one was determined mainly by the chlorophyll content and related to it fluorescence peak - about 670 nm. The third and the fourth factors were determined by the fatty-acid content of the samples. The third one unified the fatty-acids, which give us the opportunity to distinguish olive oil from the other plant oils - oleic, linoleic and stearin acids. The fourth factor included fatty-acids with relatively much lower content in the studied samples. It is enquired the number of clusters to be determined preliminary in order to apply the K-Cluster analysis. The variant K = 3 was worked out because the types of the olive oil were three. The first cluster unified all salad and pomace olive oils, the second unified the samples of extra virgin oilstaken as controls from producers, which were bought from the trade network. The third cluster unified samples from pomace and extra virgin oils, which distinguish one from another in accordance with their parameters from the natural olive oils, because of presence of plant oils impurities.
Implementation of spectral clustering on microarray data of carcinoma using k-means algorithm

NASA Astrophysics Data System (ADS)

Frisca, Bustamam, Alhadi; Siswantining, Titin

2017-03-01

Clustering is one of data analysis methods that aims to classify data which have similar characteristics in the same group. Spectral clustering is one of the most popular modern clustering algorithms. As an effective clustering technique, spectral clustering method emerged from the concepts of spectral graph theory. Spectral clustering method needs partitioning algorithm. There are some partitioning methods including PAM, SOM, Fuzzy c-means, and k-means. Based on the research that has been done by Capital and Choudhury in 2013, when using Euclidian distance k-means algorithm provide better accuracy than PAM algorithm. So in this paper we use k-means as our partition algorithm. The major advantage of spectral clustering is in reducing data dimension, especially in this case to reduce the dimension of large microarray dataset. Microarray data is a small-sized chip made of a glass plate containing thousands and even tens of thousands kinds of genes in the DNA fragments derived from doubling cDNA. Application of microarray data is widely used to detect cancer, for the example is carcinoma, in which cancer cells express the abnormalities in his genes. The purpose of this research is to classify the data that have high similarity in the same group and the data that have low similarity in the others. In this research, Carcinoma microarray data using 7457 genes. The result of partitioning using k-means algorithm is two clusters.
Characterization of esculin-positive Pseudomonas fluorescens strains isolated from an underground brook.

PubMed

Svec, P; Stegnerová, H; Durnová, E; Sedlácek, I

2004-01-01

A group of sixteen esculin-positive fluorescent pseudomonads isolated from an underground brook flowing through a cave complex was characterized by biotyping, multiple enzyme restriction fragment length polymorphism analysis of 16S rDNA (MERFLP), ribotyping and whole-cell fatty-acid methyl-esters analysis (FAME). All strains were phenotypically close to Pseudomonas fluorescens, but they revealed high biochemical variability as well as some reactions atypical for P. fluorescens species. Because identification of pseudomonads by of biochemical testing is often unclear, further techniques were employed. Fingerprints obtained by MERFLP clearly showed that all strains represent P. fluorescens species. Ribotyping separated the strains analyzed into four groups corresponding almost completely (with the exception of one strain) to the clustering based on biochemical profiles. FAME analysis grouped all the strains into one cluster together with the P. putida (biotype A, B), P. chlororaphis and P. fluorescens biotype F representatives, but differentiated them from other FAME profiles of all pseudomonads included in the standard library TSBA 40 provided by MIDI, Inc.

Analysis of intra-host genetic diversity of Prunus necrotic ringspot virus (PNRSV) using amplicon next generation sequencing.

PubMed

Kinoti, Wycliff M; Constable, Fiona E; Nancarrow, Narelle; Plummer, Kim M; Rodoni, Brendan

2017-01-01

PCR amplicon next generation sequencing (NGS) analysis offers a broadly applicable and targeted approach to detect populations of both high- or low-frequency virus variants in one or more plant samples. In this study, amplicon NGS was used to explore the diversity of the tripartite genome virus, Prunus necrotic ringspot virus (PNRSV) from 53 PNRSV-infected trees using amplicons from conserved gene regions of each of PNRSV RNA1, RNA2 and RNA3. Sequencing of the amplicons from 53 PNRSV-infected trees revealed differing levels of polymorphism across the three different components of the PNRSV genome with a total number of 5040, 2083 and 5486 sequence variants observed for RNA1, RNA2 and RNA3 respectively. The RNA2 had the lowest diversity of sequences compared to RNA1 and RNA3, reflecting the lack of flexibility tolerated by the replicase gene that is encoded by this RNA component. Distinct PNRSV phylo-groups, consisting of closely related clusters of sequence variants, were observed in each of PNRSV RNA1, RNA2 and RNA3. Most plant samples had a single phylo-group for each RNA component. Haplotype network analysis showed that smaller clusters of PNRSV sequence variants were genetically connected to the largest sequence variant cluster within a phylo-group of each RNA component. Some plant samples had sequence variants occurring in multiple PNRSV phylo-groups in at least one of each RNA and these phylo-groups formed distinct clades that represent PNRSV genetic strains. Variants within the same phylo-group of each Prunus plant sample had ≥97% similarity and phylo-groups within a Prunus plant sample and between samples had less ≤97% similarity. Based on the analysis of diversity, a definition of a PNRSV genetic strain was proposed. The proposed definition was applied to determine the number of PNRSV genetic strains in each of the plant samples and the complexity in defining genetic strains in multipartite genome viruses was explored.
Genetic structure in four West African population groups

PubMed Central

Adeyemo, Adebowale A; Chen, Guanjie; Chen, Yuanxiu; Rotimi, Charles

2005-01-01

Background Africa contains the most genetically divergent group of continental populations and several studies have reported that African populations show a high degree of population stratification. In this regard, it is important to investigate the potential for population genetic structure or stratification in genetic epidemiology studies involving multiple African populations. The presences of genetic sub-structure, if not properly accounted for, have been reported to lead to spurious association between a putative risk allele and a disease. Within the context of the Africa America Diabetes Mellitus (AADM) Study (a genetic epidemiologic study of type 2 diabetes mellitus in West Africa), we have investigated population structure or stratification in four ethnic groups in two countries (Akan and Gaa-Adangbe from Ghana, Yoruba and Igbo from Nigeria) using data from 372 autosomal microsatellite loci typed in 493 unrelated persons (986 chromosomes). Results There was no significant population genetic structure in the overall sample. The smallest probability is associated with an inferred cluster of 1 and little of the posterior probability is associated with a higher number of inferred clusters. The distribution of members of the sample to inferred clusters is consistent with this finding; roughly the same proportion of individuals from each group is assigned to each cluster with little variation between the ethnic groups. Analysis of molecular variance (AMOVA) showed that the between-population component of genetic variance is less than 0.1% in contrast to 99.91% for the within population component. Pair-wise genetic distances between the four ethnic groups were also very similar. Nonetheless, the small between-population genetic variance was sufficient to distinguish the two Ghanaian groups from the two Nigerian groups. Conclusion There was little evidence for significant population substructure in the four major West African ethnic groups represented in the AADM study sample. Ethnicity apparently did not introduce differential allele frequencies that may affect analysis and interpretation of linkage and association studies. These findings, although not entirely surprising given the geographical proximity of these groups, provide important insights into the genetic relationships between the ethnic groups studied and confirm previous results that showed close genetic relationship between most studied West African groups. PMID:15978124
Clustering of change patterns using Fourier coefficients.

PubMed

Kim, Jaehee; Kim, Haseong

2008-01-15

To understand the behavior of genes, it is important to explore how the patterns of gene expression change over a time period because biologically related gene groups can share the same change patterns. Many clustering algorithms have been proposed to group observation data. However, because of the complexity of the underlying functions there have not been many studies on grouping data based on change patterns. In this study, the problem of finding similar change patterns is induced to clustering with the derivative Fourier coefficients. The sample Fourier coefficients not only provide information about the underlying functions, but also reduce the dimension. In addition, as their limiting distribution is a multivariate normal, a model-based clustering method incorporating statistical properties would be appropriate. This work is aimed at discovering gene groups with similar change patterns that share similar biological properties. We developed a statistical model using derivative Fourier coefficients to identify similar change patterns of gene expression. We used a model-based method to cluster the Fourier series estimation of derivatives. The model-based method is advantageous over other methods in our proposed model because the sample Fourier coefficients asymptotically follow the multivariate normal distribution. Change patterns are automatically estimated with the Fourier representation in our model. Our model was tested in simulations and on real gene data sets. The simulation results showed that the model-based clustering method with the sample Fourier coefficients has a lower clustering error rate than K-means clustering. Even when the number of repeated time points was small, the same results were obtained. We also applied our model to cluster change patterns of yeast cell cycle microarray expression data with alpha-factor synchronization. It showed that, as the method clusters with the probability-neighboring data, the model-based clustering with our proposed model yielded biologically interpretable results. We expect that our proposed Fourier analysis with suitably chosen smoothing parameters could serve as a useful tool in classifying genes and interpreting possible biological change patterns. The R program is available upon the request.
Fall episodes in elderly patients with asthma and COPD - a pilot study.

PubMed

Bozek, Andrzej; Jarzab, Jerzy; Hadas, Ewa; Jakalski, Marek; Canonica, Giorgio Walter

2018-05-08

Evidence of an increased risk of falls in patients with chronic obstructive pulmonary disease (COPD) exists; however, this has not been studied in elderly asthmatic patients. The aim of the study was to determine the incidence of falls in elderly patients who were diagnosed with bronchial asthma compared to subjects with COPD. A 12 - month prospective observational study in elderly outpatients with diagnosis of either asthma or COPD was conducted. All of the participants were monitored on the following parameters: falls, comorbidities, drug therapy and The Berg Balance Scale. The rate of falls was shown as an incidence ratio. Cluster analysis for subgroups with similar features was performed on all patients included in the study. Two clusters of frequent fallers were determined. The fall incidence rate in falls per person per year was 1.41 (95% CI: 0.86-1.96) in asthmatic patients and 1.49 (95% CI: 1.05-2.11) in the COPD group. Frequent fallers were more prevalent in the COPD group, with 32% in this group compared to 28% in the groups of patients with asthma. In cluster analysis, frequent fallers were grouped into two models characterized by polytherapy, depression symptoms, hospitalizations, coronary disease, dementia and diagnosis of COPD or asthma. Elderly asthmatic patients presented a high rate of falls, which is comparable to that of patients with COPD.
Cluster analysis of particulate matter (PM10) and black carbon (BC) concentrations

NASA Astrophysics Data System (ADS)

Žibert, Janez; Pražnikar, Jure

2012-09-01

The monitoring of air-pollution constituents like particulate matter (PM10) and black carbon (BC) can provide information about air quality and the dynamics of emissions. Air quality depends on natural and anthropogenic sources of emissions as well as the weather conditions. For a one-year period the diurnal concentrations of PM10 and BC in the Port of Koper were analysed by clustering days into similar groups according to the similarity of the BC and PM10 hourly derived day-profiles without any prior assumptions about working and non-working days, weather conditions or hot and cold seasons. The analysis was performed by using k-means clustering with the squared Euclidean distance as the similarity measure. The analysis showed that 10 clusters in the BC case produced 3 clusters with just one member day and 7 clusters that encompasses more than one day with similar BC profiles. Similar results were found in the PM10 case, where one cluster has a single-member day, while 7 clusters contain several member days. The clustering analysis revealed that the clusters with less pronounced bimodal patterns and low hourly and average daily concentrations for both types of measurements include the most days in the one-year analysis. A typical day profile of the BC measurements includes a bimodal pattern with morning and evening peaks, while the PM10 measurements reveal a less pronounced bimodality. There are also clusters with single-peak day-profiles. The BC data in such cases exhibit morning peaks, while the PM10 data consist of noon or afternoon single peaks. Single pronounced peaks can be explained by appropriate cluster wind speed profiles. The analysis also revealed some special day-profiles. The BC cluster with a high midnight peak at 30/04/2010 and the PM10 cluster with the highest observed concentration of PM10 at 01/05/2010 (208.0 μg m-3) coincide with 1 May, which is a national holiday in Slovenia and has very strong tradition of bonfire parties. The clustering of the diurnal concentration showed that various different day-profiles are presented in a cold period, while this is not the case for the hot season. Additional analysis of ship traffic and rain fall data showed that there is no statistically significant difference between the ship gross (bruto) registered tonnage (BRT) values in the case of BC and PM10 clusters, but that there is statistically significant differences between the rain fall in the BC and PM10 clusters. The wind-rose for clusters which included most days in the sampling period indicating that emitted PM10 and BC from Port of Koper were manly transported in the west direction over the sea and in the east direction, where there is in no populated area. Presented analysis showed that both BC and PM10 concentrations were driven by rain intensity and wind speed.
Clustering stock market companies via chaotic map synchronization

NASA Astrophysics Data System (ADS)

Basalto, N.; Bellotti, R.; De Carlo, F.; Facchi, P.; Pascazio, S.

2005-01-01

A pairwise clustering approach is applied to the analysis of the Dow Jones index companies, in order to identify similar temporal behavior of the traded stock prices. To this end, the chaotic map clustering algorithm is used, where a map is associated to each company and the correlation coefficients of the financial time series to the coupling strengths between maps. The simulation of a chaotic map dynamics gives rise to a natural partition of the data, as companies belonging to the same industrial branch are often grouped together. The identification of clusters of companies of a given stock market index can be exploited in the portfolio optimization strategies.
A Search for Ram-pressure Stripping in the Hydra I Cluster

NASA Technical Reports Server (NTRS)

Brown, B.

2005-01-01

Ram-pressure stripping is a method by which hot interstellar gas can be removed from a galaxy moving through a group or cluster of galaxies. Indirect evidence of ram-pressure stripping includes lowered X-ray brightness in a galaxy due to less X-ray emitting gas remaining in the galaxy. Here we present the initial results of our program to determine whether cluster elliptical galaxies have lower hot gas masses than their counterparts in less rich environments. This test requires the use of the high-resolution imaging of the Chandra Observatory and we present our analysis of the galaxies in the nearby cluster Hydra I.
A Search for Ram-pressure Stripping in the Hydra I Cluster

NASA Technical Reports Server (NTRS)

Brown, B. A.

2005-01-01

Ram-pressure stripping is a method by which hot interstellar gas can be removed from a galaxy moving through a group or cluster of galaxies. Indirect evidence of ram-pressure stripping includes lowered X- ray brightness in a galaxy due to less X-ray emitting gas remaining in the galaxy. Here we present the initial results of our program to determine whether cluster elliptical galaxies have lower hot gas masses than their counterparts in less rich environments. This test requires the use of the high-resolution imaging of the Chundru Observatory and we present our analysis of the galaxies in the nearby cluster Hydra I.
Comparing the performance of biomedical clustering methods.

PubMed

Wiwie, Christian; Baumbach, Jan; Röttger, Richard

2015-11-01

Identifying groups of similar objects is a popular first step in biomedical data analysis, but it is error-prone and impossible to perform manually. Many computational methods have been developed to tackle this problem. Here we assessed 13 well-known methods using 24 data sets ranging from gene expression to protein domains. Performance was judged on the basis of 13 common cluster validity indices. We developed a clustering analysis platform, ClustEval (http://clusteval.mpi-inf.mpg.de), to promote streamlined evaluation, comparison and reproducibility of clustering results in the future. This allowed us to objectively evaluate the performance of all tools on all data sets with up to 1,000 different parameter sets each, resulting in a total of more than 4 million calculated cluster validity indices. We observed that there was no universal best performer, but on the basis of this wide-ranging comparison we were able to develop a short guideline for biomedical clustering tasks. ClustEval allows biomedical researchers to pick the appropriate tool for their data type and allows method developers to compare their tool to the state of the art.
Morphology and luminosity segregation of galaxies in nearby loose groups

NASA Astrophysics Data System (ADS)

Girardi, M.; Rigoni, E.; Mardirossian, F.; Mezzetti, M.

2003-08-01

We study morphology and luminosity segregation of galaxies in loose groups. We analyze the two catalogs of groups identified in the Nearby Optical Galaxy (NOG) sample, by means of hierarchical and percolation ``friends-of-friends'' methods (HG and PG catalogs, respectively). In the first part of our analysis we consider 387 and 436 groups of HG and PG and compare morphology- (luminosity-) weighted to unweighted group properties: velocity dispersion, mean pairwise distance, and mean groupcentric distance of member galaxies. The second part of our analysis is based on two ensemble systems, one for each catalog, built by suitably combining together galaxies of all groups (1584 and 1882 galaxies for HG and PG groups). We find that earlier-type (brighter) galaxies are more clustered and lie closer to the group centers, both in position and in velocity, than later-type (fainter) galaxies. Spatial segregations are stronger than kinematical segregations. These effects are generally detected at the >˜ 3-sigma level. Luminosity segregation is shown to be independent of morphology segregation. Our main conclusions are strengthened by the detection of segregation in both hierarchical and percolation catalogs. Our results agree with a continuum of segregation properties of galaxies in systems, from low-mass groups to massive clusters.
Investigation of Spatial and Temporal Trends in Water Quality in Daya Bay, South China Sea

PubMed Central

Wu, Mei-Lin; Wang, You-Shao; Dong, Jun-De; Sun, Cui-Ci; Wang, Yu-Tu; Sun, Fu-Lin; Cheng, Hao

2011-01-01

The objective is to identify the spatial and temporal variability of the hydrochemical quality of the water column in a subtropical coastal system, Daya Bay, China. Water samples were collected in four seasons at 12 monitoring sites. The Southeast Asian monsoons, northeasterly from October to the next April and southwesterly from May to September have also an important influence on water quality in Daya Bay. In the spatial pattern, two groups have been identified, with the help of multidimensional scaling analysis and cluster analysis. Cluster I consisted of the sites S3, S8, S10 and S11 in the west and north coastal parts of Daya Bay. Cluster I is mainly related to anthropogenic activities such as fish-farming. Cluster II consisted of the rest of the stations in the center, east and south parts of Daya Bay. Cluster II is mainly related to seawater exchange from South China Sea. PMID:21776234
Who Visits a National Park and What do They Get Out of It?: A Joint Visitor Cluster Analysis and Travel Cost Model for Yellowstone National Park

NASA Astrophysics Data System (ADS)

Benson, Charles; Watson, Philip; Taylor, Garth; Cook, Philip; Hollenhorst, Steve

2013-10-01

Yellowstone National Park visitor data were obtained from a survey collected for the National Park Service by the Park Studies Unit at the University of Idaho. Travel cost models have been conducted for national parks in the United States; however, this study builds on these studies and investigates how benefits vary by types of visitors who participate in different activities while at the park. Visitor clusters were developed based on activities in which a visitor participated while at the park. The clusters were analyzed and then incorporated into a travel cost model to determine the economic value (consumer surplus) that the different visitor groups received from visiting the park. The model was estimated using a zero-truncated negative binomial regression corrected for endogenous stratification. The travel cost price variable was estimated using both 1/3 and 1/4 the wage rate to test for sensitivity to opportunity cost specification. The average benefit across all visitor cluster groups was estimated at between 235 and 276 per person per trip. However, per trip benefits varied substantially across clusters; from 90 to 103 for the "value picnickers," to 185-263 for the "backcountry enthusiasts," 189-278 for the "do it all adventurists," 204-303 for the "windshield tourists," and 323-714 for the "creature comfort" cluster group.
Analysis of temporal gene expression profiles: clustering by simulated annealing and determining the optimal number of clusters.

PubMed

Lukashin, A V; Fuchs, R

2001-05-01

Cluster analysis of genome-wide expression data from DNA microarray hybridization studies has proved to be a useful tool for identifying biologically relevant groupings of genes and samples. In the present paper, we focus on several important issues related to clustering algorithms that have not yet been fully studied. We describe a simple and robust algorithm for the clustering of temporal gene expression profiles that is based on the simulated annealing procedure. In general, this algorithm guarantees to eventually find the globally optimal distribution of genes over clusters. We introduce an iterative scheme that serves to evaluate quantitatively the optimal number of clusters for each specific data set. The scheme is based on standard approaches used in regular statistical tests. The basic idea is to organize the search of the optimal number of clusters simultaneously with the optimization of the distribution of genes over clusters. The efficiency of the proposed algorithm has been evaluated by means of a reverse engineering experiment, that is, a situation in which the correct distribution of genes over clusters is known a priori. The employment of this statistically rigorous test has shown that our algorithm places greater than 90% genes into correct clusters. Finally, the algorithm has been tested on real gene expression data (expression changes during yeast cell cycle) for which the fundamental patterns of gene expression and the assignment of genes to clusters are well understood from numerous previous studies.
Assessment of sediment quality in the Mediterranean Sea-Boughrara lagoon exchange areas (southeastern Tunisia): GIS approach-based chemometric methods.

PubMed

Kharroubi, Adel; Gargouri, Dorra; Baati, Houda; Azri, Chafai

2012-06-01

Concentrations of selected heavy metals (Cd, Pb, Zn, Cu, Mn, and Fe) in surface sediments from 66 sites in both northern and eastern Mediterranean Sea-Boughrara lagoon exchange areas (southeastern Tunisia) were studied in order to understand current metal contamination due to the urbanization and economic development of nearby several coastal regions of the Gulf of Gabès. Multiple approaches were applied for the sediment quality assessment. These approaches were based on GIS coupled with chemometric methods (enrichment factors, geoaccumulation index, principal component analysis, and cluster analysis). Enrichment factors and principal component analysis revealed two distinct groups of metals. The first group corresponded to Fe and Mn derived from natural sources, and the second group contained Cd, Pb, Zn, and Cu originated from man-made sources. For these latter metals, cluster analysis showed two distinct distributions in the selected areas. They were attributed to temporal and spatial variations of contaminant sources input. The geoaccumulation index (I (geo)) values explained that only Cd, Pb, and Cu can be considered as moderate to extreme pollutants in the studied sediments.
Integration K-Means Clustering Method and Elbow Method For Identification of The Best Customer Profile Cluster

NASA Astrophysics Data System (ADS)

Syakur, M. A.; Khotimah, B. K.; Rochman, E. M. S.; Satoto, B. D.

2018-04-01

Clustering is a data mining technique used to analyse data that has variations and the number of lots. Clustering was process of grouping data into a cluster, so they contained data that is as similar as possible and different from other cluster objects. SMEs Indonesia has a variety of customers, but SMEs do not have the mapping of these customers so they did not know which customers are loyal or otherwise. Customer mapping is a grouping of customer profiling to facilitate analysis and policy of SMEs in the production of goods, especially batik sales. Researchers will use a combination of K-Means method with elbow to improve efficient and effective k-means performance in processing large amounts of data. K-Means Clustering is a localized optimization method that is sensitive to the selection of the starting position from the midpoint of the cluster. So choosing the starting position from the midpoint of a bad cluster will result in K-Means Clustering algorithm resulting in high errors and poor cluster results. The K-means algorithm has problems in determining the best number of clusters. So Elbow looks for the best number of clusters on the K-means method. Based on the results obtained from the process in determining the best number of clusters with elbow method can produce the same number of clusters K on the amount of different data. The result of determining the best number of clusters with elbow method will be the default for characteristic process based on case study. Measurement of k-means value of k-means has resulted in the best clusters based on SSE values on 500 clusters of batik visitors. The result shows the cluster has a sharp decrease is at K = 3, so K as the cut-off point as the best cluster.
Stable isotope phenotyping via cluster analysis of NanoSIMS data as a method for characterizing distinct microbial ecophysiologies and sulfur-cycling in the environment

NASA Astrophysics Data System (ADS)

Dawson, K.; Scheller, S.; Dillon, J. G.; Orphan, V. J.

2016-12-01

Stable isotope probing (SIP) is a valuable tool for gaining insights into ecophysiology and biogeochemical cycling of environmental microbial communities by tracking isotopically labeled compounds into cellular macromolecules as well as into byproducts of respiration. SIP, in conjunction with nanoscale secondary ion mass spectrometry (NanoSIMS), allows for the visualization of isotope incorporation at the single cell level. In this manner, both active cells within a diverse population as well as heterogeneity in metabolism within a homogeneous population can be observed. The ecophysiological implications of these single cell stable isotope measurements are often limited to the taxonomic resolution of paired fluorescence in situ hybridization (FISH) microscopy. Here we introduce a taxonomy-independent method using multi-isotope SIP and NanoSIMS for identifying and grouping phenotypically similar microbial cells by their chemical and isotopic fingerprint. This method was applied to SIP experiments in a sulfur-cycling biofilm collected from sulfidic intertidal vents amended with 13C-acetate, 15N-ammonium, and 33S-sulfate. Using a cluster analysis technique based on fuzzy c-means to group cells according to their isotope (13C/12C, 15N/14N, and 33S/32S) and elemental ratio (C/CN and S/CN) profiles, our analysis partitioned 2200 cellular regions of interest (ROIs) into 5 distinct groups. These isotope phenotype groupings are reflective of the variation in labeled substrate uptake by cells in a multispecies metabolic network dominated by Gamma- and Deltaproteobacteria. Populations independently grouped by isotope phenotype were subsequently compared with paired FISH data, demonstrating a single coherent deltaproteobacterial cluster and multiple gammaproteobacterial groups, highlighting the distinct ecophysiologies of spatially-associated microbes within the sulfur-cycling biofilm from White Point Beach, CA.
Stable Isotope Phenotyping via Cluster Analysis of NanoSIMS Data As a Method for Characterizing Distinct Microbial Ecophysiologies and Sulfur-Cycling in the Environment

PubMed Central

Dawson, Katherine S.; Scheller, Silvan; Dillon, Jesse G.; Orphan, Victoria J.

2016-01-01

Stable isotope probing (SIP) is a valuable tool for gaining insights into ecophysiology and biogeochemical cycling of environmental microbial communities by tracking isotopically labeled compounds into cellular macromolecules as well as into byproducts of respiration. SIP, in conjunction with nanoscale secondary ion mass spectrometry (NanoSIMS), allows for the visualization of isotope incorporation at the single cell level. In this manner, both active cells within a diverse population as well as heterogeneity in metabolism within a homogeneous population can be observed. The ecophysiological implications of these single cell stable isotope measurements are often limited to the taxonomic resolution of paired fluorescence in situ hybridization (FISH) microscopy. Here we introduce a taxonomy-independent method using multi-isotope SIP and NanoSIMS for identifying and grouping phenotypically similar microbial cells by their chemical and isotopic fingerprint. This method was applied to SIP experiments in a sulfur-cycling biofilm collected from sulfidic intertidal vents amended with 13C-acetate, 15N-ammonium, and 33S-sulfate. Using a cluster analysis technique based on fuzzy c-means to group cells according to their isotope (13C/12C, 15N/14N, and 33S/32S) and elemental ratio (C/CN and S/CN) profiles, our analysis partitioned ~2200 cellular regions of interest (ROIs) into five distinct groups. These isotope phenotype groupings are reflective of the variation in labeled substrate uptake by cells in a multispecies metabolic network dominated by Gamma- and Deltaproteobacteria. Populations independently grouped by isotope phenotype were subsequently compared with paired FISH data, demonstrating a single coherent deltaproteobacterial cluster and multiple gammaproteobacterial groups, highlighting the distinct ecophysiologies of spatially-associated microbes within the sulfur-cycling biofilm from White Point Beach, CA. PMID:27303371
Determination of Arctic sea ice variability modes on interannual timescales via nonhierarchical clustering

NASA Astrophysics Data System (ADS)

Fučkar, Neven-Stjepan; Guemas, Virginie; Massonnet, François; Doblas-Reyes, Francisco

2015-04-01

Over the modern observational era, the northern hemisphere sea ice concentration, age and thickness have experienced a sharp long-term decline superimposed with strong internal variability. Hence, there is a crucial need to identify robust patterns of Arctic sea ice variability on interannual timescales and disentangle them from the long-term trend in noisy datasets. The principal component analysis (PCA) is a versatile and broadly used method for the study of climate variability. However, the PCA has several limiting aspects because it assumes that all modes of variability have symmetry between positive and negative phases, and suppresses nonlinearities by using a linear covariance matrix. Clustering methods offer an alternative set of dimension reduction tools that are more robust and capable of taking into account possible nonlinear characteristics of a climate field. Cluster analysis aggregates data into groups or clusters based on their distance, to simultaneously minimize the distance between data points in a given cluster and maximize the distance between the centers of the clusters. We extract modes of Arctic interannual sea-ice variability with nonhierarchical K-means cluster analysis and investigate the mechanisms leading to these modes. Our focus is on the sea ice thickness (SIT) as the base variable for clustering because SIT holds most of the climate memory for variability and predictability on interannual timescales. We primarily use global reconstructions of sea ice fields with a state-of-the-art ocean-sea-ice model, but we also verify the robustness of determined clusters in other Arctic sea ice datasets. Applied cluster analysis over the 1958-2013 period shows that the optimal number of detrended SIT clusters is K=3. Determined SIT cluster patterns and their time series of occurrence are rather similar between different seasons and months. Two opposite thermodynamic modes are characterized with prevailing negative or positive SIT anomalies over the Arctic basin. The intermediate mode, with negative anomalies centered on the East Siberian shelf and positive anomalies along the North American side of the basin, has predominately dynamic characteristics. The associated sea ice concentration (SIC) clusters vary more between different seasons and months, but the SIC patterns are physically framed by the SIT cluster patterns.
Passion and intrinsic motivation in digital gaming.

PubMed

Wang, Chee Keng John; Khoo, Angeline; Liu, Woon Chia; Divaharan, Shanti

2008-02-01

Digital gaming is fast becoming a favorite activity all over the world. Yet very few studies have examined the underlying motivational processes involved in digital gaming. One motivational force that receives little attention in psychology is passion, which could help us understand the motivation of gamers. The purpose of the present study was to identify subgroups of young people with distinctive passion profiles on self-determined regulations, flow dispositions, affect, and engagement time in gaming. One hundred fifty-five students from two secondary schools in Singapore participated in the survey. There were 134 males and 8 females (13 unspecified). The participants completed a questionnaire to measure harmonious passion (HP), obsessive passion (OP), perceived locus of causality, disposition flow, positive and negative affects, and engagement time in gaming. Cluster analysis found three clusters with distinct passion profiles. The first cluster had an average HP/OP profile, the second cluster had a low HP/OP profile, and the third cluster had a high HP/OP profile. The three clusters displayed different levels of cognitive, affective, and behavioral outcomes. Cluster analysis, as this study shows, is useful in identifying groups of gamers with different passion profiles. It has helped us gain a deeper understanding of motivation in digital gaming.
HPLC-DAD-ESI-MS Analysis of Flavonoids from Leaves of Different Cultivars of Sweet Osmanthus.

PubMed

Wang, Yiguang; Fu, Jianxin; Zhang, Chao; Zhao, Hongbo

2016-09-14

Osmanthus fragrans Lour. has traditionally been a popular ornamental plant in China. In this study, ethanol extracts of the leaves of four cultivar groups of O. fragrans were analyzed by high-performance liquid chromatography coupled with diode array detection (HPLC-DAD) and high-performance liquid chromatography with electrospray ionization and mass spectrometry (HPLC-ESI-MS). The results suggest that variation in flavonoids among O. fragrans cultivars is quantitative, rather than qualitative. Fifteen components were detected and separated, among which, the structures of 11 flavonoids and two coumarins were identified or tentatively identified. According to principal component analysis (PCA) and hierarchical cluster analysis (HCA) based on the abundance of these components (expressed as rutin equivalents), 22 selected cultivars were classified into four clusters. The seven cultivars from Cluster III ('Xiaoye Sugui', 'Boye Jingui', 'Wuyi Dangui', 'Yingye Dangui', 'Danzhuang', 'Foding Zhu', and 'Tianxiang Taige'), which are enriched in rutin and total flavonoids, and 'Sijigui' from Cluster II which contained the highest amounts of kaempferol glycosides and apigenin 7-O-glucoside, could be selected as potential pharmaceutical resources. However, the chemotaxonomy in this paper does not correlate with the distribution of the existing cultivar groups, demonstrating that the distribution of flavonoids in O. fragrans leaves does not provide an effective means of classification for O. fragrans cultivars based on flower color.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.